84 41 127

Feynman Innovations

ajibawa-2023

AjinkyaBawase

AI & ML interests

LLM, RL, DL, ML, AGI. Developing LLMs (preferably fully fine tuned ) for various use cases.

Recent Activity

Reacted to davidberenstein1957's post with 🔥 1 day ago

🔥 Dataset Drop - Open Image Preferences BlackForest Labs Flux Dev VS. Stability AI Stable Diffusion Large 3.5 Together with the ⁠data-is-better-together community, we've worked on an Apache 2.0 licensed open image preference dataset based on the fal ai imgsys prompts dataset. Thanks to the awesome community, we have managed to get 5K preference pairs in less than 2 days. The annotation alignment among annotators is great too. Aashish Kumar won a month of Hugging Face Pro by making the most contributions! Congrats from the entire team 🥇 The best thing?! We are not done yet! Let's keep the annotations coming for 5K more in the second part of the sprint! (with more prices to go around). Dataset: https://huggingface.co/datasets/data-is-better-together/image-preferences-results

Reacted to MohamedRashad's post with 🚀 1 day ago

A while back i shared this model https://huggingface.co/MohamedRashad/arabic-small-nougat that was a finetune from https://huggingface.co/facebook/nougat-small for the Arabic Language. Today this humble project has been scaled with new models, new datasets, new space, and a new paper Check everything throught this collection here: https://huggingface.co/collections/MohamedRashad/arabic-nougat-673a3f540bd92904c9b92a8e

New activity 16 days ago

ajibawa-2023/Code-290k-ShareGPT:How is this dataset created?

View all activity

Organizations

ajibawa-2023's activity

Reacted to davidberenstein1957's post with 🔥 1 day ago

Post

1386

🔥 Dataset Drop - Open Image Preferences

BlackForest Labs Flux Dev VS. Stability AI Stable Diffusion Large 3.5

Together with the ⁠data-is-better-together community, we've worked on an Apache 2.0 licensed open image preference dataset based on the fal ai imgsys prompts dataset. Thanks to the awesome community, we have managed to get 5K preference pairs in less than 2 days. The annotation alignment among annotators is great too.

Aashish Kumar won a month of Hugging Face Pro by making the most contributions! Congrats from the entire team 🥇

The best thing?! We are not done yet! Let's keep the annotations coming for 5K more in the second part of the sprint! (with more prices to go around).

Dataset: data-is-better-together/image-preferences-results

Reacted to MohamedRashad's post with 🚀 1 day ago

Post

1216

A while back i shared this model MohamedRashad/arabic-small-nougat that was a finetune from facebook/nougat-small for the Arabic Language.

Today this humble project has been scaled with new models, new datasets, new space, and a new paper

Check everything throught this collection here:
MohamedRashad/arabic-nougat-673a3f540bd92904c9b92a8e

New activity in ajibawa-2023/Code-290k-ShareGPT 16 days ago

How is this dataset created?

#3 opened 16 days ago by

oo22010

New activity in ajibawa-2023/Python-Code-13B 22 days ago

Adding `safetensors` variant of this model

#2 opened 22 days ago by

SFconvertbot

Reacted to qq8933's post with 👍 26 days ago

Post

5748

LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace
Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models!
https://github.com/SimpleBerry/LLaMA-O1/

What will happen when you compound MCTS ❤ LLM ❤ Self-Play ❤RLHF?
Just a little bite of strawberry!🍓

Past related works:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)

2 replies

Reacted to Jaward's post with 🔥 26 days ago

Post

2097

It's work like this that in some way signal the eventual “dominance” of AI over all the sciences.

“We train our model on the six-dimensional N-body phase space, predicting particle velocities as the time derivative of the model’s displacement outputs”

The emulator is capable of predicting
the nonlinear displacement and velocity fields for 128^3 particles in half a second on a single GPU🤯

1 reply

replied to their post about 1 month ago

Yes, it is synthetic.

liked a model about 1 month ago

deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

Text Generation • Updated Jul 3 • 63.6k • 321

upvoted an article about 1 month ago

Article

🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦‍⬛

•

Oct 21

• 18

replied to their post about 1 month ago

Thank you for kind words!

Reacted to their post with 👀❤️🔥🚀👍 about 1 month ago

Post

2918

New Dataset: Software-Architecture
Link: ajibawa-2023/Software-Architecture

I am releasing a Large Dataset covering topics related to Software-Architecture. This dataset consists of around 450,000 lines of data in jsonl.

I have included following topics:

Architectural Frameworks

Architectural Patterns for Reliability

Architectural Patterns for Scalability

Architectural Patterns

Architectural Quality Attributes

Architectural Testing

Architectural Views

Architectural Decision-Making

Advanced Research

Cloud-Based Architectures

Component-Based Architecture

Data Architecture

Emerging Trends

Event-Driven Architecture

Evolvability and Maintainability

Microservices and Monolithic

Microservices Architecture

Security Architecture

Service-Oriented Architecture

Software Design Principles

and Many More!

This dataset is useful in LLM development. Also those who are working on developing Software development related LLMs then this dataset can be useful.

This dataset is very useful to Researchers as well.

4 replies

posted an update about 1 month ago

Post

2918

4 replies

updated a dataset about 1 month ago

ajibawa-2023/Software-Architecture

Preview • Updated Oct 28 • 107 • 19

liked a dataset about 1 month ago

ajibawa-2023/Software-Architecture

Preview • Updated Oct 28 • 107 • 19

liked a dataset about 2 months ago

LLM360/TxT360

Preview • Updated 23 days ago • 79.6k • 213

Reacted to MonsterMMORPG's post with 🔥 about 2 months ago

Post

4076

Huge news for Kohya GUI - Now you can fully Fine Tune / DreamBooth FLUX Dev with as low as 6 GB GPUs without any quality loss compared to 48 GB GPUs - Moreover, Fine Tuning yields better results than any LoRA training could

Config Files
I published all configs here : https://www.patreon.com/posts/112099700

Tutorials
Fine tuning tutorial in production

Windows FLUX LoRA training (fine tuning is same just config changes) : https://youtu.be/nySGu12Y05k

Cloud FLUX LoRA training (RunPod and Massed Compute ultra cheap) : https://youtu.be/-uhL2nW7Ddw

LoRA Extraction
The checkpoint sizes are 23.8 GB but you can extract LoRA with almost no loss quality - I made a research and public article / guide for this as well

LoRA extraction guide from Fine Tuned checkpoint is here : https://www.patreon.com/posts/112335162

Info
This is just mind blowing. The recent improvements Kohya made for block swapping is just amazing.

Speeds are also amazing that you can see in image 2 - of course those values are based on my researched config and tested on RTX A6000 - same speed as almost RTX 3090

Also all trainings experiments are made at 1024x1024px. If you use lower resolution it will be lesser VRAM + faster speed

The VRAM usages would change according to your own configuration - likely speed as well

Moreover, Fine Tuning / DreamBooth yields better results than any LoRA could

Installers
1-Kohya GUI accurate branch and Windows Torch 2.5 Installers and test prompts shared here : https://www.patreon.com/posts/110879657

The link of Kohya GUI with accurate branch : https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1