semantic search for models, datasets, etc... would be awesome and is critically lacking! (eg, if you wanted to find all the legal datasets on Hugging Face, you're better off using Google. "law", "contract", "legal", etc.. you'd need to search them all and then you have stuff like umarbutler/emubert
which doesn't mention law in its title but is definitely legal related).
Umar Butler
umarbutler
AI & ML interests
Law, technology, AI and everything in between.
Recent Activity
liked
a dataset
about 1 month ago
MoritzLaurer/synthetic_zeroshot_mixtral_v0.1
New activity
about 1 month ago
MoritzLaurer/deberta-v3-large-zeroshot-v2.0:Why not SNLI?
Organizations
umarbutler's activity
Why not SNLI?
1
#6 opened about 1 month ago
by
umarbutler
Premise and hypothesis wrong way around?
2
#2 opened 9 months ago
by
MoritzLaurer
Significant train/test imbalance makes this more tailored to GenAI rather than LLMs in general
3
#31 opened 2 months ago
by
umarbutler
Reacted to
MoritzLaurer's
post with 👍
2 months ago
Post
1620
Why would you fine-tune a model if you can just prompt an LLM? The new paper "What is the Role of Small Models in the LLM Era: A Survey" provides a nice pro/con overview. My go-to approach combines both:
1. Start testing an idea by prompting an LLM/VLM behind an API. It's fast and easy and I avoid wasting time on tuning a model on a task that might not make it into production anyways.
2. The LLM/VLM then needs to be manually validated. Anyone seriously considering putting AI into production has to do at least some manual validation. Setting up a good validation pipeline with a tool like Argilla is crucial and it can be reused for any future experiments. Note: you can use LLM-as-a-judge to automate some evals, but you always also need to validate the judge!
3. Based on this validation I can then (a) either just continue using the prompted LLM if it is accurate enough and it makes sense financially given my load; or (b) if the LLM is not accurate enough or too expensive to run in the long-run, I reuse the existing validation pipeline to annotate some additional data for fine-tuning a smaller model. This can be sped up by reusing & correcting synthetic data from the LLM (or just pure distillation).
Paper: https://arxiv.org/pdf/2409.06857
Argilla docs: https://docs.argilla.io/latest/
Argilla is also very easy to deploy with Hugging Face Spaces (or locally): https://huggingface.co/new-space?template=argilla%2Fargilla-template-space
1. Start testing an idea by prompting an LLM/VLM behind an API. It's fast and easy and I avoid wasting time on tuning a model on a task that might not make it into production anyways.
2. The LLM/VLM then needs to be manually validated. Anyone seriously considering putting AI into production has to do at least some manual validation. Setting up a good validation pipeline with a tool like Argilla is crucial and it can be reused for any future experiments. Note: you can use LLM-as-a-judge to automate some evals, but you always also need to validate the judge!
3. Based on this validation I can then (a) either just continue using the prompted LLM if it is accurate enough and it makes sense financially given my load; or (b) if the LLM is not accurate enough or too expensive to run in the long-run, I reuse the existing validation pipeline to annotate some additional data for fine-tuning a smaller model. This can be sped up by reusing & correcting synthetic data from the LLM (or just pure distillation).
Paper: https://arxiv.org/pdf/2409.06857
Argilla docs: https://docs.argilla.io/latest/
Argilla is also very easy to deploy with Hugging Face Spaces (or locally): https://huggingface.co/new-space?template=argilla%2Fargilla-template-space
Can this be trained?
#4 opened 2 months ago
by
umarbutler
Conversion to tiktoken
3
#4 opened 6 months ago
by
koyfman
Model card looks a bit messed up
1
#3 opened 2 months ago
by
umarbutler