A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes Aug 17, 2022 • 63
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper • 2410.05355 • Published Oct 7 • 29
🦅 🐍 FalconMamba 7B Collection This collection features the FalconMamba 7B base model, the instruction-tuned version, their 4-bit and GGUF variants, and the demo. • 15 items • Updated Oct 10 • 29
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning Paper • 2303.02861 • Published Mar 6, 2023 • 2
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model Paper • 2406.04904 • Published Jun 7 • 4
AQLM+PV Collection Official AQLM quantizations for "PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression": https://arxiv.org/abs/2405.14852 • 25 items • Updated 20 days ago • 19
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper • 2405.18392 • Published May 28 • 12
view article Article Overview of natively supported quantization schemes in 🤗 Transformers Sep 12, 2023 • 11
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 683
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study Paper • 2404.10719 • Published Apr 16 • 4
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15 • 20
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 254
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 62
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 • 94