Florian Zimmermeister

flozi00

AI & ML interests

ASR, German LLM

Recent Activity

updated a dataset about 9 hours ago
flozi00/asr-german-mixed-evals
liked a dataset 7 days ago
AI-MO/NuminaMath-CoT
liked a dataset 8 days ago
HuggingFaceTB/smoltalk
View all activity

Organizations

Training Transformers Together's profile picture Speech Recognition Community Event Version 2's profile picture A\\Ware's profile picture primeLine AI Services's profile picture ZeroGPU Explorers's profile picture primeLine Research Community's profile picture Hugging Face Discord Community's profile picture open/ acc's profile picture Data Is Better Together Contributor's profile picture

flozi00's activity

New activity in primeline/whisper-large-v3-turbo-german 19 days ago

Convert to .bin?

4
#4 opened about 1 month ago by Artmart23
upvoted an article 19 days ago
view article
Article

Releasing the largest multilingual open pretraining dataset

By Pclanglais β€’
β€’ 97
upvoted an article 24 days ago
view article
Article

SauerkrautLM's Multi-Phase Spectrum Training: A Technical Deep Dive

By DavidGF β€’
β€’ 9
reacted to DavidGF's post with πŸ‘ 28 days ago
view post
Post
2991
πŸŽ‰ Celebrating One Year of #SauerkrautLM with Two Groundbreaking Releases!

We're thrilled to announce the release of SauerkrautLM-v2-14b in two specialized versions: VAGOsolutions/SauerkrautLM-v2-14b-SFT and VAGOsolutions/SauerkrautLM-v2-14b-DPO. Built on the robust Qwen2.5-14B foundation, these models represent a significant leap forward in multilingual AI capabilities.

πŸ”¬ Technical Breakthroughs:
πŸ’  Innovative three-phase Fine-Tuning approach
πŸ’  Two-step Spectrum SFT + one-step Spectrum DPO optimization phase for enhanced performance
πŸ’  Balance of German and English language capabilities
πŸ’  Advanced function calling - almost on par with Claude-3.5-Sonnet-20240620

πŸ‡©πŸ‡ͺ German Language Excellence:
What sets this release apart is our unique achievement in simultaneously improving both German and English capabilities. Through our specialized training approach with over 1.2B tokens across two phases, we've managed to:
πŸ’  Enhance German language understanding and generation (SFT Version > DPO Version)
πŸ’  Maintain authentic German linguistic nuances
πŸ’  Improve cross-lingual capabilities
πŸ’  Preserve cultural context awareness

πŸ“Š Training Innovation:
Our three-phase approach targeted specific layer percentages (15%, 20% and 25%) with carefully curated datasets, including:
πŸ’  Mathematics-focused content (proprietary classifier-selected)
πŸ’  High-quality German training data
πŸ’  Specialized function calling datasets
πŸ’  Premium multilingual content

🎁 Community Contribution:
We're also releasing two new datasets in a few days:
1️⃣ SauerkrautLM-Fermented-GER-DPO: 3,300 high-quality German training samples
2️⃣ SauerkrautLM-Fermented-Irrelevance-GER-DPO: 2,000 specialized samples for optimized function call irrelevance handling

Thank you to our incredible community and partners who have supported us throughout this journey. Here's to another year of AI innovation!Β πŸš€
reacted to qq8933's post with πŸ‘ 29 days ago
view post
Post
5768
LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace
Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models!
https://github.com/SimpleBerry/LLaMA-O1/

What will happen when you compound MCTS ❀ LLM ❀ Self-Play ❀RLHF?
Just a little bite of strawberry!πŸ“

Past related works:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)
  • 2 replies
Β·
New activity in primeline/whisper-large-v3-turbo-german about 1 month ago

german or swiss-german

2
#5 opened about 1 month ago by jschoene