HuggingFaceTB/finemath
Viewer
•
Updated
•
48.3M
•
13.5k
•
174
FineMath datasets and ablation models
Note FineMath datasets
Note FineMath text classifier to score the mathematical reasoning and educational content
Note Ablations on FineMath subsets (continual pre-training of base Llama 3.2 3B on 60B tokens)
Note Ablations on FineMath plus3 and plus4 (continual pre-training of base Llama 3.2 3B on 60B tokens)
Note Ablations on public math datasets and FW-Edu as a baseline (continual pre-training of base Llama 3.2 3B on 60B tokens)
Note Longer ablation for 160B on a mix of 40% fineweb-edu 60% FineMath and Infiwebmath 3plus / 4plus