MathGenie's picture
Update README.md
9a80e46 verified
---
license: apache-2.0
datasets:
- MathGenie/MathCode-Pile
language:
- en
metrics:
- accuracy
base_model:
- meta-llama/Meta-Llama-3-8B
pipeline_tag: text-generation
tags:
- math
---
# MathCoder2
### Introduction
The MathCoder2 models are created by conducting continued pretraining on [MathCode-Pile](https://huggingface.co/datasets/MathGenie/MathCode-Pile). They are introduced in the paper [MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code](https://arxiv.org/abs/2410.08196).
The mathematical pretraining dataset includes mathematical code accompanied with natural language reasoning steps, making it a superior resource for models aimed at performing advanced mathematical reasoning tasks.
### Evaluation
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65dd9e7b4a4fce1ec96dc6b7/BEZoDZLjp-fPFlt7oFXBa.png)
### Citation
If you find this repository helpful, please consider citing our papers:
```
@misc{lu2024mathcoder2bettermathreasoning,
title={MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code},
author={Zimu Lu and Aojun Zhou and Ke Wang and Houxing Ren and Weikang Shi and Junting Pan and Mingjie Zhan and Hongsheng Li},
year={2024},
eprint={2410.08196},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.08196},
}
```
```
@inproceedings{
wang2024mathcoder,
title={MathCoder: Seamless Code Integration in {LLM}s for Enhanced Mathematical Reasoning},
author={Zimu Lu and Aojun Zhou and Zimu Lu and Sichun Luo and Weikang Shi and Renrui Zhang and Linqi Song and Mingjie Zhan and Hongsheng Li},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=z8TW0ttBPp}
}
```