Update README.md

dfc1225 verified 14 days ago

3.73 kB

	---
	license: apache-2.0
	datasets:
	- dyyyyyyyy/ScaleQuest-Math
	language:
	- en
	metrics:
	- accuracy
	library_name: transformers
	pipeline_tag: text-generation
	---
	<p align="center"><h2 align="center">Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch</h2></p>

	# Model Card for Mistral-7B-ScaleQuest

	<!-- Provide a quick summary of what the model is/does. -->

	We introduce ScaleQuest, a scalable and novel data synthesis method that utilizes small-size open-source models to generate questions from scratch without the need for seed data with complex augmentation constraints.

	* 📑 Project Page: [https://scalequest.github.io](https://scalequest.github.io/)
	* 💻 Code: [https://github.com/yyDing1/ScaleQuest](https://github.com/yyDing1/ScaleQuest/)
	* 📖 Paper: [Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch](https://arxiv.org/abs/2410.18693)
	* 💾 Models in the 🤗 HuggingFace Hub: [ScaleQuest-Models](https://huggingface.co/collections/dyyyyyyyy/scalequest-670a7dc2623c91990f28913b)

	<p align="center">
	<img src="https://github.com/yyDing1/ScaleQuest/raw/main/img/results.png">
	</p>

	## Datasets & Models

	Math Dataset: [link](https://huggingface.co/datasets/dyyyyyyyy/ScaleQuest-Math)

	We release two question generator models and four problem-solving models.

	\| Model \| Type \| MATH \| Olympiad Bench \| 🤗 HuggingFace<br />Download Link \|
	\| - \| :-: \| :-: \| :-: \| :-: \|
	\| ScaleQuest-DeepSeekMath-7B-QGen \| question generator \| - \| - \| [link](https://huggingface.co/dyyyyyyyy/ScaleQuest-DeepSeekMath-7B-QGen)
	\| ScaleQuest-Qwen2-Math-7B-QGen \| question generator \| - \| - \| [link](https://huggingface.co/dyyyyyyyy/ScaleQuest-Qwen2-Math-7B-QGen)
	\| Mistral-7B-ScaleQuest \| problem solver \| 62.9 \| 26.8 \| [link](https://huggingface.co/dyyyyyyyy/Mistral-7B-ScaleQuest) \|
	\| Llama3-8B-ScaleQuest \| problem solver \| 64.4 \| 25.3 \| [link](https://huggingface.co/dyyyyyyyy/Llama3-8B-ScaleQuest) \|
	\| DeepSeekMath-7B-ScaleQuest \| problem solver \| 66.6 \| 29.9 \| [link](https://huggingface.co/dyyyyyyyy/DeepSeekMath-7B-ScaleQuest) \|
	\| Qwen2-Math-7B-ScaleQuest \| problem solver \| 73.4 \| 38.5 \| [link](https://huggingface.co/dyyyyyyyy/Qwen2-Math-7B-ScaleQuest) \|

	## Demo usage

	Below is an example using `Mistral-7B-ScaleQuest`
	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "dyyyyyyyy/Mistral-7B-ScaleQuest"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	question = "Find the value of $x$ that satisfies the equation $4x+5 = 6x+7$."

	sys_prompt = "Below is an instruction that describes a task. Write a response that appropriately completes the request." + "\n\n"
	query_prompt = "### Instruction:" + "\n"
	# {query}
	prompt_after_query = "\n\n"
	resp_prompt = "### Response:" + "\n"
	prompt_before_resp = ""
	# {resp}
	delim = "\n\n"

	prefix_prompt = f"{query_prompt}{question}{prompt_after_query}{resp_prompt}{prompt_before_resp}".rstrip(" ")
	full_prompt = sys_prompt + delim.join([prefix_prompt])

	# print(full_prompt)

	inputs = tokenizer(full_prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
	print(tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True))

	```

	## Citation

	```bibtex
	@article{ding2024unleashing,
	title={Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch},
	author={Ding, Yuyang and Shi, Xinyu and Liang, Xiaobo and Li, Juntao and Zhu, Qiaoming and Zhang, Min},
	journal={https://arxiv.org/abs/2410.18693},
	year={2024}
	}
	```