liang.zhao commited on
Commit
07466c6
2 Parent(s): d341f4c 9f25ea2

Merge branch 'main' of hf.co:Skywork/Skywork-13B-Math-8bits into main

Browse files
Files changed (1) hide show
  1. README.md +270 -0
README.md CHANGED
@@ -4,3 +4,273 @@ license_name: license
4
  license_link: >-
5
  https://github.com/SkyworkAI/Skywork/blob/main/Skywork%20Community%20License.pdf
6
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  license_link: >-
5
  https://github.com/SkyworkAI/Skywork/blob/main/Skywork%20Community%20License.pdf
6
  ---
7
+ <!-- <div align="center">
8
+ <h1>
9
+ ✨Skywork
10
+ </h1>
11
+ </div> -->
12
+ <div align="center"><img src="misc/skywork_logo.jpeg" width="550"/></div>
13
+
14
+ <p align="center">
15
+ 🤗 <a href="https://huggingface.co/Skywork" target="_blank">Hugging Face</a> • 🤖 <a href="https://modelscope.cn/organization/Skywork" target="_blank">ModelScope</a> • 💬 <a href="https://github.com/SkyworkAI/Skywork/blob/main/misc/wechat.png?raw=true" target="_blank">WeChat</a>• 📜<a href="https://arxiv.org/" target="_blank">Tech Report</a>• 🧮<a href="https://arxiv.org/" target="_blank">Skymath Paper</a>
16
+ </p>
17
+
18
+
19
+ <div align="center">
20
+
21
+
22
+ [🎉天工在线对话平台已正式向公众开放](https://sso.tiangong.cn/?redirect=https://model-platform.tiangong.cn/overview&client_id=200005)
23
+
24
+ </div>
25
+
26
+
27
+
28
+ <div align="center">
29
+
30
+
31
+ [![GitHub Stars](https://img.shields.io/github/stars/SkyworkAI/Skywork)](https://github.com/SkyworkAI/Skywork/stargazers)
32
+ [![GitHub Forks](https://img.shields.io/github/forks/SkyworkAI/Skywork)](https://github.com/SkyworkAI/Skywork/fork)
33
+ </div>
34
+
35
+
36
+
37
+ # 模型介绍(Introduction)
38
+ **Skywork-13B-Math**模型经过专门的数学能力强化训练。在13B规模的模型中,Skywork-13B-Math模型在GSM8K评测上得分第一,同时在MATH数据集以及CMATH上也表现优异,处于13B模型顶尖水平。
39
+
40
+ **Skywork-13B-Math**: Skywork-13B-Math model has undergone specialized training to enhance its mathematical abilities. In the 13B-scale model, the Skywork-13B-Math model ranked first in the GSM8K evaluation, and it also performed exceptionally well on the MATH dataset and CMATH, placing it among the top-level 13B models.
41
+
42
+
43
+ 如果您希望了解更多的信息,如训练方案,评估方法,请参考我们的[技术报告](https://arxiv.org/skywork-tech-report)和[Skywork-Math](https://arxiv.org/skywork-tech-report)论文。
44
+
45
+ If you are interested in more training and evaluation details, please refer to our [technical report](https://arxiv.org/skywork-tech-report) and [Skywork-Math]((https://arxiv.org/skywork-tech-report)) paper.
46
+
47
+
48
+ # 快速开始(Quickstart)
49
+ 我们将模型参数、配置文件、tokenizer等在huggingface和modelscope上进行了开源。
50
+
51
+ We have open-sourced the model parameters, configuration files, tokenizer, and more on Huggingface and Modelscope.
52
+
53
+ ## 依赖安装(Requirements)
54
+ - Python 3.8及以上版本
55
+ - Pytorch 2.0及以上版本
56
+ - CUDA建议使用11.4以上版本。
57
+
58
+ Skywork-13B-Base模型,Skywork-13B-Chat模型和Skywork-13B-Math模型运行下面的脚本进行Python依赖安装。
59
+
60
+ - Python 3.8 and above
61
+ - Pytorch 2.0 and above
62
+ - CUDA 11.4 and above are recommended.
63
+
64
+ Skywork-13B-Base model, Skywork-13B-Chat model, and Skywork-13B-Math model run the following script for Python dependency installation:
65
+
66
+ ```shell
67
+ pip install -r requirements.txt
68
+ ```
69
+ ## Huggingface模型测试(Demostration)
70
+
71
+
72
+ ### Math 模型推理(Math Model Inferecen)
73
+ ```python
74
+ from transformers import AutoModelForCausalLM, AutoTokenizer
75
+ import torch
76
+
77
+ tokenizer_path = ""
78
+ checkpoint_path = ""
79
+
80
+ tokenizer = AutoTokenizer.from_pretrained(
81
+ tokenizer_path, use_fast=False, trust_remote_code=True, padding_side='left')
82
+
83
+ model = AutoModelForCausalLM.from_pretrained(
84
+ checkpoint_path, device_map="auto", trust_remote_code=True).eval()
85
+ tokenizer.add_tokens(["[USER]", "[BOT]", "[SEP]"])
86
+
87
+ def special_encode(input, tokenizer):
88
+ raw_str = "[USER]%s[SEP][BOT]" % input.strip().replace("\r", "")
89
+ eos_id = tokenizer.eos_token_id
90
+ bos_id = tokenizer.bos_token_id
91
+ sep_id = tokenizer.encode("[SEP]")[-1]
92
+ res_id = [eos_id, bos_id]
93
+ arr = raw_str.split("[SEP]")
94
+ for elem_idx in range(len(arr)):
95
+ elem = arr[elem_idx]
96
+ elem_id = tokenizer.encode(elem)[1:]
97
+ res_id += elem_id
98
+ if elem_idx < len(arr) - 1:
99
+ res_id.append(sep_id)
100
+
101
+ return res_id
102
+
103
+ def special_encode(input, tokenizer):
104
+ raw_str = "[USER]%s[SEP][BOT]" % input.strip().replace("\r", "")
105
+ eos_id = tokenizer.eos_token_id
106
+ bos_id = tokenizer.bos_token_id
107
+ sep_id = tokenizer.encode("[SEP]")[-1]
108
+ res_id = [eos_id, bos_id]
109
+ arr = raw_str.split("[SEP]")
110
+ for elem_idx in range(len(arr)):
111
+ elem = arr[elem_idx]
112
+ elem_id = tokenizer.encode(elem)[1:]
113
+ res_id += elem_id
114
+ if elem_idx < len(arr) - 1:
115
+ res_id.append(sep_id)
116
+
117
+ return res_id
118
+
119
+ if __name__ == '__main__':
120
+ text = "小王要将150千克含药量20%的农药稀释成含药量5%的药水.需要加水多少千克?"
121
+ text_token_ids = torch.tensor(special_encode(
122
+ text, tokenizer)).to(model.device).reshape(1, -1)
123
+ response = model.generate(text_token_ids, do_sample=False, max_length=512)
124
+ response_text = tokenizer.decode(response.cpu()[0], skip_special_tokens=True)
125
+
126
+ response_text = extract_res(response_text)
127
+ print(response_text)
128
+ """输出结果:
129
+ 首先,我们需要计算出150千克含药量20%的农药中含有多少千克的药。\n\n150千克 * 20% = 30千克\n\n然后,我们需要计算出要得到含药量5%的药水,需要多少千克的药水。\n\n30千克 / 5% = 600千克\n\n最后,我们需要计算出需要加多少千克的水。\n\n600千克 - 150千克 = 450千克\n\n所以答案是,小王需要加450千克的水。
130
+ """
131
+ ```
132
+
133
+ ```python
134
+ from transformers import AutoModelForCausalLM, AutoTokenizer
135
+ import torch
136
+
137
+ tokenizer_path = ""
138
+ checkpoint_path = ""
139
+
140
+ tokenizer = AutoTokenizer.from_pretrained(
141
+ tokenizer_path, use_fast=False, trust_remote_code=True, padding_side='left')
142
+
143
+ model = AutoModelForCausalLM.from_pretrained(
144
+ checkpoint_path, device_map="auto", trust_remote_code=True).eval()
145
+ tokenizer.add_tokens(["[USER]", "[BOT]", "[SEP]"])
146
+
147
+ def special_encode(input, tokenizer):
148
+ raw_str = "[USER]%s[SEP][BOT]" % input.strip().replace("\r", "")
149
+ eos_id = tokenizer.eos_token_id
150
+ bos_id = tokenizer.bos_token_id
151
+ sep_id = tokenizer.encode("[SEP]")[-1]
152
+ res_id = [eos_id, bos_id]
153
+ arr = raw_str.split("[SEP]")
154
+ for elem_idx in range(len(arr)):
155
+ elem = arr[elem_idx]
156
+ elem_id = tokenizer.encode(elem)[1:]
157
+ res_id += elem_id
158
+ if elem_idx < len(arr) - 1:
159
+ res_id.append(sep_id)
160
+
161
+ return res_id
162
+
163
+ def extract_res(response):
164
+ if "[BOT]" in response:
165
+ response = response.split("[BOT]")[1]
166
+ if "<s>" in response:
167
+ response = response.split("<s>")[-1]
168
+ if "</s>" in response:
169
+ response = response.split("</s>")[0]
170
+ if "[SEP]" in response:
171
+ response = response.split("[SEP]")[0]
172
+ return response
173
+
174
+ if __name__ == '__main__':
175
+ text="Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?"
176
+ text_token_ids = torch.tensor(special_encode(
177
+ text, tokenizer)).to(model.device).reshape(1, -1)
178
+ response = model.generate(text_token_ids, do_sample=False, max_length=512)
179
+ response_text = tokenizer.decode(response.cpu()[0], skip_special_tokens=True)
180
+ response_text = extract_res(response_text)
181
+ print(response_text)
182
+ """Skywork-13B-Math Response:
183
+ First, we need to find out how many eggs Janet has left after eating for breakfast and baking for her friends. \n\nShe has 16 eggs per day, eats 3 for breakfast and uses 4 for baking. So, 16 - 3 - 4 = 9 eggs are left for selling at the farmers' market.\n\nSince she sells each egg for $2, she makes 9 * 2 = $<<9*2=18>>18 every day at the farmers' market.\n\nSo, the answer is $18.
184
+ """
185
+ ```
186
+
187
+
188
+
189
+ # 量化部署(Quantization)
190
+
191
+ ## 8bit量化(Int8 Quantization)
192
+
193
+ skywork 采用主流8bits量化方法:[BitsAndBytes](https://github.com/TimDettmers/bitsandbytes)。该方法量化后性能基本无损,且已经集成到transformers库中,基于BitsAndBytes,我们提供在线量化和离线8bits模型两种方式。
194
+
195
+ 以下我们提供示例说明如何使用int8量化模型,在开始使用之前,请先安装BitsAndBytes库并安装所需依赖包,具体安装方式见[BitsAndBytes](https://github.com/TimDettmers/bitsandbytes)库。
196
+
197
+ ### 在线量化(Online Quantization)
198
+
199
+ ```python
200
+ model = AutoModelForCausalLM.from_pretrained("skywork-13B-Base", torch_dtype=torch.bfloat16,load_in_8bit=True, trust_remote_code=True).eval()
201
+ ```
202
+
203
+ ### 离线量化(Offline Quantization)
204
+
205
+ ```python
206
+ model = AutoModelForCausalLM.from_pretrained("skywork-13B-Base-8bits", device_map="auto", torch_dtype=torch.bfloat16,trust_remote_code=True).eval()
207
+ ```
208
+
209
+
210
+
211
+ ### 量化效果(Evaluation)
212
+
213
+ 我们对量化模型在基准评测数据集上做了测试,结果如下所示:
214
+
215
+ | Precision | C-Eval | MMLU | CMMLU |
216
+ | --------- | ------ | ----- | ----- |
217
+ | bf16 | 59.5 | 61.6 | 61.6 |
218
+ | 8bits | 58.5 | 61.8 | 61.0 |
219
+
220
+ ### 显存占用(GPU Mem in GB)
221
+
222
+ | Precision | Skywork-13B |
223
+ | --------- | ----------- |
224
+ | bf16 | 25.91 |
225
+ | 8bits | 13.57 |
226
+
227
+
228
+
229
+ # 声明和协议(Declaration and License Aggrement)
230
+
231
+
232
+ ## 声明(Declaration)
233
+
234
+ 我们在此声明,不要利用Skywork模型进行任何危害国家社会安全或违法的活动。另外,我们也要求使用者不要将 Skywork 模型用于未经适当安全审查和备案的互联网服务。我们希望所有的使用者都能遵守这个原则,确保科技的发展能在规范和合法的环境下进行。
235
+
236
+ 我们已经尽我们所能,来确保模型训练过程中使用的数据的合规性。然而,尽管我们已经做出了巨大的努力,但由于模型和数据的复杂性,仍有可能存在一些无法预见的问题。因此,如果由于使用skywork开源模型而导致的任何问题,包括但不限于数据安全问题、公共舆论风险��或模型被误导、滥用、传播或不当利用所带来的任何风险和问题,我们将不承担任何责任。
237
+
238
+ We hereby declare that the Skywork model should not be used for any activities that pose a threat to national or societal security or engage in unlawful actions. Additionally, we request users not to deploy the Skywork model for internet services without appropriate security reviews and records. We hope that all users will adhere to this principle to ensure that technological advancements occur in a regulated and lawful environment.
239
+
240
+ We have done our utmost to ensure the compliance of the data used during the model's training process. However, despite our extensive efforts, due to the complexity of the model and data, there may still be unpredictable risks and issues. Therefore, if any problems arise as a result of using the Skywork open-source model, including but not limited to data security issues, public opinion risks, or any risks and problems arising from the model being misled, abused, disseminated, or improperly utilized, we will not assume any responsibility.
241
+
242
+ ## 协议(License Aggrement)
243
+
244
+ 社区使用Skywork模型需要遵循[《Skywork 模型社区许可协议》](https://github.com/SkyworkAI/Skywork/blob/main/Skywork%20模型社区许可协议.pdf)。Skywork模型支持商业用途,如果您计划将Skywork模型或其衍生品用于商业目的,无需再次申请, 但请您仔细阅读[《Skywork 模型社区许可协议》](https://github.com/SkyworkAI/Skywork/blob/main/Skywork%20模型社区许可协议.pdf)并严格遵守相关条款。
245
+
246
+
247
+ The community usage of Skywork model requires [Skywork Community License](https://github.com/SkyworkAI/Skywork/blob/main/Skywork%20Community%20License.pdf). The Skywork model supports commercial use. If you plan to use the Skywork model or its derivatives for commercial purposes, you must abide by terms and conditions within [Skywork Community License](https://github.com/SkyworkAI/Skywork/blob/main/Skywork%20Community%20License.pdf).
248
+
249
+
250
+
251
+ [《Skywork 模型社区许可协议》》]:https://github.com/SkyworkAI/Skywork/blob/main/Skywork%20模型社区许可协议.pdf
252
+
253
+
254
+ [skywork-opensource@kunlun-inc.com]: mailto:skywork-opensource@kunlun-inc.com
255
+
256
+ # 引用和联系我们(Contact Us and Citation)
257
+ 如果您觉得我们的工作对您有帮助,欢迎引用我们的论文~
258
+
259
+ If you find our work helpful, please feel free to cite our paper~
260
+ ```
261
+ @article{skyworktechreport,
262
+ title={},
263
+ author={},
264
+ journal={arXiv preprint arXiv:},
265
+ year={2023}
266
+ }
267
+ ```
268
+
269
+ ```
270
+ @article{skyworkmath,
271
+ title={},
272
+ author={},
273
+ journal={arXiv preprint arXiv:},
274
+ year={2023}
275
+ }
276
+ ```