Chars
/

chatglm3-ggml

Model card Files Files and versions Community

Chars commited on Nov 5, 2023

Commit

fca5a08

•

1 Parent(s): 1b120c5

Update README.md

Files changed (1) hide show

README.md +53 -0

README.md CHANGED Viewed

@@ -1,3 +1,56 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
 ---
+# ChatGLM3 模型量化
+本项目用于演示如何使用 ChatGLM3 模型，并通过 ChatGLM.cpp 工具仓库对模型进行量化。
+## 拉取模型和工具仓库
+首先，需要将 ChatGLM3 模型和 ChatGLM.cpp 工具仓库拉取到本地：
+```shell
+git lfs install
+git clone https://huggingface.co/THUDM/chatglm3-6b
+git clone https://github.com/li-plus/chatglm.cpp.git
+```
+## 安装依赖
+在开始量化之前，请确保已安装以下依赖项：
+```shell
+python3 -m pip install -U -q pip
+python3 -m pip install -q torch tabulate tqdm transformers accelerate sentencepiece
+```
+## 模型量化
+使用 ChatGLM.cpp 工具仓库中的 `convert.py` 脚本，可以对 ChatGLM3 模型进行量化。
+```shell
+types=("q4_0" "q4_1" "q5_0" "q5_1" "q8_0")
+for type_str in "${types[@]}"; do
+  python3 ./chatglm.cpp/chatglm_cpp/convert.py -i ./chatglm3-6b -t ${type_str} -o chatglm-ggml_${type_str}.bin
+done
+```
+上述代码会对 ChatGLM3 模型进行多个量化操作，并生成对应的量化模型文件。量化类型包括："q4_0"、"q4_1"、"q5_0"、"q5_1" 和 "q8_0"。
+请确保已在终端中切换到正确的工作目录，然后执行上述代码段。
+量化后的模型文件将以 `chatglm-ggml_{type_str}.bin` 的形式保存在当前目录下。
+---
+如果您有任何问题，请随时联系我们。
+模型地址：[https://huggingface.co/THUDM/chatglm3-6b](https://huggingface.co/THUDM/chatglm3-6b)
+工具仓库地址：[https://github.com/li-plus/chatglm.cpp](https://github.com/li-plus/chatglm.cpp)