grapevine-AI
/

Qwen2-57B-A14B-Instruct-GGUF

Inference Endpoints

Model card Files Files and versions Community

grapevine-AI commited on Jun 14

Commit

9f3b057

•

1 Parent(s): 02260b9

Update README.md

Files changed (1) hide show

README.md +29 -0

README.md CHANGED Viewed

@@ -9,6 +9,35 @@ I think this is the world's first successful example to make Qwen2-57B-A14B-Inst
 [TFMC/imatrix-dataset-for-japanese-llm](https://huggingface.co/datasets/TFMC/imatrix-dataset-for-japanese-llm).<br>
 This dataset contains English and many Japanese sentence.
 # License
 Apache 2.0

 [TFMC/imatrix-dataset-for-japanese-llm](https://huggingface.co/datasets/TFMC/imatrix-dataset-for-japanese-llm).<br>
 This dataset contains English and many Japanese sentence.
+# How to made it
+First I made Q6_K then tried to convert it to i-quants with ``--allow-requantize`` option.<br>
+Surprisingly (re)quantizeing process has been completed without ploblems.<br>
+I will show more detail below.
+### Step1: Making GGUF of f16
+At first, I converted safetensors to GGUF.
+### Step2: Converting f16 to Q8_0
+Second, I converted f16 to Q8_0.<br>
+This is aiming to accelerate next process because I don't have enough memory to deal with high precision tensor.
+### Step3: Calculating imatrix
+I calculate imatrix with the Q8_0.<br>
+I seem to some people succeed to calculate imatrix so I think anyone can make imatrix.<br>
+At this time, I use '-fa' option because I want to finish culculation as much as possible.<br>
+However, later I knew some people claim Qwen2 needs `-fa` option to work correctly.
+### Step4: Making Q6_K temporarily
+This is most important step. Firstly I converted f16 to Q6_K.<br>
+Never try to make i-quants directly. No one may succeed to make it directly.
+### Step5: Converting Q6 to i-quants with the imatrix
+I converted the Q6_K to i-quants with imatrix.<br>
+Strangely the process has been finished and the i-quants may work.
+### environment
+GeForce RTX 3090 and llama.cpp windows binary b3065
 # License
 Apache 2.0