grapevine-AI commited on
Commit
9f3b057
1 Parent(s): 02260b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -9,6 +9,35 @@ I think this is the world's first successful example to make Qwen2-57B-A14B-Inst
9
  [TFMC/imatrix-dataset-for-japanese-llm](https://huggingface.co/datasets/TFMC/imatrix-dataset-for-japanese-llm).<br>
10
  This dataset contains English and many Japanese sentence.
11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  # License
13
  Apache 2.0
14
 
 
9
  [TFMC/imatrix-dataset-for-japanese-llm](https://huggingface.co/datasets/TFMC/imatrix-dataset-for-japanese-llm).<br>
10
  This dataset contains English and many Japanese sentence.
11
 
12
+ # How to made it
13
+ First I made Q6_K then tried to convert it to i-quants with ``--allow-requantize`` option.<br>
14
+ Surprisingly (re)quantizeing process has been completed without ploblems.<br>
15
+ I will show more detail below.
16
+
17
+ ### Step1: Making GGUF of f16
18
+ At first, I converted safetensors to GGUF.
19
+
20
+ ### Step2: Converting f16 to Q8_0
21
+ Second, I converted f16 to Q8_0.<br>
22
+ This is aiming to accelerate next process because I don't have enough memory to deal with high precision tensor.
23
+
24
+ ### Step3: Calculating imatrix
25
+ I calculate imatrix with the Q8_0.<br>
26
+ I seem to some people succeed to calculate imatrix so I think anyone can make imatrix.<br>
27
+ At this time, I use '-fa' option because I want to finish culculation as much as possible.<br>
28
+ However, later I knew some people claim Qwen2 needs `-fa` option to work correctly.
29
+
30
+ ### Step4: Making Q6_K temporarily
31
+ This is most important step. Firstly I converted f16 to Q6_K.<br>
32
+ Never try to make i-quants directly. No one may succeed to make it directly.
33
+
34
+ ### Step5: Converting Q6 to i-quants with the imatrix
35
+ I converted the Q6_K to i-quants with imatrix.<br>
36
+ Strangely the process has been finished and the i-quants may work.
37
+
38
+ ### environment
39
+ GeForce RTX 3090 and llama.cpp windows binary b3065
40
+
41
  # License
42
  Apache 2.0
43