grapevine-AI
/

Qwen2-57B-A14B-Instruct-GGUF

Model card Files Files and versions Community

Qwen2-57B-A14B-Instruct-GGUF / README.md

grapevine-AI's picture

Update README.md

9f3b057 verified 3 months ago

|

history blame contribute delete

No virus

1.74 kB

	---
	license: apache-2.0
	---
	# What is this?
	This is GGUF of [Qwen2-57B-A14B-Instruct](https://huggingface.co/Qwen/Qwen2-57B-A14B-Instruct).<br>
	I think this is the world's first successful example to make Qwen2-57B-A14B-Instruct's gguf with imatrix.

	# imatrix dataset
	[TFMC/imatrix-dataset-for-japanese-llm](https://huggingface.co/datasets/TFMC/imatrix-dataset-for-japanese-llm).<br>
	This dataset contains English and many Japanese sentence.

	# How to made it
	First I made Q6_K then tried to convert it to i-quants with ``--allow-requantize`` option.<br>
	Surprisingly (re)quantizeing process has been completed without ploblems.<br>
	I will show more detail below.

	### Step1: Making GGUF of f16
	At first, I converted safetensors to GGUF.

	### Step2: Converting f16 to Q8_0
	Second, I converted f16 to Q8_0.<br>
	This is aiming to accelerate next process because I don't have enough memory to deal with high precision tensor.

	### Step3: Calculating imatrix
	I calculate imatrix with the Q8_0.<br>
	I seem to some people succeed to calculate imatrix so I think anyone can make imatrix.<br>
	At this time, I use '-fa' option because I want to finish culculation as much as possible.<br>
	However, later I knew some people claim Qwen2 needs `-fa` option to work correctly.

	### Step4: Making Q6_K temporarily
	This is most important step. Firstly I converted f16 to Q6_K.<br>
	Never try to make i-quants directly. No one may succeed to make it directly.

	### Step5: Converting Q6 to i-quants with the imatrix
	I converted the Q6_K to i-quants with imatrix.<br>
	Strangely the process has been finished and the i-quants may work.

	### environment
	GeForce RTX 3090 and llama.cpp windows binary b3065

	# License
	Apache 2.0

	# Developer
	Alibaba Cloud