arynkiewicz
commited on
Commit
•
d19e769
1
Parent(s):
efd9158
Update README.md
Browse files
README.md
CHANGED
@@ -33,7 +33,15 @@ dtype: bfloat16
|
|
33 |
## Quantization
|
34 |
|
35 |
The quantization was applied using [LLM Compressor](https://github.com/vllm-project/llm-compressor) with 512 random examples from [anydef-kilt-tasks-v2](https://huggingface.co/datasets/daisd-ai/anydef-kilt-tasks-v2) dataset.
|
36 |
-
We tested other
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
|
38 |
## Inference
|
39 |
|
|
|
33 |
## Quantization
|
34 |
|
35 |
The quantization was applied using [LLM Compressor](https://github.com/vllm-project/llm-compressor) with 512 random examples from [anydef-kilt-tasks-v2](https://huggingface.co/datasets/daisd-ai/anydef-kilt-tasks-v2) dataset.
|
36 |
+
We tested other numbers of examples, but did not see noticeable improvement with higher number of examples during quantization.
|
37 |
+
|
38 |
+
The recipe for quantization:
|
39 |
+
```python
|
40 |
+
recipe = [
|
41 |
+
SmoothQuantModifier(smoothing_strength=0.8),
|
42 |
+
GPTQModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"]),
|
43 |
+
]
|
44 |
+
```
|
45 |
|
46 |
## Inference
|
47 |
|