|
--- |
|
base_model: daisd-ai/anydef-orpo-v2 |
|
tags: |
|
- entity linking |
|
datasets: |
|
- arynkiewicz/anydef-kilt-tasks-v2 |
|
model-index: |
|
- name: daisd-ai/anydef-v2-linear-W4A16 |
|
results: [] |
|
license: apache-2.0 |
|
inference: false |
|
--- |
|
|
|
## Introduction |
|
|
|
This model is quantized version of linear merge of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) and [daisd-ai/anydef-orpo-v2](https://huggingface.co/daisd-ai/anydef-orpo-v2). |
|
|
|
## Merging |
|
|
|
Models were merged to improve quality of the final model ([idea](https://www.reddit.com/r/LocalLLaMA/comments/1fyx27y/im_pretty_happy_with_how_my_method_worked_out/)) and prevent huge losses during quantization. Merging was done using [mergekit](https://github.com/arcee-ai/mergekit) with following spec: |
|
```yaml |
|
models: |
|
- model: mistralai/Mistral-7B-v0.1 |
|
parameters: |
|
weight: 0.3 |
|
- model: daisd-ai/anydef-orpo-v2 |
|
parameters: |
|
weight: 0.7 |
|
merge_method: linear |
|
dtype: bfloat16 |
|
``` |
|
|
|
## Quantization |
|
|
|
The quantization was applied using [LLM Compressor](https://github.com/vllm-project/llm-compressor) with 512 random examples from [anydef-kilt-tasks-v2](https://huggingface.co/datasets/daisd-ai/anydef-kilt-tasks-v2) dataset. |
|
We tested other numbers of examples, but did not see noticeable improvement with higher number of examples during quantization. |
|
|
|
The recipe for quantization: |
|
```python |
|
recipe = [ |
|
SmoothQuantModifier(smoothing_strength=0.8), |
|
GPTQModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"]), |
|
] |
|
``` |
|
|
|
## Inference |
|
|
|
For inference code you can check our [github](https://github.com/daisd-ai/universal-el). |
|
|
|
## Benchmarks results |
|
|
|
Precision (%): |
|
| Dataset | anydef-v2 | anydef-v2-quant (this) | |
|
|------------|------------|------------| |
|
| RSS-500 | 66.89| 64.90| |
|
| ISTEX-1000| 85.82| 84.33| |
|
| Reuters-128| 64.88| 68.28| |
|
| TweekiGold| 75.93| 75.93| |
|
|
|
Retrieval rate (%): |
|
| Dataset | anydef-v2 | anydef-v2-quant (this) | |
|
|------------|------------|------------| |
|
| RSS-500 | 84.11| 83.44| |
|
| ISTEX-1000| 97.76| 97.31| |
|
| Reuters-128| 83.33| 83.87| |
|
| TweekiGold| 91.67| 91.44| |
|
|