Safetensors
mistral
entity linking
compressed-tensors
File size: 2,088 Bytes
efd9158
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d19e769
 
 
 
 
 
 
 
 
efd9158
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
base_model: daisd-ai/anydef-orpo-v2
tags:
- entity linking
datasets:
- arynkiewicz/anydef-kilt-tasks-v2
model-index:
- name: daisd-ai/anydef-v2-linear-W4A16
  results: []
license: apache-2.0
inference: false
---

## Introduction

This model is quantized version of linear merge of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) and [daisd-ai/anydef-orpo-v2](https://huggingface.co/daisd-ai/anydef-orpo-v2).

## Merging

Models were merged to improve quality of the final model ([idea](https://www.reddit.com/r/LocalLLaMA/comments/1fyx27y/im_pretty_happy_with_how_my_method_worked_out/)) and prevent huge losses during quantization. Merging was done using [mergekit](https://github.com/arcee-ai/mergekit) with following spec: 
```yaml
models:
  - model: mistralai/Mistral-7B-v0.1
    parameters:
      weight: 0.3
  - model: daisd-ai/anydef-orpo-v2
    parameters:
      weight: 0.7
merge_method: linear
dtype: bfloat16
```

## Quantization

The quantization was applied using [LLM Compressor](https://github.com/vllm-project/llm-compressor) with 512 random examples from [anydef-kilt-tasks-v2](https://huggingface.co/datasets/daisd-ai/anydef-kilt-tasks-v2) dataset.
We tested other numbers of examples, but did not see noticeable improvement with higher number of examples during quantization.

The recipe for quantization:
```python
recipe = [
    SmoothQuantModifier(smoothing_strength=0.8),
    GPTQModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"]),
]
```

## Inference

For inference code you can check our [github](https://github.com/daisd-ai/universal-el).

## Benchmarks results

Precision (%):
| Dataset   |  anydef-v2  | anydef-v2-quant (this)   |
|------------|------------|------------|
| RSS-500 | 66.89| 64.90|
| ISTEX-1000| 85.82|  84.33|
| Reuters-128| 64.88| 68.28|
| TweekiGold| 75.93| 75.93|

Retrieval rate (%):
| Dataset   |  anydef-v2  | anydef-v2-quant (this)   |
|------------|------------|------------|
| RSS-500 | 84.11| 83.44|
| ISTEX-1000| 97.76|   97.31|
| Reuters-128| 83.33| 83.87|
| TweekiGold| 91.67| 91.44|