File size: 4,893 Bytes
51a1b6e a4544db 51a1b6e a4544db b2d8214 a4544db 00c1adf b2d8214 a4544db 30677e5 a4544db |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
library_name: transformers
license: mit
language:
- fr
- en
tags:
- french
- chocolatine
datasets:
- jpacifico/french-orca-dpo-pairs-revised
pipeline_tag: text-generation
---
### Chocolatine-14B-Instruct-4k-DPO
DPO fine-tuned of [microsoft/Phi-3-medium-4k-instruct](https://huggingface.co/microsoft/Phi-3-medium-4k-instruct) (14B params)
using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
Training in French also improves the model in English, surpassing the performances of its base model (MMLU).
Window context = 4k tokens
### Benchmarks
Chocolatine-14B is the best-performing < 30B model in terms of MMLU-PRO on the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) (august 2024)
![image/png](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Assets/benchmark_14B_V1.png?raw=false)
| Metric |Value|
|-------------------|----:|
|Avg. |29.83|
|IFEval (0-Shot) |46.89|
|BBH (3-Shot) |48.02|
|MATH Lvl 5 (4-Shot)|14.88|
|GPQA (0-shot) |12.19|
|MuSR (0-shot) |15.15|
|**MMLU-PRO (5-shot)** |**41.82**|
### MT-Bench-French
Chocolatine-14B-Instruct-4k-DPO is outperforming GPT-3.5-Turbo and Phi-3-medium-4k-instruct on
[MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench)
```
########## First turn ##########
score
model turn
Chocolatine-14B-Instruct-4k-DPO 1 8.6375
Phi-3-medium-4k-instruct 1 8.2250
gpt-3.5-turbo 1 8.1375
Chocolatine-3B-Instruct-DPO-Revised 1 7.9875
Daredevil-8B 1 7.8875
Chocolatine-3B-Instruct-DPO-v1.0 1 7.6875
NeuralDaredevil-8B-abliterated 1 7.6250
Phi-3-mini-4k-instruct 1 7.2125
Meta-Llama-3-8B-Instruct 1 7.1625
vigostral-7b-chat 1 6.7875
Mistral-7B-Instruct-v0.3 1 6.7500
Mistral-7B-Instruct-v0.2 1 6.2875
########## Second turn ##########
score
model turn
Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
Phi-3-medium-4k-instruct 2 7.750000
Chocolatine-14B-Instruct-4k-DPO 2 7.737500
gpt-3.5-turbo 2 7.679167
Chocolatine-3B-Instruct-DPO-v1.0 2 7.612500
NeuralDaredevil-8B-abliterated 2 7.125000
Daredevil-8B 2 7.087500
Meta-Llama-3-8B-Instruct 2 6.800000
Mistral-7B-Instruct-v0.2 2 6.512500
Mistral-7B-Instruct-v0.3 2 6.500000
Phi-3-mini-4k-instruct 2 6.487500
vigostral-7b-chat 2 6.162500
########## Average ##########
score
model
Chocolatine-14B-Instruct-4k-DPO 8.187500
Phi-3-medium-4k-instruct 7.987500
Chocolatine-3B-Instruct-DPO-Revised 7.962500
gpt-3.5-turbo 7.908333
Chocolatine-3B-Instruct-DPO-v1.0 7.650000
Daredevil-8B 7.487500
NeuralDaredevil-8B-abliterated 7.375000
Meta-Llama-3-8B-Instruct 6.981250
Phi-3-mini-4k-instruct 6.850000
Mistral-7B-Instruct-v0.3 6.625000
vigostral-7b-chat 6.475000
Mistral-7B-Instruct-v0.2 6.400000
```
### Usage
You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
You can also run Chocolatine using the following code:
```python
import transformers
from transformers import AutoTokenizer
# Format prompt
message = [
{"role": "system", "content": "You are a helpful assistant chatbot."},
{"role": "user", "content": "What is a Large Language Model?"}
]
tokenizer = AutoTokenizer.from_pretrained(new_model)
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
# Create pipeline
pipeline = transformers.pipeline(
"text-generation",
model=new_model,
tokenizer=tokenizer
)
# Generate text
sequences = pipeline(
prompt,
do_sample=True,
temperature=0.7,
top_p=0.9,
num_return_sequences=1,
max_length=200,
)
print(sequences[0]['generated_text'])
```
### Limitations
The Chocolatine model is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
It does not have any moderation mechanism.
- **Developed by:** Jonathan Pacifico, 2024
- **Model type:** LLM
- **Language(s) (NLP):** French, English
- **License:** MIT |