File size: 1,728 Bytes
4b40687 370e8c4 5a5a953 4b40687 f283f6b 4b40687 370e8c4 4b40687 8733263 4b40687 370e8c4 fe54a7d 4b40687 370e8c4 fe54a7d 4b40687 f719313 370e8c4 0e9c197 0a4f0a2 370e8c4 0a4f0a2 0e9c197 370e8c4 0e9c197 0a4f0a2 370e8c4 c9d4d1e 0e9c197 0a4f0a2 b39b843 5a5a953 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
---
library_name: transformers
license: llama2
datasets:
- aqua_rat
- microsoft/orca-math-word-problems-200k
- m-a-p/CodeFeedback-Filtered-Instruction
- anon8231489123/ShareGPT_Vicuna_unfiltered
---
# Llama-3-Smaug-8B
### Built with Meta Llama 3
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f95cac5f9ba52bbcd7f/OrcJyTaUtD2HxJOPPwNva.png)
This model was built using the Smaug recipe for improving performance on real world multi-turn conversations applied to
[meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
### Model Description
- **Developed by:** [Abacus.AI](https://abacus.ai)
- **License:** https://llama.meta.com/llama3/license/
- **Finetuned from model:** [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
## Evaluation
### MT-Bench
```
########## First turn ##########
score
model turn
Llama-3-Smaug-8B 1 8.77500
Meta-Llama-3-8B-Instruct 1 8.31250
########## Second turn ##########
score
model turn
Meta-Llama-3-8B-Instruct 2 7.8875
Llama-3-Smaug-8B 2 7.8875
########## Average ##########
score
model
Llama-3-Smaug-8B 8.331250
Meta-Llama-3-8B-Instruct 8.10
```
| Model | First turn | Second Turn | Average |
| :---- | ---------: | ----------: | ------: |
| Llama-3-Smaug-8B | 8.78 | 7.89 | 8.33 |
| Llama-3-8B-Instruct | 8.31 | 7.89 | 8.10 |
This version of Smaug uses new techniques and new data compared to [Smaug-72B](https://huggingface.co/abacusai/Smaug-72B-v0.1), and more information will be released later on. For now, see the previous Smaug paper: https://arxiv.org/abs/2402.13228. |