|
--- |
|
library_name: transformers |
|
tags: |
|
- Chat Model |
|
- SFT |
|
- RLHF |
|
license: llama3 |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Llama3-PBM-Nova-70B |
|
|
|
## Introduction |
|
|
|
Llama3-PBM-Nova-70B is a chat model developed by PKU-Baichuan-MLSysLab, based on the Llama3-70B. In order to better utilize open-source data, we've performed deduplication, quality filtering, and data synthesis on it. Then, through Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), we've significantly enhanced the base model's performance. |
|
|
|
- **Developed by:** [PKU-Baichuan-MLSysLab](https://github.com/PKU-Baichuan-MLSystemLab) |
|
- **Base Model:** [Llama-3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B) |
|
- **Model Type:** Chat Model |
|
- **Training Method:** SFT + RLHF |
|
- **Release Date:** August 2024 |
|
|
|
## Evaluation |
|
|
|
| Model | Arena-Hard | MixEval-Hard | Alpaca-Eval 2.0 | |
|
|------------------------|------------|--------------|-----------------| |
|
| GPT-4Turbo (04/09) | 82.6% | 62.6 | 55.0% | |
|
| GPT-4o (05/13) | 79.2% | 64.7 | 57.5% | |
|
| Gemini 1.5 Pro | 72.0% | 58.3 | - | |
|
| Llama3-PBM-Nova-70B | 74.5% | 58.1 | 56.9% | |
|
| Llama-3.1-70B-Instruct | 55.7% | 61.25 | 38.1% | |
|
| Llama-3-70B-Instruct | 46.6% | 55.9 | 34.4% | |
|
|
|
|
|
## Usage |
|
|
|
Below is an example of how to use this model based on the Transformers library. |
|
|
|
``` |
|
import transformers |
|
import torch |
|
|
|
model_id = "PKU-Baichuan-MLSystemLab/Llama3-PBM-Nova-70B" |
|
|
|
pipeline = transformers.pipeline( |
|
"text-generation", |
|
model=model_id, |
|
model_kwargs={"torch_dtype": torch.bfloat16}, |
|
device_map="auto", |
|
) |
|
|
|
messages = [ |
|
{"role": "user", "content": "Who are you?"}, |
|
] |
|
|
|
terminators = [ |
|
pipeline.tokenizer.eos_token_id, |
|
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>") |
|
] |
|
|
|
outputs = pipeline( |
|
messages, |
|
max_new_tokens=256, |
|
eos_token_id=terminators, |
|
do_sample=True, |
|
temperature=0.6, |
|
top_p=0.9, |
|
) |
|
print(outputs[0]["generated_text"][-1]) |
|
``` |
|
|
|
## License |
|
|
|
- [LLAMA3 License](https://huggingface.co/meta-llama/Meta-Llama-3-70B/blob/main/LICENSE) |