HINT-lab
/

llama3-8b-final-ppo-m-v0.3

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

teapot123 commited on 22 days ago

Commit

9ed1b72

•

1 Parent(s): a1eb0e9

Update README.md

Files changed (1) hide show

README.md +1 -6

README.md CHANGED Viewed

@@ -24,12 +24,7 @@ We train [OpenRLHF/Llama-3-8b-sft-mixture](https://huggingface.co/OpenRLHF/Llama
 with our calibrated reward model [HINT-lab/llama3-8b-crm-final-v0.1](https://huggingface.co/HINT-lab/llama3-8b-crm-final-v0.1).
 - **Developed by:** Jixuan Leng, Chengsong Huang, Banghua Zhu, Jiaxin Huang
-<!-- - **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed] -->
-<!-- - **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed] -->
-- **Finetuned from model [optional]:** [OpenRLHF/Llama-3-8b-sft-mixture](https://huggingface.co/OpenRLHF/Llama-3-8b-sft-mixture)
 ### Model Sources [optional]

 with our calibrated reward model [HINT-lab/llama3-8b-crm-final-v0.1](https://huggingface.co/HINT-lab/llama3-8b-crm-final-v0.1).
 - **Developed by:** Jixuan Leng, Chengsong Huang, Banghua Zhu, Jiaxin Huang
+- **Finetuned from model:** [OpenRLHF/Llama-3-8b-sft-mixture](https://huggingface.co/OpenRLHF/Llama-3-8b-sft-mixture)
 ### Model Sources [optional]