chujiezheng
/

Llama3-8B-Chinese-Chat-ExPO

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chujiezheng commited on May 27

Commit

00293b3

•

1 Parent(s): 3683640

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ The extrapolated (ExPO) model based on [`shenzhi-wang/Llama3-8B-Chinese-Chat`](h
 Specifically, we obtain this model by extrapolating **(alpha = 0.3)** from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.
-**Note:** This is an experimental model, as I have not comprehensively evaluated its Chinese ability. There may occur unexpected issues when we apply extrapolation to the new-language (i.e., Chinese) training.
 ## Evaluation Results

 Specifically, we obtain this model by extrapolating **(alpha = 0.3)** from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.
+**Note:** This is an experimental model, as I have not comprehensively evaluated its Chinese ability. **Unexpected issues may occur when we apply extrapolation to the DPO/RLHF alignment training for new languages (e.g., Chinese).**
 ## Evaluation Results