mrzjy
/

NovelWriting-Outline-PRM-Qwen2.5-0.5B-Reward

Token Classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mrzjy commited on 3 days ago

Commit

bd72bd4

·

verified ·

1 Parent(s): 3a80cd9

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -85,7 +85,7 @@ This approach ensures a balanced distribution of positive and negative labels.
 We trained 2 models on the above dataset:
 - NovelWriting-Outline-Qwen2.5-7B-Instruct: The SFT LLM, trained by [Llama-Factory](https://github.com/hiyouga/LLaMA-Factory).
-- [NovelWriting-Outline-PRM-Qwen2.5-0.5B-Reward](https://huggingface.co/mrzjy/NovelWriting-Outline-PRM-Qwen2.5-0.5B-Reward): The PRM for outline generation task, trained by using TRL library [Doc](https://huggingface.co/docs/trl/prm_trainer).
 ## 4. Performance Evaluation

 We trained 2 models on the above dataset:
 - NovelWriting-Outline-Qwen2.5-7B-Instruct: The SFT LLM, trained by [Llama-Factory](https://github.com/hiyouga/LLaMA-Factory).
+- [NovelWriting-Outline-PRM-Qwen2.5-0.5B-Reward](https://huggingface.co/mrzjy/NovelWriting-Outline-PRM-Qwen2.5-0.5B-Reward): The PRM for outline generation task, trained by using TRL library ([Refer to Doc](https://huggingface.co/docs/trl/prm_trainer)).
 ## 4. Performance Evaluation