chuyi777's picture
Update README.md
41d1ad8 verified

Process Reward Model trained by OpenRLHF