GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models Paper • 2410.05229 • Published Oct 7, 2024 • 22 • 6
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1, 2024 • 25 • 7