Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge Paper • 2407.19594 • Published Jul 28 • 20
System 2 Attention (is something you might need too) Paper • 2311.11829 • Published Nov 20, 2023 • 39
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation Paper • 2310.15123 • Published Oct 23, 2023 • 7
Chain-of-Verification Reduces Hallucination in Large Language Models Paper • 2309.11495 • Published Sep 20, 2023 • 38
Leveraging Implicit Feedback from Deployment Data in Dialogue Paper • 2307.14117 • Published Jul 26, 2023 • 4