Rl - a anujga Collection

anujga 's Collections

Rl

Theory

agent

Rl

updated Sep 10, 2023

RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment

Paper • 2307.12950 • Published Jul 24, 2023 • 9