AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets Paper • 2404.05623 • Published Apr 8 • 3
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment Paper • 2405.19332 • Published May 29 • 15
BPO: Supercharging Online Preference Learning by Adhering to the Proximity of Behavior LLM Paper • 2406.12168 • Published Jun 18 • 7
Deep Bayesian Active Learning for Preference Modeling in Large Language Models Paper • 2406.10023 • Published Jun 14 • 2