LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks Paper • 2412.15204 • Published 8 days ago • 31
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style Paper • 2410.16184 • Published Oct 21 • 23
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA Paper • 2409.02897 • Published Sep 4 • 44
LongAlign: A Recipe for Long Context Alignment of Large Language Models Paper • 2401.18058 • Published Jan 31 • 20
KoLA: Carefully Benchmarking World Knowledge of Large Language Models Paper • 2306.09296 • Published Jun 15, 2023 • 19