Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance Paper • 2406.15330 • Published Jun 21, 2024
Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training Paper • 2411.14318 • Published Nov 21, 2024
EpiCoder: Encompassing Diversity and Complexity in Code Generation Paper • 2501.04694 • Published 2 days ago • 7
EpiCoder: Encompassing Diversity and Complexity in Code Generation Paper • 2501.04694 • Published 2 days ago • 7
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published 2 days ago • 40
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published Nov 29, 2024 • 56
Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning Paper • 2410.03103 • Published Oct 4, 2024 • 7
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 224