TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts Paper • 2407.03203 • Published Jul 3 • 11
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise Paper • 2312.14567 • Published Dec 22, 2023 • 1
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning Paper • 2403.17919 • Published Mar 26 • 16