Collections
Discover the best community collections!
Collections including paper arxiv:2402.14547
-
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 10 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 16 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 24 -
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Paper • 2402.13064 • Published • 46
-
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
Learning Vision from Models Rivals Learning Vision from Data
Paper • 2312.17742 • Published • 15 -
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation
Paper • 2312.17276 • Published • 15 -
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
Paper • 2401.02669 • Published • 14
-
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 17 -
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 36 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 42 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 20
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 8