PDFTriage: Question Answering over Long, Structured Documents Paper • 2309.08872 • Published Sep 16, 2023 • 53
Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 77
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise Paper • 2310.19019 • Published Oct 29, 2023 • 9
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 70
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling Paper • 2312.15166 • Published Dec 23, 2023 • 56
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 79
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 181
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding Paper • 2401.04398 • Published Jan 9 • 21
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Paper • 2401.15024 • Published Jan 26 • 69
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains Paper • 2402.05140 • Published Feb 6 • 20
AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts Paper • 2402.07625 • Published Feb 12 • 11
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models Paper • 2402.10986 • Published Feb 16 • 77
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20 • 62
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models Paper • 2405.20541 • Published May 30 • 21
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning Paper • 2406.09170 • Published Jun 13 • 24
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20 • 85
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25 • 86
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation Paper • 2406.19215 • Published Jun 27 • 29
Show Less, Instruct More: Enriching Prompts with Definitions and Guidelines for Zero-Shot NER Paper • 2407.01272 • Published Jul 1 • 8
LETS-C: Leveraging Language Embedding for Time Series Classification Paper • 2407.06533 • Published Jul 9 • 2
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers Paper • 2407.09413 • Published Jul 12 • 9
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore Paper • 2407.12854 • Published Jul 9 • 29
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains Paper • 2407.18961 • Published Jul 18 • 39
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain Paper • 2407.19584 • Published Jul 28 • 62
Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models Paper • 2407.19474 • Published Jul 28 • 23
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25 • 31
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper • 2407.20183 • Published Jul 29 • 38
Knowledge Mechanisms in Large Language Models: A Survey and Perspective Paper • 2407.15017 • Published Jul 22 • 33
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper • 2407.03502 • Published Jul 3 • 48
Text2SQL is Not Enough: Unifying AI and Databases with TAG Paper • 2408.14717 • Published Aug 27 • 24
Efficient Detection of Toxic Prompts in Large Language Models Paper • 2408.11727 • Published Aug 21 • 12
Controllable Text Generation for Large Language Models: A Survey Paper • 2408.12599 • Published Aug 22 • 63
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17 • 51
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Paper • 2409.05840 • Published Sep 9 • 45
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4 • 72
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models Paper • 2409.11136 • Published Sep 17 • 21
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19 • 135
A Controlled Study on Long Context Extension and Generalization in LLMs Paper • 2409.12181 • Published Sep 18 • 43
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18 • 36
A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B Paper • 2409.11055 • Published Sep 17 • 16
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper • 2409.10516 • Published Sep 16 • 39
Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types Paper • 2409.09269 • Published Sep 14 • 7
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23 • 34
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published Sep 24 • 41
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Paper • 2409.17481 • Published Sep 26 • 46
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Paper • 2409.20566 • Published Sep 30 • 52
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices Paper • 2410.00531 • Published Oct 1 • 29
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30 • 53
Embodied-RAG: General non-parametric Embodied Memory for Retrieval and Generation Paper • 2409.18313 • Published Sep 26 • 3
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Paper • 2410.02740 • Published Oct 3 • 52
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise Paper • 2410.03017 • Published Oct 3 • 25
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper • 2410.02707 • Published Oct 3 • 47
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery Paper • 2410.05080 • Published Oct 7 • 19
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints Paper • 2410.06458 • Published Oct 9 • 8
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering Paper • 2410.07095 • Published Oct 9 • 6
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Paper • 2410.08196 • Published Oct 10 • 44
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents Paper • 2410.03450 • Published Oct 4 • 36
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning Paper • 2410.06508 • Published Oct 9 • 10
Vector-ICL: In-context Learning with Continuous Vector Representations Paper • 2410.05629 • Published Oct 8 • 3
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization Paper • 2410.08815 • Published Oct 11 • 42
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning Paper • 2410.06456 • Published Oct 9 • 35
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14 • 37
Thinking LLMs: General Instruction Following with Thought Generation Paper • 2410.10630 • Published Oct 14 • 17
SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI Paper • 2410.11096 • Published Oct 14 • 12
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models Paper • 2410.13085 • Published Oct 16 • 20
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper • 2410.12705 • Published Oct 16 • 29
TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration Paper • 2410.12183 • Published Oct 16 • 3
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models Paper • 2410.14059 • Published Oct 17 • 53
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Paper • 2410.14669 • Published Oct 18 • 35
MiniPLM: Knowledge Distillation for Pre-Training Language Models Paper • 2410.17215 • Published Oct 22 • 12
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark Paper • 2410.19168 • Published Oct 24 • 19
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Paper • 2411.02337 • Published 24 days ago • 36
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published 23 days ago • 64
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Paper • 2411.16594 • Published 3 days ago • 29
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published 7 days ago • 29
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Paper • 2411.14199 • Published 7 days ago • 25