Reading Papers - a speechlessai Collection

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 181

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 64

LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2 • 26

A Comprehensive Study of Knowledge Editing for Large Language Models

Paper • 2401.01286 • Published Jan 2 • 16

Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 79

LARP: Language-Agent Role Play for Open-World Games

Paper • 2312.17653 • Published Dec 24, 2023 • 30

Extending LLMs' Context Window with 100 Samples

Paper • 2401.07004 • Published Jan 13 • 14

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11 • 42

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Paper • 2312.16862 • Published Dec 28, 2023 • 30

Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math

Paper • 2312.17120 • Published Dec 28, 2023 • 25

MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices

Paper • 2312.16886 • Published Dec 28, 2023 • 19

Human101: Training 100+FPS Human Gaussians in 100s from 1 View

Paper • 2312.15258 • Published Dec 23, 2023 • 7

WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation

Paper • 2312.14187 • Published Dec 20, 2023 • 49

Reasons to Reject? Aligning Language Models with Judgments

Paper • 2312.14591 • Published Dec 22, 2023 • 17

AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 51

Time is Encoded in the Weights of Finetuned Language Models

Paper • 2312.13401 • Published Dec 20, 2023 • 19

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

Paper • 2312.13789 • Published Dec 21, 2023 • 13

TinyGSM: achieving >80% on GSM8k with small language models

Paper • 2312.09241 • Published Dec 14, 2023 • 37

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

Paper • 2312.07987 • Published Dec 13, 2023 • 40

PromptBench: A Unified Library for Evaluation of Large Language Models

Paper • 2312.07910 • Published Dec 13, 2023 • 15

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Paper • 2312.06674 • Published Dec 7, 2023 • 6

LLM360: Towards Fully Transparent Open-Source LLMs

Paper • 2312.06550 • Published Dec 11, 2023 • 56

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 28

Context Tuning for Retrieval Augmented Generation

Paper • 2312.05708 • Published Dec 9, 2023 • 16

Evaluation of Large Language Models for Decision Making in Autonomous Driving

Paper • 2312.06351 • Published Dec 11, 2023 • 5

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Paper • 2312.03818 • Published Dec 6, 2023 • 32

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Paper • 2312.04461 • Published Dec 7, 2023 • 57

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Paper • 2312.04474 • Published Dec 7, 2023 • 29

Pearl: A Production-ready Reinforcement Learning Agent

Paper • 2312.03814 • Published Dec 6, 2023 • 14

OneLLM: One Framework to Align All Modalities with Language

Paper • 2312.03700 • Published Dec 6, 2023 • 20

LivePhoto: Real Image Animation with Text-guided Motion Control

Paper • 2312.02928 • Published Dec 5, 2023 • 16

Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models

Paper • 2312.02969 • Published Dec 5, 2023 • 12

Training Chain-of-Thought via Latent-Variable Inference

Paper • 2312.02179 • Published Nov 28, 2023 • 8

Magicoder: Source Code Is All You Need

Paper • 2312.02120 • Published Dec 4, 2023 • 79

Segment and Caption Anything

Paper • 2312.00869 • Published Dec 1, 2023 • 18

Segment Any 3D Gaussians

Paper • 2312.00860 • Published Dec 1, 2023 • 8

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138

Dolphins: Multimodal Language Model for Driving

Paper • 2312.00438 • Published Dec 1, 2023 • 12

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

Paper • 2311.13600 • Published Nov 22, 2023 • 42

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Paper • 2311.13231 • Published Nov 22, 2023 • 26

Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 118

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 70

TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems

Paper • 2311.11315 • Published Nov 19, 2023 • 6

ToolTalk: Evaluating Tool-Usage in a Conversational Setting

Paper • 2311.10775 • Published Nov 15, 2023 • 7

ProAgent: From Robotic Process Automation to Agentic Process Automation

Paper • 2311.10751 • Published Nov 2, 2023 • 8

Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

Paper • 2311.10702 • Published Nov 17, 2023 • 18

SelfEval: Leveraging the discriminative nature of generative models for evaluation

Paper • 2311.10708 • Published Nov 17, 2023 • 14

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Paper • 2311.10093 • Published Nov 16, 2023 • 57

ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks

Paper • 2311.09835 • Published Nov 16, 2023 • 9

Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models

Paper • 2311.08692 • Published Nov 15, 2023 • 12

A Survey on Language Models for Code

Paper • 2311.07989 • Published Nov 14, 2023 • 21

Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 19

Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster

Paper • 2311.08263 • Published Nov 14, 2023 • 15

The ART of LLM Refinement: Ask, Refine, and Trust

Paper • 2311.07961 • Published Nov 14, 2023 • 10

ChatAnything: Facetime Chat with LLM-Enhanced Personas

Paper • 2311.06772 • Published Nov 12, 2023 • 34

LayoutPrompter: Awaken the Design Ability of Large Language Models

Paper • 2311.06495 • Published Nov 11, 2023 • 10

Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

Paper • 2311.05657 • Published Nov 9, 2023 • 27

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 80

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 45

Prompt Cache: Modular Attention Reuse for Low-Latency Inference

Paper • 2311.04934 • Published Nov 7, 2023 • 28

Can LLMs Follow Simple Rules?

Paper • 2311.04235 • Published Nov 6, 2023 • 10

Levels of AGI: Operationalizing Progress on the Path to AGI

Paper • 2311.02462 • Published Nov 4, 2023 • 33

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 28

CogVLM: Visual Expert for Pretrained Language Models

Paper • 2311.03079 • Published Nov 6, 2023 • 23

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

Paper • 2311.02103 • Published Nov 1, 2023 • 16

MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning

Paper • 2311.02303 • Published Nov 4, 2023 • 4

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Paper • 2311.03354 • Published Nov 6, 2023 • 4

ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation

Paper • 2311.00272 • Published Nov 1, 2023 • 9

ChipNeMo: Domain-Adapted LLMs for Chip Design

Paper • 2311.00176 • Published Oct 31, 2023 • 8

Learning From Mistakes Makes LLM Better Reasoner

Paper • 2310.20689 • Published Oct 31, 2023 • 28

Does GPT-4 Pass the Turing Test?

Paper • 2310.20216 • Published Oct 31, 2023 • 17

LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B

Paper • 2310.20624 • Published Oct 31, 2023 • 12

What's In My Big Data?

Paper • 2310.20707 • Published Oct 31, 2023 • 10

LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery

Paper • 2310.18356 • Published Oct 24, 2023 • 22

TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise

Paper • 2310.19019 • Published Oct 29, 2023 • 9

JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Paper • 2310.17631 • Published Oct 26, 2023 • 32

QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models

Paper • 2310.16795 • Published Oct 25, 2023 • 26

InstructExcel: A Benchmark for Natural Language Instruction in Excel

Paper • 2310.14495 • Published Oct 23, 2023 • 1

Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

Paper • 2310.13127 • Published Oct 19, 2023 • 11

ToolChain: Efficient Action Space Navigation in Large Language Models with A Search

Paper • 2310.13227 • Published Oct 20, 2023 • 12

Tuna: Instruction Tuning using Feedback from Large Language Models

Paper • 2310.13385 • Published Oct 20, 2023 • 10

AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 35

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 74

Table-GPT: Table-tuned GPT for Diverse Table Tasks

Paper • 2310.09263 • Published Oct 13, 2023 • 39

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Paper • 2310.08659 • Published Oct 12, 2023 • 22

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 53

Lemur: Harmonizing Natural Language and Code for Language Agents

Paper • 2310.06830 • Published Oct 10, 2023 • 30

MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning

Paper • 2310.03731 • Published Oct 5, 2023 • 29

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Paper • 2310.03714 • Published Oct 5, 2023 • 30

SmartPlay : A Benchmark for LLMs as Intelligent Agents

Paper • 2310.01557 • Published Oct 2, 2023 • 12

VMamba: Visual State Space Model

Paper • 2401.10166 • Published Jan 18 • 37

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 53

ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19 • 13

Rambler: Supporting Writing With Speech via LLM-Assisted Gist Manipulation

Paper • 2401.10838 • Published Jan 19 • 8

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

Paper • 2401.11708 • Published Jan 22 • 29

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

Paper • 2401.12474 • Published Jan 23 • 33

Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20 • 11

Small Language Model Meets with Reinforced Vision Vocabulary

Paper • 2401.12503 • Published Jan 23 • 31

BiTA: Bi-Directional Tuning for Lossless Acceleration in Large Language Models

Paper • 2401.12522 • Published Jan 23 • 11

MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 44

Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All

Paper • 2401.13795 • Published Jan 24 • 65

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25 • 46

Genie: Achieving Human Parity in Content-Grounded Datasets Generation

Paper • 2401.14367 • Published Jan 25 • 6

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Paper • 2401.16380 • Published Jan 29 • 47

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29 • 48

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Paper • 2401.16158 • Published Jan 29 • 17

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

Paper • 2401.16013 • Published Jan 29 • 20

SymbolicAI: A framework for logic-based approaches combining generative models and solvers

Paper • 2402.00854 • Published Feb 1 • 19

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Paper • 2402.07456 • Published Feb 12 • 41

Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models

Paper • 2402.07033 • Published Feb 10 • 16

ChemLLM: A Chemical Large Language Model

Paper • 2402.06852 • Published Feb 10 • 26

LiRank: Industrial Large Scale Ranking Models at LinkedIn

Paper • 2402.06859 • Published Feb 10 • 8

AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts

Paper • 2402.07625 • Published Feb 12 • 11

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99

Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15 • 51

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17

DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization

Paper • 2402.09812 • Published Feb 15 • 12

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset

Paper • 2402.10176 • Published Feb 15 • 34

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 29

LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models

Paper • 2402.10524 • Published Feb 16 • 21

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

Paper • 2402.10466 • Published Feb 16 • 16

LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing

Paper • 2402.10294 • Published Feb 15 • 22

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22 • 82

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21 • 47

TinyLLaVA: A Framework of Small-scale Large Multimodal Models

Paper • 2402.14289 • Published Feb 22 • 19

Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming

Paper • 2402.14261 • Published Feb 22 • 10

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 111

Aria Everyday Activities Dataset

Paper • 2402.13349 • Published Feb 20 • 29

Coercing LLMs to do and reveal (almost) anything

Paper • 2402.14020 • Published Feb 21 • 12

Ouroboros: Speculative Decoding with Large Model Enhanced Drafting

Paper • 2402.13720 • Published Feb 21 • 5

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Paper • 2402.00159 • Published Jan 31 • 59

Specialized Language Models with Cheap Inference from Limited Domain Data

Paper • 2402.01093 • Published Feb 2 • 45

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

Paper • 2402.01622 • Published Feb 2 • 33

Nomic Embed: Training a Reproducible Long Context Text Embedder

Paper • 2402.01613 • Published Feb 2 • 14

Training-Free Consistent Text-to-Image Generation

Paper • 2402.03286 • Published Feb 5 • 64

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5 • 69

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Paper • 2402.01739 • Published Jan 29 • 26

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing

Paper • 2402.02583 • Published Feb 4 • 7

Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6 • 109

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6 • 30

MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6 • 12

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12

Multi-line AI-assisted Code Authoring

Paper • 2402.04141 • Published Feb 6 • 9

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Paper • 2402.04291 • Published Feb 6 • 48

ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 38

Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

Paper • 2402.04379 • Published Feb 6 • 7

CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

Paper • 2402.04858 • Published Feb 7 • 14

Grandmaster-Level Chess Without Search

Paper • 2402.04494 • Published Feb 7 • 67

More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3 • 51

Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains

Paper • 2402.05140 • Published Feb 6 • 20

An Interactive Agent Foundation Model

Paper • 2402.05929 • Published Feb 8 • 27

WebLINX: Real-World Website Navigation with Multi-Turn Dialogue

Paper • 2402.05930 • Published Feb 8 • 39

Training Generative Question-Answering on Synthetic Data Obtained from an Instruct-tuned Model

Paper • 2310.08072 • Published Oct 12, 2023 • 1

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models

Paper • 2402.10986 • Published Feb 16 • 76

Speculative Streaming: Fast LLM Inference without Auxiliary Models

Paper • 2402.11131 • Published Feb 16 • 41

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Paper • 2402.12226 • Published Feb 19 • 40

Reformatted Alignment

Paper • 2402.12219 • Published Feb 19 • 15

Rethinking Data Selection for Supervised Fine-Tuning

Paper • 2402.06094 • Published Feb 8 • 1

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Paper • 2312.15685 • Published Dec 25, 2023 • 17

SelectLLM: Can LLMs Select Important Instructions to Annotate?

Paper • 2401.16553 • Published Jan 29 • 3

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Paper • 2402.04833 • Published Feb 7 • 6

A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications

Paper • 2402.07927 • Published Feb 5 • 1

Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28 • 18

DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model

Paper • 2402.17412 • Published Feb 27 • 21

OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web

Paper • 2402.17553 • Published Feb 27 • 21

Training-Free Long-Context Scaling of Large Language Models

Paper • 2402.17463 • Published Feb 27 • 19

FuseChat: Knowledge Fusion of Chat Models

Paper • 2402.16107 • Published Feb 25 • 36

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26 • 26

Seamless Human Motion Composition with Blended Positional Encodings

Paper • 2402.15509 • Published Feb 23 • 14

Design2Code: How Far Are We From Automating Front-End Engineering?

Paper • 2403.03163 • Published Mar 5 • 93

Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters

Paper • 2403.02677 • Published Mar 5 • 16

MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Paper • 2403.03194 • Published Mar 5 • 12

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12 • 75

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12 • 39

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 48

Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77

Recurrent Drafter for Fast Speculative Decoding in Large Language Models

Paper • 2403.09919 • Published Mar 14 • 20

RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15 • 67

mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

Paper • 2403.12895 • Published Mar 19 • 30

TnT-LLM: Text Mining at Scale with Large Language Models

Paper • 2403.12173 • Published Mar 18 • 19

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Paper • 2403.12968 • Published Mar 19 • 24

BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

Paper • 2403.18421 • Published Mar 27 • 22

PowerInfer-2: Fast Large Language Model Inference on a Smartphone

Paper • 2406.06282 • Published Jun 10 • 36

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Paper • 2406.05955 • Published Jun 10 • 22

RegMix: Data Mixture as Regression for Language Model Pre-training

Paper • 2407.01492 • Published Jul 1 • 34

ColPali: Efficient Document Retrieval with Vision Language Models

Paper • 2407.01449 • Published Jun 27 • 41

MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1 • 16

Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

Paper • 2407.01906 • Published Jul 2 • 34

Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation

Paper • 2408.00205 • Published Aug 1 • 4

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention

Paper • 2408.00760 • Published Aug 1 • 5

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 105

Tora: Trajectory-oriented Diffusion Transformer for Video Generation

Paper • 2407.21705 • Published Jul 31 • 25

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent

Paper • 2407.21646 • Published Jul 31 • 18

SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain

Paper • 2407.19584 • Published Jul 28 • 61

Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning

Paper • 2407.18248 • Published Jul 25 • 30

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

Paper • 2407.19594 • Published Jul 28 • 19

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26 • 30

Vript: A Video Is Worth Thousands of Words

Paper • 2406.06040 • Published Jun 10 • 22

AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Paper • 2407.18901 • Published Jul 26 • 31

GTA: A Benchmark for General Tool Agents

Paper • 2407.08713 • Published Jul 11 • 14

Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches

Paper • 2408.04567 • Published Aug 8 • 23

ShortCircuit: AlphaZero-Driven Circuit Design

Paper • 2408.09858 • Published Aug 19 • 16

Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data

Paper • 2408.10119 • Published Aug 19 • 15

Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15 • 38

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery

Paper • 2409.05591 • Published Sep 9 • 28