Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.10790

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 143
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20 • 11
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 50
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 44

LLM-LongContext-Compression

Papers in terms of LLM LongContext Compression way, Reading Details: https://www.notion.so/LLM-LongContext-Compression-323cc6da39124c3a97d3502e1bf61b7

Adapting Language Models to Compress Contexts

Paper • 2305.14788 • Published May 24, 2023 • 1
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

Paper • 2401.03462 • Published Jan 7 • 26
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization

Paper • 2401.07793 • Published Jan 15 • 3
Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression

Paper • 2402.16058 • Published Feb 25

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20 • 10
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20 • 16
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20 • 24
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14 • 12
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 29
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16 • 40

Lee's RoPE Tricks / Context Extension Reads

Set of Long Context (RoPE or otherwise) I'm collecting off of HF

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 111
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15 • 21
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18 • 15
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey

Paper • 2401.07872 • Published Jan 15 • 2

InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

Paper • 2402.04617 • Published Feb 7 • 4
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Paper • 2403.09347 • Published Mar 14 • 20
Resonance RoPE: Improving Context Length Generalization of Large Language Models

Paper • 2403.00071 • Published Feb 29 • 22
Training-Free Long-Context Scaling of Large Language Models

Paper • 2402.17463 • Published Feb 27 • 19

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16 • 40
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models

Paper • 2402.10524 • Published Feb 16 • 21
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published 19 days ago • 58

Papers - Context

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16 • 40
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18 • 15
A Neural Conversational Model

Paper • 1506.05869 • Published Jun 19, 2015 • 2
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15 • 21

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16 • 40

Papers is all you need

In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16 • 40

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs