Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.05099

Papers - Inference - Batch - Hierarchical Sharing Pattern

Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 18

Papers - Inference - KV Cache

Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 18

Papers - Inference

Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 18

🔧 Optimization Mechanics 🔧

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Paper • 2210.17323 • Published Oct 31, 2022 • 7
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

Paper • 2208.07339 • Published Aug 15, 2022 • 4
Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 18
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 53

Papers - University of Oxford

Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems

Paper • 1705.04146 • Published May 11, 2017 • 1
Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?

Paper • 2404.03411 • Published Apr 4 • 8
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

Paper • 2404.04125 • Published Apr 4 • 27
Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 18

Papers - Infererence - Performance

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3 • 64
On Speculative Decoding for Multimodal Large Language Models

Paper • 2404.08856 • Published Apr 13 • 13
Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 18

Papers - University - University of Waterloo

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2 • 35
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks

Paper • 2211.12588 • Published Nov 22, 2022 • 3
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26 • 26
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model

Paper • 2404.04167 • Published Apr 5 • 12

Papers - University - Stanford University

BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

Paper • 2403.18421 • Published Mar 27 • 22
Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27 • 24
stanford-crfm/BioMedLM

Text Generation • Updated Mar 28 • 2.93k • 394
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 48

Papers - Tree of Thoughts

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Paper • 2305.10601 • Published May 17, 2023 • 10
Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 18

MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

Paper • 2402.15627 • Published Feb 23 • 34
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27 • 88
Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29 • 49
Hydragen: High-Throughput LLM Inference with Shared Prefixes

Paper • 2402.05099 • Published Feb 7 • 18

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs