Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2309.10668

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Paper • 2312.06134 • Published Dec 11, 2023 • 2
Efficient Monotonic Multihead Attention

Paper • 2312.04515 • Published Dec 7, 2023 • 6
Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 37
Exploring Format Consistency for Instruction Tuning

Paper • 2307.15504 • Published Jul 28, 2023 • 7

Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 79
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82

THUDM/chatglm3-6b

Updated 2 days ago • 42.3k • 1.09k
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82

Data Compression

Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82
openai-community/gpt2

Text Generation • Updated Feb 19 • 16.2M • • 2.35k
meta-llama/Llama-2-7b

Text Generation • Updated Apr 17 • 4.13k
meta-llama/Llama-2-13b

Text Generation • Updated Apr 17 • 321

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Paper • 2307.01952 • Published Jul 4, 2023 • 82
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models

Paper • 2311.00871 • Published Nov 1, 2023 • 2

Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82

LLM Optimization

Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers

Paper • 2309.08532 • Published Sep 15, 2023 • 52
Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 37
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82

Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82
Qwen/Qwen-7B-Chat

Text Generation • Updated Mar 19 • 62.4k • 751
Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 8.49k • 1.2k
cerebras/SlimPajama-627B

Preview • Updated Jul 7, 2023 • 35.9k • 424

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs