Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.14547

OmniPred: Language Models as Universal Regressors

Paper • 2402.14547 • Published Feb 22 • 11

Weird LLM Ability Papers

OmniPred: Language Models as Universal Regressors

Paper • 2402.14547 • Published Feb 22 • 11
Scaling Up LLM Reviews for Google Ads Content Moderation

Paper • 2402.14590 • Published Feb 7 • 7

OmniPred: Language Models as Universal Regressors

Paper • 2402.14547 • Published Feb 22 • 11

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21 • 47
OmniPred: Language Models as Universal Regressors

Paper • 2402.14547 • Published Feb 22 • 11

LLM - general ideas

OmniPred: Language Models as Universal Regressors

Paper • 2402.14547 • Published Feb 22 • 11
InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26 • 29

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20 • 10
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20 • 16
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20 • 24
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 181
Learning Vision from Models Rivals Learning Vision from Data

Paper • 2312.17742 • Published Dec 28, 2023 • 15
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation

Paper • 2312.17276 • Published Dec 27, 2023 • 15
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

Paper • 2401.02669 • Published Jan 5 • 14

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

Paper • 2309.14509 • Published Sep 25, 2023 • 17
LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4 • 36
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11 • 42
Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16 • 20

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 22
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 9
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 8

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs