Lysandre's picture

Lysandre

lysandre

·

http://lysand.re

AI & ML interests

chief open-source officer @ hf

Recent Activity

liked a model 4 days ago

black-forest-labs/FLUX.1-dev

View all activity

Articles

Fixing Gradient Accumulation

License to Call: Introducing Transformers Agents 2.0

We are hiring interns!

Hugging Face on PyTorch / XLA TPUs

Organizations

lysandre's activity

upvoted a paper 29 days ago

Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis

Paper • 2410.23320 • Published Oct 30 • 6

upvoted an article about 1 month ago

Article

Transformers.js v3: WebGPU support, new models & tasks, and more…

Oct 22

• 65

upvoted an article about 2 months ago

Article

Tool Use, Unified

Aug 12

• 65

upvoted a collection 3 months ago

Llama3-8B-1.58

A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! • 3 items • Updated Sep 14 • 12

upvoted 2 articles 3 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18

• 205

Article

Don't repeat yourself - 🤗 Transformers Design Philosophy

Apr 5, 2022

• 12

upvoted 3 articles 4 months ago

Article

MobileNet Baselines

By

•

Jul 26

• 23

Article

MobileNet-V4 (now in timm)

By

•

Jun 17

• 39

Article

WWDC 24: Running Mistral 7B with Core ML

Jul 22

• 56

upvoted a collection 6 months ago

Nemotron 4 340B

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated about 1 month ago • 159

upvoted a collection 7 months ago

Embedding Model Datasets

A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 67 items • Updated Jul 3 • 79

upvoted 2 articles 7 months ago

Article

License to Call: Introducing Transformers Agents 2.0

May 13

• 118

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

Apr 24

• 59

upvoted a collection 8 months ago

Gemma release

Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325

upvoted a collection 10 months ago

Canonical models

This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace • 68 items • Updated Feb 13 • 13

upvoted a collection 11 months ago

SigLIP

Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated 15 days ago • 37

upvoted a paper about 1 year ago

Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 118

upvoted 3 collections about 1 year ago

Switch-Transformers release

This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated Jul 31 • 15

zephyr story

sources mentioned by hf.co/thomwolf tweet: x.com/Thom_Wolf/status/1720503998518640703 • 8 items • Updated Jan 24 • 15

Distil-Whisper Models

The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 36