pmolchanov (Pavlo)

upvoted a paper 10 days ago

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Paper • 2410.21465 • Published 12 days ago • 9

upvoted a paper 11 days ago

COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

Paper • 2410.19313 • Published 16 days ago • 18

upvoted 4 papers about 1 month ago

PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation

Paper • 2410.01680 • Published Oct 2 • 32

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Paper • 2409.18124 • Published Sep 26 • 31

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Paper • 2409.17481 • Published Sep 26 • 46

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Paper • 2409.17422 • Published Sep 25 • 24

upvoted 2 collections about 2 months ago

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 17 days ago • 453

MagpieLM

Collection

Aligning LMs with Fully Open Recipe (data+training configs+logs) • 9 items • Updated Sep 22 • 15

upvoted 3 collections 2 months ago

upvoted a paper 2 months ago

Learning to Move Like Professional Counter-Strike Players

Paper • 2408.13934 • Published Aug 25 • 21

upvoted a collection 3 months ago

Nemotron in vLLM

Collection

Nemotron models that have been converted and/or quantized to work well in vLLM • 7 items • Updated Jul 25 • 1

upvoted a paper 3 months ago

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22 • 50

upvoted a collection 3 months ago

To read... eventually

Collection

110 items • Updated 2 days ago • 3

upvoted a paper 3 months ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21 • 53

upvoted a collection 4 months ago

Papers I want to read

Collection

Papers in my to-read list • 245 items • Updated 5 days ago • 26

upvoted a paper 4 months ago

Compact Language Models via Pruning and Knowledge Distillation

Paper • 2407.14679 • Published Jul 19 • 37

upvoted 2 collections 4 months ago

🪐 SmolLM

Collection

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 192

Minitron

Collection

A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59

Pavlo

AI & ML interests

Organizations

pmolchanov's activity

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Llama 3.2

MagpieLM

RADIO

Qwen2-VL

VILA: On Pre-training for Visual Language Models

Learning to Move Like Professional Counter-Strike Players

Nemotron in vLLM

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

To read... eventually

LLM Pruning and Distillation in Practice: The Minitron Approach

Papers I want to read

Compact Language Models via Pruning and Knowledge Distillation

🪐 SmolLM

Minitron