Moe - a CCMat Collection

CCMat 's Collections

Adapters & Controls

Personalization

Vision

Video

Moe

Transformers & Attention

Gaming

StateSpaceModels

LLMs

TryOn

Audio

Agents

Data

Img Gen Foundational

UI

toread

VLM

Moe

updated May 14

BlackMamba: Mixture of Experts for State-Space Models

Paper • 2402.01771 • Published Feb 1 • 23
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Paper • 2402.01739 • Published Jan 29 • 26
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29 • 48
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11 • 42
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8 • 70
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 157
Scaling Laws for Fine-Grained Mixture of Experts

Paper • 2402.07871 • Published Feb 12 • 11
Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13 • 34
Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23 • 59
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 104