BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published 10 days ago • 17
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation Paper • 2410.18565 • Published Oct 24 • 42
Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference Paper • 2312.10193 • Published Dec 15, 2023 • 1
Exploiting Transformer Activation Sparsity with Dynamic Inference Paper • 2310.04361 • Published Oct 6, 2023 • 1
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Paper • 2401.04081 • Published Jan 8 • 71
Approximating Two-Layer Feedforward Networks for Efficient Transformers Paper • 2310.10837 • Published Oct 16, 2023 • 10