Interesting Papers - a marcelweiss Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

marcelweiss 's Collections

Interesting Papers

Interesting Papers

updated 10 days ago

These papers are interesting (to me)

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published Oct 3 • 52
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published Oct 2 • 30
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25 • 104
EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24 • 25
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Paper • 2409.14988 • Published Sep 23 • 21
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines

Paper • 2409.12959 • Published Sep 19 • 36
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published Sep 6 • 43
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise

Paper • 2410.03017 • Published Oct 3 • 26
Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1 • 144
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Paper • 2410.02707 • Published Oct 3 • 47
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References

Paper • 2410.05193 • Published Oct 7 • 12
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7 • 111
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7 • 70
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7 • 48
DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

Paper • 2411.04999 • Published Nov 7 • 16
From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond

Paper • 2411.03590 • Published Nov 6 • 9
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5 • 63
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published Nov 5 • 64
Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge

Paper • 2411.02657 • Published Nov 4 • 5
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents

Paper • 2410.24024 • Published Oct 31 • 48
How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4 • 33
Survey of Cultural Awareness in Language Models: Text and Beyond

Paper • 2411.00860 • Published Oct 30 • 23
Training-free Regional Prompting for Diffusion Transformers

Paper • 2411.02395 • Published Nov 4 • 25
DynaSaur: Large Language Agents Beyond Predefined Actions

Paper • 2411.01747 • Published Nov 4 • 18
Multi-expert Prompting Improves Reliability, Safety, and Usefulness of Large Language Models

Paper • 2411.00492 • Published Nov 1 • 6
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Paper • 2410.23218 • Published Oct 30 • 46
DELTA: Dense Efficient Long-range 3D Tracking for any video

Paper • 2410.24211 • Published Oct 31 • 8
Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Paper • 2410.21845 • Published Oct 29 • 12
Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset

Paper • 2410.22325 • Published Oct 29 • 10
A Survey of Small Language Models

Paper • 2410.20011 • Published Oct 25 • 40
Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14 • 54
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

Paper • 2410.18603 • Published Oct 24 • 32
LongReward: Improving Long-context Large Language Models with AI Feedback

Paper • 2410.21252 • Published Oct 28 • 17
Teach Multimodal LLMs to Comprehend Electrocardiographic Images

Paper • 2410.19008 • Published Oct 21 • 23
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch

Paper • 2410.18693 • Published Oct 24 • 40
WorldSimBench: Towards Video Generation Models as World Simulators

Paper • 2410.18072 • Published Oct 23 • 18
DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes

Paper • 2410.18084 • Published Oct 23 • 13
SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

Paper • 2410.17249 • Published Oct 22 • 41
AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published Oct 21 • 58
FrugalNeRF: Fast Convergence for Few-shot Novel View Synthesis without Learned Priors

Paper • 2410.16271 • Published Oct 21 • 80
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Paper • 2410.13232 • Published Oct 17 • 40
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Paper • 2410.13754 • Published Oct 17 • 74
MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Paper • 2410.13757 • Published Oct 17 • 31
Exploring Model Kinship for Merging Large Language Models

Paper • 2410.12613 • Published Oct 16 • 19
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

Paper • 2410.10814 • Published Oct 14 • 48
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts

Paper • 2410.10626 • Published Oct 14 • 37
RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19 • 47
Soft Robotic Dynamic In-Hand Pen Spinning

Paper • 2411.12734 • Published Nov 19 • 9
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use

Paper • 2411.10323 • Published Nov 15 • 31
Sharingan: Extract User Action Sequence from Desktop Recordings

Paper • 2411.08768 • Published Nov 13 • 10
Hermes: A Large Language Model Framework on the Journey to Autonomous Networks

Paper • 2411.06490 • Published Nov 10 • 6
Large Language Models Can Self-Improve in Long-context Reasoning

Paper • 2411.08147 • Published Nov 12 • 62
CamemBERT 2.0: A Smarter French Language Model Aged to Perfection

Paper • 2411.08868 • Published Nov 13 • 12
GRAPE: Generalizing Robot Policy via Preference Alignment

Paper • 2411.19309 • Published 29 days ago • 42
On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published 28 days ago • 24
Reverse Thinking Makes LLMs Stronger Reasoners

Paper • 2411.19865 • Published 28 days ago • 19
Large Language Model-Brained GUI Agents: A Survey

Paper • 2411.18279 • Published 30 days ago • 27
DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

Paper • 2411.15139 • Published Nov 22 • 15
ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published about 1 month ago • 76
Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26 • 47
MH-MoE:Multi-Head Mixture-of-Experts

Paper • 2411.16205 • Published Nov 25 • 23
Patience Is The Key to Large Language Model Reasoning

Paper • 2411.13082 • Published Nov 20 • 7
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

Paper • 2412.07760 • Published 17 days ago • 49
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Paper • 2412.09619 • Published 15 days ago • 20

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs