Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published 12 days ago • 60
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use Paper • 2411.10323 • Published 12 days ago • 27
Vision-Language Models Can Self-Improve Reasoning via Reflection Paper • 2411.00855 • Published 29 days ago • 4
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Paper • 2410.23218 • Published 28 days ago • 46
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant Paper • 2410.18603 • Published Oct 24 • 30
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5 • 52
Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models Paper • 2406.11736 • Published Jun 17 • 5
LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages Paper • 2407.05975 • Published Jul 8 • 34
Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models Paper • 2311.09278 • Published Nov 15, 2023 • 7