3 18 30

NAN

nan1248

AI & ML interests

None yet

Recent Activity

New activity 8 days ago

BAAI/Infinity-MM:数据集全部是单图的吗？

liked a dataset 12 days ago

allenai/pixmo-docs

upvoted a paper 15 days ago

Large Language Model-Brained GUI Agents: A Survey

View all activity

Organizations

None yet

nan1248's activity

New activity in BAAI/Infinity-MM 8 days ago

数据集全部是单图的吗？

#6 opened 8 days ago by

nan1248

liked a dataset 12 days ago

allenai/pixmo-docs

Viewer • Updated 8 days ago • 255k • 3.62k • 17

upvoted a paper 15 days ago

Large Language Model-Brained GUI Agents: A Survey

Paper • 2411.18279 • Published 16 days ago • 26

upvoted a collection 16 days ago

PixMo

Collection

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated 16 days ago • 45

liked a dataset 18 days ago

MosRat2333/ZhEn-latex-ocr

Viewer • Updated Sep 7 • 151k • 105 • 3

liked a dataset 26 days ago

mllmTeam/MobileViews

Updated Nov 12 • 1.7k • 13

liked a dataset about 1 month ago

argilla/Synth-APIGen-v0.1

Viewer • Updated Oct 10 • 49.4k • 246 • 49

upvoted a paper about 1 month ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7 • 110

liked a dataset about 1 month ago

opencsg/chinese-fineweb-edu-v2

Viewer • Updated Oct 26 • 188M • 6.48k • 50

liked a dataset about 2 months ago

BAAI/Infinity-MM

Updated about 21 hours ago • 41.7k • 80

liked a model about 2 months ago

microsoft/OmniParser

Image-Text-to-Text • Updated 11 days ago • 8.88k • 1.47k

liked a dataset about 2 months ago

AIDC-AI/Ovis-dataset

Preview • Updated Sep 16 • 1.45k • 22

liked a model 2 months ago

allenai/Molmo-7B-D-0924

Image-Text-to-Text • Updated Oct 10 • 161k • 456

liked a dataset 2 months ago

recursal/SuperWikiImage-7M

Updated Oct 7 • 863 • 20

upvoted a paper 3 months ago

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published Sep 19 • 47

upvoted 3 papers 4 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 123

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19 • 51

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16 • 98

liked a Space 4 months ago

Running

556

🌖

Qwen2-VL-72B

New activity in mlfoundations/MINT-1T-HTML 4 months ago

Request for Dataset Access

#2 opened 4 months ago by

nan1248