Jianshu Zhang's picture

1 9 21

Jianshu Zhang

Sterzhang

·

https://sterzhang.github.io/

AI & ML interests

Data-Centric AI, Multi-Modal Understanding

Organizations

None yet

Sterzhang's activity

upvoted a paper 7 days ago

ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Paper • 2410.23287 • Published 8 days ago • 17

upvoted 4 papers 10 days ago

Can MLLMs Understand the Deep Implication Behind Chinese Images?

Paper • 2410.13854 • Published 21 days ago • 8

DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control

Paper • 2410.13830 • Published 21 days ago • 23

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

Paper • 2410.13863 • Published 21 days ago • 35

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

Paper • 2410.16268 • Published 17 days ago • 65

upvoted a paper 11 days ago

ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

Paper • 2410.17856 • Published 15 days ago • 48

upvoted a paper 14 days ago

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Paper • 2410.17637 • Published 15 days ago • 34

upvoted a paper 29 days ago

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published 29 days ago • 69

upvoted a paper 5 months ago

Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

Paper • 2406.07502 • Published Jun 11 • 1