Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models Paper • 2411.14432 • Published Nov 21 • 21
Video Understanding with Large Language Models: A Survey Paper • 2312.17432 • Published Dec 29, 2023 • 2
DNAGPT: A Generalized Pretrained Tool for Multiple DNA Sequence Analysis Tasks Paper • 2307.05628 • Published Jul 11, 2023 • 9
Cross Contrasting Feature Perturbation for Domain Generalization Paper • 2307.12502 • Published Jul 24, 2023
Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering Paper • 2402.00827 • Published Feb 1 • 2