-
FMViT: A multiple-frequency mixing Vision Transformer
Paper • 2311.05707 • Published • 5 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper • 2405.00732 • Published • 118 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 85
Fleurentin
Zeros66
AI & ML interests
None yet
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet