-
SVGDreamer: Text Guided SVG Generation with Diffusion Model
Paper • 2312.16476 • Published -
DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
Paper • 2306.14685 • Published • 1 -
Beyond Pixels: Exploring Human-Readable SVG Generation for Simple Images with Vision Language Models
Paper • 2311.15543 • Published -
StarVector: Generating Scalable Vector Graphics Code from Images
Paper • 2312.11556 • Published • 27
Collections
Discover the best community collections!
Collections including paper arxiv:2401.17093
-
Learning Universal Predictors
Paper • 2401.14953 • Published • 19 -
Anything in Any Scene: Photorealistic Video Object Insertion
Paper • 2401.17509 • Published • 16 -
SymbolicAI: A framework for logic-based approaches combining generative models and solvers
Paper • 2402.00854 • Published • 19 -
StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis
Paper • 2401.17093 • Published • 19
-
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 30 -
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Paper • 2312.17172 • Published • 26 -
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Paper • 2401.01974 • Published • 5 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 27
-
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding
Paper • 2306.06094 • Published • 1 -
IconShop: Text-Guided Vector Icon Synthesis with Autoregressive Transformers
Paper • 2304.14400 • Published • 4 -
VecFusion: Vector Font Generation with Diffusion
Paper • 2312.10540 • Published • 21 -
StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis
Paper • 2401.17093 • Published • 19
-
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 50 -
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Paper • 2311.05698 • Published • 9 -
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Paper • 2311.06242 • Published • 84 -
PolyMaX: General Dense Prediction with Mask Transformer
Paper • 2311.05770 • Published • 6