Michael Brown

MichaelBrown123

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

facebook/fasttext-en-vectors

liked a model 5 days ago

tt-doang69/loras

liked a model 5 days ago

NexaAIDev/Qwen2-Audio-7B-GGUF

View all activity

Organizations

None yet

MichaelBrown123's activity

liked 3 models 5 days ago

liked a model 16 days ago

NexaAIDev/omnivision-968M

Updated 2 days ago • 10k • 432

Reacted to ezgikorkmaz's post with 🔥 22 days ago

Post

2497

I will be giving a tutorial at AAAI 2025! Quite excited to share the recent advancements in the field and my contributions to it!

Stay tuned for more updates.
Link: https://x.com/EzgiKorkmazAI/status/1854525141897671111

upvoted 12 papers 22 days ago

SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation

Paper • 2411.04989 • Published 23 days ago • 14

DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

Paper • 2411.04999 • Published 23 days ago • 16

M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models

Paper • 2411.04075 • Published 24 days ago • 15

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published 23 days ago • 27

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published 23 days ago • 20

Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Paper • 2411.05005 • Published 23 days ago • 13

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published 23 days ago • 48

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published 23 days ago • 48

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published 25 days ago • 25

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published 23 days ago • 63

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published 23 days ago • 69

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published 23 days ago • 109

liked 3 models 22 days ago

HuggingFaceTB/SmolLM2-1.7B-Instruct

Text Generation • Updated 4 days ago • 90.9k • 394

OuteAI/OuteTTS-0.1-350M

Text-to-Speech • Updated 4 days ago • 10k • 291

openai/whisper-large-v3-turbo

Automatic Speech Recognition • Updated Oct 4 • 2.28M • • 1.43k