hysts's picture

hysts

hysts

·

AI & ML interests

Computer Vision

Recent Activity

New activity about 10 hours ago

lemonaddie/geowizard:Apply for community grant: Academic project (gpu and storage)

liked a Space about 14 hours ago

prs-eth/rollingdepth

New activity about 16 hours ago

SimpleBerry/LLaMA-O1-Supervised-1129-Demo:Apply for community grant: Academic project (gpu)

View all activity

Articles

Introduction to the Open Leaderboard for Japanese LLMs

Efficient Controllable Generation for SDXL with T2I-Adapters

Organizations

hysts's activity

upvoted a collection 2 months ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 222

upvoted a paper 6 months ago

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 86

upvoted a paper 9 months ago

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 112

upvoted 17 papers about 1 year ago

ChatAnything: Facetime Chat with LLM-Enhanced Personas

Paper • 2311.06772 • Published Nov 12, 2023 • 34

Music ControlNet: Multiple Time-varying Controls for Music Generation

Paper • 2311.07069 • Published Nov 13, 2023 • 43

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 26

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

Paper • 2311.04145 • Published Nov 7, 2023 • 32

Learning From Mistakes Makes LLM Better Reasoner

Paper • 2310.20689 • Published Oct 31, 2023 • 28

CapsFusion: Rethinking Image-Text Data at Scale

Paper • 2310.20550 • Published Oct 31, 2023 • 25

Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

Paper • 2310.19909 • Published Oct 30, 2023 • 20

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

Paper • 2310.19512 • Published Oct 30, 2023 • 15

MM-VID: Advancing Video Understanding with GPT-4V(ision)

Paper • 2310.19773 • Published Oct 30, 2023 • 19

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 69

Wonder3D: Single Image to 3D using Cross-Domain Diffusion

Paper • 2310.15008 • Published Oct 23, 2023 • 21

LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 40

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 57

An Early Evaluation of GPT-4V(ision)

Paper • 2310.16534 • Published Oct 25, 2023 • 21

A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation

Paper • 2310.16656 • Published Oct 25, 2023 • 40

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Paper • 2310.16818 • Published Oct 25, 2023 • 30

Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 40