Generate a video based on a text prompt using Mochi
Text-to-Video
Inference with 4/8-bit quantization