Great work!

#2
by dollarpound - opened

Great work folks, wish to see this being able to run on smaller GPUs!

Work on single RTX 4090 with the repository: https://github.com/victorchall/genmoai-smol.

"Majestic snow-capped mountains at sunrise, with low-lying clouds drifting through deep valleys. Alpine meadows dotted with wildflowers in the foreground, crisp clean air, ray-traced lighting, ultra HD quality"

I am not able to build wheel for flash-attn (it is using a lot of ram, I have reducted number of workers to two now it takes 23GB of RAM, but not sure if this will work, takes ages to finish)

do you have docker image?

after several hours seems that everything is in place, I am trying to run this under docker container but seems that it stuck on:

(T2VSynthMochiModel pid=13954) Timing load_text_encs
(T2VSynthMochiModel pid=13954) Timing load_vae
(T2VSynthMochiModel pid=13954) Timing construct_dit
(T2VSynthMochiModel pid=13954) Timing dit_load_checkpoint
(T2VSynthMochiModel pid=13954) Exception raised in creation task: The actor died because of an error raised in its creation task, ray::T2VSynthMochiModel.init() (pid=13954, ip=172.17.0.2, actor_id=2dddf816160cba1b1ed1177a01000000, repr=<mochi_preview.t2v_synth_mochi.T2VSynthMochiModel object at 0x7e8f068885b0>)
(T2VSynthMochiModel pid=13954) File "/workspace/genmoai-smol/src/mochi_preview/t2v_synth_mochi.py", line 289, in init
(T2VSynthMochiModel pid=13954) model.load_state_dict(load_file(dit_checkpoint_path))
(T2VSynthMochiModel pid=13954) File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2584, in load_state_dict
(T2VSynthMochiModel pid=13954) raise RuntimeError(
(T2VSynthMochiModel pid=13954) RuntimeError: Error(s) in loading state_dict for AsymmDiTJoint:
(T2VSynthMochiModel pid=13954) Unexpected key(s) in state_dict: "t5_y_embedder.to_kv.bias", "t5_y_embedder.to_kv.weight", "t5_y_embedder.to_out.bias", "t5_y_embedder.to_out.weight", "t5_y_embedder.to_q.bias", "t5_y_embedder.to_q.weight", "t5_yproj.bias", "t5_yproj.weight".
(T2VSynthMochiModel pid=13954) size mismatch for final_layer.linear.weight: copying a param with shape torch.Size([48, 3072]) from checkpoint, the shape in current model is torch.Size([96, 3072]).
(T2VSynthMochiModel pid=13954) size mismatch for final_layer.linear.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([96]).
Traceback (most recent call last):
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/gradio/routes.py", line 439, in run_predict
output = await app.get_blocks().process_api(
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1384, in process_api
result = await self.call_function(
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1089, in call_function
prediction = await anyio.to_thread.run_sync(
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
result = context.run(func, *args)
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/gradio/utils.py", line 700, in wrapper
response = f(*args, **kwargs)
File "/workspace/genmoai-smol/src/mochi_preview/infer.py", line 99, in generate_video
load_model()
File "/workspace/genmoai-smol/src/mochi_preview/infer.py", line 59, in load_model
model = MochiWrapper(
File "/workspace/genmoai-smol/src/mochi_preview/handler.py", line 25, in init
ray.get(worker.ray_ready.remote())
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/ray/_private/worker.py", line 2745, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/ray/_private/worker.py", line 903, in get_objects
raise value
ray.exceptions.ActorDiedError: The actor died because of an error raised in its creation task, ray::T2VSynthMochiModel.init() (pid=13954, ip=172.17.0.2, actor_id=2dddf816160cba1b1ed1177a01000000, repr=<mochi_preview.t2v_synth_mochi.T2VSynthMochiModel object at 0x7e8f068885b0>)
File "/workspace/genmoai-smol/src/mochi_preview/t2v_synth_mochi.py", line 289, in init
model.load_state_dict(load_file(dit_checkpoint_path))
File "/workspace/genmoai-smol/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2584, in load_state_dict
raise RuntimeError(
RuntimeError: Error(s) in loading state_dict for AsymmDiTJoint:
Unexpected key(s) in state_dict: "t5_y_embedder.to_kv.bias", "t5_y_embedder.to_kv.weight", "t5_y_embedder.to_out.bias", "t5_y_embedder.to_out.weight", "t5_y_embedder.to_q.bias", "t5_y_embedder.to_q.weight", "t5_yproj.bias", "t5_yproj.weight".
size mismatch for final_layer.linear.weight: copying a param with shape torch.Size([48, 3072]) from checkpoint, the shape in current model is torch.Size([96, 3072]).
size mismatch for final_layer.linear.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([96]).

Genmo org

@MajinAnix Your config might be incorrect, causing the checkpoint to fail to load. Make sure patch_size=2 and in_channels=12.

Sign up or log in to comment