4090单卡运行报显存不足
用bfloat格式也不够
OutOfMemoryError: CUDA out of memory. Tried to allocate 1.48 GiB. GPU 0 has a total capacity of 23.64 GiB of which 688.12 MiB is free. Process 2629 has 818.00 MiB memory in use. Including non-PyTorch memory, this process has 22.16 GiB memory in use. Of the allocated memory 21.29 GiB is allocated by PyTorch, and 429.57 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
pipe.enable_model_cpu_offload() 用这个好像也不行
尝试更新一下现在的github和diffuser,解决了 23.9G
需要安装pytorch2.4.0吗
2.2也是可以的 2.2 2.3 2.4都行
pipe.enable_model_cpu_offload()
看起来并没有生效
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True 运行也不行吗,一共消耗23.9G 确保你的显卡当前没有占用任何的内容
在哪个步骤炸了呢
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True 运行也不行吗,一共消耗23.9G 确保你的显卡当前没有占用任何的内容
在哪个步骤炸了呢
load阶段直接占用了36G,设置了pipe.enable_model_cpu_offload(),没有起到作用,diffusers是0.30.0dev的版本
load 阶段? 你是怎么运行代码的,或许更一下现在的稳定版本