You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using these flags python gradio_server.py --i2v --profile 5 --attention xformers --precision fp16 --server-name 127.0.0.1 --open-browser
But still it'd return this error:
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(1, 49140, 24, 128) (torch.float32)
key : shape=(1, 49140, 24, 128) (torch.float32)
value : shape=(1, 49140, 24, 128) (torch.float32)
attn_bias : <class 'xformers.ops.fmha.attn_bias.BlockDiagonalPaddedKeysMask'>
p : 0.0
`[email protected]` is not supported because:
requires device with capability > (9, 0) but your GPU has capability (7, 5) (too old)
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
operator wasn't built - see `python -m xformers.info` for more info
`[email protected]` is not supported because:
requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old)
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
`cutlassF-pt` is not supported because:
attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalPaddedKeysMask'>
It's clearly old GPU issues, can you suggest configs to use or add code to support older GPUs? Do I have to bring this to the Hyvideo team's repo if it's out of the scope of your optimizations?
The text was updated successfully, but these errors were encountered:
I used sdpa instead of xformers. Didn't compile. Didn't enable on-the-fly quantization. I used profile 2. Got it running with my lone 2080 Ti (11Gb). My machine has an i9 13th gen CPU and 128Gb RAM. I use a Pytorch NGC v24.05 Docker container in an Ubuntu 22.04 headless server with Pytorch 2.6, Python 3.10, and CUDA 12.4. I ran into GPU memory issues, so simplest way to get it running is smaller resolution (I just upscale later through another Deep Learning Computer Vision model) -- I used the model's scaling factor (4:3 / 3:4). I implemented this logic into /gradio_server.py (smaller resolutions aren't in the hardcoded options). And well, at 512x288, one 2080 Ti (11Gb) and 128Gb RAM can generate 33 frames at a time -- One video takes ~12 mins at my end.
I'm using these flags
python gradio_server.py --i2v --profile 5 --attention xformers --precision fp16 --server-name 127.0.0.1 --open-browser
But still it'd return this error:
It's clearly old GPU issues, can you suggest configs to use or add code to support older GPUs? Do I have to bring this to the Hyvideo team's repo if it's out of the scope of your optimizations?
The text was updated successfully, but these errors were encountered: