Closed
Description
Describe the bug
- Flux with torchao int8wo not working
- enable_sequential_cpu_offload not working
Reproduction
example taken from (merged)
#10009
import torch
from diffusers import FluxPipeline, FluxTransformer2DModel, TorchAoConfig
model_id = "black-forest-labs/Flux.1-Dev"
dtype = torch.bfloat16
quantization_config = TorchAoConfig("int8wo")
transformer = FluxTransformer2DModel.from_pretrained(
model_id,
subfolder="transformer",
quantization_config=quantization_config,
torch_dtype=dtype,
)
pipe = FluxPipeline.from_pretrained(
model_id,
transformer=transformer,
torch_dtype=dtype,
)
# pipe.to("cuda")
# pipe.enable_sequential_cpu_offload()
pipe.vae.enable_tiling()
prompt = "A cat holding a sign that says hello world"
image = pipe(prompt, num_inference_steps=4, guidance_scale=0.0).images[0]
image.save("output.png")
Logs
Stuck at this (without cpu offload)
(venv) C:\ai1\diffuser_t2i>python FLUX_torchao.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████| 3/3 [00:00<?, ?it/s]
Loading pipeline components...: 29%|████████▊ | 2/7 [00:00<00:00, 5.36it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 6.86it/s]
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:02<00:00, 2.38it/s]
(with cpu offload)
(venv) C:\ai1\diffuser_t2i>python FLUX_torchao.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████| 3/3 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 6.98it/s]
Loading pipeline components...: 29%|████████▊ | 2/7 [00:00<00:01, 2.62it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:01<00:00, 4.31it/s]
Traceback (most recent call last):
File "C:\ai1\diffuser_t2i\FLUX_torchao.py", line 21, in <module>
pipe.enable_sequential_cpu_offload()
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1179, in enable_sequential_cpu_offload
cpu_offload(model, device, offload_buffers=offload_buffers)
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\big_modeling.py", line 205, in cpu_offload
attach_align_device_hook(
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
attach_align_device_hook(
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
attach_align_device_hook(
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
attach_align_device_hook(
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 509, in attach_align_device_hook
add_hook_to_module(module, hook, append=True)
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 161, in add_hook_to_module
module = hook.init_hook(module)
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 308, in init_hook
set_module_tensor_to_device(module, name, "meta")
File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\utils\modeling.py", line 355, in set_module_tensor_to_device
new_value.layout_tensor,
AttributeError: 'AffineQuantizedTensor' object has no attribute 'layout_tensor'
System Info
Windows 11
(venv) C:\ai1\diffuser_t2i>python --version
Python 3.10.11
(venv) C:\ai1\diffuser_t2i>echo %CUDA_PATH%
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6
(venv) C:\ai1\diffuser_t2i>pip list
Package Version
------------------ ------------
accelerate 1.2.0.dev0
aiofiles 23.2.1
annotated-types 0.7.0
anyio 4.7.0
bitsandbytes 0.45.0
certifi 2024.12.14
charset-normalizer 3.4.1
click 8.1.8
colorama 0.4.6
diffusers 0.33.0.dev0
einops 0.8.0
exceptiongroup 1.2.2
fastapi 0.115.6
ffmpy 0.5.0
filelock 3.16.1
fsspec 2024.12.0
gguf 0.13.0
gradio 5.9.1
gradio_client 1.5.2
h11 0.14.0
httpcore 1.0.7
httpx 0.28.1
huggingface-hub 0.25.2
idna 3.10
imageio 2.36.1
imageio-ffmpeg 0.5.1
importlib_metadata 8.5.0
Jinja2 3.1.5
markdown-it-py 3.0.0
MarkupSafe 2.1.5
mdurl 0.1.2
mpmath 1.3.0
networkx 3.4.2
ninja 1.11.1.3
numpy 2.2.1
opencv-python 4.10.0.84
orjson 3.10.13
packaging 24.2
pandas 2.2.3
pillow 11.1.0
pip 23.0.1
protobuf 5.29.2
psutil 6.1.1
pydantic 2.10.4
pydantic_core 2.27.2
pydub 0.25.1
Pygments 2.18.0
python-dateutil 2.9.0.post0
python-multipart 0.0.20
pytz 2024.2
PyYAML 6.0.2
regex 2024.11.6
requests 2.32.3
rich 13.9.4
ruff 0.8.6
safehttpx 0.1.6
safetensors 0.5.0
semantic-version 2.10.0
sentencepiece 0.2.0
setuptools 65.5.0
shellingham 1.5.4
six 1.17.0
sniffio 1.3.1
starlette 0.41.3
sympy 1.13.1
tokenizers 0.21.0
tomlkit 0.13.2
torch 2.5.1+cu124
torchao 0.7.0
torchvision 0.20.1+cu124
tqdm 4.67.1
transformers 4.47.1
typer 0.15.1
typing_extensions 4.12.2
tzdata 2024.2
urllib3 2.3.0
uvicorn 0.34.0
websockets 14.1
wheel 0.45.1
zipp 3.21.0
Who can help?
No response