Skip to content

Flux - torchao inference not working #10470

Closed
@nitinmukesh

Description

@nitinmukesh

Describe the bug

  1. Flux with torchao int8wo not working
  2. enable_sequential_cpu_offload not working

image

Reproduction

example taken from (merged)
#10009

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel, TorchAoConfig

model_id = "black-forest-labs/Flux.1-Dev"
dtype = torch.bfloat16

quantization_config = TorchAoConfig("int8wo")
transformer = FluxTransformer2DModel.from_pretrained(
    model_id,
    subfolder="transformer",
    quantization_config=quantization_config,
    torch_dtype=dtype,
)
pipe = FluxPipeline.from_pretrained(
    model_id,
    transformer=transformer,
    torch_dtype=dtype,
)
# pipe.to("cuda")

# pipe.enable_sequential_cpu_offload()
pipe.vae.enable_tiling()

prompt = "A cat holding a sign that says hello world"
image = pipe(prompt, num_inference_steps=4, guidance_scale=0.0).images[0]
image.save("output.png")

Logs

Stuck at this (without cpu offload)

(venv) C:\ai1\diffuser_t2i>python FLUX_torchao.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████| 3/3 [00:00<?, ?it/s]
Loading pipeline components...:  29%|████████▊                      | 2/7 [00:00<00:00,  5.36it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  6.86it/s]
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:02<00:00,  2.38it/s]

(with cpu offload)

(venv) C:\ai1\diffuser_t2i>python FLUX_torchao.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████| 3/3 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  6.98it/s]
Loading pipeline components...:  29%|████████▊                      | 2/7 [00:00<00:01,  2.62it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:01<00:00,  4.31it/s]
Traceback (most recent call last):
  File "C:\ai1\diffuser_t2i\FLUX_torchao.py", line 21, in <module>
    pipe.enable_sequential_cpu_offload()
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1179, in enable_sequential_cpu_offload
    cpu_offload(model, device, offload_buffers=offload_buffers)
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\big_modeling.py", line 205, in cpu_offload
    attach_align_device_hook(
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
    attach_align_device_hook(
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
    attach_align_device_hook(
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 518, in attach_align_device_hook
    attach_align_device_hook(
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 509, in attach_align_device_hook
    add_hook_to_module(module, hook, append=True)
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 161, in add_hook_to_module
    module = hook.init_hook(module)
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\hooks.py", line 308, in init_hook
    set_module_tensor_to_device(module, name, "meta")
  File "C:\ai1\diffuser_t2i\venv\lib\site-packages\accelerate\utils\modeling.py", line 355, in set_module_tensor_to_device
    new_value.layout_tensor,
AttributeError: 'AffineQuantizedTensor' object has no attribute 'layout_tensor'

System Info

Windows 11

(venv) C:\ai1\diffuser_t2i>python --version
Python 3.10.11

(venv) C:\ai1\diffuser_t2i>echo %CUDA_PATH%
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6

(venv) C:\ai1\diffuser_t2i>pip list
Package            Version
------------------ ------------
accelerate         1.2.0.dev0
aiofiles           23.2.1
annotated-types    0.7.0
anyio              4.7.0
bitsandbytes       0.45.0
certifi            2024.12.14
charset-normalizer 3.4.1
click              8.1.8
colorama           0.4.6
diffusers          0.33.0.dev0
einops             0.8.0
exceptiongroup     1.2.2
fastapi            0.115.6
ffmpy              0.5.0
filelock           3.16.1
fsspec             2024.12.0
gguf               0.13.0
gradio             5.9.1
gradio_client      1.5.2
h11                0.14.0
httpcore           1.0.7
httpx              0.28.1
huggingface-hub    0.25.2
idna               3.10
imageio            2.36.1
imageio-ffmpeg     0.5.1
importlib_metadata 8.5.0
Jinja2             3.1.5
markdown-it-py     3.0.0
MarkupSafe         2.1.5
mdurl              0.1.2
mpmath             1.3.0
networkx           3.4.2
ninja              1.11.1.3
numpy              2.2.1
opencv-python      4.10.0.84
orjson             3.10.13
packaging          24.2
pandas             2.2.3
pillow             11.1.0
pip                23.0.1
protobuf           5.29.2
psutil             6.1.1
pydantic           2.10.4
pydantic_core      2.27.2
pydub              0.25.1
Pygments           2.18.0
python-dateutil    2.9.0.post0
python-multipart   0.0.20
pytz               2024.2
PyYAML             6.0.2
regex              2024.11.6
requests           2.32.3
rich               13.9.4
ruff               0.8.6
safehttpx          0.1.6
safetensors        0.5.0
semantic-version   2.10.0
sentencepiece      0.2.0
setuptools         65.5.0
shellingham        1.5.4
six                1.17.0
sniffio            1.3.1
starlette          0.41.3
sympy              1.13.1
tokenizers         0.21.0
tomlkit            0.13.2
torch              2.5.1+cu124
torchao            0.7.0
torchvision        0.20.1+cu124
tqdm               4.67.1
transformers       4.47.1
typer              0.15.1
typing_extensions  4.12.2
tzdata             2024.2
urllib3            2.3.0
uvicorn            0.34.0
websockets         14.1
wheel              0.45.1
zipp               3.21.0

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions