Add GGUF loader for FluxTransformer2DModel

[GGUF](https://huggingface.co/docs/hub/en/gguf) is becoming a preferred means of distribution of FLUX fine-tunes.

Transformers recently added general support for GGUF and are slowly adding support for [additional model types](https://github.com/huggingface/transformers/issues/33260).
(implementation is by adding `gguf_file` param to `from_pretrained` method)

[This PR](https://github.com/huggingface/transformers/pull/33389) adds support for loading GGUF files to `T5EncoderModel`.
I've tested the code with quants available at <https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/tree/main> and its working with current Flux implementation in diffusers.

However, as `FluxTransformer2DModel` is defined in diffusers library, support has to be added here to be able to load actual transformer model which is most (if not all) of Flux finetunes.

Examples that can be used:
- <https://civitai.com/models/657607/gguf-fastflux-flux1-schnell-merged-with-flux1-dev>  
  with weights quantized as q4_0, q4_1, q5_0, q5_1
- <https://civitai.com/models/662958/flux1-dev-gguf-f16>  
  with weights simply converted from f16

cc: @yiyixuxu @sayakpaul @DN6


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add GGUF loader for FluxTransformer2DModel #9487

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add GGUF loader for FluxTransformer2DModel #9487

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions