Closed
Description
GGUF is becoming a preferred means of distribution of FLUX fine-tunes.
Transformers recently added general support for GGUF and are slowly adding support for additional model types.
(implementation is by adding gguf_file
param to from_pretrained
method)
This PR adds support for loading GGUF files to T5EncoderModel
.
I've tested the code with quants available at https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/tree/main and its working with current Flux implementation in diffusers.
However, as FluxTransformer2DModel
is defined in diffusers library, support has to be added here to be able to load actual transformer model which is most (if not all) of Flux finetunes.
Examples that can be used:
- https://civitai.com/models/657607/gguf-fastflux-flux1-schnell-merged-with-flux1-dev
with weights quantized as q4_0, q4_1, q5_0, q5_1 - https://civitai.com/models/662958/flux1-dev-gguf-f16
with weights simply converted from f16
cc: @yiyixuxu @sayakpaul @DN6
Metadata
Metadata
Assignees
Labels
No labels