[examples] add controlnet sd3 example #9249

DavyMorgan · 2024-08-23T03:22:22Z

What does this PR do?

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

DavyMorgan · 2024-08-23T08:56:01Z

My implementation is based on the official examples for ControlNet and SD3 DreamBooth

kadirnar · 2024-08-23T17:39:35Z

@DavyMorgan

Great. Are you going to write controlnet train code for Flux? I really need it.

DavyMorgan · 2024-08-24T08:54:06Z

@DavyMorgan

Great. Are you going to write controlnet train code for Flux? I really need it.

Hi, a nice implementation of controlnet flux with training scripts can be found at https://github.com/XLabs-AI/x-flux.

kadirnar · 2024-08-24T09:57:34Z

@DavyMorgan
Great. Are you going to write controlnet train code for Flux? I really need it.

Hi, a nice implementation of controlnet flux with training scripts can be found at https://github.com/XLabs-AI/x-flux.

I have been using this library for 1 week and multi gpu is not working. I need multi gpu support to train large datasets.

DavyMorgan · 2024-08-27T02:05:56Z

@yiyixuxu Could you take a look at this PR when you get a chance? It addresses issue #8834 and provides an example for controlnet+sd3. Thanks!

yiyixuxu · 2024-09-04T01:23:22Z

ohh thanks for your PR!

@haofanwang @wangqixun would you be able to give this PR a review too?

HuggingFaceDocBuilderDev · 2024-09-04T01:28:30Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul

Thank you for working on this! I have left a couple of comments.

Could you also share some results from your experiments?

And as @yiyixuxu mentioned, it'd be great to have this PR reviewed by @haofanwang @wangqixun as they were the first ones to have come up with SD3 ControlNets.

examples/controlnet/README_sd3.md

examples/controlnet/train_controlnet_sd3.py

xduzhangjiayu · 2024-09-04T02:42:13Z

My implementation is based on the official examples for ControlNet and SD3 DreamBooth

Hi,
I tried run your script with mixed-precision=fp16, but got unexpected error in
model_pred = transformer( hidden_states=noisy_model_input, timestep=timesteps, encoder_hidden_states=prompt_embeds, pooled_projections=pooled_prompt_embeds, block_controlnet_hidden_states=control_block_res_samples, return_dict=False, )[0]
the error is RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half
But the noisy_model_input is fp16 in the script, do you have any advice to solve this problem? thanks so much!

It can be fixed by following review.

examples/controlnet/train_controlnet_sd3.py

…trolnet-sd3-example

DavyMorgan · 2024-09-09T06:05:13Z

@sayakpaul @xduzhangjiayu Thank you very much for your kind reviews. I have updated the codes according to your comments and suggestions. I have also added two experimental images as results in README. Would you mind have another look?

DavyMorgan · 2024-09-09T06:14:31Z

Thank you for working on this! I have left a couple of comments.

Could you also share some results from your experiments?

And as @yiyixuxu mentioned, it'd be great to have this PR reviewed by @haofanwang @wangqixun as they were the first ones to have come up with SD3 ControlNets.

Please find the result images at the bottom of README_sd3.md :)

sayakpaul · 2024-09-09T10:45:47Z

Thanks for the changes @DavyMorgan! Let's also add a test similar to https://github.com/huggingface/diffusers/blob/main/examples/controlnet/test_controlnet.py?

DavyMorgan · 2024-09-09T11:41:10Z

Thanks for the changes @DavyMorgan! Let's also add a test similar to https://github.com/huggingface/diffusers/blob/main/examples/controlnet/test_controlnet.py?

@sayakpaul Yeah sure! I have added a similar test to https://github.com/huggingface/diffusers/blob/main/examples/controlnet/test_controlnet.py.

sayakpaul · 2024-09-09T12:40:33Z

Could you run make style && make quality?

DavyMorgan · 2024-09-09T13:07:54Z

Could you run make style && make quality?

@sayakpaul Yeah. I have run make style && make quality which fixes the import order and a few style issues. Thanks a lot!

DavyMorgan · 2024-09-09T13:36:35Z

In the Fast tests for PRs / PyTorch Example CPU tests (pull_request):

09/09/2024 13:20:09 - INFO - __main__ - Initializing controlnet weights from transformer
Traceback (most recent call last):
  File "/__w/diffusers/diffusers/examples/controlnet/train_controlnet_sd3.py", line 1415, in <module>
    main(args)
  File "/__w/diffusers/diffusers/examples/controlnet/train_controlnet_sd3.py", line 997, in main
    controlnet = SD3ControlNetModel.from_transformer(transformer)
  File "/__w/diffusers/diffusers/src/diffusers/models/controlnet_sd3.py", line 254, in from_transformer
    controlnet.transformer_blocks.load_state_dict(transformer.transformer_blocks.state_dict(), strict=False)
  File "/opt/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2215, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ModuleList:
	size mismatch for 0.norm1_context.linear.weight: copying a param with shape torch.Size([64, 32]) from checkpoint, the shape in current model is torch.Size([192, 32]).
	size mismatch for 0.norm1_context.linear.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([192]).

@sayakpaul It seems that the hf-internal-testing/tiny-sd3-pipe has size mismatch with SD3ControlNetModel. Should I use "InstantX/SD3-Controlnet-Canny" in the test script, while I find all the test codes use pretrained models from hf-internal-testing?

sayakpaul · 2024-09-09T13:38:15Z

We need to use a smaller ControlNet model. We should be able to initialize that from the transformer of hf-internal-testing/tiny-sd3-pipe, as followed in the https://github.com/huggingface/diffusers/blob/main/examples/controlnet/test_controlnet.py script.

DavyMorgan · 2024-09-10T03:19:29Z

We need to use a smaller ControlNet model. We should be able to initialize that from the transformer of hf-internal-testing/tiny-sd3-pipe, as followed in the https://github.com/huggingface/diffusers/blob/main/examples/controlnet/test_controlnet.py script.

@sayakpaul Thanks. I have updated the test to leverage the smaller SD3 model used in the official test script in

diffusers/tests/pipelines/controlnet_sd3/test_controlnet_sd3.py

Line 172 in f28a8c2

def test_controlnet_sd3(self):

…trolnet-sd3-example

DavyMorgan · 2024-09-10T09:23:57Z

I see. I have also added a tiny controlnet model based on the official test script of controlnet-sd3. Now the example test passes on my local machine. @sayakpaul

DavyMorgan · 2024-09-10T16:26:35Z

@sayakpaul It seems that the failure in fast pipeline test is unrelated to this PR. WDYT?

sayakpaul

Thanks for your contributions!

sayakpaul · 2024-09-09T10:35:17Z

examples/controlnet/README_sd3.md

+import torch
+
+base_model_path = "stabilityai/stable-diffusion-3-medium-diffusers"
+controlnet_path = "sd3-controlnet-out/checkpoint-6500/controlnet"


This seems like a local path. Can we update this to a checkpoint on the Hub?

Yeah sure, I will upload my checkpoint to the hub.

sayakpaul · 2024-09-09T10:37:09Z

examples/controlnet/README_sd3.md

+| |  |
+|-------------------|:-------------------------:|
+|| pale golden rod circle with old lace background |
+ ![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png) | ![pale golden rod circle with old lace background](https://huggingface.co/datasets/DavyMorgan/sd3-controlnet-results/resolve/main/step-6500.png) |


Seems like there are artifacts in the output image but it could also be because of overfitting. Any comments?

I guess I can include more sample images for validation.

sayakpaul · 2024-09-09T10:38:12Z

examples/controlnet/train_controlnet_sd3.py

+def image_grid(imgs, rows, cols):
+    assert len(imgs) == rows * cols
+
+    w, h = imgs[0].size
+    grid = Image.new("RGB", size=(cols * w, rows * h))
+
+    for i, img in enumerate(imgs):
+        grid.paste(img, box=(i % cols * w, i // cols * h))
+    return grid


We can use the make_image_grid() utility function from diffusers.utils.

Got it! Thanks.

xduzhangjiayu · 2024-09-13T10:48:22Z

Yes，I'm sure it is text encoders that occupy the memory。After computing the embedding，text encoders are still in the GPU memory. （That means clear_objs_and_retain_memory can't work）.I think it may related to my accelerate config. I will check later. Have you tried training with a large dataset（at least 1000k image)？ I think it may needs lots of CPU RAM when pre-computing the text embedding.

---- Replied Message ---- | From | Yu ***@***.***> | | Date | 09/13/2024 17:43 | | To | ***@***.***> | | Cc | ***@***.***>***@***.***> | | Subject | Re: [huggingface/diffusers] [examples] add controlnet sd3 example (PR #9249) | @DavyMorgan commented on this pull request. In examples/controlnet/train_controlnet_sd3.py:

+ )

+ + train_dataset = make_train_dataset(args, tokenizer_one, tokenizer_two, tokenizer_three, accelerator) + + train_dataloader = torch.utils.data.DataLoader( + train_dataset, + shuffle=True, + collate_fn=collate_fn, + batch_size=args.train_batch_size, + num_workers=args.dataloader_num_workers, + ) + + tokenizers = [tokenizer_one, tokenizer_two, tokenizer_three] + text_encoders = [text_encoder_one, text_encoder_two, text_encoder_three] + + def compute_text_embeddings(prompt, text_encoders, tokenizers): Are you sure it is the text encoders that occupy the memory? The GPU memory can be other models, as the vae, transformer, and controlnet models are still in memory. Besides, as we periodically run the validation, the text encoders will also be loaded every validation_steps steps. From my experiments, previously I need to separate the training and validation in two distinct GPUS, and after the above update I only need one GPU to run the script. During training, the text embeddings from text encoders are in memory, though there is a cached one in disk such that it will not compute them in your next run as long as the configs are the same. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

DavyMorgan · 2024-09-14T02:39:30Z

Yes，I'm sure it is text encoders that occupy the memory。After computing the embedding，text encoders are still in the GPU memory. （That means clear_objs_and_retain_memory can't work）.I think it may related to my accelerate config. I will check later. Have you tried training with a large dataset（at least 1000k image)？ I think it may needs lots of CPU RAM when pre-computing the text embedding.
---- Replied Message ---- | From | Yu @.> | | Date | 09/13/2024 17:43 | | To | @.> | | Cc | @.>@.> | | Subject | Re: [huggingface/diffusers] [examples] add controlnet sd3 example (PR #9249) | @DavyMorgan commented on this pull request. In examples/controlnet/train_controlnet_sd3.py:

)

train_dataset = make_train_dataset(args, tokenizer_one, tokenizer_two, tokenizer_three, accelerator) + + train_dataloader = torch.utils.data.DataLoader( + train_dataset, + shuffle=True, + collate_fn=collate_fn, + batch_size=args.train_batch_size, + num_workers=args.dataloader_num_workers, + ) + + tokenizers = [tokenizer_one, tokenizer_two, tokenizer_three] + text_encoders = [text_encoder_one, text_encoder_two, text_encoder_three] + + def compute_text_embeddings(prompt, text_encoders, tokenizers): Are you sure it is the text encoders that occupy the memory? The GPU memory can be other models, as the vae, transformer, and controlnet models are still in memory. Besides, as we periodically run the validation, the text encoders will also be loaded every validation_steps steps. From my experiments, previously I need to separate the training and validation in two distinct GPUS, and after the above update I only need one GPU to run the script. During training, the text embeddings from text encoders are in memory, though there is a cached one in disk such that it will not compute them in your next run as long as the configs are the same. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

I only tested the fill50k dataset. As we use the datasets library, I believe it will handle the memory/disk issue well. You can check https://huggingface.co/docs/datasets/en/about_arrow#memory-mapping.

xduzhangjiayu · 2024-09-14T02:59:36Z

Yes，I'm sure it is text encoders that occupy the memory。After computing the embedding，text encoders are still in the GPU memory. （That means clear_objs_and_retain_memory can't work）.I think it may related to my accelerate config. I will check later. Have you tried training with a large dataset（at least 1000k image)？ I think it may needs lots of CPU RAM when pre-computing the text embedding.
---- Replied Message ---- | From | Yu @.> | | Date | 09/13/2024 17:43 | | To | _@**._> | | Cc | _@.>@.**_> | | Subject | Re: [huggingface/diffusers] [examples] add controlnet sd3 example (PR #9249) | @DavyMorgan commented on this pull request. In examples/controlnet/train_controlnet_sd3.py:

)

train_dataset = make_train_dataset(args, tokenizer_one, tokenizer_two, tokenizer_three, accelerator) + + train_dataloader = torch.utils.data.DataLoader( + train_dataset, + shuffle=True, + collate_fn=collate_fn, + batch_size=args.train_batch_size, + num_workers=args.dataloader_num_workers, + ) + + tokenizers = [tokenizer_one, tokenizer_two, tokenizer_three] + text_encoders = [text_encoder_one, text_encoder_two, text_encoder_three] + + def compute_text_embeddings(prompt, text_encoders, tokenizers): Are you sure it is the text encoders that occupy the memory? The GPU memory can be other models, as the vae, transformer, and controlnet models are still in memory. Besides, as we periodically run the validation, the text encoders will also be loaded every validation_steps steps. From my experiments, previously I need to separate the training and validation in two distinct GPUS, and after the above update I only need one GPU to run the script. During training, the text embeddings from text encoders are in memory, though there is a cached one in disk such that it will not compute them in your next run as long as the configs are the same. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

I only tested the fill50k dataset. As we use the datasets library, I believe it will handle the memory/disk issue well. You can check https://huggingface.co/docs/datasets/en/about_arrow#memory-mapping.

OK, once again thank you very much for your reply !

* add controlnet sd3 example * add controlnet sd3 example * update controlnet sd3 example * add controlnet sd3 example test * fix quality and style * update test * update test --------- Co-authored-by: Sayak Paul <[email protected]>

DavyMorgan changed the title ~~add controlnet sd3 example~~ [examples] add controlnet sd3 example Aug 24, 2024

DavyMorgan mentioned this pull request Aug 24, 2024

how to add controlnet in sd3! #8527

Closed

yiyixuxu requested a review from sayakpaul September 4, 2024 01:24

sayakpaul reviewed Sep 4, 2024

View reviewed changes

xduzhangjiayu suggested changes Sep 4, 2024

View reviewed changes

examples/controlnet/train_controlnet_sd3.py Show resolved Hide resolved

add controlnet sd3 example

d9bf0d2

DavyMorgan force-pushed the controlnet-sd3-example branch from 3360355 to d9bf0d2 Compare September 9, 2024 01:33

DavyMorgan added 3 commits September 9, 2024 10:00

add controlnet sd3 example

b964146

Merge remote-tracking branch 'origin/controlnet-sd3-example' into con…

999054e

…trolnet-sd3-example

update controlnet sd3 example

b0da655

add controlnet sd3 example test

b3ce9cc

fix quality and style

3104a96

DavyMorgan added 2 commits September 10, 2024 11:14

update test

58004fd

Merge branch 'main' into controlnet-sd3-example

f79c5fe

DavyMorgan added 2 commits September 10, 2024 17:21

update test

407b349

Merge remote-tracking branch 'origin/controlnet-sd3-example' into con…

97bbf38

…trolnet-sd3-example

Merge branch 'main' into controlnet-sd3-example

47ecf2e

sayakpaul approved these changes Sep 11, 2024

View reviewed changes

sayakpaul merged commit c002731 into huggingface:main Sep 11, 2024
8 checks passed

DavyMorgan mentioned this pull request Sep 30, 2024

FP32 training for sd3 controlnet #9560

Closed

DavyMorgan mentioned this pull request Oct 21, 2024

Update sd3 controlnet example #9735

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[examples] add controlnet sd3 example #9249

[examples] add controlnet sd3 example #9249

DavyMorgan commented Aug 23, 2024 •

edited

Loading

DavyMorgan commented Aug 23, 2024

kadirnar commented Aug 23, 2024

DavyMorgan commented Aug 24, 2024

kadirnar commented Aug 24, 2024

DavyMorgan commented Aug 27, 2024

yiyixuxu commented Sep 4, 2024

HuggingFaceDocBuilderDev commented Sep 4, 2024

sayakpaul left a comment •

edited

Loading

xduzhangjiayu commented Sep 4, 2024 •

edited

Loading

DavyMorgan commented Sep 9, 2024 •

edited

Loading

DavyMorgan commented Sep 9, 2024

sayakpaul commented Sep 9, 2024

DavyMorgan commented Sep 9, 2024

sayakpaul commented Sep 9, 2024

DavyMorgan commented Sep 9, 2024

DavyMorgan commented Sep 9, 2024 •

edited

Loading

sayakpaul commented Sep 9, 2024

DavyMorgan commented Sep 10, 2024 •

edited

Loading

DavyMorgan commented Sep 10, 2024

DavyMorgan commented Sep 10, 2024

sayakpaul left a comment

sayakpaul Sep 9, 2024

DavyMorgan Sep 11, 2024

sayakpaul Sep 9, 2024

DavyMorgan Sep 11, 2024

sayakpaul Sep 9, 2024

DavyMorgan Sep 11, 2024

xduzhangjiayu commented Sep 13, 2024 via email

DavyMorgan commented Sep 14, 2024

xduzhangjiayu commented Sep 14, 2024

[examples] add controlnet sd3 example #9249

[examples] add controlnet sd3 example #9249

Conversation

DavyMorgan commented Aug 23, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

DavyMorgan commented Aug 23, 2024

kadirnar commented Aug 23, 2024

DavyMorgan commented Aug 24, 2024

kadirnar commented Aug 24, 2024

DavyMorgan commented Aug 27, 2024

yiyixuxu commented Sep 4, 2024

HuggingFaceDocBuilderDev commented Sep 4, 2024

sayakpaul left a comment • edited Loading

Choose a reason for hiding this comment

xduzhangjiayu commented Sep 4, 2024 • edited Loading

DavyMorgan commented Sep 9, 2024 • edited Loading

DavyMorgan commented Sep 9, 2024

sayakpaul commented Sep 9, 2024

DavyMorgan commented Sep 9, 2024

sayakpaul commented Sep 9, 2024

DavyMorgan commented Sep 9, 2024

DavyMorgan commented Sep 9, 2024 • edited Loading

sayakpaul commented Sep 9, 2024

DavyMorgan commented Sep 10, 2024 • edited Loading

DavyMorgan commented Sep 10, 2024

DavyMorgan commented Sep 10, 2024

sayakpaul left a comment

Choose a reason for hiding this comment

sayakpaul Sep 9, 2024

Choose a reason for hiding this comment

DavyMorgan Sep 11, 2024

Choose a reason for hiding this comment

sayakpaul Sep 9, 2024

Choose a reason for hiding this comment

DavyMorgan Sep 11, 2024

Choose a reason for hiding this comment

sayakpaul Sep 9, 2024

Choose a reason for hiding this comment

DavyMorgan Sep 11, 2024

Choose a reason for hiding this comment

xduzhangjiayu commented Sep 13, 2024 via email

DavyMorgan commented Sep 14, 2024

xduzhangjiayu commented Sep 14, 2024

DavyMorgan commented Aug 23, 2024 •

edited

Loading

sayakpaul left a comment •

edited

Loading

xduzhangjiayu commented Sep 4, 2024 •

edited

Loading

DavyMorgan commented Sep 9, 2024 •

edited

Loading

DavyMorgan commented Sep 9, 2024 •

edited

Loading

DavyMorgan commented Sep 10, 2024 •

edited

Loading