Consistently getting noise as output with Intel Arc #556

bvhari · 2023-04-23T10:30:41Z

I set up ComfyUI following the tutorial for Intel Arc. However I am consistently getting noise as output.
System spec: Windows 10 WSL, Ubuntu 22.04.2 LTS, Python 3.10, Arc A770

comfyanonymous · 2023-04-23T20:23:28Z

are you getting this with all samplers/schedulers?

kwaa · 2023-04-24T05:09:16Z

I think this is more likely to be an upstream (Intel Extension for PyTorch, Intel Compute Runtime, etc...) issue.
anyway, did you start with the --use-split-cross-attention argument?

in general, lower resolutions (e.g. 512*512 or lower) have a lower probability of noise.

Possibly related: intel/intel-extension-for-pytorch#325

bvhari · 2023-04-24T12:40:20Z

@comfyanonymous Yes, I tried multiple combinations.
@kwaa Quite possible as the system started slowing down after 3-4 back to back gens. I had to quit Comfy and restart the GPU driver to fix this.
I am already testing at 512x512, with --use-split-cross-attention active.
I tried with a 2 GB model, same result.

WASasquatch · 2023-04-24T16:19:46Z

I think this is more likely to be an upstream (Intel Extension for PyTorch, Intel Compute Runtime, etc...) issue. anyway, did you start with the --use-split-cross-attention argument?

in general, lower resolutions (e.g. 512*512 or lower) have a lower probability of noise.

Possibly related: intel/intel-extension-for-pytorch#325

Anything under 512x512, you'll actually get latent noise artifacting and strange results if it's not img2img. I think this is cause of the internal upscaling used from source 64x64 (for 1.4/1.5)

NoAvailableAlias · 2023-04-25T00:39:06Z

Can confirm, finally got xpu acceleration working via the guide from this merge:
#409
But I'm only getting 3.4 it/s (half of what other people are getting in webui oneapi fork), it also only outputs noise...
Thanks Intel

kwaa · 2023-04-25T04:04:13Z

But I'm only getting 3.4 it/s (half of what other people are getting in webui oneapi fork)

I didn't add ipex.optimize, if I do it will report an error
(or maybe I didn't find the right place to add it)
IPEX v2.0.0+xpu may solve this problem, but what is more expected is ipexrun xpu

kwaa · 2023-04-25T04:13:19Z

Anything under 512x512, you'll actually get latent noise artifacting and strange results if it's not img2img. I think this is cause of the internal upscaling used from source 64x64 (for 1.4/1.5)

Between the strange results and the pure noise/black images, I think I have to choose the former...

^{So Intel, F--k You!}

Note: The last time I used ComfyUI the maximum size was about 768*704, exceeding that would produce a black image.

bvhari · 2023-04-28T09:20:19Z

Finally got it working using the wheel files from this tutorial: https://github.com/TotalDay/Intel_ARC_GPU_WSL_Stable_Diffusion_WEBUI
It is basically combining the oneapi branch from this fork: https://github.com/jbaboval/stable-diffusion-webui
with some presumably custom built wheels
Interestingly, ipex.optimize is enabled in the fork, and the fork is working when I tried it.
So I tried enabling ipex.optimize in Comfy as well. Unfortunately, I am only getting around half of the performance compared to the A1111 fork. However, Karras schedule is working.
Hopefully the devs here can figure out the reason behind the performance discrepancy.

kwaa · 2023-04-28T10:59:33Z

So I tried enabling ipex.optimize in Comfy as well. Unfortunately, I am only getting around half of the performance compared to the A1111 fork.

Where did you add ipex.optimize?

However, Karras schedule is working.

It looks like TotalDay/Intel_ARC_GPU_WSL_Stable_Diffusion_WEBUI provides IPEX that has not been released yet, so this is in line with expectations.

Btw, are you now able to generate high resolution images?

bvhari · 2023-04-28T13:23:40Z

Where did you add ipex.optimize?

comfy/model_management.py
In the function load_model_gpu
I added this to the global imports at the beginning
global xpu_available
Then, after the line
real_model.to(get_torch_device())
I added

if xpu_available:
    ipex.optimize(real_model, inplace=True)

YMMV on the improvement from ipex.optimize though.

Btw, are you now able to generate high resolution images?

Yes, but only upto 1024x1024
Beyond that, the driver crashes or the output is noise.
I can go upto 1280x1280 in A1111 DirectML fork
Maybe I should try the ComfyUI DirectML fork. I might be able to hit 1536x1536 since Comfy has Tiled VAE.

kwaa · 2023-05-01T15:14:40Z

IPEX has released v1.13.120+xpu (why not v2.0.0?), I'll see what I can do.

simonlui · 2023-08-14T02:17:51Z

So good news, months later. Intel finally released an XPU version of their Pytorch extension with Pytorch 2.0 support, v2.0.110+xpu and it solves the noise issue and you can get something out without much issue... as long as you are generating one image and other caevats. I'll write up a post later in the discussion thread related to this. But the base issue should be solved.

BA8F0D39 · 2023-08-23T05:15:32Z

@simonlui
IPEX v2.0.110+xpu solves the black images and weird noises.
However, generating an image larger than 512x768 makes it all black, even though all the VRAM isn't used

simonlui · 2023-08-23T05:59:15Z

@BA8F0D39 Yeah, I've hit that but I am pretty sure though that isn't a ComfyUI issue, it's an Intel issue specifically with how they are handling allocation on GPU because of wanting to preserve their stateful addressing model. I'm currently digging into their stack and have seen the bug reports you made on that regarding how 4GB is the max you can allocate. But remember, that is the limit for 1 single allocation. I will probably open an issue or two and update those issues so keep an eye out on that. In the meantime, you can try and get the program to chunk its allocations into smaller units so it doesn't hit the limit and uses VRAM better. Using FP16 where possible and using some memory saving nodes in your workflow like using the testing nodes with Tiled VAE Encode/Decode and TomePatchModel nodes helps. ComfyUI's latest change as of barely an hour ago will also help with text encoder weights being able to be stored in FP16. With that, I am able to use SD1.5 and generate 768x768 and latent upscale to 1024x1024 without hitting any image corruption or blackout issues.

BrosnanYuen · 2023-08-23T19:14:18Z

@simonlui
I looked through the IPEX code base and it doesn't seem to directly allocate arrays
I think memory allocation happens on OneDNN and it is disabled by default.
They deleted "-cl-intel-greater-than-4GB-buffer-required" in OneDNN which enables arrays larger than 4GB

oneapi-src/oneDNN@42a1895#diff-21a382a12fc4d58cceb2ab97c73746f53439a1f739f1573ccdd6060ea62949e1

I think we can try to enable 4GB and greater allocation using
intel/compute-runtime#627

simonlui · 2023-08-23T20:07:05Z

@BrosnanYuen Both IPEX and oneDNN use SYCL for allocation. See here and here respectively. That's not the main issue here though. If you read https://github.com/intel/compute-runtime/blob/master/programmers-guide/ALLOCATIONS_GREATER_THAN_4GB.md, you will realize that there are two requirements to making >4GB allocations happen. You need build flags, true, but you also need to pass a flag or struct through with the allocation function call to make it happen. The document only specifies Level Zero and OpenCL. IPEX and OneDNN both use SYCL instead and that's the problem as there is no provision to do this same thing in SYCL. I've opened an enhancement report in Intel's LLVM to try and propagate something to get through this limitation but it is going to take a long time if Intel even considers it. Again, what can be done at this time is mitigation so IPEX doesn't allocate more than 4GB for any single allocation and that will allow for more VRAM to be used using the strategies mentioned but it is unavoidable hitting that limit for big images, batches and more complex workflows which limits what the GPU can do at this time. The other thing that can be done is ComfyUI actually splitting allocations into 4GB chunks if possible but I think it is untenable to actually ask any projects that would use IPEX to do something to mitigate something that shouldn't be their problem and implement what is essentially manual memory management in Python.

bvhari · 2024-08-08T11:14:08Z

Closing as this is long fixed

bvhari changed the title ~~Consistently getting noise as output with Intel Arc Graphics~~ Consistently getting noise as output with Intel Arc Apr 23, 2023

BA8F0D39 mentioned this issue Sep 20, 2023

Arrays larger than 4 GB crashes intel/intel-extension-for-pytorch#325

Open

bvhari closed this as completed Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistently getting noise as output with Intel Arc #556

Consistently getting noise as output with Intel Arc #556

bvhari commented Apr 23, 2023 •

edited

Loading

comfyanonymous commented Apr 23, 2023

kwaa commented Apr 24, 2023 •

edited

Loading

bvhari commented Apr 24, 2023

WASasquatch commented Apr 24, 2023 •

edited

Loading

NoAvailableAlias commented Apr 25, 2023

kwaa commented Apr 25, 2023 •

edited

Loading

kwaa commented Apr 25, 2023 •

edited

Loading

bvhari commented Apr 28, 2023 •

edited

Loading

kwaa commented Apr 28, 2023 •

edited

Loading

bvhari commented Apr 28, 2023 •

edited

Loading

kwaa commented May 1, 2023

simonlui commented Aug 14, 2023

BA8F0D39 commented Aug 23, 2023

simonlui commented Aug 23, 2023

BrosnanYuen commented Aug 23, 2023

simonlui commented Aug 23, 2023 •

edited

Loading

bvhari commented Aug 8, 2024

Consistently getting noise as output with Intel Arc #556

Consistently getting noise as output with Intel Arc #556

Comments

bvhari commented Apr 23, 2023 • edited Loading

comfyanonymous commented Apr 23, 2023

kwaa commented Apr 24, 2023 • edited Loading

bvhari commented Apr 24, 2023

WASasquatch commented Apr 24, 2023 • edited Loading

NoAvailableAlias commented Apr 25, 2023

kwaa commented Apr 25, 2023 • edited Loading

kwaa commented Apr 25, 2023 • edited Loading

bvhari commented Apr 28, 2023 • edited Loading

kwaa commented Apr 28, 2023 • edited Loading

bvhari commented Apr 28, 2023 • edited Loading

kwaa commented May 1, 2023

simonlui commented Aug 14, 2023

BA8F0D39 commented Aug 23, 2023

simonlui commented Aug 23, 2023

BrosnanYuen commented Aug 23, 2023

simonlui commented Aug 23, 2023 • edited Loading

bvhari commented Aug 8, 2024

bvhari commented Apr 23, 2023 •

edited

Loading

kwaa commented Apr 24, 2023 •

edited

Loading

WASasquatch commented Apr 24, 2023 •

edited

Loading

kwaa commented Apr 25, 2023 •

edited

Loading

kwaa commented Apr 25, 2023 •

edited

Loading

bvhari commented Apr 28, 2023 •

edited

Loading

kwaa commented Apr 28, 2023 •

edited

Loading

bvhari commented Apr 28, 2023 •

edited

Loading

simonlui commented Aug 23, 2023 •

edited

Loading