Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Vulkan] Track image layout internally #5597

Merged
merged 20 commits into from
Aug 10, 2022

Conversation

PENGUINLIONG
Copy link
Member

For a same reason as discussed in #5540 (comment) this PR propose a mechanism to track image layouts internally:

  1. When an image is created or imported, it's initial state is always VK_IMAGE_LAYOUT_UNDEFINED; VulkanDevice tracks a global state for each image;
  2. When the user inserts image barrier by VulkanCommandList::image_transition, the command list tracks the initial layout of the image when it's first referred by any command, and a pending image layout that will be realized after the execution of the command list.
  3. When the user submits a command list, the initial image layout records are compared with the current records of VulkanDevice; errors are raised if those records mismatch, because a previously submitted command list has invalidated the layouts assumed by the current list.
  4. If the layout records perfectly match, the command list is submitted and tracked layouts in VulkanDevice are updated with pending layouts in VulkanCommandList.

@netlify
Copy link

netlify bot commented Aug 2, 2022

Deploy Preview for docsite-preview canceled.

Name Link
🔨 Latest commit e199ec5
🔍 Latest deploy log https://app.netlify.com/sites/docsite-preview/deploys/62f341e944d71700086aea55

@bobcao3
Copy link
Collaborator

bobcao3 commented Aug 3, 2022

Why do you need to track the layout?

@bobcao3
Copy link
Collaborator

bobcao3 commented Aug 3, 2022

For the user, you do not need to know about the layout, you can always assume they are in "undefined"

A layout_transition(undefined, anything) is always valid. It might be sub-optimal because the underlying access flags and stages can not be specified as detailed as possible with a undefined initial layout, but no one bothers going into that kinds of detail.

Moreover, on desktop GPUs, layouts are not a thing. They are purely here for tiled GPUs such as mobile chips. A layout transition on desktop is simply a memory barrier, so it doesn't matter which layout you specify.

---- edit here:

For mobile it does matter... But this is not the correct way to track layout due to the problem we will have with ordering that I mentioned in the next post. You only know all the layouts when the user submit the command buffer, at which point you can't really modify the commands you have recorded. Thus it is better to just establish a consistent convention in terms of API, where when the user provide the image to the system, it will need to assume Taichi takes full control of that image until execution completes. The user might need to specify a final layout as well

If the tracking is really just for validation purposes (where an incorrect initial layout results in an error), then we should let Vulkan validation layer to do its job. It doesn't really simplify the API in any way.

@bobcao3
Copy link
Collaborator

bobcao3 commented Aug 3, 2022

Furthermore, this tracking is flawed. We have async compute queues in Vulkan, and we can and do submit multiple command buffers instead of just one big one for the entire frame. If other command buffers have their layout transition run first, it will simply break this and be invalid. The order we record the command list is not guaranteed to be the same order when it's executed on the GPU.

@PENGUINLIONG
Copy link
Member Author

For the user, you do not need to know about the layout, you can always assume they are in "undefined"

A layout_transition(undefined, anything) is always valid. It might be sub-optimal because the underlying access flags and stages can not be specified as detailed as possible with a undefined initial layout, but no one bothers going into that kinds of detail.

Moreover, on desktop GPUs, layouts are not a thing. They are purely here for tiled GPUs such as mobile chips. A layout transition on desktop is simply a memory barrier, so it doesn't matter which layout you specify.

I understand your concerns but your reasoning is only valid when the images are always write-only, so there is no need to preserve existing data during layout transition. The image layout actually cannot be assumed undefined, as described in the Vulkan specification.

VK_IMAGE_LAYOUT_UNDEFINED specifies that the layout is unknown. [...] This layout can be used in place of the current image layout in a layout transition, but doing so will cause the contents of the image’s memory to be undefined.

Logically, regardless of any specific implementation, the layout transition involves an inplace R/W of the existing texel data to relocate the texels in memory to serve certain needs of hardware design. So oldLayout guides the driver and the device to collect data from memory correctly. It might not be a problem on the devices we have tested on but we will have to pay for the divergence from the specification sooner or later. We won't be able to preserve memory data if oldLayout is always undefined. We certainly can give up tracking layouts, but that's only possible if images are always in VK_IMAGE_LAYOUT_GENERAL, and there is a price for it in terms of performance.

Furthermore, this tracking is flawed. We have async compute queues in Vulkan, and we can and do submit multiple command buffers instead of just one big one for the entire frame. If other command buffers have their layout transition run first, it will simply break this and be invalid. The order we record the command list is not guaranteed to be the same order when it's executed on the GPU.

I do have considered this in the design of the tracking mechanism; and that's why I proposed double tracking on both Device and CommandList levels. The device records are only updated after a successful submit, and command lists can only be successfully submitted if the initial layouts assumed by them perfectly matches the device records. If we are going to support multi-queues, which is not yet implemented AFAIK, I have integrated event primitives to synchronize different queues for on-device executions, if needed. If "using VK_IMAGE_LAYOUT_UNDEFINED as source layout all the time" is already not conformant to the specification, I don't think it's different from tracking a wrong layout.

@PENGUINLIONG PENGUINLIONG force-pushed the vk-image-layout branch 2 times, most recently from aa8f824 to 5b8995c Compare August 5, 2022 06:07
@PENGUINLIONG
Copy link
Member Author

So I have adapted the impl to track image layouts on a runtime level. @bobcao3 , any comments?

@bobcao3 bobcao3 self-requested a review August 7, 2022 20:30
Copy link
Collaborator

@bobcao3 bobcao3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PENGUINLIONG
Copy link
Member Author

/rebase

@PENGUINLIONG
Copy link
Member Author

/rebase

@PENGUINLIONG
Copy link
Member Author

/rebase

@PENGUINLIONG PENGUINLIONG merged commit 3895ff7 into taichi-dev:master Aug 10, 2022
@PENGUINLIONG PENGUINLIONG deleted the vk-image-layout branch August 10, 2022 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants