-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Enable parameter optimization for SYCL kernels #2236
[SYCL] Enable parameter optimization for SYCL kernels #2236
Conversation
Note this is the CFE component to: #2226 |
ac56302
to
4a1cf73
Compare
I lost the thing, but to do clang format, check this out: git diff -U0 --no-color COMMIT_ID_BEFORE_YOURS | clang-format-diff.py -i -p1 You might have to give it a direct path to clang-format-diff.py, and make sure clang-format executable is in your path (and make sure you are in the llvm 'root' directory), but this tends to work for me. Alternatively, you can manually apply the clang-format-diff file from the build bot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A handful of small changes, otherwise this LGTM. Re-request a review when you think you've got them all and I'll take a look (likely Monday).
1d93ea0
to
4f36f61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 nit left for me.
4f36f61
to
b800ac2
Compare
The kernel arg optimization information is not a kernel descriptor. The information should be bound to the module descriptor. If I compile for 2 targets, there is no guarantees that arguments will be elided in the same way in both cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very small nit, otherwise lgtm
Can you clarify this comment? @romanovvlad has taken a look at this solution and seems to think it is workable, so I'd be interested to hear what the two of you believe should be changed here? |
If I compile using So unless I missed something in the integration header evolution and it is emitted per target, the result is likely to fail for one of the target. |
PD.Kind = Kind; | ||
PD.Info = Info; | ||
PD.Offset = Offset; | ||
K->Params.push_back({Kind, Info, Offset, NumOpenCLParams}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.emplace_back()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keryell : emplace_back doesn't support aggregate initialization until C++20, and our codebase is C++14.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keryell : emplace_back doesn't support aggregate initialization until C++20, and our codebase is C++14.
Then you know what you have to do. Please replace all the -std=c++14
with -std=c++20
;-)
This is depressing all this time wasted writing old code... :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keryell : emplace_back doesn't support aggregate initialization until C++20, and our codebase is C++14.
Then you know what you have to do. Please replace all the
-std=c++14
with-std=c++20
;-)
This is depressing all this time wasted writing old code... :-(
Wish I could! It was hard enough to get the LLVM project switched over from a terrible-subset-of C++11 to C++14.
The solution is workable from SYCL RT point of view - SYCL RT can skip setting argument which are marked in special way. |
What happens in the driver (@mdtoguchi ?) with this multiple device targets? Do they all share a single integration header, or is there a separate one for each? Short-term we could probably limit the LLVM DSE enablement (via driver) to only single-target invocations. Long-term, could we have a separate integration header per-target? The CFE can be changed to output multiple files if necessary for the integration header. |
The integration header are included into the application, so we will have multiple definitions of things headers define. |
Current behavior is to generate a single integration header that is pulled into the host file. |
So then, what is our solution here? How can we communicate this to the runtime? Do we just prohibit the optimization in multi-device situations? Do we prohibit different opt-settings in that case? |
As an idea we could change the approach and embed "omit array" to the device image as a metadata. But it requires more changes in the RT because in the place where we currently collect arguments we don't know which specific device binary will be used. |
[L0] Fix UR_KERNEL_INFO_ATTRIBUTES returned value
Add SYCL integration header support with kernel arg optimization table to mark kernel parameters that can be omitted.