[SYCL] Enable parameter optimization for SYCL kernels #2236

srividya-sundaram · 2020-07-31T18:21:19Z

Add SYCL integration header support with kernel arg optimization table to mark kernel parameters that can be omitted.

… unused-args-elimination

erichkeane · 2020-07-31T18:26:44Z

Note this is the CFE component to: #2226

clang/lib/Sema/SemaSYCL.cpp

clang/test/CodeGenSYCL/integration_header.cpp

sycl/include/CL/sycl/detail/kernel_desc.hpp

clang/lib/Sema/SemaSYCL.cpp

clang/test/CodeGenSYCL/kernel-param-acc-array-ih.cpp

clang/test/CodeGenSYCL/kernel-param-member-acc-array-ih.cpp

clang/lib/Sema/SemaSYCL.cpp

erichkeane · 2020-07-31T23:27:25Z

I lost the thing, but to do clang format, check this out:

git diff -U0 --no-color COMMIT_ID_BEFORE_YOURS | clang-format-diff.py -i -p1

You might have to give it a direct path to clang-format-diff.py, and make sure clang-format executable is in your path (and make sure you are in the llvm 'root' directory), but this tends to work for me.

Alternatively, you can manually apply the clang-format-diff file from the build bot.

erichkeane

A handful of small changes, otherwise this LGTM. Re-request a review when you think you've got them all and I'll take a look (likely Monday).

clang/lib/Sema/SemaSYCL.cpp

clang/test/CodeGenSYCL/kernel-param-member-acc-array-ih.cpp

clang/lib/Sema/SemaSYCL.cpp

erichkeane

1 nit left for me.

…ation table

Naghasan · 2020-08-03T09:01:33Z

The kernel arg optimization information is not a kernel descriptor. The information should be bound to the module descriptor.

If I compile for 2 targets, there is no guarantees that arguments will be elided in the same way in both cases.

romanovvlad

Q

sycl/include/CL/sycl/detail/kernel_desc.hpp

clang/lib/Sema/SemaSYCL.cpp

sycl/include/CL/sycl/detail/kernel_desc.hpp

Fznamznon

Very small nit, otherwise lgtm

clang/include/clang/Sema/Sema.h

erichkeane · 2020-08-03T15:03:56Z

The kernel arg optimization information is not a kernel descriptor. The information should be bound to the module descriptor.

If I compile for 2 targets, there is no guarantees that arguments will be elided in the same way in both cases.

Can you clarify this comment? @romanovvlad has taken a look at this solution and seems to think it is workable, so I'd be interested to hear what the two of you believe should be changed here?

Naghasan · 2020-08-03T15:34:08Z

The kernel arg optimization information is not a kernel descriptor. The information should be bound to the module descriptor.
If I compile for 2 targets, there is no guarantees that arguments will be elided in the same way in both cases.

Can you clarify this comment? @romanovvlad has taken a look at this solution and seems to think it is workable, so I'd be interested to hear what the two of you believe should be changed here?

If I compile using -fsycl-targets=spir64-unknown-linux-sycldevice,spir64_fpga-unknown-linux-sycldevice, what guarantees is there that the compiler will remove exactly the same arguments in my kernels for both module ? If I start to tweak the optimization parameters, it is likely to expose different optimization opportunities and the dead argument elimination may yield different results. An extreme case would be something like -fsycl-targets=spir64-unknown-linux-sycldevice,spir64_fpga-unknown-linux-sycldevice -Xsycl-target-backend=spir64-unknown-linux-sycldevice "-O3" -Xsycl-target-backend=spir64_fpga-unknown-linux-sycldevice "-O0".

So unless I missed something in the integration header evolution and it is emitted per target, the result is likely to fail for one of the target.

keryell · 2020-08-03T16:21:18Z

clang/lib/Sema/SemaSYCL.cpp

-  PD.Kind = Kind;
-  PD.Info = Info;
-  PD.Offset = Offset;
+  K->Params.push_back({Kind, Info, Offset, NumOpenCLParams});


.emplace_back()?

@keryell : emplace_back doesn't support aggregate initialization until C++20, and our codebase is C++14.

@keryell : emplace_back doesn't support aggregate initialization until C++20, and our codebase is C++14.

Then you know what you have to do. Please replace all the -std=c++14 with -std=c++20 ;-)
This is depressing all this time wasted writing old code... :-(

@keryell : emplace_back doesn't support aggregate initialization until C++20, and our codebase is C++14.

Then you know what you have to do. Please replace all the -std=c++14 with -std=c++20 ;-)
This is depressing all this time wasted writing old code... :-(

Wish I could! It was hard enough to get the LLVM project switched over from a terrible-subset-of C++11 to C++14.

romanovvlad · 2020-08-03T18:18:28Z

The kernel arg optimization information is not a kernel descriptor. The information should be bound to the module descriptor.
If I compile for 2 targets, there is no guarantees that arguments will be elided in the same way in both cases.

Can you clarify this comment? @romanovvlad has taken a look at this solution and seems to think it is workable, so I'd be interested to hear what the two of you believe should be changed here?

The solution is workable from SYCL RT point of view - SYCL RT can skip setting argument which are marked in special way.
But, I believe, the problem @Naghasan is describing it really a problem and related to the compilation toolchain.

erichkeane · 2020-08-03T18:27:08Z

The kernel arg optimization information is not a kernel descriptor. The information should be bound to the module descriptor.
If I compile for 2 targets, there is no guarantees that arguments will be elided in the same way in both cases.

Can you clarify this comment? @romanovvlad has taken a look at this solution and seems to think it is workable, so I'd be interested to hear what the two of you believe should be changed here?

The solution is workable from SYCL RT point of view - SYCL RT can skip setting argument which are marked in special way.
But, I believe, the problem @Naghasan is describing it really a problem and related to the compilation toolchain.

What happens in the driver (@mdtoguchi ?) with this multiple device targets? Do they all share a single integration header, or is there a separate one for each?

Short-term we could probably limit the LLVM DSE enablement (via driver) to only single-target invocations. Long-term, could we have a separate integration header per-target? The CFE can be changed to output multiple files if necessary for the integration header.

romanovvlad · 2020-08-03T23:46:12Z

The kernel arg optimization information is not a kernel descriptor. The information should be bound to the module descriptor.
If I compile for 2 targets, there is no guarantees that arguments will be elided in the same way in both cases.

Can you clarify this comment? @romanovvlad has taken a look at this solution and seems to think it is workable, so I'd be interested to hear what the two of you believe should be changed here?

The solution is workable from SYCL RT point of view - SYCL RT can skip setting argument which are marked in special way.
But, I believe, the problem @Naghasan is describing it really a problem and related to the compilation toolchain.

What happens in the driver (@mdtoguchi ?) with this multiple device targets? Do they all share a single integration header, or is there a separate one for each?

Short-term we could probably limit the LLVM DSE enablement (via driver) to only single-target invocations. Long-term, could we have a separate integration header per-target? The CFE can be changed to output multiple files if necessary for the integration header.

The integration header are included into the application, so we will have multiple definitions of things headers define.

mdtoguchi · 2020-08-04T01:01:08Z

What happens in the driver (@mdtoguchi ?) with this multiple device targets? Do they all share a single integration header, or is there a separate one for each?

Current behavior is to generate a single integration header that is pulled into the host file.

erichkeane · 2020-08-04T01:47:59Z

So then, what is our solution here? How can we communicate this to the runtime? Do we just prohibit the optimization in multi-device situations? Do we prohibit different opt-settings in that case?

romanovvlad · 2020-08-04T10:34:07Z

So then, what is our solution here? How can we communicate this to the runtime? Do we just prohibit the optimization in multi-device situations? Do we prohibit different opt-settings in that case?

As an idea we could change the approach and embed "omit array" to the device image as a metadata. But it requires more changes in the RT because in the place where we currently collect arguments we don't know which specific device binary will be used.

[L0] Fix UR_KERNEL_INFO_ATTRIBUTES returned value

srividya-sundaram added 4 commits July 28, 2020 15:57

[SYCL] Kernel parameters optimization

88c2708

Remove extra lines

cc96031

wip

0b6a546

Merge branch 'sycl' of https://github.com/srividya-sundaram/llvm into…

3b7e3d7

… unused-args-elimination

srividya-sundaram requested review from elizabethandrews, Fznamznon, premanandrao and a team as code owners July 31, 2020 18:21

srividya-sundaram requested review from rbegam and erichkeane July 31, 2020 18:21

erichkeane reviewed Jul 31, 2020

View reviewed changes

srividya-sundaram force-pushed the kernel-arg-optimization branch from ac56302 to 4a1cf73 Compare July 31, 2020 22:40

erichkeane reviewed Jul 31, 2020

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

erichkeane reviewed Jul 31, 2020

View reviewed changes

srividya-sundaram changed the title ~~Kernel arg optimization~~ [SYCL] Enable parameter optimization for SYCL kernels Aug 1, 2020

srividya-sundaram force-pushed the kernel-arg-optimization branch from 1d93ea0 to 4f36f61 Compare August 1, 2020 16:24

erichkeane reviewed Aug 1, 2020

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

erichkeane reviewed Aug 1, 2020

View reviewed changes

[SYCL] Enable SYCL integration header support with kernel arg optimiz…

b800ac2

…ation table

srividya-sundaram force-pushed the kernel-arg-optimization branch from 4f36f61 to b800ac2 Compare August 2, 2020 02:19

clang-format-fix

f41a03f

srividya-sundaram requested a review from erichkeane August 2, 2020 02:30

romanovvlad reviewed Aug 3, 2020

View reviewed changes

sycl/include/CL/sycl/detail/kernel_desc.hpp Outdated Show resolved Hide resolved

Fznamznon reviewed Aug 3, 2020

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

erichkeane reviewed Aug 3, 2020

View reviewed changes

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

clang/lib/Sema/SemaSYCL.cpp Show resolved Hide resolved

clang/lib/Sema/SemaSYCL.cpp Outdated Show resolved Hide resolved

sycl/include/CL/sycl/detail/kernel_desc.hpp Outdated Show resolved Hide resolved

Review fixes

adb0a54

srividya-sundaram requested review from Fznamznon and romanovvlad August 3, 2020 14:04

erichkeane previously approved these changes Aug 3, 2020

View reviewed changes

Fznamznon reviewed Aug 3, 2020

View reviewed changes

clang/include/clang/Sema/Sema.h Outdated Show resolved Hide resolved

Update comment

45a33b0

srividya-sundaram dismissed erichkeane’s stale review via 45a33b0 August 3, 2020 15:34

srividya-sundaram requested a review from Fznamznon August 3, 2020 15:36

Fznamznon approved these changes Aug 3, 2020

View reviewed changes

srividya-sundaram requested a review from erichkeane August 3, 2020 15:38

keryell reviewed Aug 3, 2020

View reviewed changes

bader mentioned this pull request Aug 4, 2020

[DAE][SYCL] Enable DAE in SYCL kernel functions #2226

Merged

srividya-sundaram closed this Aug 4, 2020

srividya-sundaram deleted the kernel-arg-optimization branch August 4, 2020 16:12

Chenyang-L pushed a commit that referenced this pull request Feb 18, 2025

Merge pull request #2236 from againull/fix_attr

1b0fe32

[L0] Fix UR_KERNEL_INFO_ATTRIBUTES returned value

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] Enable parameter optimization for SYCL kernels #2236

[SYCL] Enable parameter optimization for SYCL kernels #2236

srividya-sundaram commented Jul 31, 2020 •

edited

Loading

erichkeane commented Jul 31, 2020

erichkeane commented Jul 31, 2020

erichkeane left a comment

erichkeane left a comment

Naghasan commented Aug 3, 2020

romanovvlad left a comment

Fznamznon left a comment

erichkeane commented Aug 3, 2020

Naghasan commented Aug 3, 2020

keryell Aug 3, 2020

erichkeane Aug 3, 2020

keryell Aug 4, 2020

erichkeane Aug 4, 2020

romanovvlad commented Aug 3, 2020

erichkeane commented Aug 3, 2020

romanovvlad commented Aug 3, 2020

mdtoguchi commented Aug 4, 2020

erichkeane commented Aug 4, 2020

romanovvlad commented Aug 4, 2020

[SYCL] Enable parameter optimization for SYCL kernels #2236

[SYCL] Enable parameter optimization for SYCL kernels #2236

Conversation

srividya-sundaram commented Jul 31, 2020 • edited Loading

erichkeane commented Jul 31, 2020

erichkeane commented Jul 31, 2020

erichkeane left a comment

Choose a reason for hiding this comment

erichkeane left a comment

Choose a reason for hiding this comment

Naghasan commented Aug 3, 2020

romanovvlad left a comment

Choose a reason for hiding this comment

Fznamznon left a comment

Choose a reason for hiding this comment

erichkeane commented Aug 3, 2020

Naghasan commented Aug 3, 2020

keryell Aug 3, 2020

Choose a reason for hiding this comment

erichkeane Aug 3, 2020

Choose a reason for hiding this comment

keryell Aug 4, 2020

Choose a reason for hiding this comment

erichkeane Aug 4, 2020

Choose a reason for hiding this comment

romanovvlad commented Aug 3, 2020

erichkeane commented Aug 3, 2020

romanovvlad commented Aug 3, 2020

mdtoguchi commented Aug 4, 2020

erichkeane commented Aug 4, 2020

romanovvlad commented Aug 4, 2020

srividya-sundaram commented Jul 31, 2020 •

edited

Loading