Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CodeGen][Lowering] Better handling of alloca address space with unified AS #682

Merged
merged 4 commits into from
Aug 1, 2024

Conversation

seven-mile
Copy link
Collaborator

@seven-mile seven-mile commented Jun 13, 2024

TargetCodeGenInfo::getASTAllocaAddressSpace is a bad design because it requires the targets return LangAS, which enforce the targets to consider which languages could be their upstream and unnecessarily complicate the target info. Unified AS in CIR is a better abstraction level for this kind of target info.

Apart from that, the languages also has some requirements on the result address space of alloca.

void func() {
  int x;
  // Here, the AS of pointer `&x` is the alloca AS defined by the language.
}

When we have inconsistency between the alloca AS defined by the language and the one from target info, we have to perform addrspacecast from the target one to the language one.

This PR includes

  • New method CGM.getLangTempAllocaAddressSpace which explicitly specifies the alloca address space defined by languages. It replaces the vague LangAS::Default in the AS comparisons from OG CodeGen.
  • Replace getASTAllocaAddressSpace with getCIRAllocaAddressSpace, which returns CIR unified AS.
  • Also use getCIRAllocaAddressSpace as the result address space of buildAlloca.
  • Fix the lowering of cir.alloca operation to be address-space-aware.

We don't perform any addrspacecast in this PR, i.e. all the related code paths still remain NYI and it's fine. That's because we don't even have a (language, target) pair holding the inconsistency.

After these changes, in the previous OpenCL testcases we will see all the alloca pointers turning into private AS, without any addrspacecast.

@seven-mile
Copy link
Collaborator Author

seven-mile commented Jun 13, 2024

  • Why not query DataLayout in the builder?

No suitable way to access it. The builder should be stateless, but the data layout is stored in the module op. Another approach is to cache the alloca address space in the TypeCache, which is already done for ASTAllocaAddressSpace. Even more trickily, consider to use typeCache.AllocaInt8PtrTy.getAddrSpace() directly. I don't think it's a good idea though.

Update: Julian suggested me to find out the module op by insertion point. After trying a bit, some tests (10% or so) were failing because the current block is unlinked (parentOp is null) at that time. @bcardosolopes Do you know what's going on in this situation?

  • Why not also fix the occurrence in LoweringPrepare?

Let's wait for the ABI related works to get the data layout available anytime. Then we can probably have a standard way to access it from later passes including LoweringPrepare. By the way, I'm not sure which feature flag should I use here. A new one about data layout or the current addressSpaceInAlloca?

  • Why no test cases?

See discussions in #671 and #672 . If there's any idea for test cases, I'm very willing to add it.

  • Where is the logic of addrspace casting?

This is a cyclic dependency. It is the direct dependency of #671, we can only test these paths under the SPIR-V target. Therefore I prefer to defer this part to #671 . You can preview the changes here.

@seven-mile seven-mile force-pushed the non-default-addrspace-alloca branch from 4d50e89 to 94eabdd Compare June 13, 2024 06:36
@jopperm
Copy link
Contributor

jopperm commented Jun 13, 2024

  • Why no test cases?

See discussions in #671 and #672 . If there's any idea for test cases, I'm very willing to add it.

The problem here is that we'd like to test that the data layout's alloca address space is reflected in the AllocaOp's pointer types, but we don't support a target yet for which that AS is != 0 (and hence would be distinguishable in the generated IR).

@@ -526,6 +527,7 @@ static void lowerArrayDtorCtorIntoLoop(CIRBaseBuilderTy &builder,
mlir::Value end = builder.create<mlir::cir::PtrStrideOp>(
loc, eltTy, begin, numArrayElementsConst);

assert(!::cir::MissingFeatures::addressSpaceInAlloca());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? How? What kind of testcase do you need to cover? This method doesn't exist in OG codegen, please add a comment saying what exactly needs to change when this is implemented in the future.

@@ -417,10 +417,14 @@ class CIRGenBuilderTy : public CIRBaseBuilderTy {

// Fetch the type representing a pointer to unsigned int values.
mlir::cir::PointerType getUInt8PtrTy(unsigned AddrSpace = 0) {
return typeCache.UInt8PtrTy;
if (AddrSpace == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on this change and add comments to explain your intent, the way it is right now it's not really possible for anyone to figure out what's the future intent of what needs to be done.

@@ -100,6 +100,16 @@ class CIRDataLayout {
getPointerTypeSizeInBits(Ty), false);
return IntTy;
}

unsigned getAllocaMemorySpace() const {
mlir::Attribute memorySpaceAttr = layout.getAllocaMemorySpace();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better than a helper method in the operation, thanks

@bcardosolopes
Copy link
Member

@bcardosolopes Do you know what's going on in this situation?

The only reliable way to getParentOf is through something you are 100% sure is an operation, if you are coming from a mlir::Value, the defining op is null in case it's a block argument.

Let's wait for the ABI related works to get the data layout available anytime. Then we can probably have a standard way to access it from later passes including LoweringPrepare. By the way, I'm not sure which feature flag should I use here. A new one about data layout or the current addressSpaceInAlloca?

This seems good. I don't think lowering prepare should be involved here (at least for the context of this PR).

  • Why no test cases?

See discussions in #671 and #672 . If there's any idea for test cases, I'm very willing to add it.

Please rename the PR with "[NFC]" on it then.

I was looking again into the PR where you introduced the address space to pointers, and I think the address space should be optional instead of making the assumption that the default one in "0". This would alleviate some of the assumptions we need to make in general - note that in the source language this is done through an attribute, and we probably don't want to over tag things that were never written in source code.

From @jopperm comments:

The problem here is that we'd like to test that the data layout's alloca address space is reflected in the AllocaOp's pointer types, but we don't support a target yet for which that AS is != 0 (and hence would be distinguishable in the generated IR).

I believe optional address space should help with this distinction. Additionally, if we forget about OpenCL, most of regular C/C++ doesn't care about address space, and we can spare of not having to tie up an integer attribute for something we don't care about.

Before I land this PR, I'd like us to solve this discussion.

Where is the logic of addrspace casting?

This is a cyclic dependency. It is the direct dependency of #671, we can only test these paths under the SPIR-V target. Therefore I prefer to defer this part to #671 . You can preview the changes here.

It's fine if that comes later, but keep in mind that if it's blocking you for longer it might be a good idea to sync with other people and maybe make it happen yourself.

@seven-mile
Copy link
Collaborator Author

Good idea! Thanks for helping me out. The key insight here is to accept the abstraction from the frontend partially. Let me phrase it more precisely as my understanding: Distinguish LangAS::Default from LangAS::FirstTargetAddressSpace in CIR, by letting the former become a pointer type without addrspace attribute.

Further discussion:

  • Abstraction shift
    This solution should work flawlessly. But I still doubt that it's capturing the semantic of address space qualifiers. Is it expected to emit the address space casting from cir.ptr<T> to cir.ptr<T, addrspace(0)>? When creating alloca operations for SPIR-V, it will happen.
    And from the point of view of the final CIR, we can also state that this strategy is just adding an additional mark called addrspace(0) for LangAS::FirstTargetAddressSpace. Equivalent but sounds much more ad-hoc.
  • Possibility to use enum as representation
    IMHO This does not make that much sense, because the frontend already contains abstraction leak (TargetAS in LangAS). We can hardly do better as its consumer.

@bcardosolopes
Copy link
Member

But I still doubt that it's capturing the semantic of address space qualifiers. Is it expected to emit the address space casting from cir.ptr<T> to cir.ptr<T, addrspace(0)>? When creating alloca operations for SPIR-V, it will happen.

If we encode default as non-present than this type of cast should be allowed.

And from the point of view of the final CIR, we can also state that this strategy is just adding an additional mark called addrspace(0) for LangAS::FirstTargetAddressSpace.

Well, it's not like you have proposed a non-adhoc solution yet. I'm trying to poke you for proposing alternatives, but all you seem to be making is assumptions instead of proposing some design.

Equivalent but sounds much more ad-hoc.

Still better than current proposed.

IMHO This does not make that much sense, because the frontend already contains abstraction leak (TargetAS in LangAS). We can hardly do better as its consumer.

This is ClangIR, not GenericIR, if we need to encode Clang specific information, and that makes sense for our representation, that's completely reasonable.

Take a step back and look at the definitions:

enum class LangAS : unsigned {
  // The default value 0 is the value used in QualType for the situation
  // where there is no address space qualifier.
  Default = 0,

  ...
  opencl_global,
  ...
  // Wasm specific address spaces.
  wasm_funcref,

  // This denotes the count of language-specific address spaces and also
  // the offset added to the target-specific address spaces, which are usually
  // specified by address space attributes __attribute__(address_space(n))).
  FirstTargetAddressSpace
};

Few more suggestions:

Solution A

Number based:
Default -> "no address space qualifier"
opencl_global, ... -> "address_space(value_of_enum)"
FirstTargetAddressSpace / __attribute__(address_space(0))) -> "address_space(0)"
FirstTargetAddressSpace+1 / __attribute__(address_space(1))) -> "address_space(1)" (this might clash with existing enums)
...

Solution B

Enum based:
{
opencl_global,
opencl_global_device,
...
target_specific
}

Default -> "no address space qualifier"
opencl_global -> "address_space(opencl_global)"
opencl_global_device -> "address_space(opencl_global_device)"
...
__attribute__((address_space(0))) -> "address_space(target<0>)"
__attribute__((address_space(1))) -> "address_space(target<1>)"
...

For this, address_space attribute would need (a) enum kind, (b) optional value (for target enum).
Extra thoughts? Suggestions?

@bcardosolopes
Copy link
Member

Let me know when this is ready for another round of review, thanks!

@lanza lanza self-requested a review as a code owner June 21, 2024 19:50
@seven-mile seven-mile force-pushed the non-default-addrspace-alloca branch from 94eabdd to 8a3b6ca Compare June 22, 2024 16:36
@seven-mile seven-mile marked this pull request as draft June 22, 2024 16:36
bcardosolopes pushed a commit that referenced this pull request Jun 27, 2024
This PR implements the solution B as discussed in #682.

* Use the syntax `cir.ptr<T>` `cir.ptr<T, addrspace(target<0>)`
`cir.ptr<T, addrspace(opencl_private)>`
* Add a new `AddressSpaceAttr`, which is used as the new type of
addrspace parameter in `PointerType`
* `AddressSpaceAttr` itself takes one single `int64_t $value` as the
parameter
* TableGen templates to generate the conversion between `clang::LangAS
-> int64_t $value <-> text-form CIR`
@bcardosolopes
Copy link
Member

@seven-mile Should we close this draft PR / has this been covered elsewhere?

@seven-mile
Copy link
Collaborator Author

There are still alloca-AS-related changes, which should be covered by this PR. I will update it and mark it ready for review after #724 lands.

@seven-mile seven-mile force-pushed the non-default-addrspace-alloca branch from 288af75 to 024d4da Compare July 27, 2024 18:48
@seven-mile seven-mile marked this pull request as ready for review July 27, 2024 18:58
@seven-mile seven-mile changed the title [CIR][CodeGen][Dialect] Use the correct alloca address space [CIR][CodeGen] Better handling of alloca address space with unified AS Jul 27, 2024
@seven-mile
Copy link
Collaborator Author

@bcardosolopes It's been a small patch and finally ready for another round of review.😉 The PR description demonstrates how it is done now by manipulating language and unified AS.

The only functional change here is the AS attached in buildAlloca.

Comment on lines 338 to 340
/// Returns the address space for temporary alloca in the language. This makes
/// the allocated variable's address matches the expectation of AST, rather
/// than keep the address space of the alloca in CIR.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • "allocated variable's address matches" -> "allocated variable's address space match"
  • do we really "keep" the AS; isn't it rather something like "defer mapping" or so?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks~

do we really "keep" the AS; isn't it rather something like "defer mapping" or so?

Sorry for my dumbness, could you please elaborate the "defer mapping" approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just a minor nit, I meant that by using the language's AS in CIR, we defer the mapping to the target-specific AS that is used for the alloca instruction later in the pipeline. We don't keep it because that alloca has not been created yet.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I got it. I've forgotten what I wrote in the comment😂Surely will fix it.

@seven-mile seven-mile changed the title [CIR][CodeGen] Better handling of alloca address space with unified AS [CIR][CodeGen][Lowering] Better handling of alloca address space with unified AS Jul 30, 2024
Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for your patience! LGTM

@bcardosolopes bcardosolopes merged commit 821534c into llvm:main Aug 1, 2024
6 checks passed
Hugobros3 pushed a commit to shady-gang/clangir that referenced this pull request Oct 2, 2024
)

This PR implements the solution B as discussed in llvm#682.

* Use the syntax `cir.ptr<T>` `cir.ptr<T, addrspace(target<0>)`
`cir.ptr<T, addrspace(opencl_private)>`
* Add a new `AddressSpaceAttr`, which is used as the new type of
addrspace parameter in `PointerType`
* `AddressSpaceAttr` itself takes one single `int64_t $value` as the
parameter
* TableGen templates to generate the conversion between `clang::LangAS
-> int64_t $value <-> text-form CIR`
Hugobros3 pushed a commit to shady-gang/clangir that referenced this pull request Oct 2, 2024
… unified AS (llvm#682)

`TargetCodeGenInfo::getASTAllocaAddressSpace` is a bad design because it
requires the targets return `LangAS`, which enforce the targets to
consider which languages could be their upstream and unnecessarily
complicate the target info. Unified AS in CIR is a better abstraction
level for this kind of target info.

Apart from that, the languages also has some requirements on the result
address space of alloca.

```cpp
void func() {
  int x;
  // Here, the AS of pointer `&x` is the alloca AS defined by the language.
}
```

When we have inconsistency between the alloca AS defined by the language
and the one from target info, we have to perform `addrspacecast` from
the target one to the language one.

This PR includes
* New method `CGM.getLangTempAllocaAddressSpace` which explicitly
specifies the alloca address space defined by languages. It replaces the
vague `LangAS::Default` in the AS comparisons from OG CodeGen.
* Replace `getASTAllocaAddressSpace` with `getCIRAllocaAddressSpace`,
which returns CIR unified AS.
* Also use `getCIRAllocaAddressSpace` as the result address space of
`buildAlloca`.
* Fix the lowering of `cir.alloca` operation to be address-space-aware.

We don't perform any `addrspacecast` in this PR, i.e. all the related
code paths still remain NYI and it's fine. That's because we don't even
have a `(language, target)` pair holding the inconsistency.

After these changes, in the previous OpenCL testcases we will see all
the alloca pointers turning into private AS, without any
`addrspacecast`.
smeenai pushed a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
)

This PR implements the solution B as discussed in llvm#682.

* Use the syntax `cir.ptr<T>` `cir.ptr<T, addrspace(target<0>)`
`cir.ptr<T, addrspace(opencl_private)>`
* Add a new `AddressSpaceAttr`, which is used as the new type of
addrspace parameter in `PointerType`
* `AddressSpaceAttr` itself takes one single `int64_t $value` as the
parameter
* TableGen templates to generate the conversion between `clang::LangAS
-> int64_t $value <-> text-form CIR`
smeenai pushed a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
… unified AS (llvm#682)

`TargetCodeGenInfo::getASTAllocaAddressSpace` is a bad design because it
requires the targets return `LangAS`, which enforce the targets to
consider which languages could be their upstream and unnecessarily
complicate the target info. Unified AS in CIR is a better abstraction
level for this kind of target info.

Apart from that, the languages also has some requirements on the result
address space of alloca.

```cpp
void func() {
  int x;
  // Here, the AS of pointer `&x` is the alloca AS defined by the language.
}
```

When we have inconsistency between the alloca AS defined by the language
and the one from target info, we have to perform `addrspacecast` from
the target one to the language one.

This PR includes
* New method `CGM.getLangTempAllocaAddressSpace` which explicitly
specifies the alloca address space defined by languages. It replaces the
vague `LangAS::Default` in the AS comparisons from OG CodeGen.
* Replace `getASTAllocaAddressSpace` with `getCIRAllocaAddressSpace`,
which returns CIR unified AS.
* Also use `getCIRAllocaAddressSpace` as the result address space of
`buildAlloca`.
* Fix the lowering of `cir.alloca` operation to be address-space-aware.

We don't perform any `addrspacecast` in this PR, i.e. all the related
code paths still remain NYI and it's fine. That's because we don't even
have a `(language, target)` pair holding the inconsistency.

After these changes, in the previous OpenCL testcases we will see all
the alloca pointers turning into private AS, without any
`addrspacecast`.
smeenai pushed a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
)

This PR implements the solution B as discussed in llvm#682.

* Use the syntax `cir.ptr<T>` `cir.ptr<T, addrspace(target<0>)`
`cir.ptr<T, addrspace(opencl_private)>`
* Add a new `AddressSpaceAttr`, which is used as the new type of
addrspace parameter in `PointerType`
* `AddressSpaceAttr` itself takes one single `int64_t $value` as the
parameter
* TableGen templates to generate the conversion between `clang::LangAS
-> int64_t $value <-> text-form CIR`
smeenai pushed a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
… unified AS (llvm#682)

`TargetCodeGenInfo::getASTAllocaAddressSpace` is a bad design because it
requires the targets return `LangAS`, which enforce the targets to
consider which languages could be their upstream and unnecessarily
complicate the target info. Unified AS in CIR is a better abstraction
level for this kind of target info.

Apart from that, the languages also has some requirements on the result
address space of alloca.

```cpp
void func() {
  int x;
  // Here, the AS of pointer `&x` is the alloca AS defined by the language.
}
```

When we have inconsistency between the alloca AS defined by the language
and the one from target info, we have to perform `addrspacecast` from
the target one to the language one.

This PR includes
* New method `CGM.getLangTempAllocaAddressSpace` which explicitly
specifies the alloca address space defined by languages. It replaces the
vague `LangAS::Default` in the AS comparisons from OG CodeGen.
* Replace `getASTAllocaAddressSpace` with `getCIRAllocaAddressSpace`,
which returns CIR unified AS.
* Also use `getCIRAllocaAddressSpace` as the result address space of
`buildAlloca`.
* Fix the lowering of `cir.alloca` operation to be address-space-aware.

We don't perform any `addrspacecast` in this PR, i.e. all the related
code paths still remain NYI and it's fine. That's because we don't even
have a `(language, target)` pair holding the inconsistency.

After these changes, in the previous OpenCL testcases we will see all
the alloca pointers turning into private AS, without any
`addrspacecast`.
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
)

This PR implements the solution B as discussed in llvm#682.

* Use the syntax `cir.ptr<T>` `cir.ptr<T, addrspace(target<0>)`
`cir.ptr<T, addrspace(opencl_private)>`
* Add a new `AddressSpaceAttr`, which is used as the new type of
addrspace parameter in `PointerType`
* `AddressSpaceAttr` itself takes one single `int64_t $value` as the
parameter
* TableGen templates to generate the conversion between `clang::LangAS
-> int64_t $value <-> text-form CIR`
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
… unified AS (llvm#682)

`TargetCodeGenInfo::getASTAllocaAddressSpace` is a bad design because it
requires the targets return `LangAS`, which enforce the targets to
consider which languages could be their upstream and unnecessarily
complicate the target info. Unified AS in CIR is a better abstraction
level for this kind of target info.

Apart from that, the languages also has some requirements on the result
address space of alloca.

```cpp
void func() {
  int x;
  // Here, the AS of pointer `&x` is the alloca AS defined by the language.
}
```

When we have inconsistency between the alloca AS defined by the language
and the one from target info, we have to perform `addrspacecast` from
the target one to the language one.

This PR includes
* New method `CGM.getLangTempAllocaAddressSpace` which explicitly
specifies the alloca address space defined by languages. It replaces the
vague `LangAS::Default` in the AS comparisons from OG CodeGen.
* Replace `getASTAllocaAddressSpace` with `getCIRAllocaAddressSpace`,
which returns CIR unified AS.
* Also use `getCIRAllocaAddressSpace` as the result address space of
`buildAlloca`.
* Fix the lowering of `cir.alloca` operation to be address-space-aware.

We don't perform any `addrspacecast` in this PR, i.e. all the related
code paths still remain NYI and it's fine. That's because we don't even
have a `(language, target)` pair holding the inconsistency.

After these changes, in the previous OpenCL testcases we will see all
the alloca pointers turning into private AS, without any
`addrspacecast`.
lanza pushed a commit that referenced this pull request Nov 5, 2024
This PR implements the solution B as discussed in #682.

* Use the syntax `cir.ptr<T>` `cir.ptr<T, addrspace(target<0>)`
`cir.ptr<T, addrspace(opencl_private)>`
* Add a new `AddressSpaceAttr`, which is used as the new type of
addrspace parameter in `PointerType`
* `AddressSpaceAttr` itself takes one single `int64_t $value` as the
parameter
* TableGen templates to generate the conversion between `clang::LangAS
-> int64_t $value <-> text-form CIR`
lanza pushed a commit that referenced this pull request Nov 5, 2024
… unified AS (#682)

`TargetCodeGenInfo::getASTAllocaAddressSpace` is a bad design because it
requires the targets return `LangAS`, which enforce the targets to
consider which languages could be their upstream and unnecessarily
complicate the target info. Unified AS in CIR is a better abstraction
level for this kind of target info.

Apart from that, the languages also has some requirements on the result
address space of alloca.

```cpp
void func() {
  int x;
  // Here, the AS of pointer `&x` is the alloca AS defined by the language.
}
```

When we have inconsistency between the alloca AS defined by the language
and the one from target info, we have to perform `addrspacecast` from
the target one to the language one.

This PR includes
* New method `CGM.getLangTempAllocaAddressSpace` which explicitly
specifies the alloca address space defined by languages. It replaces the
vague `LangAS::Default` in the AS comparisons from OG CodeGen.
* Replace `getASTAllocaAddressSpace` with `getCIRAllocaAddressSpace`,
which returns CIR unified AS.
* Also use `getCIRAllocaAddressSpace` as the result address space of
`buildAlloca`.
* Fix the lowering of `cir.alloca` operation to be address-space-aware.

We don't perform any `addrspacecast` in this PR, i.e. all the related
code paths still remain NYI and it's fine. That's because we don't even
have a `(language, target)` pair holding the inconsistency.

After these changes, in the previous OpenCL testcases we will see all
the alloca pointers turning into private AS, without any
`addrspacecast`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants