Skip to content

What about: mixed-size atomic accesses #345

Open
@RalfJung

Description

@RalfJung
Member

So it looks like mixed-size-accesses made the round on twitter again recently which got me thinking about them. rust-lang/rust#97516 carefully chose not to talk about them (and anyway it's not clear that is the right place to go into more detail about them). So what do we want to do with them in Rust?

Some facts:

  • In C++, you cannot convert a &uint16_t into an reference to an array "because no such array exists at that location in memory"; they insist that memory is strongly typed. This means they don't even have to talk about mixed-size accesses. It also means they are ignoring a large fraction of the programs out there but I guess they are fine with that. We are not. ;)

  • Apparently the x86 manual says you "should" not do this: "Software should access semaphores (shared memory used for signalling between multiple processors) using identical addresses and operand lengths." It is unclear what "should" means (or what anything else here really means, operationally speaking...)

  • In Rust, it is pretty much established that you can safely turn a &mut u16 into a &mut [u8; 2]. So you can do something where you start with a &mut AtomicU16, do some atomic things to it, then use the above conversion and get a &mut AtomicU8 and do atomic things with that -- a bona fide mixed-size atomic access.

    However, this means that there is a happens-before edge between all the 16-bit accesses and the 8-bit accesses. Having the &mut means that there is synchronization. I hope this means not even Intel disagrees with this, but since literally none of the words in that sentence is defined, who knows.

So... it seems like the most restrictive thing we can say, without disallowing code that you can already write entirely safely using bytemuck, is that

  • it is allowed to do differently-sized atomic accesses to the same location over time,
  • but only if any two not-perfectly-overlapping accesses are completely synchronized through other means (i.e., it is not these accesses themselves that add a happens-before edge, there already exists a happens-before edge through other accesses).
  • Any other kind of mixed-size access is UB.

Cc @workingjubilee @m-ou-se @Amanieu @cbeuw @thomcc

Activity

m-ou-se

m-ou-se commented on Jul 2, 2022

@m-ou-se
Member

That all matches my understanding as well.

(It's part of the reason why I added Atomic*::from_mut.)

However, this means that there is a happens-before edge between all the 16-bit accesses and the 8-bit accesses. Having the &mut means that there is synchronization. I hope this means not even Intel disagrees with this, but since literally none of the words in that sentence is defined, who knows.

If they would disagree with that, that'd basically imply that after using some memory for an atomic operation, you can never re-use that memory again. E.g. deallocating a Box would be unsafe, and so would be a stack-allocated AtomicU16 that goes out of scope.

They don't say it very clearly, but I don't see how their no-mixed-sizes rule can apply to anything other than atomic operations on the same memory that race with each other.

It is unclear what "should" means (or what anything else here really means, operationally speaking...)

Yeah, it could very well turn out that "should" just means "for performance", and that it has nothing to do with correctness. They're not very clear.

So... it seems like the most restrictive thing we can say [..]

That seems like exactly the right thing to say, and matches what you can do in safe Rust (if we include the unstable Atomic*::from_mut).

I don't think it's impossible that this might be less restrictive in the future, if we find more reasons to believe that racing mixed-size atomic operations will work on all platforms.

In C++, you cannot convert a &uint16_t into an reference to an array

Converting uint16_t* to a char* however is fine, e.g. to memset or memcpy into a uint16_t or struct, etc.

In C++20, you can also have a struct X { int a; int b; } and create an std::atomic_ref<X> first, and an std::atomic_ref<int> to one of the fields later.

In atomics.ref.generic#general-3, they clearly specify mixed-size accesses in the same way as us:

The lifetime ([basic.life]) of an object referenced by *ptr shall exceed the lifetime of all atomic_­refs that reference the object. While any atomic_­ref instances exist that reference the *ptr object, all accesses to that object shall exclusively occur through those atomic_­ref instances. No subobject of the object referenced by atomic_­ref shall be concurrently referenced by any other atomic_­ref object.

(Emphasis mine.)

Amanieu

Amanieu commented on Jul 2, 2022

@Amanieu
Member

ARM's memory model (in the section: The AArch64 Application Level Memory Model) seems to fully support mixed-sized atomic accesses.

RalfJung

RalfJung commented on Jul 2, 2022

@RalfJung
MemberAuthor

If they would disagree with that, that'd basically imply that after using some memory for an atomic operation, you can never re-use that memory again. E.g. deallocating a Box would be unsafe, and so would be a stack-allocated AtomicU16 that goes out of scope.

Ah, good point. I had forgotten that hardware memory models do not have provenance. 😂

That seems like exactly the right thing to say, and matches what you can do in safe Rust (if we include the unstable Atomic*::from_mut).

.. and include bytemuck

Converting uint16_t* to a char* however is fine, e.g. to memset or memcpy into a uint16_t or struct, etc.

Isn't that a C thing? Though C++ might have something similar with std::byte.
But anyway that's a non-atomic type.

In C++20, you can also have a struct X { int a; int b; } and create an std::atomic_ref first, and an std::atomic_ref to one of the fields later.

Oh, good point. So in some sense this is actually already all covered by rust-lang/rust#97516.

bjorn3

bjorn3 commented on Jul 2, 2022

@bjorn3
Member

If they would disagree with that, that'd basically imply that after using some memory for an atomic operation, you can never re-use that memory again. E.g. deallocating a Box would be unsafe, and so would be a stack-allocated AtomicU16 that goes out of scope.

Ah, good point. I had forgotten that hardware memory models do not have provenance. joy

When reusing memory it is undef, right? Furthermore deallocation requires some kind of synchronization with every thread that has ever accessed it using atomic operations. Together I would assume this is enough to provide consistency by "resetting" the state witnessing that it was accessed using atomic operations of a different size.

m-ou-se

m-ou-se commented on Jul 2, 2022

@m-ou-se
Member

Though C++ might have something similar with std::byte.

C++ allows aliasing through char, unsigned char and std::byte: https://wg21.link/basic.lval#11.3

std::atomic_ref

To add to my previous comment: one of the reasons why C++'s atomic_ref doesn't allow mixed size / overlapping operations, is that it supports objects of any size. If it gets too big for native atomic instructions, it uses a mutex instead, which is probably stored in some kind of global table indexed by the address of the object. It's not completely clear whether it's necessary to be as restrictive when limited to only natively supported atomic operations, like in Rust.

chorman0773

chorman0773 commented on Apr 3, 2023

@chorman0773
Contributor

Apparently the x86 manual says you "should" not do this: "Software should access semaphores (shared memory used for signalling between multiple processors) using identical addresses and operand lengths." It is unclear what "should" means (or what anything else here really means, operationally speaking...)

I thik we can interpet "should" as "It's undefined, by spec, though it works in practice b/c some important people rely on it, but please don't, we want to do fast things".

Therefore, the mixed size access should be considered undefined by Rust, as we expect to be able to compile to x86 where it is undefined.

cbeuw

cbeuw commented on Apr 6, 2023

@cbeuw

Regarding x86, I got the following from @thiagomacieira which is very helpful

Even on Intel processors (note: I work for Intel and this is an area I am very familiar
with) you can mix different-sized operations and still retain atomicity,
provided you obey some rules:

  • never cross a cacheline boundary (preferably, align naturally)
  • use operations of 16 bytes or less (ideally: use only 1-uop operations)
    • for all current in-market processors and I believe this applies to AMD
    • for all current P-core processors: 32- and 64-byte accesses are also fine

Note: I don't know how valid this is for the new RAO instructions, but they
should be ok for CMPccXADD.

In fact, I've seen a few codebases that do use the fact that they can mix two atomic_uint32_t with an overlapping atomic_uint64_t, at least when it comes to interfacing with the Linux kernel 32-bit futex support, see
https://codebrowser.dev/glibc/glibc/nptl/sem_waitcommon.c.html#do_futex_wait

RalfJung

RalfJung commented on Apr 10, 2023

@RalfJung
MemberAuthor

use operations of 16 bytes or less (ideally: use only 1-uop operations)

So this means there are atomic 256bit accesses but doing size mixing with those is a problem? (Not sure what "1-uop operations are.)

chorman0773

chorman0773 commented on Apr 10, 2023

@chorman0773
Contributor
thiagomacieira

thiagomacieira commented on Apr 10, 2023

@thiagomacieira

SSE loads and stores are atomic. AVX 256- and 512-bit loads and stores are atomic on P-core processors, but not E-core (the 256-bit operation is cracked into two 128-bit operations and therefore not atomic).

There s no RMW SIMD. The best you get is a merge-store and I'm confident that's atomic on P-core, but I doubt so on E-core. Therefore, SIMD atomics are very limited if you can only use them for loads and stores. The most useful thing that this could be done for is to load 16 bytes atomically, and use CMPXCHG16B to store, but it's still limited and somewhat slow due to the transfer between register files. seqlocks are more flexible.

19 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-open-questionCategory: An open question that we should revisit

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @comex@Amanieu@RalfJung@m-ou-se@tmandry

        Issue actions

          What about: mixed-size atomic accesses · Issue #345 · rust-lang/unsafe-code-guidelines