Skip to content

Reading Pointer bytes as Integers #547

Closed
@chorman0773

Description

@chorman0773
Contributor

This came up in rust-lang/reference#1664. I wanted to ask what T-opsem thinks about the behaviour of reading pointer bytes as integer types (or as char/bool/etc.).

As far as I can tell, there are two "sensible" behaviours, given that integers themselves do no carry provenance:

  • The pointer fragment is ignored,
  • Decoding error (thus undefined behaviour).

Given provenance monotonicity, which would be violated by the decoding error, it seems like the best option is that the fragments are ignored. Is there anything missed here? If not, can we get a formal sign off on this behaviour.

Note that I'm only considering the runtime behaviour, which can be a point against adopting the behaviour. Given that it's impossible to get the address of certain pointers in const-eval, it does need to be undefined behaviour (or otherwise an error) to read pointer bytes (to at least symbolic allocations) as integer types.

Activity

saethlin

saethlin commented on Dec 6, 2024

@saethlin
Member

Given that it's impossible to get the address of certain pointers

Which pointers?

chorman0773

chorman0773 commented on Dec 6, 2024

@chorman0773
ContributorAuthor

I failed to clarify that. It was referring to the consteval AM, where allocations that exist outside of the particular constant evaluation (what I call symbolic pointers) can't be assigned an address.

RalfJung

RalfJung commented on Dec 6, 2024

@RalfJung
Member

Const-eval can't assign an address to any allocation, "inside" or "outside". (Not sure what you mean with that distinction.)

chorman0773

chorman0773 commented on Dec 6, 2024

@chorman0773
Author
RalfJung

RalfJung commented on Dec 6, 2024

@RalfJung
RalfJung

RalfJung commented on Dec 6, 2024

@RalfJung
Member

Anyway that sub-discussion seems off-topic here, please move it to Zulip. And please update the issue description to clarify that "certain pointers" refers to const-eval.

chorman0773

chorman0773 commented on Dec 16, 2024

@chorman0773
ContributorAuthor

I suppose the third alternative that should be addressed is that the read exposes the pointer bytes, but I don't like that suggestion (and I recall few people did), as it means that reads can result in a side effect, and such reads as an integer type can never be elided.

Is there any other alternative I'm missing?

RalfJung

RalfJung commented on Dec 17, 2024

@RalfJung
Member

Yeah I definitely don't like that suggestion, it pessimizes optimization too much. It is worth mentioning that that third alternative is basically what PNVI-ae-udi mandates for C. I am curious if compilers will actually implement that, though.

RalfJung

RalfJung commented on Dec 17, 2024

@RalfJung
Member

Note that I'm only considering the runtime behaviour, which can be a point against adopting the behaviour. Given that it's impossible to get the address of certain pointers in const-eval, it does need to be undefined behaviour (or otherwise an error) to read pointer bytes (to at least symbolic allocations) as integer types.

We could characterize this as a "unsupported in const-eval" error rather than a UB error. (Internally in rustc this is already what we do, ReadPointerAsInt is a variant of UnsupportedOpInfo. However we don't clearly distinguish those cases in the error message AFAIK, and we do call this UB in the transmute docs.)

That would be similar to how is_null is sometimes unsupported in const-eval.

RalfJung

RalfJung commented on Dec 23, 2024

@RalfJung
RalfJung

RalfJung commented on Feb 20, 2025

@RalfJung
Member

@joshlf says they have a usecase for these transmutes in zerocopy. Or, to be more precise -- they have a usecase for making these transmutes not be UB. The goal isn't actually to ever run these operations, but having them be well-defined allows soundly adding some IntoBytes trait instances that would be useful even if the transmute is never actually executed. I'll let him fill out the details. :)

CatsAreFluffy

CatsAreFluffy commented on Mar 3, 2025

@CatsAreFluffy

This sounds a lot like #286.

RalfJung

RalfJung commented on Mar 3, 2025

@RalfJung
Member

True, those are discussing the same thing.

joshlf

joshlf commented on Mar 6, 2025

@joshlf

I wrote up an example use case in rust-lang/rust#137323 (comment), but the very brief TLDR is that we've designed zerocopy's API so that as many operations as possible "fall out naturally" from a base set of composable atoms. Having to special-case things makes it so that we can't express some operations that way, and as a result, it means that we have to either decide not to support certain operations, or instead create one-off APIs that don't compose with the rest of our machinery. Since most of zerocopy's internals are unsafe, changes take a long time to implement since we move very slowly to make sure we haven't made any mistakes. As a consequence, we often end up just not supporting these operations, despite having users who want them supported.

RalfJung

RalfJung commented on Mar 6, 2025

@RalfJung
Member
joshlf

joshlf commented on Mar 6, 2025

@joshlf

Yeah, we do intend to reflect provenance. But we'd prefer to reflect that "ptr-to-int is valid but strips provenance" rather than have to not support ptr-to-int because it's UB.

RalfJung

RalfJung commented on May 18, 2025

@RalfJung
Member

Closing in favor of #286.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @RalfJung@joshlf@chorman0773@saethlin@CatsAreFluffy

        Issue actions

          Reading Pointer bytes as Integers · Issue #547 · rust-lang/unsafe-code-guidelines