Skip to content

Do function pointers behave like data pointers (wrt provenance and other aspects)? #340

Open
@RalfJung

Description

@RalfJung
Member

Miri currently treats fn ptrs and data ptrs very similarly, in particular with regards to provenance. When calling a function pointer, its provenance is consulted to identify which function to invoke. This makes int2fnptr transmutes a problem (see rust-lang/rust#97321). fnptr2int transmutes are also UB because fn ptrs carry provenance which integers must not.

However, the trouble with provenance for data pointers come from multiple pointers with the same address but different provenance. Function pointers can't be offset and don't have aliasing restrictions or a "one-past-the-end" rule, so none of this applies. Hence we potentially could make them not carry provenance, and we could do the mapping from pointer to function without its provenance (basically, doing the int2ptr cast at the time the call is made).

Beyond these formal details, there are pragmatic concerns on niche architectures, such as whether data and function pointers even have the same size and representation.

Also see this Zulip discussion.

Activity

added
A-provenanceTopic: Related to when which values have which provenance (but not which alias restrictions follow)
C-open-questionCategory: An open question that we should revisit
on May 24, 2022
thomcc

thomcc commented on May 24, 2022

@thomcc
Member

It would be pretty nice if they don't, as for a while now we've taught that the way to invert the fn_ptr as usize cast is to transmute back. Given that the justification just seems to be simplification of the model (which doesn't really matter to most programmers, especially given that from their perspective, the incantation to do the inverse cast is less simple now), I think it should be defined.

RalfJung

RalfJung commented on May 25, 2022

@RalfJung
MemberAuthor

for a while now we've taught that the way to invert the fn_ptr as usize cast is to transmute back.

Have we?
As far as I found in the stdlib docs, no such pattern is taught there (at least nothing that doctests would cover), and the transmute docs actually cast to a raw ptr first.

Given that the justification just seems to be simplification of the model (which doesn't really matter to most programmers, especially given that from their perspective, the incantation to do the inverse cast is less simple now), I think it should be defined.

That's fair. I think a large part of programmers that write unsafe code also want to understand the model, so we shouldn't make it more complicated than absolutely necessary. However, the majority of programmers will probably never look at the model so there also is value is making it "do the expected thing", and it is worth spending some complexity on that. So to me this depends on how complicated we have to go to support this.

I would also like to distinguish the two directions:

  • fnptr2int transmutes are currently UB because it is UB to put data with provenance into an integer. This, I think, is tricky to fix as we'd have to figure out to to make sure fn ptrs are created without provenance. Also I have not seen any such transmute in the wild, since casts actually work here.
  • int2fnptr transmutes are UB because when calling the fnptr, we use provenance to determine which function to call. The alternative here is to ignore the provenance and use the absolute address to look up the allocation at the given point (basically, as if a cast had been done at the moment of the call).
digama0

digama0 commented on May 25, 2022

@digama0
  • fnptr2int transmutes are currently UB because it is UB to put data with provenance into an integer. This, I think, is tricky to fix as we'd have to figure out to to make sure fn ptrs are created without provenance. Also I have not seen any such transmute in the wild, since casts actually work here.

This isn't just about fnptr2int transmutes so maybe I should bring this up somewhere else, but I really think it is a bad idea to make these transmutes UB instead of simply stripping the provenance. (That is, when converting bytes with provenance to a value of integer type, the provenance is lost, and when it is saved again the provenance is not recovered.) I see no gain in making it immediate UB.

RalfJung

RalfJung commented on May 25, 2022

@RalfJung
MemberAuthor

That is basically #286. Though that thread is so huge now, not sure how useful it still is...

It's definitely off-topic for this thread though. :)

LunaBorowska

LunaBorowska commented on Aug 3, 2022

@LunaBorowska

I think function pointers on CHERI do have provenance (you cannot simply make up function pointers from integers) which is an argument supporting function pointers having provenance.

changed the title [-]Do function pointers have provenance?[/-] [+]Do function pointers behave like data pointers (wrt provenance and other aspects)?[/+] on Apr 27, 2023
RalfJung

RalfJung commented on Apr 27, 2023

@RalfJung
MemberAuthor

#309 got folded into this issue, so I generalized the title a bit to not just be a about provenance -- there are also questions around whether these types even have the same size etc

Nemo157

Nemo157 commented on Jan 6, 2025

@Nemo157
Member

Somewhat related to this is: what are even the semantics of function {item,pointer} to {pointer,address} cast. These are named in the reference but don't appear in the following semantics section. AFAIK the example in the transmute docs linked above is the only place that implies that transmute<*const (), fn()>(some_fn as _) is a round-trip.

RalfJung

RalfJung commented on Jan 6, 2025

@RalfJung
MemberAuthor

Function items have no data, so casting those to function pointers is a very special operation that synthesizes a suitable pointer "out of thin air". This operation is non-deterministic; executing it multiple times for the same function can produce different pointers.

Functions points are either like usize or like *const () (depending on the outcome of the discussion in this issue) so either way their casts to usize behave like the corresponding carrier type.

8 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-provenanceTopic: Related to when which values have which provenance (but not which alias restrictions follow)C-open-questionCategory: An open question that we should revisit

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @Nemo157@Amanieu@RalfJung@thomcc@digama0

        Issue actions

          Do function pointers behave like data pointers (wrt provenance and other aspects)? · Issue #340 · rust-lang/unsafe-code-guidelines