Skip to content

[Stabilization] Pin APIs #55766

Closed
Closed
@withoutboats

Description

@withoutboats
Contributor

@rfcbot fcp merge
Feature name: pin
Stabilization target: 1.32.0
Tracking issue: #49150
Related RFCs: rust-lang/rfcs#2349

This is a proposal to stabilize the pin library feature, making the "pinning"
APIs for manipulating pinned memory usable on stable.

(I've tried to write this proposal as a comprehensive "stabilization report.")

Stabilized feature or APIs

[std|core]::pin::Pin

This stabilizes the Pin type in the pin submodule of std/core. Pin is
a fundamental, transparent wrapper around a generic type P, which is intended
to be a pointer type (for example, Pin<&mut T> and Pin<Box<T>> are both
valid, intended constructs). The Pin wrapper modifies the pointer to "pin"
the memory it refers to in place, preventing the user from moving objects out
of that memory.

The usual way to use the Pin type is to construct a pinned variant of some
kind of owning pointer (Box, Rc, etc). The std library owning pointers all
provide a pinned constructor which returns this. Then, to manipulate the
value inside, all of these pointers provide a way to degrade toward Pin<&T>
and Pin<&mut T>. Pinned pointers can deref, giving you back &T, but cannot
safely mutably deref: this is only possible using the unsafe get_mut
function.

As a result, anyone mutating data through a pin will be required to uphold the
invariant that they never move out of that data. This allows other code to
safely assume that the data is never moved, allowing it to contain (for
example) self references.

The Pin type will have these stabilized APIs:

impl<P> Pin<P> where P: Deref, P::Target: Unpin

  • fn new(pointer: P) -> Pin<P>

impl<P> Pin<P> where P: Deref

  • unsafe fn new_unchecked(pointer: P) -> Pin<P>
  • fn as_ref(&self) -> Pin<&P::Target>

impl<P> Pin<P> where P: DerefMut

  • fn as_mut(&mut self) -> Pin<&mut P::Target>
  • fn set(&mut self, P::Target);

impl<'a, T: ?Sized> Pin<&'a T>

  • unsafe fn map_unchecked<U, F: FnOnce(&T) -> &U>(self, f: F) -> Pin<&'a U>
  • fn get_ref(self) -> &'a T

impl<'a, T: ?Sized> Pin<&'a mut T>

  • fn into_ref(self) -> Pin<&'a T>
  • unsafe fn get_unchecked_mut(self) -> &'a mut T
  • unsafe fn map_unchecked_mut<U, F: FnOnce(&mut T) -> &mut U>(self, f: F) -> Pin<&'a mut U>

impl<'a, T: ?Sized> Pin<&'a mut T> where T: Unpin

  • fn get_mut(self) -> &'a mut T

Trait impls

Most of the trait impls on Pin are fairly rote, these two are important to
its operation:

  • impl<P: Deref> Deref for Pin<P> { type Target = P::Target }
  • impl<P: DerefMut> DerefMut for Pin<P> where P::Target: Unpin { }

std::marker::Unpin

Unpin is a safe auto trait which opts out of the guarantees of pinning. If the
target of a pinned pointer implements Unpin, it is safe to mutably
dereference to it. Unpin types do not have any guarantees that they will not
be moved out of a Pin.

This makes it as ergonomic to deal with a pinned reference to something that
does not contain self-references as it would be to deal with a non-pinned
reference. The guarantees of Pin only matter for special case types like
self-referential structures: those types do not implement Unpin, so they have
the restrictions of the Pin type.

Notable implementations of Unpin in std:

  • impl<'a, T: ?Sized> Unpin for &'a T
  • impl<'a, T: ?Sized> Unpin for &'a mut T
  • impl<T: ?Sized> Unpin for Box<T>
  • impl<T: ?Sized> Unpin for Rc<T>
  • impl<T: ?Sized> Unpin for Arc<T>

These codify the notion that pinnedness is not transitive across pointers. That
is, a Pin<&T> only pins the actual memory block represented by T in a
place. Users have occassionally been confused by this and expected that a type
like Pin<&mut Box<T>> pins the data of T in place, but it only pins the
memory the pinned reference actually refers to: in this case, the Box's
representation, which a pointer into the heap.

std::marker::Pinned

The Pinned type is a ZST which does not implement Unpin; it allows you to
supress the auto-implementation of Unpin on stable, where !Unpin impls
would not be stable yet.

Smart pointer constructors

Constructors are added to the std smart pointers to create pinned references:

  • Box::pinned(data: T) -> Pin<Box<T>>
  • Rc::pinned(data: T) -> Pin<Rc<T>>
  • Arc::pinned(data: T) -> Pin<Arc<T>>

Notes on pinning & safety

Over the last 9 months the pinning APIs have gone through several iterations as
we have investigated their expressive power and also the soundness of their
guarantees. I would now say confidently that the pinning APIs stabilized here
are sound and close enough to the local maxima in ergonomics and
expressiveness; that is, ready for stabilization.

One of the trickier issues of pinning is determining when it is safe to perform
a pin projection: that is, to go from a Pin<P<Target = Foo>> to a
Pin<P<Target = Bar>>, where Bar is a field of Foo. Fortunately, we have
been able to codify a set of rules which can help users determine if such a
projection is safe:

  1. It is only safe to pin project if (Foo: Unpin) implies (Bar: Unpin): that
    is, if it is never the case that Foo (the containing type) is Unpin while
    Bar (the projected type) is not Unpin.
  2. It is only safe if Bar is never moved during the destruction of Foo,
    meaning that either Foo has no destructor, or the destructor is carefully
    checked to make sure that it never moves out of the field being projected to.
  3. It is only safe if Foo (the containing type) is not repr(packed),
    because this causes fields to be moved around to realign them.

Additionally, the std APIs provide no safe way to pin objects to the stack.
This is because there is no way to implement that safely using a function API.
However, users can unsafely pin things to the stack by guaranteeing that they
never move the object again after creating the pinned reference.

The pin-utils crate on crates.io contains macros to assist with both stack
pinning and pin projection. The stack pinning macro safely pins objects to the
stack using a trick involving shadowing, whereas a macro for projection exists
which is unsafe, but avoids you having to write the projection boilerplate in
which you could possibly introduce other incorrect unsafe code.

Implementation changes prior to stabilization

  • Export Unpin from the prelude, remove pin::Unpin re-export

As a general rule, we don't re-export things from multiple places in std unless
one is a supermodule of the real definition (e.g. shortening
std::collections::hash_map::HashMap to std::collections::HashMap). For this
reason, the re-export of std::marker::Unpin from std::pin::Unpin is out of
place.

At the same time, other important marker traits like Send and Sync are included
in the prelude. So instead of re-exporting Unpin from the pin module, by
putting in the prelude we make it unnecessary to import std::marker::Unpin,
the same reason it was put into pin.

  • Change associated functions to methods

Currently, a lot of the associated function of Pin do not use method syntax.
In theory, this is to avoid conflicting with derefable inner methods. However,
this rule has not been applied consistently, and in our experience has mostly
just made things more inconvenient. Pinned pointers only implement immutable
deref, not mutable deref or deref by value, limiting the ability to deref
anyway. Moreover, many of these names are fairly unique (e.g. map_unchecked)
and unlikely to conflict.

Instead, we prefer to give the Pin methods their due precedence; users who
need to access an interior method always can using UFCS, just as they would be
required to to access the Pin methods if we did not use method syntax.

  • Rename get_mut_unchecked to get_unchecked_mut

The current ordering is inconsistent with other uses in the standard library.

  • Remove impl<P> Unpin for Pin<P>

This impl is not justified by our standard justification for unpin impls: there is no pointer direction between Pin<P> and P. Its usefulness is covered by the impls for pointers themselves.

This futures impl will need to change to add a P: Unpin bound.

  • Mark Pin as repr(transparent)

Pin should be a transparent wrapper around the pointer inside of it, with the same representation.

Connected features and larger milestones

The pin APIs are important to safely manipulating sections of memory which can
be guaranteed not to be moved out. If the objects in that memory do not
implement Unpin, their address will never change. This is necessary for
creating self-referential generators and asynchronous functions. As a result,
the Pin type appears in the standard library future APIs and will soon
appear in the APIs for generators as well (#55704).

Stabilizing the Pin type and its APIs is a necessary precursor to stabilizing
the Future APIs, which is itself a necessary precursor to stabilizing the
async/await syntax and moving the entire futures 0.3 async IO ecosystem
onto stable Rust.

cc @cramertj @RalfJung

Activity

added
T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.
on Nov 7, 2018
withoutboats

withoutboats commented on Nov 7, 2018

@withoutboats
ContributorAuthor

@rfcbot fcp merge

rfcbot

rfcbot commented on Nov 7, 2018

@rfcbot
Collaborator

Team member @withoutboats has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

added
proposed-final-comment-periodProposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off.
disposition-mergeThis issue / PR is in PFCP or FCP with a disposition to merge it.
on Nov 7, 2018
alexcrichton

alexcrichton commented on Nov 7, 2018

@alexcrichton
Member

Thanks for the detailed writeup here @withoutboats! I've also sort of been historically confused by the various guarantees of Pin, and I've currently got a few questions about the APIs being stabilized here wrt the safety guarantees. To help sort this out in my own head though I figured I'd try to write these things down.

In trying to start writing this down though I keep running up against a wall of "what is Unpin?" I'm sort of confused by what that is and the various guarantees around it. Can you say again what it means for T to and also to not implement Unpin? Also, if Unpin is a safe trait to implement it naively seems like it could be used to easily undermine the unsafe guarantees of Pin<T>, but I'm surely missing something

tinaun

tinaun commented on Nov 7, 2018

@tinaun
Contributor

if i have got this correct, every type that is not self referential (ie: not a generator) is Unpin

Nemo157

Nemo157 commented on Nov 7, 2018

@Nemo157
Member

It's not just self-referentiality, there are some other use-cases for stable memory addresses that Pin can also support. They are relatively few and far-between though.

How I understand Unpin being safe to implement is that by implementing it you may violate an invariant required of other unsafe code you have written (crucially, only you may write this unsafe code, no external code can be relying on whether you have implemented Unpin). There is nothing you can do with Pin's safe API that will cause unsoundness whether or not you have implemented Unpin. By opting in to using some of Pin's unsafe API you are guaranteeing that you will only implement Unpin when it is safe to do so. That is covered by point 1 of the "Notes on pinning & safety" section above.

alexcrichton

alexcrichton commented on Nov 7, 2018

@alexcrichton
Member

Hm I still don't really understand Unpin. I'm at first just trying to understand what it means to implement or not to impelment Unpin.

First off, it's probably helpful to know what types implement Unpin automatically! It's mentioned above that common pointer types (Arc/Rc/Box/references) implement Unpin, but I think that's it? If this is an auto trait, does that mean that a type MyType automatically implements Unpin if it only contains pointers? Or do no other types automatically implement Unpin?

I keep trying to summarize or state what Unpin guarantees and such, but I'm finding it really difficult to do so. Can someone help out by reiterating again what it means to implement Unpin as well as what it means to not implement Unpin?

I think I understand the guarantees of Pin<P> where you can't move out of any of the inline members of P::Target, but is that right?

withoutboats

withoutboats commented on Nov 7, 2018

@withoutboats
ContributorAuthor

@alexcrichton Thanks for the questions, I'm sure the pinning APIs can be a bit confusing for people who haven't been part of the group focusing on them.

First off, it's probably helpful to know what types implement Unpin automatically!

Unpin is an auto trait like Send or Sync, so most types implement it automatically. Generators and the types of async functions are !Unpin. Types that could contain a generator or an async function body (in other words, types with type parameters) are also not automatically Unpin unless their type parameters are.

The explicit impls for pointer types is to make the Unpin even if their type parameters are not. The reason for this will hopefully be clearer by the end of this comment.

Can someone help out by reiterating again what it means to implement Unpin as well as what it means to not implement Unpin?

Here is the sort of fundamental idea of the pinning APIs. First, given a pointer type P, Pin<P> acts like P except that unless P::Target implements Unpin, it is unsafe to mutably dereference Pin<P>. Second, there are two basic invariants unsafe code related to Pin has to maintain:

  • If you unsafely get an &mut P::Target from a Pin<P>, you must never move P::Target.
  • If you can construct a Pin<P>, it must be guaranteed that you'll never be able to get an unpinned pointer to the data that pointer points to until the destructor runs.

The implication of all of this is that if you construct a Pin<P>, the value pointed to by P will never move again, which is the guarantee we need for self-referential structs as well as intrusive collections. But you can opt out of this guarantee by just implementing Unpin for your type.

So if you implement Unpin for a type, you're saying that the type opts out of any additional safety guarantees of Pin - its possible to mutably dereference the pointers pointing to it. This means you're saying the type does not need to be immovable to be used safely.

Moving a pointer type like Rc<T> doesn't move T because T is behind a pointer. Similarly, pinning a pointer to an Rc<T> (as in Pin<Box<Rc<T>>) doesn't actually pin T, it only pins that particular pointer. This is why anything which keeps its generics behind a pointer can implement Unpin even when their generics don't.

withoutboats

withoutboats commented on Nov 7, 2018

@withoutboats
ContributorAuthor

Also, if Unpin is a safe trait to implement it naively seems like it could be used to easily undermine the unsafe guarantees of Pin, but I'm surely missing something

This was one of the trickiest parts of the pinning API, and we got it wrong at first.

Unpin means "even once something has been put into a pin, it is safe to get a mutable reference to it." There is another trait that exists today that gives you the same access: Drop. So what we figured out was that since Drop is safe, Unpin must also be safe. Does this undermine the entire system? Not quite.

To actually implement a self-referential type would require unsafe code - in practice, the only self-referential types anyone cares about are those that the compiler generates for you: generators and async function state machines. These explicitly say they don't implement Unpin and they don't have a Drop implementation, so you know, for these types, once you have a Pin<&mut T>, they will never actually get a mutable reference, because they're an anonymous type that we know doesn't implement Unpin or Drop.

The problem emerges once you have a struct containing one of these anonymous types, like a future combinator. In order go from a Pin<&mut Fuse<Fut>> to a Pin<&mut Fut>, you have to perform a "pin projection." This is where you can run into trouble: if you pin project to the future field of a combinator, but then implement Drop for the combinator, you can move out of a field that is supposed to be pinned.

For this reason, pin projection is unsafe! In order to perform a pin projection without violating the pinning invariants, you have to guarantee that you never do several things, which I listed in the stabilization proposal.

So, the tl;dr: Drop exists, so Unpin must be safe. But this doesn't ruin the whole thing, it just means that pin projection is unsafe and anyone who wants to pin project needs to uphold a set of invariants.

Matthias247

Matthias247 commented on Nov 8, 2018

@Matthias247
Contributor

generators and async function state machines. These explicitly say they don't implement Unpin and they don't have a Drop implementation, so you know, for these types, once you have a Pin<&mut T>, they will never actually get a mutable reference, because they're an anonymous type that we know doesn't implement Unpin or Drop.

Shouldn't an async state machine have a Drop implementation? Things that are on the "stack" of the async function (which probably equals fields in the state machine) need to get destructed when the async function completes or gets cancelled. Or does this happen otherwise?

SimonSapin

SimonSapin commented on Nov 8, 2018

@SimonSapin
Contributor

I guess what matters in this context is whether an impl Drop for Foo {…} item exists, which would run code with &mut Foo which could use for example mem::replace to "exfiltrate" and move the Foo value.

This is not the same as "drop glue", which can be called through ptr::drop_in_place. Drop glue for a given Foo type will call Drop::drop if it’s implemented, then recursively call the drop glue for each field. But those recursive calls never involve &mut Foo.

Nemo157

Nemo157 commented on Nov 8, 2018

@Nemo157
Member

Also, while a generator (and therefore an async state machine) has custom drop glue, it's just to drop the correct set of fields based on the current state, it promises to never move any of the fields during drop.

withoutboats

withoutboats commented on Nov 8, 2018

@withoutboats
ContributorAuthor

The terminology I use (though I don't think there's any standard): "drop glue" is the compiler generated recursive walking of fields, calling their destructors; "Drop implementation" is an implementation of the Drop trait, and "destructor" is the combination of the drop glue and the Drop implementation. The drop glue never moves anything around, so we only care about Drop implementations.

235 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.finished-final-comment-periodThe final comment period is finished for this PR / Issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @steveklabnik@qnighy@josephg@jaredr@alexcrichton

        Issue actions

          [Stabilization] Pin APIs · Issue #55766 · rust-lang/rust