Description
@rfcbot fcp merge
Feature name: pin
Stabilization target: 1.32.0
Tracking issue: #49150
Related RFCs: rust-lang/rfcs#2349
This is a proposal to stabilize the pin
library feature, making the "pinning"
APIs for manipulating pinned memory usable on stable.
(I've tried to write this proposal as a comprehensive "stabilization report.")
Stabilized feature or APIs
[std|core]::pin::Pin
This stabilizes the Pin
type in the pin
submodule of std
/core
. Pin
is
a fundamental, transparent wrapper around a generic type P
, which is intended
to be a pointer type (for example, Pin<&mut T>
and Pin<Box<T>>
are both
valid, intended constructs). The Pin
wrapper modifies the pointer to "pin"
the memory it refers to in place, preventing the user from moving objects out
of that memory.
The usual way to use the Pin
type is to construct a pinned variant of some
kind of owning pointer (Box
, Rc
, etc). The std library owning pointers all
provide a pinned
constructor which returns this. Then, to manipulate the
value inside, all of these pointers provide a way to degrade toward Pin<&T>
and Pin<&mut T>
. Pinned pointers can deref, giving you back &T
, but cannot
safely mutably deref: this is only possible using the unsafe get_mut
function.
As a result, anyone mutating data through a pin will be required to uphold the
invariant that they never move out of that data. This allows other code to
safely assume that the data is never moved, allowing it to contain (for
example) self references.
The Pin
type will have these stabilized APIs:
impl<P> Pin<P> where P: Deref, P::Target: Unpin
fn new(pointer: P) -> Pin<P>
impl<P> Pin<P> where P: Deref
unsafe fn new_unchecked(pointer: P) -> Pin<P>
fn as_ref(&self) -> Pin<&P::Target>
impl<P> Pin<P> where P: DerefMut
fn as_mut(&mut self) -> Pin<&mut P::Target>
fn set(&mut self, P::Target);
impl<'a, T: ?Sized> Pin<&'a T>
unsafe fn map_unchecked<U, F: FnOnce(&T) -> &U>(self, f: F) -> Pin<&'a U>
fn get_ref(self) -> &'a T
impl<'a, T: ?Sized> Pin<&'a mut T>
fn into_ref(self) -> Pin<&'a T>
unsafe fn get_unchecked_mut(self) -> &'a mut T
unsafe fn map_unchecked_mut<U, F: FnOnce(&mut T) -> &mut U>(self, f: F) -> Pin<&'a mut U>
impl<'a, T: ?Sized> Pin<&'a mut T> where T: Unpin
fn get_mut(self) -> &'a mut T
Trait impls
Most of the trait impls on Pin
are fairly rote, these two are important to
its operation:
impl<P: Deref> Deref for Pin<P> { type Target = P::Target }
impl<P: DerefMut> DerefMut for Pin<P> where P::Target: Unpin { }
std::marker::Unpin
Unpin is a safe auto trait which opts out of the guarantees of pinning. If the
target of a pinned pointer implements Unpin
, it is safe to mutably
dereference to it. Unpin
types do not have any guarantees that they will not
be moved out of a Pin
.
This makes it as ergonomic to deal with a pinned reference to something that
does not contain self-references as it would be to deal with a non-pinned
reference. The guarantees of Pin
only matter for special case types like
self-referential structures: those types do not implement Unpin
, so they have
the restrictions of the Pin
type.
Notable implementations of Unpin
in std:
impl<'a, T: ?Sized> Unpin for &'a T
impl<'a, T: ?Sized> Unpin for &'a mut T
impl<T: ?Sized> Unpin for Box<T>
impl<T: ?Sized> Unpin for Rc<T>
impl<T: ?Sized> Unpin for Arc<T>
These codify the notion that pinnedness is not transitive across pointers. That
is, a Pin<&T>
only pins the actual memory block represented by T
in a
place. Users have occassionally been confused by this and expected that a type
like Pin<&mut Box<T>>
pins the data of T
in place, but it only pins the
memory the pinned reference actually refers to: in this case, the Box
's
representation, which a pointer into the heap.
std::marker::Pinned
The Pinned
type is a ZST which does not implement Unpin
; it allows you to
supress the auto-implementation of Unpin
on stable, where !Unpin
impls
would not be stable yet.
Smart pointer constructors
Constructors are added to the std smart pointers to create pinned references:
Box::pinned(data: T) -> Pin<Box<T>>
Rc::pinned(data: T) -> Pin<Rc<T>>
Arc::pinned(data: T) -> Pin<Arc<T>>
Notes on pinning & safety
Over the last 9 months the pinning APIs have gone through several iterations as
we have investigated their expressive power and also the soundness of their
guarantees. I would now say confidently that the pinning APIs stabilized here
are sound and close enough to the local maxima in ergonomics and
expressiveness; that is, ready for stabilization.
One of the trickier issues of pinning is determining when it is safe to perform
a pin projection: that is, to go from a Pin<P<Target = Foo>>
to a
Pin<P<Target = Bar>>
, where Bar
is a field of Foo
. Fortunately, we have
been able to codify a set of rules which can help users determine if such a
projection is safe:
- It is only safe to pin project if
(Foo: Unpin) implies (Bar: Unpin)
: that
is, if it is never the case thatFoo
(the containing type) isUnpin
while
Bar
(the projected type) is notUnpin
. - It is only safe if
Bar
is never moved during the destruction ofFoo
,
meaning that eitherFoo
has no destructor, or the destructor is carefully
checked to make sure that it never moves out of the field being projected to. - It is only safe if
Foo
(the containing type) is notrepr(packed)
,
because this causes fields to be moved around to realign them.
Additionally, the std APIs provide no safe way to pin objects to the stack.
This is because there is no way to implement that safely using a function API.
However, users can unsafely pin things to the stack by guaranteeing that they
never move the object again after creating the pinned reference.
The pin-utils
crate on crates.io contains macros to assist with both stack
pinning and pin projection. The stack pinning macro safely pins objects to the
stack using a trick involving shadowing, whereas a macro for projection exists
which is unsafe, but avoids you having to write the projection boilerplate in
which you could possibly introduce other incorrect unsafe code.
Implementation changes prior to stabilization
- Export
Unpin
from the prelude, removepin::Unpin
re-export
As a general rule, we don't re-export things from multiple places in std unless
one is a supermodule of the real definition (e.g. shortening
std::collections::hash_map::HashMap
to std::collections::HashMap
). For this
reason, the re-export of std::marker::Unpin
from std::pin::Unpin
is out of
place.
At the same time, other important marker traits like Send and Sync are included
in the prelude. So instead of re-exporting Unpin
from the pin
module, by
putting in the prelude we make it unnecessary to import std::marker::Unpin
,
the same reason it was put into pin
.
- Change associated functions to methods
Currently, a lot of the associated function of Pin
do not use method syntax.
In theory, this is to avoid conflicting with derefable inner methods. However,
this rule has not been applied consistently, and in our experience has mostly
just made things more inconvenient. Pinned pointers only implement immutable
deref, not mutable deref or deref by value, limiting the ability to deref
anyway. Moreover, many of these names are fairly unique (e.g. map_unchecked
)
and unlikely to conflict.
Instead, we prefer to give the Pin
methods their due precedence; users who
need to access an interior method always can using UFCS, just as they would be
required to to access the Pin methods if we did not use method syntax.
- Rename
get_mut_unchecked
toget_unchecked_mut
The current ordering is inconsistent with other uses in the standard library.
- Remove
impl<P> Unpin for Pin<P>
This impl is not justified by our standard justification for unpin impls: there is no pointer direction between Pin<P>
and P
. Its usefulness is covered by the impls for pointers themselves.
This futures impl will need to change to add a P: Unpin
bound.
- Mark
Pin
asrepr(transparent)
Pin should be a transparent wrapper around the pointer inside of it, with the same representation.
Connected features and larger milestones
The pin APIs are important to safely manipulating sections of memory which can
be guaranteed not to be moved out. If the objects in that memory do not
implement Unpin
, their address will never change. This is necessary for
creating self-referential generators and asynchronous functions. As a result,
the Pin
type appears in the standard library future
APIs and will soon
appear in the APIs for generators as well (#55704).
Stabilizing the Pin
type and its APIs is a necessary precursor to stabilizing
the Future
APIs, which is itself a necessary precursor to stabilizing the
async/await
syntax and moving the entire futures 0.3
async IO ecosystem
onto stable Rust.
Activity
withoutboats commentedon Nov 7, 2018
@rfcbot fcp merge
rfcbot commentedon Nov 7, 2018
Team member @withoutboats has proposed to merge this. The next step is review by the rest of the tagged team members:
Concerns:
naming-of-Unpinresolved by [Stabilization] Pin APIs #55766 (comment)Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
alexcrichton commentedon Nov 7, 2018
Thanks for the detailed writeup here @withoutboats! I've also sort of been historically confused by the various guarantees of
Pin
, and I've currently got a few questions about the APIs being stabilized here wrt the safety guarantees. To help sort this out in my own head though I figured I'd try to write these things down.In trying to start writing this down though I keep running up against a wall of "what is
Unpin
?" I'm sort of confused by what that is and the various guarantees around it. Can you say again what it means forT
to and also to not implementUnpin
? Also, ifUnpin
is a safe trait to implement it naively seems like it could be used to easily undermine the unsafe guarantees ofPin<T>
, but I'm surely missing somethingtinaun commentedon Nov 7, 2018
if i have got this correct, every type that is not self referential (ie: not a generator) is Unpin
Nemo157 commentedon Nov 7, 2018
It's not just self-referentiality, there are some other use-cases for stable memory addresses that
Pin
can also support. They are relatively few and far-between though.How I understand
Unpin
being safe to implement is that by implementing it you may violate an invariant required of other unsafe code you have written (crucially, only you may write this unsafe code, no external code can be relying on whether you have implementedUnpin
). There is nothing you can do withPin
's safe API that will cause unsoundness whether or not you have implementedUnpin
. By opting in to using some ofPin
's unsafe API you are guaranteeing that you will only implementUnpin
when it is safe to do so. That is covered by point 1 of the "Notes on pinning & safety" section above.alexcrichton commentedon Nov 7, 2018
Hm I still don't really understand
Unpin
. I'm at first just trying to understand what it means to implement or not to impelmentUnpin
.First off, it's probably helpful to know what types implement
Unpin
automatically! It's mentioned above that common pointer types (Arc/Rc/Box/references) implementUnpin
, but I think that's it? If this is an auto trait, does that mean that a typeMyType
automatically implementsUnpin
if it only contains pointers? Or do no other types automatically implementUnpin
?I keep trying to summarize or state what
Unpin
guarantees and such, but I'm finding it really difficult to do so. Can someone help out by reiterating again what it means to implementUnpin
as well as what it means to not implementUnpin
?I think I understand the guarantees of
Pin<P>
where you can't move out of any of the inline members ofP::Target
, but is that right?withoutboats commentedon Nov 7, 2018
@alexcrichton Thanks for the questions, I'm sure the pinning APIs can be a bit confusing for people who haven't been part of the group focusing on them.
Unpin is an auto trait like Send or Sync, so most types implement it automatically. Generators and the types of async functions are
!Unpin
. Types that could contain a generator or an async function body (in other words, types with type parameters) are also not automaticallyUnpin
unless their type parameters are.The explicit impls for pointer types is to make the
Unpin
even if their type parameters are not. The reason for this will hopefully be clearer by the end of this comment.Here is the sort of fundamental idea of the pinning APIs. First, given a pointer type
P
,Pin<P>
acts likeP
except that unlessP::Target
implementsUnpin
, it is unsafe to mutably dereferencePin<P>
. Second, there are two basic invariants unsafe code related toPin
has to maintain:&mut P::Target
from aPin<P>
, you must never moveP::Target
.Pin<P>
, it must be guaranteed that you'll never be able to get an unpinned pointer to the data that pointer points to until the destructor runs.The implication of all of this is that if you construct a
Pin<P>
, the value pointed to byP
will never move again, which is the guarantee we need for self-referential structs as well as intrusive collections. But you can opt out of this guarantee by just implementingUnpin
for your type.So if you implement
Unpin
for a type, you're saying that the type opts out of any additional safety guarantees ofPin
- its possible to mutably dereference the pointers pointing to it. This means you're saying the type does not need to be immovable to be used safely.Moving a pointer type like
Rc<T>
doesn't moveT
becauseT
is behind a pointer. Similarly, pinning a pointer to anRc<T>
(as inPin<Box<Rc<T>>
) doesn't actually pinT
, it only pins that particular pointer. This is why anything which keeps its generics behind a pointer can implementUnpin
even when their generics don't.withoutboats commentedon Nov 7, 2018
This was one of the trickiest parts of the pinning API, and we got it wrong at first.
Unpin means "even once something has been put into a pin, it is safe to get a mutable reference to it." There is another trait that exists today that gives you the same access:
Drop
. So what we figured out was that sinceDrop
is safe,Unpin
must also be safe. Does this undermine the entire system? Not quite.To actually implement a self-referential type would require unsafe code - in practice, the only self-referential types anyone cares about are those that the compiler generates for you: generators and async function state machines. These explicitly say they don't implement
Unpin
and they don't have aDrop
implementation, so you know, for these types, once you have aPin<&mut T>
, they will never actually get a mutable reference, because they're an anonymous type that we know doesn't implement Unpin or Drop.The problem emerges once you have a struct containing one of these anonymous types, like a future combinator. In order go from a
Pin<&mut Fuse<Fut>>
to aPin<&mut Fut>
, you have to perform a "pin projection." This is where you can run into trouble: if you pin project to the future field of a combinator, but then implement Drop for the combinator, you can move out of a field that is supposed to be pinned.For this reason, pin projection is unsafe! In order to perform a pin projection without violating the pinning invariants, you have to guarantee that you never do several things, which I listed in the stabilization proposal.
So, the tl;dr:
Drop
exists, soUnpin
must be safe. But this doesn't ruin the whole thing, it just means that pin projection isunsafe
and anyone who wants to pin project needs to uphold a set of invariants.Matthias247 commentedon Nov 8, 2018
Shouldn't an async state machine have a
Drop
implementation? Things that are on the "stack" of the async function (which probably equals fields in the state machine) need to get destructed when the async function completes or gets cancelled. Or does this happen otherwise?SimonSapin commentedon Nov 8, 2018
I guess what matters in this context is whether an
impl Drop for Foo {…}
item exists, which would run code with&mut Foo
which could use for examplemem::replace
to "exfiltrate" and move theFoo
value.This is not the same as "drop glue", which can be called through
ptr::drop_in_place
. Drop glue for a givenFoo
type will callDrop::drop
if it’s implemented, then recursively call the drop glue for each field. But those recursive calls never involve&mut Foo
.Nemo157 commentedon Nov 8, 2018
Also, while a generator (and therefore an async state machine) has custom drop glue, it's just to drop the correct set of fields based on the current state, it promises to never move any of the fields during drop.
withoutboats commentedon Nov 8, 2018
The terminology I use (though I don't think there's any standard): "drop glue" is the compiler generated recursive walking of fields, calling their destructors; "Drop implementation" is an implementation of the
Drop
trait, and "destructor" is the combination of the drop glue and the Drop implementation. The drop glue never moves anything around, so we only care about Drop implementations.235 remaining items