Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] field projections v2 #3735

Open
wants to merge 33 commits into
base: master
Choose a base branch
from

Conversation

y86-dev
Copy link

@y86-dev y86-dev commented Dec 4, 2024

Rendered

This is my second attempt at a field projection RFC, here is the first.


Add field projections, a very general operation. In simple terms, it is a new operator that turns a generic container type C<T> containing a struct T into a container C<F> where F is a field of the struct T. For example given the struct:

struct Foo {
    bar: i32,
}

One can project from &mut MaybeUninit<Foo> to &mut MaybeUninit<i32> by using the new field projection operator:

impl Foo {
    fn initialize(this: &mut MaybeUninit<Self>) {
        let bar: &mut MaybeUninit<i32> = this->bar;
        bar.write(42);
    }
}

Special cases of field projections are pin projections, or projecting raw pointers to fields *mut Foo to *mut i32 with improved syntax over &raw mut (*ptr).bar.

@y86-dev y86-dev force-pushed the field-projection-v2 branch from c60c1bc to 8d2aaac Compare December 4, 2024 10:19
@y86-dev y86-dev mentioned this pull request Dec 4, 2024
2 tasks
Comment on lines +435 to +436
// Here we use the special projections set up for `Mutex` with fields of type `Rcu<T>`.
let cfg: &Rcu<Box<BufferConfig>> = buf->cfg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you show the shape of the impl needed here? I'd guess:

impl<F, T, U> Project<F> for Mutex<T>
where F: Field<Base = T, Type = Rcu<U>> {
    ...
}

Doesn't that hit orphan impl limitations? Or is that your own mutex type maybe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add it, it is our own mutex type, so we shouldn't hit orphan rules. Also, in RfL we want to relax orphan rules anyways, so for us it might not be an issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I intended to add the code, but I forgot. I now noticed an issue with my idea for how this would work, since in the meantime, I added impl<'a, T, F: Field<Base = T>> Project<F> for &'a T. That overlaps with the following:

impl<'a, T> Projectable for &'a Mutex<T> {
    type Inner = T;
}

impl<'a, T, F, U: 'a> Project<F> for &'a Mutex<T>
where
    F: Field<Base = T, Type = Rcu<U>>
{
    type Output = &'a Rcu<U>;

    fn project(self) -> Self::Output {
        todo!()
    }
}
  • removing the blanket impl for all &'a T can't be done, because we need it to be able to project &UnsafeCell and other types with #[projecting].
  • this looks like something specialization might help with, but I am totally out of the loop wrt that, last time there wasn't much movement there, so I think we have to come up with something else.
  • one thing that we could do is the following:
    give containers with #[projecting] the ability to select which fields are passed through, then for Mutex, we could only allow Rcu<U> fields.
    I haven't thought about this for long, maybe this is unsound in our case, so I'm not too inclined to jump on this, but maybe with more thought, it ends up a good idea...

Copy link

@Coder-256 Coder-256 Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something like this would work?:

pub struct Mutex<T> {
    pub data: MutexData<T>,
    inner_mutex: InnerMutex, // the actual locking primitive
}

#[projecting]
#[repr(transparent)]
pub struct MutexData<T>(UnsafeCell<T>);

// ...

let cfg: &Rcu<Box<BufferConfig>> = buf->data->cfg;

(Maybe using #[projecting] on MutexData isn't quite accurate here; the point is, maybe somehow you could write custom projections for MutexData). Just one idea.

Edit: or maybe more along the lines of:

pub struct Mutex<T> {
    data: UnsafeCell<T>,
    inner_mutex: InnerMutex, // the actual locking primitive
}

#[projecting]
#[repr(transparent)]
pub struct MutexProjection<'a, T>(*const T, PhantomData<&'a UnsafeCell<T>>);

impl<T> Mutex<T> {
    fn get_projection(&self) -> MutexProjection<'_, T> {
        MutexProjection(self.data.get() as *const T, PhantomData)
    }
}

// ...

let cfg: &Rcu<Box<BufferConfig>> = buf.get_projection()->cfg;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah having a separate reference type for a mutex that can be projected might be possible. But it also is a bit annoying...

(Note that the MutexProjection type that you defined should implement Projectable/Project instead of being #[projecting])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do like for Deref: it wouldn't be &T: ProjectMove, it would be &T: ProjectRef, which doesn't conflict with Mutex<T>: ProjectRef.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, how exactly would that look like? Is ProjectMove the current version of Project and

pub trait ProjectRef<F: Field<Base = Self::Inner>>: Projectable {
    type Output<'a>;
    fn project(&self) -> Self::Output<'_>;
}

and ProjectMut is the same, but with &mut self as the receiver?

What if &Foo implements ProjectMove and Foo implements ProjectRef? What does -> do in that case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I didn't realize this would require a GAT. I haven't thought this through but I'd imagine when there's conflict we'd have to figure out a rule to infer the right one to use, like Deref vs DerefMut do today.

Copy link
Contributor

@arielb1 arielb1 Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a ProjectibleBase trait, then you could do impl<T> !ProjectibleBase for Mutex<T> [and the impl for refs in libcore should be impl<T: ProjectibleBase, F: Field> ProjectiblePointer<F> for &T)

~~Tho it would be annoying for the implementor of Mutex since they'll have to manually do field accesses with

fn inner_mutex(&self) -> &InnerMutex {
    field_of!(Self, inner).project_ref(self)
}
```~~
ed: no, Projectible should only affect `operator->` not `operator.`

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a ProjectibleBase trait, then you could do impl<T> !ProjectibleBase for Mutex<T> [and the impl for refs in libcore should be impl<T: ProjectibleBase, F: Field> ProjectiblePointer<F> for &T)

But that requires negative impls :(

~~Tho it would be annoying for the implementor of Mutex since they'll have to manually do field accesses with

fn inner_mutex(&self) -> &InnerMutex {
    field_of!(Self, inner).project_ref(self)
}

~~ ed: no, Projectible should only affect operator-> not operator.

Even if the solution is a bit more cumbersome for the implementer of the container, we would greatly appreciate it. Our own mutex type already has considerable unsafe code and we have the philosophy of improving driver developer ergonomics over our own library.

}
```

## Interactions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question I didn't see mentioned: what about simultaneous projections? e.g.

#[derive(PinProject)]
struct FairRaceFuture<F1, F2> {
    #[pin]
    fut1: F1,
    #[pin]
    fut2: F2,
    fair: bool,
}
let fut: Pin<&mut FairRaceFuture<_, _>> = ...;
let fut1 = fut->fut1;
let fut2 = fut->fut2;

Most projectable pointers would theoretically allow fut1 and fut2 to coexist. But that seems really hard to allow with user-provided types: fut should clearly not remain fully accessible while fut1 is live, yet passing it to a function like project likely invalidates fut1. We'd need project to take raw pointers, but then we can't have it for custom types.

Is there a design that could accomodate this in some way?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question I didn't see mentioned: what about simultaneous projections? e.g.

That is indeed something that I should mention.

Most projectable pointers would theoretically allow fut1 and fut2 to coexist. But that seems really hard to allow with user-provided types: fut should clearly not remain fully accessible while fut1 is live, yet passing it to a function like project likely invalidates fut1. We'd need project to take raw pointers, but then we can't have it for custom types.

Is there a design that could accomodate this in some way?

In the Design Meeting on Field Projections, I brought this up, but only for the Pin case. There one needs some kind of "automatic reborrowing". But this still only allows consecutive uses of projected fields, not simultaneous usage.

For simultaneous usage, we would have to involve the borrow checker, and let the user choose a behavior (i.e. should it be like &T, &mut T, *mut T. maybe there are more?). That might be a good way to fix this, either through a trait or through macro intrinsics (that depends on how the borrow checker works internally).

borrow_check!(unsafe <'a, T> Pin<&'a mut T> as &'a mut T);
// or
unsafe impl<'a, T> BorrowCheck<&'a mut T> for Pin<&'a mut T> {}

Copy link
Member

@Nadrieril Nadrieril Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the borrow-checker understands, we need a design that allows for sound behavior: you can't alias a &mut, and reborrowing the whole &mut T would invalidate any existing references, right? E.g.

fn project1(pin: &mut Pin<&mut Struct>) -> Pin<&mut Field1> { ... }
fn project2(pin: &mut Pin<&mut Struct>) -> Pin<&mut Field2> { ... }
let mut pin = ...;
let field1 = project_field1(&mut pin);
// I think the second borrow of `pin` invalidates field1 regardless of what the borrow-checker says
let field2 = project_field2(&mut pin);

Copy link
Author

@y86-dev y86-dev Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct, but if we do that with &mut Struct instead, the borrow checker knows what is going on:

let x: &mut Struct = ...;
let field1 = &mut x.field1;
let field2 = &mut x.field2;

We "just" have to make Pin<&mut T> & its projections behave in the same way as &mut T & normal field access for the borrow checker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that we could tell the borrow-checker to consider fut->field1 and fut->field2 to be disjoint, but then we'd need the underlying memory model to allow this without UB. The problem we have is the project function call. As far as I know of the current state of discussion on this, passing a &mut Whatever to a function morally counts as an access, which invalidates any sibling references to that Whatever. I'll let the opsem experts clarify though.

This is similar to how we can't borrow two disjoint fields through a DerefMut (Box cheats on this btw).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that is a problem...

Copy link
Member

@Nadrieril Nadrieril Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe what we need is:

/// SAFETY: An impl is safe if calling `project` several times on the same `self` but disjoint fields and exposing the resulting outputs to safe code is safe.
unsafe pub trait ProjectMut<F: Field<Base = Self::Inner>>: Projectable {
    type Output;
    /// SAFETY: The input must be cast from a safe `&mut Self`, and possibly passed to this `project` function with disjoint `F` fields and nothing else.
    unsafe fn project(*mut self) -> Self::Output;
}

The safety requirements would need fleshing out to be rigorous but you get the idea: the borrow-checker would ensure we don't hold two projected pointers to the same field at the same time.

There'd also be a ProjectShared for the shared case, that allows multiple aliasing shared projections. And ProjectMove would ask the borrow-checker to track which fields have been moved out and to project+drop the non-moved fields at end of scope. This may run into issues similar to those with DerefMove, I don't recall exactly what blocked that one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know of the current state of discussion on this, passing a &mut Whatever to a function morally counts as an access, which invalidates any sibling references to that Whatever. I'll let the opsem experts clarify though.

This is correct. We'd want this to be more like a MaybeDangling<&mut T>, and suppress all aliasing-related requirements temporarily, until we have the field and it can become an &mut F.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed a new design that addresses simultaneous projections by introducing an intermediate type for projection. instead of taking *mut Self, the user can specify the type themselves. When projecting a single value multiple times for different fields, the compiler will just ptr::read(&inter) it without dropping the value and give that to the project function.

I haven't yet laid out the safety requirements for implementing the trait, and maybe using *mut Self is better, but I would like to hear some feedback first.

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 4, 2024
Comment on lines +1268 to +1273
### Field Projection Operator

Current favorite: `$base:expr->$field:ident`.

Alternatives:
- use `~` instead of `->`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for field: T, the expression base->field producing any type other than a place of type T would be very confusing for readers knowing C and C++.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think someone coming from C might find &raw const base->field most familiar while still being in line with the difference between & and &raw const. I'm not sure that's much more ergonomic than &raw const (*base).field though (except that it could be safe which is still a huge improvement).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of things in Rust that are very confusing for readers that know C and C++. I personally don't think this is a huge issue, because for pointer types the operation is very similar to "getting a place" and for types it is just a generalization.

That said, I don't really mind the exact syntax of the operator as long as it's short and conveys the correct meaning. What would be your suggestion for an alternative?
Do you want me to add anything to the RFC (e.g. add this as a drawback)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I guess adding &raw const really would only be immediately obvious for raw pointers anyway. I agree it should be concise and uniform across containers so I think -> might be the best option. If it's introduced in the context of another container like MaybeUninit or Pin -> shouldn't be an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally advocate ~ here, because it feels like it wouldn't have the C++ connotations of ->, and because it is more of a "navigate" operation than a "dereference" operation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ~ and -> are fine with me. Just a note, in the previous RFC, it was argued that "[t]he ~ symbol was removed from Rust because it is hard to type on a number of keyboard layouts".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't buy that ~ is removed because of AltGr, the quoted French and German keyboard layouts showed []{}\| all involved AltGr yet we have no problem using any of these 6 symbols. Additionally there is that X: ~const Trait syntax. I think it's more that we don't want to waste the remaining unused symbols.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally there is that X: ~const Trait syntax.

That's placeholder syntax, don't use it as precedent please.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find ~ just looks very odd, so I personally slightly prefer ->. It does something with fields behind pointers, so IMO it's "close enough" to the C/C++ equivalent.

But in the end it's just syntax, I don't care very strongly and find it hard to predict how others will react.

Copy link
Contributor

@tgross35 tgross35 Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally advocate ~ here, because it feels like it wouldn't have the C++ connotations of ->, and because it is more of a "navigate" operation than a "dereference" operation.

I don't think the C++ bit holds when ~ is used for bitwise not and destructors, field projection seems more akin to -> member access than either of those :)

(but I agree with Ralf, slight preference for -> but it doesn't really matter)

- make from_pinned_ref `unsafe`
- add safety documentation
`ptr->flags: *mut u32`. Essentially `ptr->field` is a shorthand for `&raw mut (*ptr).field` (for
`*const` the same is true except for the `mut`). However, there is a small difference between the
two: the latter has to be `unsafe`, since `*ptr` requires that `ptr` be dereferencable. But field
projection is a safe operation and thus it uses [`wrapping_add`] under the hood. This is less
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better for raw-pointer(-like) field projection to be unsafe (like the "desugared" version), and require the at least the specific offset to be in-bounds (if not the whole container).

In particular, NonNull having field projection be safe would mean that it would have to panic (or something) if the projection would result in null.

It seems like none of the motivating examples would benefit from allowing ptr->field where field is not inbounds? Maybe there are usecases for this but I haven't thought of any where it wouldn't be clearer to explicitly use wrapping_add IMO.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better for raw-pointer(-like) field projection to be unsafe (like the "desugared" version), and require the at least the specific offset to be in-bounds (if not the whole container).

In particular, NonNull having field projection be safe would mean that it would have to panic (or something) if the projection would result in null.

Note that wrapping_add is itself a safe function. So the implementation of NonNull::project only contains safe code, making UB impossible. Also, there is no need for a panic, since a projection can never result in null.

It seems like none of the motivating examples would benefit from allowing ptr->field where field is not inbounds? Maybe there are usecases for this but I haven't thought of any where it wouldn't be clearer to explicitly use wrapping_add IMO.

The reason for why I used wrapping_add is that the following code is UB when the -> operation uses add. Making the operation unsafe for raw pointers is probably impossible, since it is backed by the Project trait and the project function can either be unsafe or safe.
Maybe we can split it into two different mutually exclusive traits, but I think it would be better both for consistency and for ergonomic reasons to keep field projections safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that wrapping_add is itself a safe function. So the implementation of NonNull::project only contains safe code, making UB impossible. Also, there is no need for a panic, since a projection can never result in null.

NonNull::wrapping_add does not exist, for the reason I mentioned: offsetting a NonNull can result in null.

For a concrete example: What is the behavior of this safe code with your proposed semantics?

use std::{ptr::NonNull, mem::offset_of};
#[repr(C)]
struct Foo {
  x: u8,
  y: u8,
}

let ptr: NonNull<Foo> = NonNull::new(usize::MAX as *mut Foo);
let field: NonNull<u8> = ptr->y; // either this panics or returns the "wrong" pointer, or we have UB somewhere
assert_eq!(ptr.as_ptr().addr(), usize::MAX);
assert_eq!(offset_of!(Foo, y), 1); // Foo is repr(C)
assert_eq!(usize::MAX.wrapping_add(1), 0); // the expected address of (*ptr).y is 0, which is not a valid NonNull
assert_eq!(field.as_ptr().addr(), 0); // NonNull with addr 0?

The reason for why I used wrapping_add is that the following code is UB when the -> operation uses add.

Yes, offsetting outside of an allocation using field projections is what I am saying I think should be UB. Also, what "following code" are you referring to here? The next code snippet after this line of the file is the set_flags_raw example, which unsafely writes to the projected pointer anyway and does not discuss how making the projection safe reduces UB (though it could in some edge cases; I just don't think they are worth it).

Maybe we can split it into two different mutually exclusive traits, but I think it would be better both for consistency and for ergonomic reasons to keep field projections safe.

If we want this operator to be able to be used generically, and not just for concrete pointer types, we could have an UnsafeFoo trait be a supertrait of the Foo trait, such that any Foo is also usable in a where T: UnsafeFoo context (perhaps with a blanket impl<T: ?Sized + Foo> UnsafeFoo for T { ... } so users don't have to implement UnsafeFoo for their Foo types).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that wrapping_add is itself a safe function. So the implementation of NonNull::project only contains safe code, making UB impossible. Also, there is no need for a panic, since a projection can never result in null.

NonNull::wrapping_add does not exist, for the reason I mentioned: offsetting a NonNull can result in null.

Oh, totally overlooked that. Yeah that seems bad...

For a concrete example: What is the behavior of this safe code with your proposed semantics?

Yeah that doesn't really work...

The reason for why I used wrapping_add is that the following code is UB when the -> operation uses add.

Yes, offsetting outside of an allocation using field projections is what I am saying I think should be UB.

Agreed.

Also, what "following code" are you referring to here?

Oops, I meant to add some code, but the point is moot now, since I didn't understand what you meant before.

The next code snippet after this line of the file is the set_flags_raw example, which unsafely writes to the projected pointer anyway and does not discuss how making the projection safe reduces UB (though it could in some edge cases; I just don't think they are worth it).

Maybe we can split it into two different mutually exclusive traits, but I think it would be better both for consistency and for ergonomic reasons to keep field projections safe.

If we want this operator to be able to be used generically, and not just for concrete pointer types, we could have an UnsafeFoo trait be a supertrait of the Foo trait, such that any Foo is also usable in a where T: UnsafeFoo context (perhaps with a blanket impl<T: ?Sized + Foo> UnsafeFoo for T { ... } so users don't have to implement UnsafeFoo for their Foo types).

Yes, we could have:

pub trait UnsafeProject<F>: Projectable
where
    F: Field<Base = Self::Inner>,
{
    type Output;

    unsafe fn project(self) -> Self::Output;
}

impl<P, F> UnsafeProject<F> for P
where
    P: Projectable,
    F: Field<Base = <P as Projectable>::Inner>,
    P: Project<F>,
{
    type Output = <P as Project<F>>::Output;

    unsafe fn project(self) -> Self::Output {
        <P as Project<F>>::project(self)
    }
}

But I don't know if it is possible for the compiler to detect which trait is implemented.

Copy link
Member

@kennytm kennytm Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't know if it is possible for the compiler to detect which trait is implemented.

The compiler could just always use Project when available, and fallback to UnsafeProject otherwise.

This is similar how the compiler is able to choose which of Fn/FnMut/FnOnce when desugaring call to an unboxed closure.

But I'm not sure if the compiler also required a hierarchical relationship similar to the Fn traits to make this work.

pub trait UnsafeProject<F>: Projectable
where
    F: Field<Base = Self::Inner>,
{
    type Output;
    unsafe fn project_unchecked(self) -> Self::Output;
}

pub trait Project<F>: UnsafeProject<F>
where
    F: Field<Base = Self::Inner>,
{
    fn project(self) -> Self::Output;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's simply not what unsafe means though. It is safer, just in the sense that it pessimizes codegen.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, what I mean is that offset is the "correct" version to be using by default over wrapping_offset. If we wanted field projections to reflect that and use offset instead of wrapping_offset, the operator could be 'magically' marked unsafe like with primitive addition and deref on pointers.

I do see why it wouldn't be correct to mark wrapping_add as unsafe, but that was a tangent from my main point and I shouldn't have included it at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we use ptr->x for the optimizable, unsafe .offset() and Wrapping(ptr)->x for the safe .wrapping_offset()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it seems like having both unsafe and safe projection operations is a necessity... Originally I thought that we might get away without them, but this shows that we need them both. I don't have an immediate design idea, this is probably something for a design meeting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought a bit more on this issue and I don't think that we should have a general option of making the -> operator unsafe. It would be the first unsafe generic and user-specifiable operator and keeping track of the safety requirements for using it is going to be a nightmare, as there is no place in the docs where they are currently rendered. For this reason I think we should make -> be a safe operator.
However, I agree that for raw pointers, it should use the unsafe offset function and thus be unsafe. In a way this is exactly how the deref operator *x currently works. User-specified implementations of it must be safe, but raw pointer derefrencing is unsafe.

@Coder-256
Copy link

What is the implementation of Projectable for Cow<'_, T>?

@y86-dev
Copy link
Author

y86-dev commented Dec 4, 2024

@Coder-256

What is the implementation of Projectable for Cow<'_, T>?

added the implementation to the RFC

@Coder-256
Copy link

Coder-256 commented Dec 4, 2024

I see, I'm still a bit confused. The value in Cow::Owned has type <T as ToOwned>::Owned, which is not always the same as T. For example, this wouldn't work for Cow<str> I think. Maybe you just need a T: Clone bound though?

Edit: Cow<str> is a bad example obviously (it has no fields), but you can also implement ToOwned on a non-Clone struct and choose a custom type for Owned.


Alternatives:
- Introduce a more native syntax on the level of types `$Container:ty->$field:ident` akin to
projecting an expression.
Copy link
Member

@joshtriplett joshtriplett Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for having native syntax for fields of a type.

Copy link
Author

@y86-dev y86-dev Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a natural syntax would be to lift the expression version to types. So if we go with ~, then we would have both a version for types and for expressions:

struct Foo {
    bar: i32,
}

type Foo_bar_field = Foo~bar;
let x: &UnsafeCell<Foo> = todo!();
let bar: &UnsafeCell<i32> = x~bar;

@y86-dev
Copy link
Author

y86-dev commented Dec 4, 2024

I see, I'm still a bit confused. The value in Cow::Owned has type <T as ToOwned>::Owned, which is not always the same as T. For example, this wouldn't work for Cow<str> I think. Maybe you just need a T: Clone bound though?

Edit: Cow<str> is a bad example obviously (it has no fields), but you can also implement ToOwned on a non-Clone struct and choose a custom type for Owned.

Oh, I totally forgot about ToOwned... I fixed it for now by only providing it if the owned type is the same as the borrowed type.

@joshtriplett
Copy link
Member

joshtriplett commented Dec 4, 2024

Could you please add a section discussing how we can handle types where the projection might have a different type than the original?

For instance, due to the in-memory layout, we can't turn an Arc<StructType> into an Arc<FieldType>, but we could have some kind of ArcRef type that references an external reference count, and projection could work for that type. So at a minimum we should document how we can handle things like that. But I'm also wondering if we can have some kind of associated type that would allow type-changing field projections:

let x: Arc<StructType> = Arc::new(StructType { field: "a string".to_string() });
let y: ArcRef<String> = x~field;

That would be nicer than having to do something like x.to_arc_ref()~field.

@y86-dev
Copy link
Author

y86-dev commented Dec 4, 2024

@joshtriplett I added ArcRef in future possibilities. The current proposal already supports changing the type when projecting, so there is no need to do x.to_arc_ref()->field (I'm continuing to use -> until we change it in the RFC :)

```

Then there are two field types for `Opaque<Foo>`:
- one representing `bar` with `Base = Opaque<Foo>` and `Type = Opaque<i32>`
Copy link
Member

@programmerjake programmerjake Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use the field types for Foo directly, so the field type for bar impls Field<Base = Foo, Type = i32>?

maybe change the design like the following, so you don't need a magic #[projecting] attribute:

pub unsafe trait FieldType: Copy + Send + Sync + Default + ... {} // maybe omit

/// Self is what used to be `Field::Base`
pub unsafe trait HasUnalignedField<F: FieldType> { // maybe leave `Has` out of name?
    type Type: ?Sized;

    const OFFSET: usize;
}

pub unsafe trait HasField<F: FieldType>: HasUnalignedField<F> {}

pub trait Projectable: Sized {
    type Inner: ?Sized;
}

// no Self::Inner: HasUnalignedField<F> bound for
// future compatibility with enum variant fields or union fields
pub trait Project<F>: Projectable {
    type Output;

    // pass in `field` value in case rust gains
    // fields that need runtime computation
    // to get the offset (e.g. fields of scalable vector types),
    // or things like C++ member pointers, for now it's always a ZST
    fn project(self, field: F) -> Self::Output;
}

unsafe impl<F: FieldType, T: HasUnalignedField<F, Type: Sized>> HasUnalignedField<F> for Cell<T> {
    type Type = Cell<<T as HasUnalignedField<F>>::Type>;
    const OFFSET: usize = <T as HasUnalignedField<F>>::OFFSET;
}

unsafe impl<F: FieldType, T: HasField<F, Type: Sized>> HasField<F> for Cell<T> {}

this allows implementing HasUnalignedField/HasField multiple times for the same field type, it also could allow container-like types that have other non-ZST fields:

struct MyWrapper<T> {
    other_field: u32,
    field: T,
}

unsafe impl<F: FieldType, T: HasUnalignedField<F, Type: Sized>> HasUnalignedField<F> for MyWrapper<T> {
    type Type = <T as HasUnalignedField<F>>::Type;
    const OFFSET: usize = <T as HasUnalignedField<F>>::OFFSET + offset_of!(Self, field);
}

unsafe impl<F: FieldType, T: HasField<F, Type: Sized>> HasField<F> for MyWrapper<T> {}

it also allows you to have a HashMap where the key is the TypeId of the field type and it works even if you access fields of MyStruct through SomeRef<MyStruct> or SomeOtherType<MyStruct>, since in both cases the field types are the same types -- the field types of MyStruct...

Using something like that HashMap is how I would probably implement field access for fayalite's Expr<MyStruct> if I were redesigning the API to use the projection operator introduced by this RFC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks very interesting!

How does the compiler find the right field type to return via field_of! though? Let's say I have the MyWrapper struct from above (it should probably be #[repr(C)]) as well as Foo:

struct Foo {
    other_field: i32,
    field: u32,
}

fn demo(x: &MyWrapper<Foo>) {
    let field = x->field; // which `field` is meant by this?
    let other_field = x->other_field; // same here
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the compiler find the right field type to return via field_of! though? Let's say I have the MyWrapper struct from above (it should probably be #[repr(C)]) as well as Foo:

#[repr(C)] is unnecessary since we're using offset_of!. I didn't think of finding the right inner type...maybe we need MyWrapper<T>: Projectable<Inner = T> and the compiler follows the chain of Projectable impls kinda like deref?

struct Foo {
    other_field: i32,
    field: u32,
}

fn demo(x: &MyWrapper<Foo>) {
    let field = x->field; // which `field` is meant by this?
    let other_field = x->other_field; // same here
}

assuming MyWrapper<Foo>: Projectable, since MyWrapper opted-in to being projectable, those would resolve to Foo's fields. alternatively they could be ambiguous.

Copy link
Author

@y86-dev y86-dev Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But x->field currently would desugar as:

Project::<field_of!(<typeof(x) as Projectable>::Inner, field)>::project(x, /* I don't know how we would get the value here, but at the moment its a ZST, so it doesn't matter */)

So in my example, since typeof(x) == &MyWrapper<Foo>, we would get the expression to be:

Project::<field_of!(MyWrapper<Foo>, field)>::project(x, /* ... */)

At least that's my reason for creating the Projectable trait. We could have a different trait (or an attribute) that would disable the creation of field types for a certain type (or maybe for individual fields?).

I think it is really important that the field projection expression x->field has at most one field candidate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm proposing the desugaring would change so a->b desugars to something like:

type Base<...> = {
    let mut base: type = typeof(a);

    // only true if the impl is visible, it can be hidden due to generics or impl Trait
    while base: Projectable {
        base = base::Inner;
    }
    base
};
a.project(field_of!(Base, b) {})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe a better option is following deref coercion semantics where it stops at the first type with that field where the field is visible (due to pub).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable, though might lead to people having to do some visibility gymnastics:

mod my_wrapper_def {
    pub struct MyWrapper<T> {
        other_field: u32,
        field: T,
    }
    /* inherit all fields of `field: T`... */
}
pub use my_wrapper_def::MyWrapper;
pub mod foo_def {
    use super::*;
    pub struct Foo {
        bar: i32,
        baz: u32
    }

    impl MyWrapper<Foo> {
        pub fn set_bar(this: &mut MaybeUninit<Wrapper<Foo>>, value: i32) {
            let bar: &mut MaybeUninit<i32> = this->bar;
            bar.write(value);
        }
    }
}

Which probably is rather annoying... So while we should use your suggested algorithm, I think it's probably still very useful to not generate field types for certain fields via an attribute:

pub struct MyWrapper<T> {
    #[no_field_type_bikeshed]
    other_field: u32,
    #[no_field_type_bikeshed]
    field: T,
}
/* inherit all fields of `field: T`... */
pub struct Foo {
    bar: i32,
    baz: u32,
}

impl MyWrapper<Foo> {
    pub fn set_bar(this: &mut MaybeUninit<Wrapper<Foo>>, value: i32) {
        this->bar.write(value);
    }
}

@RalfJung
Copy link
Member

RalfJung commented Dec 6, 2024

I didn't expect a solution for Cell projections, Pin projections, and improved raw ptr ergonomics in a single RFC. This is a nice early Christmas gift. :)

Comment on lines +918 to +924
- [field types],
- `Field` traits,
- `field_of!` macro,
- [`#[projecting]`](#projecting-attribute) attribute
- projection operator `->`,
- `Project` trait,
- `Projectable` trait,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to figure out what will eventually need to be stable so field projection can be used on library types, vs. what will need to be stable to make custom types field projectable. Is this list accurate?

  • With access to ->, a user can make use of basic field projection on library types and pointers
  • With access to #[projecting] and/or the ability to implement Project and Projectable, a user can make their own types field projectable
  • With access to the Field trait members (but not necessarily ability to implement) and field_of!, a user can express field projection in function signatures
  • Field and UnalignedField may never be manually implemented, they are compiler-generated for any field that is accessed by field_of!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to figure out what will eventually need to be stable so field projection can be used on library types, vs. what will need to be stable to make custom types field projectable. Is this list accurate?

It is, but one can add additional caveats/information:

  • With access to ->, a user can make use of basic field projection on library types and pointers

-> also gives access to projections for custom types (if we depend on another crate using unstable features to implement custom projections).

  • With access to #[projecting] and/or the ability to implement Project and Projectable, a user can make their own types field projectable

Yes, but often one needs the information contained in the Field trait members, since how else can one project to a field? (of course if the stdlib already provides the projection for raw pointers and one has a raw pointer then it's fine)

  • With access to the Field trait members (but not necessarily ability to implement) and field_of!, a user can express field projection in function signatures

One only needs the Field trait to express field projection in function signatures. Of course to call them, one needs the field_of! macro (or the syntax).

  • Field and UnalignedField may never be manually implemented, they are compiler-generated for any field that is accessed by field_of!

Yes, that is the current idea, but for some reasons we might want to lift that restriction, see Field trait stability

Copy link
Member

@programmerjake programmerjake Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think an enum variant field implementing something other than Field would work well. (unless the enum has only 1 variant, in which case it's effectively both a struct and an enum.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand correctly, what are you replying to in this thread? Do you mean as a way to implement enum projections?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i meant enum projections and that for those, trait Field isn't used so at least in this aspect there isn't a problem with users implementing Field since it won't need to change for enums.

@tmccombs
Copy link

tmccombs commented Dec 7, 2024

Would it make sense to implement Project for types like Option and maybe Result with something like

impl<T: Projectsble> Projectable for Option<T> {
    type Inner =  <T as Projectable>::Inner;
}

impl<T, F> Project<F> for Option< T>
where
    F: Field<Base = Self::Inner>,
    T: Project<F>,
{
    type Output = Option<F::Type>;

    fn project(self) -> Self::Output {
        match self {
          Some(v) => Some(v.project())
          None => None
    }
}

Unfortunately, I don't see how to implement it soundly for all T, since doing so would mean moving the field, so you would need an Option of some reference type.

It could also be implemented for all T if F::Type: Copy, but you can't have both, because of coherence, and I think having it for options of reference types is more useful.

@kennytm
Copy link
Member

kennytm commented Dec 7, 2024

@tmccombs this has the same issue as applying Projectable on Cow<T>, which is that Option<T> is a container of T rather than a pointer to T, meaning accessing its field would be necessarily destructive. Supporting these requires "MoveableField" discussed at the end of the RFC.

If Projectable does not need to be a pointer, the distinction between it and #[projecting] would be blurred, and it raises the question again that why not implement Projectable for #[projecting] containers.

@tmccombs
Copy link

tmccombs commented Dec 7, 2024

accessing its field would be necessarily destructive.

Not if it is projecting through another pointer type, so for some struct Foo, Moveable field would be necessary for Option<Foo> but not for, say Option<&Foo>.

And even if there is a MoveableField, then using that would mean you can't project an Option<&T> for a non-moveable field.

But perhaps it would be more useful if you could project on an &Option or &mut Option. Unfortunately, implementing Project on those would conflict with the impl for &T and &mut T, but perhaps there is a way that #[projecting] could work on enums? I think that might be what the generalized enum projection section is talking about ?

@y86-dev
Copy link
Author

y86-dev commented Dec 7, 2024

accessing its field would be necessarily destructive.

Not if it is projecting through another pointer type, so for some struct Foo, Moveable field would be necessary for Option<Foo> but not for, say Option<&Foo>.

Yes, for Option<P> where P is itself a pointer type, this works. I actually think your implementation from above makes sense:

impl<T: Projectsble> Projectable for Option<T> {
    type Inner =  <T as Projectable>::Inner;
}

impl<T, F> Project<F> for Option< T>
where
    F: Field<Base = Self::Inner>,
    T: Project<F>,
{
    type Output = Option<F::Type>;

    fn project(self) -> Self::Output {
        match self {
          Some(v) => Some(v.project())
          None => None
    }
}

Project is implemented for pointer-like types, so adding this impl would allow projecting all sorts of things (for example Option<&mut MaybeUninit<T>> or even multiple nested stuff Option<&mut UnsafeCell<MaybeUninit<T>>).

And even if there is a MoveableField, then using that would mean you can't project an Option<&T> for a non-moveable field.

But perhaps it would be more useful if you could project on an &Option or &mut Option. Unfortunately, implementing Project on those would conflict with the impl for &T and &mut T, but perhaps there is a way that #[projecting] could work on enums? I think that might be what the generalized enum projection section is talking about ?

I think that projecting through &Option<T> is not possible, since the memory layout of Option<T> is not the same as T, thus a value of type Option<T> does not "contain" a field of type Option<F> if F is a field of T.

The section on enum projection is about enabling to use enums instead of structs: with the current proposal, one can project Pin<&mut T> to Pin<&mut F> if F is a structurally pinned field of T, but if T is instead an enum with a single variant and a single structurally pinned field, the same is not possible. The section on enum projections is about allowing that. Not about enabling projections for container types that enums, since for them, adding #[projecting] is impossible for the above reason and Project can be implemented for any type, so they are already supported.

@Nadrieril

This comment was marked as off-topic.

@kennytm

This comment was marked as off-topic.

Comment on lines +1299 to +1325
When the same projection base is used multiple times, the desugaring is as follows:

```rust
struct T {
field: F,
x: X,
y: Y,
}

let t: C<T> = /* ... */;
let _ = t->field;
let _ = t->x;
let _ = t->y;

// becomes

let __inter_t = Projectable::start_projection(t);
let _ = SimultaneousProject::<field_of!(<C<T> as Projectable>::Inner, field)>::project(
unsafe { ptr::read(&__inter_t) },
);
let _ = SimultaneousProject::<field_of!(<C<T> as Projectable>::Inner, x)>::project(
unsafe { ptr::read(&__inter_t) },
);
let _ = Project::<field_of!(<C<T> as Projectable>::Inner, y)>::project_last(
Projectable::start_projection(t),
);
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 how do you determine when does the simultaneous projection starts and ends? does it need help from the borrow checker?

let mut t: Pin<&mut Foo> = t1.as_mut();
assert_eq!(t->x.0, 1);
if *(t->a) {
    t = t2.as_mut(); // t is conditionally changed here
}
assert_eq!(t->x.0, 2);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good question. And I think that the answer has to be yes.

I think that we should just mirror the behavior that we already have with &mut T. If I remember its behavior correctly, that would mean all projections in your example above would be project_last and we'd have to add automatic reborrowing to Pin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That project_last looks like a destructor anyway, could it instead be lowered as

let __inter_t = Projectable::start_projection(t);
let _ = SimultaneousProject::<field_of!(<C<T> as Projectable>::Inner, field)>::project(
   unsafe { ptr::read(&__inter_t) },
);
let _ = SimultaneousProject::<field_of!(<C<T> as Projectable>::Inner, x)>::project(
   unsafe { ptr::read(&__inter_t) },
);
let _ = SimultaneousProject::<field_of!(<C<T> as Projectable>::Inner, y)>::project(  // <-- no more project_last
   unsafe { ptr::read(&__inter_t) },
);
drop(__inter_t);  // <-- `impl Drop for C<T>::Inter` if need custom clean-up behavior

The dropck or whatever can handle where to drop __inter_t.

The disadvantage is that for ArcRef this will incur a redundant inc-ref/dec-ref operation that should be elided with the project_last approach (that sounds like #3680 actually).


I think another alternative is to copy pin-project(-lite)'s API surface, making the projected structure (like __inter_t) explicit. But instead of exposing C<T>::Inter, the output type of x.project() would remain anonymous and its fields are populated using SimultaneousProject<field_of!()> at the beginning.

// normal usage
let this: &mut MaybeUninit<Self>;
...
let bar: &mut MaybeUninit<i32> = this.project().bar;
bar.write(42);
// access simultaneous projection
let mut t: Pin<&mut Foo> = t1.as_mut();
let mut t_proj = t.project();
assert_eq!(t_proj.x.0, 1);
if *(t_proj.a) {
    t = t2.as_mut();
    t_proj = t.project();
}
assert_eq!(t_proj.x.0, 2);

(one could introduce a suffix operator such as x~ to be equivalent to x.project() so we could write this~.bar which has the same length as this->bar.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That project_last looks like a destructor anyway, could it instead be lowered as

let __inter_t = Projectable::start_projection(t);
let _ = SimultaneousProject::<field_of!(<C<T> as Projectable>::Inner, field)>::project(
   unsafe { ptr::read(&__inter_t) },
);
let _ = SimultaneousProject::<field_of!(<C<T> as Projectable>::Inner, x)>::project(
   unsafe { ptr::read(&__inter_t) },
);
let _ = SimultaneousProject::<field_of!(<C<T> as Projectable>::Inner, y)>::project(  // <-- no more project_last
   unsafe { ptr::read(&__inter_t) },
);
drop(__inter_t);  // <-- `impl Drop for C<T>::Inter` if need custom clean-up behavior

The dropck or whatever can handle where to drop __inter_t.

The disadvantage is that for ArcRef this will incur a redundant inc-ref/dec-ref operation that should be elided with the project_last approach (that sounds like #3680 actually).

Yeah, that's something that I tried to avoid with that approach. I think it's pretty important to not have additional overhead with projecting.

With regards to #3680, I am not sure if that is going to guarantee zero overhead. I am certain that it will help in a lot of cases, but since this is going to be used for very low level operations, I think an implementer of field projections must be able to ensure zero overhead.

I also don't like that the trait has two different project functions, so if there is a solution that only uses one function and doesn't introduce extra overhead that would be very nice.


I think another alternative is to copy pin-project(-lite)'s API surface, making the projected structure (like __inter_t) explicit. But instead of exposing C<T>::Inter, the output type of x.project() would remain anonymous and its fields are populated using SimultaneousProject<field_of!()> at the beginning.

That's an interesting idea. But if we use this approach, projecting an ArcRef to a single field would result in an inc-ref/dec-ref operation for every other field which is rather bad IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.