Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] field projections v2 #3735

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

y86-dev
Copy link

@y86-dev y86-dev commented Dec 4, 2024

Rendered

This is my second attempt at a field projection RFC, here is the first.


Add field projections, a very general operation. In simple terms, it is a new operator that turns a generic container type C<T> containing a struct T into a container C<F> where F is a field of the struct T. For example given the struct:

struct Foo {
    bar: i32,
}

One can project from &mut MaybeUninit<Foo> to &mut MaybeUninit<i32> by using the new field projection operator:

impl Foo {
    fn initialize(this: &mut MaybeUninit<Self>) {
        let bar: &mut MaybeUninit<i32> = this->bar;
        bar.write(42);
    }
}

Special cases of field projections are pin projections, or projecting raw pointers to fields *mut Foo to *mut i32 with improved syntax over &raw mut (*ptr).bar.

@y86-dev y86-dev force-pushed the field-projection-v2 branch from c60c1bc to 8d2aaac Compare December 4, 2024 10:19
@y86-dev y86-dev mentioned this pull request Dec 4, 2024
2 tasks
Comment on lines +435 to +436
// Here we use the special projections set up for `Mutex` with fields of type `Rcu<T>`.
let cfg: &Rcu<Box<BufferConfig>> = buf->cfg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you show the shape of the impl needed here? I'd guess:

impl<F, T, U> Project<F> for Mutex<T>
where F: Field<Base = T, Type = Rcu<U>> {
    ...
}

Doesn't that hit orphan impl limitations? Or is that your own mutex type maybe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add it, it is our own mutex type, so we shouldn't hit orphan rules. Also, in RfL we want to relax orphan rules anyways, so for us it might not be an issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I intended to add the code, but I forgot. I now noticed an issue with my idea for how this would work, since in the meantime, I added impl<'a, T, F: Field<Base = T>> Project<F> for &'a T. That overlaps with the following:

impl<'a, T> Projectable for &'a Mutex<T> {
    type Inner = T;
}

impl<'a, T, F, U: 'a> Project<F> for &'a Mutex<T>
where
    F: Field<Base = T, Type = Rcu<U>>
{
    type Output = &'a Rcu<U>;

    fn project(self) -> Self::Output {
        todo!()
    }
}
  • removing the blanket impl for all &'a T can't be done, because we need it to be able to project &UnsafeCell and other types with #[projecting].
  • this looks like something specialization might help with, but I am totally out of the loop wrt that, last time there wasn't much movement there, so I think we have to come up with something else.
  • one thing that we could do is the following:
    give containers with #[projecting] the ability to select which fields are passed through, then for Mutex, we could only allow Rcu<U> fields.
    I haven't thought about this for long, maybe this is unsound in our case, so I'm not too inclined to jump on this, but maybe with more thought, it ends up a good idea...

Copy link

@Coder-256 Coder-256 Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something like this would work?:

pub struct Mutex<T> {
    pub data: MutexData<T>,
    inner_mutex: InnerMutex, // the actual locking primitive
}

#[projecting]
#[repr(transparent)]
pub struct MutexData<T>(UnsafeCell<T>);

// ...

let cfg: &Rcu<Box<BufferConfig>> = buf->data->cfg;

(Maybe using #[projecting] on MutexData isn't quite accurate here; the point is, maybe somehow you could write custom projections for MutexData). Just one idea.

Edit: or maybe more along the lines of:

pub struct Mutex<T> {
    data: UnsafeCell<T>,
    inner_mutex: InnerMutex, // the actual locking primitive
}

#[projecting]
#[repr(transparent)]
pub struct MutexProjection<'a, T>(*const T, PhantomData<&'a UnsafeCell<T>>);

impl<T> Mutex<T> {
    fn get_projection(&self) -> MutexProjection<'_, T> {
        MutexProjection(self.data.get() as *const T, PhantomData)
    }
}

// ...

let cfg: &Rcu<Box<BufferConfig>> = buf.get_projection()->cfg;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah having a separate reference type for a mutex that can be projected might be possible. But it also is a bit annoying...

(Note that the MutexProjection type that you defined should implement Projectable/Project instead of being #[projecting])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do like for Deref: it wouldn't be &T: ProjectMove, it would be &T: ProjectRef, which doesn't conflict with Mutex<T>: ProjectRef.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, how exactly would that look like? Is ProjectMove the current version of Project and

pub trait ProjectRef<F: Field<Base = Self::Inner>>: Projectable {
    type Output<'a>;
    fn project(&self) -> Self::Output<'_>;
}

and ProjectMut is the same, but with &mut self as the receiver?

What if &Foo implements ProjectMove and Foo implements ProjectRef? What does -> do in that case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I didn't realize this would require a GAT. I haven't thought this through but I'd imagine when there's conflict we'd have to figure out a rule to infer the right one to use, like Deref vs DerefMut do today.

Copy link
Contributor

@arielb1 arielb1 Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a ProjectibleBase trait, then you could do impl<T> !ProjectibleBase for Mutex<T> [and the impl for refs in libcore should be impl<T: ProjectibleBase, F: Field> ProjectiblePointer<F> for &T)

~~Tho it would be annoying for the implementor of Mutex since they'll have to manually do field accesses with

fn inner_mutex(&self) -> &InnerMutex {
    field_of!(Self, inner).project_ref(self)
}
```~~
ed: no, Projectible should only affect `operator->` not `operator.`

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a ProjectibleBase trait, then you could do impl<T> !ProjectibleBase for Mutex<T> [and the impl for refs in libcore should be impl<T: ProjectibleBase, F: Field> ProjectiblePointer<F> for &T)

But that requires negative impls :(

~~Tho it would be annoying for the implementor of Mutex since they'll have to manually do field accesses with

fn inner_mutex(&self) -> &InnerMutex {
    field_of!(Self, inner).project_ref(self)
}

~~ ed: no, Projectible should only affect operator-> not operator.

Even if the solution is a bit more cumbersome for the implementer of the container, we would greatly appreciate it. Our own mutex type already has considerable unsafe code and we have the philosophy of improving driver developer ergonomics over our own library.

}
```

## Interactions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question I didn't see mentioned: what about simultaneous projections? e.g.

#[derive(PinProject)]
struct FairRaceFuture<F1, F2> {
    #[pin]
    fut1: F1,
    #[pin]
    fut2: F2,
    fair: bool,
}
let fut: Pin<&mut FairRaceFuture<_, _>> = ...;
let fut1 = fut->fut1;
let fut2 = fut->fut2;

Most projectable pointers would theoretically allow fut1 and fut2 to coexist. But that seems really hard to allow with user-provided types: fut should clearly not remain fully accessible while fut1 is live, yet passing it to a function like project likely invalidates fut1. We'd need project to take raw pointers, but then we can't have it for custom types.

Is there a design that could accomodate this in some way?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question I didn't see mentioned: what about simultaneous projections? e.g.

That is indeed something that I should mention.

Most projectable pointers would theoretically allow fut1 and fut2 to coexist. But that seems really hard to allow with user-provided types: fut should clearly not remain fully accessible while fut1 is live, yet passing it to a function like project likely invalidates fut1. We'd need project to take raw pointers, but then we can't have it for custom types.

Is there a design that could accomodate this in some way?

In the Design Meeting on Field Projections, I brought this up, but only for the Pin case. There one needs some kind of "automatic reborrowing". But this still only allows consecutive uses of projected fields, not simultaneous usage.

For simultaneous usage, we would have to involve the borrow checker, and let the user choose a behavior (i.e. should it be like &T, &mut T, *mut T. maybe there are more?). That might be a good way to fix this, either through a trait or through macro intrinsics (that depends on how the borrow checker works internally).

borrow_check!(unsafe <'a, T> Pin<&'a mut T> as &'a mut T);
// or
unsafe impl<'a, T> BorrowCheck<&'a mut T> for Pin<&'a mut T> {}

Copy link
Member

@Nadrieril Nadrieril Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the borrow-checker understands, we need a design that allows for sound behavior: you can't alias a &mut, and reborrowing the whole &mut T would invalidate any existing references, right? E.g.

fn project1(pin: &mut Pin<&mut Struct>) -> Pin<&mut Field1> { ... }
fn project2(pin: &mut Pin<&mut Struct>) -> Pin<&mut Field2> { ... }
let mut pin = ...;
let field1 = project_field1(&mut pin);
// I think the second borrow of `pin` invalidates field1 regardless of what the borrow-checker says
let field2 = project_field2(&mut pin);

Copy link
Author

@y86-dev y86-dev Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct, but if we do that with &mut Struct instead, the borrow checker knows what is going on:

let x: &mut Struct = ...;
let field1 = &mut x.field1;
let field2 = &mut x.field2;

We "just" have to make Pin<&mut T> & its projections behave in the same way as &mut T & normal field access for the borrow checker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that we could tell the borrow-checker to consider fut->field1 and fut->field2 to be disjoint, but then we'd need the underlying memory model to allow this without UB. The problem we have is the project function call. As far as I know of the current state of discussion on this, passing a &mut Whatever to a function morally counts as an access, which invalidates any sibling references to that Whatever. I'll let the opsem experts clarify though.

This is similar to how we can't borrow two disjoint fields through a DerefMut (Box cheats on this btw).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, that is a problem...

Copy link
Member

@Nadrieril Nadrieril Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe what we need is:

/// SAFETY: An impl is safe if calling `project` several times on the same `self` but disjoint fields and exposing the resulting outputs to safe code is safe.
unsafe pub trait ProjectMut<F: Field<Base = Self::Inner>>: Projectable {
    type Output;
    /// SAFETY: The input must be cast from a safe `&mut Self`, and possibly passed to this `project` function with disjoint `F` fields and nothing else.
    unsafe fn project(*mut self) -> Self::Output;
}

The safety requirements would need fleshing out to be rigorous but you get the idea: the borrow-checker would ensure we don't hold two projected pointers to the same field at the same time.

There'd also be a ProjectShared for the shared case, that allows multiple aliasing shared projections. And ProjectMove would ask the borrow-checker to track which fields have been moved out and to project+drop the non-moved fields at end of scope. This may run into issues similar to those with DerefMove, I don't recall exactly what blocked that one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know of the current state of discussion on this, passing a &mut Whatever to a function morally counts as an access, which invalidates any sibling references to that Whatever. I'll let the opsem experts clarify though.

This is correct. We'd want this to be more like a MaybeDangling<&mut T>, and suppress all aliasing-related requirements temporarily, until we have the field and it can become an &mut F.

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 4, 2024
Comment on lines +1268 to +1273
### Field Projection Operator

Current favorite: `$base:expr->$field:ident`.

Alternatives:
- use `~` instead of `->`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for field: T, the expression base->field producing any type other than a place of type T would be very confusing for readers knowing C and C++.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think someone coming from C might find &raw const base->field most familiar while still being in line with the difference between & and &raw const. I'm not sure that's much more ergonomic than &raw const (*base).field though (except that it could be safe which is still a huge improvement).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of things in Rust that are very confusing for readers that know C and C++. I personally don't think this is a huge issue, because for pointer types the operation is very similar to "getting a place" and for types it is just a generalization.

That said, I don't really mind the exact syntax of the operator as long as it's short and conveys the correct meaning. What would be your suggestion for an alternative?
Do you want me to add anything to the RFC (e.g. add this as a drawback)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I guess adding &raw const really would only be immediately obvious for raw pointers anyway. I agree it should be concise and uniform across containers so I think -> might be the best option. If it's introduced in the context of another container like MaybeUninit or Pin -> shouldn't be an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally advocate ~ here, because it feels like it wouldn't have the C++ connotations of ->, and because it is more of a "navigate" operation than a "dereference" operation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ~ and -> are fine with me. Just a note, in the previous RFC, it was argued that "[t]he ~ symbol was removed from Rust because it is hard to type on a number of keyboard layouts".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't buy that ~ is removed because of AltGr, the quoted French and German keyboard layouts showed []{}\| all involved AltGr yet we have no problem using any of these 6 symbols. Additionally there is that X: ~const Trait syntax. I think it's more that we don't want to waste the remaining unused symbols.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally there is that X: ~const Trait syntax.

That's placeholder syntax, don't use it as precedent please.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find ~ just looks very odd, so I personally slightly prefer ->. It does something with fields behind pointers, so IMO it's "close enough" to the C/C++ equivalent.

But in the end it's just syntax, I don't care very strongly and find it hard to predict how others will react.

Copy link
Contributor

@tgross35 tgross35 Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally advocate ~ here, because it feels like it wouldn't have the C++ connotations of ->, and because it is more of a "navigate" operation than a "dereference" operation.

I don't think the C++ bit holds when ~ is used for bitwise not and destructors, field projection seems more akin to -> member access than either of those :)

(but I agree with Ralf, slight preference for -> but it doesn't really matter)

- make from_pinned_ref `unsafe`
- add safety documentation
`ptr->flags: *mut u32`. Essentially `ptr->field` is a shorthand for `&raw mut (*ptr).field` (for
`*const` the same is true except for the `mut`). However, there is a small difference between the
two: the latter has to be `unsafe`, since `*ptr` requires that `ptr` be dereferencable. But field
projection is a safe operation and thus it uses [`wrapping_add`] under the hood. This is less
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better for raw-pointer(-like) field projection to be unsafe (like the "desugared" version), and require the at least the specific offset to be in-bounds (if not the whole container).

In particular, NonNull having field projection be safe would mean that it would have to panic (or something) if the projection would result in null.

It seems like none of the motivating examples would benefit from allowing ptr->field where field is not inbounds? Maybe there are usecases for this but I haven't thought of any where it wouldn't be clearer to explicitly use wrapping_add IMO.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better for raw-pointer(-like) field projection to be unsafe (like the "desugared" version), and require the at least the specific offset to be in-bounds (if not the whole container).

In particular, NonNull having field projection be safe would mean that it would have to panic (or something) if the projection would result in null.

Note that wrapping_add is itself a safe function. So the implementation of NonNull::project only contains safe code, making UB impossible. Also, there is no need for a panic, since a projection can never result in null.

It seems like none of the motivating examples would benefit from allowing ptr->field where field is not inbounds? Maybe there are usecases for this but I haven't thought of any where it wouldn't be clearer to explicitly use wrapping_add IMO.

The reason for why I used wrapping_add is that the following code is UB when the -> operation uses add. Making the operation unsafe for raw pointers is probably impossible, since it is backed by the Project trait and the project function can either be unsafe or safe.
Maybe we can split it into two different mutually exclusive traits, but I think it would be better both for consistency and for ergonomic reasons to keep field projections safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that wrapping_add is itself a safe function. So the implementation of NonNull::project only contains safe code, making UB impossible. Also, there is no need for a panic, since a projection can never result in null.

NonNull::wrapping_add does not exist, for the reason I mentioned: offsetting a NonNull can result in null.

For a concrete example: What is the behavior of this safe code with your proposed semantics?

use std::{ptr::NonNull, mem::offset_of};
#[repr(C)]
struct Foo {
  x: u8,
  y: u8,
}

let ptr: NonNull<Foo> = NonNull::new(usize::MAX as *mut Foo);
let field: NonNull<u8> = ptr->y; // either this panics or returns the "wrong" pointer, or we have UB somewhere
assert_eq!(ptr.as_ptr().addr(), usize::MAX);
assert_eq!(offset_of!(Foo, y), 1); // Foo is repr(C)
assert_eq!(usize::MAX.wrapping_add(1), 0); // the expected address of (*ptr).y is 0, which is not a valid NonNull
assert_eq!(field.as_ptr().addr(), 0); // NonNull with addr 0?

The reason for why I used wrapping_add is that the following code is UB when the -> operation uses add.

Yes, offsetting outside of an allocation using field projections is what I am saying I think should be UB. Also, what "following code" are you referring to here? The next code snippet after this line of the file is the set_flags_raw example, which unsafely writes to the projected pointer anyway and does not discuss how making the projection safe reduces UB (though it could in some edge cases; I just don't think they are worth it).

Maybe we can split it into two different mutually exclusive traits, but I think it would be better both for consistency and for ergonomic reasons to keep field projections safe.

If we want this operator to be able to be used generically, and not just for concrete pointer types, we could have an UnsafeFoo trait be a supertrait of the Foo trait, such that any Foo is also usable in a where T: UnsafeFoo context (perhaps with a blanket impl<T: ?Sized + Foo> UnsafeFoo for T { ... } so users don't have to implement UnsafeFoo for their Foo types).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that wrapping_add is itself a safe function. So the implementation of NonNull::project only contains safe code, making UB impossible. Also, there is no need for a panic, since a projection can never result in null.

NonNull::wrapping_add does not exist, for the reason I mentioned: offsetting a NonNull can result in null.

Oh, totally overlooked that. Yeah that seems bad...

For a concrete example: What is the behavior of this safe code with your proposed semantics?

Yeah that doesn't really work...

The reason for why I used wrapping_add is that the following code is UB when the -> operation uses add.

Yes, offsetting outside of an allocation using field projections is what I am saying I think should be UB.

Agreed.

Also, what "following code" are you referring to here?

Oops, I meant to add some code, but the point is moot now, since I didn't understand what you meant before.

The next code snippet after this line of the file is the set_flags_raw example, which unsafely writes to the projected pointer anyway and does not discuss how making the projection safe reduces UB (though it could in some edge cases; I just don't think they are worth it).

Maybe we can split it into two different mutually exclusive traits, but I think it would be better both for consistency and for ergonomic reasons to keep field projections safe.

If we want this operator to be able to be used generically, and not just for concrete pointer types, we could have an UnsafeFoo trait be a supertrait of the Foo trait, such that any Foo is also usable in a where T: UnsafeFoo context (perhaps with a blanket impl<T: ?Sized + Foo> UnsafeFoo for T { ... } so users don't have to implement UnsafeFoo for their Foo types).

Yes, we could have:

pub trait UnsafeProject<F>: Projectable
where
    F: Field<Base = Self::Inner>,
{
    type Output;

    unsafe fn project(self) -> Self::Output;
}

impl<P, F> UnsafeProject<F> for P
where
    P: Projectable,
    F: Field<Base = <P as Projectable>::Inner>,
    P: Project<F>,
{
    type Output = <P as Project<F>>::Output;

    unsafe fn project(self) -> Self::Output {
        <P as Project<F>>::project(self)
    }
}

But I don't know if it is possible for the compiler to detect which trait is implemented.

Copy link
Member

@kennytm kennytm Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't know if it is possible for the compiler to detect which trait is implemented.

The compiler could just always use Project when available, and fallback to UnsafeProject otherwise.

This is similar how the compiler is able to choose which of Fn/FnMut/FnOnce when desugaring call to an unboxed closure.

But I'm not sure if the compiler also required a hierarchical relationship similar to the Fn traits to make this work.

pub trait UnsafeProject<F>: Projectable
where
    F: Field<Base = Self::Inner>,
{
    type Output;
    unsafe fn project_unchecked(self) -> Self::Output;
}

pub trait Project<F>: UnsafeProject<F>
where
    F: Field<Base = Self::Inner>,
{
    fn project(self) -> Self::Output;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. We also have to coordinate this with possibly ProjectRef/ProjectMove... Seems like the combinatorial explosion is going to be bad...

@Coder-256
Copy link

What is the implementation of Projectable for Cow<'_, T>?

@y86-dev
Copy link
Author

y86-dev commented Dec 4, 2024

@Coder-256

What is the implementation of Projectable for Cow<'_, T>?

added the implementation to the RFC

@Coder-256
Copy link

Coder-256 commented Dec 4, 2024

I see, I'm still a bit confused. The value in Cow::Owned has type <T as ToOwned>::Owned, which is not always the same as T. For example, this wouldn't work for Cow<str> I think. Maybe you just need a T: Clone bound though?

Edit: Cow<str> is a bad example obviously (it has no fields), but you can also implement ToOwned on a non-Clone struct and choose a custom type for Owned.


Alternatives:
- Introduce a more native syntax on the level of types `$Container:ty->$field:ident` akin to
projecting an expression.
Copy link
Member

@joshtriplett joshtriplett Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for having native syntax for fields of a type.

Copy link
Author

@y86-dev y86-dev Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a natural syntax would be to lift the expression version to types. So if we go with ~, then we would have both a version for types and for expressions:

struct Foo {
    bar: i32,
}

type Foo_bar_field = Foo~bar;
let x: &UnsafeCell<Foo> = todo!();
let bar: &UnsafeCell<i32> = x~bar;

@y86-dev
Copy link
Author

y86-dev commented Dec 4, 2024

I see, I'm still a bit confused. The value in Cow::Owned has type <T as ToOwned>::Owned, which is not always the same as T. For example, this wouldn't work for Cow<str> I think. Maybe you just need a T: Clone bound though?

Edit: Cow<str> is a bad example obviously (it has no fields), but you can also implement ToOwned on a non-Clone struct and choose a custom type for Owned.

Oh, I totally forgot about ToOwned... I fixed it for now by only providing it if the owned type is the same as the borrowed type.

@joshtriplett
Copy link
Member

joshtriplett commented Dec 4, 2024

Could you please add a section discussing how we can handle types where the projection might have a different type than the original?

For instance, due to the in-memory layout, we can't turn an Arc<StructType> into an Arc<FieldType>, but we could have some kind of ArcRef type that references an external reference count, and projection could work for that type. So at a minimum we should document how we can handle things like that. But I'm also wondering if we can have some kind of associated type that would allow type-changing field projections:

let x: Arc<StructType> = Arc::new(StructType { field: "a string".to_string() });
let y: ArcRef<String> = x~field;

That would be nicer than having to do something like x.to_arc_ref()~field.

@y86-dev
Copy link
Author

y86-dev commented Dec 4, 2024

@joshtriplett I added ArcRef in future possibilities. The current proposal already supports changing the type when projecting, so there is no need to do x.to_arc_ref()->field (I'm continuing to use -> until we change it in the RFC :)

```

Then there are two field types for `Opaque<Foo>`:
- one representing `bar` with `Base = Opaque<Foo>` and `Type = Opaque<i32>`
Copy link
Member

@programmerjake programmerjake Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use the field types for Foo directly, so the field type for bar impls Field<Base = Foo, Type = i32>?

maybe change the design like the following, so you don't need a magic #[projecting] attribute:

pub unsafe trait FieldType: Copy + Send + Sync + Default + ... {} // maybe omit

/// Self is what used to be `Field::Base`
pub unsafe trait HasUnalignedField<F: FieldType> { // maybe leave `Has` out of name?
    type Type: ?Sized;

    const OFFSET: usize;
}

pub unsafe trait HasField<F: FieldType>: HasUnalignedField<F> {}

pub trait Projectable: Sized {
    type Inner: ?Sized;
}

// no Self::Inner: HasUnalignedField<F> bound for
// future compatibility with enum variant fields or union fields
pub trait Project<F>: Projectable {
    type Output;

    // pass in `field` value in case rust gains
    // fields that need runtime computation
    // to get the offset (e.g. fields of scalable vector types),
    // or things like C++ member pointers, for now it's always a ZST
    fn project(self, field: F) -> Self::Output;
}

unsafe impl<F: FieldType, T: HasUnalignedField<F, Type: Sized>> HasUnalignedField<F> for Cell<T> {
    type Type = Cell<<T as HasUnalignedField<F>>::Type>;
    const OFFSET: usize = <T as HasUnalignedField<F>>::OFFSET;
}

unsafe impl<F: FieldType, T: HasField<F, Type: Sized>> HasField<F> for Cell<T> {}

this allows implementing HasUnalignedField/HasField multiple times for the same field type, it also could allow container-like types that have other non-ZST fields:

struct MyWrapper<T> {
    other_field: u32,
    field: T,
}

unsafe impl<F: FieldType, T: HasUnalignedField<F, Type: Sized>> HasUnalignedField<F> for MyWrapper<T> {
    type Type = <T as HasUnalignedField<F>>::Type;
    const OFFSET: usize = <T as HasUnalignedField<F>>::OFFSET + offset_of!(Self, field);
}

unsafe impl<F: FieldType, T: HasField<F, Type: Sized>> HasField<F> for MyWrapper<T> {}

it also allows you to have a HashMap where the key is the TypeId of the field type and it works even if you access fields of MyStruct through SomeRef<MyStruct> or SomeOtherType<MyStruct>, since in both cases the field types are the same types -- the field types of MyStruct...

Using something like that HashMap is how I would probably implement field access for fayalite's Expr<MyStruct> if I were redesigning the API to use the projection operator introduced by this RFC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks very interesting!

How does the compiler find the right field type to return via field_of! though? Let's say I have the MyWrapper struct from above (it should probably be #[repr(C)]) as well as Foo:

struct Foo {
    other_field: i32,
    field: u32,
}

fn demo(x: &MyWrapper<Foo>) {
    let field = x->field; // which `field` is meant by this?
    let other_field = x->other_field; // same here
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the compiler find the right field type to return via field_of! though? Let's say I have the MyWrapper struct from above (it should probably be #[repr(C)]) as well as Foo:

#[repr(C)] is unnecessary since we're using offset_of!. I didn't think of finding the right inner type...maybe we need MyWrapper<T>: Projectable<Inner = T> and the compiler follows the chain of Projectable impls kinda like deref?

struct Foo {
    other_field: i32,
    field: u32,
}

fn demo(x: &MyWrapper<Foo>) {
    let field = x->field; // which `field` is meant by this?
    let other_field = x->other_field; // same here
}

assuming MyWrapper<Foo>: Projectable, since MyWrapper opted-in to being projectable, those would resolve to Foo's fields. alternatively they could be ambiguous.

Copy link
Author

@y86-dev y86-dev Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But x->field currently would desugar as:

Project::<field_of!(<typeof(x) as Projectable>::Inner, field)>::project(x, /* I don't know how we would get the value here, but at the moment its a ZST, so it doesn't matter */)

So in my example, since typeof(x) == &MyWrapper<Foo>, we would get the expression to be:

Project::<field_of!(MyWrapper<Foo>, field)>::project(x, /* ... */)

At least that's my reason for creating the Projectable trait. We could have a different trait (or an attribute) that would disable the creation of field types for a certain type (or maybe for individual fields?).

I think it is really important that the field projection expression x->field has at most one field candidate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm proposing the desugaring would change so a->b desugars to something like:

type Base<...> = {
    let mut base: type = typeof(a);

    // only true if the impl is visible, it can be hidden due to generics or impl Trait
    while base: Projectable {
        base = base::Inner;
    }
    base
};
a.project(field_of!(Base, b) {})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe a better option is following deref coercion semantics where it stops at the first type with that field where the field is visible (due to pub).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable, though might lead to people having to do some visibility gymnastics:

mod my_wrapper_def {
    pub struct MyWrapper<T> {
        other_field: u32,
        field: T,
    }
    /* inherit all fields of `field: T`... */
}
pub use my_wrapper_def::MyWrapper;
pub mod foo_def {
    use super::*;
    pub struct Foo {
        bar: i32,
        baz: u32
    }

    impl MyWrapper<Foo> {
        pub fn set_bar(this: &mut MaybeUninit<Wrapper<Foo>>, value: i32) {
            let bar: &mut MaybeUninit<i32> = this->bar;
            bar.write(value);
        }
    }
}

Which probably is rather annoying... So while we should use your suggested algorithm, I think it's probably still very useful to not generate field types for certain fields via an attribute:

pub struct MyWrapper<T> {
    #[no_field_type_bikeshed]
    other_field: u32,
    #[no_field_type_bikeshed]
    field: T,
}
/* inherit all fields of `field: T`... */
pub struct Foo {
    bar: i32,
    baz: u32,
}

impl MyWrapper<Foo> {
    pub fn set_bar(this: &mut MaybeUninit<Wrapper<Foo>>, value: i32) {
        this->bar.write(value);
    }
}

Comment on lines +801 to +802
first: self.first.project(),
second: self.second.project(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work, how does it know which field to project them to?
Is this Project::<F>::project(self.first)? (Not sure if there's a nicer way to fix the F here. Maybe self.first->F should work. ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect this works the same way you can just call self.my_field.fmt(f) in a Debug impl even though that's not the only trait with a fmt function that may be in scope -- it's the only impl in the where bounds for the generic arguments.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, I was just hoping that the type inference would be good enough to achieve this, I didn't test it though.

@RalfJung
Copy link
Member

RalfJung commented Dec 6, 2024

I didn't expect a solution for Cell projections, Pin projections, and improved raw ptr ergonomics in a single RFC. This is a nice early Christmas gift. :)

Comment on lines +918 to +924
- [field types],
- `Field` traits,
- `field_of!` macro,
- [`#[projecting]`](#projecting-attribute) attribute
- projection operator `->`,
- `Project` trait,
- `Projectable` trait,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to figure out what will eventually need to be stable so field projection can be used on library types, vs. what will need to be stable to make custom types field projectable. Is this list accurate?

  • With access to ->, a user can make use of basic field projection on library types and pointers
  • With access to #[projecting] and/or the ability to implement Project and Projectable, a user can make their own types field projectable
  • With access to the Field trait members (but not necessarily ability to implement) and field_of!, a user can express field projection in function signatures
  • Field and UnalignedField may never be manually implemented, they are compiler-generated for any field that is accessed by field_of!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to figure out what will eventually need to be stable so field projection can be used on library types, vs. what will need to be stable to make custom types field projectable. Is this list accurate?

It is, but one can add additional caveats/information:

  • With access to ->, a user can make use of basic field projection on library types and pointers

-> also gives access to projections for custom types (if we depend on another crate using unstable features to implement custom projections).

  • With access to #[projecting] and/or the ability to implement Project and Projectable, a user can make their own types field projectable

Yes, but often one needs the information contained in the Field trait members, since how else can one project to a field? (of course if the stdlib already provides the projection for raw pointers and one has a raw pointer then it's fine)

  • With access to the Field trait members (but not necessarily ability to implement) and field_of!, a user can express field projection in function signatures

One only needs the Field trait to express field projection in function signatures. Of course to call them, one needs the field_of! macro (or the syntax).

  • Field and UnalignedField may never be manually implemented, they are compiler-generated for any field that is accessed by field_of!

Yes, that is the current idea, but for some reasons we might want to lift that restriction, see Field trait stability

Copy link
Member

@programmerjake programmerjake Dec 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think an enum variant field implementing something other than Field would work well. (unless the enum has only 1 variant, in which case it's effectively both a struct and an enum.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand correctly, what are you replying to in this thread? Do you mean as a way to implement enum projections?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i meant enum projections and that for those, trait Field isn't used so at least in this aspect there isn't a problem with users implementing Field since it won't need to change for enums.

@tmccombs
Copy link

tmccombs commented Dec 7, 2024

Would it make sense to implement Project for types like Option and maybe Result with something like

impl<T: Projectsble> Projectable for Option<T> {
    type Inner =  <T as Projectable>::Inner;
}

impl<T, F> Project<F> for Option< T>
where
    F: Field<Base = Self::Inner>,
    T: Project<F>,
{
    type Output = Option<F::Type>;

    fn project(self) -> Self::Output {
        match self {
          Some(v) => Some(v.project())
          None => None
    }
}

Unfortunately, I don't see how to implement it soundly for all T, since doing so would mean moving the field, so you would need an Option of some reference type.

It could also be implemented for all T if F::Type: Copy, but you can't have both, because of coherence, and I think having it for options of reference types is more useful.

@kennytm
Copy link
Member

kennytm commented Dec 7, 2024

@tmccombs this has the same issue as applying Projectable on Cow<T>, which is that Option<T> is a container of T rather than a pointer to T, meaning accessing its field would be necessarily destructive. Supporting these requires "MoveableField" discussed at the end of the RFC.

If Projectable does not need to be a pointer, the distinction between it and #[projecting] would be blurred, and it raises the question again that why not implement Projectable for #[projecting] containers.

@tmccombs
Copy link

tmccombs commented Dec 7, 2024

accessing its field would be necessarily destructive.

Not if it is projecting through another pointer type, so for some struct Foo, Moveable field would be necessary for Option<Foo> but not for, say Option<&Foo>.

And even if there is a MoveableField, then using that would mean you can't project an Option<&T> for a non-moveable field.

But perhaps it would be more useful if you could project on an &Option or &mut Option. Unfortunately, implementing Project on those would conflict with the impl for &T and &mut T, but perhaps there is a way that #[projecting] could work on enums? I think that might be what the generalized enum projection section is talking about ?

@y86-dev
Copy link
Author

y86-dev commented Dec 7, 2024

accessing its field would be necessarily destructive.

Not if it is projecting through another pointer type, so for some struct Foo, Moveable field would be necessary for Option<Foo> but not for, say Option<&Foo>.

Yes, for Option<P> where P is itself a pointer type, this works. I actually think your implementation from above makes sense:

impl<T: Projectsble> Projectable for Option<T> {
    type Inner =  <T as Projectable>::Inner;
}

impl<T, F> Project<F> for Option< T>
where
    F: Field<Base = Self::Inner>,
    T: Project<F>,
{
    type Output = Option<F::Type>;

    fn project(self) -> Self::Output {
        match self {
          Some(v) => Some(v.project())
          None => None
    }
}

Project is implemented for pointer-like types, so adding this impl would allow projecting all sorts of things (for example Option<&mut MaybeUninit<T>> or even multiple nested stuff Option<&mut UnsafeCell<MaybeUninit<T>>).

And even if there is a MoveableField, then using that would mean you can't project an Option<&T> for a non-moveable field.

But perhaps it would be more useful if you could project on an &Option or &mut Option. Unfortunately, implementing Project on those would conflict with the impl for &T and &mut T, but perhaps there is a way that #[projecting] could work on enums? I think that might be what the generalized enum projection section is talking about ?

I think that projecting through &Option<T> is not possible, since the memory layout of Option<T> is not the same as T, thus a value of type Option<T> does not "contain" a field of type Option<F> if F is a field of T.

The section on enum projection is about enabling to use enums instead of structs: with the current proposal, one can project Pin<&mut T> to Pin<&mut F> if F is a structurally pinned field of T, but if T is instead an enum with a single variant and a single structurally pinned field, the same is not possible. The section on enum projections is about allowing that. Not about enabling projections for container types that enums, since for them, adding #[projecting] is impossible for the above reason and Project can be implemented for any type, so they are already supported.

@Nadrieril

This comment was marked as off-topic.

@kennytm

This comment was marked as off-topic.

Comment on lines +900 to +908
## Impact of this Feature

Overall this feature improves readability of code, because it replaces more complex to parse syntax
with simpler syntax:
- `&raw mut (*ptr).foo` is turned into `ptr->foo`
- using `NonNull<T>` as a replacement for `*mut T` becomes a lot better when accessing fields,

There is of course a cost associated with introducing a new operator along with the concept of field
projections.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem that I find very difficult to address is that of how hard this feature is to understand. Since I have been thinking about this feature a lot (and seeing its uses out in the wild for even longer) I feel like I have lost the ability to accurately gauge its difficulty. I also have no idea of any particular hurdles that people come across when learning about field projections for the first time.

This also brings me to the question of how we teach this. I tried to write the guide-level explanation in a rust-book chapter-style, but I don't know if it even reaches that standard, or if we even would like to talk about field projections there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems simple enough to me, then again I'm already familiar with C++ member pointers, so I probably won't be much help on teachability aspects...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same concept as the c++ member pointers?

# Summary
[summary]: #summary

Field projections are a very general concept. In simple terms, it is a new operator that turns a
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a nit: in the context of this RFC the field projection operation is only applicable to reference-like containers. For example consider the case of Arc<T>: suppose I have some code that wants an Arc<F>, and then I refactor things such that I now have an Arc<T> instead, where T has a field of type F; this proposal does not cover the case of owned value projection - I cannot obtain an Arc<F> from my Arc<T>. Right?

There's some prior art for this kind of projection: https://docs.rs/yoke/latest/yoke/.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a nit: in the context of this RFC the field projection operation is only applicable to reference-like containers. For example consider the case of Arc<T>: suppose I have some code that wants an Arc<F>, and then I refactor things such that I now have an Arc<T> instead, where T has a field of type F; this proposal does not cover the case of owned value projection - I cannot obtain an Arc<F> from my Arc<T>. Right?

Nothing prevents us from implementing Project<F> for Arc<T>, but I think it'd be a footgun, since you have to construct an entirely new Arc<F::Type>. Since an Arc<T> is essentially just a NonNull<ArcInner<T>>, we can just look at the layout of that:

#[repr(C)]
struct Foo {
    a: usize,
    b: usize,
}

// layout of `Foo`:
//
// |a-------|b-------|

// layout of `ArcInner<Foo>`
//
// |strong--|weak----|data.a--|data.b--|

// layout of `ArcInner<usize>`
//
// |strong--|weak----|value---|

The problem is now that we have to create a new ArcInner<usize> on the heap, copy the value from Foo and then drop the old Arc. There are two reasons:

  • if the field is not the first one, we have to move it, since the ArcInner expects the value directly after the weak refcount
  • in order for Arc to reliably free the memory it has allocated, it needs to know the size of the data. If we were to just transmute NonNull<ArcInner<Foo>> to NonNull<ArcInner<usize>> when projecting to the field a, then the free function of the allocator could be called with the wrong layout.

Overall, I feel like this is a rather complex operation (and I haven't really used it in code). So it shouldn't be hidden away behind an operator, instead a method taking a generic parameter that is a field type would be a much better fit in my opinion.

Do you think that the RFC is unclear w.r.t. this? If yes, then I'll gladly add the above to it.

There's some prior art for this kind of projection: https://docs.rs/yoke/latest/yoke/.

Oh yeah, I forgot about that crate! But I think that it also can only provide projections for the yoke type. So if you have Yoke<Bar<'static>, Rc<[u8]>> then you can project to its fields (considering the example from the docs), but the cart type Rc<[u8]> will be unaffected.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough answer! It all makes sense to me. It might be good to include a section explaining why such transmutation is not possible in the general case. Otherwise the phrasing implies too much generality.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh one thing that I forgot to say was that if we change how Arc is implemented, then the projection you suggested above is possible (so if we have two pointers, one to the refcounters and another to the value). The RFC already talks about this in the ArcRef section.

But I would argue that Arc<T>/ArcRef<T> is a "reference-like container" type (taking your wording here, I would rather use the term "pointer-like type"), since it essentially is just a NonNull<ArcInner<T>>.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, makes sense - I don't think folks will be overly keen on changing how Arc is implemented since it increases the allocation cost.

Arc<T> is a "reference-like container" but its reference is to an ArcInner<T> which is not reference-like because it holds both the count and the allocation in-line. My suggested wording is probably not enough to clarify the point, but I still think the wording would benefit from some clarification, specifically that intrusive containers (e.g. ArcInner<T> - again please use better language if you have it) aren't appropriate for field projection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.