From 8d2aaacebe0d1cf4b66a1bec11e7077ec04e31ec Mon Sep 17 00:00:00 2001 From: y86-dev Date: Wed, 4 Dec 2024 11:06:45 +0100 Subject: [PATCH] add `field_projection` v2 rfc --- text/3735-field-projections.md | 1344 ++++++++++++++++++++++++++++++++ 1 file changed, 1344 insertions(+) create mode 100644 text/3735-field-projections.md diff --git a/text/3735-field-projections.md b/text/3735-field-projections.md new file mode 100644 index 00000000000..9f308711749 --- /dev/null +++ b/text/3735-field-projections.md @@ -0,0 +1,1344 @@ +- Feature Name: `field_projection` +- Start Date: 2024-10-24 +- RFC PR: [rust-lang/rfcs#3735](https://github.com/rust-lang/rfcs/pull/3735) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +Field projections are a very general concept. In simple terms, it is a new operator that turns a +generic container type `C` containing a struct `T` into a container `C` where `F` is a +field of the struct `T`. For example given the struct: + +```rust +struct Foo { + bar: i32, +} +``` + +One can project from `&mut MaybeUninit` to `&mut MaybeUninit` by using the new field +projection operator: + +```rust +impl Foo { + fn initialize(this: &mut MaybeUninit) { + let bar: &mut MaybeUninit = this->bar; + bar.write(42); + } +} +``` + +Special cases of field projections are [pin projections], or projecting raw pointers to fields +`*mut Foo` to `*mut i32` with improved syntax over `&raw mut (*ptr).bar`. + +# Motivation +[motivation]: #motivation + +Field projections are a unifying solution to several problems: +- [pin projections], +- ergonomic pointer-to-field access operations for pointer-types (`*const T`, `&mut MaybeUninit`, + `NonNull`, `&UnsafeCell`, etc.), +- projecting custom references and container types. + +[Pin projections] have been a constant pain point and this feature solves them elegantly while at +the same time solving a much broader problem space. For example, field projections enable the +ergonomic use of `NonNull` over `*mut T` for accessing fields. + +In the following sections, we will cover the basic usage first. And then we will go over the most +complex version that is required for [pin projections] as well as allowing custom projections such +as the abstraction for RCU from the Rust for Linux project (also given below). + +[pin projections]: https://doc.rust-lang.org/std/pin/index.html#projections-and-structural-pinning +[Pin projections]: https://doc.rust-lang.org/std/pin/index.html#projections-and-structural-pinning + +## Ergonomic Pointer-to-Field Operations + +We will use the struct from the summary as a simple example: + +```rust +struct Foo { + bar: i32, +} +``` + +References and raw pointers already possess pointer-to-field operations. Given a variable `foo: &T` +one can write `&foo.bar` to obtain a `&i32` pointing to the field `bar` of `Foo`. The same can be +done for `foo: *const T`: `&raw (*foo).bar` (although this operation is `unsafe`) and their mutable +versions. + +However, the other pointer-like types such as `NonNull`, `&mut MaybeUninit` and +`&UnsafeCell` don't natively support this operation. Of course one can write: + +```rust +unsafe fn project(foo: NonNull) -> NonNull { + let foo = foo.as_ptr(); + unsafe { NonNull::new_unchecked(&raw (*foo).bar) } +} +``` + +But this is very annoying to use in practice, since the code depends on the name of the field and +can thus not be written using a single generic function. For this reason, many people use raw +pointers even though `NonNull` would be more fitting. The same can be said about `&mut +MaybeUninit`. + +There are a lot of types that can benefit from this operation: +- `NonNull` +- `*const T`, `*mut T` +- `&T`, `&mut T` +- `&Cell`, `&UnsafeCell` +- `&mut MaybeUninit`, `*mut MaybeUninit` +- `cell::Ref<'_, T>`, `cell::RefMut<'_, T>` +- `Cow<'_, T>` + +## Pin Projections + +The examples from the previous section are very simple, since they all follow the pattern `C -> +C` where `C` is the respective generic container type and `F` is a field of `T`. + +In order to handle `Pin<&mut T>`, the return type of the field projection operator needs to depend +on the field itself. This is needed in order to be able to project structurally pinned fields from +`Pin<&mut T>` to `Pin<&mut F1>` while simultaneously projecting not structurally pinned fields from +`Pin<&mut T>` to `&mut F2`. + +Fields marked with `#[pin]` are structurally pinned field. For example, consider the following +future: + +```rust +struct FairRaceFuture { + #[pin] + fut1: F1, + #[pin] + fut2: F2, + fair: bool, +} +``` + +One can utilize the following projections when given `fut: Pin<&mut FairRaceFuture>`: +- `fut->fut1: Pin<&mut F1>` +- `fut->fut2: Pin<&mut F2>` +- `fut->fair: &mut bool` + +Using these, one can concisely implement `Future` for `FairRaceFuture`: + +```rust +impl> Future for FairRaceFuture { + type Output = F1::Output; + + fn poll(self: Pin<&mut Self>, ctx: &mut Context) -> Poll { + let fair: &mut bool = self->fair; + *fair = !*fair; + if *fair { + // self->fut1: Pin<&mut F1> + match self->fut1.poll(ctx) { + Poll::Pending => self->fut2.poll(ctx), + Poll::Ready(res) => Poll::Ready(res), + } + } else { + // self->fut2: Pin<&mut F2> + match self->fut2.poll(ctx) { + Poll::Pending => self->fut1.poll(ctx), + Poll::Ready(res) => Poll::Ready(res), + } + } + } +} +``` + +Without field projection, one would either have to use `unsafe` or reach for a third party library +like [`pin-project`] or [`pin-project-lite`] and then use the provided `project` function. + +[`pin-project`]: https://crates.io/crates/pin-project +[`pin-project-lite`]: https://crates.io/crates/pin-project-lite + +## Custom Projections + +This proposal also aims to allow custom field projections. For example a custom pointer type for +"always valid pointers" i.e. mutable references that are allowed to alias and that have no +guarantees with respect to race conditions. Those would be rather annoying to use without field +projection, since one would always have to convert them into raw pointers to go to a field. + +In this section, three examples are presented of custom field projections in the Rust for Linux +project. The first is volatile memory, a pointer that ensures only volatile access to the pointee. +The second is untrusted data, requiring validation before the data can be used for logic. And the +last example is a sketch for a safe abstraction (an API that provides only safe functions to use the +underlying feature) of RCU. It probably requires field projection in order to be able to provide +such a safe abstraction. Also note that this example requires to use field projection as pin +projections, so it is beneficial to read that section first. + +### Rust for Linux Example: Volatile Memory + +In the kernel, sometimes there is the need to solely access memory via volatile operations. Since +combining normal and volatile memory accesses will lead to undefined behavior, a safe abstraction is +required. + +```rust +pub struct VolatileMem { + inner: *mut T, +} + +impl VolatileMem { + pub fn write(&mut self, value: T) + // Required, since we can't drop the value + where T: Copy + { + unsafe { std::ptr::write_volatile(self.inner, value) } + } + + pub fn read(&self) -> T + // Required, since we can't drop the value + where T: Copy + { + unsafe { std::ptr::read_volatile(self.inner) } + } +} +``` + +This design is problematic when `T` is a big struct and one is either only interested in reading a +single field or in modifying a single field. + +```rust +#[derive(Clone, Copy)] +struct Data { + x: i64, + y: u64, + /* imagine lots more fields */ +} + +let data: VolatileMem = /* ... */; +data.write(Data { /* ... */ }); + +// later in the program + +// we only want to change `x`, but have to first read and then write the entire struct. +let d = data.read(); +data.write(Data { x: 42, ..d }); +``` + +This is a big problem, also for correctness, since in some applications of volatile memory, the +value of `data` might change after the read, but before the write. Additionally it is very +inefficient, when the struct is very big. + +Any projection operation would have to be `unsafe`, because the pointer stored in `VolatileMem` is a +raw pointer and there is no way to ensure that the resulting, user-supplied pointer points to a +field of the original value. + +But with custom field projections, one could simply do this instead: + +```rust +data->x.write(42); +``` + +### Rust for Linux Example: Untrusted Data + +In the Linux kernel, data coming from hardware or userspace is untrusted. This means that the data +must be validated before it is used for *logic* inside the kernel. Copying it into userspace is fine +without validation, but indexing some structure requires to first validate the index. + +For the exact details, see the [untrusted data patch +series](https://lore.kernel.org/rust-for-linux/20240925205244.873020-1-benno.lossin@proton.me/). It +introduces the `Untrusted` type used to mark data as untrusted. Kernel developers are supposed to +validate such data before it is used to drive logic within the kernel. Thus this type prevents +reading the data without validating it first. + +One use case of untrusted data will be ioctls. They were discussed in version 1 in [this +reply](https://lore.kernel.org/rust-for-linux/ZvU6JQEN_92nPH4k@phenom.ffwll.local/) (slightly +adapted the code): +> Example in pseudo-rust: +> +> ```rust +> struct IoctlParams { +> input: u32, +> ouptut: u32, +> } +> ``` +> +> The thing is that ioctl that use the struct approach like drm does, use the same struct if there's +> both input and output paramterers, and furthermore we are not allowed to overwrite the entire +> struct because that breaks ioctl restarting. So the flow is roughly +> +> ```rust +> let userptr: UserSlice; +> let params: Untrusted; +> +> userptr.read(params)); +> +> // validate params, do something interesting with it params.input +> +> // this is _not_ allowed to overwrite params.input but must leave it +> // unchanged +> +> params.write(|x| { x.output = 42; }); +> +> userptr.write(params); +> ``` +> +> Your current write doesn't allow this case, and I think that's not good enough. The one I propsed +> in private does: +> +> ```rust +> Untrusted::write(&mut self, impl Fn(&mut T)) +> ``` + +Importantly, we would like to only overwrite the `output` field of the `IoctlParams` struct. This is +the exact pattern that field projections can help with, instead of exposing a mutable reference to +the untrusted data via the `write` function, we can have: + +```rust +impl Untrusted { + fn write(&mut self, value: T); +} +``` + +In addition to allowing projections of `&mut Untrusted` to `&mut Untrusted`, thus +allowing to overwrite parts of a struct with field projections. + +### Rust for Linux Example: RCU + +RCU stands for read, copy, update. It is a creative locking mechanism that is very efficient for +data that is seldomly updated, but read very often. Below you can find a small summary of how I +understand it to work. No guarantees that I am 100% correct, if you want to make sure that you have +a correct understanding of how RCU works, please read the sources provided in the next section. + +It requires quite a lot of explaining until I can express why field projection comes up in this +instance. However, in this case (similar to `Pin`) it is (to my knowledge) impossible to write a +safe API without field projections, so they would be invaluable for this use case. + +#### RCU Explained + +For a much more extensive explanation, please see . +Since the first paragraph of the first section is invaluable in understanding RCU, it is quoted here +for the reader's convenience: + +> The basic idea behind RCU is to split updates into “removal” and “reclamation” phases. The removal +> phase removes references to data items within a data structure (possibly by replacing them with +> references to new versions of these data items), and can run concurrently with readers. The reason +> that it is safe to run the removal phase concurrently with readers is the semantics of modern CPUs +> guarantee that readers will see either the old or the new version of the data structure rather +> than a partially updated reference. The reclamation phase does the work of reclaiming (e.g., +> freeing) the data items removed from the data structure during the removal phase. Because +> reclaiming data items can disrupt any readers concurrently referencing those data items, the +> reclamation phase must not start until readers no longer hold references to those data items. + +In C, RCU is used like this: +- the data protected by RCU sits behind a pointer, +- readers must use the [`rcu_read_lock()`](https://docs.kernel.org/RCU/whatisRCU.html#rcu-read-lock) + and [`rcu_read_unlock()`](https://docs.kernel.org/RCU/whatisRCU.html#rcu-read-unlock) functions + when accessing any data protected by RCU, within this critical section, blocking is forbidden. +- read accesses of the pointer must only be done after calling + [`rcu_dereference()`](https://docs.kernel.org/RCU/whatisRCU.html#rcu-dereference). +- write accesses of the pointer must be done via [`rcu_assign_pointer(, + )`](https://docs.kernel.org/RCU/whatisRCU.html#rcu-assign-pointer). +- before a writer frees the old value (i.e. it enters into the reclamation phase), they must call + [`synchronize_rcu()`](https://docs.kernel.org/RCU/whatisRCU.html#synchronize-rcu). +- multiple writers **still require** some other kind of locking mechanism. + +`synchronize_rcu()` waits for all existing read-side critical sections to complete. It does not have +to wait for new read-side critical sections that are begun after it has been called. + +The big advantage of RCU is that in certain kernel configurations, (un)locking the RCU read lock is +achieved with absolutely no instructions. + +#### A Safe Abstraction for RCU + +In Rust, we will of course use a guard for the RCU read lock, so we have: + +```rust +mod rcu { + pub struct Guard(/* ... */); + + impl Drop for Guard { /* ... */ } + + pub fn read_lock() -> Guard; +} +``` + +The pointers that are protected by RCU must be specially tagged, so we introduce the `Rcu` type. It +exposes the Rust equivalents of `rcu_dereference` and `rcu_assign_pointer` [^1]: + +[^1]: Note that the requirement of not blocking in a critical RCU section is not expressed in code. + Instead we use an external tool called [`klint`] for that purpose. + +[`klint`]: https://rust-for-linux.com/klint + +```rust +mod rcu { + pub struct Rcu

{ + inner: UnsafeCell

, + // we require this to opt-out of uniqueness of `&mut`. + // if `UnsafePinned` were available, we would use that instead. + _phantom: PhantomPinned, + } + + impl Rcu

{ + pub fn read<'a>(&'a self, _guard: &'a RcuGuard) -> &'a P::Target; + pub fn set(self: Pin<&mut Self>, new: P) -> Old

; + } + + pub struct Old

(/* ... */); + + impl

Drop for Old

{ + fn drop() { + unsafe { bindings::synchronize_rcu() }; + } + } +} +``` + +The `Old` type is responsible for calling `synchronize_rcu` before dropping the old value. + +Note that `set` takes a pinned mutable reference to `Rcu`. This is important, since it might not be +obvious why there is pinning involved here. Firstly, we need to take a mutable reference, since +writers still need to be synchronized. Secondly, since there are still concurrent shared references, +we must not allow users to use `mem::swap`, since that would change the value without the required +compiler and CPU barriers in place. + +Now to the crux of the issue and why field projection comes up here: A common use-case of RCU is to +protect data inside of a struct that is itself protected by a lock. Since the data is protected by +RCU, we don't need to hold the lock to read the data. However, locks do not allow access to the +inner value without locking it (that's kind of their whole point...). So we need a way to get to the +`Rcu

` without locking the lock. Using field projection, we would allow projections for fields of +type `Rcu` from `&Lock` to `&Rcu

`. + +This way, readers can use field projection and the `Rcu::read` function and writers can continue to +lock the lock and then use `Rcu::set`. + +#### RCU API Usage Examples + +```rust +struct BufferConfig { + flush_sensitivity: u8, +} + +struct Buffer { + // We also require `Rcu` to be pinned, because `&mut Rcu` must not exist (otherwise one could + // call mem::swap). + #[pin] + cfg: Rcu>, + buf: Vec, +} + +struct MyDriver { + // The `Mutex` in the kernel needs to be pinned. + #[pin] + buf: Mutex, +} +``` + +Here the struct that is protected by the lock is `Buffer` and the data that is protected by RCU +inside of this struct is `BufferConfig`. To read the config, we now don't have to lock the lock, +instead we can read it using field projection: + +```rust +impl MyDriver { + fn buffer_config<'a>(&'a self, rcu_guard: &'a RcuGuard) -> &'a BufferConfig { + let buf: &Mutex = &self.buf; + // Here we use the special projections set up for `Mutex` with fields of type `Rcu`. + let cfg: &Rcu> = buf->cfg; + cfg.read(rcu_guard) + } +} +``` + +To set the buffer config, one has to hold the lock: + +```rust +impl MyDriver { + fn set_buffer_config(&self, flush_sensitivity: u8) { + // Our `Mutex` pins the value. + let mut guard: Pin> = self.buf.lock(); + let buf: Pin<&mut Buffer> = guard.as_mut(); + // We can use pin-projections since we marked `cfg` as `#[pin]` + let cfg: Pin<&mut Rcu>> = buf->cfg; + cfg.set(Box::new(BufferConfig { flush_sensitivity })); + // ^^ this returns an `Old>` and runs `synchronize_rcu` on drop. + } +} +``` + +And of course one can still use other fields normally, but now requires field projection, since +`Pin<&mut T>` is involved: + +```rust +impl MyDriver { + fn read_to_buffer(&self, data: &[u8]) -> Result { + let mut buf: Pin> = self.buf.lock(); + // This method allocates, so it must be fallible. + // `buf.as_mut()->buf` again uses the field projection for `Pin` to yield a `&mut Vec`. + buf.as_mut()->buf.extend_from_slice(data) + } +} +``` + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +## Rust Book Chapter: Field Projections + +When programming in Rust, one often has the need to access only a single field of a struct. In the +usual cases of `&T` or `&mut T`, this is simple. Just use dot syntax and you can create a reference +to a field of the struct `&t.field`. + +However, when one has a different type that "contains" or "points" at a `T`, one has to reach for +*field projections* via the *field projection operator* `->`. In this chapter, we will learn what +field projections are and how to use them for the most common types from the standard library. For +example for pointer-like types and [pin projections]. + +### Simple Uses of Field Projections + +Let's say we have a big struct that doesn't fit onto the stack: + +```rust +struct Data { + flags: u32, + buf: [u8; 1024 * 1024], +} +``` + +We would like to initialize the bytes in `buf` to `0xff` and `flags` should be `0x0f`. We start with +a new function returning memory on the heap: + +```rust +impl Data { + fn new() -> Box { + let mut data = Box::new_uninit(); + { + let data: &mut MaybeUninit = &mut *data; +``` + +Now we can use field projection to turn `&mut MaybeUninit` into `&mut MaybeUninit` that +points to the `flags` field: + +```rust + let flags: &mut MaybeUninit = data->flags; + flags.write(0x0f); +``` + +And to initialize `buf`, we can do the same: + +```rust + let buf: &mut MaybeUninit<[u8]> = data->buf; + let buf: &mut [MaybeUninit] = MaybeUninit::slice_as_bytes_mut(buf); + MaybeUninit::fill(buf, 0xff); + } +``` + +Now we only need to unsafely assert that we initialized everything. + +```rust + unsafe { data.assume_init() } + } +} +``` + +A more general explanation of field projection is that it is an operation that turns a generic +container type `C` containing a struct `T` into a container `C` where `F` is a field of the +struct `T`. + +#### Raw Pointers + +Similarly to `&mut MaybeUninit`, raw pointers also support projections. Given a raw pointer +`ptr: *mut Data`, one can use field projection to obtain a pointer to a field: +`ptr->flags: *mut u32`. Essentially `ptr->field` is a shorthand for `&raw mut (*ptr).field` (for +`*const` the same is true except for the `mut`). However, there is a small difference between the +two: the latter has to be `unsafe`, since `*ptr` requires that `ptr` be dereferencable. But field +projection is a safe operation and thus it uses [`wrapping_add`] under the hood. This is less +efficient, as it prevents certain optimizations. If that is a problem, either use `&raw [mut] +(*ptr).field` or create a custom pointer type that represents an always dereferencable pointer and +implement field projections using `unsafe`. + +[`wrapping_add`]: https://doc.rust-lang.org/std/primitive.pointer.html#method.wrapping_add + +Another pointer type that supports field projection is `NonNull`. For example, if we had to add a +function that sets the `flags` field given only a `NonNull`, we could do so: + +```rust +impl Data { + unsafe fn set_flags_raw(this: NonNull, flags: u32) { + let ptr: NonNull = this->flags; + unsafe { ptr.write(flags) }; + } +} +``` + +#### `RefCell`'s References + +Even the "exotic" references of `RefCell` i.e. [`cell::Ref<'_, T>`] and [`cell::RefMut<'_, T>`] +are supporting field projection. + +[`cell::Ref<'_, T>`]: https://doc.rust-lang.org/std/cell/struct.Ref.html +[`cell::RefMut<'_, T>`]: https://doc.rust-lang.org/std/cell/struct.RefMut.html + +In this example, we create a buffer that tracks the various operations done to it for debug +purposes. + +```rust +struct Buffer { + stats: RefCell, + buf: VecDeque, +} + +struct Stats { + ops: Vec, + elements_pushed: usize, + elements_popped: usize, +} +``` + +There are three operations, one for pushing a number of elements, one other for popping them and the +last one for peeking at the elements in the buffer. + +```rust +enum Operation { + Push(usize), + Pop(usize), + Peek, +} +``` + +When pushing and popping, we have a mutable reference to the buffer and could just use `Stats` +without the `RefCell`. But in the peek case, we only have a shared reference and still require to +record the statistic. + +Pushing and popping are very simple: + +```rust +impl Buffer { + fn push(&mut self, items: &[T]) + where + T: Clone, + { + let mut stats = self.stats.borrow_mut(); + stats.ops.push(Operation::Push(items.len())); + stats.elements_pushed += items.len(); + self.buf.extend(items.iter().cloned()); + } + + fn pop(&mut self, count: usize) -> Option> { + let count = count.min(self.buf.len()); + let mut stats = self.stats.borrow_mut(); + stats.ops.push(Operation::Pop(count)); + stats.elements_popped += count; + if count == 0 { + return None; + } + let mut res = Box::new_uninit_slice(count); + for i in 0..count { + let Some(val) = self.buf.pop_front() else { + // we took the minimum above. + unreachable!() + }; + res[i].write(val); + } + Some(unsafe { res.assume_init() }) + } +} +``` + +Peeking also is rather easy: + +```rust +impl Buffer { + fn peek(&self) -> Option<&T> { + self.stats.borrow_mut().ops.push(Operation::Peek); + self.buf.front() + } +} +``` + +Now we come to the part where we need field projections. We would like to be able to access the +operation statistics from other code. But because it is wrapped in `RefCell`, we cannot give a +reference out: + +```rust +impl Buffer { + fn stats(&self) -> &Vec { +// error[E0515]: cannot return value referencing temporary value + &self.stats.borrow().ops +// ^-------------------^^^^ +// || +// |temporary value created here +// returns a value referencing data owned by the current function + } +} +``` + +That is because the value returned by `borrow` is placed on the stack and must be kept alive for +bookkeeping purposes of `RefCell` until the borrow ends. But using field projection, we can return a +`cell::Ref`: + +```rust +impl Buffer { + fn stats(&self) -> cell::Ref<'_, Vec> { + self.stats.borrow()->ops + } +} +``` + +We could even hide the fact that the stats are implemented using `RefCell` using an opaque type: + +```rust +impl Buffer { + fn stats(&self) -> impl Deref> + '_ { + self.stats.borrow()->ops + } +} +``` + +### Complicated Field Projections + +Field projection is even more powerful than what we have seen until now. The returned type of the +projection operator can even depend on the field itself! + +This enables them to be used for making [pin projections] ergonomic. We will discuss how to use +this way of pin projection in the next section. + +#### Pin Projections + +For this section, you should understand what [pin projections] are. If not, then you can just skip +this section. + +Structurally pinned fields are marked with `#[pin]` using the derive macro `PinProject`. For example +consider a future that alternatingly polls two futures: + +```rust +#[derive(PinProject)] +struct FairRaceFuture { + #[pin] + fut1: F1, + #[pin] + fut2: F2, + fair: bool, +} +``` + +Now, it's possible to project a `fut: Pin<&mut FairRaceFuture>`: +- `fut->fut1: Pin<&mut F1>` +- `fut->fut2: Pin<&mut F2>` +- `fut->fair: &mut bool` + +Using these, one can concisely implement `Future` for `FairRaceFuture` without any `unsafe` code: + +```rust +impl> Future for FairRaceFuture { + type Output = F1::Output; + + fn poll(self: Pin<&mut Self>, ctx: &mut Context) -> Poll { + let fair: &mut bool = self->fair; + *fair = !*fair; + if *fair { + // self->fut1: Pin<&mut F1> + match self->fut1.poll(ctx) { + Poll::Pending => self->fut2.poll(ctx), + Poll::Ready(res) => Poll::Ready(res), + } + } else { + // self->fut2: Pin<&mut F2> + match self->fut2.poll(ctx) { + Poll::Pending => self->fut1.poll(ctx), + Poll::Ready(res) => Poll::Ready(res), + } + } + } +} +``` + +## Impact of this Feature + +Overall this feature improves readability of code, because it replaces more complex to parse syntax +with simpler syntax: +- `&raw mut (*ptr).foo` is turned into `ptr->foo` +- using `NonNull` as a replacement for `*mut T` becomes a lot better when accessing fields, + +There is of course a cost associated with introducing a new operator along with the concept of field +projections. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +## Implementation Details + +In order to facilitate field projections, several interlinked concepts have to be introduced. These +concepts are: + +- [field types], + - `Field` trait, + - `field_of!` macro, + - [`#[projecting]`](#projecting-attribute) attribute +- projection operator `->`, + - `Project` trait, + - `Projectable` trait, + +To ease understanding, here is a short explanation of the interactions between these concepts. The +following subsections explain them in more detail, so refer to them in cases of ambiguity. The +projection operator `->` is governed by the `Project` trait that has `Projectable` as a super trait. +`Projectable` helps to select the struct whose fields are used for projection. Field types store +information about a field (such as the base struct and the field type) via the `Field` trait and the +`field_of!` macro makes it possible to name the [field types]. Finally, the +[`#[projecting]`](#projecting-attribute) attribute allows `repr(transparent)` structs to be ignored +when looking for fields for projection. + +### Field Types +[field type]: #field-types +[field types]: #field-types + +The compiler generates a compiler-internal type for every field of every struct. These types can +only be named via the `field_of!` macro that has the same syntax as `offset_of!`. Only fields +accessible to the current scope can be projected. These types are called *field types*. + +Field types implement the `Field` trait: + +```rust +pub trait Field { + type Base: ?Sized; + type Type: ?Sized; + + const OFFSET: usize; +} +``` + +In the implementation of this trait, `Base` is set to the struct that the field is part of and +`Type` is set to the type of the field. `OFFSET` is set to the offset in bytes of the field in +the struct (i.e. `OFFSET = offset_of!(Base, ident)`). + +In addition to all fields of all structs, field types are generated for transparent, +[`#[projecting]`](#projecting-attribute) container types as follows: given a transparent, generic type annotated with +[`#[projecting]`](#projecting-attribute) and a struct contained in it: + +```rust +#[projecting] +#[repr(transparent)] +pub struct Container { + inner: T, +} + +struct Foo { + bar: i32, +} +``` + +The type `Container` inherits all fields of `Foo` with `Base` and `Type` adjusted accordingly +(i.e. wrapped by `Container`): + +```rust +fn project(r: &F::Base) -> &F::Type; + +let x: Container; +let _: &Container = project::, bar)>(&x); +``` + +The implementation of the `Field` trait sets the associated types and constant like this: +- `Base = Container` +- `Type = Container` +- `OFFSET = offset_of!(Foo, bar)` + +This can even be done for multiple levels: `Container>` also has a [field type] `bar` +of type `Container>`. Mixing different container types is also possible. + +Annotating a struct with [`#[projecting]`](#projecting-attribute) disables projection via that structs own fields. +Continuing the example from above: + +```rust +struct Bar {} + +// ERROR: `Container` does not have a field `inner`. `Container` is annotated with +// `#[projecting]` and thus the field types it exposes are changed to the wrapped type. `Bar` does +// not have a field `inner`. +type X = field_of!(Container, inner); + +struct Baz { + inner: Foo, +} + +// this refers to the field `inner` of `Baz`. +type Y = field_of!(Container, inner); +// it has the following implementation of `Field`: +impl Field for Y { + type Base = Container; + type Type = Container; + + const OFFSET: usize = offset_of!(Baz, inner); +} +``` + +#### `Field` Trait + +The field trait is added to `core::marker` and cannot be implemented manually. + +```rust +pub trait Field { + type Base: ?Sized; + type Type: ?Sized; + + const OFFSET: usize; +} +``` + +The compiler automatically implements it for all [field types]. Users of the trait are allowed to rely +on the associated types and constants in `unsafe` code. So for example this piece of code is sound: + +```rust +fn get_field(base: &F::Base) -> &F::Type +where + // required to be able to `cast` + F::Type: Sized, + F::Base: Sized, +{ + let ptr: *const Base = base; + let ptr: *const u8 = base.cast::(); + // SAFETY: `ptr` is derived from a reference and the `Field` trait is guaranteed to contain + // correct values. So `F::OFFSET` is still within the `F::Base` type. + let ptr: *const u8 = unsafe { ptr.add(F::OFFSET) }; + let ptr: *const F::Type = ptr.cast::(); + // SAFETY: The `Field` trait guarantees that at `F::OFFSET` we find a field of type `F::Type`. + unsafe { &*ptr } +} +``` + +#### `field_of!` Macro + +Also added to `core::marker` is the following built-in macro: + +```rust +pub macro field_of($Container:ty, $($fields:expr)+ $(,)?) { + /* built-in macro */ +} +``` + +It has the same syntax as the `offset_of!` macro and returns the respective [field type]. It emits an +error in the following cases: + +```rust +pub mod foo { + pub struct Foo { + bar: u32, + pub baz: i32, + } + + // Error: unknown field `barr` of type `Foo` + type FooBar = field_of!(Foo, barr); +} + +pub mod bar { + // Error: field `bar` of type `Foo` is private + type FooBar = field_of!(Foo, bar); +} +``` + +#### `#[projecting]` Attribute + +The `#[projecting]` attribute can be put on a struct or union declaration. It requires that the +type is `#[repr(transparent)]` and that there is a unique field that has non-zero size in general. +It results in fields of the single non-zero sized type being considered fields of the outer type. + +So for example: + +```rust +#[projecting] +#[repr(transparent)] +pub struct Container { + inner: T, +} + +struct Foo { + bar: i32, +} +``` + +Now `Container` has a [field type] associated with `bar` implementing `Field` with: +- `Base = Container` +- `Type = Container` +- `OFFSET = offset_of!(Foo, bar)` + +The same is true for the non-generic case: + +```rust +#[projecting] +#[repr(transparent)] +pub struct Container { + inner: Foo, +} + +struct Foo { + bar: i32, +} +``` + +Here are some error examples: + +```rust +// ERROR: missing `#[repr(transparent)]` +#[projecting] +pub struct Container { + inner: T, +} + +// ERROR: no field to project onto found, the struct has no fields +#[projecting] +#[repr(transparent)] +pub struct Container {} + +// ERROR: no field to project onto found, all fields are always zero-sized +#[projecting] +#[repr(transparent)] +pub struct Container { + phantom1: PhantomData T>, + phantom2: PhantomData, +} +``` + +### Field Projection Operator + +The field projection operator `->` has the following syntax: + +> **Syntax**\ +> _ProjectionExpression_ :\ +>    [_Expression_] `->` _ProjectionMember_ +> +> _ProjectionMember_ :\ +>       [IDENTIFIER] +>    | [TUPLE_INDEX] + +[IDENTIFIER]: https://doc.rust-lang.org/reference/identifiers.html +[_Expression_]: https://doc.rust-lang.org/reference/expressions.html +[TUPLE_INDEX]: https://doc.rust-lang.org/reference/tokens.html#tuple-index + +#### Desugaring + +```rust +struct T { + field: F, +} + +let t: C = /* ... */; +let _ = t->field; + +// becomes + +let _ = Project:: as Projectable>::Inner, field)>::project(t); +``` + +The `C` in the `C as Projectable` comes from a type inference variable over the expression +`t`. + + +#### `Projectable` & `Project` Traits + +Added to `core::ops`: + +```rust +pub trait Projectable: Sized { + type Inner: ?Sized; +} + +pub trait Project: Projectable +where + F: Field, +{ + type Output; + + fn project(self) -> Self::Output; +} +``` + +## Stdlib Field Projections + +All examples from the guide-level explanation work when the standard library is extended with the +implementations detailed below. + +The following pointer types get an implementation for `Projectable` with `Inner = T`. They support +projections for any field and perform the obvious offset operation. + +- `&mut T` +- `&T` +- `*mut T` +- `*const T` +- `NonNull` +- `Cow<'_, T>` + +The following types get annotated with [`#[projecting]`](#projecting-attribute): + +- `MaybeUninit` +- `Cell` +- `UnsafeCell` +- `SyncUnsafeCell` + +### Pin Projections + +In order to provide [pin projections], a new derive macro `PinProject` and a trait `PinField` is +required: + +```rust +pub unsafe trait PinField: Field { + type Projected<'a>; + + fn from_pinned_ref<'a>(r: &'a mut Self::Type) -> Self::Projected<'a>; +} +``` + +The trait is `unsafe`, because it must only be implemented by the `PinProject` derive marco. + +```rust +#[derive(PinProject)] +struct FairRaceFuture { + #[pin] + fut1: F1, + #[pin] + fut2: F2, + fair: bool, +} +``` + +It expands the above to: + +```rust +struct FairRaceFuture { + fut1: F1, + fut2: F2, + fair: bool, +} + +unsafe impl PinField for field_of!(FairRaceFuture, fut1) { + type Projected<'a> = Pin<&'a mut F1>; + + fn from_pinned_ref<'a>(r: &'a mut F1) -> Pin<&'a mut F1> { + unsafe { Pin::new_unchecked(r) } + } +} + +unsafe impl PinField for field_of!(FairRaceFuture, fut2) { + type Projected<'a> = Pin<&'a mut F2>; + + fn from_pinned_ref<'a>(r: &'a mut F2) -> Pin<&'a mut F2> { + unsafe { Pin::new_unchecked(r) } + } +} + +unsafe impl PinField for field_of!(FairRaceFuture, fut2) { + type Projected<'a> = &'a mut bool; + + fn from_pinned_ref<'a>(r: &'a mut bool) -> &'a mut bool { + r + } +} +``` + +Now the only component that is left is an implementation of `Projectable` and `Project` for +`Pin<&mut T>`: + +```rust +impl<'a, T: ?Sized> Projectable for Pin<&'a mut T> { + type Inner = T; +} + +impl<'a, T, F> Project for Pin<&'a mut T> +where + F: Field + PinField, +{ + type Output = ::Projected<'a>; + + fn project(self) -> Self::Output { + let r: &mut T = unsafe { Pin::into_inner_unchecked(self) }; + let r: &mut F::Type = <&mut T as Project::>::project(r); + ::from_pinned_ref(r) + } +} +``` + +## Interactions + +There aren't a lot of interactions with other features. + +The projection operator binds very tightly: + +```rust +*ctr->field = *(ctr->field); + +&mut ctr->field = &mut (ctr->field); + +ctr->field.foo() = (ctr->field).foo(); + +ctr.foo()->field = (ctr.foo())->field; + +ctr->field->bar = (ctr->field)->bar; +``` + +# Drawbacks +[drawbacks]: #drawbacks + +- [Pin projections] still require library level support via a proc macro and a trait solely for + [field types]. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +This proposal is a lot more general than just improving [pin projections]. It not only covers +pointer-like types, but also permits all sorts of operations generic over fields. + +Not adding this feature will result in the proliferation of `*mut T` over more suitable pointer +types that better express the invariants of the pointer. The ergonomic cost of +`unsafe { MyPtr::new_unchecked(&raw mut (*my_ptr.as_ptr()).field) }` is just to great to be useful +in practice. + +While [pin projections] can be addressed via a library or a separate feature, not having them in the +language takes a toll on projects trying to minimize dependencies. The Rust for Linux project is +already using pinning extensively, since all locking primitives require it; a library solution will +never be as ergonomic as a language-level construct. Thus that project would benefit greatly from +this feature. + +Additionally, safe RCU abstractions are likely impossible without field projections, since they +require being generic over the fields of structs. + +Field projections are on first contact rather difficult to understand, especially the instantiation +as [pin projections]. However, they are a very natural operation, extending the already existent +features of raw pointers and references. Therefore they are fairly easy to adjust to; and in turn, +they provide a big increase in readability of the code, expressing the concept of field projection +concisely. The compiler changes are rather manageable, reusing several already existing systems, +thus increasing the maintenance burden only slightly if at all. + +# Prior art +[prior-art]: #prior-art + +Most importantly, see the [old field projection RFC](http://github.com/rust-lang/rfcs/pull/3318). +There also was a [pre-old-RFC +discussion](https://internals.rust-lang.org/t/pre-rfc-field-projection/17383/57) and the +list of crates in the next section is also from the old RFC. + +## Crates + +There are several crates implementing projections for different types. + +- [pin projections] + - [`pin-project`] provides pin projections via a proc macro on the type specifying the + structurally pinned fields. At the projection-site the user calls a projection function + `.project()` and then receives a type with each field replaced with the respective projected + field. + - [cell-project] provides cell projection via a macro at the projection-site: the user writes + `cell_project!($ty, $val.$field)` where `$ty` is the type of `$val`. Internally, it uses unsafe + to facilitate the projection. + - [pin-projections] provides pin projections, it differs from [`pin-project`] by providing + explicit projection functions for each field. It also can generate other types of getters for + fields. [`pin-project`] seems like a more mature solution. +- `&[mut] MaybeUninit` projections + - [project-uninit] provides uninit projections via macros at the projection-site uses `unsafe` + internally. +- multiple of the above + - [`field-project`] provides projection for `Pin<&[mut] T>` and `&[mut] MaybeUninit` via a macro + at the projection-site: the user writes `proj!($var.$field)` to project to `$field`. + - [`field-projection`] is an experimental crate that implements general field projections via a + proc-macro that hashes the name of the field to create unique types for each field that can then + implement traits to make different output types for projections. + +[`field-project`]: https://crates.io/crates/field-project +[`cell-project`]: https://crates.io/crates/cell-project +[`pin-projections`]: https://crates.io/crates/pin-projections +[`project-uninit`]: https://crates.io/crates/project-uninit +[`field-projection`]: https://crates.io/crates/field-projection + +## Blog Posts and Discussions + +- [Safe Cell field projection in + Rust](https://www.abubalay.com/blog/2020/01/05/cell-field-projection) +- [Field projdection for `Rc` and + `Arc`](https://internals.rust-lang.org/t/field-projection-for-rc-and-arc/15827) +- [Generic Field Projection](https://internals.rust-lang.org/t/generic-field-projection/16204) + +Blog posts about pin (projections): +- [Pinned places](https://without.boats/blog/pinned-places/) +- [Overwrite trait](https://smallcultfollowing.com/babysteps/series/overwrite-trait/) + +## Rust and Other Languages + +Rust already has a precedent for compiler-generated types. All functions and closures have a unique, +unnameable type. + +In C++ there are field projections supported on `std::shared_ptr`, it consists of two pointers, one +pointing to the reference count and the other to the data. Making it possible to project down to a +field and still take a reference count on the entire struct, keeping also the field alive. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +## Syntax Bikeshedding + +What is the right syntax for the various operations given in this RFC? + +Ideally we would have a strong opinion when this feature is implemented. But the decision should +only be finalized when stabilizing the feature. + +### Field Projection Operator + +Current favorite: `$base:expr->$field:ident`. + +Alternatives: +- use `~` instead of `->` + +### Naming Field Types + +Current favorite: `field_of!($Container:ty, $($fields:expr)+ $(,)?)` macro with `offset_of!` syntax. + +Alternatives: +- Introduce a more native syntax on the level of types `$Container:ty->$field:ident` akin to + projecting an expression. + +### Declaring a Transparent Container Type + +Current favorite: [`#[projecting]`](#projecting-attribute) attribute. + +Alternatives: +- extend the `repr` attribute with + - `projecting` + - `flatten` +- use `#[flatten]` instead. + +## Other + +- should the [`#[projecting]`](#projecting-attribute) attribute have an associated field attribute + to mark the field that is projected onto? + +# Future possibilities +[future-possibilities]: #future-possibilities + +## Enums + +Enums are difficult to support with the same framework as structs. The problem is that many +containers don't provide sufficient guarantees to read the discriminant (for example raw pointers +and `&mut MaybeUninit`). However, for types that do provide sufficient guarantees, one could cook +up a similar feature. Let's call them *enum projections*. They could work like this: projecting is +done via a new kind of match operator: + +```rust +enum MyEnum { + A(i32, String), + B(#[pin] F), +} +type F = impl Future; +let x: Pin<&mut MyEnum>; +match_proj x { + MyEnum::A(n, s) => { + let _: &mut i32 = n; + let _: &mut String = s; + } + MyEnum::B(fut) => { + let _: Pin<&mut F> = fut; + } +} +``` + +Here `match_proj` would need to be a new keyword. I dislike the name and syntax, but haven't come up +with something better. + +## Arrays + +Arrays can be thought of structs/tuples where each index is a field. Supporting them would simply +follow tuples. They might need additional syntax or just use the tuple syntax. + +## Unions + +Since field access for unions is `unsafe`, projection would also have to be `unsafe`. Since unions +are rarely used directly, this probably isn't important. + +## More Stdlib Additions + +Types that might be good candidates for [`#[projecting]`](#projecting-attribute): + +- `ManuallyDrop`