Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layout and behavior of enums with uninhabited fields in some variants #443

Open
RalfJung opened this issue Aug 8, 2023 · 5 comments
Open

Comments

@RalfJung
Copy link
Member

RalfJung commented Aug 8, 2023

This is about enums like the following:

enum E1 { A, B(!) }
enum E2 { A, B(!), C(i32, !) }

The first interesting point is that E1 actually will not get any space assigned for storing a discriminant, it has size 0. The algorithm that assigns discriminants entirely skips B. In Miri we had to make SetDiscriminant throw UB early if the requested variant to set is uninhabited, since otherwise we get ICEs later.

However, E2 has size 8, even though there is only a single valid variant and that variant has size 0. I think we currently always provide storage for all fields of all variants, even if they are uninhabited. This is useful because in theory it lets us compile E2::C(f(), panic!()) into something that does in-place initialization:

let val: E2;
val.C.0 = f();
panic!();
SetDiscriminant(val, C); // this would be UB but we don't get here

(AFAIK we currently don't actually do that, we introduce temporaries instead.)

For structs we have decided that we will always have storage for all fields, and this is unavoidable since safe code can partially initialize a struct. However, safe code cannot partially initialize an enum, so we could soundly decide that E2 has size 0. We have to decide between smaller enums and in-place initialization for arbitrary enums. (We can of course have small enums and then have an analysis that uses in-place initialization where possible -- but in generic MIR, it might not be possible to tell whether the variant we are about to initialize is inhabited.)

@Nadrieril
Copy link
Member

Is it guaranteed (and documented somewhere) that ReadDiscriminant cannot return the discriminant of an uninhabited variant (on pain of UB)? Is that true even of repr(whatever) enums?

@RalfJung
Copy link
Member Author

That is the current behavior as implemented in Miri, but I don't think we guarantee it. ReadDiscriminant is anyway more of an implementation detail, it doesn't show up directly in the surface syntax.

@Nadrieril
Copy link
Member

Do we not specify what data a pattern-match accesses?

My real question is: is there or will there ever be a UB-free execution that can reach a match arm that matches on an uninhabited variant. Such an execution could look like the following (I'm assuming this example is fine for e.g. T=u8, do tell me if I'm wrong):

#![feature(never_type)]

#[repr(u8)]
enum Enum<T> {
    A = 0,
    B = 1,
    C(u8, T) = 2,
}

fn main() {
    let mut x: Enum<!> = Enum::A;
    unsafe { (&raw mut x).cast::<u8>().write(2u8) };
    match x {
        Enum::A => println!("got A!"),
        Enum::B => println!("got B!"),
        Enum::C(..) => println!("got C!"), // is this ever reachable?
    }
}

@RalfJung
Copy link
Member Author

Do we not specify what data a pattern-match accesses?

We specify basically nothing here. It's all details of how exactly match gets lowered to MIR / MiniRust, and that isn't really specified in any way.

Personally I think it's reasonable to say that GetDiscriminant will never return the discriminant of an uninhabited enum. Though this is yet another case of special-casing uninhabited types in the semantics (similar to #413 in that sense), which not everyone is happy with. And for repr(C) enums it is particularly tricky since we make so many layout guarantees about them.

@GKFX
Copy link

GKFX commented Dec 26, 2024

There has been a good bit of discussion of how this affects code which manually constructs an enum at rust-lang/rust#120141. (Also an IRLO thread.) Essentially, there would be a need for a fallible variant of offset_of, since if the enum is too small to contain the fields of one of its variants there is no valid offset to give for those fields. This is not insurmountably difficult to handle but it is likely to end up generally awkward for users.

Personally, I think that a type smaller than the fields it contains is just too surprising/strange of an optimization, and I doubt that manual construction of an enum is the only thing it would affect.

Are there other ways to achieve the same performance benefits without changing the enum's size? E.g., since the Rust calling convention is private to the compiler, having functions that take or return Result<T, Uninhabited> actually (in the ABI) take/return T. More generically, this would be - if there is exactly one inhabited variant, pass that directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants