-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't enforce null checks on fields inside nullable structs #1296
base: master
Are you sure you want to change the base?
Conversation
@zsxwing let me know what you think of this. I was hoping to potentially support updating the null checks to properly check the way Spark handles things, but that was a bit too much. Also open to feature flagging this if you think it's acceptable but don't want this to be the default behavior. |
472dea6
to
84a0b17
Compare
Sorry for the late reply. I'm inclined to not make this change because
|
Does it make sense then to, only on the Spark side, make any children of nullable fields nullable creating schemas for a table (or considering changed schemas on new writes)? It just seems weird to leave a case that can knowingly trigger an error that's not really a bug (for Spark at least) |
Coming back to this since we continue to have issues with this and several others have reported this as erroneous behavior as well
I'm not sure why I didn't respond to this part before but this is the whole issue. Null invariants are added automatically for non null fields, with no way to disable them. So there is no workaround. So it'd be great if you had a change of heart about this. If not I'll make a separate PR that lets users just disable null invariants completely so there actually is a workaround |
Description
Resolves #860, (and maybe #1239 too)
Currently a non-nullable field inside a nullable struct is enforced to be not null. However, the way Spark handles nullability, a non-nullable field could be "null" if its parent is null. The non-nullable on the inner field simply means "if my parent is non-null then I will definitely be non-null". This PR turns a non-nullable field inside a nullable struct into an unenforced not null.
How was this patch tested?
Updated existing unit tests to make sure the parent structs are non-nullable for the same current behavior, and new UTs to test the new behavior.
Does this PR introduce any user-facing changes?
Yes, non-nullable fields inside nullable structs become unenforced not nulls, thereby failing to create the schema by default without the
allowUnenforcedNotNull
setting enabled, instead of creating runtime null checks for the field.