Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-2032: [Java] Add support for NaN, Infinity and -Infinity in JsonDecoder #3066

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rroesch1
Copy link

@rroesch1 rroesch1 commented Aug 5, 2024

What is the purpose of the change

JsonEncoder uses special string values to represent NaN, Infinity and -Infinity values for float and double values, but JsonDecoder does not accept these string values. This change adds support for these special values to JsonDecoder.

Verifying this change

This change added tests and can be verified as follows:

  • The change adds an unit tests which verifies all 6 special cases:
    • NaN for float fields
    • NaN for double fields
    • Infinity for float fields
    • Infinity for double fields
    • -Infinity for float fields
    • -Infinity for double fields
  • These values are defined in Jackson.
    StdDeserializer

Documentation

  • This PR doesn't introduce a new feature but implements a behavior which is not documented at all.
  • The Avro spec doesn't mention how these special cases are handled in JSON (RFC 8259 does not define numeric literals for NaN, Infinity and -Infinity)
  • JsonEncoder/JsonDecoder resemble the behavior of Jackson

JsonEncoder uses special string values to represent NaN, Infinity and
-Infinity values for float and double values, but JsonDecoder does not
accept these string values. This change adds support for these special
values to JsonDecoder.
@github-actions github-actions bot added the Java Pull Requests for Java binding label Aug 5, 2024
@martin-g
Copy link
Member

martin-g commented Aug 6, 2024

Similar PR for the Rust SDK: #3051
Note: it also supports inf next to infinity.

@martin-g
Copy link
Member

martin-g commented Aug 6, 2024

Note: it also supports inf next to infinity.

I see that the implementation here also supports Inf. The difference with the Rust PR is that here the values are case-sensitive.

@rroesch1
Copy link
Author

rroesch1 commented Aug 6, 2024

@martin-g

@rroesch1
Copy link
Author

rroesch1 commented Aug 6, 2024

fyi: looks like the C++ implementations also uses the literals "NaN", "Infinity", and "-Infinity" (see: JsonCodec.cc )

Copy link
Contributor

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Avro spec doesn't mention how these special cases are handled in JSON

Indeed. I think we should fix that AVRO-4025.
Perhaps we could add some tests to demonstrate the current behavior (the de-facto standard) of schema / JsonEncoder, and them document the behavior into the spec. I will try to have a PR for that soon.

@xxchan
Copy link
Contributor

xxchan commented Aug 6, 2024

Please note this PR is just about the JSON encoding for records, not about the encoding for schemas

Indeed. But "Except for unions, the JSON encoding is the same as is used to encode field default values (in schema)", so I think they are essentially the same (and SHOULD have the same behavior).

https://avro.apache.org/docs/1.11.1/specification/_print/#json-encoding

@zcsizmadia
Copy link
Contributor

zcsizmadia commented Aug 7, 2024

The same PR for C# #3070. We should specify the accepted strings. NaN, Infinity, -Infinity makes sense, however if if we need to support INF and -INF and/or case insensitivity, we should add all the supprted strings to the doc. I added tests for all of those cases.

martin-g added a commit that referenced this pull request Aug 7, 2024
…Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
martin-g added a commit that referenced this pull request Aug 7, 2024
…Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
martin-g added a commit that referenced this pull request Aug 7, 2024
…Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
martin-g added a commit that referenced this pull request Aug 7, 2024
…Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
martin-g added a commit that referenced this pull request Aug 7, 2024
…g numbers with Java and C# (#3073)

* AVRO-4024: [Rust] Improve the error messages when parsing unknown String into Value::Float/Double

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>

* AVRO-4024: [Rust] Accept only "Nan", "INF", "-INF", "Infinity" and "-Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>

---------

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
martin-g added a commit that referenced this pull request Aug 7, 2024
…g numbers with Java and C# (#3073)

* AVRO-4024: [Rust] Improve the error messages when parsing unknown String into Value::Float/Double

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>

* AVRO-4024: [Rust] Accept only "Nan", "INF", "-INF", "Infinity" and "-Infinity"

This is what the Java SDK (via Jackson library) supports (#3066).
This is what the C# SDK also would support (#3070)

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>

---------

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
(cherry picked from commit dffc13a)
martin-g added a commit to apache/avro-rs that referenced this pull request Sep 23, 2024
…g numbers with Java and C# (#3073)

* AVRO-4024: [Rust] Improve the error messages when parsing unknown String into Value::Float/Double

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>

* AVRO-4024: [Rust] Accept only "Nan", "INF", "-INF", "Infinity" and "-Infinity"

This is what the Java SDK (via Jackson library) supports (apache/avro#3066).
This is what the C# SDK also would support (apache/avro#3070)

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>

---------

Signed-off-by: Martin Tzvetanov Grigorov <[email protected]>
@legal-s
Copy link

legal-s commented Nov 22, 2024

Hello,

Do you know when this pull request will be merged because it would fix a data deserialization problem with an entity containing the NaN value that i have?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java Pull Requests for Java binding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants