-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A 69-bit integer value in JSON could cause data loss when using kafka source #13047
Comments
https://datatracker.ietf.org/doc/html/rfc8259#section-6
Also note that proto3 maps 64-bit integer to json string rather than json number: In the long term we do not want to be the bottleneck of limiting json numbers to only 2^53. But for good interoperability it is better not to assume all tools in the data pipeline can support it.
Currently a parse error |
As the user suggests, simd-json 0.13.5 now enables us to convert very-large-numbers into f64, with a potential loss of precisions.
But I still think it's not a good way to deal with such numbers. Converting them to decimal https://crates.io/crates/bigdecimal may be better. |
Describe the bug
Hi there, recently I found some parse error logs in cn node like below
the JSON data mentioned in log at partition 0 offset 8 in kafka topic is skipped by kafka source and result in data loss eventually.
after some invistigation of the source code I realized that this is simd-json parse error caused by a field has a large integer number value, so I wrote a mimimum rust snippet to reproduce this error:
output:
In our kafka upstream, JSON data may contains large integer value inside nested object, it is very hard to detect those values and correct them before ingestion.
so, is this an expected behavior? or this field should placed by NULL instead of a parse error?
Error message/log
No response
To Reproduce
No response
Expected behavior
No response
How did you deploy RisingWave?
Kubernetes
The version of RisingWave
PostgreSQL 9.5-RisingWave-1.3.0 (c4c31bd)
Additional context
No response
The text was updated successfully, but these errors were encountered: