-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate input of overlong UTF-8 sequences #932
Comments
We have standard Angular validators for the form fields. They seem to be well tested and handle such symbols correctly. |
I believe there may have been a misunderstanding here. UTF-8's design theoretically allows code points to be represented in different ways. Overlong UTF-8 sequences use more bytes than strictly required, while still decoding to the same code point. For example, the ASCII space character The concern is that, if software operates directly on UTF-8-encoded strings, such encodings could potentially be used to bypass validation checks. In the above case of the space character, a validation that checks if a certain input does not contain whitespace may naively look only for the byte Since this concerns input validation, I believe it is a backend issue, rather than (just) a frontend issue. Footnotes |
thank you for the information! |
Is this really an issue of our repo or shall it be addressed in Core EDC? //Cc @efiege |
Both, since we would have to investigate the behavior of both upstream and our custom code. |
Task
Description
Investigate behavior when input contains overlong UTF-8 sequences (check if string validation can be bypassed; should be fine since Java converts all UTF-8 to UTF-16 before exposing it as strings, but not sure if JSON parser reads UTF-8 stream directly)
Stakeholders
@sybereal
Solution Proposal and Work Breakdown
The text was updated successfully, but these errors were encountered: