Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #442 - Prevent crash when cursor is in the middle of a UTF sequence #443

Merged
merged 2 commits into from
Feb 24, 2018

Conversation

ghost
Copy link

@ghost ghost commented Feb 24, 2018

No description provided.

@ghost ghost force-pushed the issue-442 branch from 07b8751 to 7e70c26 Compare February 24, 2018 12:00
// since later there's a non-nothrow call to `toUpper`
import std.utf : validate, UTFException;
try validate(partial);
catch (UTFException) partial = "";
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that phobos misses a nothrow UTF validator. Using try catch for this is really horrible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I also miss a way to process strings in nothrow e.g. by replacing with the invalid UTF char - though there was a DIP once: https://wiki.dlang.org/DIP76

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I recently hacked a toValidUTF method for run.dlang.io, because of similar problems.
Though it's really ugly (performance-wise), but Phobos is really lacking in this regard :/

dlang-tour/core#673

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it's really ugly (performance-wise),

Thinking about it - maybe we should use this approach in the catch(UTFException) case? silently setting partial to an empty string if there's just one invalid UTF symbol looks a bit like patching over the actual problem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual problem comes from the IDE in first place and in dcd we just avoid a crash.

Copy link
Member

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'm going to merge this as DCD clearly shouldn't crash on invalid UTF sequences. The exact behavior can also be redefined later.

@dlang-bot dlang-bot merged commit a6804db into master Feb 24, 2018
@ghost ghost deleted the issue-442 branch February 24, 2018 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants