Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extra guards for empty lines for Mouth::readToken #2215

Merged
merged 2 commits into from
Sep 30, 2023

Conversation

dginev
Copy link
Collaborator

@dginev dginev commented Sep 19, 2023

Fixes #2175

The regression is really subtle, so this PR definitely deserves a careful look-over. As Tim helpfully diagnosed in the issue, the underlying cause were the improvements from commit acaa51d .

The remaining subtlety seemed to be revealed when babel.sty loads its english .ini file on a recent texlive (babel-en.ini). Every time that file read encounters a completely empty line, it seems to terminate the read early. Explaining how that happens is not something I can do with sufficient clarity at the moment, but it has to do with the interplay of Gullet::readUntil, Mouth::readToken and Mouth::getNextChar, where the undef from the empty line needs to be handled at just the right context point - or the assembled token list for the argument is incorrect, raising the higher level Errors of the regression.

I arrived at the PR contents by experiment - trying to tweak in a variety of contexts and settling on the first reasonable combination that worked out. The key change was adding a $$self{nchars} == 0 check to the branch of Mouth::readToken that creates an EOL marker if a line is completely empty - tests continue to pass, and the regression is resolved.

I also changed a couple of checks relying on $ch to instead rely on defined $ch along the way, as I kept worrying about 0 chars.

@dginev dginev requested a review from brucemiller September 19, 2023 22:50
@dginev
Copy link
Collaborator Author

dginev commented Sep 19, 2023

I should also mention that I have tested this on the way the regression manifested itself on texlive 2022 (as I described in the issue) - but I haven't checked a texlive 2023 installation, where the test suite was also reported to be failing. I guess I should install that and double-check the fix is also appropriate there...

@@ -214,7 +214,7 @@ sub handle_escape { # Read control sequence
# Bit I believe that he does NOT mean within control sequences
my $cs = "\\" . $ch; # I need this standardized to be able to lookup tokens (A better way???)
if ((defined $cc) && ($cc == CC_LETTER)) { # For letter, read more letters for csname.
while ((($ch, $cc) = getNextChar($self)) && $ch && ($cc == CC_LETTER)) {
while ((($ch, $cc) = getNextChar($self)) && (length($ch) > 0) && ($cc == CC_LETTER)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really supposed to be positive length, rather than defined?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, the length check is extra-paranoid (I believe I was triple-checking I am not misunderstanding the return values).

My fear was that the empty string "" is defined, but empty. But studying getNextChar, there is no code path that could return "". So I can switch this to a "defined" test, thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also rebased to master - the English babel load continues to succeed.

@dginev dginev force-pushed the babel-english-regression branch from 29cb020 to 9f2645f Compare September 28, 2023 19:13
@brucemiller
Copy link
Owner

Slightly scary, but seems right, and doesn't cause any problems that I can detect. Thanks!

@brucemiller brucemiller merged commit 41bad55 into brucemiller:master Sep 30, 2023
13 checks passed
@brucemiller brucemiller deleted the babel-english-regression branch September 30, 2023 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test fails w/ latest CTAN snapshot (babel/greek)
2 participants