fix: improve `funcId` recognition and squash misdetection bugs #636

novusnota · 2024-07-28T01:22:09Z

That took a looooooooot of RegEx iterations.

Closes #635

I have updated CHANGELOG.md
~~- [ ] I have documented my contribution in Tact Docs: https://github.com/tact-lang/tact-docs/pull/PR-NUMBER~~
I have added tests to demonstrate the contribution is correctly implemented: this usually includes both positive and negative tests, showing the happy path(s) and featuring intentionally broken cases
I have run all the tests locally and no test failure was reported
I have run the linter, formatter and spellchecker
I did not do unrelated and/or undiscussed refactorings

That took a looooooooot of RegEx iterations, including going into `grammar.ts` and trying to adjust things in the semantic analysis. Fortunately, that wasn't needed.

CHANGELOG.md

src/grammar/test/items-native-fun-funcid.tact

anton-trunov · 2024-07-28T08:21:10Z

To be on the safe side, let's take stdlib.fc and mathlib.fc as libraries that are commonly used and see if we can add more function identifiers from those libs into our positive tests to make sure people will be able to FFI with those.

Co-authored-by: Anton Trunov <[email protected]>

Came up with a simple solution for 0x0_ stuff, so our parser can deal with them too!

src/grammar/grammar.ohm

Co-authored-by: Anton Trunov <[email protected]>

anton-trunov

This PR accepts, for instance, + as a valid FunC identifier, but it's actually not.

Something like this should work:

    funcInvalidId = "_" ")" --notUnderscore
                  | ("+" | "-" | "*" | "/" | "~/" | "^/" | "%" | "~%" | "^%" | "/%") ")" --notArithOperator
      // more operators here
                  | ("-"? digit+) ")" --notDecimalNumber
                  | ("-"? "0x" hexDigit+) ")" --notHexadecimalNumber
    funcPlainId = ~funcInvalidId (~(whiteSpace | "(" | ")" | "," | "." | ";" | "~") any)+

novusnota · 2024-07-28T17:12:14Z

Nice observation! I'd try putting lookaheads & in front of those ")" so they don't get parsed, but let me try your suggestion first, it should work. UPD: Yeah, it does, clever idea!

The keywords are not allowed and should be excluded too. Basically, contents of https://github.com/ton-blockchain/ton/blob/master/crypto/func/keywords.cpp has to be put in the exceptions list. I'm on it.

UPD2: Order of operators in those inner () alternations really does matter, the ones with bigger prefix shall stand in front of others.

novusnota · 2024-07-28T20:37:29Z

There was a suggestion, which didn't pass negative tests, so I hid it.

The thing we've engineered works, but possibly I've found a cleaner solution, which doesn't require the ")" and can be used in the FunC parser too (the current approach in this PR fails there in absence of ")" and stuff). But this one works:

It passes similar set of tests we have for function identifiers, but adapted to FunC code:

UPD: Negative tests failed terribly, reverting :)

anton-trunov · 2024-07-30T05:44:32Z

@novusnota So, what's the current status of this PR?

novusnota · 2024-07-30T11:49:47Z

@anton-trunov It's a working and robust solution. The alternative would be to do invalidation of identifiers during syntactic analysis, which enhances the error messages further, but that seems to be a bit out of scope of this PR. Wdyt?

novusnota added 2 commits July 28, 2024 03:21

fix: improve funcId recognition and squash misdetection bugs

ac2e10b

That took a looooooooot of RegEx iterations, including going into `grammar.ts` and trying to adjust things in the semantic analysis. Fortunately, that wasn't needed.

chore: CHANGELOG

0163633

novusnota requested a review from anton-trunov July 28, 2024 01:44

test: chaos ensues

65c5417

anton-trunov requested changes Jul 28, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

src/grammar/test/items-native-fun-funcid.tact Outdated Show resolved Hide resolved

novusnota and others added 2 commits July 28, 2024 11:03

Update CHANGELOG.md

0d7e4a7

Co-authored-by: Anton Trunov <[email protected]>

feat/test: extended the possibilities and added more tests

c955962

Came up with a simple solution for 0x0_ stuff, so our parser can deal with them too!

novusnota requested a review from anton-trunov July 28, 2024 09:31

anton-trunov reviewed Jul 28, 2024

View reviewed changes

src/grammar/grammar.ohm Outdated Show resolved Hide resolved

novusnota and others added 2 commits July 28, 2024 11:40

Update src/grammar/grammar.ohm

c5dadf4

Co-authored-by: Anton Trunov <[email protected]>

test: corrupted and emojified identifiers

155bba3

anton-trunov self-assigned this Jul 28, 2024

anton-trunov added this to the v1.4.2 milestone Jul 28, 2024

anton-trunov requested changes Jul 28, 2024

View reviewed changes

novusnota added 2 commits July 28, 2024 20:26

fix/test: prohibit keywords, operators and compiler directives

071572f

chore: update snapshots

67b4d35

novusnota added 3 commits July 28, 2024 23:12

test: derivatives of reserved words should parse

097a131

fix: typo in the operator

ca8a50f

fix: prohibit square brackets (used in tuples)

9d40c88

anton-trunov approved these changes Jul 30, 2024

View reviewed changes

anton-trunov merged commit 7f192a6 into main Jul 30, 2024
3 checks passed

anton-trunov deleted the issues/635 branch July 30, 2024 14:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve `funcId` recognition and squash misdetection bugs #636

fix: improve `funcId` recognition and squash misdetection bugs #636

novusnota commented Jul 28, 2024 •

edited

Loading

anton-trunov commented Jul 28, 2024

anton-trunov left a comment •

edited

Loading

novusnota commented Jul 28, 2024 •

edited

Loading

novusnota commented Jul 28, 2024 •

edited

Loading

anton-trunov commented Jul 30, 2024

novusnota commented Jul 30, 2024 •

edited

Loading

fix: improve funcId recognition and squash misdetection bugs #636

fix: improve funcId recognition and squash misdetection bugs #636

Conversation

novusnota commented Jul 28, 2024 • edited Loading

anton-trunov commented Jul 28, 2024

anton-trunov left a comment • edited Loading

Choose a reason for hiding this comment

novusnota commented Jul 28, 2024 • edited Loading

novusnota commented Jul 28, 2024 • edited Loading

anton-trunov commented Jul 30, 2024

novusnota commented Jul 30, 2024 • edited Loading

fix: improve `funcId` recognition and squash misdetection bugs #636

fix: improve `funcId` recognition and squash misdetection bugs #636

novusnota commented Jul 28, 2024 •

edited

Loading

anton-trunov left a comment •

edited

Loading

novusnota commented Jul 28, 2024 •

edited

Loading

novusnota commented Jul 28, 2024 •

edited

Loading

novusnota commented Jul 30, 2024 •

edited

Loading