-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
match merging optimization #363
base: dev
Are you sure you want to change the base?
Conversation
This is not ready to merge, I'll add some tests since it does end up moving things around quite a bit. |
8448557
to
9fd2ac6
Compare
3cd2901
to
c160707
Compare
skip guards catch all case
0e61ffa
to
e1fa89d
Compare
TLDR: Shaves off 3-5% runtime on some heavily nested matches which impact ref counting (e.g. some variants of rbtree), otherwise doesn't impact the runtime. Old fip benchmarks:benchmark variant param elapsed relative stddev rsskk rbtree fip 100000 0.63 1.000 .0148324 4816 kk ftree fip 100000 0.75 1.000 .0083666 4928 kk msort fip 100000 1.03 1.000 .0161245 6368 kk qsort fip 100000 1.485 1.000 .0261725 12112 kk tmap fip 100000 1.24 1.000 .0130384 7968 New fip benchmarks:benchmark variant param elapsed relative stddev rsskk rbtree fip 100000 0.61 1.000 .0109545 4832 kk ftree fip 100000 0.8 1.000 .00894427 4912 kk msort fip 100000 0.01 0 kk qsort fip 100000 1.55 1.000 .0457165 12064 kk tmap fip 100000 1.235 1.000 .0276586 7984 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Tim! This looks pretty cool, I didn't realize we could make the binary search tree benchmarks even faster with a proper pattern match compiler.
I added two nitpicks on the tests, but didn't really read the code. Is there something you wanted me to look at in particular?
d8cd1ab
to
7a4b5ee
Compare
Yes, I originally intended to do this to clean up the generated C code - so there wouldn't be as many duplicated checks for similar structure, but I expected the C compiler at least to mostly optimize these redundant checks. It looks however that it is obscure enough that the C compiler cannot optimize it. Looking at it again, when running the benchmark script I realized I was getting some inconsistent results, especially with
You are one of the only other people besides Daan who could give this a review, and I'd appreciate pointing out any obvious bugs before I take up Daan's time since he is so busy. The file isn't very big <600 lines of heavily commented code. I understand if you are busy as well, so just whatever you are willing to look at I'd appreciate. |
Takes a match that has common superstructure, and reworks it to match on that first.
Note that the transform expects there only to be a single pattern in the match statement, which I believe is a fine assumption to begin with, since tuple types are used for multiple pattern matches, and it rewrites the core from the bottom up. However, it probably should be improved to handle this as well. Branches can get tricky fast.
So
Turns into
In the generated C code, the original results in duplicated Cons checks for the first two Conses.
In the new, those checks are eliminated.
It is smart enough to push the implicit error incomplete match error branches down into subpatterns, but doesn't do any exhaustiveness checking to see if the changes have resulted in any new exhaustive cases.