-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support marking exprs as subsumed #301
Adds support marking exprs as subsumed #301
Conversation
A couple of high level thoughts:
|
Thanks for taking a look @oflatt!
Will do, I will move it into the table, as part of the
Oh yeah, I like this behavior. I was already worried that if like you mark something as unextractable, then it could still be matched and return a similar expression that would then be extractable. What if we just had one flag called This would also give a better name to the rewrite flag, of |
I'm guessing there are cases where we might want something to be not extractable, but still matchable or vice-versa. |
Ok I can add actions then for both of them and store the info separately. Maybe the high-level rewrite flag though could enable both and be called like |
* Make row a struct so we can add attribute easier * make function and table debug friendly
c0d174b
to
78bcdb5
Compare
This is ready for review again. I have updated it to:
|
Do you think we should add some sort of warning if a whole e-class has been subsumed? I think it might be a potential foot-gun, and hard to prevent. Example:
|
Maybe we could expose the unextractable and subsumed status in the visualization? I was also thinking this would be helpful for debugging. If we do this though, I would prefer to do it in a follow up PR, since it requires a bit of coordination with the egraph-serialized repo. |
I wanted to post here to re-clarify the driving use case for this PR to help with considering whether it should be updated and merged. This feature would be useful when trying to rewrite a smaller expression into a larger one and extract that larger one. In the array API example, for example, since Numab doesn't support Instead, I have to work around it's absence by making the I want to be able to support extracting it as is when I don't apply the numba ruleset (and for instances that don't have the This workaround is not ideal for two reasons:
So with the features in this PR it would become easier to define a separate ruleset that translates smaller expressions into larger ones, and extract those larger ones if that ruleset is run, and if not extract the original smaller ones. Also, I am not sure if this is relevant, but I did see an open issue in eggcc that also might be related (egraphs-good/eggcc#281), asking for a way to replace an expression with another one. It is possible the subsume and unextractable actions could help there too. |
I joined the egglog meeting today (w/ @mwillsey, @ztatlock, @oflatt, and @ajpal) and we came up with a proposal that we:
Thank you all! I will update this PR to reflect these comments. We also talked about some other possible workarounds, including separating unextractable/costs for functions from the function definitions so you could use different ones. We also remarked that if you have multiple rules that subsume an expression, those could match at the same time and they would both be matched even though it is subsumed. It was expressed this was OK because this is similar to other actions that also act simultaneously after matching, even if it's unexpected. You could avoid this if you want by splitting rules into rulesets and running them one at a time. |
I have updated the PR to address the points from the discussion. This is ready for another review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thanks for adapting the PR to the new refactored codebase and incorporating conclusions from discussions! I left a small question. I believe once @oflatt approves, we should be able to merge this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks almost ready to merge!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 🎉 🎉
I wanted to note that I am now integrating these changes into the Python code and I can now remove my custom magic cost numbers! https://github.com/egraphs-good/egglog-python/pull/127/files#diff-9a9a81a04022815662adeb365e546523cb4376d5c1c797be3131968605f1c4c8 Thank you all for the time/energy spent refining this. |
EDIT: This PR has been changed to allow "subsuming" nodes, which are both unextractable and cannot be matched during rules: #301 (comment)
Previously, we had support for marking a whole function as unextractable. This PR extends that to also allow marking individual rows of a function as unextractable as well.
It also adds a high-level
:unextractable
parameter for therewrite
command to mark the LHS as unextractable after its been applied.This is one way to support directional transformations/subsumption (closes #256 for now).
As I mentioned in that issue, one concrete use case for this is in my rewrites for numba compatibility of array expressions. In that example, I have to translate
mean
intosum
expressions, when anaxis
parameter is passed to mean. I want the rewritten version extracted, but by default, the cost would be higher so it wouldn't be. My workaround at the moment was to give the wholemean
function a huge cost so it would always be avoided if possible.However, that solution isn't ideal for a few reasons:
With this PR, I could move the logic entirely to the rewrite.
Implementation
This PR implements the feature by storing a set of "unextractable" input values to each function. I am not sure if this is like safe for rebuilding, so I would love some feedback on that. If there is a better way of storing this information, I am happy to change it.