-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disappearing variables #391
Comments
Great question. This sort of thing sometimes happens when Herbie tries a Taylor expansion and finds that a zeroth-order expansion is just as accurate. Kind of surprised how it felt that way in this case though! I'll try to dig into this expression at some point just to double check that nothing went wrong. |
In case it helps, I think |
Hi all, I had some spare time today and did a deep-dive into this issue, to get a deeper sense of what's going on than my reply above. A quick reminder of the problem: When given this benchmark, herbie returns the following program:
This is bizarre because y is unused, and also because a great alternative is available:
Herbie’s result has an accuracy of something like 99.3%, while the better alternative has an accuracy of something like 99.9999%. On specific inputs, Herbie’s result is especially bad. On (1, 1, 1), the worst case, the error is about 22%. The alternative has an ULP of error on some points, but no more than that. Let's recall my earlier explanation. The one-hypot version works because most points Herbie samples have x, y, and z with dramatically different orders of magnitude. After sorting, either x or z is the biggest, while y is the one in the middle. So Herbie's one-hypot version uses the biggest input—sometimes the biggest two—and that's more or less all that counts for most inputs. Cases where all three are reasonably big are very rare, which is why Herbie thinks its result is OK. Today I looked into this a bit more, and got a fuller picture. The key is that Herbie ends up considering three similar programs: hypot(x, y), hypot(x, z), and hypot(y, z). Now, for most points (3/4 of them), the three variables have different signs, and hypot(x, z) is best on those, as discussed in the previous paragraph. However, in a few cases all three variables have the same sign. In this case, either hypot(x, y) or hypot(y, z) is good, because those now consider the two largest inputs. There is a worst case scenario, (1, 1, 1), where none of the three programs is good, but these are extremely rare and Herbie doesn't sample them. On the points we can actually sample, one of the three one-hypot variations is going to be good. This causes Herbie to discard the two-hypot version. Internally, Herbie builds something called an alt-table, which (to simplify a bit) stores only the single best program on each sampled point. Here, the one-hypot versions are better than the two-hypot versions, because both have no error and the one-hypot version is cheaper. Of course, this is a trap. Different inputs need different one-hypot versions, while the two-hypot version is good everywhere; Herbie doesn't take this into account. At the end of a run, when Herbie is actually called to output a single program, it picks the best one-hypot version, which indeed is OK, but the two-hypot version that it discarded earlier would have been better. So the way I characterize the problem here is to say that our alt-table rule—evaluate all programs point-by-point and keep the best program at each point is what steers us wrong. On any individual point (except extremely rare ones), the two-hypot option loses to one of the one-hypot option. But it is better overall. The big challenge here is how to fix things. The alt-table rule is important for filtering, because Herbie constructs thousands of options when working on this problem, and we can't store all of them. (Later passes like regimes are O(N^2) in the number of options.) It also isn't a bad thing that Herbie generates the one-hypot variations; Herbie is supposed to be robust to a few bad options sneaking in, and also if you really care about speed the one-hypot version may in fact be preferable. Anyway, I'll leave this issue open—not because I don't want to fix it but because there's no easy fix. As we keep working on Herbie, perhaps we'll find ways to work around this! |
@pavpanchekha pavpanchekha I read your explanation, dated Apr 14th. Isn't this issue a showstopper bug? The main thing is, the earlier versions of Herbie didn't do this. |
Taylor expansion has been in Herbie since version 0.9, and we haven't made major changes in the last version. (But we did make it more obvious that the You can check if Herbie used Taylor approximations by looking at the "Derivation" below any alternative suggested by Herbie. I take it this was the submission you were talking about? https://herbie.uwplse.org/demo/ea53ed137edb112a5651954cffeb139864572772.2.0/graph.html |
Yes that is the submission. The only reason I suspected Taylor expansion was due to it being labelled as so in the derivation's that you can open like you mentioned. |
I'm trying to optimise an expression with 4 variables:
with precondition
However, the final output is independent of x!
This was on my local install, with the following reproduce block:
x seems to disappear when Herbie Taylor expands around 0.
Running the online version gave me
With reproduce block
However, trying an equivalent version of the same expression on the online demo gives
x also disappears in the Taylor expansion step. The reproduce block is as follows.
I'm quite confused as to what's going on here; does it set x to 0 and then forget about it somehow during the Taylor expansion step?
The text was updated successfully, but these errors were encountered: