-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPOPT via cyipopt #2368
base: main
Are you sure you want to change the base?
IPOPT via cyipopt #2368
Conversation
Regarding the timeout, if you do not want to integrate it into the function callable, one could also just ignore it, which means that in the IPOPT case no timeout can be used. Regarding hessians, this would be the way to integrate them so that no extra call has to be made for them: https://stackoverflow.com/questions/68608105/how-to-compute-objective-gradient-and-hessian-within-one-function-and-pass-it-t Do you have any experience with computing hessians of the acqf with respect to the inputs via autograd? Is this feasible? Best, Johannes |
Thanks for putting this up, excited to see some other optimizers that could be useful in cases where SLSQP just ends up being too slow.
Hmm not a bad idea. I think for a v0 we can probably ignore the timeout complications for now, we can follow up to support timeouts once we are sure everything else works.
Yeah I think for now we can have IPOPT + cyipopt as optional dependencies, and just error out at runtime if they are not installed.
I would probably just run this on a bunch of different acquisition functions on a bunch of different models fitted to data from synthetic functions, and then compare optimization performance (e.g. in terms of % improvements or # of best found solutions) an wall times. Maybe @SebastianAment still has some code sitting around for this from the logEI paper?
Indeed!
We've looked into this in the past a bit; since the computation is fully in pytorch it should be possible to use built-in torch functionality for this, see e.g. https://pytorch.org/functorch/stable/notebooks/jacobians_hessians.html#hessian-computation-with-functorch-hessian. I recall that, in the past, there were some coverage gaps with |
So you prefer to include the dependency directly and not rewrite Regarding the hessians, I tried several things which all leads to errors. The different approaches are based on the basic botorch example: import torch
from botorch.models import SingleTaskGP
from botorch.fit import fit_gpytorch_mll
from botorch.utils import standardize
from gpytorch.mlls import ExactMarginalLogLikelihood
from botorch.acquisition import UpperConfidenceBound
train_X = torch.rand(10, 2, dtype=torch.float64)
Y = 1 - torch.linalg.norm(train_X - 0.5, dim=-1, keepdim=True)
Y = Y + 0.1 * torch.randn_like(Y) # add some noise
train_Y = standardize(Y)
gp = SingleTaskGP(train_X, train_Y)
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_mll(mll)
UCB = UpperConfidenceBound(gp, beta=0.1)
X = torch.tensor([[0.9513, 0.8491]], dtype=torch.float64, requires_grad=True) I tried the following this (I am definitely no expert in torch autograd stuff :-)):
and
Any ideas on this? |
Yeah that would work too. One downside is that it would be much less discoverable that way and a sort of "hidden feature". I guess we can cross that bridge when we know that this is working well. I may not get to dig very deep into the Hessian question for a while, have you tried just running this without providing Hessians? I know @yucenli has been around folks in the past who did compute Hessian via autograd, maybe she has some thoughts here? |
Ok, I will start benchmarking it without the hessian for the GP. As soon as I have results, I will share them here ;) |
@jduerholt any insights from the benchmarking? |
Unfortunately not, as I had no time to further work on it. But it is still on my menu, and I hope to find a motivated student to support me on this ;) |
Motivation
As discussed already several times with @Balandat, it could be for some problems (especially large ones with constraints) beneficial to use
IPOPT
instead of scipysSLSQP
orL-BFGS-B
optimizers. Of course this has to be tested (also with different solvers like HSL within IPOPT (https://coin-or.github.io/Ipopt/INSTALL.html#DOWNLOAD_HSL)).The idea behind this PR is to bring
IPOPT
via its python wrapper (cyipopt
, https://pypi.org/project/cyipopt/) into botorch.cyipopt
offers a handy interface which behave to 99% percent asscipy.optimize.minimize
, so I currently just copiedgen_candidates_scipy
and created a methodgen_candidates_ipopt
. Ideally one would merge them into one method.The following problems exists for a common method both for
scipy
andipopt
optimization:scipy.optimize.minimize
.cyipopt
is not supporting callback methods (in short: the keyword is ignored, https://cyipopt.readthedocs.io/en/stable/reference.html#cyipopt.minimize_ipopt). A solution could be to implement the timeout method via the objective function and incorporate it intof_np_wrapper
and remove the currentminimize_with_timeout
. What do you think?cyipopt
via conda, if installed via pip only the wrapper is installed and not the actual solver, socyipopt
would be somehow an optional dependency of botorch which is imported on execution. An alternative would be to be able to provide a callable as argument to the new version ofgen_candidates_scipy
which has to offer the same signature asscipy.optimize.minimize
which is then used for the actual optimization. This would make it even more flexible and prevent the need for thecyipopt
dependency. Only need would be to refactor the timeout functionality.Additional comments and questions:
Have you read the Contributing Guidelines on pull requests?
Yes.
Test Plan
Not yet implemented, as the current implementation is still experimental and has to be finalized.