-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Parameter Constraints Do Not Work #2542
Comments
Hi @kayween. Thanks for reporting. This issue would be better to raise on the GPyTorch side since the underlying logic is purely controlled by GPyTorch. We just package these modules together for convenience.
We do see occasional line search failures during model fitting. If this is the root cause, it'd be great to see it fixed :) |
I am also not sure if your gradient step is the correct way to update the parameter values. When you do
you're updating the
it will trigger the setter of
|
Hi @saitcakmak
PyTorch tensors that require gradients do need in-place updates during gradient steps. So Also, we do want to directly update the raw noise, as opposed to updating
So |
The reason why the constraint is not enforced is due to this line in BoTorch.
The argument
I can attach a minimal working example later showing that removing this argument fixes NaN issues. And then we can test if this also resolves occasional line search failures. |
Ah, you're right. I always find this parameter / raw parameter setup confusing. Going back through code history, looks like the noise constraint on the likelihood was added by @Balandat very long time ago (5333d5b), along with |
#2544 would remove |
Going through the code history 5333d5b, it looks like there were some discussions on this previously cornellius-gp/gpytorch#629
Based on the commit message, it seems that In any case, @Balandat probably has more comments on this. Nevertheless, it is probably good to have consistency? Some modules have botorch/botorch/models/utils/gpytorch_modules.py Lines 41 to 50 in 8536468
|
Yes the idea was indeed that when using L-BFGS-B we want to impose explicit constraints on the parameters. The easiest way to do this is to not use a transform in which case the can just constrain the "raw_x" value directly. If gpytorch applies a transformation to enforce the constraint then you can end up with issues as mentioned above. Now this decision was made a long time ago and so we could potentially revisit this. If always using a transform works fine with L-BFGS-B then great. We may consider also using an explicit constraint in addition to the transform as mentioned above, but then we'd have to somehow apply the constraint in the transformed space in the optimizer. Yet another option could be to adaptively change the nature of the constraint (set the transform to |
I came across this again yesterday and looked into the GPyTorch constraints a bit more. Seems like we could disable the application of constraints by mocking a simple property:
All constraints I could find subclass |
🐛 Bug
botorch.models.utils.gpytorch_modules
implements a few utility functions that return kernels and likelihoods. Those functions should enforce constraints onkernel.lengthscale
andlikelihood.noise
to make sure they are always positive.However, the constraints do not work in some cases.
To reproduce
The following is a minimal working example showing that the parameter constraint does not work properly.
The following is the output. The log prior is NaN, because the noise is outside the support of Gamma distributions
Expected Behavior
likelihood.noise
is supposed to be greater than1e-4
.log_prior
should not be NaN.The above two should hold in all cases for any gradient values.
System information
Additional context
I am working on aepsych (heavily relies on botorch) where we use similar outputscale/lengthscale priors. I was fitting a GP model on a synthetic dataset and had NaN issues during hyperparameter optimization (I was using Adam). But those NaN issues might break LBFGS as well, e.g., line search failures.
Those NaN issues are due to the argument
transform=None
.botorch/botorch/models/utils/gpytorch_modules.py
Lines 63 to 71 in 8536468
GPyTorch implements the constraint by a softplus transformation. However, if we override the argument by
transform=None
, thenself.enforce=None
. As a result, no transformation is applied and the constraint is not enforced.https://github.com/cornellius-gp/gpytorch/blob/8825cdd7abd1db7dea5803265067d598f21d6962/gpytorch/constraints/constraints.py#L173-L175
In most cases, the prior pushes the hyperparameter towards the mode of the prior distribution. Thus, this NaN issue does not happen very often. However, I believe this could lead to unintended behavior, e.g., line search failure and early termination of LBFGS.
The text was updated successfully, but these errors were encountered: