-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surrogate scaling #315
Merged
Merged
Surrogate scaling #315
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Preparation for use with sklearn's ColumnTransformer, which spits out arrays
AdrianSosic
force-pushed
the
refactor/surrogates/scaling
branch
2 times, most recently
from
July 17, 2024 08:06
c3ade11
to
59eed75
Compare
Scienfitz
reviewed
Jul 18, 2024
AdrianSosic
force-pushed
the
refactor/surrogates/scaling
branch
from
July 19, 2024 07:32
2f5f851
to
e44c145
Compare
AdrianSosic
commented
Jul 19, 2024
AVHopp
reviewed
Jul 21, 2024
AdrianSosic
force-pushed
the
refactor/surrogates/scaling
branch
from
July 22, 2024 08:13
e44c145
to
53ccd4c
Compare
AdrianSosic
force-pushed
the
refactor/surrogates/scaling
branch
from
July 22, 2024 08:21
53ccd4c
to
b6c56e9
Compare
AdrianSosic
force-pushed
the
refactor/surrogates/scaling
branch
from
July 22, 2024 08:28
b6c56e9
to
1a39f62
Compare
Scienfitz
reviewed
Jul 22, 2024
Became necessary due to the known-third-party ruff flag
AdrianSosic
force-pushed
the
refactor/surrogates/scaling
branch
from
July 22, 2024 19:36
8508b23
to
cdf6688
Compare
For this PR, we leave the mechanism untouched: * Parameters are normalized based on search space bounds * Targets are standardized based on observed measurements
In order to differentiate from target scaling
Scienfitz
reviewed
Jul 23, 2024
Scienfitz
approved these changes
Jul 23, 2024
AdrianSosic
force-pushed
the
refactor/surrogates/scaling
branch
from
July 23, 2024 19:01
8360c67
to
21953d4
Compare
AVHopp
reviewed
Jul 23, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two minor comments
The decorator is no longer compatible with the generalized surrogate layout. Instead of upgrading the decorator, a replacement mechanism using Python's built-in typing.Protocol will be introduced.
AVHopp
reviewed
Jul 24, 2024
Merged
AdrianSosic
added a commit
that referenced
this pull request
Aug 8, 2024
PR #315 introduced the new configurable class-based scaling approach, but: * did so using some rather heavy hierarchy of methods * introduced a conceptual bug in that the scaling logic was not consistently applied in the correct way (e.g. `best_f` should not have been scaled) * the resulting layout implied a rather unclear interface for a `SurrogateProtocol`, where users providing such protocols would also need to expose transform related methods that should actually stay surrogate internal. This PR refactors scaling mechanism with the final interfaces in mind, in particular the newly introduced `SurrogateProtocol` class. ### Problems solved * Much of the transform-related machinery was removed. Left are only two scaler attributes (one for input, one for output) which are purely class internal. * The result is a clean `SurrogateProtocol` interface, which imposes the two required mechanisms on the user: * `fit` (i.e. how is the custom-defined surrogate to be trained) and * `to_botorch` (i.e. how is the trained surrogate converted to be made compatible with `botorch`'s machinery) * The `Surrogate` base class now clearly offers the three layers of connection that are dictated from the surrounding: * a posterior method intended for the user, interfacing experimental representations * a posterior method for the computational layer operating with tensors in computational representation, for interplay with `botorch` * a posterior method that focusses only on the surrogate architecture where transformation and scaling is abstracted away, intended for overriding in subclasses * As a result, scaling is now completely encapsulated inside the surrogate so that objects outside do not need to bother about surrogate internals. That means, questions like "do we need to scale certain quantities before passing them to the surrogate" (such as `best_f` or `X_pending`) are trivially answered with "No", since scaling is not visible outside the surrogate. * Because scaling happens inside the torch layer, it is now part of the computational torch graph, meaning that backpropagation through the entire surrogate model is supported.
Merged
AdrianSosic
added a commit
that referenced
this pull request
Aug 29, 2024
Completes the surrogate factoring, which extended over #278, #309, #315, #325, #337. ### Most important changes * The transition point from experimental to computational representation has been moved from the recommender to the surrogate. From an architecture/responsibility perspective, this is reasonable since the recommend should not have to bother about algorithmic/computational details. * The desired consequence is that public `Surrogate` methods like `posterior` and `fit` can now operate on dataframes in experimental representation, meaning they can also be exposed directly to the user. * The new posterior methods now all return a general `Posterior` object instead of implicitly assuming Gaussian distributions. This paves the way for arbitrary surrogate extensions, such as Bernoulli/Categorical surrogates, etc. At the moment, this introduces an explicit coupling to botorch, which is fine because botorch remains a core dependency and the only backend used for complex surrogate modeling. In the future, this can be further abstracted by introducing our own `Posterior` class. * The `Surrogate` layout has been refined such that the extracted `SurrogateProtocol`, which now defines the formal interface for all surrogates, imposes minimal requirements to the user. * Scaling has been completely redesigned, offering the possibility to configure input/output scaling down to the level of individual parameters and targets. The configuration is currently class-specific, but can be extended to allow surrogate instance specific rules in the future.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR refactors our scaling logic, introducing a mechanism that gives fine control over the applied scaling approach.