-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vpa-recommender: Add support for configuring global max allowed resources #7560
base: master
Are you sure you want to change the base?
vpa-recommender: Add support for configuring global max allowed resources #7560
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ialidzhikov The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
9967539
to
0a79197
Compare
Sorry, but I've just gotten a PR merged (#7548) that will merge conflict with yours. |
0a79197
to
af3a158
Compare
…14e06e613e531f7dd This PR adopts the changes from kubernetes/autoscaler#7560.
var globalMaxAllowed apiv1.ResourceList | ||
if !maxAllowedCPU.Quantity.IsZero() { | ||
setGlobalMaxAllowed(&globalMaxAllowed, apiv1.ResourceCPU, maxAllowedCPU.Quantity) | ||
} | ||
if !maxAllowedMemory.Quantity.IsZero() { | ||
setGlobalMaxAllowed(&globalMaxAllowed, apiv1.ResourceMemory, maxAllowedMemory.Quantity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be some logic to ensure that the global max is greater than the global min?
It may be a bit strange since one is per container and the other per pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a perfect world, I agree that validation should exists.
However, I am not sure how to add such validation when the global min allowed flags are on Pod-level and global max allowed flags are on container-level.
In #7147 (comment) I suggested to deprecate the global Pod-level min allowed flags because:
- they are on Pod-level, and not on container-level. The recommendation is on container-level. Right now, the recommender splits the Pod-level min allowed values to the number of containers in a Pod.
- they are implemented as ResourceEstimator. This causes the VPA
.status.uncappedTarget
to be capped as well which is a contradiction of the definition of this field. This also causes the global min-allowed flags to overwrite the VPA minAllowed field (if specified), there is also no merge between the global Pod-level min allowed flags and the VPA's minAllowed field.
If you agree, I can open a dedicated issue for deprecated the global Pod-level min allowed flags and introduce new container-level min allowed equivalents. And validation can be added between the global container-level min and max allowed flags. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I am not sure how to add such validation when the global min allowed flags are on Pod-level and global max allowed flags are on container-level.
Yeah, I agree. I can't figure out a way to make the validation work.
In #7147 (comment) I suggested to deprecate the global Pod-level min allowed flags because:
I don't think deprecation is necessary yet. Pod level resources may be in the VPA's future, so those flags may be used for Pod level resources.
@@ -108,3 +109,16 @@ These options cannot be used together and are mutually exclusive. | |||
It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller. | |||
Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA. | |||
Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk. | |||
|
|||
### Specifying global maximum allowed resources to prevent pods from being unschedulable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about moving this section move to the "features.md" page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many sections in examples.md which actually describe features of VPA (Starting multiple recommenders, Custom memory bump-up after OOMKill, Using CPU management with static policy, Controlling eviction behavior based on scaling direction and resource, etc.). I don't think the newly introduced section is different from the existing section.
IMO, examples and features are overlapping conceptually. If you describe a feature, you usually also add example(s) of how the feature can be used. examples.md
and features.md
should be merged IMO. This is out-of-scope of the existing PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, examples and features are overlapping conceptually. If you describe a feature, you usually also add example(s) of how the feature can be used.
examples.md
andfeatures.md
should be merged IMO. This is out-of-scope of the existing PR.
I agree with you here. The features page is new, and I want to start moving the examples across to the features page, with more description about that feature.
I thought it would be nice for the "global max" feature to be added to features from the start, since it's a better fit there, and will require work to move at a later stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(but it's up to you though, the documentation needs a lot of work in general)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to keep it consistent with the existing doc. In another PR all the examples doc can be reworked as features sections.
Unsure if you want to do this, but an e2e test for the feature would be great! |
12f42ee
to
04e3340
Compare
Do you have a suggestion for the e2e test? It seems that we deploy vpa components to a kind cluster and then create/update/delete VPA objects and finally assert their state. It is relatively easy to test a field in the VPA spec. But I don't find example of e2e tests for such global flags/options. |
/test pull-kubernetes-e2e-autoscaling-vpa-full |
It seems that at the moment we don't have a similar test to base yours off, so I think one would need to be written to deploy the recommender with a global max, and ensure that it doesn't go above that max. It's going to be annoying to write such a test, I think, but I think the test will help us in the long run. What do you think? |
I'm going to give this a I like the idea of an e2e test, but will let an approver decide if it's needed or not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!! Sorry for the delay!
} else { | ||
// Set resources from the global maxAllowed if the VPA maxAllowed is missing them. | ||
for resourceName, quantity := range globalMaxAllowed { | ||
if _, ok := maxAllowed[resourceName]; !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment here that we only override this if the user did not explicitly set a maximum in their container policy in the VPA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already
// Set resources from the global maxAllowed if the VPA maxAllowed is missing them. |
Let me know if you have suggestions for improvements.
New changes are detected. LGTM label has been removed. |
Hi @raywainman , I addressed your review feedback in caa47f9. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
See #7147
Which issue(s) this PR fixes:
Fixes #7147
Special notes for your reviewer:
N/A
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: