Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass the whole VPA into cappingRecommendationProcessor.Apply() #7527

Merged
merged 6 commits into from
Dec 11, 2024

Conversation

adrianmoisey
Copy link
Member

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

The current log message for when no container is found is very misleading and can cause confusion.

This passes the entire VPA object into that function, in order for it to create a log file with the relevant VPA name in it.

It kinda feels like surgery with a scalpel, any alternative approaches would be appreciated.

Take a look at #7491 to see the confusion it can cause.

Which issue(s) this PR fixes:

Fixes #7491

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


The current log message for when no container is found is very
misleading and can cause confusion.

This passes the entire VPA object into that function, in order for it to
create a log file with the relevant VPA name in it.

It kinda feels like surgery with a scalpel, any alternative approaches
would be appreciated.
@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 23, 2024
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 23, 2024
pod *apiv1.Pod) (*vpa_types.RecommendedPodResources, ContainerToAnnotationsMap, error) {
// TODO: Annotate if request enforced by maintaining proportion with limit and allowed limit range is in conflict with policy.

policy := vpa.Spec.ResourcePolicy.DeepCopy()
Copy link
Member

@omerap12 omerap12 Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check for some edge cases ( just to avoid panic )?

if vpa == nil {
    return nil, nil, fmt.Errorf("cannot process nil vpa")
    }
if pod == nil {
    return nil, nil, fmt.Errorf("cannot process nil pod")
    }
if vpa.Status.Recommendation == nil {
        return nil, nil, nil  // This matches existing behavior for no recommendation
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed in 13d6ffa

pod *v1.Pod) (*vpa_types.RecommendedPodResources, ContainerToAnnotationsMap, error) {
recommendation := podRecommendation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for here ( not sure if needed but just my opinion ):

if vpa == nil {
      return nil, nil, fmt.Errorf("cannot process nil vpa")
  }
if vpa.Status.Recommendation == nil {
    return nil, nil, nil
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed in 13d6ffa

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check pod here as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about adding a check there, but all that function does is pass Pod on to another function call.
Since this function doesn't actually do anything to Pod, I figured the guards could be in the functions that would do something with Pod.

Copy link
Member

@omerap12 omerap12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my comments make sense to you, we could add more unit tests to cover those edge cases. For example:

func TestApplyWithNilVPA(t *testing.T) {
    pod := test.Pod().WithName("pod1").AddContainer(test.Container().WithName("ctr-name").Get()).Get()
    processor := NewCappingRecommendationProcessor(&fakeLimitRangeCalculator{})
    
    res, annotations, err := processor.Apply(nil, pod)
    assert.Error(t, err)
    assert.Nil(t, res)
    assert.Nil(t, annotations)
}

func TestApplyWithNilPod(t *testing.T) {
    vpa := test.VerticalPodAutoscaler().WithContainer("container").Get()
    processor := NewCappingRecommendationProcessor(&fakeLimitRangeCalculator{})
    
    res, annotations, err := processor.Apply(vpa, nil)
    assert.Error(t, err)
    assert.Nil(t, res)
    assert.Nil(t, annotations)
}

@raywainman
Copy link
Contributor

Nice cleanup, thank you :)

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 25, 2024
Copy link
Contributor

@raywainman raywainman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry realized I didn't submit my comments

pod *apiv1.Pod) (*vpa_types.RecommendedPodResources, ContainerToAnnotationsMap, error) {
// TODO: Annotate if request enforced by maintaining proportion with limit and allowed limit range is in conflict with policy.

policy := vpa.Spec.ResourcePolicy.DeepCopy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the defensiveness here, in practice the code before could have allowed someone to mutate the object and cause all kinds of weird behavior.

Curious if you noticed any issue anywhere in this function that could have caused unwanted mutation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, we might inherit some performance penalties here (though should be very small) and I don't think we generally do this anywhere else in the codebase so could go either way here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed them in e9c1c1c

For some reason I thought they were needed. When making this PR I was a little rushed, since I was moving laptops and wanted my work-in-progress code pushed.

Anyway, the problem I was facing was actually fixed in e277df0, but for some reason I thought that DeepCopy fixed it

@adrianmoisey
Copy link
Member Author

Thanks for the reviews. I was actually rushed when making this, so still need to do a cleanup and respond to comments.
I'll hold for now until I can address all of that
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 25, 2024
It's unneeded
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 27, 2024
And add tests
@omerap12
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 27, 2024
@adrianmoisey
Copy link
Member Author

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 27, 2024
pod *v1.Pod) (*vpa_types.RecommendedPodResources, ContainerToAnnotationsMap, error) {
recommendation := podRecommendation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check pod here as well?

return nil, nil, fmt.Errorf("cannot process nil vpa")
}
if vpa.Status.Recommendation == nil {
return nil, nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wonder if this should be an error case...

The processors are generally "adjusting" the recommendation one at a time. If there is no recommendation to adjust then this feels like an error case to me.

@omerap12 or @adrianmoisey are you seeing a case where this should be allowed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait I see that this is being done in the capping processor. Perhaps it is best to be consistent.

However I wonder if this can eventually be made more strict and turned into an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming that this case should never happen

@raywainman
Copy link
Contributor

/lgtm

Copy link
Contributor

@voelzmo voelzmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this change really improves the error message and should be a good step forward to improve the situation. I have just some small comments where I don't want to block the approval on.

Looking at this I'm reminded that we have #6744 still open and never got to reviewing #6745 to the end. This should fix the main reason for seeing these messages.

Apply(podRecommendation *vpa_types.RecommendedPodResources,
policy *vpa_types.PodResourcePolicy,
conditions []vpa_types.VerticalPodAutoscalerCondition,
Apply(Vpa *vpa_types.VerticalPodAutoscaler,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this should be lower case

Suggested change
Apply(Vpa *vpa_types.VerticalPodAutoscaler,
Apply(vpa *vpa_types.VerticalPodAutoscaler,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3ea36d2

pod *apiv1.Pod) (*vpa_types.RecommendedPodResources, ContainerToAnnotationsMap, error) {
// TODO: Annotate if request enforced by maintaining proportion with limit and allowed limit range is in conflict with policy.

if vpa == nil {
Copy link
Contributor

@voelzmo voelzmo Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the back-and-forth, but should we really add these nilchecks? You're changing the signature coming from explicitly passing the recommendations, policy and conditions to passing an entire VPA object. So the context where all this is currently called from already ensures that the VPA and Pod aren't nil: https://github.com/kubernetes/autoscaler/pull/7527/files#diff-b21ffa4a9ddd85d3bdd971cf8ff4cdb5050b05f8c349137a3f7e3152d4634d6eR108-R110

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is something I had considered, but I landed on the fact that this function should be able to handle any possible input.

I don't actually know if there's a Go or Kubernetes best practice around this. I figured that more defensive code is a good thing, but I'm also happy to change, but I'll wait for consensus before changing it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, since this is an exported function, we should include the nil check as to prevent panics from invalid usage. But, I do agree that we're only changing the signature to so it's unlikely the VPA object would ever be nil in current usage patterns. I'm comfortable with whatever you decide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for defensive approach here.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 3, 2024
@omerap12
Copy link
Member

omerap12 commented Dec 6, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 6, 2024
return nil, nil, nil
}

var recommendation *vpa_types.RecommendedPodResources
Copy link
Contributor

@raywainman raywainman Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small edge case here is that if there are no processors, this will be nil (where the behavior beforehand was that we would return the existing VPA recommendation).

I know we don't currently have an empty processor but should we match the behavior here and initialize this to the value of vpa.Status.Recommendation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this change in cf1e270, is that what you were thinking?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry I added a quick comment on the commit and it isn't appearing anywhere :/

You can fold the two lines:

recommendation := vpa.Status.Recommendation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 Yup, good callout.
Should be fixed in 59236c9

@raywainman
Copy link
Contributor

One last small comment then this is good to go, sorry for the wait.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 10, 2024
@raywainman
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 11, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 11, 2024
@raywainman
Copy link
Contributor

/lgtm

/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 11, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adrianmoisey, raywainman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 11, 2024
@k8s-ci-robot k8s-ci-robot merged commit 562059b into kubernetes:master Dec 11, 2024
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/vertical-pod-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VPA wrongly determine container name?
5 participants