Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong VPA is attempting to update pod resources and failing #7499

Open
WesCossick opened this issue Nov 14, 2024 · 18 comments
Open

Wrong VPA is attempting to update pod resources and failing #7499

WesCossick opened this issue Nov 14, 2024 · 18 comments
Labels
area/vertical-pod-autoscaler kind/documentation Categorizes issue or PR as related to documentation.

Comments

@WesCossick
Copy link

Which component are you using?:

vertical-pod-autoscaler

What version of the component are you using?:

Component version: Not sure; using the one auto-installed by GKE.

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.30.5
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.5-gke.1699000

What environment is this in?:

Google Cloud Platform's GKE.

What did you expect to happen?:

For the resources to be updated by the correct VPA.

What happened instead?:

The wrong VPA is updating pod resources and failing.

How to reproduce it (as minimally and precisely as possible):

Create the following two VPAs:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: example-1-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: CronJob
    name: example-1
  updatePolicy:
    updateMode: "Initial"
    minReplicas: 1
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: example-2-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: CronJob
    name: example-2
  updatePolicy:
    updateMode: "Initial"
    minReplicas: 1

Create the following two CronJobs:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: example-1
spec:
  schedule: * * * * *
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: example-1
              image: example:latest
apiVersion: batch/v1
kind: CronJob
metadata:
  name: example-2
spec:
  schedule: * * * * *
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: example-2
              image: example:latest

Using describe pod, see the following annotations on one of the cronjob's pods:

Annotations:          vpaObservedContainers: example-1
                      vpaUpdates: Pod resources updated by example-2-vpa: container 0: 

When this happens, the other cronjob's pods look correct and have their CPU and memory correctly set by the VPA:

Annotations:          vpaObservedContainers: example-2
                      vpaUpdates: Pod resources updated by example-2-vpa: container 0: cpu request, memory request

Which one is mixed up changes based on the order in which you create the VPAs it seems.

Anything else we need to know?:

I pared down the two CronJob resources to simplify them.

@WesCossick WesCossick added the kind/bug Categorizes issue or PR as related to a bug. label Nov 14, 2024
@adrianmoisey
Copy link
Member

/area vertical-pod-autoscaler

@raywainman
Copy link
Contributor

raywainman commented Nov 15, 2024

I believe this issue would happen with the VPA deployed from OSS as well.

The way VPA knows which pods to target is that it will fetch the CronJob object:

https://github.com/bskiba/autoscaler/blob/8ff3b4fd47bcba514c9e904b6bee481ff0f3703e/vertical-pod-autoscaler/pkg/target/fetcher.go#L187-L192

And it fetches the apiObj.Spec.JobTemplate.Spec.Template.Labels value to then do pod selection.

From your example above, you are not setting any labels in your jobTemplate which basically means it is matching all pods in the same namespace (I'd have to confirm in the code to be 100% sure but this is fairly likely).

So now the two VPAs are essentially racing and competing against each-other.

To fix this, you'll need to set some labels in your two CronJob templates to make sure you are differentiating them from each-other.

Hope that helps :)

@WesCossick
Copy link
Author

@raywainman Yep, that seems to be the issue! In the real environment where we were doing experimentation, the CronJob objects did actually have one label, but it was identical for each CronJob. Providing unique labels to each worked around the issue.

Not sure if y'all are considering this a bug, though. If not, at a minimum it would be helpful if this behavior was documented somewhere clearly so that it doesn't confuse others in the same way.

@raywainman
Copy link
Contributor

That's a great idea, it's an easy pitfall.

I thought we had something documenting this but I actually couldn't find anything.

Let me see if I can put together a PR with this.

Thanks @WesCossick!

@adrianmoisey
Copy link
Member

/kind documentation
/remove-kind bug

@k8s-ci-robot k8s-ci-robot added kind/documentation Categorizes issue or PR as related to documentation. and removed kind/bug Categorizes issue or PR as related to a bug. labels Nov 28, 2024
@omerap12
Copy link
Member

omerap12 commented Dec 7, 2024

@raywainman I can make a PR for this, but are we sure it's not a bug? shouldn't we log a warning to let users know their VPA will target everything?

@adrianmoisey
Copy link
Member

That reminds me... #6460 was made a long time ago that fixes this problem.
I haven't read the code, but may be it doesn't apply to CronJobs, or is bugged in some other way?

@omerap12
Copy link
Member

omerap12 commented Dec 7, 2024

I'm not sure if this qualifies as a bug, but the issue is that the function: https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/utils/vpa/api.go#L163 is designed to identify the parent of a pod. However, the behavior is different for CronJobs. In the case of CronJobs, they create Jobs, which in turn create Pods. As a result, the ownerReference of the Pod points to the Job, and the ownerReference of the Job points to the CronJob.

@adrianmoisey
Copy link
Member

I wonder if GKE is running an old version of the VPA?

I tried using the latest OSS one, and I can't reproduce this behaviour.

VPA 1.1.0 has #6460 which fixes this bug.

@maxcao13
Copy link

maxcao13 commented Dec 10, 2024

I ran into this while using VPA 1.2.0 trying to provide recommendations for CronJobs and using hack/vpa.up in a kind v0.25.0 cluster using k8s v1.31.2. These are my manifests:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/3 * * * *"
  jobTemplate:
    spec:
      template:
        metadata:
          labels:
            app: hello
        spec:
          containers:
          - name: hello
            image: busybox:1.28
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - |
              tries=0
              while [ $tries -lt 30 ]; do
                echo "Hello, World $tries!"
                tries=$((tries+1))
                sleep 6
              done
          restartPolicy: OnFailure
        
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: hello
spec:
  targetRef:
      apiVersion: "apps/v1"
      kind:       "CronJob"
      name:       "hello"
  updatePolicy:
    updateMode: "Initial"
    minReplicas: 1

I had to specify the Job labels like @raywainman said in order for it to start working :/. Seems like a bug/docs issue to me if reproducible.

@adrianmoisey
Copy link
Member

You're right, thanks for clarifying that.

The VPA is definitely expecting a CronJob to have spec.jobTemplate.spec.template.metadata.labels, and fails silently when it doesn't.

@omerap12
Copy link
Member

jobTemplate

If that's the case, we should at least emit a log ( and maybe throw an error? )

@voelzmo
Copy link
Contributor

voelzmo commented Dec 11, 2024

Oh, this is interesting (and a bit unexpected, tbh): the CronJobController doesn't use the labels to get the Jobs to reconcile, but instead uses just the OwnerRef . So from the CronJob POV, labels don't matter and would technically even be optional
However, the JobController even has a check for the case that two Jobs would use the same labelSelectors and errors.
So in the scenario that we're discussing here, I would expect that the Jobs were in an error state along the way?

Nevertheless, I also would expect some docs in the official CronJob docs and I guess we can also add a reference to this in the VPA docs, doubling down on disjoint labels for separate CronJobs.

WDYT?

@adrianmoisey
Copy link
Member

However, the JobController even has a check for the case that two Jobs would use the same labelSelectors and errors.
So in the scenario that we're discussing here, I would expect that the Jobs were in an error state along the way?

I tried to reproduce this error, and I couldn't.
It seems that when a Job is created, if no labelSelector is set (as per the example above), then one is auto-generated.

It seems to be this part of code: https://github.com/kubernetes/kubernetes/blob/8a39b6062060ffb4fba52f53c011bd00428db194/pkg/registry/batch/job/strategy.go#L223-L226

Based on this, it seems that the CronJob <-> Job relationship in Kubernetes is sound (as you had mentioned with the OwnerRef). And the Job <-> Pod relationship is also sound, due to these labelSelectors being auto-generated. (It does seem that if a user sets .spec.manualSelector no their Job, they can override this behaviour and potentially cause harm)

This makes me assume that for the VPA to grab the labels of a CronJob the approach that Kubernetes is using, which makes me think that this is a bug.

@voelzmo
Copy link
Contributor

voelzmo commented Dec 11, 2024

Oooh, great find on the auto-generated labels and selectors on Job! There even is good documentation about this on the Job.

So that means from the CronJob docs perspective it makes sense to allow the user leaving this empty, but this doesn't work for VPA. If it is empty, you will at least get a condition in the VPA status

status:
  conditions:
  - lastTransitionTime: "2024-12-11T10:56:03Z"
    message: No pods match this VPA object
    reason: NoPodsMatched
    status: "True"
    type: NoPodsMatched

And if then users configure the same labels on different CronJobs, this is still fine from the CronJob and Job controller's perspective, it is just VPA requiring unique labels per CronJob. So this is definitely a VPA topic.

I'm still wondering why #6460 wouldn't help here? Shouldn't it detect the case where the labels match, but following the ownerRef doesn't match?

@adrianmoisey
Copy link
Member

adrianmoisey commented Dec 15, 2024

I think I've figured out what's going on.

https://github.com/bskiba/autoscaler/blob/35359b902ba00931fcad5be7c72a03bd7be39f7f/vertical-pod-autoscaler/pkg/target/fetcher.go#L167-L168

This piece of code grabs the labels from a CronJob's Job template, and assuming that the Pods will be discoverable by using those labels. Since there may not be any labels on the CronJob's spec, the labelSelector can be empty.

Later on, when GetControllingVPAForPod() is executed, it passes a list of VPAs + the discovered labelSelectors (in our case, an empty labelSelector) to PodMatchesVPA(): https://github.com/bskiba/autoscaler/blob/35359b902ba00931fcad5be7c72a03bd7be39f7f/vertical-pod-autoscaler/pkg/utils/vpa/api.go#L172-L175

This then uses those labelSelectors to find a match: https://github.com/bskiba/autoscaler/blob/35359b902ba00931fcad5be7c72a03bd7be39f7f/vertical-pod-autoscaler/pkg/utils/vpa/api.go#L99-L110

Since the labelSelector that came from the CronJob is empty, nothing is matched.

I think we could fix this if we found an active Job, and used its labelSelector to find Pods. I assume the implementation may end up being a little gross though.

@voelzmo
Copy link
Contributor

voelzmo commented Dec 19, 2024

still, after #6460, we also check the ownerRef to avoid cases where the labels are coincidentally matching – this should never have been the same and the original issue should not have occurred, right?

@adrianmoisey
Copy link
Member

Not to my understanding. Since we're matching a Pod to the labels discovered by the VPA (which is empty), no match happens.

I think the purpose of #6460 is to filter out Pods that were matched, but shouldn't have been matched.

I think the CronJob situation is opposite, no pods are being matched, since the labelSelector can't be discovered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/vertical-pod-autoscaler kind/documentation Categorizes issue or PR as related to documentation.
Projects
None yet
Development

No branches or pull requests

7 participants