Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VK+KIP displays cells from a previous incarnation #44

Closed
myechuri opened this issue Mar 12, 2020 · 2 comments
Closed

VK+KIP displays cells from a previous incarnation #44

myechuri opened this issue Mar 12, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@myechuri
Copy link
Contributor

  1. Created 1 node vk+kip cluster - incarnation1 (KipControllerID 2qs4mvn2pza6teedd6hfz5hiky)
  2. Deployed nginx deployment with 1 replica - incarnation1
  3. Deleted nginx deployment
  4. Deleted virtual-kubelet deployment
  5. Created virtual-kubelet deployment - incarnation2 (KipControllerID 72oxoqlbgrgjrleablteitvrxu)
    kubectl get pods shows zero pods.
    kubectl get cells shows nginx and kube-proxy cells from incarnation1.
$ kubectl get pods
No resources found in default namespace.

$ kubectl get cells
NAME                                   POD NAME                            POD NAMESPACE   NODE              LAUNCH TYPE   INSTANCE TYPE   INSTANCE ID           IP
0dfd31cf-8d5a-494d-98e7-d8de2f4b4c9d   kube-proxy-rjspt                    kube-system     virtual-kubelet   On-Demand     t3.nano         i-0650833d57a649efd   172.31.74.58
61c44374-33aa-4636-8d09-23479025c3b4   kube-proxy-rjspt                    kube-system     virtual-kubelet   On-Demand     t3.nano         i-005c2fb81f90a1ede   172.31.77.142
66e5f0da-7f74-4a17-ac9b-fe6234ed8369   nginx-deployment-66f967f649-9b8zj   default         virtual-kubelet   On-Demand     t3.nano         i-0ec6eabbae4ca4c6b   172.31.72.109

Screen Shot 2020-03-12 at 12 52 58 PM

@justnoise
Copy link
Contributor

Yes, this is actually expected (but not ideal). I'll go through what's happening to cause this and the way we deal with the issue in deploy/virtual-kubelet.yaml and how I deal with the issue when working with kip in minikube. This got a bit long, apologies.

Each Kip pod needs to keep track of which cloud instances it is responsible for (starting, stopping, cleaning up). To do this, each kip pod, when it first creates its internal state storage, also creates a UUID that identifies the provider. That UUID is the ControllerID.

Each cloud instance gets tagged with the ControllerID when it is created. The controller will only manipulate (start, stop, etc.) instances that have a ControllerID tag that matches their internal ControllerID. Likewise, since there might be multiple virtual-kubelet controllers in the same k8s cluster, each controller only maintains cell records that have a ControllerID field that matches their internal ControllerID value (the ControllerID field is stored in the Cels CRD but is not a default field for printing, you can see the field with kubectl get cells -oyaml).

So in the basic case, when you create a virtual kubelet pod, Kip will create its persistent storage and create a ControllerID which is written to that storage. Without patching kube-proxy, the controller will also launch a kube-proxy pod (tagged with the ControllerID) and create a cell for that pod. Note: We don't actually need/use the kube-proxy pod (I patch it out in my setup).

Now, when the virtual-kubelet pod is deleted, the instance and cell remain in the system marked with the original ControllerID. When the Kip virtual-kubelet deployment is re-created, the pod gets a new filesystem, the storage area for Kip is re-created and the controller creates a new ControllerID value. Kip will no longer track the old cells because it has a new identity.

How we fix this in deploy/virtual-kubelet.yaml:

  1. In the manifests we give to users, we create a PersistentVolumeClaim for the virtual-kubelet storage and store Kip's state on that volume. When the pod is restarted or moved, the storage should remain the same.
  2. In minikube, the situation is a bit different. I choose to store kip's data on a hostPath volume that's mounted into the pod. That remains the same across pod restarts but will go away if we do minikube delete:
  containers:
   - name: virtual-kubelet
     volumeMounts:
     - name: data
       mountPath: /opt/kip
  volumes:
  - name: data
    hostPath:
      path: /opt/kip
      type: DirectoryOrCreate

Both of these present a non-ideal situation but one that works and was chosen as one of the easiest ways to do it. When the time is right, I'd like to look into alternatives to preserving controller identity across pod restarts.

@ldx
Copy link
Contributor

ldx commented Jun 15, 2020

Created #131 to deal with all resources left behind by a deleted Kip instance.

@ldx ldx closed this as completed Jun 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants