Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage capacity producer #450

Merged
merged 8 commits into from
Aug 18, 2020
Merged

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Jul 1, 2020

What type of PR is this?
/kind feature

What this PR does / why we need it:

This is the producer side of https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1472-storage-capacity-tracking

Special notes for your reviewer:

TODO:

  • decide about command line interface
  • release notes
  • documentation
  • more tests
  • import Kubernetes 1.19.0 beta + sig-storage-lib-external-provisioner v6.0.0; right now I am using redirects instead
  • ignore storage classes with immediate binding, because CSIStorageCapacity information is not used for those anyway?
  • support making a Deployment the owner by walking up the owner chain

Does this PR introduce a user-facing change?:

optionally publish storage capacity information (alpha feature)

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 1, 2020
@k8s-ci-robot k8s-ci-robot requested review from jsafrane and saad-ali July 1, 2020 13:13
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 1, 2020
pkg/features/features.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
pkg/capacity/topology/doc.go Outdated Show resolved Hide resolved
pkg/capacity/topology/nodes.go Show resolved Hide resolved
pkg/capacity/topology/nodes_test.go Outdated Show resolved Hide resolved
},
expectedSegments: []*Segment{localStorageNode2},
},
"deep-topology": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe another tests, with partially matching segments:

  • Topology keys are A, B, C
  • One node has A: US, B: NY, C:1
  • Second node has A: US, B: NY, C:2
  • -> two segments are created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 182 to 183
// There's no reconciliation at runtime because changes not made by us
// should never happen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is scary. "Changes not made by us" do happen. If a controller owns an object, it should recreate it when someone deletes it or update it when someone replaces it with some nonsense. This is actually my main issue with this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about this myself, hence this comment. I'll add reconciliation against against actual objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm undecided whether I should use a CSIStorageCapacity informer or just list them periodically.

If I use an informer, the controller will get all of its own changes sent back to it, which adds a permanent, fixed overhead. The advantage is that the controller can react to undesirable changes quickly.

If I list, probably with a long delay, then the overhead might be lower (depending on how often it polls), but undesirable changes will be detected more slowly.

@jsafrane : What would you prefer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsafrane: ping? ^^^

Copy link
Contributor Author

@pohly pohly Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've convinced myself that an informer is better because:

  • a constant overhead is simpler to measure and plan for
  • the code for handling the update conflict can be removed

Implemented.

pkg/capacity/controller.go Outdated Show resolved Hide resolved
pkg/capacity/controller.go Outdated Show resolved Hide resolved
pkg/capacity/controller.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 6, 2020
@pohly pohly force-pushed the storage-capacity branch from 5f82e32 to c67ed1b Compare July 14, 2020 13:05
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2020
@pohly pohly force-pushed the storage-capacity branch from c67ed1b to f2c7497 Compare July 14, 2020 13:57
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 30, 2020
@pohly pohly force-pushed the storage-capacity branch from f2c7497 to 35a8691 Compare August 4, 2020 12:55
@@ -266,6 +277,50 @@ func main() {
controllerCapabilities,
)

var capacityController *capacity.Controller
if (*capacityFeatures)[capacity.FeatureCentral] {
// TODO: use parameters instead?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsafrane: do you have an opinion whether it is worth making POD_NAME and POD_NAMESPACE real parameters? The advantage is that they become more visible. The disadvantage is that the YAML file becomes more complex because the variables are still needed (we need the env/valueFrom/fieldRef and the parameter which uses it).

Also, is it worth adding a parameter for how far the ownership chain should be walked up? It's not needed for StatefulSet where the desired owner is one level up (current code, hard-coded), but it may be useful for a Deployment->ReplicaSet->Pod scenario, because then the owner should be the Deployment object.

The problem is that in order to retrieve arbitrary parent objects, I need to pull in code like sigs.k8s.io/controller-runtime/pkg/client which can get objects of arbitrary types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to stick with just the env variables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Env. variables are pretty common. For ownership, I can see that using Deployment is far better than using ReplicaSet, however, I don't know how to make it generic and nice. Perhaps a special case for ReplicaSet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented it using sigs.k8s.io/controller-runtime/pkg/client: that API works with arbitrary object types.

I just pushed that updated code.

@pohly pohly force-pushed the storage-capacity branch from 35a8691 to 9a4d37e Compare August 5, 2020 09:49
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2020
@pohly pohly force-pushed the storage-capacity branch 2 times, most recently from ee5efa9 to bcef05d Compare August 6, 2020 15:53
@pohly
Copy link
Contributor Author

pohly commented Aug 6, 2020

@jsafrane: I've pushed a version of the code with documentation and all of the inline TODOs resolved. Can you perhaps take another look?

I intend to add two more things before merging:

  • walking up the owner chain
  • skipping storage classes with immediate binding

@pohly pohly force-pushed the storage-capacity branch from bcef05d to 75e634a Compare August 6, 2020 15:59
deploy/kubernetes/rbac.yaml Outdated Show resolved Hide resolved
deploy/kubernetes/rbac.yaml Outdated Show resolved Hide resolved
cmd/csi-provisioner/csi-provisioner.go Outdated Show resolved Hide resolved
deploy/kubernetes/rbac.yaml Outdated Show resolved Hide resolved
deploy/kubernetes/rbac.yaml Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
pkg/capacity/controller.go Outdated Show resolved Hide resolved
Comment on lines 200 to 204
AddFunc: func(obj interface{}) { c.onCAddOrUpdate(ctx, obj.(*storagev1alpha1.CSIStorageCapacity)) },
UpdateFunc: func(_ interface{}, newObj interface{}) {
c.onCAddOrUpdate(ctx, newObj.(*storagev1alpha1.CSIStorageCapacity))
},
DeleteFunc: func(obj interface{}) { c.onCDelete(ctx, obj.(*storagev1alpha1.CSIStorageCapacity)) },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the objects are retyped without any checks, please check for c != nil in onCAddOrUpdate / onCDelete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can that ever happen? I've seen the checks elsewhere but didn't add them myself because the callbacks should never get any other kind of object.

Copy link
Contributor

@jsafrane jsafrane Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least DeleteFunc used to be called with deletion tombstone - not the latest version of the API object, but the latest known by the informer. wrapped in an extra object

https://github.com/kubernetes/kubernetes/blob/ff3e5e06a79bc69ad3d7ccedd277542b6712514b/pkg/controller/volume/persistentvolume/pv_controller_base.go#L180

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I learned something new, thanks. I wonder whether I could have learned that from reading some API description.

I added cast checking. I see that as something that needs to be handled in the glue code rather than the actual callback, but it's now a bit repetitive, so I could be convinced to move casting into the callback...

pkg/capacity/controller.go Outdated Show resolved Hide resolved
pkg/capacity/controller_test.go Outdated Show resolved Hide resolved
pkg/capacity/controller_test.go Outdated Show resolved Hide resolved
pkg/capacity/controller_test.go Outdated Show resolved Hide resolved
@pohly pohly force-pushed the storage-capacity branch from 75e634a to f18f39b Compare August 10, 2020 13:08
@msau42
Copy link
Collaborator

msau42 commented Aug 14, 2020

cc @verult

@pohly pohly force-pushed the storage-capacity branch from fdc3726 to 17ba059 Compare August 14, 2020 08:30
Copy link
Contributor Author

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the updated code.

README.md Outdated

* `--capacity-poll-period <duration>`: How long the external-provisioner waits before checking for storage capacity changes. Defaults to `1m`.

* `--enable-capacity <enumeration>`: Enables producing CSIStorageCapacity objects with capacity information from the driver's GetCapacity call. Can be given more than once and/or with comma-separated values. Currently supported: --enable-capacity=central,immediate-binding.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, --enable-capacity was not necessarily mutually exclusive. Like feature gates, it enabled certain aspects of capacity info publishing. It was just that pruning of the KEP led to just two aspects that now happen to be mutually exclusive.

What would the separate flag for immediate binding be called? --enable-capacity-for-immediate-binding?

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
pkg/capacity/topology/nodes.go Outdated Show resolved Hide resolved
pkg/capacity/topology/nodes.go Outdated Show resolved Hide resolved
pkg/capacity/topology/nodes.go Show resolved Hide resolved
pkg/owner/owner.go Show resolved Hide resolved
pkg/owner/owner.go Outdated Show resolved Hide resolved
@pohly
Copy link
Contributor Author

pohly commented Aug 14, 2020

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-csi_external-provisioner/450/pull-kubernetes-csi-external-provisioner-1-18-on-kubernetes-1-18/1294189594291474435 failed because:

That download failure can be fixed. But what puzzles me is that the job tries to build Kubernetes from source. It shouldn't be doing that.

@pohly pohly force-pushed the storage-capacity branch 2 times, most recently from 195c868 to 17ba059 Compare August 14, 2020 09:48
@pohly
Copy link
Contributor Author

pohly commented Aug 14, 2020

That download failure can be fixed. But what puzzles me is that the job tries to build Kubernetes from source. It shouldn't
be doing that.

Gotcha! We don't configure which Kubernetes e2e to use for 1.18, so it falls back to building master.

@pohly pohly force-pushed the storage-capacity branch 3 times, most recently from 44841b0 to 12fe135 Compare August 14, 2020 18:29
README.md Outdated

* `--capacity-poll-period <duration>`: How long the external-provisioner waits before checking for storage capacity changes. Defaults to `1m`.

* `--enable-capacity <enumeration>`: Enables producing CSIStorageCapacity objects with capacity information from the driver's GetCapacity call. Can be given more than once and/or with comma-separated values. Currently supported: --enable-capacity=central,immediate-binding.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I can't really think of anything shorter. How about --capacity-controller-deployment-mode for central and local?

README.md Outdated

* `--capacity-threads <num>`: Number of simultaneously running threads, handling CSIStorageCapacity objects. Defaults to `1`.

* `--capacity-poll-period <interval>`: How long the external-provisioner waits before checking for storage capacity changes. Defaults to `1m`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I meant "interval" in the field name. For example -retry-interval-max, except in this case it's not exponential.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, of course. I overlooked the "period" in the parameter name. Fixed.

const (
// FeatureCentral enables the mode where there is only one
// external-provisioner actively running in the cluster which
// talks to the CSI driver's controller.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add "service" to be more explicit we're talking about the csi service

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

pkg/capacity/capacity.go Outdated Show resolved Hide resolved
@pohly pohly force-pushed the storage-capacity branch from 12fe135 to 1feefc7 Compare August 17, 2020 08:57
pohly added 8 commits August 17, 2020 12:37
This is the producer side of KEP
https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1472-storage-capacity-tracking.

Only deployment together with a central controller is currently
implemented.

When syncing directly whenever there is a change, there's potentially
a larger number of changes emitted. When there are rapid changes (for
example, while a driver gets deployed), it may be better to delay
processing and thus combine multiple changes in a single sync.
Generic code from the controller runtime is used to retrieve
unstructured objects. This is needed for deployment via Deployment,
because the direct parent of the pod is then a ReplicaSet which itself
will get deleted by the Deployment when rolling out changes.

There are intentionally no unit tests for the feature because that
would bring in even more additional dependencies.
The information is not needed when the consumer is the Kubernetes
scheduler, but it may be useful to publish also information about
storage classes with immediate binding, so it's configurable.
DeletedFinalStateUnknown is something that may be handed to the delete
callback. It can and should be handled.

Because there might be other, currently unexpected objects in the
future, it's better to handle them gracefully with a checked cast and
proper logging.
`--enable-capacity` was originally designed to be a multi-value
boolean map. After removing options from it, replacing it with a
single-value string makes more sense.
@pohly pohly force-pushed the storage-capacity branch from 1feefc7 to e50daf3 Compare August 17, 2020 10:38
@msau42
Copy link
Collaborator

msau42 commented Aug 18, 2020

/approve
Thank you!

Will let @jsafrane do another pass.

I just have some nits on adding a comment and more description to the new field value. And can you update the release note to point to the readme with more information on how to use the feature?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msau42, pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 18, 2020
@jsafrane
Copy link
Contributor

/lgtm
/retest

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 18, 2020
@k8s-ci-robot k8s-ci-robot merged commit e909258 into kubernetes-csi:master Aug 18, 2020

* `--capacity-poll-interval <interval>`: How long the external-provisioner waits before checking for storage capacity changes. Defaults to `1m`.

* `--capacity-controller-deployment-mode central|none`: Enables producing CSIStorageCapacity objects with capacity information from the driver's GetCapacity call. 'central' is currently the only supported mode. Use it when there is just one active provisioner in the cluster. Defaults to `none`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I think this merged before I got replied. I think it's not obvious that "none" means disabled. Could you clarify that in the doc, or just have unset as disabled?

Maybe also put this argument first since it basically controls the remaining args.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #468

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants