-
Notifications
You must be signed in to change notification settings - Fork 336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage capacity producer #450
Conversation
}, | ||
expectedSegments: []*Segment{localStorageNode2}, | ||
}, | ||
"deep-topology": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe another tests, with partially matching segments:
- Topology keys are
A
,B
,C
- One node has
A: US, B: NY, C:1
- Second node has
A: US, B: NY, C:2
- -> two segments are created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/capacity/controller.go
Outdated
// There's no reconciliation at runtime because changes not made by us | ||
// should never happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is scary. "Changes not made by us" do happen. If a controller owns an object, it should recreate it when someone deletes it or update it when someone replaces it with some nonsense. This is actually my main issue with this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about this myself, hence this comment. I'll add reconciliation against against actual objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm undecided whether I should use a CSIStorageCapacity informer or just list them periodically.
If I use an informer, the controller will get all of its own changes sent back to it, which adds a permanent, fixed overhead. The advantage is that the controller can react to undesirable changes quickly.
If I list, probably with a long delay, then the overhead might be lower (depending on how often it polls), but undesirable changes will be detected more slowly.
@jsafrane : What would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane: ping? ^^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've convinced myself that an informer is better because:
- a constant overhead is simpler to measure and plan for
- the code for handling the update conflict can be removed
Implemented.
@@ -266,6 +277,50 @@ func main() { | |||
controllerCapabilities, | |||
) | |||
|
|||
var capacityController *capacity.Controller | |||
if (*capacityFeatures)[capacity.FeatureCentral] { | |||
// TODO: use parameters instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsafrane: do you have an opinion whether it is worth making POD_NAME and POD_NAMESPACE real parameters? The advantage is that they become more visible. The disadvantage is that the YAML file becomes more complex because the variables are still needed (we need the env/valueFrom/fieldRef and the parameter which uses it).
Also, is it worth adding a parameter for how far the ownership chain should be walked up? It's not needed for StatefulSet where the desired owner is one level up (current code, hard-coded), but it may be useful for a Deployment->ReplicaSet->Pod scenario, because then the owner should be the Deployment object.
The problem is that in order to retrieve arbitrary parent objects, I need to pull in code like sigs.k8s.io/controller-runtime/pkg/client which can get objects of arbitrary types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to stick with just the env variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Env. variables are pretty common. For ownership, I can see that using Deployment is far better than using ReplicaSet, however, I don't know how to make it generic and nice. Perhaps a special case for ReplicaSet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented it using sigs.k8s.io/controller-runtime/pkg/client: that API works with arbitrary object types.
I just pushed that updated code.
ee5efa9
to
bcef05d
Compare
@jsafrane: I've pushed a version of the code with documentation and all of the inline TODOs resolved. Can you perhaps take another look? I intend to add two more things before merging:
|
pkg/capacity/controller.go
Outdated
AddFunc: func(obj interface{}) { c.onCAddOrUpdate(ctx, obj.(*storagev1alpha1.CSIStorageCapacity)) }, | ||
UpdateFunc: func(_ interface{}, newObj interface{}) { | ||
c.onCAddOrUpdate(ctx, newObj.(*storagev1alpha1.CSIStorageCapacity)) | ||
}, | ||
DeleteFunc: func(obj interface{}) { c.onCDelete(ctx, obj.(*storagev1alpha1.CSIStorageCapacity)) }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the objects are retyped without any checks, please check for c != nil
in onCAddOrUpdate
/ onCDelete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can that ever happen? I've seen the checks elsewhere but didn't add them myself because the callbacks should never get any other kind of object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least DeleteFunc used to be called with deletion tombstone - not the latest version of the API object, but the latest known by the informer. wrapped in an extra object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I learned something new, thanks. I wonder whether I could have learned that from reading some API description.
I added cast checking. I see that as something that needs to be handled in the glue code rather than the actual callback, but it's now a bit repetitive, so I could be convinced to move casting into the callback...
75e634a
to
f18f39b
Compare
cc @verult |
fdc3726
to
17ba059
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed the updated code.
README.md
Outdated
|
||
* `--capacity-poll-period <duration>`: How long the external-provisioner waits before checking for storage capacity changes. Defaults to `1m`. | ||
|
||
* `--enable-capacity <enumeration>`: Enables producing CSIStorageCapacity objects with capacity information from the driver's GetCapacity call. Can be given more than once and/or with comma-separated values. Currently supported: --enable-capacity=central,immediate-binding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, --enable-capacity
was not necessarily mutually exclusive. Like feature gates, it enabled certain aspects of capacity info publishing. It was just that pruning of the KEP led to just two aspects that now happen to be mutually exclusive.
What would the separate flag for immediate binding be called? --enable-capacity-for-immediate-binding
?
That download failure can be fixed. But what puzzles me is that the job tries to build Kubernetes from source. It shouldn't be doing that. |
195c868
to
17ba059
Compare
Gotcha! We don't configure which Kubernetes e2e to use for 1.18, so it falls back to building master. |
44841b0
to
12fe135
Compare
README.md
Outdated
|
||
* `--capacity-poll-period <duration>`: How long the external-provisioner waits before checking for storage capacity changes. Defaults to `1m`. | ||
|
||
* `--enable-capacity <enumeration>`: Enables producing CSIStorageCapacity objects with capacity information from the driver's GetCapacity call. Can be given more than once and/or with comma-separated values. Currently supported: --enable-capacity=central,immediate-binding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I can't really think of anything shorter. How about --capacity-controller-deployment-mode
for central
and local
?
README.md
Outdated
|
||
* `--capacity-threads <num>`: Number of simultaneously running threads, handling CSIStorageCapacity objects. Defaults to `1`. | ||
|
||
* `--capacity-poll-period <interval>`: How long the external-provisioner waits before checking for storage capacity changes. Defaults to `1m`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I meant "interval" in the field name. For example -retry-interval-max
, except in this case it's not exponential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, of course. I overlooked the "period" in the parameter name. Fixed.
pkg/capacity/features.go
Outdated
const ( | ||
// FeatureCentral enables the mode where there is only one | ||
// external-provisioner actively running in the cluster which | ||
// talks to the CSI driver's controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add "service" to be more explicit we're talking about the csi service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
12fe135
to
1feefc7
Compare
This is the producer side of KEP https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1472-storage-capacity-tracking. Only deployment together with a central controller is currently implemented. When syncing directly whenever there is a change, there's potentially a larger number of changes emitted. When there are rapid changes (for example, while a driver gets deployed), it may be better to delay processing and thus combine multiple changes in a single sync.
Generic code from the controller runtime is used to retrieve unstructured objects. This is needed for deployment via Deployment, because the direct parent of the pod is then a ReplicaSet which itself will get deleted by the Deployment when rolling out changes. There are intentionally no unit tests for the feature because that would bring in even more additional dependencies.
The information is not needed when the consumer is the Kubernetes scheduler, but it may be useful to publish also information about storage classes with immediate binding, so it's configurable.
DeletedFinalStateUnknown is something that may be handed to the delete callback. It can and should be handled. Because there might be other, currently unexpected objects in the future, it's better to handle them gracefully with a checked cast and proper logging.
`--enable-capacity` was originally designed to be a multi-value boolean map. After removing options from it, replacing it with a single-value string makes more sense.
1feefc7
to
e50daf3
Compare
/approve Will let @jsafrane do another pass. I just have some nits on adding a comment and more description to the new field value. And can you update the release note to point to the readme with more information on how to use the feature? |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: msau42, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
|
||
* `--capacity-poll-interval <interval>`: How long the external-provisioner waits before checking for storage capacity changes. Defaults to `1m`. | ||
|
||
* `--capacity-controller-deployment-mode central|none`: Enables producing CSIStorageCapacity objects with capacity information from the driver's GetCapacity call. 'central' is currently the only supported mode. Use it when there is just one active provisioner in the cluster. Defaults to `none`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I think this merged before I got replied. I think it's not obvious that "none" means disabled. Could you clarify that in the doc, or just have unset as disabled?
Maybe also put this argument first since it basically controls the remaining args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #468
What type of PR is this?
/kind feature
What this PR does / why we need it:
This is the producer side of https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1472-storage-capacity-tracking
Special notes for your reviewer:
TODO:
Does this PR introduce a user-facing change?: