Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add KubeVirt rate limiting profile kubelet #2496

Closed

Conversation

jcanocan
Copy link
Contributor

What this PR does / why we need it:
The rate limiting profile kubelet allows users to dynamically set the provided kubelet KubeAPIQPS and KubeAPIBurst to KubeVirt components: API, Webhook, Controller and Handler. The profile could be enabled adding the following to the HCO definition:

apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
...
spec:
  tuningPolicy: kubelet
...

This profile relies on the Machine Configuration Operator (MCO) to detect changes on the kubelet configuration. Therefore, MCO is required in order to work. This operator is available by default on OpenShift except for HyperShift where it's not available. Also, among all MCO objects, the kubelet profile uses the XX-worker-generated-kubelet objects to fetch and keep track of the rate limiting values.

Reviewer Checklist

Reviewers are supposed to review the PR for every aspect below one by one. To check an item means the PR is either "OK" or "Not Applicable" in terms of that item. All items are supposed to be checked before merging a PR.

  • PR Message
  • Commit Messages
  • How to test
  • Unit Tests
  • Functional Tests
  • User Documentation
  • Developer Documentation
  • Upgrade Scenario
  • Uninstallation Scenario
  • Backward Compatibility
  • Troubleshooting Friendly

Jira Ticket:

https://issues.redhat.com/browse/CNV-28519

Release note:

Add `kubelet` rate limting profile

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 31, 2023
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sradco for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jcanocan
Copy link
Contributor Author

/hold

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 31, 2023
@jcanocan
Copy link
Contributor Author

This is a WIP, the idea behind this PR is to get feedback from the community and to find out the best way to get this feature. Alternatives to MCO are listed in CNV-28519.

@@ -32,6 +32,7 @@ type HyperConvergedTuningPolicy string
const (
HyperConvergedAnnotationTuningPolicy HyperConvergedTuningPolicy = "annotation"
HyperConvergedHighBurstProfile HyperConvergedTuningPolicy = "highBurst"
HyperConvergedKubeletProfile HyperConvergedTuningPolicy = "kubelet"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about follow-kubelet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me! We have also considered naming it as something like mcoKubelet, with the idea of reflecting the dependency of the machine configuration operator (mco). Another possibility is a merge between both ideas like: followMCOKubelet. WDYT? In any case, followKubelet (IMHO without the - for consistency purposes) sounds better than just kubelet :)

@jcanocan jcanocan force-pushed the poc-kubelet-rate-machineconfigs branch 2 times, most recently from c7aaa05 to 73808eb Compare September 26, 2023 07:03
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 26, 2023
@jcanocan jcanocan force-pushed the poc-kubelet-rate-machineconfigs branch 2 times, most recently from fd0c5e3 to 095cd54 Compare September 26, 2023 07:26
The rate limiting profile `kubelet` allows users to dynamically set the
provided kubelet `KubeAPIQPS` and `KubeAPIBurst` to KubeVirt components.
The profile could be enabled adding the following to the HCO definition:

```
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
...
spec:
  tuningPolicy: kubelet
...
```

This profile relies on the Machine Configuration Operator (MCO) to detect
changes on the kubelet configuration. Therefore, MCO is required in
order to work. This operator is available by default on OpenShift with
the exception of HyperShift where it's not available.
Also, among all MCO objects, the `kubelet` profile uses the
`XX-worker-generated-kubelet` objects to fetch and keep track of the
rate limiting values.
Although, to configure kubelet the `KubeletConfig` objects are used,
this profile relies on the `machineconfig` object
`XX-worker-generated-kubelet` because it has several advantages:

* It is complete, i.e., it contains all kubelet values including default
  configurations in case the user enables the profile but no additional
settings have been provided to the rate limiters. Therefore, in case
there's any `KubeletConfig`, the profile can still get the default
values of kubelet, which may differ in future versions of OCP.
* This object is only updated if the configuration process has been done
  properly. For instance, if the user creates a new `KubeletConfig` but
no `machineconfigpool` has been labeled to use it, the
`XX-worker-generated-kubelet` won't reflect any changes.

Signed-off-by: Javier Cano Cano <[email protected]>
@jcanocan jcanocan force-pushed the poc-kubelet-rate-machineconfigs branch from 095cd54 to c5a1f94 Compare September 26, 2023 07:36
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 5 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 6309547484

  • 11 of 169 (6.51%) changed or added relevant lines in 4 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-2.2%) to 84.054%

Changes Missing Coverage Covered Lines Changed/Added Lines %
controllers/hyperconverged/hyperconverged_controller.go 0 15 0.0%
controllers/operands/kubevirt.go 8 57 14.04%
pkg/util/kubelet.go 0 94 0.0%
Files with Coverage Reduction New Missed Lines %
controllers/operands/kubevirt.go 1 87.76%
Totals Coverage Status
Change from base Build 6301809738: -2.2%
Covered Lines: 5213
Relevant Lines: 6202

💛 - Coveralls

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 3, 2023
@kubevirt-bot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

openshift-ci bot commented Dec 28, 2023

@jcanocan: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/hco-e2e-upgrade-operator-sdk-sno-aws c5a1f94 link false /test hco-e2e-upgrade-operator-sdk-sno-aws
ci/prow/hco-e2e-upgrade-prev-operator-sdk-aws c5a1f94 link true /test hco-e2e-upgrade-prev-operator-sdk-aws
ci/prow/hco-e2e-upgrade-prev-operator-sdk-sno-aws c5a1f94 link false /test hco-e2e-upgrade-prev-operator-sdk-sno-aws
ci/prow/hco-e2e-upgrade-operator-sdk-aws c5a1f94 link true /test hco-e2e-upgrade-operator-sdk-aws
ci/prow/hco-e2e-operator-sdk-sno-azure c5a1f94 link false /test hco-e2e-operator-sdk-sno-azure
ci/prow/hco-e2e-upgrade-operator-sdk-sno-azure c5a1f94 link false /test hco-e2e-upgrade-operator-sdk-sno-azure
ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-aws c5a1f94 link true /test hco-e2e-consecutive-operator-sdk-upgrades-aws
ci/prow/okd-images c5a1f94 link true /test okd-images
ci/prow/images c5a1f94 link true /test images
ci/prow/hco-e2e-consecutive-operator-sdk-upgrades-azure c5a1f94 link true /test hco-e2e-consecutive-operator-sdk-upgrades-azure

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 27, 2024
@kubevirt-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

@kubevirt-bot kubevirt-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 3, 2024
@kubevirt-bot
Copy link
Contributor

@jcanocan: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-hyperconverged-cluster-operator-e2e-k8s-1.29 c5a1f94 link true /test pull-hyperconverged-cluster-operator-e2e-k8s-1.29
pull-hyperconverged-cluster-operator-e2e-k8s-1.30 c5a1f94 link true /test pull-hyperconverged-cluster-operator-e2e-k8s-1.30

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kubevirt-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

@kubevirt-bot
Copy link
Contributor

@kubevirt-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants