-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.17] OCPBUGS-39558: Filter out shallowly UpdateEffectNone
errors from a MultipleErrors
message in the Failing condition
#1114
Conversation
…erVersionStatus This commit will add additional testing regarding setting the Failing condition using the `updateClusterVersionStatus` function. This is to ensure no functionality is lost upon new changes.
…n MultipleErrors in Failing condition Various errors get propagated to users, such as the summarized task graph error. For example, in the form of the message in the Failing condition. However, update errors set with the update effect of UpdateEffectNone can confuse users, as these primarily informing messages get displayed together with valid update errors that heavily impact the update. This can result in a message such as: { "lastTransitionTime": "2023-06-20T13:40:12Z", "message": "Multiple errors are preventing progress:\n* Cluster operator authentication is updating versions\n* Could not update customresourcedefinition \"alertingrules.monitoring.openshift.io\" (512 of 993): the object is invalid, possibly due to local cluster configuration", "reason": "MultipleErrors", "status": "True", "type": "Failing" } The Failing condition is not true because of the UpdateEffectNone error ("Cluster operator authentication is updating versions"), but its message still gets displayed. This commit makes sure that update errors that do not heavily affect the update will be removed from the MultipleErrors error in the Failing condition message to an extent. The filtered out errors from the message will still be displayed in the logs and in other places, such as the ReconciliationIssues condition. The original code handles correctly situations where the status failure is an UpdateEffectNone error. The new changes leave such errors be. In case the MultipleErrors error contains only UpdateEffectNone errors, the error is unchanged to keep the original logic unchanged and keep the commit simple. The goal of this commit is to remove unimportant messages from MultipleErrors errors that contain valid messages in the Failing condition. The current code contains an override to set the Failing condition when history is empty or the CVO is reconciling. This commit will keep this logic functional. This means the filtering is only applied when history is not empty and the CVO is not reconciling the payload.
Due to the introduced filtering of UpdateError errors before setting the Failing condition, it is needed to update the TestCVO_ParallelError test, as its errors are getting rightfully filtered due to their UpdateEffect being None. This commit is utilizing this chance to update the UpdateEffect of one of the errors to test the filtering here as well.
245e690
to
7e71284
Compare
@openshift-cherrypick-robot: Detected clone of Jira Issue OCPBUGS-15200 with correct target version. Will retitle the PR to link to the clone. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
UpdateEffectNone
errors from a MultipleErrors
message in the Failing conditionUpdateEffectNone
errors from a MultipleErrors
message in the Failing condition
@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-39558, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/test e2e-hypershift |
This is not the first time I see a |
Follow up on #1114 (comment) |
/test e2e-hypershift |
/override ci/prow/e2e-hypershift The CI seems to be flaky. The changes from the PR only filter messages propagated to the ClusterVersion Failing condition. The failing CI is not relevant to the PR. There were three runs:
|
@DavidHurta: DavidHurta unauthorized: /override is restricted to Repo administrators, approvers in top level OWNERS file, and the following github teams:openshift: openshift-release-oversight openshift-staff-engineers. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The 4.17 branch has my older username |
/override ci/prow/e2e-hypershift |
@petr-muller: Overrode contexts on behalf of petr-muller: ci/prow/e2e-hypershift In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/label backport-risk-assessed
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: openshift-cherrypick-robot, petr-muller The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Install a 4.17 Cluster and degrade the CO authentication
Create a custom policy to prohibit the CVO from updating the openshift-samples operator deployment.
Trigger an upgrade to version which contains the changes
Upgrade is triggered and CVO is throwing new error after sometime.
check the CVO logs to be sure that message was filtered as expected.
After un-degrading the CO authentication. error should change from Multiple errors to UpdatePayloadResourceInvalid
Delete the created policies
Upgrade should proceed without errors
|
/label qe-approved |
@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-39558, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@jiajliu, PTAL regarding the |
/label cherry-pick-approved |
1 similar comment
@openshift-cherrypick-robot: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
bc60dbd
into
openshift:release-4.17
@openshift-cherrypick-robot: Jira Issue OCPBUGS-39558: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-39558 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[ART PR BUILD NOTIFIER] Distgit: cluster-version-operator |
/cherry-pick release-4.16 |
@DavidHurta: new pull request created: #1128 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This is an automated cherry-pick of #1050
/assign DavidHurta