Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shift generated cluster e2e image validation jobs to config-forker/rotator #33927

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Rajalakshmi-Girish
Copy link
Contributor

@Rajalakshmi-Girish Rajalakshmi-Girish commented Dec 11, 2024

This change contains:

  • Add detail about fork-per-release-generic-suffix to releng/config-forker/README.md
  • Add new placeholder releng/cloud_provider_image_validation.yaml to hold cluster e2e cloud provider image validation job definitions.
  • Use Config-forker and generate beta job in config/jobs/kubernetes/sig-release/release-branch-jobs/cloud-provider/1.32.yaml
  • Use Config-rotator and subsequently rotate the version markers.
  • Manual changes to the stable2(1.30) and stable3(1.29) Jobs as per testsuite name mentioned:
    • alphafeatures: remove EventedPLEG=false and ENABLE_AUTH_PROVIDER_GCP=true env args
    • default: remove ENABLE_CACHE_MUTATION_DETECTOR=true env arg
    • serial: remove [sig-cloud-provider-gcp\] regex from --ginkgo.skip test_arg
  • Remove the jobs from directory generated

The upcoming actions would be to:

@xmudrii @BenTheElder @dims ^^

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 11, 2024
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. area/config Issues or PRs related to code in /config area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Dec 11, 2024
@Rajalakshmi-Girish Rajalakshmi-Girish force-pushed the change-generated-jobs branch 2 times, most recently from b631329 to c649f80 Compare December 11, 2024 19:16
@Rajalakshmi-Girish Rajalakshmi-Girish marked this pull request as draft December 11, 2024 19:25
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2024
@Rajalakshmi-Girish Rajalakshmi-Girish marked this pull request as ready for review December 11, 2024 19:27
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2024
@Rajalakshmi-Girish Rajalakshmi-Girish marked this pull request as draft December 11, 2024 19:32
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2024
@Rajalakshmi-Girish
Copy link
Contributor Author

/retest

@Rajalakshmi-Girish Rajalakshmi-Girish marked this pull request as ready for review December 12, 2024 10:46
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 12, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Rajalakshmi-Girish
Once this PR has been reviewed and has the lgtm label, please assign cici37 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@xmudrii xmudrii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support that we're getting rid of config/jobs/kubernetes/generated/generated.yaml!

One comment as of now, but I'm yet to do a more detailed review.

spec:
containers:
- args:
- --cluster=test-gce-cos-k8sstable3-alphafeatures
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this change, this used to be a randomly generated string (I think). Do we care about this change, is there a chance that this might break something?

cc @BenTheElder

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will break anything, we are fully on boskos. I don't think we even need to set this flag anymore, but I'd prefer to leave it for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest concern would be generating a value that is too long or improper character set, if we're changing the generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value for the --cluster flag was previously assigned the SHA1 hash of the job_name, generated by this line.

Since the job names only varied by version markers (e.g., beta, stable1) and remained consistent otherwise, the --cluster data stayed aligned with the job name despite version changes.

Given this consistency, I thought assigning the --cluster value directly as the job name should suffice.

@@ -0,0 +1,236 @@
periodics:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this file called "image-validation" ? isn't this just release-branched jobs?

and they're not really sig-release, even if release is doing the forking, cc @aojea
these are just kubernetes/kubernetes jobs, with multiple SIGs involved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replied here

@BenTheElder
Copy link
Member

It would be nice if we could do a commit with the file split and then further commits with the definition mutations, as-is it's very hard to be confident about which changes landed in the jobs and review them.

Comment on lines +3 to +5
When a release branch of kubernetes is first cut, the jobs defined in [`cloud_provider_image_validation.yaml`]
must be forked to use the new release branch. Use [`releng/config-forker`] to
accomplish this, eg:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some wording improvements:

Suggested change
When a release branch of kubernetes is first cut, the jobs defined in [`cloud_provider_image_validation.yaml`]
must be forked to use the new release branch. Use [`releng/config-forker`] to
accomplish this, eg:
When cutting a new Kubernetes release branch, the jobs defined in
[`cloud_provider_image_validation.yaml`] must be forked for the new
release branch. Use [`releng/config-forker`] to accomplish this, e.g.:

Comment on lines +11 to +12
--version 1.27 \
--go-version 1.31 \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put placeholders here, e.g.:

Suggested change
--version 1.27 \
--go-version 1.31 \
--version <Kubernetes major.minor version> \
--go-version <used Go major.minor version> \

--job-config $(pwd)/releng/cloud_provider_image_validation.yaml \
--version 1.27 \
--go-version 1.31 \
--output $(pwd)/config/jobs/kubernetes/sig-release/release-branch-jobs/cloud-provider/image-validation-1.31.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--output $(pwd)/config/jobs/kubernetes/sig-release/release-branch-jobs/cloud-provider/image-validation-1.31.yaml
--output $(pwd)/config/jobs/kubernetes/sig-release/release-branch-jobs/cloud-provider/image-validation-<Kubernetes major.minor version>.yaml

--output $(pwd)/config/jobs/kubernetes/sig-release/release-branch-jobs/cloud-provider/image-validation-1.31.yaml
```

# How to rotate the k8sbeta job to stable1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should explain why are we doing this.

Comment on lines +18 to +23
```sh
# from test-infra root
$ go run ./releng/config-rotator \
--config-file ./config/jobs/kubernetes/sig-release/release-branch-jobs/cloud-provider/image-validation-1.31.yaml \
--new stable1 --old beta
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mention that this needs to be done for each release branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I don't think this really solves the problem. This just shifts the problem from test_config.yaml to a new source that still has to be maintained in addition to the master branch jobs.

Ideally, we'd fork existing master branch jobs and not a dedicated template just for release branches. I can still see maintainers just changing the master branch jobs, and leaving jobs defined in this files as-is.

Also +1 to what @BenTheElder said about file name and ownership.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback 👍
@xmudrii You’re absolutely right that having a dedicated template for release branch jobs introduces the risk of maintainers not keeping it in sync with master branch jobs.
How would you prefer to handle this? Do you think we should configure the jobs in the template to also run against the master branch?

@BenTheElder Could you also suggest an ideal filename and location for these jobs to ensure clarity and ease of maintenance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xmudrii / @BenTheElder
Below are the corresponding jobs I could find for master branch. Kindly let me know if these can be modified to be compatible with config-forker to generate release branch jobs?

image

Other than the details job wise, there are a bunch of changes that I see when comparing master branch and release branch jobs:

  1. Release branch jobs have build cluster value cluster: k8s-infra-prow-build mentioned in job definitions.
  2. All RB jobs have an extra argument --cluster passed. But as per Ben's comment here, that flag can be removed for RB jobs too!
  3. The master branch jobs have extra annotations like testgrid-alert-email:, testgrid-alert-stale-results-hours:

What could be the next step?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 19, 2024
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/config Issues or PRs related to code in /config area/jobs area/release-eng Issues or PRs related to the Release Engineering subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants