Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable multi-architecture support, specifically for s390x, in CNAO and the associated CNIs. #1916

Open
ashokpariya0 opened this issue Oct 8, 2024 · 62 comments

Comments

@ashokpariya0
Copy link
Contributor

Is your feature request related to a problem? Please describe:
The origin-kube-rbac-proxy image(quay.io/openshift/origin-kube-rbac-proxy@sha256:e2def4213ec0657e72eb790ae8a115511d5b8f164a62d3568d2f1bff189917e8) used by CNAO operator is not compatible with the s390x architecture.

Describe the solution you'd like:
use multi arch supported image for kube-rbac-proxy.
origin-kube-rbac-proxy image should be compatible with s390x architecture as well.

Describe alternatives you've considered:
We can use quay.io/brancz/kube-rbac-proxy@sha256:e6a323504999b2a4d2a6bf94f8580a050378eba0900fd31335cf9df5787d9a9b
for kube-rbac-proxy image as this image is compatible with amd64, arm, arm64, ppc64le and s390x architecture.
Please check :
(https://quay.io/repository/brancz/kube-rbac-proxy/manifest/sha256:e6a323504999b2a4d2a6bf94f8580a050378eba0900fd31335cf9df5787d9a9b)

@ormergi
Copy link
Contributor

ormergi commented Oct 8, 2024

@ashokpariya0 could you please elaborate on the motivation for supporting s390x arch?
Is there some tracking issue or enhancement proposal for supporting s390x?

@ashokpariya0
Copy link
Contributor Author

We are enabling s390x (IBM Z system) support in all(most) kubevirt ecosystem components. Already kubevirt (virt-operator), supports s390x and now we are working on additional components like cnao, cdi(almost done), ssp, etc.

@oshoval
Copy link
Collaborator

oshoval commented Oct 8, 2024

Checked again, wrt auto bumper, i think in this specific case we dont need to update bumper scripts as this parameter is part of CNAO and not part of the components.
So your PR is fine.

What about all the other images, some of them arent multiarch right ?
We won't able to maintain it right now, but will prioritize it as a task for all the images once we can.
If you plan to contribute and able to give support if it breaks we might able to take your PR, thanks.
Maybe we first need to map all the required changes over all repos and just then start by fixing one by one.

Also please consider that kube-rbac-proxy need to be changed on https://github.com/k8snetworkplumbingwg/kubemacpool so if KMP is installed standalone or via CNAO it will be the same

WRT the proposed image, it seems safe to use it as https://github.com/brancz/kube-rbac-proxy is the one
https://github.com/openshift/kube-rbac-proxy is forked from

Wdyt about fixing it / opening an issue on https://github.com/openshift/kube-rbac-proxy please ?

@phoracek agree ? anything i missed ?
Thanks

@ashokpariya0
Copy link
Contributor Author

ashokpariya0 commented Oct 9, 2024

Checked again, wrt auto bumper, i think in this specific case we dont need to update bumper scripts as this parameter is part of CNAO and not part of the components. So your PR is fine.

@oshoval Thank you for your feedback.

What about all the other images, some of them arent multiarch right ? We won't able to maintain it right now, but will prioritize it as a task for all the images once we can. If you plan to contribute and able to give support if it breaks we might able to take your PR, thanks. Maybe we first need to map all the required changes over all repos and just then start by fixing one by one.

Yes, you are right. We have noticed that most of the CNI images do not support multiple architectures, and we are in the process of identifying the necessary changes. Our initial focus is to get the CNAO operator running on the s390x architecture(for this we need to change kube-rbac-proxy image), and then we will address the other CNI images one by one, as you suggested. We are committed to actively contributing to this effort.

Also please consider that kube-rbac-proxy need to be changed on https://github.com/k8snetworkplumbingwg/kubemacpool so if KMP is installed standalone or via CNAO it will be the same

Okay, Noted.

WRT the proposed image, it seems safe to use it as https://github.com/brancz/kube-rbac-proxy is the one https://github.com/openshift/kube-rbac-proxy is forked from

Yes, That is correct.

Wdyt about fixing it / opening an issue on https://github.com/openshift/kube-rbac-proxy please ?

Sure, I will create the issue and follow up on it.

@oshoval
Copy link
Collaborator

oshoval commented Oct 9, 2024

Thank you

Yes, you are right. We have noticed that most of the CNI images do not support multiple architectures, and we are in the process of identifying the necessary changes. Our initial focus is to get the CNAO operator running on the s390x architecture(for this we need to change kube-rbac-proxy image), and then we will address the other CNI images one by one, as you suggested. We are committed to actively contributing to this effort.

Just to make sure, i meant other components that CNAO deploys, do we talk about the same?
(i believe we do)

kubeMacPool
kubeSecondaryDNS
kubevirtIpamController
linuxBridge
macvtap
multus
multusDynamicNetworks
ovs

@ashokpariya0
Copy link
Contributor Author

ashokpariya0 commented Oct 9, 2024

Thank you

Yes, you are right. We have noticed that most of the CNI images do not support multiple architectures, and we are in the process of identifying the necessary changes. Our initial focus is to get the CNAO operator running on the s390x architecture(for this we need to change kube-rbac-proxy image), and then we will address the other CNI images one by one, as you suggested. We are committed to actively contributing to this effort.

Just to make sure, i meant other components that CNAO deploys, do we talk about the same?

kubeMacPool
kubeSecondaryDNS
kubevirtIpamController
linuxBridge
macvtap
multus
multusDynamicNetworks
ovs

Yes.
I tried all of the above CNIs, Below is my findings.
Multus : Able to deploy with latest image tag.
Multus Dynamic Networks Controller: We don't have mutli arch supported image for this CNI plugin. We need to build one.
linuxBridge- Image is multi arch supported but we see runtime issue(Marker binary is unavailable for s390x) with images.
macvtap: Image is multi arch supported but cp binray present inside image is not s390x compatible
ovs: We don't have mutli arch supported images for this CNI plugin.
KubeSecondaryDNS : We don't have multi arch image for this CNI plugin.

@oshoval
Copy link
Collaborator

oshoval commented Oct 9, 2024

Amazing thank you
what about kubeMacPool / kubevirtIpamController if possible as well please ?

@oshoval
Copy link
Collaborator

oshoval commented Oct 9, 2024

Please keep in mind for changes that are done, that we will also need ARM, so best if changes can be done in a robust manner (or at least strives to)
no need to test for ARM of course, just if some new builds are made, will be nice if the scripts can be robust please

@ashokpariya0
Copy link
Contributor Author

Amazing thank you what about kubeMacPool / kubevirtIpamController if possible as well please ?

Regarding kubemacpool:- Image is multi arch supported However I see runtime issue with image as manager binary present inside kubemacpool-cert-manager init container is not compatible with s390x.
kubevirtIpamController: looks fine.

@oshoval
Copy link
Collaborator

oshoval commented Oct 9, 2024

Thanks a lot, we are here whatever needed

@oshoval
Copy link
Collaborator

oshoval commented Oct 9, 2024

Hi @zhlhahaha
Might interest you about ARM, as some of the changes might affect it as well

@oshoval
Copy link
Collaborator

oshoval commented Oct 9, 2024

Might worth please to consider a gist about how to emulate s390x so whoever need to sanity check / develop it can
https://www.qemu.org/docs/master/system/target-s390x.html

@ashokpariya0
Copy link
Contributor Author

Might worth please to consider a gist about how to emulate s390x so whoever need to sanity check / develop it can https://www.qemu.org/docs/master/system/target-s390x.html

Sure, Thanks!

@zhlhahaha
Copy link

Hi @zhlhahaha Might interest you about ARM, as some of the changes might affect it as well

Let me take a look, thanks for reminding @oshoval

@ashokpariya0
Copy link
Contributor Author

@oshoval Please check PR: #1917

@ashokpariya0
Copy link
Contributor Author

@oshoval I created an issue on GitHub to enable multi-architecture image support for quay.io/openshift/origin-kube-rbac-proxy. You can find the issue here: Issue #113

@ashokpariya0 ashokpariya0 changed the title Enable kube-rbac-proxy image for s390x architecture. Enable multi-architecture support, specifically for s390x, in CNAO and the associated CNIs. Oct 11, 2024
@ashokpariya0
Copy link
Contributor Author

@oshoval @RamLavi The last pushed upstream CNAO image (link) was 13 days ago. Since then, a couple of pull requests have been merged in the CNAO repository (specifically the last two). However, I haven't seen a new operator image published. Could you please help clarify the reason for this?

@oshoval
Copy link
Collaborator

oshoval commented Oct 27, 2024

Hi, we dont release on each PR, just once there is a need, if you need it we can release

@ashokpariya0
Copy link
Contributor Author

@phoracek @oshoval @RamLavi
Please review the PR at #1923 when you get a chance.

@ashokpariya0
Copy link
Contributor Author

@oshoval @phoracek could you please direct me to the GitHub repository where the image below is being prepared?
https://quay.io/repository/kubevirt/cni-default-plugins?tab=tags&tag=latest

@oshoval
Copy link
Collaborator

oshoval commented Nov 21, 2024

hack/components/bump-linux-bridge.sh
by this script i think, which runs on CNAO git actions (bump release action)

@ashokpariya0
Copy link
Contributor Author

hack/components/bump-linux-bridge.sh by this script i think, which runs on CNAO git actions (bump release action)

Thanks, @oshoval , for the quick reply! I got my answer.

@oshoval
Copy link
Collaborator

oshoval commented Nov 21, 2024

you mean who build quay.io/openshift/origin-operator-registry:4.2.0 ?
i dont know by heart sorry, if i find i will update here

@ashokpariya0
Copy link
Contributor Author

@RamLavi @phoracek @oshoval
I have submitted a PR for multi-platform/architecture support. Please review the following:

For Linux Bridge:
Bridge-Marker: PR #77
Linux-Bridge: PR #1927

For KubeMacPool:
PR #441

@oshoval
Copy link
Collaborator

oshoval commented Nov 28, 2024

Thanks
FYI if you want to check this one please k8snetworkplumbingwg/ovs-cni#336

@ashokpariya0
Copy link
Contributor Author

Thanks FYI if you want to check this one please k8snetworkplumbingwg/ovs-cni#336

Sure

@ashokpariya0
Copy link
Contributor Author

@oshoval Can we add the s390x platform alongside amd64 in the Makefile, similar to how it's done in the ipam-extensions Makefile? By default, it will only build for the amd64 architecture unless s390x is explicitly included. Additionally, we need to remember to export the PLATFORMS variable like so: export PLATFORMS=linux/arm64,linux/s390x,linux/amd64 I think it would be a good idea to add s390x for now, and we can include other platforms later. I can also add a note to the developer workflow, advising how to set or unset this platform configuration

Personally i prefer not, at least not in this stage, lets first finish the first phase of adapting the components and the release lanes and then we can consider ? I might be wrong here, best to ask for more opinions

@oshoval Regarding multiplatform support, it's not ideal yet, but we can begin by adding support for AMD64 and s390x architectures. I've already completed testing for these, so once multiplatform support is integrated for CNAO-supported CNIs and the release is finalized, we’ll just need to update the image digest.

Currently, the Multus CNI already supports multiplatform, so we can directly proceed with testing that.
I think it would be a great idea to start publishing multiplatform-supported images for the CNAO operator starting with the next release. This would enable us to test upstream images on the s390x architecture as well.
What do you think ? @phoracek @RamLavi

btw what about auto detection ? it can win both worlds, be light and automatic

Yes, this is a good suggestions, and I can submit the PR for it.

@ashokpariya0
Copy link
Contributor Author

btw what about auto detection ? it can win both worlds, be light and automatic

Added in PR: #1928

@LakshmiRavichandran1
Copy link

@RamLavi @phoracek @oshoval ,
thank you for your support from the community to enable s390x efforts on CNAO and the related plugins.

  • I take this opportunity to increase the importance of the below s390x PRs from @ashokpariya0 and the urge to get them merged before the feature freeze on 02.Dec.2024 (coming monday).

  • Additionally, is it possible to get a new release created for each of the CNIs, so that we can update their image digests in the CNAO codebase before 02.Dec?

For your kind visibility, these are the PRs:

For Multus Dynamic:

k8snetworkplumbingwg/multus-dynamic-networks-controller#267

For Linux Bridge:

Bridge-Marker: kubevirt/bridge-marker#77
Linux-Bridge: #1927

For KubeMacPool:

k8snetworkplumbingwg/kubemacpool#441

@oshoval
Copy link
Collaborator

oshoval commented Dec 1, 2024

Thanks for all your contribution.
Sorry, we won't able to finish it by tomorrow, it should get reviewed and prioritized,
we are doing the best we can.

@oshoval
Copy link
Collaborator

oshoval commented Dec 1, 2024

There are also those PRs by @jschintag
kubevirt/kubesecondarydns#73
kubevirt/kubesecondarydns#75

Thanks

@oshoval
Copy link
Collaborator

oshoval commented Dec 1, 2024

From my point of view we must hold until #1923 (comment) is addressed fully

also i am going to try CNAO locally, i might have an older fedora and will need time to make sure it runs locally (or that i need to upgrade which will take time)
I am on other tasks as well, and won't able to maintain CNAO otherwise

@oshoval
Copy link
Collaborator

oshoval commented Dec 1, 2024

about #1916 (comment)
I thought about it again, auto detection is fine, compiling both is not, our time is precious and build should be fast
Please consider that on the mentioned PRs (unless already done)

(for completeness you might want to benchmark with and without on worst case, and if it is neglectable we can theoretically consider it, but i don't think it make sense to force it)

@ashokpariya0
Copy link
Contributor Author

about #1916 (comment) I thought about it again, auto detection is fine, compiling both is not, our time is precious and build should be fast Please consider that on the mentioned PRs (unless already done)

(for completeness you might want to benchmark with and without on worst case, and if it is neglectable we can theoretically consider it, but i don't think it make sense to force it)

I've noticed that multi-platform builds are much faster with Docker Buildx since it builds in parallel, whereas with Podman, it takes more time (although I can see ways to optimize that). Additionally, could we consider adding a GitHub Action to offload the builds to a CI/CD system for better performance in multi-platform scenarios?

@oshoval
Copy link
Collaborator

oshoval commented Dec 1, 2024

Hi, about git actions not atm please, won't have time to work on that, lets first do the minimum first
about compilation, we can have benchmark and act accordingly please once we can compare, imho auto detection is good enough in any case (let me know please why not if you think otherwise)

@oshoval
Copy link
Collaborator

oshoval commented Dec 3, 2024

All your PRs that aren't merged yet, so will be easier to track,
if i miss anything / you would like to add new ones please let me know

Thanks a lot for your effort

EDIT
once we done, this is the list of what we should release / verify

  • kubeMacPool
  • kubeSecondaryDNS (released)
  • kubevirtIpamController
  • linuxBridge
  • macvtap
  • multus
  • multusDynamicNetworks (release was created)
  • OVS
  • Bridge marker
  • CNAO

@oshoval
Copy link
Collaborator

oshoval commented Dec 3, 2024

We miss macvtap in case you would like to address it please as well
Thanks

@ashokpariya0
Copy link
Contributor Author

We miss macvtap in case you would like to address it please as well Thanks

Sure Noted, I will work on that.

@ashokpariya0
Copy link
Contributor Author

@oshoval Once the PR #336 gets merged, I can submit the changes for s390x (which will be a small update). After that, we can request a new release for the OVS CNI.

@oshoval
Copy link
Collaborator

oshoval commented Dec 3, 2024

@oshoval Once the PR #336 gets merged, I can submit the changes for s390x (which will be a small update). After that, we can request a new release for the OVS CNI.

awesome thanks
k8snetworkplumbingwg/ovs-cni#336 is merged

@ashokpariya0
Copy link
Contributor Author

@oshoval Once the PR #336 gets merged, I can submit the changes for s390x (which will be a small update). After that, we can request a new release for the OVS CNI.

awesome thanks k8snetworkplumbingwg/ovs-cni#336 is merged

I submitted s390x support PR for OVS CNI: k8snetworkplumbingwg/ovs-cni#337

@oshoval
Copy link
Collaborator

oshoval commented Dec 5, 2024

Thanks

Lets please also run locally docker / podman on the repos to make sure it does work for both for multi arch / cross etc
at least some of the combinations

if help is needed lets sync on slack, thanks

@ashokpariya0
Copy link
Contributor Author

Thanks

Lets please also run locally docker / podman on the repos to make sure it does work for both for multi arch / cross etc at least some of the combinations

if help is needed lets sync on slack, thanks

Thanks @oshoval
I’ve considered that, and here’s what I do:

  1. I build on amd64 and then test on s390x to validate cross-compilation.
  2. I also build directly on s390x and test on the same architecture to ensure everything works natively.
  3. Additionally, I test with both Docker and Podman to ensure compatibility across both container runtimes.

@oshoval
Copy link
Collaborator

oshoval commented Dec 5, 2024

Awesome, thanks a lot

@oshoval
Copy link
Collaborator

oshoval commented Dec 5, 2024

Hi, I didn't manage yet to have a system end to end that compiles and push everything in every combinations,
and seems i won't able to focus on that, so i can't maintain it myself atm.
It means that if problems related to multi arch will arise, we might need to ask your help,
will you able please to support it?
Our main effort will still be the stability of the main flow releases.
Thanks

@oshoval
Copy link
Collaborator

oshoval commented Dec 5, 2024

/cc @jschintag
since you helped in this effort, thanks

@ashokpariya0
Copy link
Contributor Author

Hi, I didn't manage yet to have a system end to end that compiles and push everything in every combinations,
and seems i won't able to focus on that, so i can't maintain it myself atm.
It means that if problems related to multi arch will arise, we might need to ask your help,
will you able please to support it?
Our main effort will still be the stability of the main flow releases.
Thanks

@oshoval We're here to assist whenever you need extra hands with multi-arch or any related concerns. Just let us know, and we'll do our best to help!

@oshoval
Copy link
Collaborator

oshoval commented Dec 5, 2024

Thank you very much for all the effort and co-op

@jschintag
Copy link

No Problem, thank you @oshoval for your time and help in this.
And as @ashokpariya0 said if you have a multi-arch problem feel free to ping us.

@oshoval
Copy link
Collaborator

oshoval commented Dec 12, 2024

Can you please check that CNAO release jobs themselves support and will use "all" flag ?

@ashokpariya0
Copy link
Contributor Author

Can you please check that CNAO release jobs themselves support and will use "all" flag ?

Sure, could you please direct me to the Prow jobs responsible for the release?

@oshoval
Copy link
Collaborator

oshoval commented Dec 12, 2024

please look on post submit of cnao under project infra or on git actions
if you dont manage i will try to help

@ashokpariya0
Copy link
Contributor Author

Can you please check that CNAO release jobs themselves support and will use "all" flag ?

Done, PR link: #1956

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants