-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal to publish multi-arch buildpacks #288
base: main
Are you sure you want to change the base?
Proposal to publish multi-arch buildpacks #288
Conversation
I haven't really been following the existing arm64 work in the cloud native buildpacks and paketo communities. One thing I am wondering about in the context of this RFC is what the longer-term picture is? Is this RFC a stepping stone towards |
|
||
My proposal is to update jam to create buildpacks for amd64 and arm64. I haven't used jam yet, so I don't have specific code changes for it and would appreciate thoughts on how this could be added. | ||
|
||
These can then be pushed using skopeo with the above mentioned architecture-specific tags. Finally some `docker manifest` commands can be used to create the manifest list image and then push it to the registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably can't use skopeo
to publish multi-architecture images - see this PR to jam which adds support for multi-arch stacks to jam
directly. We would probably have to do something similar for multi-arch buildpacks - i.e. jam publish-buildpack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
||
As discussed [here](https://github.com/actions/runner-images/issues/5631), github actions currently only supports amd64 runners. This means that unit and integration tests cannot be run on arm64. | ||
|
||
There are two ways to approach this issue. The first is to only run tests for amd64 and label all arm64 artifacts as experimental. The second is to use self-hosted or third-party arm64 runners so everything can be tested on both architectures. Given that paketo bulidpacks are used for production workloads, I think having arm64 runners will be necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do both - start with experimental support for arm images for buildpacks that don't require dependencies compiled for arm64 (see my comment above) while we work on getting arm64 runners.
For buildpacks that don't have dependencies that need compiling, we should be able to cross-compile the buildpacks themselves and drop-in arm64 dependencies from upstream, and distribute them as "experimental" before we are truly able to test them.
That seems like a helpful path forward - the community gets easy access to arm64 buildpacks and we get feedback on them, while we work on getting the necessary prerequisites (i.e. arm64 runners) to be confident in labeling the buildpacks production ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to start this endeavour with an experimental flag at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should absolutely cross compile the Buildpacks. I have been doing this in the Arm64 scripts I have and it works perfectly.
We do have access to arm runners through BuildJet but we should reserve these for things where it is absolutely necessary like compiling dependencies. They are not free like the standard GitHub runners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't think we need to run all of the test suites on both architectures or use an experimental label. There's very little that is architecture specific about a buildpack. It is really just the code that installs dependencies. That code can be mocked out in tests so we can be confident functionality works correctly regardless of the architecture.
We could potentially look at a small suite of smoke tests for both architectures that we run before release of composites or possibly using some sort of test flags to only run a subset of tests on Arm64. I think we should start without doing this though and see how things go, there will be a period of experimental/beta support for Arm64 as we develop it. We can just watch to see if any platform specific issues crop up and react accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I'm mostly concerned about the integration / end-to-end testing of the arm64 compiled dependencies. We already run into issues just with the amd64 compiled dependencies and we have fairly decent test suites for those. I would not be confident calling the arm64-compiled-dependency buildpacks production-ready without integration testing them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're on the same page, but just trying to clarify how I'm thinking about this more.
I think we should treat buildpacks and dependencies as two different things. At the layer of buildpacks, there's an interface between the buildpack and the dependency. It expects files in a certain layout and it expects certain executables to exist. As long as the contract is the same per arch, which it generally should be, then testing buildpacks once should be enough. Even if it's not, you can very like mock or stub those differences and the arch on which those tests runs is irrelevant.
I agree that for the cases where we build and publish dependencies that we should do more testing on those dependencies. If we compile from source we should run test suites that provide some basic level of fitness on those binaries. That's going to require arm runners.
If we download upstream binaries, like with Java or Go, I don't think we need to do that. IMO, that's out of our scope. The publisher of those binaries owns that responsibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and no. For the buildpacks that don't use libpak
(i.e. everything except java/utilities), in order to get the current level of integration test coverage and confidence we have for amd64, we have to get each and every buildpack working on arm64. The integration tests for each buildpack use the buildpack before it. For example, the integration tests for pip-install
run a pack build
with the pip-install
buildpack but also with cpython
and pip
. Sure, we could just unit-test the buildpack on arm64, but I don't feel comfortable with that reduced level of support.
If you have a different level of comfort for the java buildpacks - which would make sense because you don't really have integration tests for the java buildpacks in the same way as the other Paketo buildpacks - then I'd understand that, but I don't think we can extrapolate that to become a general assertion for all Paketo buildpacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with agreeing to disagree on this point, I think we can do that and still move forward with this effort. Each language family team needs to make the cost/effort versus benefit evaluation and act accordingly. I'm confident each team will make the right choices for their user bases, so perhaps we can put the following takeaways into the RFC:
- The project has ARM64 runners for teams to use as needed.
- Each language team will need to decide what "production-ready" means. Things for each team to consider: confidence in the code, automated release for timely updates, and feature parity with x86.
Anything else come to mind as a good guideline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds perfectly reasonable. I don't think I have much else to add to the takeaways.
@loewenstein the current work upstream is adding a lower level feature like Down the road that will be more automatic. The normal In the meantime Paketo will have to fill in the gaps in our pipelines. Maybe with some additional tooling too. That's what we're doing with the stack builds now, enhancing jam to publish a multi arch image for the stack. |
@jpena-r7 Thanks so much for writing this up. Excellent to pull all of this information together and help get everyone focused on this topic. |
|
||
I haven't looked at github workflows for all buildpacks, but I believe [jam](https://github.com/paketo-buildpacks/jam) is being used to package buildpacks. I did see skopeo being used to push images ([cpython](https://github.com/paketo-buildpacks/cpython/blob/main/.github/workflows/push-buildpackage.yml#L62)). | ||
|
||
My proposal is to update jam to create buildpacks for amd64 and arm64. I haven't used jam yet, so I don't have specific code changes for it and would appreciate thoughts on how this could be added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I'm thinking more about this, it's not obvious to me what, if any, changes are required to jam pack
to support building multi-arch buildpacks. jam pack
doesn't compile the buildpack; it just packages it. It also doesn't create OCI images - pack buildpack package
does this.
Expanding on this in more detail:
The current workflow to create a buildpack is:
- Compile the
build
andrun
binaries (e.g.go build
). - Run either
jam pack
(forpackit
buildpacks) orpackager
(forlibbuildpack
buildpacks). This primarily creates a tgz of the buildpack files. Forpackit
buildpacks, this also does some extra steps like setting the buildpack version and optionally downloading dependencies. I'm not sure what other stepspackager
takes. pack buildpack package
to turn thattgz
into an OCI image.
Currently, jam pack
creates a single tarball of a single buildpack with no concept of multi-architecture. You can optionally filter dependencies by --stack
to only include the stack-specific dependencies in the resultant metadata (and downloaded dependencies), but given that stacks can be multi-architecture, this isn't really sufficient to filter dependencies by architecture.
It's not really obvious to me what it would mean for jam pack
to support packaging up a buildpack in a multi-architecture way. It doesn't compile the binaries, and it doesn't produce OCI images. Filtering the dependencies by architecture would be an optimization to reduce metadata (and resultant buildpack size if downloading dependencies), but it's not a necessary step for creating the buildpack.
I suppose you could make modifications to jam pack
to support packaging multiple versions of build
and run
for each target architecture and create a wrapper around them to invoke the correct one at run-time. But it seems just as easy to invoke jam pack
multiple times - once per architecture - to create architecture-specific tarballs.
I don't know if pack buildpack package
already supports multi-arch images. I took a quick look at the code, and it only seems to be aware of platform
, which only contains OS
(i.e. linux
or windows
). This leads me to think either we'd have to make changes to pack
, or we'd have to create separate docker images for each architecture, and bundle them together. Perhaps that's what docker buildx imagetools create
does - I'm not sure. Perhaps @dmikusa or @dashaun have more info on how to build multi-arch buildpacks in a single manifest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct that pack
does not support multi-arch yet, and the new changes being introduced will allow users to create manifest images after they have already been built.
This is an example of creating multi-arch builders. It creates and publishes arch-specific tags, but it is meant for demonstration purposes and testing so it gives an idea.
Creating multi-arch buildpacks should be simpler and may not even require a local registry because no other images are pulled in. That should allow you to create the arch-specific buildpack images locally and then create a manifest from there and push the manifest.
Let's say you compile a buildpack for amd64 in one folder amd64
and arm64 in another folder arm64
. You then create local buildpack images for each arch such as mybuildpack:0.0.1-amd64
and mybuildpack:0.0.1-arm64
. Then you can just create and push a manifest image as shown below and it would work on both architectures.
docker buildx imagetools create -t mybuildpack:0.0.1 mybuildpack:0.0.1-amd64 mybuildpack:0.0.1-amd64
There is one issue that needs to be considered though. At this time pack seems to set the architecture to amd64
. I had to work around this issue when creating the multi-arch builder here, but I just wanted to call this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is exactly what I was thinking. I don't think we need to make any changes to jam pack
to achieve this.
It would look something like the following:
-
for arch in [amd64, arm64]
- compile buildpack executables for target architecture via
go build
; put them in arch-specific temp directory - package buildpack with
jam pack
into arch-specific temp directory - create arch-specific OCI image via
pack buildpack package
, passing in contents of arch-specific temp directory - upload image to local docker daemon/container-runtime/engine (whatever the correct terminology is here) with arch-specific tag
- compile buildpack executables for target architecture via
-
create single multi-arch image via
docker builds imagetools create
and the arch-specific tagged images previously uploaded to local daemon -
publish single image to relevant registries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan was to not actually publish the intermediate images either. Have them live in the Docker daemon, then make the multi-arch index image and publish that. We push to the docker daemon by default, so that's easy. If you're not pushing to it already you can combine multiple docker image archive files into an index image pretty easily (kind of similar to that PR I send to jam recently, but that would need to be done in code. I'm not sure it would be any more efficient, so we'll be going the daemon route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More of an implementation detail, not sure it needs to go into the RFC, but skopeo
won't work with multi-arch images in OCI image files which get used in a number of places so jam
may need some tooling for that. Not sure if the commands I added for publishing the stacks will work with buildpacks as well? Maybe 🤔
On the pipeline-builder side, we use crane
, and has similar shortcomings when it comes to multi-arch images in OCI image files. We may need a similar tool, although I'm not sure because we don't use OCI image files as much. All in all, it could be a good chance for both pipelines to remove some 3rd party tools and rally around jam
as our solution though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More thoughts. Pack is adding a manifest
command for generating manifest images. This is part of their first step towards multi-arch support. If it is ready in time, it could be something we utilize instead of docker
. Not sure what folks think about this?
Right now binaries are created on/for amd64 and then packaged up into buildpacks using pack. It should be easy to create binaries for arm64 which can then be packaged up into buildpacks using pack. These architecture-specific buildpacks can be pushed to a registry with a platform specific tag as shown in the examples below. | ||
|
||
* `paketobuildpacks/ca-certificates:3.6-amd64` | ||
* `paketobuildpacks/ca-certificates:3.6-arm64` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I'm reading about docker buildx
, a lot of the examples seem to push a single image with multiple tags from a single Dockerfile
. It looks like docker buildx imagetools create
does the same thing but with multiple source images that already exist on a registry.
Is it possible to avoid the intermediate step of publishing buildpacks to the :<version>-<arch>
tag? It seems unnecessary once we've created a single manifest/image for all platforms. Could we create the images on a local registry and just push the single multi-arch image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that pushing the arch-specific tags is messy and suspect it could be avoided with a local registry. Since creating buildpacks doesn't requiring pulling other images, the entire workflow should be able to run on amd64 (github runner).
I'll do some testing to validate this when I get a chance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Circling back on this, what's the outcome? Can we go for publishing the single multi-arch image?
I plan to start contributing from @jericop going forward so I signed the CLA. Sorry for the confusion. |
[This](https://github.com/buildpacks/rfcs/blob/main/text/0096-remove-stacks-mixins.md) rfc captures a lot of the concepts implemented in the spec changes. The following [PR](https://github.com/buildpacks/spec/pull/336/files#) to the buildpack spec introduces two new environment variables that will be set by the lifecycle during the detect and build phases, `CNB_TARGET_OS` and `CNB_TARGET_ARCH`. These changes are available in version `0.12` of the platform api as outlined [here](https://github.com/buildpacks/spec/pull/335). These changes were partially implemented in [this](https://github.com/buildpacks/lifecycle/issues/743) issue, but they have not yet made into pack ([here](https://github.com/buildpacks/pack/blob/v0.30.0-pre1/builder/config_reader.go)). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should acknowledge stack removal but mark it as out of scope for this RFC, and just target the existing stack-based implementation. There's a whole ton of work that needs to happen for stack removal and some of it is just going to take time as we transition from one to the other.
I think if we focus on the stack-based implementation we have now, a.) we can simplify this RFC, b.) we can start work immediately on it, and c.) we can still implement because we just assume that a stack will have the require arch or the build wouldn't even run (I believe pack build
would fail cause it couldn't download the right stack images, or it would fallback to amd64 images if on a Mac which permits both).
Once all upstream changes to pack have been fully implemented and the newest version of pack and the lifecycle are being used the values in `CNB_TARGET_OS` and `CNB_TARGET_ARCH` will be set by paketo in the builder config toml. This means the changes I'm proposing to `libpak` and `packit` below can just read those environment variables and match any dependencies to the appropriate os and architecture. | ||
|
||
Since there are a lot of moving pieces to this project, I suggest the change from using stacks to targets (run and build images) be done separately. This would mean that the `CNB_TARGET_OS` and `CNB_TARGET_ARCH` environment variables would need to be set in the function/method that reads the buildpack. At build time buildpacks should attempt to read those environment variables if set, but otherwise fall back to determining os and architecture from another source such as `GOOS` and `GOARCH` environment variables. This should ultimately match whatever is used to set those values in the builder too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I think we should just skip the CNB_TARGET_*
stuff and use GOARCH
to match the right architecture for dependencies where it is required. This should be reliable as the stack that gets picked will have the right architecture. Then the build will use GOARCH
and install deps for that arch.
I also think getting into things like GOOS
in this PR is probably not necessary either and it can be more complicated than just the OS. There are a handful of other CNB_TARGET_*
variables that can be used for matching compatibility with a binary. All of that may need to be taken into consideration, depending on the binaries being installed. For now, I think we are best to bury our heads in the sand and just stick with the stack approach. When we do the work for removing stacks, then we can discuss how to add in the metadata required for matching, which I think is going to have to include some sort of wildcard/pattern.
One thing that might be worth discussing/adding in regards to GOARCH
is what to do for a non-arch-specific dependency. I do not believe GOARCH
has a value defined for that, so we might need to define one (or the lack of one). Not sure if folks have a preference, we could use *
or say that the lack of an architecture value on the dependency means it is not dependent on the architecture? or something else? Thoughts on that?
|
||
The upstream pack project is already creating multi-arch images for pack and the lifecycle using `docker buildx`, which creates manifest list images. An alternative approach is to create and publish separate images that are tagged by architecture. This would be the steps I mentioned above without the final manifest list image. I think this is an anti-pattern at this point given that multi-arch images are already a norm. I also suspect end users would end up creating their own manifest list images anyway. I think the biggest driver for this is to avoid end users cloning and rebuilding paketo buildpacks. | ||
|
||
## Implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this RFC is missing information about stacks and builders. Both are really necessary to provide first-class support for Paketo users. I'd be OK calling it out-of-scope, if we don't want this RFC to get bigger, but I still think we should mention it because it's part of the overall picture to say that we support arm64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point and I completely agree. I picked up where you left off on multi-arch stacks and will then look at builders. Once we have those in place I think we can move forward with this RFC.
`Os` and `Architecture` string fields can be added to the [`Dependency`](https://github.com/paketo-buildpacks/packit/blob/v2/postal/buildpack.go#L15) struct in `packit/postal` to allow buildpacks to interrogate and use the new metadata when determining what external dependencies to download. This will allow `buildpack.toml` files to be updated to add in the `os` and `architecture` metadata for all dependencies. They would all be set to something like `os=linux` and `architecture=amd64` initially. | ||
|
||
Some logic can be added to the [Resolve](https://github.com/paketo-buildpacks/packit/blob/18202009038b0df285ba0fb7d8b43abbf60d3ed0/postal/service.go#L89) method of `packit/postal` to pick a version that matches the `GOOS` and `GOARCH` environment variables whenever the `os` and `architecture` fields have been specified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the working group today (3/5), will this section be removed/modified? Also the Changes to buildpack.toml files
section? Should we allow this to be specified in #297 maybe, and just allude that some type of buildpack.toml
fields will be added and supported in packit/libpak as a result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the packit-specific code recommendations can be captured in an issue also
With work being in progress already to fulfill this not yet accepted RFC... Should we make sure this actually get's merged as a basis of that ongoing work? WDYT @paketo-buildpacks/steering-committee? |
I'm unsure of what we actually need to put into an RFC. I could see us putting together an RFC that signals the intent to publish multi-arch images across Paketo. It would really just be a signal to users that we intend to do this across the whole project. IMO, it's probably better to just keep writing blog posts about our progress. Those are just as easy to site when folks ask questions about multi-arch support. I don't think there's any technical coordination that needs to happen to publish multi-arch images though. Upstream has made tooling available to publish multi-arch images. We've done it for a number of buildpacks and it works fine. I don't see any reason why folks would not use that functionality in their implementations, but I suppose we could include that in an RFC. Maybe there are some details with the stacks/builders that would be helpful to work out in an RFC. I would defer to that team though. From what I've observed, they're already working on it and I don't want to add a process if it's not needed. If everyone is in agreement on how we proceed, then we probably don't need an RFC and we just focus on the work. That said, happy to help push an RFC forward if folks think it's helpful. Just let me know. |
Sure. All I am saying is that this is a draft PR for an RFC - so either we get it merged in one shape or another or we close it with a comment that it doesn't need an RFC. Either way, I would suggest we at least create a tracking issue for all the work involved, similarly to the one I created for adopting Ubuntu Noble. That way we can ping all maintainers and track overall progress in a central place. |
@loewenstein +1 to both those points. I'd prefer we close this RFC-draft. Then use the issue to track the works that's been done and what's still to be completed. |
If anyone objects to closing this RFC-draft. Please let us know, or we'll close this in one week. |
Readable
Checklist