-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add rosetta-maxtext #738
base: main
Are you sure you want to change the base?
Changes from all commits
25ca69e
1da45d2
5acefb4
4fc41f3
b8699ca
c2525cf
2e27a17
ecc5628
2998718
a4564f7
b876ccf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we did not run the tests for rosetta-maxtext yet, right? we should check the validity of the rosetta build |
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
local/ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this a debug line? Does it need to be committed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can remove it. I found it convenient to have local testing directory which isn't checked into git. |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are we again building TE here? The base image should be |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
# syntax=docker/dockerfile:1-labs | ||
ARG BASE_IMAGE=ghcr.io/nvidia/jax-mealkit:upstream-maxtext | ||
ARG [email protected] | ||
ARG GIT_USER_NAME=NVIDIA | ||
# If set to "true", then will pull new local patches, the manifest.yaml and create-distribution.sh (in case it was updated). | ||
# This is useful for development if you run `./bump.sh -i manifest.yaml` manually and do not want to trigger a full rebuild all | ||
# the way up to the jax build. | ||
ARG UPDATE_PATCHES=false | ||
# It is common for TE developers to test a different TE against the LLM application. This is a knob to override what's in the manifest | ||
# Accepts git-ref's from NVIDIA/TransformerEngine or pull requests (pull/$number/head) | ||
ARG UPDATED_TE_REF="" | ||
|
||
# Rosetta and optionally patches are pulled from this | ||
FROM scratch AS jax-toolbox | ||
|
||
############################################################################### | ||
### Download source and add auxiliary scripts | ||
################################################################################ | ||
|
||
FROM ${BASE_IMAGE} AS mealkit | ||
ARG GIT_USER_EMAIL | ||
ARG GIT_USER_NAME | ||
ARG UPDATE_PATCHES | ||
ARG UPDATED_TE_REF | ||
|
||
ENV ENABLE_TE=1 | ||
|
||
RUN --mount=target=/mnt/jax-toolbox,from=jax-toolbox <<"EOF" bash -exu | ||
MANIFEST_DIR=$(dirname ${MANIFEST_FILE}) | ||
if [[ "${UPDATE_PATCHES}" != "true" && "${UPDATE_PATCHES}" != "false" ]]; then | ||
echo "UPDATE_PATCHES can only be true or false" | ||
exit 1 | ||
fi | ||
if [[ "${UPDATE_PATCHES}" == "true" ]]; then | ||
cp -r /mnt/jax-toolbox/.github/container/patches ${MANIFEST_DIR}/ | ||
cp /mnt/jax-toolbox/.github/container/manifest.yaml ${MANIFEST_DIR}/manifest.yaml | ||
cp /mnt/jax-toolbox/.github/container/create-distribution.sh ${MANIFEST_DIR}/create-distribution.sh | ||
fi | ||
cp -r /mnt/jax-toolbox/rosetta /opt/rosetta | ||
|
||
if [[ -n "${UPDATED_TE_REF}" ]]; then | ||
TE_INSTALL_DIR=/opt/transformer-engine | ||
yq e ".transformer-engine.latest_verified_commit = \"${UPDATED_TE_REF}\"" -i $MANIFEST_FILE | ||
# Install from source instead of pre-built wheel | ||
sed -i -E 's@( file:///opt/transformer-engine)/dist/[^ ]*@\1@' /opt/pip-tools.d/requirements-te.in | ||
git -C $TE_INSTALL_DIR fetch -a | ||
if [[ "${UPDATED_TE_REF}" =~ ^pull/ ]]; then | ||
PR_ID=$(cut -d/ -f2 <<<"${UPDATED_TE_REF}") | ||
git -C $TE_INSTALL_DIR fetch origin ${UPDATED_TE_REF}:PR-${PR_ID} | ||
git -C $TE_INSTALL_DIR checkout PR-${PR_ID} | ||
else | ||
git -C $TE_INSTALL_DIR checkout ${UPDATED_TE_REF} | ||
fi | ||
fi | ||
|
||
# Setting the username/email is required to author commits from patches | ||
git config --global user.email "${GIT_USER_EMAIL}" | ||
git config --global user.name "${GIT_USER_NAME}" | ||
|
||
bash ${MANIFEST_DIR}/create-distribution.sh \ | ||
--manifest ${MANIFEST_FILE} \ | ||
--package maxtext | ||
# Remove .gitconfig to avoid end-user authoring commits as the "build user" | ||
rm -f ~/.gitconfig | ||
EOF | ||
|
||
WORKDIR /opt/rosetta | ||
|
||
############################################################################### | ||
### Install accumulated packages from the base image and the previous stage | ||
################################################################################ | ||
|
||
FROM mealkit as final | ||
|
||
RUN pip-finalize.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just leaving a reminder that this can be cleaned up if not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not see the patch file in the repo. Are we using it?