Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update front-page readme with link to XLA flag doc #684

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

terrykong
Copy link
Contributor

Followup for #674 that links to this new document

@terrykong terrykong requested review from nouiz and instinct79 April 4, 2024 23:30
@instinct79
Copy link
Contributor

Should we delete all xla flags from the front page readme, and link to this document and maybe the scripts for PAX or T5X for benchmark specific flags ?

@nouiz
Copy link
Collaborator

nouiz commented Apr 5, 2024

Should we delete all xla flags from the front page readme, and link to this document and maybe the scripts for PAX or T5X for benchmark specific flags ?

We need to tell which flag are enabled in the container. But maybe we can just list them and tell to see that page for their description?

@instinct79
Copy link
Contributor

instinct79 commented Apr 5, 2024

I think we should point to our scripts that enable these flags. Otherwise, there are duplicate documentation. Perhaps, the environment variables can stay here.

@terrykong
Copy link
Contributor Author

Since the flags in there can change over time, what about instead including instructions to view the flags. Example:

docker run --rm quay.io/skopeo/stable inspect docker://ghcr.io/nvidia/jax:upstream-pax | jq -r '.Env[]' | grep '^XLA_FLAGS='

# which returns

XLA_FLAGS= --xla_gpu_enable_latency_hiding_scheduler=true --xla_gpu_enable_async_all_gather=true --xla_gpu_enable_async_reduce_scatter=true --xla_gpu_enable_triton_gemm=false

@instinct79
Copy link
Contributor

That's a good idea. Let's do that.

instructions for how to inspect them remotely
@yhtang
Copy link
Collaborator

yhtang commented Apr 5, 2024

Since the flags in there can change over time, what about instead including instructions to view the flags. Example:

docker run --rm quay.io/skopeo/stable inspect docker://ghcr.io/nvidia/jax:upstream-pax | jq -r '.Env[]' | grep '^XLA_FLAGS='

# which returns

XLA_FLAGS= --xla_gpu_enable_latency_hiding_scheduler=true --xla_gpu_enable_async_all_gather=true --xla_gpu_enable_async_reduce_scatter=true --xla_gpu_enable_triton_gemm=false

I would consider it a less-readable approach as a reader will have to break from the mental flow and do scripting to obtain these flags.

A more user-friendly but also much heavier solution is to set up a proper docs building process (e.g. readthedocs) so that the docs page is automatically populated with the flags set in the containers.

@instinct79
Copy link
Contributor

Maybe provide a sample output from one of our latest containers for now ? The flags are not going to change so rapidly and will atleast provide the users some idea of what is enabled for our benchmarks.

@yhtang
Copy link
Collaborator

yhtang commented Apr 10, 2024

Maybe provide a sample output from one of our latest containers for now ? The flags are not going to change so rapidly and will atleast provide the users some idea of what is enabled for our benchmarks.

Good point. @terrykong what do you think?

@terrykong
Copy link
Contributor Author

Done 👍

@terrykong terrykong requested a review from yhtang April 24, 2024 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants