Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up llava deps and consolidate all HF deps #4320

Closed
wants to merge 1 commit into from

Conversation

guangy10
Copy link
Contributor

@guangy10 guangy10 commented Jul 19, 2024

python3 -m examples.portable.scripts.export --model_name="llava" works fine on my mac M1. No strict mode is needed. And can directly working on the most recent transformers lib.

It seems like the old restriction about numpy and transformers are not longer true. If CI gives green light we can cleanup the stale setup for llava and use the root setup (executorch/install_requirements.sh) as the source of truth

Copy link

pytorch-bot bot commented Jul 19, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4320

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job

As of commit a89f1cf with merge base 39aeff9 (image):

NEW FAILURE - The following job has failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@guangy10 guangy10 requested a review from larryliu0820 July 19, 2024 22:35
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 19, 2024

# Newer transformer will give TypeError: LlavaLlamaForCausalLM.forward() got an unexpected keyword argument 'cache_position'
pip install timm==0.6.13
pip install transformers==4.38.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is removed because it's already in root level install_requirements.sh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root script is installing the latest versions. To me when I tried exporting llava locally, the latest HF versions actually work well. Unless there are specific reasons to hold us back to these old versions, we should use latest versions.

Copy link
Contributor

@larryliu0820 larryliu0820 Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix test-export-llava-linux /

I haven't got a chance to look into the failure. But I notice that the issue is thrown from torchao 0.1. @jerryzh168 Is this something you may know? Also we are still pinning torchao 0.1 even in main, should we bump it up to newer version? What would be the version we want to use for the upcoming releases including Beta release?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locally I can export the llava with xnnpack w/o a problem w/ torch 2.5.0.dev20240716 + torchao 0.1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not supporting ao v0.1, please use the latest stable release with 0.3.1

This error is strange and I answered here pytorch/ao#535 (comment) it's strange that triton is even installed on this machine if the expectation is that this is a cpu only job

torchao 0.3.1 has no dependencies so it shouldnt even be installing a torch version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msaroufim It turns out that the third-party llava (very old code) pulled triton and installed 2.1.1, which doesn't work with torchao.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not supporting ao v0.1, please use the latest stable release with 0.3.1

@msaroufim Prefer to address one issue a time. We can update torchao dep to 0.3.1.

torchao is not the only package has this issue. That's why I have to upgrade HF packages and resolve all the deps problems manually. I think @dbort has a proposal to use ">=" for deps in general to make sure our packages will not stuck at some stale versions.

@guangy10 guangy10 force-pushed the cleanup_llava_deps branch from b3cf9ec to 3417c3b Compare July 23, 2024 01:14
@guangy10 guangy10 force-pushed the cleanup_llava_deps branch from 3417c3b to dba5c74 Compare July 24, 2024 19:35
@guangy10 guangy10 changed the title Clean up llava deps Clean up llava deps and consolidate all HF deps Jul 24, 2024
@guangy10 guangy10 requested a review from iseeyuan July 24, 2024 19:52
@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@guangy10 guangy10 force-pushed the cleanup_llava_deps branch from dba5c74 to 29b0fc8 Compare July 24, 2024 20:37
@guangy10
Copy link
Contributor Author

It works now on my Linux devserver, but not in the CI due to OOM on the runner.

@guangy10
Copy link
Contributor Author

@larryliu0820 could you help review this PR? I'd like to consolidate to one 🤗 transformers versions for ET. That will make future model enablement and benchmarking much easier

fi
# python3 -m examples.portable.scripts.export --model_name="llama2" should works too
"${PYTHON_EXECUTABLE}" -m examples.portable.scripts.export --model_name="${MODEL_NAME}" "${STRICT}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this work for Llava, does it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larryliu0820 Yeah, the CI passes, and locally I tried the AOT a while ago and it works fine. Do you recall what was the issue, AOT or runtime?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larryliu0820 it just works fine.

@larryliu0820
Copy link
Contributor

@larryliu0820 could you help review this PR? I'd like to consolidate to one 🤗 transformers versions for ET. That will make future model enablement and benchmarking much easier

Can you please rebase? I just cleaned up the dependencies for llava.

@guangy10 guangy10 force-pushed the cleanup_llava_deps branch from 798c33f to bf4ee71 Compare August 15, 2024 18:44
@guangy10 guangy10 force-pushed the cleanup_llava_deps branch from bf4ee71 to a89f1cf Compare August 15, 2024 21:18
@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@guangy10 guangy10 closed this Sep 17, 2024
@guangy10 guangy10 deleted the cleanup_llava_deps branch September 17, 2024 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants