Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileDataLoader fails to read the file when size > INT32_MAX #4435

Closed
wants to merge 2 commits into from

Conversation

cymbalrush
Copy link
Collaborator

@cymbalrush cymbalrush commented Jul 26, 2024

On macOS, the read function will fail with an EINVAL error if the size parameter exceeds INT32_MAX. This update addresses the issue by adding a check to ensure that the read size does not surpass INT32_MAX. On Linux, the maximum permissible read size is 2,147,479,552 bytes ( < INT32_MAX), so attempting to read beyond this limit is inconsequential.

Test Plan:

Exporting llama3 with python -m examples.models.llama2.export_llama --checkpoint examples/models/llama-2-7B/consolidated.00.pth --params examples/models/llama-2-7B/params.json --coreml --disable_dynamic_shape -kv

Without fix
Fails with invalid argument error.

With fix
Succeeds.

@cymbalrush cymbalrush requested review from shoumikhin and cccclai July 26, 2024 21:30
Copy link

pytorch-bot bot commented Jul 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4435

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job

As of commit 0a3116c with merge base dbf7d6e (image):

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 26, 2024
@dbort dbort self-requested a review July 26, 2024 22:10
@dbort dbort self-assigned this Jul 26, 2024
Copy link
Contributor

@dbort dbort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this fix! One super-minor request but the code changes look great.

I would suggest adding a unit test for this, but I really don't want our test jobs creating and reading a >2GB file. But because there's no automated testing, please add a Test Plan: section to the PR description explaining how you tested that this works.

@@ -189,7 +191,12 @@ Result<FreeableBuffer> FileDataLoader::load(
size_t needed = size;
uint8_t* buf = reinterpret_cast<uint8_t*>(aligned_buffer);
while (needed > 0) {
ssize_t nread = ::read(fd_, buf, needed);
// read on macos would fail with EINVAL if size > INT32_MAX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please begin comments with uppercase letters and end with a period, to be consistent with the rest of the file. Thanks for adding this comment, though, to explain why this extra logic is necessary, and where to focus testing if it changes.

Suggested change
// read on macos would fail with EINVAL if size > INT32_MAX
// Reads on macOS will fail with EINVAL if size > INT32_MAX.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you get a chance please update this comment, then we can land the PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@cymbalrush
Copy link
Collaborator Author

Thank you so much for this fix! One super-minor request but the code changes look great.

I would suggest adding a unit test for this, but I really don't want our test jobs creating and reading a >2GB file. But because there's no automated testing, please add a Test Plan: section to the PR description explaining how you tested that this works.

That's the reason for not adding test, I didn't want to create a 2GB file for unit testing this.

@facebook-github-bot
Copy link
Contributor

@shoumikhin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@shoumikhin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@shoumikhin merged this pull request in 1e4603d.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants