Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support append data file and add e2e test #9

Merged
merged 17 commits into from
Nov 20, 2024

Conversation

ZENOTME
Copy link
Collaborator

@ZENOTME ZENOTME commented Nov 15, 2024

upstream pr: apache#349

This PR:

  1. It adds the FastAppendAction to commit the data file
  2. It init the e2e test for write data file

@ZENOTME ZENOTME requested review from chenzl25 and xxchan November 15, 2024 03:25
@xxchan xxchan changed the base branch from main to dev-rebase-main-20241030 November 15, 2024 03:26
@ZENOTME
Copy link
Collaborator Author

ZENOTME commented Nov 15, 2024

Should we have another PR to rebase with the mainstream first? Or exclude them.

@xxchan
Copy link
Member

xxchan commented Nov 15, 2024

Better to drop upstream commits.

To rebase with upstream, we will need to force push or create a new branch

.filter(|entry| {
entry.has_added_files()
|| entry.has_existing_files()
|| entry.added_snapshot_id == snapshot_produce.snapshot_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which cases will this branch be triggered?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I ask this in the community and get the answer:

We've copied this from the Java code: https://github.com/apache/iceberg/blob/307593ffd99752b2d62cc91f4928285fc0c62b75/co[…]e/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java

I don't think this is used in PyIceberg today. In Java it is possible to add existing manifests to a commit, while in PyIceberg we only accept datafiles. This is on purpose since the ability to add manifests also comes with problems of its own.

Copy link
Collaborator

@chenzl25 chenzl25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

@ZENOTME ZENOTME changed the base branch from dev-rebase-main-20241030 to dev-rebase-main-20241119 November 19, 2024 07:20
@ZENOTME ZENOTME requested a review from Li0k November 20, 2024 07:32
@chenzl25 chenzl25 merged commit 85076c7 into dev-rebase-main-20241119 Nov 20, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants