Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support batch deletion operation in S3 backend #135

Open
arestifo opened this issue Nov 10, 2022 · 1 comment
Open

Support batch deletion operation in S3 backend #135

arestifo opened this issue Nov 10, 2022 · 1 comment
Assignees

Comments

@arestifo
Copy link

Is your feature request related to a problem? Please describe.
Currently, VFS only provides support for deleting individual files. If deleting more than a handful of files, this causes many network roundtrips and is a generally inefficient method of doing things, especially since S3 provides native batch deletion operations.

We are currently deleting multiple files in a for loop, calling Delete on each one. This is also a waste of money, since we pay for each S3 request.

Is this something that is on the VFS team's radar?

Describe the solution you'd like
One option is to add a new function DeletePrefix(prefix string) that takes a prefix and deletes all objects matching that prefix in as few API calls as possible.

Another option would be DeleteMultiple(objects []string) that deletes all objects in string array objects, again using as few API calls as possible. This fits into the existing ListWithPrefix function, and ideally the output of this function could be directly passed into DeleteMultiple

Describe alternatives you've considered
Not including this feature

Additional context
N/A

@funkyshu
Copy link
Member

funkyshu commented Nov 18, 2022

This is not something we've considered yet but we've been looking into possibly deprecating the list functions (List, ListPrefix, ListRegex) with an iterator pattern. location.ListIterator() would accept filter options that returns an iterator. Potentially we could either pass a list iterator directly into a new function like location.Delete(LocationIterator, ...DeleteOption) or produce a list of relative vfs.File paths from it (for from some other known source) to pass into location.DeleteFiles(relpaths []string, ...DeleteOption). Of course any backend that has a native batch delete would use it where possible. From my research, looks like only S3 and Azure (strangely GCS does not support this) would benefit but still worthwhile in my estimation.

@funkyshu funkyshu self-assigned this Nov 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants