Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Census/Cleanup Non-Data Files #120

Open
ric-evans opened this issue Oct 28, 2021 · 3 comments
Open

Census/Cleanup Non-Data Files #120

ric-evans opened this issue Oct 28, 2021 · 3 comments
Labels
data fix modify existing persisted data

Comments

@ric-evans
Copy link
Member

ric-evans commented Oct 28, 2021

There are files in the FC that are not data files, like .cxx files. Are these abundant? This mostly happens in bulk ingestion. If so we should clean them up, and place guardrails to prevent future indexing like this.

@ric-evans ric-evans added the data fix modify existing persisted data label Oct 28, 2021
@blinkdog
Copy link
Collaborator

blinkdog commented Nov 7, 2022

  • Fix File Catalog entries that don't begin with /data/ana, /data/exp, or /data/sim
    • Some files may start with /mnt/lfs7
    • Some files may start with /data/wipac

@blinkdog
Copy link
Collaborator

blinkdog commented Nov 7, 2022

I have some more clean-up that's specific to LTA here: WIPACrepo/lta#236

@ric-evans
Copy link
Member Author

If it's a prescriptive cleanup, it's doable. But crawling the FC (an actual census) is pretty much impossible with current timeout values (mongo and/or REST, I'm unsure). So we'd have to iterate through a dump, offline. @dsschult your thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data fix modify existing persisted data
Projects
None yet
Development

No branches or pull requests

2 participants