Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate ILLiad OCR Process to use hosted Illiad #6412

Open
1 of 10 tasks
tpendragon opened this issue May 24, 2024 · 3 comments · May be fixed by #6576
Open
1 of 10 tasks

Migrate ILLiad OCR Process to use hosted Illiad #6412

tpendragon opened this issue May 24, 2024 · 3 comments · May be fixed by #6576

Comments

@tpendragon
Copy link
Contributor

tpendragon commented May 24, 2024

Summary

The library will be switching to hosted Illiad in August. @kevinreiss says we will need to do two things:

  1. Change the folder that Figgy's FileWatcher looks at for new ingests - it'll be in a different place.
  2. Upload the resulting OCR'd PDFs via sftp to hosted Illiad, instead of moving it to a folder.

Acceptance Criteria

  • There's a feature flipper to switch to the new Illiad.
  • When set to hosted Illiad, when a new file gets added to \\lib-fileshare.princeton.edu\ILL_OCR_SCANS, it gets processed by PdfOcrJob (/mnt/hosted_illiad/ILL_OCR_SCANS)
  • When set to hosted Illiad, the PdfOcrJob uploads the resulting PDF to the hosted Illiad SFTP location (you'll have to connect to this and figure out the exact path)

Research Needed

  • What should happen in staging - what happens in staging right now?
  • What's the SFTP path exactly?

Impact

We're switching to hosted ILLiad early 2025. We need to have this ready to go before then.

Priority recommendation

  • asap
  • within the next 3 weeks
  • PO will prioritize

Blockers

  • What is the folder we should pull files from?
    • \\lib-fileshare.princeton.edu\ILL_OCR_Scans - the permissions are the same as the existing illiad share.
  • What's the SFTP information?
    • LastPass, Hosted ILLIad Environment Details
@tpendragon
Copy link
Contributor Author

Note:

Their SFTP server has IP restrictions, so you have to tunnel traffic. To do so:

ssh -L 2222:princeton.illiad.oclc.org:222 pulsys@figgy-web-staging2

Then in FileZilla (or however you want to connect to sftp), you connect to localhost:2222, using the correct auth info.

@tpendragon
Copy link
Contributor Author

@kevinreiss I've no idea where to put these things. Is it this uploads directory? I don't know how Illiad works. It used to be an images directory.

@VickieKarasic
Copy link

@tpendragon here is ILLiad's response - let me know if this makes sense to you:

We utilize the ILLiad default location for PDF files that get delivered to patrons:

c:\inetpub\wwwroot\illiad\pdf\

And thus the URL to access them would look like this:

https://princeton.illiad.oclc.org/illiad/pdf/[TN#].pdf

And when you are accessing your OCLC hosted ILLiad web server via SFTP .. there is an additional folder due to the storage file system we utilize:

/cygdrive/c/Inetpub/wwwroot/illiad/pdf

During the actual migration, we can add the PDF files to the correct location.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants