-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use OCR to support structuring in metadata editor #5476
Comments
Very good idea! I strongly support this feature request. |
Currently an implementation project is underway for the integration of OCR-D and Kitodo. https://github.com/slub/ocrd_kitodo In the context of our discussion, automated structuring was a topic, but for reasons not a main priority yet. However, this issue shows us that there is a need for this. We will discuss this and possible include it in the integration. |
To be precise, what we discussed so far is a different scenario: fully automatic structuring – as an external script task (backed by OCR-D) at the end of the Production workflow, when the METS has already been exported and can therefore be amended by automatic structuring tools (which naturally operate on METS/MODS directly). But I also fully agree with you @oliver-stoehr that we should also support the semi-automatic scenario from our side (ocrd_kitodo). Currently, you can already have the OCR (script task) run on a single page and get back ALTO, so that should not be a problem. We can also try to provide OCR-D workflows dedicated for optimal layout analysis and recognition on index (toc) pages. Most of the work for this feature will be on the Kitodo/UI side though (esp. in step 3 where you need to embed some form of ALTO viewer, perhaps from Presentation). |
Additional ideas by @BartChris (originally reported to ocrd_kitodo):
|
@solth I would like to suggest the UI parts of the OCR-D integration (needs more precision on what is needed) for the development fund but i cannot assign labels. Could you mark the issue maybe as candidate for the development fund? |
@BartChris I updated your role in the repository - could you check if you can assign labels to issues again? |
@solth Great, it works now. |
Thank you for the clarification. I have made a suggestion towards such fully automatic structuring at #5573 (comment) albeit coming from a different scenario. As far as I understand the implementation already present in the BAR's fork modifies the metadata before the images are loaded into the metadata editor (since the separator pages are removed before that). Also, as far as I can see one is free to access the same METS file that Kitodo is going to write metadata entered in the metadata editor into through the Could you explain why you think the file could/should only be amended after the export? I would like to know how different this two scenarios are and if joining the suggestions makes any sense or if we would need a completely new proposal for “fully automatic structuring by OCR”. |
No, according to @Kathrin-Huber IIRC that file (and its schema) is:
So even if you can work out something for a particular version, it might break with the next update.
That's the only reason. An external tool cannot know our exact schema. (The additions to Kitodo you sketched in #5573 – scanning physical barcode inserts with structuring information – would require an internal structuring tool, so that can be anywhere in the workflow.) IMO the 3 scenarios are quite different:
|
related to: #3837 |
Great! Thank you for pointing this out.
Ok. I will retract my suggestion to have internal UI extensions for controlling fully automatic structuring then and will try to find out where my other suggestions (automatic structuring by imported metadata or filenames) might fit in. Thank you! |
The process of structuring in the metadata editor should be supported by a small tool to increase the level of automation.
It should be possible to select one or multiple pages containing the table of contents and perform OCR them.
The result of the OCR should be used to create structure elements and semi automatically insert the text. The workflow in the editor could be:
Requirement for this feature would be that @OCR-D is implemented in Kitodo.Production to allow a convenient way to perform the OCR.
The text was updated successfully, but these errors were encountered: