Indexing using multiple threads #5367

thomaslow · 2022-09-27T09:23:35Z

This pull request improves the batch indexing performance (triggered via the System/Indexing page) by using multiple threads for indexing.

A full re-indexing of 81.000 processes and 4.000 tasks took 12 min 12 seconds on my machine (8-core CPU + SSD, default batch size of 250). With these changes, the indexing improves as follows (relative to the best case scenario of the current master branch with a batch size of 2500):

batch size	master	pr threads=1	pr threads=4	pr threads=8
250	0.57 (12m:12s)	1.04 (6m:40s)	3.39 (2m:03s)	4.85 (1m:26s)
500	0.76 (9m:12s)	0.95 (7m:19s)	3.50 (1m:59s)	5.09 (1m:22s)
2500	1.00 (6m:57s)	1.08 (6m:27s)	3.26 (2m:08s)	4.74 (1m:28s)

There are still a bunch more problems with the indexing, but at least now you will notice them much quicker ;-)

Also, the indexing should be more resilient to intermittent errors. I restarted ElasticSearch during the indexing process and the indexing still finished successfully.

The default number of threads is set to 4 and can be configured via the elasticsearch.threads parameter.

…tch elastic search error when checking already indexed documents due to max_result_window limit.

Kitodo/src/main/java/org/kitodo/config/enums/ParameterCore.java

…nts.

henning-gerhardt · 2022-10-05T13:18:58Z

I tried your changes but you killed the workaround for #5179 (refreshing the page to prevent high access rate) and in a short time gigabyte of log entries are created in application proxy in front of the elastic search system.

thomaslow · 2022-10-05T13:53:40Z

You are probably logging HTTP-requests that go to your ElasticSearch server, right?

You can limit the number of HTTP-requests used for indexing via the elasticsearch.batch parameter. A higher batch size will reduce the number of HTTP-requests required to index all objects, roughly number_of_objects / batch_size.

The HTTP-requests that are used to refresh the current number of already indexed objects (issue #5179) are triggered by the user interface and can be avoided by navigating away from the System-page (and not only switching to another tab in the System-page), logging out of Kitodo or closing the browser window. The indexing should not stop. As mentioned in #5181, there is currently no button to cancel the indexing process. Alternatively, we could increase the refresh interval to a higher number (e.g. 5 seconds or so?).

Another approach would be to set up log-rotation in your application proxy to avoid huge log files.

henning-gerhardt · 2022-10-05T14:56:31Z

You are probably logging HTTP-requests that go to your ElasticSearch server, right?

Correct. We need / use this application proxy to limit the access to the ElasticSearch instance.

You can limit the number of HTTP-requests used for indexing via the elasticsearch.batch parameter. A higher batch size will reduce the number of HTTP-requests required to index all objects, roughly number_of_objects / batch_size.

We have already increased the value of batch parameter to 2.500 to lower the amount of transfers between both systems.

The HTTP-requests that are used to refresh the current number of already indexed objects (issue #5179) are triggered by the user interface and can be avoided by navigating away from the System-page (and not only switching to another tab in the System-page), logging out of Kitodo or closing the browser window. The indexing should not stop. As mentioned in #5181, there is currently no button to cancel the indexing process.

The index page refreshing is the issue for the big amount of log entries not the indexing itself. I will try it and switch to an other page after the indexing is started to get a possible new workaround for the open issue with the page refreshing

Alternatively, we could increase the refresh interval to a higher number (e.g. 5 seconds or so?).

This could be a way but should be separated into its own pull request as this is "fixing" an other issue. I never understand why the refreshing is done 10 times per second but this should be discussed in the other issue.

Another approach would be to set up log-rotation in your application proxy to avoid huge log files.

There is already a log rotation which works fine for normal use but the indexing page refresh is the issue for so many log entries.

henning-gerhardt

This is only a code review with small adjustments. I will add some notice regarding to the speed improvement later.

Kitodo/src/main/java/org/kitodo/production/helper/IndexWorker.java

henning-gerhardt · 2022-10-07T08:10:45Z

Kitodo/src/main/java/org/kitodo/production/helper/IndexWorker.java

+                logger.error("stop indexing after maximum amount of attempts");
+                throw new RuntimeException("stop indexing after maximum amount of attempts");


Maybe use a string to avoid duplicated text message? I don't know if logging and throwing an exception should happen on one place and RuntimeException must be catched explicit. Maybe @solth should decide here what is a good solution.

The RuntimeException was used to break out of the while-loop. I added a boolean "failed" to gracefully exit the while loop and thread.

henning-gerhardt · 2022-10-07T08:13:01Z

Kitodo/src/main/java/org/kitodo/production/services/index/IndexManagmentThread.java

+                Helper.setErrorMessage(e.getLocalizedMessage(), IndexingService.getLogger(), e);
+                logger.error(e);


Maybe first log and then show the error message like a few lines above? If showing the error message raise itself an exception / error then this error will not get logged instead the exception / error of the displaying will be logged and this error get lost.

I removed the Helper.setErrorMessage call because a Thread.sleep interrupt is not really a problem.

henning-gerhardt · 2022-10-07T08:14:07Z

Kitodo/src/main/java/org/kitodo/production/services/index/IndexingService.java

+                // by iterating over all indexed documents in elastic search
+                try {
+                    while (offset < amountInIndex) {
+                        // TODO: actually iterate over all elastic search documents


When will this TODO be solved? Is there is already changed ready or will this be fixed in a middle / far future away change?

This code was not changed by this pull request. I just added a comment that this is a problem. There is a new issue #5379 for this.

Thanks, then will this TODO / Issue not got lost.

Kitodo/src/main/java/org/kitodo/production/services/index/IndexingService.java

henning-gerhardt · 2022-10-07T08:16:50Z

Kitodo/src/main/java/org/kitodo/production/services/index/IndexingService.java

+                        searchService.removeLooseIndexData(searchService.findAllIDs(offset, indexLimit));
+                        offset += indexLimit;
+                    }
+                } catch (Exception e) {


You catching all kind of exceptions. Is this is really expected?

This exception is actually an elastic search exception. Unfortunately, the elastic search java library is not available in this class. I changed it to a RuntimeException, which is the most specific exception type available at this point.

henning-gerhardt · 2022-10-07T08:39:29Z

Good thing is, that your changes did not lower the indexing speed - if you use the workaround and switch to an other page or close the tab like the old implementation. But I can not confirm the speed improvements in this large scale.

I did a lot of benchmarks but speed improvements in my local environment are visible not not in a such way that the break records. I did all my benchmarks with a batch limit of 2.500 and index limit of 2.500, I close the tab or switch to an other non system page after the single type indexing has started (workaround!) and before a indexing is started the corresponding index was cleared and the application restarted (cleaning index was done between application stop and restart).

Indexing around 560.000 processes including meta data files and properties (around 13 million property table entries in the database) need 1:45 hours on a single core. Only 1:15 hours on four cores. Not really faster on four cores instead of using only a single core.

Indexing around 4.4 million tasks take 6:20 hours on one core and 2:36 hours on four cores. Here is a acceptable speed improvement but not so fast like on your timings.

Massive negative impact on indexing speed is to stay on the system / indexing page as this speed down indexing a lot - I broke the process indexing after 3 hours and only a quarter or third of all data was indexed.

No or only less influence has the used application proxy in front of the elastic search service. The process indexing goes down on four cores to 1:10 hours.

thomaslow · 2022-10-07T10:12:33Z

@henning-gerhardt Thank you for your detailed feedback. I will have a look at your code review soon. I'm sorry if I raised too high expectations for this pull request. I suppose your setup (multiple virtual machines?) has different bottlenecks than my setup (multiple docker container on one non-virtual machine). Also, my database consists mostly of fake data (empty processes, no meta data, only a few mets files, etc.). Maybe there are other bottlenecks with proper real-world data (like disk i/o) or a multi-server setups (like network delays).

Did you see full CPU utilization on each of your virtual machines (Database, ElasticSearch, Tomcat)? With 16 threads, my CPU is fully utilized at nearly 100%. Both the Tomcat server (~48%) and MariaDB (~40%) use the majority of the CPU. ElasticSearch is using only ~10% of the CPU.

If you do not see full CPU utilization, I would suggest to try more than 4 threads, even if you Tomcat server only has 4 CPU cores. There could be other bottlenecks, e.g. network delay, disk i/o delay, that could improve with a higher number of threads. Try to increase the number of threads until there is no more increase in CPU utilization or indexing time. For example, 8 or even 16 threads should be no problem at all.

Also, at some point (I don't know exactly when), your batch size could be too high. However, I tested your batch size (2500) on my machine and got the same duration of ~2min for 4 threads. A high batch size might cause additional network delays, because large data packets are sent over the network, and the indexing can not proceed while the data is sent over the network. Maybe this could be a problem. I don't know.

Unfortunately, it is not easy to find a good balance for these parameters.

henning-gerhardt · 2022-10-07T10:50:18Z

Some information about my local test environment which I used for benchmarking: 4 core or 8 core with hyper-threading processor, nvme based ssd and 32 gb ram. IDEA is starting the Tomcat server (9.0.56) and deploying the application inside this tomcat server. Database (MariaDB 10.5.15) and ElasticSearch (7.17.6) are started on system up and are running as normal system processes. Connection to database, Tomcat and Elastic search are over localhost or local ethernet card without any real data transfer over network. A local running Apache service is acting as application proxy for the Kitodo.Productoin application and ElasticSearch but the application proxy has only less influence on the indexing.

Using 4 threads for indexing let my system stay at all 50 - 60% total usage except I did not close the indexing page then the cpu usage go to 100% on all 8 cores. Using 8 or more cores is not really practicable as I run this benchmarks in the background while doing other import daily work.

If I would try your changes in our real world system (VMWare virtualisation with many VMs (4 core application, 4 core elastic, 2 core database and accessing the meta data files through nfs on the application server) then benchmarking is not difficult to compare as network i/o, disk i/o (database + elastic search), VMware host cpu usage and many other factors will have a (big) influence on timing. Maybe I can convince my team mates to stay away from our system for a week or more so I can do some benchmarking in this environment.

I choose a higher batch limit size and a similar indexing limit as lower values need more time on a single core indexing with our data.

thomaslow · 2022-10-07T11:44:37Z

Ok. If, your system is similar to mine and your CPU usage is already at 100% (or as high as you want it to be) than I don't think there is anything that can be improved by tuning the batch or threads parameters.

Btw. I did my tests while keeping the indexing page active (including the automatic refresh every 1s). I also enabled database logging, which prints every SQL statement that is executed. So, I could probably gain a few seconds by optimizing my setup. This is why I wasn't expecting that your performance would be worse than mine. Weird.

I suppose the most likely factor is that I use very simple test data.

…review.

thomaslow · 2022-10-07T13:28:05Z

I found the change that caused that the previous workaround for stopping the automatic refresh (polling) didn't work anymore. Now the previous behavior is back (page refresh will stop polling).

henning-gerhardt

I did found only some small notices - in general are this changes approved.

henning-gerhardt · 2022-10-10T08:38:26Z

Kitodo/src/main/java/org/kitodo/production/services/index/IndexingService.java

+                // by iterating over all indexed documents in elastic search
+                try {
+                    while (offset < amountInIndex) {
+                        // TODO: actually iterate over all elastic search documents


Thanks, then will this TODO / Issue not got lost.

Kitodo/src/main/java/org/kitodo/production/services/index/IndexingService.java

Kitodo/src/main/java/org/kitodo/production/helper/IndexWorkerStatus.java

…l indexing via simple link. Fix status images that were not displayed.

thomaslow · 2022-10-10T12:36:36Z

I refactored the failure logic a bit. Previously, a failure would not stop the indexing of subsequent objects types (if the indexing was started via the "index everything" button). I also added a simple cancel link to the UI, since the cancel implementation was very similar to the failure handling.

solth · 2022-10-10T12:51:04Z

I also added a simple cancel link to the UI, since the cancel implementation was very similar to the failure handling.

Does this resolve #5181 ?

thomaslow · 2022-10-10T12:56:27Z

The cancel link looks like this:

If this is sufficient, then it will resolve #5181.

henning-gerhardt · 2022-10-10T14:19:24Z

Adding new functionality after a review is near done and as new function was not part of any review request or other requirement is not so good in my opinion. I must now find a couple of time in the next days to check the new and existing functions again as no (selenium based) tests are added.
If I could say a wish: the new "cancel" button should be appear as the other buttons on this page and should only be visible if an indexing is started.

thomaslow · 2022-10-10T15:01:00Z

I'm sorry I mixed features in one pull request. If it helps, I could revert this pull request to the last reviewed commit and add the recent changes to a new pull request.

The cancel link is only visible when the indexing is currently in progress.

henning-gerhardt · 2022-10-10T15:02:42Z

No, let this changes in. I will test this tomorrow or the day after tomorrow.

henning-gerhardt · 2022-10-11T09:12:04Z

Canceling the indexing is not working properly:

I started the indexing of processes with a batch size of 2.500 and indexing limit of 2.500 and interrupted the indexing in the first batch. In the UI the state of process indexing is set to failed but in the background process indexing is still a second attempt:

[INFO] 2022-10-11 10:54:27,851 [IndexManagementThread] org.kitodo.production.services.index.IndexingService - start 1 threads for indexing PROCESS
[INFO] 2022-10-11 10:54:27,857 [pool-6-thread-1] org.kitodo.production.helper.IndexWorker - index PROCESS with offset 0 and attempt 1/10
[TRACE] 2022-10-11 10:54:28,483 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 7/meta.xml
[TRACE] 2022-10-11 10:54:28,484 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 7/meta.xml
[TRACE] 2022-10-11 10:54:28,523 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 7/meta.xml
[INFO] 2022-10-11 10:54:28,533 [pool-6-thread-1] org.kitodo.production.services.dataformat.MetsService - Reading 7/meta.xml


... (Now hitting the cancel button in the UI) ...

[INFO] 2022-10-11 10:54:55,802 [pool-6-thread-1] org.kitodo.production.services.dataformat.MetsService - Reading 1455/meta.xml
[TRACE] 2022-10-11 10:54:55,846 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 1456/meta.xml
[TRACE] 2022-10-11 10:54:55,846 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 1456/meta.xml
[TRACE] 2022-10-11 10:54:55,869 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 1456/meta.xml
[INFO] 2022-10-11 10:54:55,869 [pool-6-thread-1] org.kitodo.production.services.dataformat.MetsService - Reading 1456/meta.xml
[INFO] 2022-10-11 10:54:55,871 [IndexManagementThread] org.kitodo.production.services.index.IndexingService - indexing of PROCESS failed, cleaning up
[TRACE] 2022-10-11 10:54:55,899 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 1457/meta.xml
[TRACE] 2022-10-11 10:54:55,899 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 1457/meta.xml
[TRACE] 2022-10-11 10:54:55,900 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 1457/meta.xml
[INFO] 2022-10-11 10:54:55,900 [pool-6-thread-1] org.kitodo.production.services.dataformat.MetsService - Reading 1457/meta.xml

... more indexing is done until the end of the current indexing batch ...

[INFO] 2022-10-11 10:55:17,612 [pool-6-thread-1] org.kitodo.production.services.dataformat.MetsService - Reading 2702/meta.xml
[ERROR] 2022-10-11 10:55:40,661 [pool-6-thread-1] org.kitodo.production.helper.IndexWorker - java.lang.InterruptedException
org.elasticsearch.ElasticsearchException: java.lang.InterruptedException
        at org.elasticsearch.client.RestHighLevelClient.performClientRequest(RestHighLevelClient.java:2695) ~[elasticsearch-rest-high-level-client-7.17.6.jar:7.17.6]
        at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:2171) ~[elasticsearch-rest-high-level-client-7.17.6.jar:7.17.6]
        at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:2137) ~[elasticsearch-rest-high-level-client-7.17.6.jar:7.17.6]
        at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:2105) ~[elasticsearch-rest-high-level-client-7.17.6.jar:7.17.6]
        at org.elasticsearch.client.RestHighLevelClient.bulk(RestHighLevelClient.java:620) ~[elasticsearch-rest-high-level-client-7.17.6.jar:7.17.6]
        at org.kitodo.data.elasticsearch.index.IndexRestClient.addTypeSync(IndexRestClient.java:110) ~[kitodo-data-management-3.4.4-SNAPSHOT.jar:?]
        at org.kitodo.data.elasticsearch.index.Indexer.performMultipleRequests(Indexer.java:117) ~[kitodo-data-management-3.4.4-SNAPSHOT.jar:?]
        at org.kitodo.production.services.data.base.SearchService.addAllObjectsToIndex(SearchService.java:210) ~[classes/:3.4.4-SNAPSHOT]
        at org.kitodo.production.services.data.ProcessService.addAllObjectsToIndex(ProcessService.java:331) ~[classes/:3.4.4-SNAPSHOT]
        at org.kitodo.production.helper.IndexWorker.indexObjects(IndexWorker.java:98) ~[classes/:3.4.4-SNAPSHOT]
        at org.kitodo.production.helper.IndexWorker.run(IndexWorker.java:69) ~[classes/:3.4.4-SNAPSHOT]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:829) ~[?:?]
Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1343) ~[?:?]
        at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:243) ~[elasticsearch-7.17.6.jar:7.17.6]
        at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:75) ~[elasticsearch-7.17.6.jar:7.17.6]
        at org.elasticsearch.client.RestHighLevelClient.performClientRequest(RestHighLevelClient.java:2692) ~[elasticsearch-rest-high-level-client-7.17.6.jar:7.17.6]
        ... 15 more
[INFO] 2022-10-11 10:55:42,661 [pool-6-thread-1] org.kitodo.production.helper.IndexWorker - index PROCESS with offset 0 and attempt 2/10
[TRACE] 2022-10-11 10:55:42,862 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 7/meta.xml
[TRACE] 2022-10-11 10:55:42,862 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 7/meta.xml
[TRACE] 2022-10-11 10:55:42,863 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 7/meta.xml
[INFO] 2022-10-11 10:55:42,863 [pool-6-thread-1] org.kitodo.production.services.dataformat.MetsService - Reading 7/meta.xml

... now indexing again the indexing batch until end of the batch ...

[TRACE] 2022-10-11 10:56:15,523 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 2702/meta.xml
[TRACE] 2022-10-11 10:56:15,523 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 2702/meta.xml
[TRACE] 2022-10-11 10:56:15,547 [pool-6-thread-1] org.kitodo.filemanagement.FileManagement - Found 2702/meta.xml
[INFO] 2022-10-11 10:56:15,547 [pool-6-thread-1] org.kitodo.production.services.dataformat.MetsService - Reading 2702/meta.xml

... no more log entries ...

After second attempt is "done" indexing is not executed further but in the UI the amount of indexed processes is set to 2.500, so indexing of one batch size was submitted to the index.

Maybe it should be documented that the running indexing process is executed until the end of the current batch and even submitted to the elastic search.

thomaslow · 2022-10-11T09:32:36Z

Yes, after the user cancels the indexing, currently active batches are still being indexed. If you use threads=4, the currently 4 active batches are still indexed and submitted to ElasticSearch. In case of a smaller batch size (e.g. the default 250) the user will most likely not notice any delay.

Actually interrupting (or killing) the indexing of a single batch could cause various problems, e.g. partial HTTP requests being submitted to ElasticSearch, network connections not being closed, files not being closed. I wouldn't recommend that.

When indexing all objects via the index everything button, there is a small time window (in between indexing of two subsequent object types) where the cancel button will currently have no effect and not cancel anything.

As I said, this is a very simple implementation that basically follows the failure logic. I didn't really put much thought into the user experience side of things. If this is not "better than nothing", I would suggest to revert this pull request to a previous commit.

henning-gerhardt · 2022-10-11T10:17:44Z

Yes, after the user cancels the indexing, currently active batches are still being indexed. If you use threads=4, the currently 4 active batches are still indexed and submitted to ElasticSearch. In case of a smaller batch size (e.g. the default 250) the user will most likely not notice any delay.

This will be okay and I'm fine with that. My point was or is, that the current interruption logic is raising a second index attempt.

Actually interrupting (or killing) the indexing of a single batch could cause various problems, e.g. partial HTTP requests being submitted to ElasticSearch, network connections not being closed, files not being closed. I wouldn't recommend that.

That is true and this is not what I want as we have already enough issues to solve and should not introduce new ones.

When indexing all objects via the index everything button, there is a small time window (in between indexing of two subsequent object types) where the cancel button will currently have no effect and not cancel anything.

Will the cancel request be forgotten and the user must hit the cancel button again after the indexing has started? I did not test this as I do only single indexing.

As I said, this is a very simple implementation that basically follows the failure logic. I didn't really put much thought into the user experience side of things. If this is not "better than nothing", I would suggest to revert this pull request to a previous commit.

I will put some of my notes to the cancel indexing issue so that it should be clear what to be expect and what not. Edit: see #5181 (comment)

If the refactoring commit and so the cancel interaction commit should be reverted should be decided by @solth .

solth · 2022-10-11T11:43:25Z

If the refactoring commit and so the cancel interaction commit should be reverted should be decided by @solth .

I am a fan of incremental improvements. Even if this is not the optimal solution, having the current simple link is still better than having no way to cancel the indexing process at all.

I would suggest to leave the commit in this pull request but not close issue #5181. Instead, the issue description could be updated to reflect that the "cancel" functionality is now available, but could (and should) be improved upon in a future pull request.

henning-gerhardt · 2022-11-04T08:17:45Z

@solth @thomaslow :
Changes are okay except the new introduced issue, where an additional batch indexing run is executed after the cancel button is clicked. I suggest to remove the cancel button from the UI, so that the latest changes (which introduce the cancel mechanic) must not be reverted but must be corrected in a separate pull request.

…improved in the future.

thomaslow added 4 commits September 26, 2022 21:45

Improve indexing performance using multiple worker threads.

2114759

Reduce default indexing threads to 4. Add some comments.

83034f5

Fix threads getting stuck do to async bulk request implementation. Ca…

c564df8

…tch elastic search error when checking already indexed documents due to max_result_window limit.

Fix checkstyle issues.

fecb9cf

henning-gerhardt reviewed Sep 27, 2022

View reviewed changes

Kitodo/src/main/java/org/kitodo/config/enums/ParameterCore.java Outdated Show resolved Hide resolved

Add new indexing properties to kitodo properties file including comme…

f5479eb

…nts.

solth requested a review from henning-gerhardt September 27, 2022 12:20

Merge branch 'master' into indexing-using-multiple-threads

77da858

henning-gerhardt suggested changes Oct 7, 2022

View reviewed changes

Revert to previous UI polling behaviour. Minor changes based on code …

b0fdc3d

…review.

henning-gerhardt suggested changes Oct 10, 2022

View reviewed changes

thomaslow added 2 commits October 10, 2022 12:28

Make failure stop indexing of subsequent object types. Allow to cance…

09855a7

…l indexing via simple link. Fix status images that were not displayed.

Merge branch 'master' into indexing-using-multiple-threads

17c933d

thomaslow added 2 commits October 25, 2022 12:51

Merge branch 'master' into indexing-using-multiple-threads

4382fd1

Merge branch 'master' into indexing-using-multiple-threads

a04701f

Remove index cancel link from UI but keep current cancel logic to be …

4461906

…improved in the future.

henning-gerhardt approved these changes Nov 4, 2022

View reviewed changes

solth merged commit becd22e into kitodo:master Nov 4, 2022

This was referenced Feb 20, 2023

indexing - static initialisation and waste of ressources #5174

Open

Indexing does not work #3999

Closed

Remove time limit to start new indexing threads #4947

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indexing using multiple threads #5367

Indexing using multiple threads #5367

thomaslow commented Sep 27, 2022 •

edited

Loading

henning-gerhardt commented Oct 5, 2022

thomaslow commented Oct 5, 2022 •

edited

Loading

henning-gerhardt commented Oct 5, 2022

henning-gerhardt left a comment

henning-gerhardt Oct 7, 2022

thomaslow Oct 7, 2022

henning-gerhardt Oct 7, 2022

thomaslow Oct 7, 2022

henning-gerhardt Oct 7, 2022

thomaslow Oct 7, 2022

henning-gerhardt Oct 10, 2022

henning-gerhardt Oct 7, 2022

thomaslow Oct 7, 2022

henning-gerhardt commented Oct 7, 2022 •

edited

Loading

thomaslow commented Oct 7, 2022

henning-gerhardt commented Oct 7, 2022

thomaslow commented Oct 7, 2022

thomaslow commented Oct 7, 2022

henning-gerhardt left a comment

henning-gerhardt Oct 10, 2022

thomaslow commented Oct 10, 2022

solth commented Oct 10, 2022

thomaslow commented Oct 10, 2022

henning-gerhardt commented Oct 10, 2022

thomaslow commented Oct 10, 2022 •

edited

Loading

henning-gerhardt commented Oct 10, 2022

henning-gerhardt commented Oct 11, 2022

thomaslow commented Oct 11, 2022

henning-gerhardt commented Oct 11, 2022 •

edited

Loading

solth commented Oct 11, 2022

henning-gerhardt commented Nov 4, 2022

		logger.error("stop indexing after maximum amount of attempts");
		throw new RuntimeException("stop indexing after maximum amount of attempts");

		Helper.setErrorMessage(e.getLocalizedMessage(), IndexingService.getLogger(), e);
		logger.error(e);

Indexing using multiple threads #5367

Indexing using multiple threads #5367

Conversation

thomaslow commented Sep 27, 2022 • edited Loading

henning-gerhardt commented Oct 5, 2022

thomaslow commented Oct 5, 2022 • edited Loading

henning-gerhardt commented Oct 5, 2022

henning-gerhardt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henning-gerhardt commented Oct 7, 2022 • edited Loading

thomaslow commented Oct 7, 2022

henning-gerhardt commented Oct 7, 2022

thomaslow commented Oct 7, 2022

thomaslow commented Oct 7, 2022

henning-gerhardt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomaslow commented Oct 10, 2022

solth commented Oct 10, 2022

thomaslow commented Oct 10, 2022

henning-gerhardt commented Oct 10, 2022

thomaslow commented Oct 10, 2022 • edited Loading

henning-gerhardt commented Oct 10, 2022

henning-gerhardt commented Oct 11, 2022

thomaslow commented Oct 11, 2022

henning-gerhardt commented Oct 11, 2022 • edited Loading

solth commented Oct 11, 2022

henning-gerhardt commented Nov 4, 2022

thomaslow commented Sep 27, 2022 •

edited

Loading

thomaslow commented Oct 5, 2022 •

edited

Loading

henning-gerhardt commented Oct 7, 2022 •

edited

Loading

thomaslow commented Oct 10, 2022 •

edited

Loading

henning-gerhardt commented Oct 11, 2022 •

edited

Loading