You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have just come across the following: I am using the MultiThreadBatchTaskEngine in a feature ablation experiment in DKPro TC. The batch task fails with the following runtime exception:
Details: de.tudarmstadt.ukp.dkpro.lab.engine.ExecutionException: java.lang.RuntimeException:
-de.tudarmstadt.ukp.dkpro.lab.storage.UnresolvedImportException: Unable to resolve import of task
[de.tudarmstadt.ukp.dkpro.tc.core.task.ExtractFeaturesTask-Test-AIFdbClassificationFE_PRvCvNP_2015-09-04_18-05-38]
pointing to [task-latest://de.tudarmstadt.ukp.dkpro.tc.core.task.ExtractFeaturesTask-Train-AIFdbClassificationFE_PRvCvNP_2015-09-04_18-05-38/output];
nested exception is de.tudarmstadt.ukp.dkpro.lab.storage.TaskContextNotFoundException: Task
[de.tudarmstadt.ukp.dkpro.tc.core.task.ExtractFeaturesTask-Train-AIFdbClassificationFE_PRvCvNP_2015-09-04_18-05-38] has never been executed.
...
This in turn leads to more runtime exceptions about other tasks not having been executed.
When I look into the details, I find the following exception at an earlier stage:
2015-09-05 09:32:49 DEBUG PrimitiveAnalysisEngine_impl:347 - AnalysisEngine de.tudarmstadt.ukp.dkpro.tc.core.feature.UnitContextMetaCollector process begin
2015-09-05 09:32:49 DEBUG PrimitiveAnalysisEngine_impl:413 - AnalysisEngine de.tudarmstadt.ukp.dkpro.tc.core.feature.UnitContextMetaCollector process end
2015-09-05 09:32:49 DEBUG PrimitiveAnalysisEngine_impl:347 - AnalysisEngine de.tudarmstadt.ukp.dkpro.tc.features.ngram.meta.LuceneNGramMetaCollector process begin
2015-09-05 09:32:49 ERROR PrimitiveAnalysisEngine_impl:417 - Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:401)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:308)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
at de.tudarmstadt.ukp.dkpro.lab.uima.engine.simple.SimpleExecutionEngine.run(SimpleExecutionEngine.java:139)
at de.tudarmstadt.ukp.dkpro.lab.engine.impl.MultiThreadBatchTaskEngine$ExecutionThread.run(MultiThreadBatchTaskEngine.java:274)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:614)
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:628)
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1508)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1188)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1169)
at de.tudarmstadt.ukp.dkpro.tc.features.ngram.meta.LuceneBasedMetaCollector.writeToIndex(LuceneBasedMetaCollector.java:165)
at de.tudarmstadt.ukp.dkpro.tc.features.ngram.meta.LuceneBasedMetaCollector.process(LuceneBasedMetaCollector.java:136)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:385)
... 13 more
The experiments runs successfully when I use the single-threaded DefaulfBatchTaskEngine, so I presume the error about the closed index writer is caused as a side-effect of the multi-threading.
I don't have time right now to investigate this in detail or produce a minimum configuration where this error occurs, but I thought I'd report it anyways, in case others come across the same issue.
The text was updated successfully, but these errors were encountered:
I guess how we currently close the index writer is not really thread safe ...
It worked single threaded, as when collectionProcessComplete() is called, all lucene-based meta collectors are already finished, not so in a multi-threaded scenario?
However, Lucene says that IndexWriter is completely thread-safe ...
@Override
public void collectionProcessComplete()
throws AnalysisEngineProcessException
{
super.collectionProcessComplete();
if (indexWriter != null) {
try {
indexWriter.commit();
indexWriter.close();
indexWriter = null;
} catch (AlreadyClosedException e) {
// ignore, as multiple meta collectors write in the same index
// and will all try to close the index
} catch (CorruptIndexException e) {
throw new AnalysisEngineProcessException(e);
} catch (IOException e) {
throw new AnalysisEngineProcessException(e);
}
}
}
Multi-threading on the batch-task level should only parallelize complete task executions. Since each task execution (that includes in particular the UIMA pipeline) is then single-threaded and working on its own context (folder) there should be no problem.
Of course, if you use the CPE UIMA engine, that's another thing - then the writers that are not thread-safe/that must see all data must be derived from (J)CasConsumer_ImplBase or explicitly declare @OperationalProperties(multipleDeploymentAllowed = false).
Hi,
I have just come across the following: I am using the MultiThreadBatchTaskEngine in a feature ablation experiment in DKPro TC. The batch task fails with the following runtime exception:
This in turn leads to more runtime exceptions about other tasks not having been executed.
When I look into the details, I find the following exception at an earlier stage:
The experiments runs successfully when I use the single-threaded DefaulfBatchTaskEngine, so I presume the error about the closed index writer is caused as a side-effect of the multi-threading.
I don't have time right now to investigate this in detail or produce a minimum configuration where this error occurs, but I thought I'd report it anyways, in case others come across the same issue.
The text was updated successfully, but these errors were encountered: