You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why are we seeing two Celery tasks call process_meta() on the same new Datafile upload?
The BioFormats filter code that the nifcert filter was based on had locking code to manage this, so it may not be specific to nifcert.
the nifcert filter retains the locking code from the BioFormats filter, so the duplicate task has no effect, except an ERROR message in the log (duplicate key). I'd guess locking is implemented by inserting a row into a table with a unique key requirement, so this is perhaps expected behaviour.
I've confirmed with debugging output that both tasks have Datafile.verified == True, so the second thread is not in response to the Datafile being updated after the uploaded file's checksum is verified.
Below: sample log file output produced by clicking the "Add Files..." button on the view Dataset page and completing the subsequent actions to select and confirm a file.
351a420 protects the NIFCert app from the duplicate call.
The verified flag (in the debugging output mentioned in original issue) was probably set in both NIFCert task instances because our configuration seems to take a few seconds to initiate tasks. This allows the MyTardis async checksum verification task to update the database before the NIFCert tasks begin.
Note: the BioFormats code relies on a stale DataFile instance passed in as a parameter (which the Django/Celery docs say is a bad idea).
It's unclear whether the locking technique described in the Celery 3.1docs (used by BioFormats) is robust if the cache is exhausted on a heavily loaded system. BioFormats sometimes skips thumbnail generation for files. This may be the reason.
Why are we seeing two Celery tasks call process_meta() on the same new Datafile upload?
The BioFormats filter code that the nifcert filter was based on had locking code to manage this, so it may not be specific to nifcert.
the nifcert filter retains the locking code from the BioFormats filter, so the duplicate task has no effect, except an ERROR message in the log (duplicate key). I'd guess locking is implemented by inserting a row into a table with a unique key requirement, so this is perhaps expected behaviour.
I've confirmed with debugging output that both tasks have Datafile.verified == True, so the second thread is not in response to the Datafile being updated after the uploaded file's checksum is verified.
Below: sample log file output produced by clicking the "Add Files..." button on the view Dataset page and completing the subsequent actions to select and confirm a file.
The text was updated successfully, but these errors were encountered: