-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add constraint so that no duplicate records for druid's workflow steps are possible #221
Comments
@jcoyne this is done, right? |
@jcoyne dang. How do we proceed? In theory, accessioning is halted Wed am July 1 for the centos upgrades of preservation machines -- does that help? |
We’d have to fix all the data first. The data was nonconformant when we migrated off oracle a few years back. |
is this still relevant, or can it be closed? |
I haven't seen this problem since, but also haven't been fixing many accessioning errors either. I put it on the agenda for the next SDR Admin meeting with Andrew and Tony C. on the 14th to see if others have run into it lately. |
I haven't seen an issue here in at least a year, possibly two. I deleted literally millions of duplicate steps, I think in 2020. I never have come back to document how I did it, though I should. I followed the approach documented in the wiki, with a couple of modifications so that I could handle cases where there were more than duplicates (3+ copies of a step), plus some extra analysis to identify all the druids with steps with the issue. This came up because some changes in summer 2020 suddenly made duplicate wf steps a blocker for modifying and/or releasing lots of items. See sul-dlss/argo#2185 I think the question of "is a constraint still desirable?" is different than "does this error still happen?" Someone could check if there's still non-conformant data being created occasionally. I haven't checked since I did the big de-duplication. I thought non-conformant data was the only reason this wasn't done. |
I'm not taking this further unless someone asks me to; we added this repo to our board and it seemed to have some tickets that were very old so I enquired. The ticket can hang out AFAIC. |
For example: see https://jirasul.stanford.edu/jira/browse/SDRO-282
TL;DR, two druids were stuck in the
accessionWF
, meaning that their status was "Accessioned" but they were still active in the workflow, with some steps showing awaiting
status.Trying to set the status of
waiting
steps tocompleted
didn't work -- the steps remained inwaiting
.Looking at the objects in the workflow database revealed that the objects had at least one, sometimes two records for each
completed
step. We were able to remediate the objects by deleting the duplicatecompleted
steps, and the remainingwaiting
steps for the object, then re-indexing them. At that point, the objects had an "Accessioned" status and showed they had completed the workflow.This is how we discovered and remediated the objects:
Presumably this would be a harder position to get into if the workflow service were unable to generate "duplicates" like these, whether duplicate records for a step at a given status, or across different statuses.
The text was updated successfully, but these errors were encountered: