Common Support Tasks

This page enumerates common support tasks that you may encounter, and suggests paths forward, often using commands on the "Useful commands" page.

There is a SUL-PUB Service Level Agreement(SLA)

Support communication channels are via email at [email protected] (routed to infrastructure team and select managers) as well as the #sul-cap-collab channel.

Manually Import Publications For An Author

This request comes in typically via a HelpSU ticket to the Profiles project manager, and then gets routed to the sul-pub team. Basically, the queries we send for harvesting do not always get all of an authors publications, either because of name ambiguity problems or previous institutions, or both. Steps to take:

Confirm/request (via the Profiles project manager ) that the author update their "publication settings" in Profiles so that they have previous institutions listed. This makes the query more robuts by adding "alternate_identities" to their primary identity. This is all on the user, and will result in a new harvest being run the day after they update their profile.
Manually import a list of publication IDs for the user. This requires manually searching the Web of Science via the UI and fetching and confirming publications. I have relied on Grace Baysinger or someone else in the library with more expertise to run these manual searches and give me a list of IDs to import. The technical process to import is outlined here

Delete Excess Publications For An Author

If an author has lots and lots of publications, it is likely many of them are incorrect matches. This request also comes in typically via a HelpSU ticket to the Profiles project manager, and then gets routed to the sul-pub team. Steps to take:

Confirm if we really want to delete publications for a given author. This is destructive and should be done cautiously, and also requires the Profiles team to delete from their database so the library sul-pub system and the profiles system can remain in sync.

The alternative is for the user to "deny" the publications in the Profiles UI, which is a better and safer choice for smaller numbers and requires no technical intervention, but perhaps too onerous if its in the hundreds or thousands.

Use the rake task outlined here under "author publication cleanup" to delete them. You can use date ranges and publication provenance (i.e. source of publication) to limit the deletions.
Follow up with the profiles team about what you did so they can delete from their end.

Merge Duplicate Authors

This request also comes in typically via a HelpSU ticket to the Profiles project manager, and then gets routed to the sul-pub team. The use case is an author who has two database rows in the system with two different IDs and those need to be merged into a single author record. You need to know the "primary" record and the "duped" record, since you will be merging one into the other and want to be sure you end up with all of the publications associated with the correct author record.

Steps to take:

Confirm the author and confirm that the identifiers you are given make sense by looking at the author, e.g. if you are given a primary author cap_profile_id='1234' and a duped author cap_profile_id='4567', do a sanity check to make sure both look the same:

On the rails console:

Author.find_by_cap_profile_id(1234)
Author.find_by_cap_profile_id(4567)

Merge the profiles using the rake task outlined here.
Communicate results back to the Profiles team.

Report a data issue to Clarivate

This may come about if a faculty member notices bad data that was harvested from the Web of Science. To report this to Clarivate, fill in the form at https://support.clarivate.com/ScientificandAcademicResearch/s/datachanges?language=en_US or search for the record in the Web of Science and then click the corrections link. Note that Clarivate will not tell you if/when they correct the data and even if they do, this would require updating the record from Clarivate in our database and then re-generating the pub hash.

The Clarivate API Goes Down

If the Clarivate API goes down, you may get a support request. You will need to contact Clarivate support. This process is well outlined in DevOpsDocs

The Pubmed API Goes Down

You may need to contact NIH. This process is well outlined in DevOpsDocs

Custom Report for SMCI

As documented here: https://github.com/sul-dlss/sul_pub/issues/1201

This is a custom report for SMCI. Full documentation is at the top of the lib/smci_report.rb class. Note that "lookback" windows refer to how for back from today you want to get publications for. There are two groups of publications fetched:

users who are in profiles
users who are not in profiles

For users in profiles, you specify a date in DD/MM/YYYY format and all approved publications harvested after that date are returned. For users not in profiles, you use a "lookback" window that must be a valid value as specified in the WoS API. Allowed values are described here the "Symbolic Time Span" values described at the bottom of this page: https://github.com/sul-dlss/sul_pub/wiki/Clarivate-APIs Note that are not specific dates, but rather periods in time to look back to.

Other caveats:

for users who are in profiles, this is solely determined by a sunetid or cap_profile_id passed into the input csv file ... even if these users happen to have their harvesting flag or their profile disabled, so its possible you will get 0 publications returned for these users
for users who are not in profiles, the search performed against WOS is a simple name search, so there will be false positives for common names
the output file is in UTF-8 format, if you open in Excel or a program that doesn't correctly use UTF-8, you will see bad character substitutions (such as smart quotes getting turned into odd characters)

Note that you will probably want to run this using screen since it may take a while:

# get on VPN first and then do a kinit so you can access the server correctly
# you will probably want to sftp the input csv file from your laptop to the server
sftp pub@sul-pub-prod
cd sul-pub/current/tmp
put authors.csv

ssh pub@sul-pub-prod

screen -list   # list existing screens
screen # start a new screen

cd sul-pub/current
RAILS_ENV=production bundle exec rake sul:smci_export['tmp/authors.csv','tmp/results.csv','1/1/2000','10year'] # for a specific lookback window
RAILS_ENV=production bundle exec rake sul:smci_export['tmp/authors.csv','tmp/results.csv',,] # for all time

# when process is running, ctrl-a ctrl-d to close that screen

screen -r # to re-attach later
exit # to kill the screen when done processing

# tail the log file to see the output when running or view when done
tail -99f log/smci_export.log

# count the results
wc tmp/results.csv

# see the results
less tmp/results.csv

# you will probably want to sftp the output file back to your laptop when it is done running
sftp pub@sul-pub-prod
cd sul-pub/current/tmp
get results.csv

Here is a sample input CSV file with five users. The first three are Stanford authors, two with a sunetid and the third with a cap_profile_id specified. Either works, with sunet taking precedence if both supplied. The next two are non-Stanford authors, and leave sunetid and cap_profile_id blank, but include names and optionally institutions as a comma delimited list. If no institutions are provided, Stanford is used by default. You can also supply an ORCID instead of a name for non-Profiles users, which will use the ORCID search against WoS. Finally, you can include an optional symbolicTimeSpan per user for non-Profiles users.

For running just for Profiles users, just add a single header column called "sunetid" and ensure each row has a value. You don't need the full header if you are only supplying sunets as the input.

sunetid,cap_profile_id,first_name,middle_name,last_name,institutions,orcid,time_span
altmann,
sunetuser,
,49353
,,bill,a,clinton,
,,Condoleezza,,Rice,"stanford,harvard"

Update a Pubmed record

It is possible to refresh a Pubmed record with updated data from Pubmed and then rebuild the pub hash. This is useful for a Pubmed provenance record that had a typo in the originally harvested data but is now fixed:

pmid='25277988'
pub = Publication.find_by_pmid(pmid)
pub.update_from_pubmed

HB Alert for a bad ORCID Token

You may see this HB alert: https://app.honeybadger.io/projects/50046/faults/80045843

We suspect this occurs when a user has removed Stanford as a trusted organization on their ORCID profile but did not edit their permissions in our integration at authorize.stanford.edu. So when we try to use the token we get from the MaIS API, it is refused by ORCID. There isn't much we can do about this. See https://docs.google.com/document/d/1ZfNmfBzPTYm7aJpwrWAx6nXHvVvt1AfkOmSceOxCoXo/edit#heading=h.pdmvl8v0tpqr

Provide feedback

Saved searches

Use saved searches to filter your results more quickly