Transformation format for openrefine to XML mods
Tools Required
- download Openrefine http://openrefine.org/download.html
OpenRefine's scripting language is GREL. Foundations course https://courses.tranzf.org/course/view.php?id=18
- documentation https://github.com/OpenRefine/OpenRefine/wiki/General-Refine-Expression-Language
- understanding expressions https://github.com/OpenRefine/OpenRefine/wiki/Understanding-Expressions
general artifacts SU sheets
Remove quotes from Templating -add .replace('"', '') to the value inputs in the XML ex, {{jsonize(cells["SU Number"].value).replace('"', '')}}
Fixing Date
Openrefine will change the date format when a project is opened, and this new format is unwanted. This is h
- Transform use the correct year Expression: slice(value, 4, 10)+" 2014"
- Transform Expression: value.toDate().toString('yyyy-MM-dd')
###Catalogued Objects (CO)###
DO NOT store blank rows DO NOT store blank cells as null
Be sure that the spellings of the columns are the same as in the CO template.
DELETE Columns
- 'Cat. Sheet'
- 'Date Found'
- 'Date Catalogued'
- 'Photo Num.'
- 'Photos'
- 'Drawings'
- 'Bibliography'
- 'Comments'
- 'Date Issued'
- Edit column -> Remove this column
Rename 'Brief Identification' to 'Object'
- 'Brief Identification' -> Edit Column -> Rename this column
- In the popup window type 'Object'
Rename 'Sp.Obj.' to 'Special'
- 'Sp.Obj.' -> Edit Column -> Rename this column
- In the popup window type 'Special'
Rename 'Detailed Description' to 'Description'
- 'Detailed Description' -> Edit column -> Rename this column
- In the popup window type 'Description'
Split 'Locus/SU' into two columns if not already
- 'Locus/SU' -> Split into several columns
- by separator '.'
- Deselect 'Remove this column' to be safe
- OK
- Rename first column 'Locus'; Rename second column 'SU'
- Remove original column
Remember to change the value in the template to today's date.
Generating purls using Cataloged Objects column
- Click on Cataloged Objects Column
- Click 'add column based on this column'
- Name: Purl
- 'set to blank'
- Expression: forEach(value.split(", "), v, "http://purl.flvc.org/fsu/fd/FSU_Digital_Cosa_" + v)
Dealing with notes for Covers, Cuts, Cut By, etc. Both grel expressions work, however one of them evaluates for blank cells which is best practice for coding in general Expression w/ eval: {{forNonBlank(cells["Covers"], c, jsonize(c.value), "").replace('"', '')}} Expression w/o eval: {{jsonize(cells["Covered By"].value).replace('"', '')}}
Correcting the Season if incorrect Expression: replace(value, "Fall", "Summer")
Create column for recordCreationDate
- Select 'SU Number' column and click 'add column based on column SU Number' (it can really be any column, I'm just picking SU Number for indecisive folks)
- New Column Name: Record Date
- select 'set to blank'
- Expression: now().toString('yyyy-MM-dd')
- Click 'OK'