-
Notifications
You must be signed in to change notification settings - Fork 6
Blue Mountain Identifiers and Naming Conventions
Blue Mountain is a database of magazines: that is, it is a system that links information together. In order to link information about specific objects – magazine titles, magazine issues, etc. – Blue Mountain must assign a unique identifier to each one. (This is a well-known feature of information science.) Doing so allows us – programs and people – to refer to these things unambiguously.
Blue Mountain adopts the Universal Resource Name conventions (URN) to compose unique identifiers for titles and issues, as well as for the metadata records (METS and MODS) used to encode information about them.
Blue Mountain assigns a URI to each magazine, magazine issue, METS record, and MODS record it maintains. We have developed a convention for composing these URIs. Formally, the convention can be expressed like this:
<BMTNPREFIX> ::= "urn:PUL:bluemountain" <BMTNID> ::= "bmtn" <a-z><a-z><a-z> <DATESTRING> ::= CCYY-MM-DD | CCYY-MM | CCYY <ISSUEINDEX> ::= <0-9><0-9> <ISSUANCE> ::= <DATESTRING> "_" <ISSUEINDEX> <ISSUEID> ::= <BMTNID> "_" <ISSUANCE> <TITLEURI> ::= <BMTNPREFIX> ":" <BMTNID> <ISSUEURI> ::= <BMTNPREFIX> ":" <ISSUEID> <TITLEMETSURI> ::= <BMTNPREFIX> ":td:" <BMTNID> <ISSUEMETSURI> ::= <BMTNPREFIX> ":td:" <ISSUEID> <TITLEMODSURI> ::= <BMTNPREFIX> ":dmd:" <BMTNID> <ISSUEMODSURI> ::= <BMTNPREFIX> ":dmd:" <ISSUEID>
Let’s take a more discursive look at this syntax.
The Blue Mountain Prefix is a fixed string that follows the functional requirements specified in RFC 1737. It comprises the fixed string “urn” and a namespace identifier (NID), “PUL”2. Within the PUL namespace, the string bluemountain is intended to signify resources produced by and for the Blue Mountain project.
Blue Mountain assigns a sequential blue mountain identifier, or bmtnid to each title. The bmtnid will take the form bmtnNNN, where NNN is a hexavigesimal number (e.g., aaa, aab, aac, etc.).1 Blue Mountain will maintain bmtnids in a project registry, a file maintained with other administrative files.
Blue Mountain represents two conceptual objects:
- Titles
- A journal or magazine taken as a whole (e.g., The Signature was a magazine published by D. H. Lawrence and John Middleton Murray).
- Issues
- The periodic output under a Title (e.g., the first issue of The Signature appeared on October 4th, 1915).
Blue Mountain does not have an explicit conception of Volume, an aggregation of issues. Sometimes volumes were explicitly compiled by the publisher; sometimes they were created by collectors – libraries or individual subscribers. We will address the question of volumes at a later stage.
Blue Mountain represents titles and issues with two kinds of metadata: descriptive metadata about the entity (its title, place and date of publication, and so forth) and technical metadata about the digital files that comprise its representation in Blue Mountain.
Blue Mountain has adopted the MODS framework to encode descriptive metadata and the METS framework to encode technical metadata.
There are, then, six distinct kinds of object in Blue Mountain:
- Titles
- Issues of Titles
- The descriptive metadata for a title
- The descriptive metadata for an issue
- The technical metadata for a title
- The technical metadata for an issue
The URI of an object can be used to indicate what type of object it is. A title is represented by its bmtnid and an issue by its issueid; The descriptive metadata for a title or issue uses the same bmtnid or issueid but inserts the token :dmd: between the Blue Mountain Prefix and the id; similarly, the URI for technical metadata contains the token :td: between the prefix and the id.
The concept of issuance is critical to the ontology of periodicals. Blue Mountain’s formalization of issuance continues to evolve, but at this stage we have adopted the formalism developed by NDNP and DL Consulting (?) for describing newspaper issues.
In Blue Mountain, each magazine issue is assigned an issue identifier or issueid of the form
bmtnid_issuanceString
where issuanceString corresponds to the date of issuance and takes the form CCYY-MM-DD_II, defined as follows:
- CCYY
- A four-digit number representing the year of publication (e.g., 1912)
- MM
- A two-digit number representing the month of publication, where January = 01, Feburary = 02, etc.
- DD
- A two-digit number representing the day of publication (e.g., 01, 02, .. 30, 31).
- II
- A two-digit index of daily issuance (e.g., the first issue of the day is 01, the second is 02, and so on). This convention is adopted from the issuance of newspapers, which not infrequently issued a morning edition and an evening edition on the same day. Blue Mountain’s adoption allows us to distinguish among magazine texts that were published on the same day: a regular issue and a supplement, for example, and also among editions that bear the same date of issuance but were printed and published in different locations, sometimes with different content.
Issueids, like ISO 8601 dates, are organized from most significant to least significant chronological unit (i.e., from year to day). This format has two advantages: it allows ids to be sorted naturally, and it enables variable precision: the representation of daily, monthly, or yearly issuance.
- If you know the year, month, and day of publication (e.g., you know that the issue was published on January 5th, 1912) :: then the issuance string os 1912-01-05_01.
- If you have two issues published on the same day (e.g., an issue and a special supplement were both issued on January 5th, 1912) :: then the issuance strings are 1912-01-05_01 and 1912-01-05_02.
- If you know only the year and month of publication (e.g., you know it was published in January, 1912, but you do not know on what day: 1912-01_01.
- If you know that the magazine published two issues monthly, but you do not know the dates of publication (e.g., you have two issues published in January, 1912): The issuance strings are 1912-01_01 and 1912-01_02.
- If you know only the year of publication (e.g., you know that the issue was published in 1912, but you do not know the month or, therefore, the day of publication): The issuance string is 1912_01.
- If you have several issues published in the same year, but you know neither the month nor the day of publication (e.g., you know the journal published two issues in 1912, but you do not know the months or days of publication): The issuance strings are 1912_01 and 1912_02.
The journal le coeur à barbe has the Blue Mountain identifier bmtnaad. Only one issue was published, in April of 1920.
titleid | urn:PUL:bluemountain:bmtnaad |
title METS id | urn:PUL:bluemountain:td:bmtnaad |
title MODS id | urn:PUL:bluemountain:dmd:bmtnaad |
issueid | urn:PUL:bluemountain:bmtnaad_1920-04_01 |
issue METS id | urn:PUL:bluemountain:td:bmtnaad_1920-04_01 |
issue MODS id | urn:PUL:bluemountain:dmd:bmtnaad_1920-04_01 |
---|
1 This convention has been adopted to suport the naming conventions in Veridian, which prohibit the use of integers in identifiers.
2 RFC 1737 urges all NIDs to be registered with IANA. PUL is not, to my knowledge, registered with IANA.