Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Biographical records #2

Merged
merged 8 commits into from
Jul 25, 2024
Merged

Biographical records #2

merged 8 commits into from
Jul 25, 2024

Conversation

tijmenbaarda
Copy link
Contributor

This PR adds fields for biographical records. It also defines a LocationField.

Comment on lines +233 to +241
edpoprec:name a rdf:property ;
skos:prefLabel "Name"@en ;
rdfs:domain edpoprec:BiographicalRecord ;
rdfs:range edpoprec:Field .

edpoprec:variantName a rdf:property ;
skos:prefLabel "Name"@en ;
rdfs:domain edpoprec:BiographicalRecord ;
rdfs:range edpoprec:Field .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be cleaner to add variants as edpoprec:name. If there is some crucial distinction between name and variantName, you should explain it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is not very clean (same for BibliographicalRecord with alternative titles). My reasoning to use this anyway was that:

  • We need a default if there are multiple options
  • Most catalogues I have seen have a sharp distinction between the title/name and alternative titles/names

But this solution is not very clean and not flexible either.

Perhaps the following option will work:

  • Add a subfield 'default' to each Field with a boolean value
  • Make all fields allow multiple values (the additional advantage is that this implies that all fields will be repeatable, which is in fact already the case in RDF but not in the Python API)
  • Provide some convenience methods on Record in the Python API to quickly get the default field of a certain property

@jgonggrijp @lukavdplas: do you have an opinion about this problem?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't know whether "alternative name" is semantically meaningful (or if other catalogues just include it for technical reasons), but if so, that distinction makes sense.

Some more thoughts:

  • I don't know if there are specific reasons for needing a default name? Perhaps it's not necessary. As long as your interface can display multiple names, that can be fine too.
  • I would not use a default subfield - setting this to true for one and only one property within the range of an entity is hard to enforce.
  • An alternative that is least easier to validate: you could have :name point to multiple objects and add a functional property :defaultName. Thus, an entity might have five :name relations, one of which is also its :defaultName.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction will be important, because some records have more than ten alternative names/titles. That happens especially with CERL Thesaurus as this database tries to record all variants of a name across Europe. But they have selected the most common or 'normal' one as the main name. We will need that as well; for instance, the entry for Vondel has over 50 variant names, of which the first one is Йост ван ден Вондел, and I don't think we want to show this one first :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add a separate (functional) property defaultName, I don't really see the advantage over having separate normal properties... I also think that it will make the implementation more complicated.

It will be problematic indeed to enforce that there is only one field marked as default. Not sure if that is necessary, but certainly not very clean either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it was solved in BIBFRAME (the linked data successor of Marc21): https://www.loc.gov/bibframe/docs/pdf/bf2-titles-march2017.pdf

So a work has only one title property, title, of class Title. Since it is RDF it can of course have multiple values. But Title has various subclasses, such as AlternativeTitle, AbbreviatedTitle and also KeyTitle, so that the distinction can still be made.

This could work in our case as well; then we would have to create subclasses of Field both in the ontology and in the API, and we can simply follow the semantics of the catalogues to decide whether something is 'alternative' or not.

What about this solution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep it simple and go with name and alternativeName. That is also conventional, e.g. SKOS does that with labels. You can also add a preferredName in the mix, if you think this is useful. All those properties can have multiple values, for example in order to support multiple languages.

variantName sounds like it is the name of a variant of something, so I think that name would be less suitable. But the principle seems correct.

Comment on lines +253 to +268
edpoprec:placeOfActivity a rdf:property ;
skos:prefLabel "Place of activity"@en ;
rdfs:domain edpoprec:BiographicalRecord ;
rdfs:range edpoprec:LocationField .

edpoprec:timespan a rdf:property ;
skos:prefLabel "Timespan"@en ;
skos:description "The years that the person was alive or the entity was existing" ;
rdfs:domain edpoprec:BiographicalRecord ;
rdfs:range edpoprec:DatingField .

edpoprec:activity a rdf:property ;
skos:prefLabel "Activity"@en ;
skos:description "An activity that the entity is known for" ;
rdfs:domain edpoprec:BiographicalRecord ;
rdfs:range edpoprec:Field .
Copy link

@lukavdplas lukavdplas Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might expect something like

edpoprec:activity rdfs:domain edpoprec:BiographicalRecord ; rdfs:range edpoprec:Activity .
edpoprec:placeOfActivity rdfs:domain edpoprec:Activity .
edpoprec:timespan rdfs:domain edpoprec:Activity .

That is, that placeOfActivity and timespan have activities as their domain.

If a biographical record describes a person with multiple activities, and placeOfActivity is described for each, this setup would be ambiguous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We will need nested Fields then. But I think we are going to need them anyway, because there is more information that is grouped in similar ways.

@lukavdplas @jgonggrijp What are your views on nested fields? That is, fields that can have atomary values like strings as their subfields, but also other fields? This makes the structures of records more complicated. However, all fields (and subfields) can still be rendered as simple strings, so if their user does not want to use the complexity, they may still disregard it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested fields make sense to me. It's kind of the strength of RDF that you can easily create data structures with any level of depth.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can somehow extract that information, then yes, I agree that the nested structure makes more sense.

@tijmenbaarda tijmenbaarda merged commit 910805c into develop Jul 25, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants