-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LinkML support for future profile development #2
Comments
The LinkML repo is published on PyPI, so could be added as a dependency. The same goes for pySHACL. I'm not really sure of the advantage of supporting LinkML directly, since it will always be converted to SHACL and so embracing SHACL makes more sense to me. For this reason I think it would make sense to create a separate repository with RO-Crate schemas in SHACL format, then add a If you wanted to also support LinkML then you could create another repo with the LinkML, then add a Happy to help with any of this. |
Hi all, this seems like a good proposal @elichad. We were discussing it with @kikkomep and @simleo just yesterday and we’re all in agreement that being able to use it as an alternative to SHACL could make adding support for additional profiles more approachable. Our first impression is that the best way to start integrating LinkML support would be the second approach you suggested:
The “automatically” word needs some discussion though. We could have a directory within the package for the LinkML profiles, but the profiles wouldn’t be actually used at run time. Instead, we’d propose the simple solution of having the profiles converted to SHACL as part of the development or packaging process. This should make it easier to test the converted profiles and fix things as necessary before release; keeping the conversion process prior to run time should also help make the tool more robust and easier to debug. Since we discussed this yesterday, @multimeric joined the conversation and also made some points that we should discuss together. To implement the LinkML -> SHACL conversion it looks like we can use the SHACL generator you referenced. @kikkomep ran some experiments and managed to successfully create a LinkML validation profile, convert it to SHACL with the generator and use it within rocrate-validator. The process did expose some small bugs in the internal SHACL parsing (which have been fixed) and there is an open issue with respect to how to manage severities. For the conversion, there could be either a dedicated script or subcommand that runs the conversion and lays out the resulting ttl following the directory structure used by rocrate-validator for the validation profiles. The profile.ttl file would have to be created manually (though, if we wanted, it wouldn’t be too hard to create a little script to guide the collection of the required metadata). As I was saying, one thing that needs some careful thinking is how to attach severities (MUST, SHOULD, MAY) to the LinkML checks. A solution could be to use annotations, but that would need support from the conversion script/subcommand to parse that information out of the resulting ttl and use it to lay out the checks appropriately in the directory structure. Another alternative, still using LinkML annotations, would be implementing additional SHACL parsing in rocrate-validator to extract the severity annotations (there's already some parsing to extract metadata). We'd be happy to hear other better/simpler alternatives. As for helping, we're happy to receive and support PR's on this issue. Let's just agree on the approach before anyone starts hacking :-) |
Next steps after discussion at the Workflow Run RO-Crate meeting today: Make a proof of concept LinkML-SHACL integration, to check that LinkML is a viable option for writing profiles:
After that (assuming LinkML is shown to be viable), we'll look at adding validation for the Five Safes Crate profile with this LinkML-SHACL approach, as this would be useful for our team at Manchester. We'll work on this on the Manchester side, I've just made a fork which we'll contribute back from: https://github.com/eScienceLab/rocrate-validator |
The PR #8 introduces support for the This feature should simplify the process of converting a LinkML specification to SHACL, as the output from the conversion process can be directly used by the validator without requiring the creation of the mentioned folder structure. From my experiments, simply annotating the LinkML slots with the Person:
is_a: NamedThing
description: >-
A person....
class_uri: schema:Person
slots:
- primary_email
slot_usage:
primary_email:
pattern: "^\\S+@[\\S+\\.]+\\S+"
recommended: true
annotations:
sh:severity: sh:Warning
sh:name: "Primary Email Validation"
sh:description: "This requirement checks the validity of the primary email address."
sh:message: "The primary email address is not valid."
... By using the LinkML-SHACL conversion tool with the schema1:PersonTest a sh:NodeShape ;
rdfs:subClassOf personinfo:NamedThing ;
sh:closed true ;
sh:description "A person...." ;
sh:ignoredProperties ( rdf:type ) ;
sh:property [
sh:datatype xsd:string ;
sh:description "This requirement checks the validity of the primary email address."^^xsd:string ;
sh:maxCount 1 ;
sh:message "Primary email address is not valid."^^xsd:string ;
sh:name "Primary Email Validation"^^xsd:string ;
sh:nodeKind sh:Literal ;
sh:order 0 ;
sh:path schema1:email ;
sh:pattern "^\\S+@[\\S+\\.]+\\S+" ;
sh:severity sh:Warning
],
... All that remains is to place it in the appropriate folder within your validation profile. |
@kikkomep amazing! Thank you for implementing this so quickly! |
Hi all, have there been any recent updates on LinkML implementation here? |
@multimeric I'm still working on this - here's a branch on my fork where I have one check working in LinkML as a proof of concept (from Workflow RO-Crate, checking that the root dataset has |
Expanding on a discussion with @simleo in the WRROC meeting.
At Manchester we've just started trying to use LinkML to write schemas for RO-Crate validation. LinkML schemas are YAML-based and therefore a lot easier for inexperienced users to comprehend and add to - and crucially, they can also be converted to SHACL. We think that it's important for RO-Crate profile developers to be able to write a validation schema for their profile themselves, and LinkML is a more approachable framework than SHACL to achieve this (as profile developers may not be linked data/RDF experts).
There has been interest and discussion around this previously: see ResearchObject/ro-crate#264 and linkml/linkml#1462
Thinking about how future profiles could be developed using LinkML in a way that's compatible with this validator package, there are a few possible approaches:
Please let me know your thoughts about what the best direction would be.
The text was updated successfully, but these errors were encountered: