Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E-Module Mapping Script #117

Open
4 of 5 tasks
alFrie opened this issue Feb 9, 2023 · 21 comments
Open
4 of 5 tasks

E-Module Mapping Script #117

alFrie opened this issue Feb 9, 2023 · 21 comments
Assignees

Comments

@alFrie
Copy link
Collaborator

alFrie commented Feb 9, 2023

The e-module part got separated from the rest of the CPTO. We have a working mapping script for the simple ontology and metadata to be mapped. So on this branch we will write a mapping script for the e-module. Things to do:

  • Edit the metadata exctraction script: Hardcode the specimen age to 28 days and add to yaml-file.
  • Edit the metadata exctraction script: Extract the experiment duration ("Zeit") and add to yaml-file.
  • Edit the metadata exctraction script: Make the metadata-dictionary keys match the placeholders.
  • Map the metadata in the yaml-file to the ontology by replacing placeholders.
  • Write a test.
@alFrie
Copy link
Collaborator Author

alFrie commented Feb 9, 2023

@raviapatel Please add the orange box for the ID, so I can append an ID after the underscore for linking mix and emodule.

@ThiloMuth
Copy link
Collaborator

Test

@ThiloMuth
Copy link
Collaborator

I want to review code - please let me do it =)

@alFrie alFrie linked a pull request Feb 9, 2023 that will close this issue
@joergfunger
Copy link
Member

@ThiloMuth if you accept the invitation to the repo, you can review the merge requests, e.g. this one

@alFrie
Copy link
Collaborator Author

alFrie commented Feb 13, 2023

@mattheokru Is this way of spelling ("Modul") within the e-module ontology on purpose/ predefined by the ontology or something like that? It looks german to me.
image

@mattheokru
Copy link
Collaborator

mattheokru commented Feb 20, 2023

No nothing predefined by the Ontology, I will change it. I updated the Ontologies in the pull request "updating Ontologies"

@raviapatel
Copy link
Collaborator

@mattheokru Is this way of spelling ("Modul") within the e-module ontology on purpose/ predefined by the ontology or something like that? It looks german to me. image
Yes this is just individual and you are right this is german way of doing it I would propose to change it to YoungsModulusTestSpecimen_ or EModulusTestSpeciemen

@alFrie
Copy link
Collaborator Author

alFrie commented Feb 23, 2023

I have a question regarding the emodul_metadata_extraction.py:

It should create an entry for the "processedFile" key in the dictionary, having as value the path to csv file with values extracted by emodul_generate_processed_data.py. Currently as a placeholder it's a null pointer.
Where are these files stored? In the dodo file I find
processed_data_emodulus_directory = Path(emodul_output_directory, 'processed_data') # folder with csv data files
so should I do the same?

@joergfunger
Copy link
Member

Currently, we store the files locally (as you have mentioned with that path), so I would add for now exactly this path. Ultimately, this would have to go to a file server/mongdb/openBIS with a URI, but please talk to @AidaZt or @ThiloMuth on how we should then reference these files in our KG.

@AidaZt
Copy link
Collaborator

AidaZt commented Feb 24, 2023

Andre suggested that we can reference the link/URL to the file.
Example of an RDF triple would be:
<http://bam.de/material#experiment01> <http://bam.de/properties#rawdata> <http://bam.de/dataserver/rawdata.csv> .

@joergfunger
Copy link
Member

But that links is not existing, or how do we intend to store data such that this link is actually a real reference?

@firmao
Copy link
Collaborator

firmao commented Feb 24, 2023

But that links is not existing, or how do we intend to store data such that this link is actually a real reference?

Then, we need, at least to talk about a file server, point straight to github raw files, dereferencing URIs providing RDF content, etc.

I suggest we have a short meeting to have an agreement about the best way for us to deal with the raw files.
What about Monday after 3pm?

Best regards,
Andre Valdestilhas

@alFrie
Copy link
Collaborator Author

alFrie commented Feb 27, 2023

About the Transducer Column:
According to the drawio we're expecting an integer:
"$$TransducerColumn_Value$$"^^xsd:integer
The value gets defined within the metadata extraction script of emodule. We talked about saving a list to that key: [1,2,3], giving this result of the mapped onto:
con:Transducer_ a con:MeasuringGauge, owl:NamedIndividual ; ns3:hasPmdUnit ns3:Q56402798 ; mid:has_column_index "[1, 2, 3]"^^xsd:integer .
We have a list of integers instead of an integer. Is that still valid?

Edit: @raviapatel Is this how you imagined it to be?

@firmao
Copy link
Collaborator

firmao commented Feb 27, 2023

About the Transducer Column: According to the drawio we're expecting an integer: "$$TransducerColumn_Value$$"^^xsd:integer The value gets defined within the metadata extraction script of emodule. We talked about saving a list to that key: [1,2,3], giving this result of the mapped onto: con:Transducer_ a con:MeasuringGauge, owl:NamedIndividual ; ns3:hasPmdUnit ns3:Q56402798 ; mid:has_column_index "[1, 2, 3]"^^xsd:integer . We have a list of integers instead of an integer. Is that still valid?

The data type expected is an xsd:integer, therefore it's supposed to be an integer number. If you still not sure about the data type, then store as an xsd:string.

@alFrie
Copy link
Collaborator Author

alFrie commented Feb 27, 2023

The data type expected is an xsd:integer, therefore it's supposed to be an integer number. If you still not sure about the data type, then store as an xsd:string.

And a list of integers doesn't fit the integer type, right?

@AidaZt
Copy link
Collaborator

AidaZt commented Feb 27, 2023

I thinks so, because we either have string or integer as a type and we can't refer it as xsd:list or something? I think for now leave it as xsd:string.

@firmao
Copy link
Collaborator

firmao commented Feb 27, 2023

if you still need to store a kind of list of values in RDF, there is an example here:
https://stackoverflow.com/questions/29669555/dynamic-array-in-rdf-xml

@alFrie
Copy link
Collaborator Author

alFrie commented Feb 28, 2023

image
So this is the current result of the mapping script. Please look at the following three issues:

  1. Only the placeholder EModule_Value doesn't get a key from the metadata.
  2. Height and Width get set to None, since the shape is cylindrical - this results in "None"^^xsd:decimal. That's problematic, None is not of type decimal, right? The type should stay decimal tho. since in the future there won't only be cylindric specimen, if I got that right.
  3. We have more metadata values than placeholders (f.e. weight and so on have no place to get mapped to. This is not a problem for now tho I guess). You can still look through that list of unmapped metadata and see if you'd like to create some individuals for some of them within the ontology?

For your information:

  • Placeholders get generated through a function so in case we decide of a different placeholder strucutre, we only need to change this small function and not the main function itself.
  • Tests are still failing because they were designed for Ilias outdated script. @soudehMasoudian is on it (Update test_mapping_script.py #134 )

@joergfunger @raviapatel, maybe @ThiloMuth wants to have a look at it, too.

@raviapatel
Copy link
Collaborator

I thinks so, because we either have string or integer as a type and we can't refer it as xsd:list or something? I think for now leave it as xsd:string.

Ok this is also fine for me

@alFrie
Copy link
Collaborator Author

alFrie commented Mar 9, 2023

How will the info about the openBis raw data location get mapped? Will this be defined in the mapping script or created during the metadata extraction so that the mapping script will automatically map it? @joergfunger

@joergfunger
Copy link
Member

That should be done during the extraction of the metadata. In the final setup, we will have the data all stored in the openBIS system (metadata), and then extracting this information together with the link to the raw data file should happen. Afterwards, the mapping script will just take that information in the metadata.json and replace that value in the ttl file obtained from the diagrams.net ttl template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants