Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knowledge Bases are not parsed correctly when doing Entity Linking #5070

Open
thetaaaaa opened this issue Sep 25, 2024 · 17 comments
Open

Knowledge Bases are not parsed correctly when doing Entity Linking #5070

thetaaaaa opened this issue Sep 25, 2024 · 17 comments
Assignees
Labels
Support request User has a problem and needs help

Comments

@thetaaaaa
Copy link

Describe the bug
I imported a self-built ontology with instances as the knowlege base to do Entity Linking. But, I can only saw concepts which have more than one level, the others are missing.

To Reproduce
Steps to reproduce the behavior:

  1. Create a new project based on the built-in Layer "Entity Linking" and import an custom ontology with all import options setted as default. In my case, the ontology is Ancient Oratory Ontology
  2. Check "Knowledge Base", and there are only concepts with more than one levels are visible, others with just one level are missing.
    Picture below shows the difference between how the ontology are displayed in Inception and Protege.
    image
    image

Expected behavior
All concepts should be visible in Inception, with the same hierarchy as it in Protege.

Screenshots
If applicable, add screenshots to help explain your problem.

Please complete the following information:

  • Version and build ID: 33.7 (2024-09-18 21:04:21, build c156ca2)
  • OS: Windows
  • Browser: Edge

Additional context
Add any other context about the problem here.

@reckart
Copy link
Member

reckart commented Sep 26, 2024

INCEpTION requires that concepts are explicitly defined to appear visible. However, ontolex#Form, skos:Concept, and skos:ConceptScheme are not defined in the ontology you link to - they are only used. Looks like Protegé then infers their existence and auto-generates an incomplete definition for them. Protegé has quite a few more capabilities than INCEpTION including the ability to do inference which INCEpTION does not offer.

Which entries from your ontology do you want to link against?

@thetaaaaa
Copy link
Author

thetaaaaa commented Sep 26, 2024

I want to link to the instances under the ontolex#Form and skos:Concept, but they are invisible, so I cannot choose them. What are necessary elements that required to make a concept visible in Inception ? In other words, what do you mean by"INCEpTION requires that concepts are explicitly defined to appear visible?"

Here is my test on my ontology: I added a property "rdfs:isDefinedBy:xxx" to the the ontolex#Form, but after that, “ontolex#Form” is still invisible in Inception. Below are pictures of the same modified ontology in Protege and Inception:
image
image

@thetaaaaa thetaaaaa changed the title Knowledge Base are not parsed correctly when doing Entity Linking Knowledge Bases are not parsed correctly when doing Entity Linking Sep 26, 2024
@reckart
Copy link
Member

reckart commented Sep 26, 2024

I have imported the oratory_v.1.0.rdf into Protegé and then saved it again as RDF/XML so it includes the concepts that are auto-generated by Protegé.

I set up two knowledge bases.

The first I call Oratory Lemon, import the augmented ontology and I configure it like this:

Screenshot 2024-09-26 at 19 17 43 Screenshot 2024-09-26 at 19 18 09 Screenshot 2024-09-26 at 19 18 18

This gives me this content:

Screenshot 2024-09-26 at 19 18 44

The first I call Oratory SKOS, import the augmented ontology and I configure it like this:
Screenshot 2024-09-26 at 19 19 29

Screenshot 2024-09-26 at 19 19 39

Which then gives me this content:

Screenshot 2024-09-26 at 19 20 21

Note that in the current v33 version, the "additional language" cannot be configured in the UI. If you need it, you can configure at by adding the following line to your settings.properties file:

knowledge-base.default-fallback-languages[0]=grc

If you want to use two different annotation features to each link to only one of the "views" of the data, you may configure two concept features and select for each one of the knowledge bases.

Does that work for you?

@reckart reckart added the Support request User has a problem and needs help label Sep 26, 2024
@reckart reckart added this to Support Sep 26, 2024
@reckart reckart self-assigned this Sep 26, 2024
@github-project-automation github-project-automation bot moved this to 🤷 To do in Support Sep 26, 2024
@reckart reckart moved this from 🤷 To do to 🤔 In progress in Support Sep 26, 2024
@thetaaaaa
Copy link
Author

Thank you for your reply and effort. I followed your steps and get these results:
(1)Your second practice works the same on my device.
(2) Your first practice don't work on my device. I resave the "oratory_v.1.0.rdf" through Protege, and renamed it as "lemon". Then I imported "lemon" in to Inception, and try to set the Root Concepts with the URI of the concept "ontolex#Form", and Inception prompt me with the error below, so it seems that Inception still didn't find the concept "ontolex#Form" on my device.
Error shown by Inception:
image

I checked the resaved ontology "lemon.rdf", it do has the concept"#Form" inside. Below is the screenshot of the resaved ontology:
image

The version of my Protege is 5.6.4.

@reckart
Copy link
Member

reckart commented Sep 26, 2024

For me, that worked. I have attached the RDF file I have exported from Protege here.

oratory.rdf.zip

@reckart
Copy link
Member

reckart commented Sep 26, 2024

If you scroll up, what size information do you see. It should be like this:

Screenshot 2024-09-26 at 22 08 54

@reckart
Copy link
Member

reckart commented Sep 26, 2024

Note, to import a file after creating a KB, you need to select the file in the import contents field and then press "Save" on the bottom. Only after that, setting the root will work.

@thetaaaaa
Copy link
Author

If you scroll up, what size information do you see. It should be like this:

Screenshot 2024-09-26 at 22 08 54

I imported the ontology which your attached during the Creating Knowlege Base Step, and pressed "Save". The result is like below, we are not the same:
image

@thetaaaaa
Copy link
Author

thetaaaaa commented Sep 26, 2024

I tried many times on creating Knowledge Base, but it still not work as expected. I record a video of my operations. I wish you can check it to see if I do something wrong.
【Creating KnowledgeBase in INCEPTION】 https://www.bilibili.com/video/BV1drxHeyEFF/?share_source=copy_web&vd_source=c269ff663afb842acff6599950467062

@reckart
Copy link
Member

reckart commented Sep 27, 2024

Thanks for the video. In fact, in my instructions I forgot an essential step... I am very sorry for that.

I though that passing the ontology once through Protegé was all that I had done... however, I did one additional step.

I downloaded the lemon ontology from here http://www.w3.org/ns/lemon/ontolex and saved it as ontolex.rdf.

I then imported that file into the knowledge base as well using the Import contents field in the KB settings and saving.

That brings the number of statements in the KB up from 4764 to 5348 and also finally provides the missing http://www.w3.org/ns/lemon/ontolex#Form concept.

Actually you don't even need to pass the oratory file through Protegé in this case. You could also simply select both the files oratory_v.1.0.rdf and ontolex.rdf (using the CTRL or SHIFT button while clicking in the file selection dialog) during the creation of the knowledge base and then just set the root concept to http://www.w3.org/ns/lemon/ontolex#Form and the label concept to http://www.w3.org/ns/lemon/ontolex#writtenRep - that should already be sufficient.

For some reason, if I select both files during KB creation, I seem to end up with only some ~150 fewer statements in the KB... no idea why. However, it still seems to work.

@thetaaaaa
Copy link
Author

thetaaaaa commented Sep 27, 2024

It works now, thank you. But the concept "#Form" and the others cannot automaticlly be shown in Inception at the same time. Do you mean # I need to create two knowlege bases(one for "#Form", one for the others) at the same time ?

I figure out another way to make all the concepts in one ontology visible at the same time, which is by manully adding all the first level concepts to "Root Concepts", through settings like the picture below, is this also a correct operation ?
image
After that, I will get this result:
image

@reckart
Copy link
Member

reckart commented Sep 27, 2024

If you can configure the a single KB that gives you access to all of the concepts you are interested in, it is fine.

I suggested to set up two KBs because the SKOS hierarchy typically uses different properties to build the tree (e.g. skos#broader) than e.g. an OWL/RDF class hierarchy.

Mind that the root concepts settings just gives you a nicer way of browsing the KB. It does not really affect the linking auto-complete on the annotation page. If you want to limit that, you can set a scope on the concept feature.

Glad it works and I guess now you have a bunch of options to try out to see what fits your personal preference most.

@thetaaaaa
Copy link
Author

thetaaaaa commented Sep 27, 2024

Thank you. I got new problems......
I create a new project with the Basic Annotation template. I uploaded my CoNLL2002 file as the prelabeled corpus. I want to annotate both the named entities and their relationships with concepts and properties in my ontology.

(1) It seems built-in "Named Entity" Layer doesn't allow users to use more than one KB to labeling text. As you can see in the picture below, I cannot create another feature at the same built-in "Named Entity" Layer, is this a bug ? If so, creating two KBs may not be a good option.
image

(2) I create a custom Layer called "Property", in order to annotate the relationship between two built-in "Named Entity" with the property in my ontology, my setting is shown below, at the mean time, I do import all the related ontology, such as ontolex.rdf:
image
But when I try to annotate the property between two named entities, the properties in my ontology are invisible. As is shown in the pictures below, I cannot see any properties from my ontology:
image
image
I checked the project Knowledge Base, and I found that the displayed properties, are incomplete and totally different from the propertis compared with it in Protege(I don't even know where the displayed properties come from......):
image
So my question is how can I annotate the relationship between two built-in "Named Entity" with the property in my ontology?
Thank you for your time and instructions.

@reckart
Copy link
Member

reckart commented Sep 29, 2024

You cannot add features to the built-in Named Entity layer. The features of built-in layers can currently not be changed. This is not a bug.

However, if you leave the KB selection on the feature on All knowledge bases, then you can select concepts from both of your configured KBs.

In order to annotate relations between entities with concepts from your knowledge base, you would need to create a custom relation layer. On that custom relation layer, you can create arbitrary features. However, you cannot export that in CoNLL layer on. If you want to export, you will have to use e.g. UIMA CAS JSON.

Does that answer your question?

@thetaaaaa
Copy link
Author

thetaaaaa commented Oct 1, 2024

Thank you for you instruction. But I still remain some problems when I do manully annotation.

  1. The Properties of my own ontologies are not visible and cannot be selected, only Classes and Instances are visible and can be selected. The same thing happen with the official SKOS ontology. I created a vedio for your reference.
  2. There is always inconsistency between the displayed ontologyies (the ontologies I used are here for your reference) in Protege and INCEPTION, and I can't even find a pattern for their differences in order to modify my KBs. Is it possible to enhance the function of KB in INCETION?
  3. When I leave the KB selection on the feature on All knowledge bases, any of them cannot be selected when I do manully annotation, they are avaliable for selection only when I leave the KB selection on a specific ontology. You can also check it in the same vedio above.

All the problems above occur with both my own ontologies and the built-in dafault remote wikidata ontology. The template I used in my project is the dafault built-in "Basic annotation (span/relation)".

@reckart
Copy link
Member

reckart commented Oct 14, 2024

Thanks for the pointers to the ontologies. I don't know when I may find time to look into these though.

Thanks also for the videos. However, I find them a bit hard to follow without explanation. INCEpTION does not aim to be a copy of Protegé. INCEpTION wants to be able to use its IRI mappings to access a configurable concept and property hierarchy from a knowledge base. Depending on how the KB is built that may or may not be possible without inference. INCEpTION does not do inference though. There are also certain kinds of KB designs that are not supported by INCEpTION, e.g. those modelling the concept hierarchy using the skos:narrower property.

The KB support in INCEpTION is improved over time step by step. At the moment, I do not know what exactly would need to be done for the ontologies you linked. It might be it is just a matter of properly configuring the IRI mapping. It might also be that they use a structure that is currently not supported and that code changes are necessary.

It is easier for me to process focussed issue reports or questions related to a specific knowledge base. The more you package into an issue, the more time it takes me to look into it, the less likely I am to be able to do anything about it in the near future.

@thetaaaaa
Copy link
Author

Thank you for your reply and time and explaination! My objective is to use INCEpTION to create instances of a given Ontology by annotating a given raw text, that is the reason why I need all the property and concept in an ontology to be correct prased and displayed. I will keep exploring INCEpTION to see if I can solve the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Support request User has a problem and needs help
Projects
Status: 🤔 In progress
Development

No branches or pull requests

2 participants