-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add tissue_type and new validation rules for tissue_ontology_term_id and cell_type_ontology_term_id #623
Conversation
…ue_ontology_term_id and cell_type_ontology_term_id
Codecov Report
@@ Coverage Diff @@
## main #623 +/- ##
==========================================
- Coverage 83.08% 83.01% -0.08%
==========================================
Files 19 19
Lines 1709 1684 -25
==========================================
- Hits 1420 1398 -22
+ Misses 289 286 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@@ -281,16 +281,31 @@ def test_assay_ontology_term_id(self): | |||
|
|||
def test_cell_type_ontology_term_id(self): | |||
""" | |||
cell_type_ontology_term_id categorical with str categories. This MUST be a CL term. | |||
cell_type_ontology_term_id categorical with str categories. This MUST be a CL term, and must NOT match forbidden |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a locally glossary somewhere for these acronyms(CL)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think its an acronym, its a prefix used for cell ontology terms. There's a glossary in the user-facing schema documentation
) | ||
with self.subTest(forbidden_term="EFO:0000001"): | ||
self.validator.adata.obs.loc[ | ||
self.validator.adata.obs.index[0], "cell_type_ontology_term_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the benefit of using the numpy method of slicing foo[ x, y]
, as opposed to pythons way using foo[x][y]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the syntax for the loc accessor function for pandas dataframes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is the standard way to reassign a pandas df entry but perhaps both ways work
#514
#517
Changes:
- do NOT accept suffixes in term_ids and do not append them to the resulting label
- enforce cell_type_ontology_term_id rules on this term_id IF tissue_type is 'cell culture',
- enforce term_id is child term of UBERON:0001062 if tissue_type is 'organoid' or 'tissue'