Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
HeloiseCoen committed Jun 19, 2024
2 parents b351b49 + 5249298 commit 9e08c1b
Show file tree
Hide file tree
Showing 20 changed files with 309 additions and 26 deletions.
7 changes: 7 additions & 0 deletions vault/Master-Works.Edouard-Brülhart.Collection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
id: i6scy8dy7a9appqdqvxas6j
title: Collection
desc: ''
updated: 1718557537307
created: 1718557537307
---
4 changes: 2 additions & 2 deletions vault/Master-Works.Edouard-Brülhart.Database.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ updated: 1718358903442
created: 1718349400476
---

## [[Master-Works.Edouard-Brülhart.Database.Services.Directus]]
## [[Master-Works.Edouard-Brülhart.Services.Directus]]

The database we use is an integration of [[Master-Works.Edouard-Brülhart.Database.Services.PostgreSQL]], a robust relational database system, with [[Master-Works.Edouard-Brülhart.Database.Services.Directus]], an open-source data platform. The combined solution offers a powerful, flexible, and user-friendly environment for managing structured data. This setup enables efficient data storage, retrieval, and management with a modern web-based interface, suitable for diverse applications in scientific research and beyond."
The database we use is an integration of [[Master-Works.Edouard-Brülhart.Services.PostgreSQL]], a robust relational database system, with [[Master-Works.Edouard-Brülhart.Services.Directus]], an open-source data platform. The combined solution offers a powerful, flexible, and user-friendly environment for managing structured data. This setup enables efficient data storage, retrieval, and management with a modern web-based interface, suitable for diverse applications in scientific research and beyond."

This database is the core part that stores the data from the beginning to the end. This then permits to retrieve easily and automatically all necessary metadata coming from the different services we use.

Expand Down
7 changes: 7 additions & 0 deletions vault/Master-Works.Edouard-Brülhart.Laboratory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
id: 4u9167irz9kzo0w3noogs2t
title: Laboratory
desc: ''
updated: 1718557636202
created: 1718557636202
---
7 changes: 7 additions & 0 deletions vault/Master-Works.Edouard-Brülhart.Mass-spectrometry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
id: gt1q9bzu5eep82sxxog219k
title: Mass spectrometry
desc: ''
updated: 1718557600639
created: 1718557586154
---
10 changes: 6 additions & 4 deletions vault/Master-Works.Edouard-Brülhart.Methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,18 @@
id: 5u2uiaco359415r2i31ymac
title: Methods
desc: ''
updated: 1718372787886
updated: 1718557636314
created: 1718349372688
---
The methods are organized in four categories, following the chronological order of the workflow.

### [[Master-Works.Edouard-Brülhart.Database]]

### Collection process
### [[Master-Works.Edouard-Brülhart.Collection]]
The collection process is made using QGIS and QField, with a QFieldCloud server.

### Lab process
### Mass spectrometry process
### [[Master-Works.Edouard-Brülhart.Laboratory]]


### [[Master-Works.Edouard-Brülhart.Mass spectrometry]]

Binary file added vault/assets/images/2024-06-18-12-06-28.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions vault/open-notebook.dbgi.dtandon.2024.04.15.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: ig41sdi15e6x9f1o8zaeqwu
title: '2024-04-15'
desc: ''
updated: 1713203654590
updated: 1718354530862
created: 1713166208351
traitIds:
- open-notebook-dbgi-dtandon
Expand Down Expand Up @@ -54,7 +54,7 @@ https://www.science-studios.ch/
4. #SciComm Build a graphical overview of the two ideas discussed with Manuela.

## Today I learned that
1. Ontop with duckdb https://www.linkedin.com/pulse/scaling-sparql-querying-billion-observations-ontop-duckdb-gschwend-myghf/
1. Ontop with duckdb https://www..com/pulse/scaling-sparql-querying-billion-observations-ontop-duckdb-gschwend-myghf/

## Paused

Expand Down
6 changes: 3 additions & 3 deletions vault/open-notebook.dbgi.lcappelletti.2023.07.17.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: twrgu2q3yot15pz1v2en7f6
title: '17'
desc: ''
updated: 1689601789408
updated: 1718354530867
created: 1689585522386
---

Expand All @@ -22,11 +22,11 @@ Today is 2023.07.17
* What you are currently reading is my first entry in Denron DBGI.
* [BioCypher](https://arxiv.org/abs/2212.13543) and [PheKnowLator](https://arxiv.org/abs/2307.05727) are primarily pipelines in Python for combining KGs. We need an efficient and robust pipeline to **create** a KG from scratch, and first thing first Python is likely not the suitable candidate for such objectives, altough it is okay when one needs to smash things togheter. The objective of the Earth Metabolome KG is not to compose through ETL several existing ontologies and KGs, but to create a new KG from the metabolomic data and its metadata. While it is vital for the long term viability of these data to be compatible with the existing ontologies and graphs by following the naming conventions for species for instance from the likes of [NCBITaxon](https://www.ncbi.nlm.nih.gov/taxonomy), it is not the goal of this KG to ingest them. The Earth Metabolome KG is not a KG of KGs, it is a KG of metabolomic data, that may be included in several other KGs.
* [KG-Hub](https://academic.oup.com/bioinformatics/article/39/7/btad418/7211646) is similar in nature to the aforementioned libraries, but also provides a [web-hosting that makes it particularly easy to access versioned KGs.](https://kghub.org/) The standardized nature of the node types and edge types through the biolink format make it easier to combine different KGs, thought in several instances the level of detail in the hierarchy of biolink effectively used is too coarse and stops at extremely high level labels such as [biolink:NamedThing](https://biolink.github.io/biolink-model/docs/NamedThing.html). Aiming to have a standardized and structured set of labels will be vital for the Earth Metabolome KG, but they will also need to be finely grained and detailed enough to be useful for the biological samples and metabolomic data. The ease of access and clear versioning of the [KG-Hub](https://academic.oup.com/bioinformatics/article/39/7/btad418/7211646) data is surely something to be imitated. Finally, the tight integration of the [GRAPE](https://github.com/AnacletoLAB/grape) library with [KG-Hub](https://academic.oup.com/bioinformatics/article/39/7/btad418/7211646) is something to be considered for the Earth Metabolome KG.
* Asked to [Deepak Unni](https://www.linkedin.com/in/deepakunni3/), who is the main author of [BioLink](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372416/), whether it provides support for biological samples as it is unclear from the BioLink paper and website.
* Asked to [Deepak Unni](https://www..com/in/deepakunni3/), who is the main author of [BioLink](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372416/), whether it provides support for biological samples as it is unclear from the BioLink paper and website.


## Todo tomorrow
* Write an entry describing the benefits of having a hierarchy for the Earth Metabolome KG's node & edge labels. Use [Biolink KG](https://github.com/LucaCappelletti94/kg-biolink) as an example of an effective characterization of metadata based on the metadata topology.
* Delve into whether the data models suggested by [Chris Mungall](https://www.linkedin.com/in/chrismungall/) are a viable solution for the characterization of biological samples.
* Delve into whether the data models suggested by [Chris Mungall](https://www..com/in/chrismungall/) are a viable solution for the characterization of biological samples.
* Bother the authors of the resources that seem to be the most promising for the Earth Metabolome KG so to make sure I am not missing anything and neither reinventing the wheel.
* Start to delve into the [ENPKG Workflow](https://github.com/enpkg/enpkg_workflow) so as to plan a holystic refactoring, if needed.
6 changes: 3 additions & 3 deletions vault/open-notebook.dbgi.lcappelletti.2023.07.18.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: bwu3bavs2hymd3x55ikezub
title: '17'
desc: ''
updated: 1690284260848
updated: 1718354530885
created: 1689672762964
---

Expand All @@ -16,7 +16,7 @@ Today is 2023.07.18
## Done

### On Biolink
[Deepak Unni](https://www.linkedin.com/in/deepakunni3/), who is the main author of [BioLink](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372416/), said the following regarding the support of BioLink for biological samples:
[Deepak Unni](https://www..com/in/deepakunni3/), who is the main author of [BioLink](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372416/), said the following regarding the support of BioLink for biological samples:

> Biolink Model does have a few concepts that can be used to represent information about samples. But I don't think it is sufficiently modelled to support processing of samples as it is done for genomic sequencing or single cell sequencing. Ideally, it would be good to explore how the model can be extended to support this knowledge.
Expand All @@ -32,5 +32,5 @@ Deepak also pointed me to the following resources:
The [BioLink](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372416/) comes equipped with two important features that are oftem under appreciated: the textual description associated to the node and edge labels, and the hierarchy of the node and edge labels. We can create node type and edge type features from the textual features using procedures such as [Okapi BM25-weighted BERT](https://github.com/AnacletoLAB/grape/blob/main/tutorials/BM25_weighted_pretrained_BERT_based_textual_embedding_using_GRAPE.ipynb) and by composing the [BioLink hyerarchy into a KG](https://github.com/LucaCappelletti94/kg-biolink) we can use any node embedding model to get additional topological node type and edge type features. These features can be used for many tasks, such as the classification of the nodes and edges, and the prediction of the edges. The [BioLink](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9372416/) hierarchy is not perfect, but it provides several good inspirations that should be followed for the Earth Metabolome KG.

## Todo tomorrow
* Delve into whether the data models suggested by [Chris Mungall](https://www.linkedin.com/in/chrismungall/) are a viable solution for the characterization of biological samples.
* Delve into whether the data models suggested by [Chris Mungall](https://www..com/in/chrismungall/) are a viable solution for the characterization of biological samples.
* Start to delve into the [ENPKG Workflow](https://github.com/enpkg/enpkg_workflow) so as to plan a holystic refactoring, if needed.
4 changes: 2 additions & 2 deletions vault/open-notebook.dbgi.lcappelletti.2023.07.19.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: 4gh9ubiyqx2c29rph7lyax3
title: '17'
desc: ''
updated: 1689778750988
updated: 1718354530881
created: 1689775221469
---

Expand Down Expand Up @@ -31,5 +31,5 @@ git clone [email protected]:enpkg/enpkg_graph_builder.git
```

## Todo tomorrow
* Delve into whether the data models suggested by [Chris Mungall](https://www.linkedin.com/in/chrismungall/) are a viable solution for the characterization of biological samples.
* Delve into whether the data models suggested by [Chris Mungall](https://www..com/in/chrismungall/) are a viable solution for the characterization of biological samples.
* Start writing a test suite for the [data organization](https://github.com/enpkg/enpkg_data_organization/tree/refactoring) repository with the data [made available on Zenodo](https://zenodo.org/record/8152039).
4 changes: 2 additions & 2 deletions vault/open-notebook.dbgi.lcappelletti.2023.07.21.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: qs8zsh67h8a5kj4klmtnxbj
title: '17'
desc: ''
updated: 1689952153135
updated: 1718354530857
created: 1689950746986
---

Expand Down Expand Up @@ -51,5 +51,5 @@ The following list of libraries used in the code has been identified, which will


## For next week
* Delve into whether the data models suggested by [Chris Mungall](https://www.linkedin.com/in/chrismungall/) are a viable solution for the characterization of biological samples.
* Delve into whether the data models suggested by [Chris Mungall](https://www..com/in/chrismungall/) are a viable solution for the characterization of biological samples.
* Ask to Pierre-Marie for help understanding how to start using the data [made available on Zenodo](https://zenodo.org/record/8152039) for testing the [data organization](https://github.com/enpkg/enpkg_data_organization/tree/refactoring) and in chain all others to avoid breaking the pipeline during the refactoring.
4 changes: 2 additions & 2 deletions vault/open-notebook.dbgi.mvisani.2024.02.09.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: f3wly7juhxjotu33qfw1wvo
title: '2024-02-09'
desc: ''
updated: 1707499094082
updated: 1718354530890
created: 1707499015014
traitIds:
- open-notebook-mvisani
Expand All @@ -25,4 +25,4 @@ Today is 2024.02.09


## Todo on Monday 2024.02.12
- [ ] **Write LinkedIn post about Sirius binding !!!!**
- [ ] **Write post about Sirius binding !!!!**
4 changes: 2 additions & 2 deletions vault/open-notebook.dbgi.mvisani.2024.02.13.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: zlzmw93ezh9g89apr3l77zj
title: '2024-02-13'
desc: ''
updated: 1707834918892
updated: 1718354530876
created: 1707818340915
traitIds:
- open-notebook-mvisani
Expand All @@ -15,7 +15,7 @@ Today is 2024.02.13
## Notes
Add time taken to learn the language, tag pma, tag luca

create a linkeding organisation for the EMI
create a g organisation for the EMI
## Todo today
- [ ]

Expand Down
4 changes: 2 additions & 2 deletions vault/open-notebook.dbgi.mwannier.2023.05.11.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: pwgi25wmjyo2lw342x3npi6
title: '2023-05-11'
desc: ''
updated: 1683869987093
updated: 1718354530853
created: 1683793354813
traitIds:
- open-notebook-mwannier
Expand All @@ -24,7 +24,7 @@ Today is 2023.05.11
## NOTES

Followed a lesson on D3js:
https://www.linkedin.com/learning/d3-js-essential-training-for-data-scientists/creating-a-linear-scale?autoplay=true
https://www..com/learning/d3-js-essential-training-for-data-scientists/creating-a-linear-scale?autoplay=true

## TODO NEXT

Expand Down
4 changes: 2 additions & 2 deletions vault/open-notebook.dbgi.pmallard.2023.06.15.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
id: 1n9kw635ospnbwjvismwj9u
title: '2023-06-15'
desc: ''
updated: 1686847854180
updated: 1718354530872
created: 1686813851132
traitIds:
- open-notebook-dbgi-pmallard
Expand Down Expand Up @@ -112,7 +112,7 @@ For next meeting:

Justin Reese indicates these meetings each 2 weeks

https://www.linkedin.com/feed/update/urn:li:activity:7074081027261874176/
https://www..com/feed/update/urn:li:activity:7074081027261874176/

Topic of negative sampling. A pretty complex topi

Expand Down
62 changes: 62 additions & 0 deletions vault/open-notebook.dbgi.pmallard.2024.06.18.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
id: 7s00deq4u6037m0udwo87do
title: '2024-06-18'
desc: ''
updated: 1718701130350
created: 1718701130350
traitIds:
- open-notebook-dbgi-pmallard
---


# This is PMA's DBGI daily open-notebook.

Today is 2024.06.18

## Todo today

### Have a look at the DBGI discussion forum
- https://github.com/orgs/digital-botanical-gardens-initiative/discussions
###
###

## Doing

## Paused

## Done

## Notes

Points with Disha on Traits and Globi

Todo :

use pyotl to resolve
1. wikidata species names
2. TRYDB species names

Then only proceed to the species names matching.


BiotXplorer

https://sibils.text-analytics.ch/

https://sibils.text-analytics.ch/doc/api/fetch/






## Todo tomorrow, one day ... or never

###
###
###


## Today I learned that

-
Loading

0 comments on commit 9e08c1b

Please sign in to comment.