Drug Repositioning via Text Augmented KG Embeddings

Code repository to NeurIPS 2021 AI for Science Workshop

Repository Setup

Install required libraries with

pip install pykeen==1.5.0 wandb

Refer to link for more instructions on WANDB result tracker. You will be able to see various metrics during training and the evaluation after training.

Download files into the corresponding directory under root

Data and Trained Models

Baseline Training

Train and save trained baseline embedding model X (transe, distmult, proje, rotate, simple, tucker)

python baseline/save_model_X.py

Example hyperparameter optimization

python baseline/hpo_search.py

Text Augmented Model Training

(optional) Scrape text from sources

The needed json files of scraped text are already in the directory. If you are interested in doing from scratch, first download the hetionet json dataset

python text_scrape.py -n X

where X must be one of the available entity type('Anatomy', 'Biological Process', 'Cellular Component', 'Compound', 'Disease', 'Gene'(local), 'Molecular Function', 'Pathway', 'Pharmacologic Class')

Text Embedding We provide the texual embedding generated with BioBert V1.1 in the aforementioned polybox folder. If you are interested in generate this texual embedding yourself, please refer to

text_process/get_embedding_for_hetionet_drugs.py

You will need to install Hugging Face Transformers library.

Find your pykeen library installation path and replace corresponding files with the ones in /pykeen-extension/ This step is to include the texual interaction etc. functionality.
Train Text Augmented KG Run the model you desire to train by
python model_with_text/X.py The file name should be self-explanatory.
To calculate % of Disease @10 and Unique Entities @1: please refer to

evaluation/test_evaluate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Drug Repositioning via Text Augmented KG Embeddings

Repository Setup

Baseline Training

Text Augmented Model Training

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
baseline		baseline
evaluation		evaluation
model_with_text		model_with_text
pykeen-extension		pykeen-extension
text_process		text_process
README.md		README.md
env.yml		env.yml

mianzg/neurips21_ai4sci_kgdr

Folders and files

Latest commit

History

Repository files navigation

Drug Repositioning via Text Augmented KG Embeddings

Repository Setup

Baseline Training

Text Augmented Model Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages