DeepGLSTM: Deep Graph Convolutional Network and LSTM based approach for predicting drug-target binding affinity

Quick Links

Abstract
Model Architecture
Preparation
1. Environment Setup
2. Dataset description
Quick Start
Pretrained Models and Dataset
1. Pretrained Models download links
2. Dataset download links
Model Performance Stats
Case studies on SARS-CoV-2 viral proteins
Citation

Abstract

Development of new drugs is an expensive and time-consuming process. Due to the world-wide SARS-CoV-2 outbreak, it is essential that new drugs for SARS-CoV-2 are developed as soon as possible. Drug repurposing techniques can reduce the time span needed to develop new drugs by probing the list of existing FDA-approved drugs and their properties to reuse them for combating the new disease. We propose a novel architecture DeepGLSTM, which is a Graph Convolutional network and LSTM based method that predicts binding affinity values between the FDA-approved drugs and the viral proteins of SARS-CoV-2. Our proposed model has been trained on Davis, KIBA (Kinase Inhibitor Bioactivity), DTC (Drug Target Commons), Metz, ToxCast and STITCH datasets. We use our novel architecture to predict a Combined Score (calculated using Davis and KIBA score) of 2,304 FDA-approved drugs against 5 viral proteins. On the basis of the Combined Score, we prepare a list of the top-18 drugs with the highest binding affinity for 5 viral proteins present in SARS-CoV-2. Subsequently, this list may be used for the creation of new useful drugs. For more details please visit our work.

Model Architecture

Preparation

Environment Setup

The dependency pakages can be installed using the command.

pip install -r requirements.txt

Dataset description

In our experiment we use Davis, Kiba, DTC, Metz, ToxCast, Stitch datasets respectively.

Dataset Statistics:

Quick Start

Create Dataset

Firstly, run the script below to create Pytorch_Geometric file. The file will be created in processed folder in data folder.

python3 data_creation.py

Default values of argument parser are set for davis dataset.

Model Training

Run the following script to train the model.

python3 training.py

Default values of argument parser are set for davis dataset.

Inference on Pretrained Model

Run the following script to test the model.

python3 inference.py

Default values of argument parser are set for davis dataset.

Pretrained Models and Dataset

Pretrained Models download links

Dataset	Model download link
Davis	Link
Kiba	Link
DTC	Link
Metz	Link
ToxCast	Link
Stitch	Link

Download models from the above table for particular dataset and store in the pretrained_model folder.

Dataset download links

Dataset	Dataset download links
Davis	Link
Kiba	Link
DTC	Link
Metz	Link
ToxCast	Link
Stitch	Link

Download dataset from the above table for particular data and store in the data folder. For each folder in the link there are two csv file train and test.

Model Performance Stats

Plots showing DeepGLSTM versus measured binding affinity values for the (a) Davis dataset (b) KIBA dataset (c) DTC dataset (d) Metz dataset (e) ToxCast dataset (f) STITCH dataset. In figure Coef_V is Pearson correlation coefficient.

Case studies on SARS-CoV-2 viral proteins

Citation

Please cite our paper if it's helpful to you in your research.

@inbook{doi:10.1137/1.9781611977172.82,
author = {Shrimon Mukherjee and Madhusudan Ghosh and Partha Basuchowdhuri},
title = {DeepGLSTM: Deep Graph Convolutional Network and LSTM based approach for predicting drug-target binding affinity},
booktitle = {Proceedings of the 2022 SIAM International Conference on Data Mining (SDM)},
chapter = {},
pages = {729-737},
doi = {10.1137/1.9781611977172.82},
URL = {https://epubs.siam.org/doi/abs/10.1137/1.9781611977172.82},
eprint = {https://epubs.siam.org/doi/pdf/10.1137/1.9781611977172.82},
    abstract = { Abstract Development of new drugs is an expensive and time-consuming process. Due to the world-wide SARS-CoV-2 outbreak, it is essential that new drugs for SARS-CoV-2 are developed as soon as possible. Drug repurposing techniques can reduce the time span needed to develop new drugs by probing the list of existing FDA-approved drugs and their properties to reuse them for combating the new disease. We propose a novel architecture DeepGLSTM, which is a Graph Convolutional network and LSTM based method that predicts binding affinity values between the FDA-approved drugs and the viral proteins of SARS-CoV-2. Our proposed model has been trained on Davis, KIBA (Kinase Inhibitor Bioactivity), DTC (Drug Target Commons), Metz, ToxCast and STITCH datasets. We use our novel architecture to predict a Combined Score (calculated using Davis and KIBA score) of 2,304 FDA-approved drugs against 5 viral proteins. On the basis of the Combined Score, we prepare a list of the top-18 drugs with the highest binding affinity for 5 viral proteins present in SARS-CoV-2. Subsequently, this list may be used for the creation of new useful drugs. }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepGLSTM: Deep Graph Convolutional Network and LSTM based approach for predicting drug-target binding affinity

Quick Links

Abstract

Model Architecture

Preparation

Environment Setup

Dataset description

Quick Start

Create Dataset

Model Training

Inference on Pretrained Model

Pretrained Models and Dataset

Pretrained Models download links

Dataset download links

Model Performance Stats

Case studies on SARS-CoV-2 viral proteins

Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
data		data
images		images
models		models
pretrained_model		pretrained_model
README.md		README.md
SDM_presentation.pdf		SDM_presentation.pdf
data_creation.py		data_creation.py
inference.py		inference.py
requirements.txt		requirements.txt
training.py		training.py
utils.py		utils.py

MLlab4CS/DeepGLSTM

Folders and files

Latest commit

History

Repository files navigation

DeepGLSTM: Deep Graph Convolutional Network and LSTM based approach for predicting drug-target binding affinity

Quick Links

Abstract

Model Architecture

Preparation

Environment Setup

Dataset description

Quick Start

Create Dataset

Model Training

Inference on Pretrained Model

Pretrained Models and Dataset

Pretrained Models download links

Dataset download links

Model Performance Stats

Case studies on SARS-CoV-2 viral proteins

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages