TY - JOUR
T1 - WinBinVec
T2 - Cancer-Associated Protein-Protein Interaction Extraction and Identification of 20 Various Cancer Types and Metastasis Using Different Deep Learning Models
AU - Abdollahi, Sina
AU - Lin, Peng Chan
AU - Chiang, Jung Hsien
N1 - Funding Information:
Manuscript received December 15, 2020; revised April 20, 2021; accepted June 24, 2021. Date of publication June 29, 2021; date of current version October 5, 2021. This work was supported by Grant MOST-108-2634-F-006-006. (Corresponding authors: Jung-Hsien Chiang; Peng-Chan Lin.) Sina Abdollahi is with the Department of Computer Science and Information Engineering, National Cheng-Kung University, Tainan 701, Taiwan (e-mail: sina@iir.csie.ncku.edu.tw).
Publisher Copyright:
© 2013 IEEE.
PY - 2021/10/1
Y1 - 2021/10/1
N2 - Biophysical protein-protein interactions perform dominant roles in the initiation and progression of many cancer-related pathways. A protein-protein interaction might play different roles in diverse cancer types. Hence, prioritizing the PPIs in each cancer type would help detect cancer-associated pathways, find a better understanding of cancer biology, and facilitate drug discovery. Several studies to date have proposed computational methods for extracting the PPI essentiality of different cancer types based on the PPI network. The main drawback of these studies is not using a rich source such as genomics variant data. An amino acid sequence encodes useful information about protein structure and behavior. We represent each amino acid sequence based on its variants/mutations in seven different ways: binary vectors, pathogenicity scores, binding affinity changes upon mutations, gene expression-based network of the interactions, biophysicochemical properties, g-gap dipeptide, and one-hot vectors. Based on these representations, we design and consider seven different deep learning models. Then, we compare the accuracy of these models in predicting 20 different cancer types from the TCGA cohort. WinBinVec is a window-based model that outperforms the other models. Moreover, WinBinVec contains a PPI essentiality module that helps extract the essentiality probability of each PPI for every cancer type.
AB - Biophysical protein-protein interactions perform dominant roles in the initiation and progression of many cancer-related pathways. A protein-protein interaction might play different roles in diverse cancer types. Hence, prioritizing the PPIs in each cancer type would help detect cancer-associated pathways, find a better understanding of cancer biology, and facilitate drug discovery. Several studies to date have proposed computational methods for extracting the PPI essentiality of different cancer types based on the PPI network. The main drawback of these studies is not using a rich source such as genomics variant data. An amino acid sequence encodes useful information about protein structure and behavior. We represent each amino acid sequence based on its variants/mutations in seven different ways: binary vectors, pathogenicity scores, binding affinity changes upon mutations, gene expression-based network of the interactions, biophysicochemical properties, g-gap dipeptide, and one-hot vectors. Based on these representations, we design and consider seven different deep learning models. Then, we compare the accuracy of these models in predicting 20 different cancer types from the TCGA cohort. WinBinVec is a window-based model that outperforms the other models. Moreover, WinBinVec contains a PPI essentiality module that helps extract the essentiality probability of each PPI for every cancer type.
UR - http://www.scopus.com/inward/record.url?scp=85112193106&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112193106&partnerID=8YFLogxK
U2 - 10.1109/JBHI.2021.3093441
DO - 10.1109/JBHI.2021.3093441
M3 - Article
C2 - 34185653
AN - SCOPUS:85112193106
SN - 2168-2194
VL - 25
SP - 4052
EP - 4063
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
IS - 10
ER -