TY - JOUR
T1 - CFA
T2 - An explainable deep learning model for annotating the transcriptional roles of cis-regulatory modules based on epigenetic codes
AU - Yang, Tzu Hsien
AU - Yu, Yu Huai
AU - Wu, Sheng Hang
AU - Zhang, Fang Yuan
N1 - Funding Information:
This work was supported by National Cheng Kung University, Taiwan , the National University of Kaohsiung, Taiwan , and the National Science and Technology Council of Taiwan ( MOST 107-2218-E-390-009-MY3 , MOST 110-2222-E-006-017 , and MOST 111-2221-E-006-231 ).
Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2023/1
Y1 - 2023/1
N2 - Metazoa gene expression is controlled by modular DNA segments called cis-regulatory modules (CRMs). CRMs can convey promoter/enhancer/insulator roles, generating additional regulation layers in transcription. Experiments for understanding CRM roles are low-throughput and costly. Large-scale CRM function investigation still depends on computational methods. However, existing in silico tools only recognize enhancers or promoters exclusively, thus accumulating errors when considering CRM promoter/enhancer/insulator roles altogether. Currently, no algorithm can concurrently consider these CRM roles. In this research, we developed the CRM Function Annotator (CFA) model. CFA provides complete CRM transcriptional role labeling based on epigenetic profiling interpretation. We demonstrated that CFA achieves high performance (test macro auROC/auPRC = 94.1%/90.3%) and outperforms existing tools in promoter/enhancer/insulator identification. CFA is also inspected to recognize explainable epigenetic codes consistent with previous findings when labeling CRM roles. By considering the higher-order combinations of the epigenetic codes, CFA significantly reduces false-positive rates in CRM transcriptional role annotation. CFA is available at https://github.com/cobisLab/CFA/.
AB - Metazoa gene expression is controlled by modular DNA segments called cis-regulatory modules (CRMs). CRMs can convey promoter/enhancer/insulator roles, generating additional regulation layers in transcription. Experiments for understanding CRM roles are low-throughput and costly. Large-scale CRM function investigation still depends on computational methods. However, existing in silico tools only recognize enhancers or promoters exclusively, thus accumulating errors when considering CRM promoter/enhancer/insulator roles altogether. Currently, no algorithm can concurrently consider these CRM roles. In this research, we developed the CRM Function Annotator (CFA) model. CFA provides complete CRM transcriptional role labeling based on epigenetic profiling interpretation. We demonstrated that CFA achieves high performance (test macro auROC/auPRC = 94.1%/90.3%) and outperforms existing tools in promoter/enhancer/insulator identification. CFA is also inspected to recognize explainable epigenetic codes consistent with previous findings when labeling CRM roles. By considering the higher-order combinations of the epigenetic codes, CFA significantly reduces false-positive rates in CRM transcriptional role annotation. CFA is available at https://github.com/cobisLab/CFA/.
UR - http://www.scopus.com/inward/record.url?scp=85145491831&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85145491831&partnerID=8YFLogxK
U2 - 10.1016/j.compbiomed.2022.106375
DO - 10.1016/j.compbiomed.2022.106375
M3 - Article
C2 - 36502693
AN - SCOPUS:85145491831
SN - 0010-4825
VL - 152
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
M1 - 106375
ER -