TY - JOUR
T1 - Precise strength prediction of endogenous promoters from Escherichia coli and J-series promoters by artificial intelligence
AU - Huang, Yu Kuan
AU - Yu, Chi Hua
AU - Ng, I. Son
N1 - Publisher Copyright:
© 2023 Taiwan Institute of Chemical Engineers
PY - 2023
Y1 - 2023
N2 - Background: Promoter strength plays a critical role in modulating protein expression in genetic engineering. However, there are only a few studies on the strength of promoters from the comprehensive genomic database of sigma factors. To circumvent the time and resource-intensive experimental approach, artificial intelligence (AI) is considered to construct a complete database of proposed promoters from Escherichia coli, and further utilizing prediction algorithms to evaluate the promoter strength and confirmed using intensity of green fluorescent protein (GFP). Methods: The promoter database was constructed using partial information from Ecocyc, and predictive strength of the promoters was calculated via the phiSITE hunter tool. Among the 1744 promoter entries in the database were derived from E. coli MG1655, while total of 935 sigma factor 70 (σ70) promoters were identified. Then, the training database was applied to develop a precise tool for predicting promoter strength using machine learning and six deep learning models. The accuracy of predictions was confirmed through wet experiments conducted on endogenous and J-series promoters. Significant findings: By employing a deep learning model, particularly the Convolutional Neural Network (CNN), the promoter prediction fitness of phiSITE, which relied on traditional alignment metrics, was approved. On the other hand, phiSITE demonstrated satisfied result in the fluorescence experiments using 7 endogenous promoters, achieving an R-squared (R2) at 0.93. When applied the same model to predict the strength of J-series promoters, the best R2 achieved 0.99. Thus, CNN model represents as an effective evaluation of AI-based promoter strength.
AB - Background: Promoter strength plays a critical role in modulating protein expression in genetic engineering. However, there are only a few studies on the strength of promoters from the comprehensive genomic database of sigma factors. To circumvent the time and resource-intensive experimental approach, artificial intelligence (AI) is considered to construct a complete database of proposed promoters from Escherichia coli, and further utilizing prediction algorithms to evaluate the promoter strength and confirmed using intensity of green fluorescent protein (GFP). Methods: The promoter database was constructed using partial information from Ecocyc, and predictive strength of the promoters was calculated via the phiSITE hunter tool. Among the 1744 promoter entries in the database were derived from E. coli MG1655, while total of 935 sigma factor 70 (σ70) promoters were identified. Then, the training database was applied to develop a precise tool for predicting promoter strength using machine learning and six deep learning models. The accuracy of predictions was confirmed through wet experiments conducted on endogenous and J-series promoters. Significant findings: By employing a deep learning model, particularly the Convolutional Neural Network (CNN), the promoter prediction fitness of phiSITE, which relied on traditional alignment metrics, was approved. On the other hand, phiSITE demonstrated satisfied result in the fluorescence experiments using 7 endogenous promoters, achieving an R-squared (R2) at 0.93. When applied the same model to predict the strength of J-series promoters, the best R2 achieved 0.99. Thus, CNN model represents as an effective evaluation of AI-based promoter strength.
UR - http://www.scopus.com/inward/record.url?scp=85176408954&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85176408954&partnerID=8YFLogxK
U2 - 10.1016/j.jtice.2023.105211
DO - 10.1016/j.jtice.2023.105211
M3 - Article
AN - SCOPUS:85176408954
SN - 1876-1070
VL - 160
JO - Journal of the Taiwan Institute of Chemical Engineers
JF - Journal of the Taiwan Institute of Chemical Engineers
M1 - 105211
ER -