TY - JOUR
T1 - N-Ace
T2 - Using solvent accessibility and physicochemical properties to identify protein N-acetylation sites
AU - Lee, Tzong Yi
AU - Hsu, Justin Bo Kai
AU - Lin, Feng Mao
AU - Chang, Wen Chi
AU - Hsu, Po Chiang
AU - Huang, Hsien Da
PY - 2010/11/30
Y1 - 2010/11/30
N2 - Protein acetylation, which is catalyzed by acetyltransferases, is a type of post-translational modification and crucial to numerous essential biological processes, including transcriptional regulation, apoptosis, and cytokine signaling. As the experimental identification of protein acetylation sites is time consuming and laboratory intensive, several computational approaches have been developed for identifying the candidates of experimental validation. In this work, solvent accessibility and the physicochemical properties of proteins are utilized to identify acetylated alanine, glycine, lysine, methionine, serine, and threonine. A two-stage support vector machine was applied to learn the computational models with combinations of amino acid sequences, and the accessible surface area and physicochemical properties of proteins. The predictive accuracy thus achieved is 5% to 14% higher than that of models trained using only amino acid sequences. Additionally, the substrate specificity of the acetylated site was investigated in detail with reference to the subcellular colocalization of acetyltransferases and acetylated proteins. The proposed method, N-Ace, is evaluated using independent test sets in various acetylated residues and predictive accuracies of 90% were achieved, indicating that the performance of N-Ace is comparable with that of other acetylation prediction methods. N-Ace not only provides a user-friendly input/output interface but also is a creative method for predicting protein acetylation sites. This novel analytical resource is now freely available at.
AB - Protein acetylation, which is catalyzed by acetyltransferases, is a type of post-translational modification and crucial to numerous essential biological processes, including transcriptional regulation, apoptosis, and cytokine signaling. As the experimental identification of protein acetylation sites is time consuming and laboratory intensive, several computational approaches have been developed for identifying the candidates of experimental validation. In this work, solvent accessibility and the physicochemical properties of proteins are utilized to identify acetylated alanine, glycine, lysine, methionine, serine, and threonine. A two-stage support vector machine was applied to learn the computational models with combinations of amino acid sequences, and the accessible surface area and physicochemical properties of proteins. The predictive accuracy thus achieved is 5% to 14% higher than that of models trained using only amino acid sequences. Additionally, the substrate specificity of the acetylated site was investigated in detail with reference to the subcellular colocalization of acetyltransferases and acetylated proteins. The proposed method, N-Ace, is evaluated using independent test sets in various acetylated residues and predictive accuracies of 90% were achieved, indicating that the performance of N-Ace is comparable with that of other acetylation prediction methods. N-Ace not only provides a user-friendly input/output interface but also is a creative method for predicting protein acetylation sites. This novel analytical resource is now freely available at.
UR - http://www.scopus.com/inward/record.url?scp=78149461454&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149461454&partnerID=8YFLogxK
U2 - 10.1002/jcc.21569
DO - 10.1002/jcc.21569
M3 - Article
C2 - 20839302
AN - SCOPUS:78149461454
SN - 0192-8651
VL - 31
SP - 2759
EP - 2771
JO - Journal of Computational Chemistry
JF - Journal of Computational Chemistry
IS - 15
ER -