The most popular tools for predicting pathogenicity of single amino acid variants (SAVs) were developed based on sequence-based techniques. SAVs may change protein structure and function. In the context of van der Waals force and disulfide bridge calculations, no method directly predicts the impact of mutations on the energies of the protein structure. Here, we combined machine learning methods and energy scores of protein structures calculated by Rosetta Energy Function 2015 to predict SAV pathogenicity. The accuracy level of our model (0.76) is higher than that of six prediction tools. Further analyses revealed that the differential reference energies, attractive energies, and solvation of polar atoms between wildtype and mutant side-chains played essential roles in distinguishing benign from pathogenic variants. These features indicated the physicochemical properties of amino acids, which were observed in 3D structures instead of sequences. We added 16 features to Rhapsody (the prediction tool we used for our data set) and consequently improved its performance. The results indicated that these energy scores were more appropriate and more detailed representations of the pathogenicity of SAVs.
|Number of pages
|IEEE/ACM Transactions on Computational Biology and Bioinformatics
|Published - 2023 Jan 1
All Science Journal Classification (ASJC) codes
- Applied Mathematics