Motivation: Discriminating outer membrane proteins from other folding types of globular and membrane proteins is an important task both for dissecting outer membrane proteins (OMPs) from genomic sequences and for the successful prediction of their secondary and tertiary structures. Results: We have developed a method based on support vector machines using amino acid composition and residue pair information. Our approach with amino acid composition has correctly predicted the OMPs with a cross-validated accuracy of 94% in a set of 208 proteins. Further, this method has successfully excluded 633 of 673 globular proteins and 191 of 206 α-helical membrane proteins. We obtained an overall accuracy of 92% for correctly picking up the OMPs from a dataset of 1087 proteins belonging to all different types of globular and membrane proteins. Furthermore, residue pair information improved the accuracy from 92 to 94%. This accuracy of discriminating OMPs is higher than that of other methods in the literature, which could be used for dissecting OMPs from genomic sequences.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics