TY - JOUR
T1 - Wavelet-based procedures for proteomic mass spectrometry data processing
AU - Chen, Shuo
AU - Hong, Don
AU - Shyr, Yu
N1 - Funding Information:
The authors are grateful to professor Christophe Croux and the anonymous referees for their valuable comments and suggestions which helped to improve this work. Also, the authors would like to thank Jonathan Xu, Department of Cancer Biology, Vanderbilt University for providing data sets and many useful suggestions in this study. This research was supported in part by Lung Cancer SPORE (Special Program of Research Excellence) (P50 CA90949), Breast Cancer SPORE (1P50 CA98131-01), GI (5P50 CA95103-02), and Cancer Center Support Grant (CCSG) (P30 CA68485) for Y. Shyr, and by NSF IGMS (#0408086 and #0552377), NSA (H98230-05-1-0304), and MTSU REP for D. Hong.
PY - 2007/9/15
Y1 - 2007/9/15
N2 - Proteomics aims at determining the structure, function and expression of proteins. High-throughput mass spectrometry (MS) is emerging as a leading technique in the proteomics revolution. Though it can be used to find disease-related protein patterns in mixtures of proteins derived from easily obtained samples, key challenges remain in the processing of proteomic MS data. Multiscale mathematical tools such as wavelets play an important role in signal processing and statistical data analysis. A wavelet-based algorithm for proteomic data processing is developed. A MATLAB implementation of the software package, called WaveSpect0, is presented including processing procedures of step-interval unification, adaptive stationary discrete wavelet denoising, baseline correction using splines, normalization, peak detection, and a newly designed peak alignment method using clustering techniques. Applications to real MS data sets for different cancer research projects in Vanderbilt Ingram Cancer Center show that the algorithm is efficient and satisfactory in MS data mining.
AB - Proteomics aims at determining the structure, function and expression of proteins. High-throughput mass spectrometry (MS) is emerging as a leading technique in the proteomics revolution. Though it can be used to find disease-related protein patterns in mixtures of proteins derived from easily obtained samples, key challenges remain in the processing of proteomic MS data. Multiscale mathematical tools such as wavelets play an important role in signal processing and statistical data analysis. A wavelet-based algorithm for proteomic data processing is developed. A MATLAB implementation of the software package, called WaveSpect0, is presented including processing procedures of step-interval unification, adaptive stationary discrete wavelet denoising, baseline correction using splines, normalization, peak detection, and a newly designed peak alignment method using clustering techniques. Applications to real MS data sets for different cancer research projects in Vanderbilt Ingram Cancer Center show that the algorithm is efficient and satisfactory in MS data mining.
UR - http://www.scopus.com/inward/record.url?scp=34548247344&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34548247344&partnerID=8YFLogxK
U2 - 10.1016/j.csda.2007.02.022
DO - 10.1016/j.csda.2007.02.022
M3 - Article
AN - SCOPUS:34548247344
SN - 0167-9473
VL - 52
SP - 211
EP - 220
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
IS - 1
ER -