Analysis of Mass Spectrometry data: Significance Analysis of Microarrays for SELDI-MS Data in Proteomics

Sarra HAMZAOUI, Smail BOUZERGANE, Tiratha Raj SINGH, Hassan BADIR, Ahmed MOUSSA

Abstract


Mass Spectrometry (MS) has arguably become thecore technology in proteomics. MALDI and SELDI-TOFtechniques enable the study biological fluids, e.g. human blood.Analysis of these samples can lead to discover new biomarkerswhich can ease the diagnostic and prognostic of several diseases,e.g. various cancers. In this work, we focus on MS data fromSELDI-TOF experiments. We begin with a preprocessing step inorder to remove noises due to the acquisition process of the data.Then, we apply the differential analysis to a SELDI-MS data,using the Significance Analysis of Microarray (SAM) methodimplemented in Matlab. Results using the SAM method arecompared with those obtained by the conventional t-test andAnalysis Of Variance (ANOVA) in order to evaluate its efficacyand its performance. As a result, we demonstrate that the SAMmethod can be adapted for effective significance analysis ofSELDI-MS data. It is deemed powerful and provides betterresults that totes. An easy-to-use application is developed withMatlab for mass spectrometry data analysis from raw spectra todifferential analysis, including the SAM method.

Full Text:

PDF

References


Emanuel F Petricoin, Ali M Ardekani, Ben A Hitt, Peter J Levine,Vincent A Fusaro, Seth M Steinberg, Cordon B Mills, Charles Simone, David A Fichman, Elise C Kohn, and Lance A Liotta. “Use ofproteomic patterns in serum to identify ovarian cancer”. Lancet, 359(9306):572-577, 2002.

Keith A Baggerly, Jeffrey S morris, and Kevin RCoombes.“Reproducibility of seldi-tof protein patterns in serum:comparing datasets from different experiments”. Bioinformatics, 20(5):777-785, 2004.

Kevin R Coombes, Jeffrey S Morris, Jianhua Hu, Sarah R Edmonson, andKeith A Baggerly. “Serum proteomics profiling-a young technologybegins to mature”. Nat Biotechnol, 23(3): 391-292, 2005.

Eleftherios P Diamandis. “Mass spectrometry as a diagnostic and acancer biomarker discovery tool: opportunities and potentiallimitations”. Mol Cell Proteomics, 3(4): 376-378, 2004.

Jianhua Hu, Kevin R Coombes, Jeffrey S Morris, and Keith ABaggerly.“The importance of experimental design in proteomic massspectrometry experiments: some cautionary tales”. Brief FunctGenomic Proteomic, 3(4): 322-331, 2005.

Bryan AP Roxas and Qingbo Li. “Significance analysis of microarray for relative quantitation of LC/MS data in proteomics”. BMCBioinformatics, 9:187, 2008.

Ann L. Oberg,Douglas W. Mahoney, Jeanette E. Eckel-Passow,Christopher J. Malone, Russell D. Wolfinger, Elizabeth G. Hill, LeslieT. Cooper, Oyere K. Onuma, Craig Spiro, Terry M. Therneau, and H.Robert Bergen, III. “Statistical analysis of relative labeled massspectrometry data from complex samples using ANOVA”. J ProteomeRes, 7(1): 225–233, 2008.

J. Prados, A. Kalousis and M. Hilario. “On preprocessing of seldi-ms data and its evaluation”.In Proceedings of the 19th IEEE Symposium onComputer-Based Medical Systems, pages 953-958, 2006.

J. Prados, A. Kalousis, L. Allard, O. Carrette, J. C. Sanchez, and M. Hilario. “ Mining mass-spectra for diagnosis and biomarker discoveryof cerebral accidents”. Proteomics, 4:2320–2332, 2004.

M. Hilario, A. Kalousis, C. Pellegrini, and M. Muller.“Processing and classification of protein mass spectra”. Mass Spectrometry Reviews,25:409 – 449, 2006.

Kevin R. Coombes, SpiridonTsavachidis, Jeffrey S. Morris, Keith A. Baggerly, Mien-Chie Hung and Henry M. Kuerer. “Improved PeakDetection and Quantification of Mass Spectrometry Data Acquiredfrom Surface-Enhanced Laser Desorption and Ionization by DenoisingSpectra with the Undecimated Discrete Wavelet Transform”.

Proteomics, 5(16):4107-17, 2005.

ThomasJouve, DelphineMaucort-Boulch, Patrick Ducoroy and Pascal Roy.”Local features based methods in mass spectrometry proteomics:a review” .Bioinformatics. 2009.

Malyarenko DI, Cooke WE, Adam BL, Malik G, Chen H, Tracy ER, Trosset MW, Sasinowski M, Semmes OJ, Manos DM. “Enhancementof sensitivity and resolution of surface-enhanced laserdesorption/ionization time-of-flight mass spectrometric records forserum peptides using time-series analysis techniques”. ClinicalChemistry, 51:65–74, 2005.

http://home.ccr.cancer.gov/ncifdaproteomics/ppatterns.asp

Christin C, Hoefsloot HC, Smilde AK, Hoekman B, Suits F, Bischoff R, Horvatovich P. ”A Critical Assessment of Feature Selection Methodsfor Biomarker Discovery in Clinical Proteomics”. Mol CellProteomics. 12(1):263-76, 2013.

William Stafford Noble. “How does multiple testing correction work? ”. Nat Biotechnol. 27(12): 1135–1137, 2009.

Angel P. Diz, Antonio Carvajal-Rodríguez, and David O. F. Skibinski.“Multiple Hypothesis Testing in Proteomics: A Strategy forExperimental Work”.Mol Cell Proteomics.10(3): M110.004374, 2011.

Tusher VG, Tibshirani R, Chu G. “Significance analysis of microarray applied to the ionizing radiation response”. ProcNatlAcadSci U S A,98(9):5116-5121, 2001.

Tusher VG, Tibshirani R and Chu G. “ Significance analysis of microarrays applied to the ionizing radiation response”.ProcNatlAcadSci U S. 28;98(18):10515, 2001.

John T. Prince and Edward M. Marcotte. “mspire: mass spectrometry proteomics in Ruby”.Bioinformatics. 24(23): 2796–2797, 2008.

Colin A. Smith , Elizabeth J. Want , Grace O'Maille, Ruben Abagyan and Gary Siuzdak. “ XCMS: Processing Mass Spectrometry Data forMetabolite Profiling Using Nonlinear Peak Alignment, Matching, andIdentification”.Anal. Chem. , 78 (3), 2006.

Vagisha Sharma, Jimmy K. Eng,Michael J. MacCoss and Michael Riffle.“A Mass Spectrometry Proteomics Data Management Platform”.MolCell Proteomics. 11(9): 824–831, 2012.

Tom Fawcett. “ROC Graphs: Notes and Practical Considerations for Data Mining Researchers”. HP Laboratories Palo Alto.CA 94304,2004.

Faraggi D, Reiser B. “Estimating of area under the ROC curve”. Stat Med, 21:3093-3106, 2002.

Richard M. Simon, Jyothi Subramanian, Ming-Chung Li and SupriyaMenezes. “Using cross-validation to evaluate predictive accuracy ofsurvival risk classifiers based on high-dimensional data”.Brief Bioinform. 12(3): 203–214, 2011.