Determination of protein-protein interaction through Artificial Neural Network and Support Vector Machine: A Comparative study

Himansu Kumar, Swati Srivastava, Pritish Varadwaj


Protein-protein interactions (PPI) plays considerable role in most of the cellular processes and study of PPI enhances understanding of molecular mechanism of the cells. After emergence of proteomics, huge amount of protein sequences were generated but there interaction patterns are still unrevealed. Traditionally various techniques were used to predict PPI but are deficient in terms of accuracy. To overcome the limitations of experimental approaches numerous computational approaches were developed to find PPI. However previous computational approaches were based on descriptors, various external factors and protein sequences. In this article, a sequence based prediction model is proposed by using various machine learning approaches. A comparative study was done to understand efficiency of various machine learning approaches. Large amount of yeast PPI data have been analyzed. Same data has been incorporated for different classification approach like Artificial Neural Network (ANN) and Support Vector Machine (SVM), and compared their results. Existing methods with additional features were implemented to enhance the accuracy of the result. Thus it was concluded that efficiency of this model was more admirable than those existing sequence-based methods; therefore it can be effective for future proteomics research work.


Protein-Protein Interaction; Machine Learning; Artificial Neural Network; Support Vector Machine

Full Text:



M. Deng, K. Zhang, S. Mehta, T. Chen, F. Sun, Prediction of protein function using protein–protein interaction data, IEEE Computer Society Bioinformatics Conference, 2002;197–206.

J.H. Lakey, E.M. Raggett, Measuring protein–protein interactions, Curr. Opin. Struct. Biol. 8, 1998; 119–123.

P. Legrain, J.Wojcik, J.M. Gauthier, Protein–protein interaction maps: a lead towards cellular functions, Trends Genet.; 2001; 17, 346–352.

A.Valencia, F. Pazos, Computational methods for the prediction of protein interactions, Curr. Opin. Struct. Biol. 12 ; 2002; 368–373.

Charton, M.; Charton, B. I.The structural dependence of amino acid hydrophobicity parameters. J. Theor. Biol.; 1982; 99(4), 629-44.

S.M. Gomez,A. Rzhetsky,Towards the prediction of complete protein–protein interaction networks, Pac. Symp. Biocomput. 7; 2002; 413–424.

R. Bandyopadhyay, K. Maatthews, D. Subramanian, X.X. Tan, Predicting protein-ligand interactions from primary structure, Rice University, Department of Computer Science, Technical Report TR02-387, February 2002.

Marcotte, E. M.; Xenarios, I.; Eisenberg, D. Mining literature for protein-protein interactions. Bioinformatics; 2001; 17(4), 359-363.

Xia J.F., Han K. Sequence-Based Prediction of Protein-Protein Interactions by Means of rotation Forest and

Autocorrelation Descriptor. Protein & Peptide Letters; 2010; 17, 137-145.

Xue-wen Chen, Jong Cheol Jeong, and Patrick Dermyer (2010). KUPS: constructing datasets of interacting and non-interacting protein pairs with associated attributions. Nucl. Acids Res. (Database issue): First published online: October 15, 2010.

Marcotte, E. M.; Pellegrini, M.; Ng, H. L.; Rice, D. W.; Yeates, T.O.; Eisenberg, D. Detecting protein function and protein-protein interactions from genome sequences. Science; 1999; 285(5428), 751-753.

Pazos, F.; Helmer-Citterich, M.; Ausiello, G.; Valencia, A. Correlated mutations contain information about protein-protein interaction. J. Mol. Biol.; 1997; 271(4), 511-523.

.C.A.Kumar, M.Ankush, Sungc W, Probabilistic prediction of protein–protein interactions from the protein sequences. Computers in Biology and Medicine; 2006; 36,1143–1154.

. Xuchun Li_, Lei Wang, Eric Sung, AdaBoost with SVM-based component classifiers. Engineering Applications of Artificial Intelligence; 2008; 21,785–795.

Nanni, L. Fusion of classifiers for predicting protein-protein interactions. Neurocomputing; 2005; 68(3), 289-296

G Yanzhi, Y Lezheng, W Zhining and Menglong. Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences. Nucleic Acids Research; 2008; 1–6 doi:10.1093/nar/gkn159.

H.Q. Ding, I. Dubchak, Multi-class protein fold recognition using support vector machines and neural networks, Bioinformatics; 2001; 4 (17) 349–358.

H. Hishigaki, K. Nakai, T. Ono, A. Tanigami, T. Takagi, Assessment of prediction accuracy of protein function from protein–protein interaction data,Yeast ; 2001; 18; 523–531.

S. Letovsky, S. Kasif, Predicting protein function from protein/protein interaction data: a probabilistic approach, Bioinformatics; 2003; 19 (1) I197–I204.

J.Weston, S. Mukherjee, O. Chapelle, M. Pontil, V. Vapnik, T. Poggio, Feature selection for SVMs, Adv. Neural Inform. Process. Syst; 2000; 668–674.

Nanni, L Lumini, A. An ensemble of K-local hyperplanes for predicting protein-protein interactions. Bioinformatics; 2006; 22(10), 1207-10.

Chang CC and Lin CJ et al., A practical guide to Support Vector Classification,, 2003.