TpPred: A Tool for Hierarchical Prediction of Transport Proteins Using Cluster of Neural Networks and Sequence Derived Features

Sankalp Jain, Piyush Ranjan, Dipankar Sengupta, Pradeep Kumar Naik

Abstract


A top–down predictor, called TpPred, is developed which consists of 3 level of hierarchical classification using cascade of neural networks from sequence derived features. The 1st layer of the prediction engine is for identifying a query protein as transport protein or not; the 2nd layer for the main functional class; and the 3rd layer for the sub-functional class. The overall success rates for all the three layers are higher than 65% that were obtained through rigorous cross-validation tests on the very stringent benchmark datasets in which none of the proteins has 30% sequence identity with any other in the same class or subclass. TpPred achieved good prediction accuracies and could nicely complement experimental approaches for identification of transport proteins. TpPred is freely available to be use in-house as a standalone version and is accessible at http://www.juit.ac.in/attachments/tppred/Home.html.


Keywords


Transport Proteins, Hierarchical classification, Neural networks, Sequence derived features

Full Text:

PDF

References


Hediger MA, “Structure, function and evolution of solute transporters in prokaryotes and eukaryotes”, J. Exp. Biology, 196: 15–49 (1994).

Borst P and Elferink RO, “Mammalian ABC transporters in health and disease”, Annu. Rev. Biochemistry, 71: 537–592 (2002).

Seal RP and Amara SG, “Excitatory amino acid transporters: a family in flux”, Annu. Rev. Pharmacol. Toxicology, 39: 431–456 (1999).

Joet T, Morin C, Fischbarg J, Louw AI, Eckstein-Ludwig U, Woodrow C and Krishna S, “Why is the Plasmodium falciparum hexose transporter a promising new drug target?”, Expert. Opin.Ther. Targets, 7: 593–602 (2003).

Birch PJ, Dekker LV, James IF, Southan A and Cronk D, “Strategies to identify ion channel modulators: current and novel approaches to target neuropathic pain”, Drug Discov. Today, 9: 410–418 (2004).

Dutta AK, Zhang S, Kolhatkar R and Reith ME, “Dopamine transporter as target for drug development of cocaine dependence medications”, Eur. J. Pharmacology, 479: 93–106 (2003).

Lee W and Kim RB, “Transporters and renal drug elimination”, Annu. Rev. Pharmacol. Toxicology, 44: 137–166 (2004).

Kunta JR and Sinko PJ, “Intestinal drug transporters: in vivo function and clinical importance”, Curr. Drug Metabolism, 5: 109–124 (2004).

Driessen AJ, Rosen BP and Konings WN, “Diversity of transport mechanisms: common structural principles”, Trends Biochem. Science, 25: 397–401 (2000).

Chou KC and Zhang CT, “Prediction of protein structural classes”, Crit. Rev. Biochem. and Mol. Biology, 30: 275-349 (1995).

Klein P, “Prediction of protein structural class by discriminant analysis”, Biochem. Biophys. Acta, 874: 205-215 (1986).

Saier MH, Tran CV and Barabote RD, “TCDB: the transporter classification database for membrane transport protein analyses and information”, Nucl. Acids Research, 34: 181-186 (2006).

Zhou X, Hvorup RN and Saier MH Jr, “An automated program to screen databases for members of protein families”, J. Mol. Microbiol. Biotechnology, 5: 7–10 (2003).

Campbell RS, Brearley GM, Varsani H, Morris HC, Milligan TP, Hall SK, Hammond PM and Price CP, “Development and validation of a robust specific enzyme mediated assay for phenylalanine in serum”, Clin. Chim. Acta, 210: 197–210 (1992).

Howard EM, Zhang H and Roepe PD, “A novel transporter, Pfcrt, confers antimalarial drug resistance”, J. Membr. Biology, 190: 1–8 (2002).

Sano Y, Inamura K, Miyake A, Mochizuki S, Kitada C, Yokoi H, Nozawa K, Okada H, Matsushime H and Furuichi K, “A novel two-pore domain K_channel, TRESK, is localized in the spinal cord”, J. Biol. Chemistry, 278: 27406–27412 (2003).

Zhang Y, Jock S and Geider K, “Genes of Erwinia amylovora involved in yellow color formation and release of a low-molecular-weight compound during growth in the presence of copper ions”, Mol. Gen. Genetics, 264: 233–240 (2000).

Ohki R and Murata M, “bmr3, a third multidrug transporter gene of Bacillus subtili”, J. Bacteriology, 179: 1423–1427 (1997).

Reyes R, Duprat F, Lesage F, Fink M, Salinas M, Farman N and Lazdunski M, “Cloning and expression of a novel pH-sensitive two pore domain K_channel from human kidney”, J. Biol. Chemistry, 273: 30863–30869 (1998).

Vardy E, Arkin IT, Gottschalk KE, Kaback HR and Schuldiner S, “Structural conservation in the major facilitator superfamily as revealed by comparative modelling”, Protein Science, 13: 1832–1840 (2004).

Enright AJ and Ouzounis CA, “GeneRAGE: a robust algorithm for sequence clustering and domain detection”, Bioinformatics, 16: 451–457 (2000).

Whisstock JC and Lesk AM, “Prediction of protein function from protein sequence and structure”, Q. Rev. Biophysics, 36: 307–340 (2003).

Altschul SF, Gish W, Miller W, Myers EW and Lipman DJ, “Basic local alignment search tool”, J. Mol. Biology, 215: 403-410 (1990).

Anfisen CB, “Principles that govern the folding of protein chains”, Science, 181: 223-230 (1973).

Rice P, Longden I and Bleasby A, “EMBOSS: The European Molecular Biology Open Software Suite”, Trends in Genetics, 16: 276-277 (2000).

Chou KC, “Prediction of protein cellular attributes using pseudo amino acid composition” Proteins, Structure, Function and Genetics, 43: 246-255 (2001).

Tanford C, “Contribution of hydrophobic interactions to the stability of the globular conformation of proteins”, J. American Chem. Society, 84: 4240-4247 (1962).

Hopp TP and Woods KR, “Prediction of protein antigenic determinants from amino acid sequences”, Proc. Natl. Acad. Science, 78: 3824-3828 (1981).

Rumelhart DE, Hinton GE and Williams RJ, “Learning internal representations by error propagation”, In: Rumelhart, D.E., McClelland, J.L. (Eds.), Parallel distributed processing: explorations in the microstructure of cognition. Volume 1: Foundations, Cambridge, MA: MIT Press, 318-362 (1986).

Jaiswal K, Kumar C and Naik PK, “Prediction of EF-hand calcium-binding proteins and identification of calcium-binding regions using machine learning techniques”, J. Cell Mol. Biology 8(2): 41-49 (2010).

Patel A, Patel S and Naik PK, “Binary classification of uncharacterized proteins into DNA binding/non-DNA binding proteins from sequence derived features using ANN”, Digest J. of Nanomateria and Biostructure 4(4): 775-782 (2009).

Naik PK, Mishra VS, Gupta M and Jaiswal K, “Prediction of enzymes and non-enzymes from protein sequences based on sequence features and PSSM matrix using artificial neural network”, Bioinformation 2(3): 107-112 (2007).

Zhou GP, “An intriguing controversy over protein structural class prediction”, J. Protein Chemistry 17: 729–738 (1998).

Chou KC and Cai YD, “Using functional domain composition and support vector machines for prediction of protein subcellular location”, J. Biol. Chemistry 277: 45765–45769 (2002).

Huang Y and Li Y, “Prediction of protein subcellular locations using fuzzy k-NN method”, Bioinformatics 20: 21–28 (2004).

Lin HH, Han LY, Zhang HL, Zheng CJ, Xie B and Chen YZ, “Prediction of the functional class of lipid-binding proteins from sequence derived properties irrespective of sequence similarity”, J. Lipid Research 47: 824-831 (2006).

Fierro-Monti I and Mathews MB, “Proteins binding to duplexed RNA: one motif, multiple functions”, Trends in Biochem. Science 25: 241-246 (2000).

Perez-Canadillas JM and Varani G, “Recent advances in RNA-protein recognition”, Curr. Opin.Struct. Biology, 11: 53-58 (2001).

Maglio O, Nastri F, Calhoun JR, Lahr S, Wade H, Pavone V, DeGrado WF and Lombardi A, “Artificial di-iron proteins: solution characterization of four helix bundles containing two distinct types of inter-helical loops”, J. Biol. Inorganic Chemistry 10: 539-549 (2005).