PEDS Advance Access originally published online on September 4, 2008
Protein Engineering Design and Selection 2008 21(11):659-664; doi:10.1093/protein/gzn045
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A novel hierarchical ensemble classifier for protein fold recognition
Information Engineering College, Xiangtan University, Xiangtan 411105, Hunan, PR China
1 To whom correspondence should be addressed. E-mail: xpgao{at}xtu.edu.cn
The ensemble classifier plays a critical role in protein fold recognition. In this article, a novel hierarchical ensemble classifier named GAOEC (Genetic-Algorithm Optimized Ensemble Classifier) is presented and it can be constructed in the following steps. First, a novel optimized classifier named GAET-KNN (Genetic-Algorithm Evidence-Theoretic K Nearest Neighbors) is proposed as a component classifier. Second, six component classifiers in the first layer are used to get a potential class index for every query protein. Third, according to the results of the first layer, every component classifier in the second layer generates a 27-dimension vector whose elements represent the confidence degrees of 27-folds. Finally, genetic algorithm is used for generating weights for the outputs of the second layer to get the final classification result. The standard percentage accuracy of GAOEC is 64.7% on a widely used benchmark dataset, where the proteins in the testing set have less than 35% identity with those in the training set.
Keywords: ET-KNN/GAET-KNN/genetic algorithm/hierarchical ensemble classifier/protein fold recognition
Received November 14, 2007; revised July 29, 2008; accepted August 1, 2008.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Q. Dong, S. Zhou, and J. Guan A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation Bioinformatics, October 15, 2009; 25(20): 2655 - 2662. [Abstract] [Full Text] [PDF] |
||||
