Skip Navigation


PEDS Advance Access originally published online on September 4, 2008
Protein Engineering Design and Selection 2008 21(11):659-664; doi:10.1093/protein/gzn045
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
21/11/659    most recent
gzn045v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Guo, X.
Right arrow Articles by Gao, X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Guo, X.
Right arrow Articles by Gao, X.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org

A novel hierarchical ensemble classifier for protein fold recognition

Xia Guo and Xieping Gao1

Information Engineering College, Xiangtan University, Xiangtan 411105, Hunan, PR China

1 To whom correspondence should be addressed. E-mail: xpgao{at}xtu.edu.cn

The ensemble classifier plays a critical role in protein fold recognition. In this article, a novel hierarchical ensemble classifier named GAOEC (Genetic-Algorithm Optimized Ensemble Classifier) is presented and it can be constructed in the following steps. First, a novel optimized classifier named GAET-KNN (Genetic-Algorithm Evidence-Theoretic K Nearest Neighbors) is proposed as a component classifier. Second, six component classifiers in the first layer are used to get a potential class index for every query protein. Third, according to the results of the first layer, every component classifier in the second layer generates a 27-dimension vector whose elements represent the confidence degrees of 27-folds. Finally, genetic algorithm is used for generating weights for the outputs of the second layer to get the final classification result. The standard percentage accuracy of GAOEC is 64.7% on a widely used benchmark dataset, where the proteins in the testing set have less than 35% identity with those in the training set.

Keywords: ET-KNN/GAET-KNN/genetic algorithm/hierarchical ensemble classifier/protein fold recognition

Received November 14, 2007; revised July 29, 2008; accepted August 1, 2008.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
Q. Dong, S. Zhou, and J. Guan
A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation
Bioinformatics, October 15, 2009; 25(20): 2655 - 2662.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.