Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (31)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Li, T.
Right arrow Articles by Wang, W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, T.
Right arrow Articles by Wang, W.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Protein Engineering vol. 16 no. 5 pp. 323-330, 2003
© 2003 Oxford University Press

Reduction of protein sequence complexity by residue grouping

Tanping Li, Ke Fan, Jun Wang and Wei Wang1

National Laboratory of Solid State Microstructure, Institute of Biophysics and Department of Physics, Nanjing University, Nanjing 210093, China

1 To whom correspondence should be addressed. e-mail: wangwei{at}nju.edu.cn

It is well known that there are some similarities among various naturally occurring amino acids. Thus, the complexity in protein systems could be reduced by sorting these amino acids with similarities into groups and then protein sequences can be simplified by reduced alphabets. This paper discusses how to group similar amino acids and whether there is a minimal amino acid alphabet by which proteins can be folded. Various reduced alphabets are obtained by reserving the maximal information for the simplified protein sequence compared with the parent sequence using global sequence alignment. With these reduced alphabets and simplified similarity matrices, we achieve recognition of the protein fold based on the similarity score of the sequence alignment. The coverage in dataset SCOP40 for various levels of reduction on the amino acid types is obtained, which is the number of homologous pairs detected by program BLAST to the number marked by SCOP40. For the reduced alphabets containing 10 types of amino acids, the ability to detect distantly related folds remains almost at the same level as that by the alphabet of 20 types of amino acids, which implies that 10 types of amino acids may be the degree of freedom for characterizing the complexity in proteins.

Received November 20, 2002; revised March 10, 2003; accepted April 4, 2003.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
E. L. Peterson, J. Kondev, J. A. Theriot, and R. Phillips
Reduced amino acid alphabets exhibit an improved sensitivity and selectivity in fold assignment
Bioinformatics, June 1, 2009; 25(11): 1356 - 1362.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. N. Davies, A. Secker, A. A. Freitas, E. Clark, J. Timmis, and D. R. Flower
Optimizing amino acid groupings for GPCR classification
Bioinformatics, September 15, 2008; 24(18): 1980 - 1986.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
A. Suemori and M. Iwakura
A Systematic and Comprehensive Combinatorial Approach to Simultaneously Improve the Activity, Reaction Specificity, and Thermal Stability of p-Hydroxybenzoate Hydroxylase
J. Biol. Chem., July 6, 2007; 282(27): 19969 - 19978.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. E. Vinogradov
'Genome design' model and multicellular complexity: golden middle
Nucleic Acids Res., November 6, 2006; 34(20): 5906 - 5914.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. C. Edgar
Local homology recognition and distance measures in linear time using compressed amino acid alphabets
Nucleic Acids Res., January 16, 2004; 32(1): 380 - 385.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.