Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (26)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Li, W.
Right arrow Articles by Godzik, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, W.
Right arrow Articles by Godzik, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Protein Engineering, Vol. 15, No. 8, 643-649, August 2002
© 2002 Oxford University Press

Sequence clustering strategies improve remote homology recognitions while reducing search times

Weizhong Li1, Lukasz Jaroszewski2 and Adam Godzik3

The Burnham Institute, La Jolla, CA 92037, USA

Sequence databases are rapidly growing, thereby increasing the coverage of protein sequence space, but this coverage is uneven because most sequencing efforts have concentrated on a small number of organisms. The resulting granularity of sequence space creates many problems for profile-based sequence comparison programs. In this paper, we suggest several strategies that address these problems, and at the same time speed up the searches for homologous proteins and improve the ability of profile methods to recognize distant homologies. One of our strategies combines database clustering, which removes highly redundant sequence, and a two-step PSI-BLAST (PDB-BLAST), which separates sequence spaces of profile composition and space of homology searching. The combination of these strategies improves distant homology recognitions by more than 100%, while using only 10% of the CPU time of the standard PSI-BLAST search. Another method, intermediate profile searches, allows for the exploration of additional search directions that are normally dominated by large protein sub-families within very diverse families. All methods are evaluated with a large fold-recognition benchmark.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
J. Soding, M. Remmert, A. Biegert, and A. N. Lupas
HHsenser: exhaustive transitive profile search using HMM-HMM comparison.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W374 - W378.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
W. Ge, H. Hu, K. Ding, L. Sun, and S. Zheng
Protein Interaction Analysis of ST14 Domains and Their Point and Deletion Mutants
J. Biol. Chem., March 17, 2006; 281(11): 7406 - 7412.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
A. Sivakumar, C. Wilton, and L. Holm
From sequences to a functional unit
Physiol Genomics, March 13, 2006; 25(1): 1 - 8.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Arnold, L. Bordoli, J. Kopp, and T. Schwede
The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling
Bioinformatics, January 15, 2006; 22(2): 195 - 201.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. S. Pettitt, L. J. McGuffin, and D. T. Jones
Improving sequence-based fold recognition by using 3D model quality assessment
Bioinformatics, September 1, 2005; 21(17): 3509 - 3515.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Itoh, S. Goto, T. Akutsu, and M. Kanehisa
Fast and accurate database homology search using upper bounds of local alignment scores
Bioinformatics, April 1, 2005; 21(7): 912 - 921.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.