Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (32)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Salamov, A. A.
Right arrow Articles by Swindells, M. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Salamov, A. A.
Right arrow Articles by Swindells, M. B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Protein Engineering, Vol. 12, No. 2, 95-100, February 1999
© 1999 Oxford University Press

Combining sensitive database searches with multiple intermediates to detect distant homologues

Asaf A. Salamov1,2, Makiko Suwa1, Christine A. Orengo3 and Mark B. Swindells1,4,5

1 Helix Research Institute, 1532–3 Yana, Kisarazu-shi, Chiba, 292, Japan, 3 Biomolecular Structure and Modelling Unit, Department of Biochemistry, University College London, Gower Street, London, UK and 4 Tsukuba Advanced Research Alliance, University of Tsukuba, Tsukuba 305, Japan

Using data from the CATH structure classification, we have assessed the blastp, fasta, smith–waterman and gapped-blast algorithms, developed a portable normalization scheme and identified safe thresholds for database searching. Of the four methods assessed, fasta, smith–waterman and gapped-blast perform similarly, whereas the sensitivity of blastp was much lower. Introduction of an intermediate sequence search substantially improved the results. When tested on a set of relationships that could not be identified by blastp, intermediate sequences were able to find double the number of relationships identified by the smith–waterman algorithm alone. However, we found that the benefit of using intermediates varied considerably between each family and depended not only on the number of available sequences, but also their diversity. In an attempt to increase sensitivity further, a multiple intermediate sequence search (MISS) procedure was developed. When assessed on 1906 cases from a wide range of homologous families that could not be detected by the previous approaches, MISS was able to identify 241 additional relationships. MISS uses the full extent of sequence diversity to detect additional relationships, but does not consider any structure-specific information. For this reason, it is more generally applicable than fold recognition and threading methods, which require a library of known structures.

Keywords: CATH/intermediate searches/sequence analysis/protein structure

2 Present address: Sanger Centre, Wellcome Trust Genome Campus, Cambridge, UK

4 To whom correspondence should be addressed. Present address: Inpharmatica Ltd, 60 Charlotte Street, London W1P 2AX, UK


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
C. G. Roessler, B. M. Hall, W. J. Anderson, W. M. Ingram, S. A. Roberts, W. R. Montfort, and M. H. J. Cordes
Transitive homology-guided structural studies lead to discovery of Cro proteins with 40% sequence identity but different folds
PNAS, February 19, 2008; 105(7): 2343 - 2348.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Bhadra, S. Sandhya, K. R. Abhinandan, S. Chakrabarti, R. Sowdhamini, and N. Srinivasan
Cascade PSI-BLAST web server: a remote homology search tool for relating protein domains.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W143 - W146.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. D. Thompson, V. Prigent, and O. Poch
LEON: multiple aLignment Evaluation Of Neighbours
Nucleic Acids Res., February 24, 2004; 32(4): 1298 - 1307.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
N. Casals, P. Gomez-Puertas, J. Pie, C. Mir, R. Roca, B. Puisac, R. Aledo, J. Clotet, S. Menao, D. Serra, et al.
Structural ({beta}{alpha})8 TIM Barrel Model of 3-Hydroxy-3-methylglutaryl-Coenzyme A Lyase
J. Biol. Chem., August 1, 2003; 278(31): 29016 - 29023.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. R. Panchenko
Finding weak similarities between proteins by sequence profile comparison
Nucleic Acids Res., January 15, 2003; 31(2): 683 - 689.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
W. Li, L. Jaroszewski, and A. Godzik
Sequence clustering strategies improve remote homology recognitions while reducing search times
Protein Eng. Des. Sel., August 1, 2002; 15(8): 643 - 649.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Rognes
ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches
Nucleic Acids Res., April 1, 2001; 29(7): 1647 - 1652.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. T. Yudate, M. Suwa, R. Irie, H. Matsui, T. Nishikawa, Y. Nakamura, D. Yamaguchi, Z. Z. Peng, T. Yamamoto, K. Nagai, et al.
HUNT: launch of a full-length cDNA database from the Helix Research Institute
Nucleic Acids Res., January 1, 2001; 29(1): 185 - 188.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
J.E. Bray, A.E. Todd, F.M.G. Pearl, J.M. Thornton, and C.A. Orengo
The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues
Protein Eng. Des. Sel., March 1, 2000; 13(3): 153 - 165.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.