Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (289)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Rost, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rost, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Protein Engineering, Vol. 12, No. 2, 85-94, February 1999
© 1999 Oxford University Press

Twilight zone of protein sequence alignments

Burkhard Rost1,2,3

1 EMBL, 69 012 Heidelberg, 2 LION Bioscience AG, Im Neuenheimer Feld 517, 69 120 Heidelberg, Germany and 3 Columbia University, Department of Biochemistry and Molecular Biophysics, 650 West 168 Street, New York, NY 10032, USA

Sequence alignments unambiguously distinguish between protein pairs of similar and non-similar structure when the pairwise sequence identity is high (>40% for long alignments). The signal gets blurred in the twilight zone of 20–35% sequence identity. Here, more than a million sequence alignments were analysed between protein pairs of known structures to re-define a line distinguishing between true and false positives for low levels of similarity. Four results stood out. (i) The transition from the safe zone of sequence alignment into the twilight zone is described by an explosion of false negatives. More than 95% of all pairs detected in the twilight zone had different structures. More precisely, above a cut-off roughly corresponding to 30% sequence identity, 90% of the pairs were homologous; below 25% less than 10% were. (ii) Whether or not sequence homology implied structural identity depended crucially on the alignment length. For example, if 10 residues were similar in an alignment of length 16 (>60%), structural similarity could not be inferred. (iii) The `more similar than identical' rule (discarding all pairs for which percentage similarity was lower than percentage identity) reduced false positives significantly. (iv) Using intermediate sequences for finding links between more distant families was almost as successful: pairs were predicted to be homologous when the respective sequence families had proteins in common. All findings are applicable to automatic database searches.

Keywords: alignment quality analysis/evolutionary conservation/genome analysis/protein sequence alignment/sequence space hopping


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
D. Przybylski and B. Rost
Powerful fusion: PSI-BLAST and consensus sequences
Bioinformatics, September 15, 2008; 24(18): 1987 - 1993.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Patel, R. George, F. Autore, F. Fraternali, J. E. Ladbury, and P. V. Nikolova
Molecular interactions of ASPP1 and ASPP2 with the p53 protein family and the apoptotic promoters PUMA and Bax
Nucleic Acids Res., September 1, 2008; 36(16): 5139 - 5151.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Loewenstein and M. Linial
Connect the dots: exposing hidden protein family connections from the entire sequence tree
Bioinformatics, August 15, 2008; 24(16): i193 - i199.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. M. Overton, C. A. J. van Niekerk, L. G. Carter, A. Dawson, D. M. A. Martin, S. Cameron, S. A. McMahon, M. F. White, W. N. Hunter, J. H. Naismith, et al.
TarO: a target optimisation system for structural biology
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W190 - W196.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. M. Overton, G. Padovani, M. A. Girolami, and G. J. Barton
ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction
Bioinformatics, April 1, 2008; 24(7): 901 - 907.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. R. Shah, C. S. Oehmen, and B.-J. Webb-Robertson
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection
Bioinformatics, March 15, 2008; 24(6): 783 - 790.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. N. Wass and M. J. E. Sternberg
ConFunc--functional annotation in the twilight zone
Bioinformatics, March 15, 2008; 24(6): 798 - 806.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
N. Mallorqui-Fernandez, S. P. Manandhar, G. Mallorqui-Fernandez, I. Uson, K. Wawrzonek, T. Kantyka, M. Sola, I. B. Thogersen, J. J. Enghild, J. Potempa, et al.
A New Autocatalytic Activation Mechanism for Cysteine Proteases Revealed by Prevotella intermedia Interpain A
J. Biol. Chem., February 1, 2008; 283(5): 2871 - 2882.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Hulo, A. Bairoch, V. Bulliard, L. Cerutti, B. A. Cuche, E. de Castro, C. Lachaize, P. S. Langendijk-Genevaux, and C. J. A. Sigrist
The 20 years of PROSITE
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D245 - D249.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Brylinski and J. Skolnick
A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation
PNAS, January 8, 2008; 105(1): 129 - 134.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
J. Xiong, C. E. Bauer, and A. Pancholy
Insight into the haem d1 biosynthesis pathway in heliobacteria through bioinformatics analysis
Microbiology, October 1, 2007; 153(10): 3548 - 3562.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Schlessinger, M. Punta, and B. Rost
Natively unstructured regions in proteins identified from contact predictions
Bioinformatics, September 15, 2007; 23(18): 2376 - 2384.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Ofran, V. Mysore, and B. Rost
Prediction of DNA-binding residues from sequence
Bioinformatics, July 1, 2007; 23(13): i347 - i353.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
A. Pandini, G. Mauri, A. Bordogna, and L. Bonati
Detecting similarities among distant homologous proteins by comparison of domain flexibilities
Protein Eng. Des. Sel., June 30, 2007; (2007) gzm021v2.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Bromberg and B. Rost
SNAP: predict effect of non-synonymous polymorphisms on function
Nucleic Acids Res., June 28, 2007; 35(11): 3823 - 3835.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Przybylski and B. Rost
Consensus sequences improve PSI-BLAST through mimicking profile-profile alignments
Nucleic Acids Res., April 1, 2007; 35(7): 2238 - 2246.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Dalhus, I. H. Helle, P. H. Backe, I. Alseth, T. Rognes, M. Bjoras, and J. K. Laerdahl
Structural insight into repair of alkylated DNA by a new superfamily of DNA glycosylases comprising HEAT-like repeats
Nucleic Acids Res., April 1, 2007; 35(7): 2451 - 2459.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
S. Richardt, D. Lang, R. Reski, W. Frank, and S. A. Rensing
PlanTAPDB, a Phylogeny-Based Resource of Plant Transcription-Associated Proteins
Plant Physiology, April 1, 2007; 143(4): 1452 - 1466.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
E. Krissinel
On the relationship between sequence and structure similarities in proteomics
Bioinformatics, March 15, 2007; 23(6): 717 - 723.
[Abstract] [Full Text] [PDF]


Home page
Biophys. JHome page
C. B. Roland and E. I. Shakhnovich
Divergent Evolution of a Structural Proteome: Phenomenological Models
Biophys. J., February 1, 2007; 92(3): 701 - 716.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Ofran and B. Rost
ISIS: interaction sites identified from sequence
Bioinformatics, January 15, 2007; 23(2): e13 - e16.
[Abstract] [Full Text] [PDF]


Home page
Int. J. Syst. Evol. Microbiol.Home page
J. Goris, K. T. Konstantinidis, J. A. Klappenbach, T. Coenye, P. Vandamme, and J. M. Tiedje
DNA-DNA hybridization values and their relationship to whole-genome sequence similarities
Int J Syst Evol Microbiol, January 1, 2007; 57(1): 81 - 91.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
S. A. Miller, S. Tollefson, J. E. Crowe Jr., J. V. Williams, and D. W. Wright
Examination of a Fusogenic Hexameric Core from Human Metapneumovirus and Identification of a Potent Synthetic Peptide Inhibitor from the Heptad Repeat 1 Region
J. Virol., January 1, 2007; 81(1): 141 - 149.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
M. M. Meyer, L. Hochrein, and F. H. Arnold
Structure-guided SCHEMA recombination of distantly related {beta}-lactamases
Protein Eng. Des. Sel., December 1, 2006; 19(12): 563 - 570.
[Abstract] [Full Text] [PDF]


Home page
Mol. Pharmacol.Home page
T. Beuming, L. Shi, J. A. Javitch, and H. Weinstein
A Comprehensive Structure-Based Alignment of Prokaryotic and Eukaryotic Neurotransmitter/Na+ Symporters (NSS) Aids in the Use of the LeuT Structure to Probe NSS Structure and Function
Mol. Pharmacol., November 1, 2006; 70(5): 1630 - 1642.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. Sadekar, J. Raymond, and R. E. Blankenship
Conservation of Distantly Related Membrane Proteins: Photosynthetic Reaction Centers Share a Common Structural Core
Mol. Biol. Evol., November 1, 2006; 23(11): 2001 - 2007.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
P. S. G. Chain, V. J. Denef, K. T. Konstantinidis, L. M. Vergez, L. Agullo, V. L. Reyes, L. Hauser, M. Cordova, L. Gomez, M. Gonzalez, et al.
Inaugural Article: Burkholderia xenovorans LB400 harbors a multi-replicon, 9.73-Mbp genome shaped for versatility
PNAS, October 17, 2006; 103(42): 15280 - 15287.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Tangrot, L. Wang, B. Kagstrom, and U. H. Sauer
FISH--family identification of sequence homologues using structure anchored hidden Markov models.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W10 - W14.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Vacic, L. M. Iakoucheva, and P. Radivojac
Two Sample Logo: a graphical representation of the differences between two sets of sequence alignments
Bioinformatics, June 15, 2006; 22(12): 1536 - 1537.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Physiol.Home page
A. A. Fodor and R. W. Aldrich
Statistical Limits to the Identification of Ion Channel Domains by Sequence Similarity
J. Gen. Physiol., May 30, 2006; 127(6): 755 - 766.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
E. Ozkirimli and C. B. Post
Src kinase activation: A switched electrostatic network
Protein Sci., May 1, 2006; 15(5): 1051 - 1062.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
Z. Su, J. Wang, J. Yu, X. Huang, and X. Gu
Evolution of alternative splicing after gene duplication
Genome Res., February 1, 2006; 16(2): 182 - 189.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Arnold, L. Bordoli, J. Kopp, and T. Schwede
The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling
Bioinformatics, January 15, 2006; 22(2): 195 - 201.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Mihalek, I. Res, and O. Lichtarge
A structure and evolution-guided Monte Carlo sequence selection strategy for multiple alignment-based analysis of proteins
Bioinformatics, January 15, 2006; 22(2): 149 - 156.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Schlessinger, Y. Ofran, G. Yachdav, and B. Rost
Epitome: database of structure-inferred antigenic epitopes
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D777 - D780.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
E. Kim and Y. Kliger
Discovering hidden viral piracy
Bioinformatics, December 1, 2005; 21(23): 4216 - 4222.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. Li, J. Wang, and J.-A. Feng
NdPASA: a pairwise sequence alignment server for distantly related proteins
Bioinformatics, October 1, 2005; 21(19): 3803 - 3805.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
K. T. Konstantinidis and J. M. Tiedje
Towards a Genome-Based Taxonomy for Prokaryotes
J. Bacteriol., September 15, 2005; 187(18): 6258 - 6264.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
K. Vandepoele, K. Vlieghe, K. Florquin, L. Hennig, G. T.S. Beemster, W. Gruissem, Y. Van de Peer, D. Inze, and L. De Veylder
Genome-Wide Identification of Potential Plant E2F Target Genes
Plant Physiology, September 1, 2005; 139(1): 316 - 328.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. A. Simossis and J. Heringa
PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W289 - W294.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Punta and B. Rost
PROFcon: novel prediction of long-range contacts
Bioinformatics, July 1, 2005; 21(13): 2960 - 2968.
[Abstract] [Full Text] [PDF]


Home page
IOVSHome page
A. Kantardzhieva, I. Gosens, S. Alexeeva, I. M. Punte, I. Versteeg, E. Krieger, C. A. Neefjes-Mol, A. I. den Hollander, S. J. F. Letteboer, J. Klooster, et al.
MPP5 Recruits MPP4 to the CRB1 Complex in Photoreceptors
Invest. Ophthalmol. Vis. Sci., June 1, 2005; 46(6): 2192 - 2201.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
B. Wallner and A. Elofsson
All are not equal: A benchmark of different homology modeling programs
Protein Sci., May 1, 2005; 14(5): 1315 - 1327.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. A. Drummond, J. J. Silberg, M. M. Meyer, C. O. Wilke, and F. H. Arnold
On the conservative nature of intragenic recombination
PNAS, April 12, 2005; 102(15): 5380 - 5385.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. A. Simossis, J. Kleinjung, and J. Heringa
Homology-extended sequence alignment
Nucleic Acids Res., February 7, 2005; 33(3): 816 - 824.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
C. B. Do, M. S.P. Mahabhashyam, M. Brudno, and S. Batzoglou
ProbCons: Probabilistic consistency-based multiple sequence alignment
Genome Res., February 1, 2005; 15(2): 330 - 340.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
Z. Zhang, S. Kochhar, and M. G. Grigorov
Descriptor-based protein remote homology identification
Protein Sci., February 1, 2005; 14(2): 431 - 444.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
M. J. Blades, J. C. Ison, R. Ranasinghe, and J. B.C. Findlay
Automatic generation and evaluation of sparse protein signatures for families of protein structural domains
Protein Sci., January 1, 2005; 14(1): 13 - 23.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
K. Vandepoele and Y. Van de Peer
Exploring the Plant Transcriptome through Phylogenetic Profiling
Plant Physiology, January 1, 2005; 137(1): 31 - 42.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
T. H. Oakley, Z. Gu, E. Abouheif, N. H. Patel, and W.-H. Li
Comparative Methods for the Analysis of Gene-Expression Evolution: An Example Using Yeast Functional Genomic Data
Mol. Biol. Evol., January 1, 2005; 22(1): 40 - 50.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. G. Aller, E. T. Eng, C. J. De Feo, and V. M. Unger
Eukaryotic CTR Copper Uptake Transporters Require Two Faces of the Third Transmembrane Domain for Helix Packing, Oligomerization, and Function
J. Biol. Chem., December 17, 2004; 279(51): 53435 - 53441.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
A. Paiardini, F. Bossa, and S. Pascarella
Evolutionarily conserved regions and hydrophobic contacts at the superfamily level: The case of the fold-type I, pyridoxal-5'-phosphate-dependent enzymes
Protein Sci., November 1, 2004; 13(11): 2992 - 3005.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
T. Holscher, R. Krajmalnik-Brown, K. M. Ritalahti, F. von Wintzingerode, H. Gorisch, F. E. Loffler, and L. Adrian
Multiple Nonidentical Reductive-Dehalogenase-Homologous Genes Are Common in Dehalococcoides
Appl. Envir. Microbiol., September 1, 2004; 70(9): 5290 - 5297.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Liu and B. Rost
Sequence-based prediction of protein domains
Nucleic Acids Res., July 7, 2004; 32(12): 3522 - 3530.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Rost, G. Yachdav, and J. Liu
The PredictProtein server
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W321 - W326.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Nair and B. Rost
LOCnet and LOCtarget: sub-cellular localization for structural genomics targets
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W517 - W521.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
C. Simillion, K. Vandepoele, Y. Saeys, and Y. Van de Peer
Building Genomic Profiles for Uncovering Segmental Homology in the Twilight Zone
Genome Res., June 1, 2004; 14(6): 1095 - 1106.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
K. Xie, M. P. Sowden, G. S. C. Dance, A. T. Torelli, H. C. Smith, and J. E. Wedekind
The structure of a yeast RNA-editing deaminase provides insight into the fold and function of activation-induced deaminase and APOBEC-1
PNAS, May 25, 2004; 101(21): 8114 - 8119.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. R. Bigelow, D. S. Petrey, J. Liu, D. Przybylski, and B. Rost
Predicting transmembrane beta-barrels in proteomes
Nucleic Acids Res., May 11, 2004; 32(8): 2566 - 2577.
[Abstract] [Full Text] [PDF]


Home page