Protein Engineering, Vol. 12, No. 5, 387-394,
May 1999
© 1999 Oxford University Press
PSIC: profile extraction from sequence alignments with position-specific counts of independent observations
1 European Molecular Biology Laboratory, Meyerhofstrasse1, Postfach 10.2209, D-69012 Heidelberg, 2 Max-Delbrück-Centrum für Molekulare Medizin, Robert-Rössle-Strasse 10, D-13122 Berlin-Buch, Germany, 3 V.A. Engelhardt Institut of Molecular Biology, Russian Academy of Sciences, Vavilov Street 32, 117984 Moscow, 5 Moscow Institute of Physics and Technology, Institutsky per. 9, Dolgoprudny, Moscow Region and 6 Institute of Control Sciences, Russian Academy of Sciences, Profsoyuznaya Street 65, 117806 Moscow, Russia
Sequence weighting techniques are aimed at balancing redundant observed information from subsets of similar sequences in multiple alignments. Traditional approaches apply the same weight to all positions of a given sequence, hence equal efficiency of phylogenetic changes is assumed along the whole sequence. This restrictive assumption is not required for the new method PSIC (position-specific independent counts) described in this paper. The number of independent observations (counts) of an amino acid type at a given alignment position is calculated from the overall similarity of the sequences that share the amino acid type at this position with the help of statistical concepts. This approach allows the fast computation of position-specific sequence weights even for alignments containing hundreds of sequences. The PSIC approach has been applied to profile extraction and to the fold family assignment of protein sequences with known structures. Our method was shown to be very productive in finding distantly related sequences and more powerful than Hidden Markov Models or the profile methods in WiseTools and PSI-BLAST in many cases. The profile extraction routine is available on the WWW (http://www.bork.embl-heidelberg.de/PSIC or http://www.imb.ac.ru/PSIC).
Keywords: fold recognition/motif recognition/profile extraction/position-specific independent counts/PSIC/sequence weighting
4 To whom correspondence should be addressed. E-mail frank.eisenhaber{at}embl-heidelberg.de
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Kumar, M. P. Suleski, G. J. Markov, S. Lawrence, A. Marco, and A. J. Filipski Positional conservation and amino acids shape the correct diagnosis and population frequencies of benign and damaging personal amino acid mutations Genome Res., September 1, 2009; 19(9): 1562 - 1569. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Wang, R. I. Sadreyev, and N. V. Grishin PROCAIN: protein profile comparison with assisting information Nucleic Acids Res., June 1, 2009; 37(11): 3522 - 3530. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Karchin Next generation tools for the annotation of human SNPs Brief Bioinform, January 1, 2009; 10(1): 35 - 52. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Nan, T. Niu, D. J. Hunter, and J. Han Missense Polymorphisms in Matrix Metalloproteinase Genes and Skin Cancer Risk Cancer Epidemiol. Biomarkers Prev., December 1, 2008; 17(12): 3551 - 3557. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Bromberg, G. Yachdav, and B. Rost SNAP predicts effect of mutations on protein function Bioinformatics, October 15, 2008; 24(20): 2397 - 2398. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Bromberg and B. Rost SNAP: predict effect of non-synonymous polymorphisms on function Nucleic Acids Res., June 28, 2007; 35(11): 3823 - 3835. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Pei and N. V. Grishin PROMALS: towards accurate multiple sequence alignments of distantly related proteins Bioinformatics, April 1, 2007; 23(7): 802 - 808. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Balasubramanian, Y. Xia, E. Freinkman, and M. Gerstein Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms Nucleic Acids Res., March 22, 2005; 33(5): 1710 - 1721. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Eisenhaber, M. Wildpaner, C. J. Schultz, G. H.H. Borner, P. Dupree, and F. Eisenhaber Glycosylphosphatidylinositol Lipid Anchoring of Plant Proteins. Sensitive Prediction from Sequence- and Genome-Wide Studies for Arabidopsis and Rice Plant Physiology, December 1, 2003; 133(4): 1691 - 1701. [Abstract] [Full Text] |
||||
![]() |
F. Aguero, V. Campo, L. Cremona, A. Jager, J. M. Di Noia, P. Overath, D. O. Sanchez, and A. C. Frasch Gene Discovery in the Freshwater Fish Parasite Trypanosoma carassii: Identification of trans-Sialidase-Like and Mucin-Like Genes Infect. Immun., December 1, 2002; 70(12): 7140 - 7144. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Ramensky, P. Bork, and S. Sunyaev Human non-synonymous SNPs: server and survey Nucleic Acids Res., September 1, 2002; 30(17): 3894 - 3900. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. W. May Optimal classification of protein sequences and selection of representative sets from multiple alignments: application to homologous families and lessons for structural genomics Protein Eng. Des. Sel., April 1, 2001; 14(4): 209 - 217. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sunyaev, V. Ramensky, I. Koch, W. Lathe III, A. S. Kondrashov, and P. Bork Prediction of deleterious human alleles Hum. Mol. Genet., March 1, 2001; 10(6): 591 - 597. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Eisenhaber, P. Bork, and F. Eisenhaber Post-translational GPI lipid anchor modification of proteins in kingdoms of life: analysis of protein sequence data from complete genomes Protein Eng. Des. Sel., January 1, 2001; 14(1): 17 - 25. [Abstract] [Full Text] [PDF] |
||||








