Protein Engineering vol. 16 no. 12 pp. 861-863, 2003
© 2003 Oxford University Press
COMMUNICATION |
Similarity between the C-terminal domain of the prion protein and chimpanzee cytomegalovirus glycoprotein UL9
Department of Biomathematical Sciences, Mount Sinai School of Medicine, Box 1023, One Gustave L. Levy Place, New York, NY 10029, USA
1 To whom correspondence should be addressed at: Department of Molecular Biosciences, 2034 Haworth Hall, 1200 Sunnyside Avenue, The University of Kansas, Lawrence, KS 66045, USA. e-mail: igor{at}ku.edu
| Abstract |
|---|
|
|
|---|
Prion diseases are a group of fatal neurodegenerative disorders associated with structural conversion of a normal, mostly
-helical cellular prion protein, PrPC, into a pathogenic ß-sheet-rich conformation, PrPSc. The structure of PrPC is well studied, whereas the insolubility of PrPSc makes the characterization of its structure problematic. No proteins similar to PrP, except for its paralog with the same fold, PrP-Doppel, are known. However, PrP-Doppel does not undergo a structural transition into a ß-sheet-rich conformation. Structural information from proteins that share a weak but significant sequence similarity with PrP may be used to gain additional insights into the conformation of PrPSc. We construct a sequence profile corresponding to the structured domain of PrP and use this profile to search the SWISS-PROT and TrEMBL databases. We identify a significant sequence similarity between PrP and chimpanzee cytomegalovirus glycoprotein UL9. This glycoprotein scores higher than all PrP-Doppel sequences. Fold recognition methods assign a mainly-ß fold to UL9. Owing to the observed sequence similarity with PrP and a putative mainly-ß fold, the UL9 glycoprotein may represent a potential target for experimental structure determination aimed at obtaining a structural template for PrPSc modeling.
Keywords: alignment/conformational transition/Doppel/sequence profile
| Introduction |
|---|
|
|
|---|
Prion diseases are a class of fatal neurodegenerative disorders in mammals (CJD, BSE, scrapie, etc.). These diseases may be inherited or may arise sporadically, and are believed to be caused by a unique pathogen that contains no nucleic acid, the prion protein. The prion protein is a rare example of a protein that can exist, under physiological conditions, in two different conformationsthe normal cellular protein with unknown function, designated PrPC, and the infectious pathogenic form, designated PrPSc. According to the prion only hypothesis, the pathogenesis involves the initial formation, caused by a point mutation or some exogenous factors, of PrPSc which subsequently interacts with PrPC and converts it. The conformational transition PrPC
PrPSc involves unfolding of
-helices and formation of ß-sheets. This transition is not associated with any covalent modifications (Prusiner et al., 1998
The cellular form of the prion protein is a GPI-anchored outer-membrane glycoprotein that undergoes rapid endocytosis (Lehmann et al., 1999
). A number of NMR and X-ray studies aimed to detect the structure of PrPC have revealed that the C-terminal domain of the protein is structured, whereas the N-terminal domain, which contains Gly- and Pro-rich octarepeats, is highly flexible and cannot be assigned a particular conformation (Riek et al., 1998
). Recently, a paralog of the prion protein, PrP-Doppel, was identified (Mo et al., 2001
). This protein and the C-terminal domain of PrP share
25% sequence identity and have very similar structures which consist of three
-helices (A, B and C) and a short ß-sheet. However, despite its structural similarity to PrP, Doppel does not undergo a structural transition into a ß-sheet-rich conformation (Nicholson et al., 2002
).
Little is known about the pathogenic conformation of the prion protein, PrPSc, except for its approximate secondary structure content, protease resistance and the insolubility of some forms (Prusiner et al., 1998
). Owing to the insolubility of PrPSc, characterization of its structure by NMR or X-ray crystallography has been problematic. A number of attempts to model the structure of PrPSc using spectroscopic and electron crystallography data have been undertaken (Huang et al., 1995
; Wille et al., 2003
). Improvement of the quality of such knowledge-based models, and progress in determining the structure of PrPSc, can be achieved by using information derived from proteins that share a weak but significant sequence similarity with PrP. The structural properties of such proteins, especially if they adopt a mainly-ß fold, may be used to gain insight into the conformation of PrPSc. Sequence profiles obtained from a multiple sequence alignment of related proteins represent one of the most sensitive methods used to detect structural similarity between proteins with a low degree of sequence identity (Gribskov and Veretnik, 1996
). Our aim here is to use sequence profiles to identify proteins that share a significant sequence similarity with the structured C-terminal domain of the prion protein.
| Methods and results |
|---|
|
|
|---|
We used sequence profile search software (Gribskov and Veretnik, 1996
|
The highest scoring (z-score = 11.76) non-PrP and non-Doppel sequence is the chimpanzee cytomegalovirus (CMV) glycoprotein UL9 (Table I). UL9 is the only non-PrP sequence which scores higher than all Doppel proteins. It should be noted that the prion protein and Doppel share the same fold and
25% sequence identity. The next highest scoring non-PrP and non-Doppel sequence (TrEMBL accession No. Q9DFV7) has a z-score of 8.63, which is lower than the z-scores of all but one Doppel sequences. The alignment scores produced by the PROFILESEARCH program do not take into account compositional bias, which may result in an artificially high score for sequences with low complexity. Therefore, it is necessary to study the effect of the amino acid composition of UL9 on the z-score obtained from profile analysis. We generated 103 random sequences by shuffling the UL9 sequence and aligned these random sequences with the PrP profile using the GCG PROFILEGAP program. This procedure produced a distribution of 103 random scores with an average score of 22.266 ± 3.7688. The local alignment scores are known to follow the extreme value distribution (Pearson, 1998
x):
The random scores were fitted to the extreme value distribution using the STATISTICA software package (Statistica version 6.0; StatSoft, Inc., 2300 East 14 St., Tulsa, OK 74104, USA), giving a = 20.5557 and b = 2.971 (Equation 1). The profile alignment score for UL9 is 53.17, and the probability of observing a score of this magnitude or larger in the sequences with the same amino acid composition and length as those of UL9 obtained using Equation 1 is very low, P(S
53.17) = 1.7x105. Therefore, we conclude that the observed sequence similarity between UL9 and prion protein is highly significant. The local alignment between UL9 and the prion protein profile comprises residues 67172 of UL9 and is shown in Figure 1A. Pairwise local alignment of chicken PrP and UL9 shows that the best alignment comprises residues 80130 of UL9, and helices A and B of PrP (Figure 1B). It should be noted that the loop connecting helices A and B is thought to participate in binding the hypothetical PrP ligand, protein X, which may be involved in conformational transition (Kaneko et al., 1997
).
|
The CMV is a member of the herpesvirus group. It has been proposed as the most prevalent infectious agent causing neurological dysfunction in the developing brain, and therefore has a high affinity for developing brain cells (van den Pol et al., 2002
We used the mGenThreader fold recognition server (McGuffin and Jones, 2003
), which has been shown to have the lowest rate of false positive predictions among all automated fold recognition servers (Bujnicki et al., 2001
), to make predictions for UL9 protein. It should be noted that all highest scoring templates (E-value from 0.03 to 0.06) belong to mainly-ß proteins involved in substrate binding: immunoglobulin antigen-binding domains (PDB i.d. 8fab, 12e8, 1a3l, 32c2, 1igt) and T-cell receptors (PDB i.d. 1tcr, 1hxm, 1bec). A different fold recognition method, SAM_T02 (Karplus et al., 2001
), also assigns highest scoring hits for UL9 to immunoglobulin antigen-binding domains and T-cell receptors. The same two methods do not find any significant matches for the C-terminal domain of the prion protein, except for the match between PrP and Doppel. The evidence of a putative mainly-ß fold of the UL9 protein and its sequence similarity with the prion protein, which undergoes a conformational transition into mainly-ß conformation, identify UL9 as a potential target for experimental structure determination aimed at obtaining a template for modeling the structure of PrPSc. Further progress in structural and functional annotation of UL9 may help understand the function of PrP and what type of substrate it binds.
| Acknowledgements |
|---|
This work was supported by grant number 1R01 LM06789 from the National Library of Medicine of the National Institutes of Health. I.B.K. is supported by NSF EPSCoR.
| References |
|---|
|
|
|---|
Bairoch,A. and Apweiler,R. (2000) Nucleic Acids Res., 28, 4548.
Bujnicki,J.M., Elofsson,A., Fischer,D. and Rychlewski,L. (2001) Proteins, S5, 184191.[CrossRef]
Gribskov,M. and Veretnik,S. (1996) Methods Enzymol., 266, 198212.[Web of Science][Medline]
Henikoff,S. and Henikoff,J.G. (1992) Proc. Natl Acad. Sci. USA, 89, 1095110919.
Huang,Z., Prusiner,S. and Cohen,F.E. (1995) Fold Des., 1, 1319.[CrossRef][Medline]
Kaneko,K., Zulianello,L., Scott,M., Cooper,C.M., Wallace,A.C., James,T.L., Cohen,F.E. and Prusiner,S.B. (1997) Proc. Natl Acad. Sci. USA, 94, 1006910074.
Karplus,K., Karchin,R., Barrett,C., Tu,S., Cline,M., Diekhans,M., Grate,L., Casper,J. and Hughey,R. (2001) Proteins, S5, 8691.[CrossRef]
Lehmann,S., Milhavet,O. and Mange,A. (1999) Biomed. Pharmacother., 53, 3946.[CrossRef][Medline]
McGuffin,L.J and Jones,D.T. (2003) Bioinformatics, 19, 874881.
Mo,H., Moore,R.C., Cohen,F.E., Westaway,D., Prusiner,S.B., Wright,P.E. and Dyson,H.J. (2001) Proc. Natl Acad. Sci. USA, 98, 23522357.
Nicholson,E.M., Mo,H., Prusiner,S.B., Cohen,F.E. and Marqusee,S. (2002) J. Mol. Biol., 316, 807815.[CrossRef][Web of Science][Medline]
Pearson,W.R. (1998) J. Mol. Biol., 276, 7184.[CrossRef][Web of Science][Medline]
Prusiner,S.B, Scott,M.R., DeArmond,S.J. and Cohen,F.E. (1998) Cell, 93, 337348.[CrossRef][Web of Science][Medline]
Riek,R., Wider,G., Billiter,M., Hornemann,S., Glockshuber,R. and Wutrich,K. (1998) Proc. Natl Acad. Sci. USA, 95, 1166711672.
van den Pol,A.N., Reuter,J.D. and Santarelli,J.G. (2002) J. Virol., 76, 88428854.
Wille,H., Michelitsch,M.D., Guenebaut,V., Supattapone,S., Serban,A., Cohen,F.E., Agard,D.A. and Prusiner,S.B. (2003) Proc. Natl Acad. Sci. USA, 99, 35633568.
Received September 2, 2003; accepted September 12, 2003
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
