Protein Engineering, Vol. 12, No. 10, 815-818,
October 1999
© 1999 Oxford University Press
Short Communications |
Conserved structural features and sequence patterns in the GroES fold family
Institute of Microbial Technology, Sector 39-A, Chandigarh 160 036, India
| Abstract |
|---|
|
|
|---|
An irregular, all ß-class of proteins, comprising members of the chaperonin-10, quinone oxidoreductase, glucose dehydrogenase and alcohol dehydrogenase families has earlier been classified as the GroES fold. In this communication, we present an extensive analysis of sequences and three dimensional structures of proteins belonging to this family. The individual protein structures can be superposed within 1.6 Å for more than 60 structurally equivalent residues. The comparisons show a highly conserved hydrophobic core and conservation of a few key residues. A glycyl-aspartate dipeptide is suggested as being critical for the maintenance of the GroES fold. One of the surprising findings of the study is the non-conservative nature of Ile to Leu mutations in the protein core, although Ile to Val mutations are found to occur frequently.
Keywords: alcohol dehydrogenase/ß-barrel/chaperonin-10/GroES-fold
| Introduction |
|---|
|
|
|---|
The GroES fold has been described as an irregular ß-barrel and has been found to occur in at least four different functional classes of proteins: the quinone oxidoreductases (QOR), the alcohol dehydrogenases (ADH), the glucose dehydrogenases (GDH) and the chaperonin-10 (cpn10) (Murzin, 1996
| Materials and methods |
|---|
|
|
|---|
The analysis is based on comparison of 157 protein sequences and eight three-dimensional structures. All the sequences were retrieved from the Swiss-Prot Release 36.0 (Bairoch and Apweiler, 1999
The three-dimensional structures for the proteins belonging to the GroES fold family were retrieved from the Protein Data Bank (PDB) (Sussman et al., 1998
). Of the eight structures compared, three belonged to the chaperonin-10 family, namely, GroES of Escherichia coli (1AON; Xu et al., 1997), chaperonin-10 of Mycobacterium leprae (1LEP; Mande et al., 1996) and the gp31 protein of the T4 phage (1G31; Hunt et al., 1997), and three to the alcohol dehydrogenase family. The proteins belonging to the alcohol dehydrogenase family were horse alcohol dehydrogenase (2OHX; Eklund et al., 1976), human beta1 alcohol dehydrogenase (1DEH; Hurley et al., 1994) and human sigma alcohol dehydrogenase (1AGN; Xie et al., 1997). The other two structures considered in these comparisons were of E.coli quinone oxidoreductase (1QOR; Thorn et al., 1995) and T.acidophilum glucose dehydrogenase (John et al., 1994
). The structures were superposed by visualization followed by least-squares fitting using the lsq commands of O (Jones et al., 1991
).
The secondary structure assignments and the accessible surface areas for all the protein structures were calculated using the dssp program (Kabsch and Sander, 1983
). The amino acid residues with an accessible surface area of less than 5% for at least one representative of each class were classified as residues in the core. These residues were further confirmed as core residues by manually inspecting their positions in the respective three-dimensional structures.
| Results and discussion |
|---|
|
|
|---|
The overall topology of all the structures that were compared is very similar, as shown by Murzin (1996). The three-dimensional structures of the four families can be superimposed very well with one another (Table I
|
Figure 1
|
Another major difference among the various structures is the insertion of a second loop, the dome loop, connecting the second and the third strands of the barrel in the chaperonin-10 structures (Figure 1
All the members of the cpn10 family are known to be homoheptamers. Structure determination of the E.coli cpn10 (GroES) and its homologue in M.leprae has revealed that the heptamers are arranged in the shape of a dome-like structure and with an approximate sevenfold symmetry (Hunt et al., 1996
; Mande et al., 1996
). There are two distinct clusters of hydrophobic residues in the cpn10 family, one at the structural core of the monomer and the other at the interface of the monomers in the heptameric assembly. We analyzed sequence conservation at these positions, to check if any definite patterns emerge as fold determinants. As expected, our findings suggest that conservation of residues at the core of the monomer is far more stringent than at the interface.
Among the conserved structural features in all the protein families considered in our sequence analysis is the occurrence of a glycineaspartate sequence at the end of the second ß-strand (positions 62 and 63 of E.coli GroES). Absolute conservation of the glycylaspartyl dipeptide across these different protein families as divergent as a viral sequence, an archaeon sequence and mammalian sequences is as interesting as the conservation of the fold among these groups. This conserved sequence forms a part of a type II ß-turn (also referred to as a glycine turn; Richardson, 1981) positioned at the initiation of the third ß-strand. In type II turns, the second residue is normally in the poly-Pro conformation, while the third residue is in the left-handed 310 conformation. In all the structures examined in this study, we find that the glycine occupies the third position of the turn and is in the canonical left-handed 310 conformation. As expected for type II turns, all the four
-carbons appear to be nearly in a plane. The type II turns generally connect two consecutive antiparallel ß-strands in protein structures or help the polypeptide reverse its direction (Richardson, 1981
). In the GroES fold family, neither of the two cases is observed.
The type II turn seems to be important for maintaining the integrity of the fold, by involving a unique side chainmain chain interaction The side chain carboxylates of the aspartate are involved in hydrogen bonding to the main chain nitrogen of the first residue of the turn (Figure 2
), thereby restricting the polypeptide on either side of the ß-turn from approaching each other. The aspartate thus correctly juxtaposes the second and third ß-strands of the barrel with respect to the core of the protein. In the absence of the aspartate, we hypothesize that the second and third ß-strands would form an antiparallel ß-sheet, as commonly observed in other protein structures. This hypothesis can easily be tested by site-directed mutagenesis of these two residues.
|
Occurrence of the 310 helix inserted between the second and the third strands of the ß-barrel appears to be a conserved feature of the GroES fold family. The reasons for the conservation of the 310 helix among all the protein structures appear to be intriguing. A detailed sequence analysis and site-directed mutagenesis of the residues involved can shed more light on the role of the 310 helix in the integrity of the fold.
On comparing the three-dimensional structures, we identified eight residues that are shielded from the solvent and form the hydrophobic core of the proteins. These eight residues are seen to be highly conserved across the sequences with very little variation (Table II
). Considering the volume of the core to be that of the contributing side chains (Harpaz et al., 1994
), average core volumes of the chaperonin-10, quinone oxidoreductase and alcohol dehydrogenase families are 1171.3, 1102.7 and 1043.8 Å3, respectively. Hence all the three families have similar core volumes. The small difference in the volumes of the cpn10 and alcohol dehydrogenase families is due to the predominant occupation by aromatic residues at site 67 of E.coli and the corresponding positions of other cpn10 sequences.
|
Out of the eight identified positions in the core of the GroES fold proteins (Table II
A majority of the side chains in the hydrophobic core are small, non-polar side chains. The predominantly occurring residue is valine, which is highlighted by the mutation patterns at each of the individual sites. Interestingly, valines at the core positions are seen to be mutable into isoleucines, but not to leucines (Table II
). Considering that Ile and Leu have similar side chain volumes, the higher frequency of substitution of Ile by Val was rather unexpected. A probable reason could be the higher ß-sheet propensity of Ile and Val than Leu (Wilmot and Thornton, 1988
). Another possible reason could be the branching of side chains at the Cß position in both Val and Ile, while it is at the C
position in Leu. Therefore, in the event of compensatory mutations, Val to Leu or Ile to Leu mutation would appear to be non-conservative. A more plausible explanation can be sought from the genetic code. A single base mutation to convert Ile to Val requires a transition mutation at the first position of the triplet codon, whereas for Ile to Leu it would be a transversion mutation. Since transition mutation rates have a bias over transversion rates (Huelsenbeck and Rannala, 1997
), the substitution of Ile/Val by Leu would be less probable. A similar observation is also noted from the amino acid substitution matrices of Dayhoff (1978).
The interesting similarities between the different GroES fold proteins, therefore, suggest a possible evolutionary relatedness among them. Occurrence of ligand binding at the topologically equivalent site may seem to suggest a common evolutionary origin of the four protein families (Murzin, 1996
), such as that commonly found in TIM barrel proteins (Farber, 1993
). The quinone oxidoreductase and alcohol dehydrogenase proteins do indeed show high sequence similarities, reinforcing the conclusions regarding evolutionary divergence. The divergence of sequences may have preceded divergence of different kingdoms and therefore losing trace of sequence similarities between the chaperonin-10 and other families. However, the evolutionary pressure seems to have preserved the amino acids responsible for core formation, and also the glycylaspartyl dipeptide sequence for maintaining the integrity of the fold. Further detailed comparison of other ß-barrel classes of proteins can help in the identification of such fold determinants, the importance of which can be confirmed beyond doubt by various tools including site-directed mutagenesis. Nevertheless, the identification and importance of such fold determinants should provide the necessary impetus in the prediction of tertiary structures from first principles.
| Acknowledgments |
|---|
We thank Alexey Murzin for useful comments and suggestions on the manuscript, Garry L.Taylor for providing the coordinates of glucose dehydrogenase and the Bioinformatics facility of Institute of Microbial Technology for access to computers. B.T. is a CSIR Junior Research Fellow.
| Notes |
|---|
1 To whom correspondence should be addressed. Email: shekhar{at}bragg.imtech.ernet.in
| References |
|---|
|
|
|---|
Bairoch,A. and Apweiler,R. (1999) Nucleic Acids Res., 27, 4954.
Dayhoff,M. (1978) Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington, DC, pp. 345358.
Eklund,H., Nordstrom,B., Soderlund,E.Z.G., Ohlsson,I., Soderberg,T.B.B.-O., Tapia,O. and Branden,C.-I. (1976) J. Mol. Biol., 102, 2759.[Web of Science][Medline]
Farber,G.K. (1993) Curr. Opin. Struct. Biol., 3, 409412.
Harpaz,Y., Gerstein,M. and Chothia,C. (1994) Structure, 2, 641649.[Medline]
Huelsenbeck,J.P. and Rannala,B. (1997) Science, 276, 227232.
Hunt,J.F., Weaver,A.J., Landry,S.J., Gierasch,L. and Deisenhofer,J. (1996) Nature, 379, 3745.[Medline]
Hunt,J.F., Saskia,M.V., Henry,L. and Deisenhofer,J. (1997) Cell, 90, 361371.[Web of Science][Medline]
Hurley,T.D., Bosron,W.F. and Stone,C.L. (1994) J. Mol. Biol., 239, 415420.[Web of Science][Medline]
John, J., Crennell,S.J., Hough,D.W., Danson,M.J. and Taylor,G.L. (1994) Structure, 2, 385393.[Medline]
Jones,T.A., Zou,J.Y., Cowan,S.W. and Kjeldgaard, M. (1991) Acta Crystallogr., A47, 110119.
Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 25772637.[Web of Science][Medline]
Kraulis,P.J. (1991) J. Appl. Crystallogr., 90, 946950.
Mande,S.C., Mehra,V., Bloom,B.R. and Hol,W.G.J. (1996) Science, 271, 203207.[Abstract]
Murzin,A.G. (1996) Curr. Opin. Struct. Biol. 6, 386394.[Web of Science][Medline]
Murzin,A.G., Lesk,A.M. and Chothia, C. (1994a) J. Mol. Biol., 236, 13691381.[Web of Science][Medline]
Murzin,A.G., Lesk,A.M. and Chothia, C. (1994b) J. Mol. Biol., 236, 13821400.[Web of Science][Medline]
Richardson,J.S. (1981) Adv. Protein Chem., 34, 167339.[Medline]
Sussman,J.L., Lin,D., Jiang,J., Manning,N.O., Prilusky,J., Ritter,O. and Abola,E.E. (1998) Acta Crystallogr., D54, 10781084.
Thomson,J.D., Higgins,D.G. and Gibson,T.J. (1994) Nucleic Acids Res., 22, 46734680.
Thorn,J.M., Barton,J.D., Dixon,N.E., Ollis,D.L. and Edwards,K.J. (1995) J. Mol. Biol. 249, 785799.[Web of Science][Medline]
Wilmot,C.M. and Thornton,J.M. (1988) J. Mol. Biol., 203, 221232.[Web of Science][Medline]
Xie,P., Parsons,S. H., Speckhard,D.C., Bosron,W.F. and Hurley,T.D. (1997) J. Biol. Chem., 272, 1855818563.
Xu,Z., Horwich,A.L. and Sigler,P.B. (1997) Nature, 388, 741750.[Medline]
Received March 19, 1999; revised June 11, 1999; accepted July 5, 1999.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. van der Giezen, G. Leon-Avila, and J. Tovar Characterization of chaperonin 10 (Cpn10) from the intestinal human pathogen Entamoeba histolytica Microbiology, September 1, 2005; 151(9): 3107 - 3115. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Qamra, B. Taneja, and S. C. Mande Identification of conserved residue patterns in small {beta}-barrel proteins Protein Eng. Des. Sel., December 1, 2002; 15(12): 967 - 977. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Taneja and S. C. Mande Metal ions modulate the plastic nature of Mycobacterium tuberculosis chaperonin-10 Protein Eng. Des. Sel., June 1, 2001; 14(6): 391 - 395. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



