Protein Engineering, Vol. 12, No. 2, 101-105,
February 1999
© 1999 Oxford University Press
Spatial sign-alternating charge clusters in globular proteins
Institute of Protein Research, Russian Academy of Sciences, 142292 Pushchino, Moscow, Russia
| Abstract |
|---|
|
|
|---|
Large sign-alternating charge clusters formed by the charged side groups of amino acid residues and N- and C-terminal groups were found in the majority of considered globular proteins, namely 235 in a total of 274 protein structures, i.e. 85.8%. The clusters were determined by the criteria proposed earlier: charged groups were included in the cluster if their charged N and O atoms were located at distances between 2.4 and 7.0 Å. The set of selected proteins consisted of known non-homologous protein structures from the Protein Data Bank with a resolution less than or equal to 2.5 Å and pair sequence similarity less than 25%. Molecular masses of the proteins were from 5.5 to 91.5 kDa and protein chain length from 50 to 830 residues. The distribution of charged groups on the protein surface between isolated charged groups, small clusters with two and three groups, and large clusters with four or more groups were found to be approximately similar making 33, 35 and 32% of the total amount of protein charged groups, respectively. The large sign-alternating charge clusters with four or more charged groups were studied in greater detail. The amount of such clusters depends on the protein chain length. The small proteins contain 13 clusters while the large proteins display 46 or more clusters. On average, 1.5 clusters per each 100 residues were observed. In contrast with this, the size of a cluster, i.e. the number of charged groups inside a cluster, does not depend on the protein molecular mass, and large clusters are observed for proteins from a range of molecular masses. Clusters consisting of four to six charged groups occur most frequently, although extra large clusters are also often revealed. We can conclude that sign-alternating charge clusters are a common feature of the protein surface of globular protein. They are suggested to play a general functional role as a local polar factor of protein surface.
Keywords: charged groups/globular proteins/hydrophilicity/molecular surface/sign-alternating charge cluster
| Introduction |
|---|
|
|
|---|
One of the general features of the protein surface is the existence of extended regions with different functional features. At present some of such regions have been detected and examined in globular proteins. They are as follows: the polar areas sized by continuous networks of hydrogen bonds between polar side chains and water molecules (Peters and Peters, 1993
`Complex salt bridges', i.e. clusters of ionic pairs with the charge-to-charge distance less than or equal to 4 Å have been recently examined in the PDB subset of protein structures (Musofia et al., 1995
). It was revealed that 60% of the studied proteins contained complex salt bridges and most of them had an important function in intersubunit interactions. The extended charge polar regions which include spatial sign-alternating charge clusters were found in the calf eye lens protein
-crystallins (Chirgadze and Tabolina, 1996
). Here the charge-to-charge distances were stated to be in the range less than or equal to 7.0 Å. It was shown that
-crystallin has five such large enough clusters with four to six charged groups which compose 54% of the total charged groups in this protein. Common stereochemical properties of sign-alternating charge clusters has resulted in them being described as surface structural invariants. The evolutionary conservatism of these clusters was confirmed for all members of the
-crystallin family of vertebrates including fish, frog, mouse, rat, calf and man. The charge clusters play two functional roles in
-crystallins. One is connected with a decrease of the surface `hydrophilic potential' on the cluster areas. It allows the native protein to exist in the condensed medium of eye lens (Wistow et al., 1983
). Another function is an increase in local stability of the protein internal structure (Chirgadze, 1996
). The result obtained for
-crystallin encouraged us to study the existence of sign-alternating charge clusters in all known protein structures which have been determined with high resolution. We will see below that the charge clusters of this type are wide-spread among globular proteins and are of general interest for studying the protein surface.
| Materials and methods |
|---|
|
|
|---|
Definition of sign-alternating charge clusters
Ion pairs on the protein surface are formed by the side groups of amino acid residues and can be divided into two types. One is related with short contact ion pairs which are postulated to have intercharge distances less than 4.0 Å (Barlow and Thornton, 1983
). The other is connected with distant ion pairs having intercharge distances from 4 to 7 Å. It should be noted that in both cases neither water molecule nor any other protein atomic group can be situated between the charged atoms. Herein we follow the cluster definition given in the previous paper (Chirgadze and Tabolina, 1996
) which includes ion pairs of both types. It was assumed that a sign-alternating charge cluster is a system of oppositely charged atoms of amino acid side chains and N- and C-terminal charged groups with distances between charged atoms of 2.4 < d
7.0 Å. In this approximation, point charges were placed at the centres of corresponding oxygen and nitrogen atoms. Positive charged groups of Lys, Arg and the N-terminus and negative charged groups of Asp, Glu and the C-terminus were considered. The pH value was suggested to be neutral, and side groups of histidine residues were not taken into account. Such clusters differ from the small-size clusters formed by conventional short contact ion pairs by the extent of the local area which is evenly filled only with the charged atoms or atomic groups. As mentioned above, these clusters can be considered as a surface structural invariant. Therefore, we paid attention mainly to large-size clusters. Although the charged groups on the protein surface are distributed between single isolated groups, small-size clusters of two and three groups, and large-size clusters formed by four or more charged groups.
| Program |
|---|
|
|
|---|
Searching for sign-alternating charge clusters was performed by means of the CLUSTER program (Chirgadze and Tabolina, 1996
Data
We have analysed protein structure data from the Brookhaven Protein Data Bank (Bernstein et al., 1977
). Only high resolution 3D-crystallographic data with the resolution limit equal to or less than 2.5 Å were chosen. Non-homologous proteins with a pair sequence identity of no more than 25% were taken into account. Small proteins with a chain length of less than 100 residues were allowed to have a sequence similarity less than 30%. The smallest proteins with chain lengths of less than 50 residues were not examined. To obtain a high quality data set at the final step of file selection, a visual inspection of the PDB files was performed. Only a few files were discarded, mainly because of the lack of partial data such as side group of charged residues. Very often a few residues of the N- and C-termini or some parts in the middle of the chain were lacking. The protein structure files with such deficiency were also treated. Sometimes twin positions of charged groups were found in the data files with a resolution less than 1.8 Å. In these cases, we chose the first position of the charged atom.
| Results |
|---|
|
|
|---|
Number of charge clusters depending on protein chain length
An accurate treatment of charge clusters was done at the atomic approximation described in our previous paper (Chirgadze and Tabolina, 1996
). However, counting of charged groups is a much more common practice. Because of this we considered as large enough only clusters with four or more charged groups which consist of at least five charged atoms. The main result is that the large charge clusters were observed in 235 of a total of 274 selected protein structures, and only in 39 structures were they not found (Table I
). Thus the majority of proteins, i.e. 85.8% display charge clusters on their protein surfaces. It should be also noted that almost 100% of proteins of medium and large size, starting from chain lengths with 200 residues, always contain large charge clusters.
|
The amount of clusters in the protein molecule depends on the protein chain length. We selected protein structures with chain lengths from 50 to 830 amino acid residues which corresponds approximately to a range of molecular masses from 5.5 to 91.5 kDa. The distribution of the selected proteins depending on their chain lengths is presented in Figure 1
|
A relative amount of proteins distributed along the groups with a different number of large-size charge clusters is presented in Table II
|
|
Size of charge clusters
Total distribution of the charge cluster depending on their size is presented in Table III
for 235 globular proteins. We have determined three main types of cluster depending on their size which is assumed to be simply the number of charged groups composing the cluster. From a total of 13 754 charged groups considered for these proteins, isolated charged groups make 32.5%, small-size clusters of 23 charged groups contain 35.1% groups, and large clusters of four or more groups gather 32.4% charged groups. This suggests that in globular proteins nearly one third of all charged groups is very often united in surface large-size sign-alternating charge clusters. In total, 805 such clusters were observed. The protein cluster size as a function of protein chain length is presented in Figure 3
. Cluster sizes from a wide range, from 4 to 22 charged groups, were observed. Clusters of four and six charged groups occur most frequently. However, extra large clusters which include seven or even more charged groups are also widely spread. It is interesting to note that the existence of extra-large clusters does not depend on the protein chain length, and such clusters occur in a whole range of protein chain lengths from 50 to 830 amino acid residues.
|
|
Composition of charge clusters
Relative contents of various charged groups which form clusters are presented in Table IV
. In general, charged groups of different kinds are equally distributed between three cluster types. However, some groups have an obvious preference. The charged side group of Arg has a lower content in single isolated groups but a higher content in large size clusters. The charged side group of Lys and the carboxyl group of C-termini are preferential in small size clusters but are weakly presented in large size clusters. As concerns large-size clusters, we can conclude that Arg is a higher component giving 41.1%, while Lys and C-termini are lower components giving 28.0 and 21.5%, respectively. The obtained result on Arg and Lys coincides completely with the conclusion of Musofia et al. (1995) on the composition of the so-called `complex salt bridges' consisting of 35 ionic groups.
|
| Discussion |
|---|
|
|
|---|
The results suggest that sign-alternating charge clusters are a rather common feature of globular proteins. Virtually 86% of the considered protein structures have been found to contain large-size charge clusters. Charge clusters with four to six charged groups are the most frequently occurring although extra large clusters with 710 or even more charged groups are also observed. An arginine side group is slightly more preferential, and a lysine side group and carboxyl group of C-termini participate weakly in the composition of large-size charge clusters. At present there is no doubt about the functional importance of sign-alternating charge clusters. Their most general function seems to be connected with the local polarity and possibly stability of the protein surface. Earlier data analysis did not allow one to explain unambiguously an increase of the protein stability by introducing salt bridges (Lee and Vasmatzis, 1997
-crystallins (Chirgadze, 1996
-crystallins showed that they display some specific features, such as plane geometry, large linear dimension, water arrangement along the cluster boundary, etc., which should also be taken into account. However, the most interesting feature which seems to be meaningful only for a future consideration, is a decrease of the `hydrophilic potential' of the protein surface in the cluster area (Wistow et al., 1983| Note added in proof |
|---|
|
|
|---|
The theoretical probability of occurrence of charge clusters was estimated depending on their size [E.Larionova and Yu Chirgadze (1998) Mol. Biol., 32, 15 (Russian)]. It was shown that the observed distribution of the single charge group, small-size clusters and large charge clusters of 46 groups was satisfied by occasional occurrence. But the gigantic clusters consisting of seven or more charge groups were observed in the proteins much more frequently than would be expected from occasional occurrence.
| Acknowledgments |
|---|
This work was supported by the Russian Academy of Sciences and the Russian Foundation for Basic Research (grant No. 96-04-48585).
| Notes |
|---|
1 To whom correspondence should be addressed
| References |
|---|
|
|
|---|
Barlow,D.J. and Thornton,J.M. (1983) J. Mol. Biol., 168, 867885.[Web of Science][Medline]
Bernstein,F.C., Koetzle,T., Williams,G., Meyer,E., Brice,M., Rogers,J., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535542.[Web of Science][Medline]
Chirgadze,Yu.N. (1996) Molek. Biologia (in Russian), 30, 343347. English translation, pp. 202205.
Chirgadze,Yu.N. and Tabolina,O.Yu. (1996) Protein Engng, 9, 745754.
Lee,B. and Vasmatzis,G. (1997) Curr. Opin. Biotechnol. 8, 423428.[Web of Science][Medline]
Lijnzaad,P., Berendsen,H.J.C. and Argos,P. (1996) Proteins, 25, 389397.[Web of Science][Medline]
Musofia,B., Buchner,V. and Arad,D. (1995) J. Mol. Biol., 254, 761770.[Web of Science][Medline]
Peters,D. and Peters,J. (1993) Mol. Engng, 2, 375400.
Realini,C., Rogers,S.W. and Rechsteiner,M. (1994) FEBS Lett., 348, 109113.[Web of Science][Medline]
Vogt,G. and Argos,P. (1997) Fold. Design, 2, 540548.
Wistow,G., Turnell,B., Summers,L., Slingsby,C., Moss,D., Miller,L., Lindley,P.F. and Blundell,T.L. (1983) J. Mol. Biol., 170, 175202.[Web of Science][Medline]
Received December 17, 1997; revised May 12, 1998; accepted September 22, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


