Protein Engineering, Vol. 12, No. 7, 535-548,
July 1999
© 1999 Oxford University Press
Amino acid neighbours and detailed conformational analysis of cysteines in proteins
Biostructure and Protein Engineering Group, Department of Life Science, Aalborg University, Sohngaardsholmsvej 57, DK-9000 Aalborg, Denmark E-mail: Steffan.Petersen{at}civil.ave.dk
| Abstract |
|---|
|
|
|---|
Here we present an investigation of the contacts that cysteines make with residues in their three-dimensional environment and a comprehensive analysis of the conformational features of 351 disulphide bridges in 131 non-homologous single-chain protein structures. Upstream half-cystines preferentially have downstream neighbours, whereas downstream half-cystines have mainly upstream neighbours. Non-disulphide bridged cysteines (free cysteines) have no preference for upstream or downstream neighbours. Free cysteines have more contacts to non-polar residues and fewer contacts to polar/charged residues than half-cystines, which correlates with our observation that free cysteines are more buried than half-cystines. Free cysteines prefer to be located in
-helices while no clear preference is observed for half-cystines. Histidine and methionine are preferentially seen nearby free cysteines. Tryptophan is found preferentially nearby half-cystines. We have merged sequential and spatial information, and highly interesting novel patterns have been discovered. The number of cysteines per protein is typically an even number, peaking at four. The number of residues separating two half-cystines is preferentially 11 and 16. Left-handed and right-handed disulphide bridges display different conformational parameters. Here we present side chain torsion angle information based on a 512 times larger number of disulphide bridges than has previously been published. Considering the importance of cysteines for maintaining the 3D-structural scaffold of proteins, it is essential to have as accurate information as possible concerning the packing and conformational preferences. The present work may provide key information for engineering the protein environment around cysteines.
Keywords: free cysteine and disulphide (disulfide) bridge conformation/packing/protein engineering/sequence and spatial contacts/solvent accessibility/thermostability
| Introduction |
|---|
|
|
|---|
Cysteine residues pair to form unique cross-links (disulphide bridges) in proteins. These bridges have an important role in stability, as suggested by experimental engineering of disulphides, and are very common in extracellular soluble globular proteins. This probably reflects the need for increased stability in such proteins in the extracellular environment (see, for example, Goldenberg, 1985; Pantoliano, 1987; Matsumara et al., 1989a
The exponential growth in the number of experimentally determined 3D structures by X-ray diffraction and NMR makes it possible to acquire new relevant knowledge about 3D contacts and the preferred geometry acquired by the two half-cystines involved in the disulphide bond. The conclusions presented in this study are essential for the design of putative sites in a protein that might accommodate an extra disulphide bond in order to improve its thermostability. It is also highly relevant for the evaluation of the structural importance of a particular disulphide bond and its contribution to a protein's stability. The present work provides updated and more accurate information about the packing and stereochemical reference parameters to which the 3D environment and the geometry of a particular disulphide bond can be compared with.
Principles for the distribution of cysteine residues as well as their possible involvement in disulphide bridges have been extracted with regard to their geometry and conformational energy maps, solvent accessibility, connectivity and preferred interactions with other residues (Perahia and Pullman, 1971
; Richardson, 1981
; Thornton, 1981
; Reid et al., 1985
; Mao, 1989
; Richardson and Richardson, 1989
; Muskal et al., 1990
; Srinivasan et al., 1990
; Fiser et al., 1992
; Benham and Jafri, 1993
; Harrison and Sternberg, 1994
; Karlin et al., 1994
; Bagley and Altman, 1995
; Harrison and Sternberg, 1996
; Pal and Chakrabarti, 1998
).
Families of disulphide bridges have been grouped based on preferred values for side-chain torsion angles. The maximum number of protein structures analysed in previous extensive conformational analysis studies (that included the analysis of atom distances, bond angles, dihedral angles, combined plots of particular geometrical parameters in order to find correlations and propose families of disulphide bridges) has been 22, containing 72 disulphide bridges (Srinivasan et al., 1990
). In the present work, a comprehensive analysis of 350 disulphide bridges in 131 non-homologous single chain (monomeric) protein structures, sharing less than 35% mutual sequence identity and with an average resolution of 1.95 Å, is presented. Since the number of disulphide bridges in the protein subset used in this study was 512 times higher than that of earlier studies on disulphide bridges, new and more accurate data was obtained. In several instances our data did not conform with previously published data with respect to several torsional values and correlations. The distribution of
1 and
'1,
2 and
'2 for the upstream and downstream cystines in left-handed and right-handed disulphide bridges was analysed. Novel correlations between these angles were identified, some of which contradict earlier reports based on substantially smaller data sets.
The type and number of residues neighbouring in space each half-cystine involved in a disulphide bridge was analysed. We believe that the failure to obtain increased thermal stability by introducing new, engineered disulphide bridges can be partly explained by unfavourable spatial neighbour contacts as well as torsion angles. The solvent accessibility, secondary structural preferences and preferred spatial contacts for free cysteines were analysed. The spatial environment of half-cystines and cysteines described in this work is compared with their sequence environment reported previously. New data arise from our 3D neighbourhood analysis since residues close in space are not always close in sequence.
| Materials and methods |
|---|
|
|
|---|
The 3D packing and geometry around each disulphide bridge and the 3D packing around each free cysteine present in a subset of the Brookhaven Protein Databank published by Hobohm and Sander was analysed. The initial large subset from which we derived our final subset was the 35% threshold list (proteins sharing less than 35% mutual identity) consisting of 951 chains with 211 793 residues (April 1997). The list was incremental and built up on the 25% mutual identity threshold list which consisted of 608 chains with 162 166 residues. From the 35% identity subset, we excluded the C
C
only files and the multiple chain protein pdb files. Only single chain (monomeric) proteins containing at least one disulphide bridge were included in our analysis. Among the NMR structures, only the pdb files where it was clearly stated that model number # was preferred were included. In total we analysed 16 NMR structures. The single chain subset consisted of 412 proteins (monomeric) with less than 35% mutual identity. Of these, 131 protein sequences contained at least one disulphide bridge. This was the final subset analysed. Approximately two-thirds of the files belonged to the 25% dataset.
The neighbours of each cysteine residue (whether involved or not in a disulphide bridge) are determined by calculating the distance between non-hydrogen atoms in the respective PDB files (Bernstein et al., 1977
). Metal ligand cysteines are considered as free cysteines. For every cysteine the two upstream and the two downstream sequence neighbours are excluded from the spatial analysis. All residues being in contactthat is, the distance between non-hydrogen atoms is shorter than a given cut-offare reported with the solvent accessibility of the residues involved. We also compare the spatial distribution with the sequential distribution by isolating all residues within five residues, in sequence, from the cysteine residues. The solvent accessibilities are retrieved from the corresponding HSSP file (Sander and Schneider, 1991
; Dodge et al., 1998
) together with the secondary structure of the cysteine residue. A total of 350 disulphide bridges were analysed.
We compared the secondary structural preference for cysteines and half-cystines. We grouped the secondary structure classes from the HSSP files in the following way: helices are defined by the classes G, I and H, strands by B and E, turns by T and bulges by S (Kabsch and Sander, 1983
). The remaining residues are classified as having `other' secondary structures.
The geometry around 351 disulphide bridges was analysed using our program GETCYS, written in C, and the interdependence of the side chain torsion angles, as well as their correlation with the handedness of the disulphide bridge, were investigated in detail. The number of cysteine residues and disulphide bridges per protein (only the proteins containing at least one disulphide bond were evaluated) and the sequence distances between two half-cystines forming a disulphide bridge were investigated. In this study, we also studied the dependence of the average fraction of disulphide bridges and free cysteine residues on the chain length.
The average C
C
and SS distances were evaluated as well as the C
Cß and CßS distances for both upstream and downstream cystine residues. The bond angles C
CßS and CßSS were evaluated for each cystine residue. The distribution of the dihedral angles
1,
2,
3,
'2 and
'1 as defined below (Figure 18
) was studied. The interdependence of
1,
'1,
2 and
'2 and their correlation with the handedness of the disulphide bridge was also unravelled.
|
The additional structures present in the 35% identity dataset and absent from the 25% identity dataset were not responsible for the smaller peaks observed in the
1,
'1,
2 and
'2 histograms (data not shown). These peaks were seen mostly in structures whose resolution was better than 2 Å. The former data set was selected in order to obtain a better signal to noise ratio.
The geometrical parameters found in this study were compared with the bond and angle parameters used by the forcefield AMBER (Weiner et al., 1984
, 1986
) used for protein minimization, bond and angle parameters used by XPLOR (Engh and Huber, 1991
) for X-ray protein structure refinement, and with geometrical parameters assumed by PROCHECK (Laskowski et al., 1993
). The geometrical parameters found in this study were also compared with bond and angle parameters used by CHARMM (Brooks et al., 1983
), a program for macromolecular energy, minimization and dynamics calculations.
A histogram of the resolution of the proteins structures used in this study is shown in Figure 19
.
|
The PDB ID of the files used in our analysis were as follows: 1hnf, 1amp, 1arb, 1arv, 1bec, 1bp2, 1btl, 1cfb, 1cid, 1cpo, 1crl, 1ctn, 1cus, 1cyg, 1ddt, 1dpe, 1dsn, 1erg, 1esc, 1esl, 1exg, 1fbr, 1gai, 1gen, 1gof, 1gpl, 1hbq, 1hc4, 1hfh, 1huw, 1hxn, 1iae, 1ilk, 1irl, 1ivd, 1jcv, 1jer, 1jpc, 1kcw, 1lbu, 1lit, 1lki, 1lzr, 1mal, 1mpp, 1mup, 1onc, 1pcn, 1pgs, 1poc, 1ptx, 1qba, 1rcb, 1rie, 1rpa, 1smd, 1sra, 1svb, 1svr, 1tca, 1thv, 1thx, 1tml, 1tpg, 1wba, 2alp, 2ayh, 2cas, 2ctc, 2eng, 2hft, 2hvm, 2lbp, 2mcm, 2prk, 2sas, 2sil, 2tgi, 3cd4, 3grs, 3pte, 3tgl, 6taa, 7rsa, 9pap, 9rnt, 1aba, 1ang, 1apa, 1apf, 1cbg, 1cbh, 1cbn, 1cc5, 1cd8, 1cpy, 1dtx, 1ehs, 1emn, 1ezm, 1fsc, 1gal, 1hoe, 1hpt, 1hyp, 1kte, 1mof, 1ntn, 1obr, 1oxy, 1pdc, 1pex, 1pi2, 1rlr, 1roo, 1rtu, 1sh1, 1ste, 1tib, 1tie, 1tin, 1ton, 1try, 1ukz, 1vjw, 2ace, 2pec, 2plh, 2sga, 2sn3, 7pti.
| Results |
|---|
|
|
|---|
Environment around cysteine residues
The analysis of the environment around cysteine residues (cysteines involved or not in disulphide bridges) per protein molecule shows a distribution with a clear preference for an even number of cysteines, as seen in Figure 1
, indicating the prevalence for disulphide formation. Four cysteines per protein is the most frequent value. In Figure 2
, the histogram shows the number of disulphide bridges per protein. One disulphide bridge per protein molecule is the most common observation. The probability of a high number of disulphide bridges decays in an exponential fashion. The histogram shown in Figure 3
reveals that half-cystines that are close together in sequence are more likely to form disulphide bridges. The most frequent separation between cysteine residues involved in disulphide bridges (half-cystines) is 11 and 16. Interestingly, some separations between half-cystines are less frequent, e.g. 13 and 21 residues. In the data set analysed, only one disulphide bridge was formed by adjacent neighbour residues (1obr.pdb). Figure 4
shows the dependence of the average value of the fraction of disulphide bridges and fraction of free cysteine residues (not involved in disulphide bridges) on the chain length. Small size proteins contain a larger number of disulphide bridges than large proteins. For proteins between 0 and 50 residues no free cysteine was found.
|
|
|
|
The cumulated solvent accessibility distribution shown in Figure 5
|
|
|
The nature of the contact, described by the type of contacting residue, is compared with the distribution of the corresponding residue in our dataset, as shown in Figure 8
|
The secondary structural preference for cysteines and half-cystines is presented in Figure 9
|
Atom distances and bond angles
In Table I
we display the most common atom distances between the two C
(C
C
) and the two sulphur atoms (SS) of half-cystine residues involved in the same disulphide bridge. The bond distances CßS for each cystine residue involved in a disulphide bridge and the bond angles C
CßS and CßSS for each disulphide bridge are displayed. The correlation of each bond angle and atom distance with the handedness of the disulphide bond is also shown. The histograms showing the frequency of values assumed by each bond angle and each atom distance are displayed in Figures 10 and 11![]()
, respectively. The distribution of the distance between the sulphur atoms of each cysteine involved in a disulphide bridge is shown in Figure 10a
. A single sharp peak is observed at 2.02 Å with a full width at half maximum of 0.04 Å. We observe no difference in bond length for left- and right-handed disulphide bonds (data not shown). In contrast, a clear dependence of the C
C
distance on the handedness of the disulphide bond is seen (Figure 10b
). The left-handed disulphide bonds present a single peak at 5.8 Å whereas the right-handed disulphide bonds display a less distinct distribution, where the most intense peak is located at 5.4 Å. This also shows that the C
C
distance for the left-handed disulphide bridges is on average larger than the C
C
distance for the right-handed disulphide bridges, where distances as short as 4 Å have been observed. The distances between the SiCßi (Figure 10c
) and SjCßj (Figure 10d
) both show a single peak at 1.81 Å, with similar intensities, independently of the handedness of the bond.
|
|
|
In Figure 11a
CßS distribution and its dependence on the value of
3 are given in Figure 11b
In Table I
we summarize our results concerning atom distances and angles and report also previously reported values.
Dihedral angles' distribution and their interdependence
The preferred distribution of the dihedral angles
1,
2,
3,
'2 and
'1 for disulphides found in the 351 disulphide bridges subset and their correlation with the handedness of the disulphide bond are given in Table II
. Also displayed are the previous results found in the literature concerning dihedral angles.
|
The
3 distribution in the 351 disulphide bridge subset indeed shows that there are two groups: the left-handed disulphides peaking at 80° with intensity 30 and the right-handed disulphides peaking at +100° with intensity 35 as shown in Figure 12
2 and
'2 and that there is no correlation between
3,
2 and
'2 for the right-handed disulphide bridges.
|
The distribution of combinations of the pair values (
1,
'1) and (
2,
'2) are shown in Figure 13
1,
'1) are observed when
1 and
'1 assume values between [75, 60] (Figure 13a
2,
'2) are observed when
2 assume values in the interval [75, 60] and
'2 assume values between [85, 60] (Figure 13b
2 assume values in the interval [65, 75] and
'2 assume values between [65, 75]. From these two figures it can be observed that the (
2,
'2) distribution seems to be more disperse than the (
1,
'1) distribution and there is a strong dependence on the handedness of the disulphide bond, as mentioned above.
|
1 and
'1 distribution and correlation with the values assumed by
3,
2 and
'2
Figures 14 and 15![]()
show the histograms concerning the distribution of
1 and
'1 and their dependence on the handedness of the bond and on the values assumed by
2 and
'2.
1 and
'1 show a clear trimodal distribution for the right-handed disulphide bridges and a weaker one for the left-handed disulphide bridges. For the left-handed disulphide bridges
1 (Figure 14
) peaks at 60°, +60° and 170° (intensities 40, 5 and 8 respectively) and
'1 (Figure 15
) peaks at 60°, +60° and 170° (intensities 37, 4 and 11 respectively). For the right-handed disulphide bridges
1 peaks at 60°, +60° and +180° (intensities 36, 11 and 14 respectively) and
'1 peaks at 60°, +60° and 170° (intensities 40, 11 and 12 respectively).
|
|
The most intense peak at
1= 60° is correlated with
2 and
'2 assuming both negative values (between 180 and 0), especially for the left-handed disulphide bridges. No particular combination of
2 and
'2 values is correlated with the
1 peaks at +60° and (170°/+180°).
For the left-handed disulphide bridges, the most intense
'1 peak at 60° is correlated with
2 and
'2 assuming both negative values (Figure 15
). For the right-handed disulphide bridges the
'1 peak at 60° is correlated with
2 and
'2 both assuming negative values and with
2 positive when
'2 is negative (Figure 15
). The second most intense peak,
'1 = 170° is correlated with
2 and
'2 assuming both positive values, as seen in Figure 15
.
2 and
'2 distribution and correlation with the values assumed by
3,
1 and
'1
The
2 distribution can be seen in Figure 16
. For the left-handed disulphide bridges one major peak is observed at 60° and a minor peak at +180° (intensities 31 and 8 respectively). For the right-handed disulphide bridges four distinct peaks are observed at 60°, 90°, +60°, +95° (intensities 18, 14, 15 and 16 respectively).
|
For the left-handed disulphide bridges the most intense
2 peak (60°) is correlated with
1 and
'1 assuming both negative values, as seen in Figure 16
2 peaks (60°, 90°) are mainly correlated with
1 and
'1 being both negative. Positive
2 peaks seem mainly to be correlated with
1 and
'1 being both negative and with
1 being positive when
'1 is negative, as seen in Figure 16
The
'2 distribution can be seen in Figure 17
. For the left-handed disulphide bridges, two major peaks at 60° and 80° (intensities 29 and 22 respectively) and a minor peak at +180° (I = 10) are observed. For the right-handed disulphide bridges the broad distribution of
'2 reflects a dispersion of allowed conformations: the most intense peak at +80° (I = 17) and a broad peak around 80° (Imax = 13 at 60°). Other less intense peaks can be observed in Figure 17
. Only values between 40° and +40° seem to be disallowed or less frequent.
|
For the left-handed disulphide bridges the negative peaks observed for
'2 (at 60° and 90°) seem to be correlated with
1 and
'1 both assuming negative values, as seen in Figure 17
'2 is mostly correlated with
1 and
'1 being both negative and with
1 being positive when
'1 is negative (Figure 17
1 and
'1 values seems to correlate with the most intense peak of
'2 at +80°.
The above mentioned interdependence of dihedral angles is summarized in Tables III and IV![]()
.
|
|
| Discussion |
|---|
|
|
|---|
The periodicity of two observed in the analysis of the number of cysteine residues per protein molecule indicates the prevalence of disulphide bridge formation over the existence of non-paired cysteine residues. The average percentage of free cysteines (non-disulphide bridged) in the data set analysed was found to be rather low (majority of values found between 0 and 0.2% of the total number of residues), as seen in Figure 4
A disulphide bridge is intrinsically a local phenomenon, the most common half-cystine distance is 11 and 16 residues (Figure 3
). Also the apparent importance of disulphide bridges appears to be higher in small proteins than in large proteins. This observation is related with our finding that the fraction of disulphide bridges in small proteins is higher than in large molecules. The fraction of disulphide bonds was found to decay exponentially with sequence length, as seen in Figure 4
. Most likely the role of the disulphide bridge in protein stabilization is more relevant in small proteins than in larger proteins. We propose that the high number of disulphide bonds in small proteins is necessary to compensate for the low number of hydrophobic contacts.
Most disulphide bridges have been reported to be largely inaccessible to solvent as a result of being buried (Thornton, 1981
; Srinivasan et al., 1990
). Our data shows that free cysteines are even more buried that half-cystines. The solvent accessibilities of the cystines have not been corrected (meaning that the maximum area of cysteine in use) and because the half-cystine has a lower maximum the solvent accessibility percent will increase. In total this will further increase the difference between the free and bonded cysteines. All solvent accessibilities are calculated without heteroatoms (Kabsch and Sander, 1983
). Therefore, the reason for the high number of buried free cysteines is not the presence of metal ions.
Our data shows that cysteines not involved in disulphide bridges seem to prefer classical secondary structural elements, especially helices, while no clear preference is seen for half-cystines. With a 5 Å cut-off, residues close in sequence may be excluded from the database. Our finding that free cysteines prefer an
-helical environment is thus consistent with an amphiphilic
-helix populated on the hydrophilic side with polar or charged residues and on the non-polar side with free cysteines as well as more regular non-polar residues. Our finding is contrary to previous reports that half-cystines and cysteines not involved in disulphide bridges are predominantly in coiled regions (Thornton, 1981
). The secondary structure proportion as a function of primary sequence distance for disulphide- and non-disulphide-bonded cysteines has been analysed (Muskal et al., 1990
). They showed that the sequences surrounding and including half-cystines seem to prefer the extended conformation of ß-sheets over that of turns and bends, whereas those sequences containing non-disulphide-bonded cysteines show little, if any, secondary structure preference.
The number of contacts for different distances from the central residue (Figure 6
) shows distinct peaks for different types of interactions. First a sharp peak corresponding to the covalent SS bond is evident in the cystines. Remember that all sequence-neighbours are removed in this analysis and hence there is no covalent bonds between a free cysteines and another residue. Between 3 and 4 Å from the central residue a broad peak of interactions such as H-bonds is apparent. The shape of this peak is similar for the two types of cysteines. At even larger distances a significant difference exists between the two curves. It seems that the presence of a disulphide bond disrupts the packing of residues and gives rise to a more evenly distributed number of contacts. Our choice of cut-off at 5 Å in the later investigation was motivated by the inclusion of hydrogen bonded residues. We believe that the free cysteines peak around 5.2 Å arises from the second layer of residues around the cysteine, and was not included in the further analysis.
The distance in sequence between the cysteine and its spatial neighbour (Figure 7
) shows clear preferences. The free cysteines mostly have neighbours close in sequence, with no apparent preference for upstream or downstream neighbours. Upstream half-cystines most often have neighbours further down the sequence, whereas downstream half-cystines have upstream neighbours. This implies that disulphide bridges are mostly in contact with residues between the sequence locations of the two half-cystines, a fact consistent with the role of cystines as `structure-keepers' (Figure 20
). We have repeated this anaylsis for cysteine residues involved in disulphide bridges being less than 16 residues apart in the sequence and more than 16 residues apart (data not shown). This cut-off value was chosen because 16 residues is one of the most observed distances occurring between half-cystines involved in disulphide bridges. The trend is still the sameupstream cysteines mostly have contacts to downstream residues and vice versa. This leads us to believe that our observation indeed is a general feature of disulphide bridges and that this observation is not correlated with the sequential distance between the cysteine residues.
|
The type of residue involved in spatial contacts with cysteine residues (Figure 8b
New and relevant conclusions arise from the comparison of the sequential neighbourhood and the spatial contact analysis around cysteine residues (Figure 8a and b
). While hydrophobic residues such as Val, Leu, Ile and Met are not seen in close sequence proximity around half-cystines, they are overrepresented in the spatial analysis. These residues are also more frequently seen in the spatial neighborhood around half-cystines than in their sequential neighbourhood. The same is observed for the aromatic Phe and Tyr residues. Trptophan does not show a preference for half-cystines in the sequential analysis but it is the most overrepresented amino acid in the spatial environment around disulphide bridges. In the sequential analysis Gly residues were seen frequently in the vicinity of both free cysteines and half-cystines but become underrepresented in their 3D space. Histidine shows a clear preference for free cystines both in sequence and in space. Arg favours half-cystines both in sequence and in space. The charged residues Lys, Glu, Asp and the polar residue Asn are less observed in space than in sequence around cysteines, especially around free cysteines. This finding also matches with our finding that free cysteines are more buried than half cystines.
It is interesting to note the high number of histidine and methionine contacts to the free cysteines (Figure 8b
). Histidine residues are, together with free cysteines, involved in metal binding. Methionine residues are also involved in metal binding, for example, in cytochromes, where a conserved methionine residue is coordinated through its highly polarizable sulphur atom to iron (Creighton, 1993
). Since the sulphur atoms of free cysteine and of methionine residues are most susceptible to oxidation, it is likely that throughout molecular evolution the only location that they could stably occupy was the buried interior locations. The high number of contacts between half-cystines and tryptophan residues observed in our study (contacts mainly within 3 to 4 Å) conforms with a previous prediction that in proteins tryptophandisulphide interactions are very localized in nature and should give rise to detectable anomalous phosphorescence decays (Li et al., 1989
). It has been suggested that a sulphur atom can interact favourably with an aromatic ring, via S-
interaction (Morgan et al., 1978
; Morgan and McAdon, 1980
) and the geometry has been analysed (Reid et al., 1985
; Pal and Chakrabarti, 1998
). Sulphydryl cysteines, half-cystines and methionine sulphurs have been reported to interact closely with an aromatic ring (Reid et al., 1985
; Klingler and Brutlag, 1994
). The number of S-
interactions in folded globular proteins has been predicted based on their amino acid composition and it correlates positively with methionine and cysteine residues (Morgan and McAdon, 1980
).
Our sequential neighbour analysis does not agree totally with previously reported sequential data. We do not observe the reported positive propensity of isoleucine for half-cystines (Muskal et al., 1990
). From our data, isoleucine is 26.5% less represented around half-cystines than expected from the distribution of residues in our data set and more represented for free cysteines (+16.7%). Tyrosine is also reported to be highly conducive towards disulphide bond formation. This is not obvious from our data. We can only say that both for free cysteines and half-cystines we observe less tyrosine residues than expected in the sequence flanking cysteines. The sequential neighbour analysis by Muskal et al. (1990) also shows that residues contributing towards disulphide bonds are polar and/or charged and that residues disfavouring disulphide bonds are mainly hydrophobic. However, our data also shows that the polar residue histidine is highly present around free cysteines. Phe and Trp are reported to disfavour disulphide bond formation quite strongly. These results are not obvious from our data. In contrast, we find in our spatial neighbour analysis that Trp is the most overrepresented residue in the spatial environment of half-cystines. Glycine is also reported to be more abundant in the neighbourhood of half-cystines than of free-cystines, in agreement with the work of Fiser et al. (1992). From our sequential data glycine is equally seen flanking half-cystines and free cystines. We used the same window size for the sequential analysis as Muskal et al. (1990). Our sequential neighbour data is in disagreement with the finding by Fiser et al. (1992) that charged residues occur preferentially in the vicinity of free cysteines.
Our results do not agree totally with previous reports on spatial neighbours. When comparing the environment around cysteines disulphide bonded versus the environment around non-disulphide bridged cysteines using a 10 Å cut-off from the sulphur atom of every cysteine, Bagley and Altman (1995) report an abundance of tyrosines in the disulphide environment. Our spatial data shows an equal abundance of tyrosine residues around free cysteines and half-cystines. The work of Bagley and Altman (1995) also concludes that the environment of disulphide bridges has a higher amount of contacts to polar residues than the environment of non-disulphide bridged cysteines. This finding is in agreement with the present work. These authors do not see glycine as a frequent residue in the 10 Å cut-off neighbourhood of non-disulphide cysteines, in contrast to the reports by Fiser et al. (1992). This finding is also in agreement with our work, where the number of glycine residues is lower than expected both for free cysteines and half-cystines (Figure 8b
). The removal of the two closest sequence neighbours before the calculation of the spatial neighbour preferences did not affect our conclusions.
Our finding that
3 peaks at +100° for the right-handed disulphide bonds agrees with molecular orbital calculations for
3 by Perahia and Pullman (1971) where an energy minimum was found at +100°.
For right-handed disulphide bridges, the values assumed by
1 and
'1 seem to be much more constrained than the values assumed by
2 and
'2, where major positive and negative peaks were observed. The observation that
2 and
'2 assume a broader range of values when
3 is positive than when
3 is negative might reflect a mechanism of minimizing steric constraints around the right-handed disulphide bond.
The same conformational analysis (analysis of bond angles, atom distances and the dihedral angles' distribution) was carried out on the proteins belonging exclusively to the 25% mutual identity subset (data not shown). No new peaks or other features appeared in the 35% dataset compared with the 25% dataset, aside from an increase in the apparent signal to noise ratio in the 35% dataset. In order to analyse the influence of the resolution of the protein structures on the obtained results, we analysed all geometrical parameters for the proteins with resolution less or equal to 2 Å and for the proteins with resolution above 2 Å (data not shown). The choice of the resolution cut-off below or above 2 Å leads to the same conclusion that no new features emerge that cannot be argued for by the loss of signal to noise ratio in the poor resolution dataset (about one-third of the proteins have resolution above 2 Å and two-thirds have resolution less or equal to 2 Å). All NMR structures were not included in the analysis since no comparable resolution figure is provided.
A comparison of the geometrical parameters of disulphide bridges found in our study with previously published data is not straightforward. Primarily this is due to a surprising lack of information about how the earlier data was generated. However, a summary of our data and previously published results was presented in Tables I and II![]()
.
The geo
































