PEDS Advance Access originally published online on December 2, 2004
Protein Engineering Design and Selection 2004 17(11):795-808; doi:10.1093/protein/gzh093
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Disulfide bonds, their stereospecific environment and conservation in protein structures
Department of Biochemistry, Bose Institute, P-1/12 CIT Scheme VIIM, Calcutta 700 054, India
2 To whom correspondence should be addressed. E-mail: pinak{at}boseinst.ernet.in or pinak_chak{at}yahoo.co.in
| Abstract |
|---|
|
|
|---|
We studied the specificity of the non-bonded interaction in the environment of 572 disulfide bonds in 247 polypeptide chains selected from the Protein Data Bank. The preferred geometry of interaction of peptide oxygen atoms is along the back of the two covalent bonds at the sulfur atom of half cystine. With aromatic residues the geometries that direct one of the sulfur lone pair of electrons into the aromatic
-system are avoided; an orientation in which the sulfide plane is normal or inclined to the aromatic plane and on top of its edge is normally preferred. The importance of the S
aromatic interaction is manifested in the high degree of its conservation across members in homologous protein families. These interactions, while providing extra overall stability to the native fold and reducing the accessibility of the disulfide bond and thereby preventing exchange reactions, also set the orientation of the conserved aromatic rings for further interactions and binding to another molecule. The conformational features and the mode of interactions of disulfide bridges should be useful for molecular design and protein engineering experiments.
Keywords: conservation of interaction/disulfide bond/protein stability/S
aromatic interaction/S
O interaction
| Introduction |
|---|
|
|
|---|
The specificity of folding is provided by non-bonded interactions, such as hydrogen bonding. Increasingly, other specific interactions, such as aromaticaromatic, XH

(X = N, O or C), CH
O, etc., are being recognized (Singh and Thornton, 1985
The existence of covalent bonds in the form of disulfide linkages contributes to the stability and is especially important for extracellular proteins (Thornton, 1981
). There have been analyses of residues around half cystines in protein sequences and methods to identify them (Fiser et al., 1992
; van Vlijmen et al., 2004
). The cross-linking of the polypeptide chain has been used to engineer additional conformational stability into proteins by site-directed mutagenesis (Matsumura et al., 1989
; Clarke and Fersht, 1993
; Mansfeld et al., 1997
). However, the success of this strategy has not been universal (Mitchinson and Wells, 1989
; Betz, 1993
; van den Burg et al., 1993
; Hinck et al., 1996
), suggesting that some of the engineered bonds may have strained conformation (Katz and Kossiakoff, 1986
) or the specific, but as yet not fully understood, environment of the bonds in wild-type proteins has an associated stabilizing role, which may be lacking at the engineered sites. The contribution of the environment in the form of the interaction of Cys with hydrophobic residues has been suggested to give rise to an intermediate with a single disulfide bond in the early stages of folding of BPTI (Dadlez, 1997
), which contains a total of three such bonds (Darby and Creighton, 1993
). A highly conserved disulfideTrp interaction has been observed in the immunoglobulin fold (Ioerger et al., 1999
).
The thiol group in Cys and the sulfide group in Met are not very adept at forming hydrogen bonds (Ippolito et al., 1990
; Gregoret et al., 1991
; Allen et al., 1997
), but still sulfur can partake in a number of non-bonded interactions, such as that involving aromatic residues (Desiraju and Steiner, 1999
; Meyer, et al., 2003
). In particular, Rosenfield et al. (1977)
have observed, from an analysis of the environment of a divalent S atom (YSZ) in organic and inorganic crystals, that a nucleophilic O atom tends to approach the S atom from the backside of SY and SZ bonds (i.e. along an antibonding orbital) to make the S
O interaction a stabilizing one. In spite of the presence of myriad other non-covalent interactions in proteins, the directional preference of the S
O interaction was essentially maintained in the environment of the S atom of Met (Pal and Chakrabarti, 2001
; Iwaoka et al., 2002a
). Moreover, the interaction has also been shown to regulate enzymatic function; for example, there exists an S
O interaction between the sulfur atom of S-adenosylmethionine and the carboxylate group of Asp118 in S-adenosylmethionine synthetase (Taylor and Markham, 1999
). The concept of electrophilenucleophile interaction was extended to include the
-electron system of aromatic rings acting as nucleophile and, again, there was marked directionality (Pal and Chakrabarti, 2001
). In this paper, we address the question of whether a similar geometric relationship is retained in the interaction of the disulfide group with carbonyl oxygen atoms (of which we consider only the main-chain ones) and the aromatic rings. The importance of an interaction, especially when it involves a side chain, can also be discerned by finding out the degree of conservation of the participating residues during evolution, something we also studied. Although the general characteristics of disulfide bonds have been enumerated in numerous studies (Richardson, 1981
; Thornton, 1981
; Srinivasan et al., 1990
; Morris et al., 1992
; Petersen et al., 1999
), such features need to be constantly refined using larger databases. Therefore, while addressing our primary concern of the environment of disulfide moiety, we also analyzed the conformations, primary and secondary structural features of the half-cystines making up the disulfide bond and in this process identified some folding patterns.
| Materials and methods |
|---|
|
|
|---|
Atomic coordinates were obtained from the Protein Data Bank (PDB) at the Research Collaboratory for Structural Bioinformatics (RCSB) (Berman et al., 2000
20% and resolution of
2.0 Å and sequence identity <25%. Of these, 247 chains contain one or more disulfide bonds. Only those Cys and aromatic residues and carbonyl groups were considered for which the fractional occupancies and temperature (or B) factors (of S
atom of Cys and all ring atoms of aromatic residues and carbonyl oxygen atoms) were 1.00 and
30 Å2, respectively. Disulfide bonds were identified using a cut-off distance of 2.3 Å between the S
atoms of two Cys residues. In PDB files disulfide bonds are also specified with the SSBOND record. However, in five proteins [1cru_A: 338345 (distance 2.89 Å), 1fjs_L: 89100 (2.35 Å), 1p5u_A: 98137 (2.31 Å), 1ubk_L: 84549 (2.87 Å), 2sic_I: 3550 (2.46 Å)] the specified disulfide bonds have a larger S
S
distance. These disulfide bonds were excluded from our analysis.
To ascertain the relative importance of the non-bonded contacts involving the sulfur atom with different kinds of atoms (aromatic and aliphatic carbon atoms, main-chain carbonyl oxygen atoms and all types of side-chain oxygen atoms), a density function,
, was determined for each type of contact at distances r = 3.0, 3.1, ..., 5.5 Å. A shell of width 0.1 Å was assumed at each distance and the number of atoms of a given type in it was found (for a residue in contact, even if more than one atom satisfied the criterion, only one was accepted). If the outer radius r2 had N2 occurrences and the inner radius r1 had N1, then
![]() |
For the calculation of the geometry, the centroids of aromatic residues were first determined (for His, Phe and Tyr, the center of mass of the five- or six-membered ring; for Trp, the mid-point of CD2 and CE2 atoms). A molecular axial system was defined with the origin at the centroid of the aromatic residue and the z-axis along the normal to the aromatic plane. The interplanar angle, P, and the angle
made between the z-axis and the line joining the centroid of the aromatic residue to S
were computed (Figure 4a). For each interaction the geometry was placed in one of the elements in a 3 x 3 grid (each element spanning a range of 30° along P and
) (Figure 4b). Each relative orientation is designated by a two-letter code (fp, ot, en, etc.). The first letter indicates if the aromatic residue is interacting with its face (f) or the S
of half cystine is near its edge (e) or located in an intermediate (offset or o) position. The second letter denotes if the sulfide plane (passing through CßS
S
') is normal (n) or parallel (p) to the aromatic ring or has a tilted (t) orientation; these labels are slightly different from those in the earlier studies where both the interacting residues were aromatic (Samanta et al., 1999
; Bhattacharyya et al., 2002
, 2003
). If Oij is the observed frequency of occurrence in the grid element corresponding to the ith row and jth column (i and j varying from 1 to 3), the corresponding expected value (Eij) can be calculated as the product of the sum of the observed numbers in the elements in the ith row and that in the jth column, divided by the total number of observations in all nine grid elements (Samanta et al., 1999
).
|
To see the directionality of aromatic centroid or carbonyl oxygen atoms (X) relative to the sulfide plane, two additional parameters,
1 and
were used (Figure 4c);
1 is the polar angle between the normal to sulfide plane and S
X vector (if
1 > 90°,
1 is made equal to 180°
1, so that contacts above or below the plane are assumed to be equivalent; i.e., 0°
1
90°).
is the azimuthal angle between the extension of the bisector of the
CßS
S
' and the projection of X in the disulfide plane.
The secondary structural elements were determined using the algorithm DSSP (Kabsch and Sander, 1983
). Unless mentioned otherwise, we grouped all helix types (H, G, I) as H, ß-strands (E, B) as E and turns (S, T) as T; C corresponds to non-regular structure. The solvent-accessible surface areas (ASA) of the Cys residues were computed with the program NACCESS (Hubbard, 1992
), which implements the algorithm of Lee and Richards (1971)
. The relative accessibility of a Cys residue is the percentage ASA of the residue in the structure as compared with its ASA in an extended AlaCysAla tripeptide.
The degree of conservation of aromatic residues in a protein was found based on the amino acid usage at the same position in all homologous protein families, as delineated in the HOMSTRAD structural alignment database (Mizuguchi et al., 1998
; Stebbings and Mizuguchi, 2004
). Molecular diagrams were generated using MOLSCRIPT (Kraulis, 1991
) and the scatterplot for the distribution of O atoms around S was drawn using the software IsoGen from the Cambridge Structural Database (Allen, 2002
). In the text a PDB file is mentioned as the four-lettered code (in lowercase), with the chain identifier, if present, appended (in uppercase). The disulfide-linked Cys residues are designated Cys1 and Cys2 based on the sequential order.
| Results |
|---|
|
|
|---|
In 247 polypeptide chains, 572 disulfide bonds are present of which 556 are intra-chain and 16 are inter-chain. If one selects the cysteine and half-cystines after applying the temperature factor cut-off, the total number of Cys residues (free and half cystine) in 247 chains is 1269, which indicates that 90% of the Cys in these chains participate in disulfide linkageshence free Cys residues are fairly rare in proteins containing disulfide bonds.
Of the 16 cases (in 11 PDB files) in which the disulfide bond links two chains, in five the molecule is a heterooligomer, and except in three proteins [myeloperoxidase (length = 466 residues), renal dipeptidase (369) and serine carboxypeptidase (255)], the chain length is <100 residues.
A total of 305 Cys (27% of half cystines) from 222 disulfide bonds (i.e. 39% of the total) are involved in 357 Cys
Arom (Arom = aromatic residue) interactions. If we calculate the ratio of the observed to the expected number of Cys
Arominteractions (Bhattacharyya et al., 2003
), the highest value, 1.5, is observed for Trp whereas for the other three aromatic residues, the values are 1.2 (Phe), 0.92 (Tyr) and 0.46 (His). Sixty-six aromatic residues [Phe (34), Tyr (16), Trp (9) and His (7)] interact with both of the half cystines.
A larger number (899) of Cys (79% of half cystines) are found to be involved in 1307 S
O (carbonyl) interactions, of which 1296 are intra-chain and 11 inter-chain. Out of 572 disulfide bonds, 538 (94%) are in interaction with carbonyl oxygens; 204 oxygen atoms are found to interact with both of the half cystines.
Sequence and structural features of disulfide bonds
In Figure 1, the number of intra-chain disulfide bonds per 100 residue long polypeptide chain is plotted against the chain length. Normalized in this way, one can fit a power function to the data. This number is high in small proteins and reaches a plateau when the chain is
200 residues long. Consideration of the relative accessibility of half cystines indicates that 50% of them are completely buried with relative accessibilities <5% (data not shown).
|
The distribution of the sequence difference between the half cystines (
) for intra-chain disulfide is presented in Table I. If the interactions with
5 are considered local, only 7.9% of disulfide bonds are formed locally. Interestingly, there is no cystine with
= 2. Some 49% of disulfide bonds are formed at higher values (
> 25). Hence disulfide bonds stabilize the three-dimensional structure by holding together distant regions of the chains.
|
|
A number of groups (Richardson, 1981
1
3 conformations of disulfide bonds. As is known, the distribution of the
3 torsion angle is not symmetric (Figure 2a), with clustering around the left-handed (
3 = 86.4 ± 8.5°) and right-handed conformers (
3 = 95.0 ± 10.3°). Whether the two half cystines are local or remote in sequence does not seem to have any effect on the sign of
3 (Table I). When the side-chain conformation angles (
1 and
2) of the half cystines are plotted, separated into two ranges in
3 and the secondary structural elements of the residues marked, one can see some interesting trends (Figure 2b and c). The most populated conformation has a negative value of
3 and those for
1 and
2 around 60° for all the secondary structural states of the two residues. Between
1 and
2, the former is more restricted to a value around 60° (the g+ conformation) (Janin et al., 1978
1 is centered on the canonical values of ±60 and 180°,
2 shows considerable deviation. Considering Figure 2c, the average
2 values for the two half cystines taken together in the ±60° regions are +91(33)° and 77(18)° [in Figure 2b, the most populated region has a
2 average of 71(17)°]. It is interesting that the preferred values of
2 torsion are very similar to the corresponding angle (C
CßS
M) involving the cation (M) in metal-bound Cys residues, where angles are observed around ±90 and 180° (Chakrabarti, 1989
3 value to that with a positive value is 1.41 when the secondary structure is helical (crosses), but 0.83 when it is ß-strand (triangles). Another finding is that greater numbers of half cystines with
2 around 60° are observed (black symbols in Figure 2b and c) when
3 assumes a value that is on the outside edge of the distribution; this is especially noticeable in Figure 2b, indicating the possibility that such disulfide bonds have some inherent strains. The combination of the secondary structural elements for the half cystines is presented in Table II. This shows that one of the elements being a ß-strand is fairly common. In 62 out of 68 cases of EE type, both the half cystines are in ß-strands, which are mostly antiparallel. Also in 58%, the disulfide bond links two strands of a single ß-sheet, whereas in others it connects two ß-sheets. When the disulfide bond links two helices, the latter are antiparallel in 42% of cases, parallel in 16% and have intermediate orientations in the remaining cases. The highest number of observations is found not linking two regular secondary structures, but connecting a strand with a non-regular region. However, cases with two Cys residues with non-regular structure being linked by disulfide bonds are rare.
|
Distribution of aromatic residues and carbonyl oxygen atoms around half cystines
Figure 3 displays the density (defined in Materials and methods) of some atom types at different distances from S
of half cystine. The distribution for Cys
Arom interactions shows a peak at
3.7 Å, which falls to a minimum value at
4.3 Å, beyond which it rises slightly to reach a plateau. Similar behavior has been seen in the interaction between aromatic rings (McGaughey et al., 1998
) and can be interpreted as the manifestation of a binding interaction between S and aromatic residues inside the minimum in the distribution, beyond which any direct interaction is lost because of random thermal motion. Hence within a limiting distance of 4.3 Å, any preferential orientation between the interacting groups would be revealed. In contrast, the S
C(aliphatic) interaction does not show any sharp peak or trough, suggesting the absence of any specific interaction between the groups. However, S
O(main-chain) interactions have features akin to Cys
Arom interactions and we have used a cut-off distance of 4.0 Å for identifying specific interactions of this type and then delineating their geometric features. In contrast, the distribution involving the side-chain oxygen atoms does not have any peak, but only a plateau beyond 3.5 Å, indicating that the higher thermal parameters of the side-chain atoms would mask any directional features of the S
O interaction. Within the cut-off distance used, the average distances of the centroids of different aromatic residues from S
are His, 4.5 (±0.5) Å; Phe, 4.8 (0.5) Å; Tyr, 4.8 (0.5) Å; Trp, 5.2 (0.8) Å. These distances are shorter than 6.0 Å, observed by Reid et al. (1985)
to be the optimum distance between the centroid of aromatic residues and S
of Cys. A half cystine can have more than one aromatic residue or carbonyl oxygen atom in contact. Of the 222 cystines involved in S
Arom interactions, 129 have one such contact, 59 two, 26 three and 8 four; the equivalent numbers for S
O interactions are 117 one contact, 195 two, 139 three, 58 four, 23 five and 6 six.
|
Geometry of Cys
Arom interactions relative to the aromatic ring
Two parameters, P, the interplanar angle and
, the angular displacement of the S
atom relative to the aromatic ring (Figure 4a), have been used to study the geometry of interaction between half cystine and the aromatic residue. For visualization, the values (in the range 090°) have been grouped in bins of size 30° along the two variables, resulting in nine grid elements, and each geometry is designated by a two-letter tag (Figure 4b) as given in Materials and methods. The observed and expected numbers of occurrences in all the grid elements and that in which the two values have significant differences are shown in Figure 5. There are some minor differences in the preferred relative orientations depending on the type of aromatic residue. Tyr and Trp interact with half cystine in similar orientations with et and op geometries having more than the expected number of observations. Instead of et, Phe shows a preference for the adjacent element en. Of the aromatic residues, the number of interactions is the highest (154) with Phe; with His the number is rather small (33) and it is not possible to make any definite statement on the preferred geometry. Examples of some geometric orientations are shown in Figure 6.
|
|
Geometry of Cys
Arom interactions relative to the disulfide group
A number of parameters have been calculated to visualize the interacting group from the perspective of the cystine moiety. One of them is the angle, CßS
Arcen or S'
S
Arcen, shown in Figure 4c. Whichever is larger is used to draw the histogram, Figure 7a. The highest number of occurrences is observed in the range 121150° for both the angles, suggesting that the aromatic centroid is positioned nearly at the rear of the CßS
or S'
S
bond. A scatter diagram of
1 vs
(Figure 4c), presented in Figure 7b, indicates the same feature. More points are closer to the disulfide plane (
1 in the range 4590°) than perpendicular to it (
1: 045°) and
values span the range 60 and 60°.
|
Sequence difference and secondary structural features of Cys
Arom pair
The sequence difference (
) between the residues involved in intra-chain Cys
Arom interactions is given in Table III. Only 24% of total interactions with |
|
5 may be termed local interactions. In 76% of interactions, the aromatic residue is more than five residues away from the half cystine, suggesting that Cys
Arom interactions contribute to the stability of the tertiary structure. Among the local interactions,
= 4, 2, 1 and 2 are observed in higher numbers than the rest, but there is no interaction with
= 1.
|
Some 47% of the interacting Cys
Arom pairs belong to three secondary structural motifs, HH, EE and CE (in the last category, C corresponds to Arom). Some representative examples, along with the relative orientation of the residues, are given in Figure 8. When both the half cystine and the interacting aromatic residue are in the same helix, the observed sequence difference is invariably
= 1 and 4 (24 cases) and not 1 and 4 (Figure 8a and b). This indicates the stereospecificity of the interaction between the half cystine and the aromatic residue, which is not possible in the reverse order.
|
The local interactions observed in a ß-strand involve an aromatic side chain two residues preceding the half cystine (seven cases; the reverse order is found in four cases only) (Figure 8c). In 52 cases (81% of the EE motifs), the half cystine and the interacting aromatic residues are in two different ß-strands; of these, in 67% cases they are located in two adjacent strands of an antiparallel ß-sheet (Figure 8d) (in the rest, the strands belong to two different ß-sheets).
S
O interactions and geometric features
The average S
O distance involving the main-chain carbonyl oxygen atoms is 3.6(2) Å. As in Cys
Arom interactions (Figure 7a and b), the geometry of S(Cys)
O interactions has also been characterized using angular parameters (Figure 7c and d). Of the two angles in Figure 7c, S
'S
O occurs in a greater number of cases (883) than CßS
O (423), indicating, as has also been noted earlier (Iwaoka et al., 2002a
), that the interaction along the backside of the SS bond may be stronger than that along the rear of the SC bond. However, the peak of the CßS
O angle is at 180°, indicating a more linear interaction than the S
'S
O interaction, which peaks at
150°. This can also be inferred from Figure 7d, in which the angles
1 and
(Figure 4c) are plotted. Although there are more points with negative values of
, these are also scattered more. Overall, however, compared with Cys
Arom contacts, S(Cys)
O interactions are more numerous with distinct clustering. For example, points are distributed in the bands with
±60°, with points avoiding the region around
0°. This can be clearly seen in Figure 9, which shows no density along a plane perpendicular and bisecting the sulfide plane.
|
Sequence difference and the secondary structural features of S
O interactions
The sequence difference (
) of intra-chain S
O interactions is given in Table IV. In 25% of cases the half cystines are interacting with their own carbonyl oxygen atom. Excluding these, 48% of the remaining interactions are with |
|
5 and 52% with |
| > 5, showing that the S
O interactions can be local, as well as long range (Figure 10).
|
|
Two structural motifs involving S
O interactions are conspicuous. One is the interaction between sulfur and the carbonyl oxygen present at four residues before the half cystine within an
-helix (Figure 10a); 64% of interactions of
= 4 occur in
-helices. The second is the interaction between half cystine and the carbonyl group of the preceding residue in a ß-strand (Figure 10b); 51% of interactions of
= 1 are of this type. There are 82 examples of S
O interactions occurring between two antiparallel ß-strands. Of these, in 45 cases the two strands belong to the same ß-sheet (Figure 10c). Of the interactions involving antiparallel ß-strands, in 35% of cases the half cystine occupies the first or the last position of a strand and 43% of the interacting carbonyl groups are also found at these two positions. In 10% of cases of long-range interactions, the residues involved do not possess any regular secondary structure. Conservation of aromatic residues in contact with disulfide moieties
Abkevich and Shakhnovich (2000)
observed a strong correlation of cystine content with polar residue content in proteins, indicating the possibility that certain amino acid classes may influence the folding kinetics and stability of disulfides, although how the effect is exerted is unclear. The pronounced directionality observed in Cys
Arom interactions suggests that, just like the location of two half cystines, the presence of a neighboring aromatic residue at an optimum orientation may also be the hallmark of the existence of a disulfide bond in a protein family. Using the HOMSTRAD database (Mizuguchi et al., 1998
; Stebbings and Mizuguchi, 2004
), which has the structure-based alignments of all homologous protein families, one can study the conservation of Cys
Arom interactions in aligned sequences. The families corresponding to our PDB files showing Cys
Arom interactions were considered only if they had at least four members; 52 PDB files satisfied this condition (Table V). Of these, 41 disulfide bonds are fully conserved all across the families. Among these conserved moieties, 22 cystines are found to interact with 26 conserved (the criterion used being the degree of conservation >75%) aromatic residues, i.e.
50% of the conserved cystines have at least one conserved aromatic residue in specific contact. The break-up of the conserved aromatic residues interacting with 22 conserved disulfide bridge is Phe: 10, Tyr 12, Trp 4 and His 0 and the geometry in the majority of the cases is en or et.
|
It has been noted that non-conserved disulfide bridges lie on or near the surface of globular proteins (Thornton, 1981
Arom interactions towards the evolutionary conservation of disulfide bonds. A few individual cases of conservation are discussed below.
In the papain family of cysteine proteinase (PDB file: 1me4; Table V) cysteine at the active site acts as a nucleophile for the peptide bond cleavage. An important structural motif in this family is the presence of three disulfide bonds, of which two have conserved, interacting aromatic residues. The half cystines along with the aromatic residues in contact form the hydrophobic core of the protein (Kamphuis et al., 1985
). However, cathepsins B (1the) represents a family with a different topology of disulfide bonds (Jia et al., 1995
). Within this family all the half cystines and the associated aromatic residues are conserved. Of these, His110 also constitutes the active site and, interestingly, Trp30 is a residue which is conserved in the papain family.
Porcine pancreatic spasmolytic polypeptide (2psp) belongs to the trefoil family of loop peptides and acts as a naturally occurring healing factor for various diseases of the gastrointestinal tract. The basic characteristic of these proteins is a domain of 38 or 39 residues, which includes six bridged Cys residues. 2psp contains two such domains and six disulfide bonds and all conserved residues are reported to be in the vicinity of the cleft areas (Petersen et al., 1996
). Interestingly, these conserved aromatic residues (Phe36, Phe47, Phe85 and Phe96) at the cleft areas are found to interact with conserved disulfide bonds.
The members of serine proteinase inhibitors have three disulfide bonds, of which two are found to interact with conserved aromatic residues (1g6x). The disulfide bond, 14C38C, in this PDB file was excluded from our analysis as the sulfur atom of Cys38 had fractional occupancy. However, there have been considerable solution studies involving the bridge. For example, a series of 24 mutants of bovine pancreatic trypsin inhibitor showed that the 14C38C disulfide bond was formed at an early stage of protein folding and the rate of formation of this disulfide bond was affected 2-fold if Tyr35 was mutated by Ala (Dadlez, 1997
). This Tyr35 is conserved in serine proteinase family. We found a homologous PDB file, 5pti, in which the atoms in Cys38 have full occupancy, but in this structure Tyr35 (at 6.4 Å) is beyond the contact distance from S
of Cys38. Yet another study found that the mutation of Tyr35 and Tyr23 affected the folding of bovine pancreatic trypsin inhibitor (Goldenberg et al., 1989
) and Tyr23 is in interaction with the 30C51C bridge.
The disulfide connectivity of tick anticoagulant peptide (TAP, 1d0d) is similar to the kunitz family of serine proteinase inhibitors, although its amino acid sequence identity with this family is much less (Antuch et al., 1994
). We have found that there are two aromatic residues (Tyr1 and Tyr49) in contact of the 5C59C bridge. The conserved Tyr49 in TAP (or Phe found in some members) is responsible for the structural integrity of the 310-helix (Asn2Leu4) which is responsible for the binding of the molecule to the secondary site of factor Xa (Charles et al., 2000
). Tyr1 is crucial for TAP, as it renders the peptide highly specific for factor Xa (by binding to the S1 specificity pocket of Xa), exhibiting little inhibitory activity towards other serine proteases, such as trypsin, thrombin and other blood proteases (Waxman et al., 1990
; Charles et al., 2000
). This is an example where an aromatic residue is positioned by the disulfide bond (in the op geometry) for its functional role.
Yet another example of the functional importance associated with aromatic residues is provided by phospholipase A2, which hydrolyzes phospholipids to fatty acids and lysophospholipids. These are abundant in snake venoms of various species and share similar three-dimensional structures. Acutohaemolysin (1mc2) is a lipase that lacks catalytic and hemolytic activity. Three conserved disulfide bonds interact with conserved aromatic residues, some of which have catalytic functions. Among these, Tyr1025 lies in the calcium-binding loop, which is one of the most conservative regions in the structure. Tyr1052 is one of the constituents of the invariant HisTyrAsp catalytic triad. Phe1102 (interacting with 1029C1045C, en geometry), which exists only in acutohaemolysin, blocks the substrate binding to this catalytic triad, resulting in the loss of hemolysis (Liu et al., 2003
).
Lesk and Chothia (1982)
first identified a high degree of conservation of CysCys and Trp residues in the Fab molecule. Later, this CysCys and Trp structural triad was found to be conserved in almost all immunoglobulin domains, such that Trp is packed against the disulfide bond (Ioerger et al., 1999
). These residues, however, do not show up against the entry 1k5n for the heavy chain of histocompatibility antigen binding domain in Table V, as the distance (4.8 Å) between the half cystine and Trp is beyond our cut-off value and indeed it has been reported that the triad geometry and the distance between CysCys and Trp is to some extent different in major histocompatibility complex antigen, as compared with other immunoglobulin domains (Ioerger et al., 1999
). Nevertheless, in this molecule we have identified another conserved disulfide bond (101C164C) and interacting Tyr159 (in on geometry). Interestingly, this Tyr, in turn, interacts with Phe3 of the antigen peptide, giving rise to a disulfidearomaticaromatic triad in the complex structure.
| Discussion |
|---|
|
|
|---|
The tertiary folds of native proteins are determined by a large number of weak interactions, viz., hydrogen bonding, hydrophobic interactions, salt bridges and weakly polar interactions, such as aromaticaromatic, XH

(where X is N, O or C) and CH
O (Singh and Thornton, 1985
Arom interactions (Morgan et al., 1978
O and S
Arom contacts, respectively. Delineating the geometric features of the former was restricted to main-chain oxygen atoms only, as the distribution involving the side-chain O atoms did not reveal any peak indicative of a region within which specific interactions between the atoms would stand up against the background of other non-bonded interactions. Features of the disulfide bonds
For proteins containing disulfide bonds, one can fit a power function to the number of bonds per 100 residues plotted against the chain length (Figure 1); after an initial fall, the number flattens out at about 150 residues. The left-handed spiral structure (Richardson, 1981
) with
1
2
60° and
3
90° (Figure 2) is the most populated conformation. Unlike
1, the
2 torsion (C
CßS
S
') angle is closer to ±90° and this aspect is similar to what is exhibited by
2, the equivalent torsion involving metal ion, when Cys residues act as ligands (Chakrabarti, 1989
). The residues with
3 values in the fringes of the distribution usually have
1 and/or
2 around +60°, a nominally higher energy conformation. Such bonds may be strained and it will be worthwhile to study the stability conferred by such bonds to protein structures.
Stereospecific interactions
The contour plot showing the distribution of oxygen atoms around the interacting sulfur of the disulfide bond is shown in Figure 9. There is a general avoidance of the circular region around the extension of the bisector to the CßS
S
' angle and the points are clustered at the backside of CßS
and S
'S
bonds, suggesting that the carbonyl oxygen atom tends to approach cystine along the antibonding orbital of the bonds. The stabilizing nature of the interaction when the molecular orbitals of an nucleophile (oxygen) and electophile (sulfur) interact lead to the directional properties of S
O non-bonded interactions (Rosenfield et al., 1977
; Guru Row and Parthasarathy, 1981
). This interaction (examples in Figure 10) has been noted earlier in protein structures, especially in the interaction involving the sulfur atom of Met residues (Pal and Chakrabarti, 2001
; Iwaoka et al., 2002a
). Although there are more points interacting at the backside of the S
'S
bond, there is more diffusion of points away from the sulfide plane as compared with the region behind the CßS
bond. These geometric features are also revealed from an analysis of angular parameters (Figure 4c) displayed in Figure 7c and d. Ab initio calculations using model systems indicate that the S
O interaction can contribute 2.53.2 kcal/mol to the stability (Dixit et al., 1995
; Iwaoka et al., 2002a
,b
).
Like the S
O interaction, the S
Arom interaction also has preferred geometries. From the perspective of the sulfide plane the distribution of the aromatic centroid is similar to that of the oxygen atoms, as revealed by the observed values (Figure 7) of
1 and
angles (Figure 4c). However, as the
-electron cloud is spread over the whole aromatic ring, the clustering is less prominent and a contour map similar to Figure 9 for oxygen atoms could not be obtained. The orientation of the sulfide plane relative to the aromatic ring is also found to be restricted, the preferred geometry (Figure 4b) being en or et (Figures 5 and 6). The orientations having repulsive interaction between the face of the aromatic ring and the sp3 lone pair of electrons in sulfur are avoided. For example, in the stacked fp geometry one lone pair orbital of sulfur points directly to the
electron cloud of the ring, whereas in fn both orbitals interact unfavorably (Figure 11). In contrast, the repulsive interaction is the minimum in et and the rear (i.e. the antibonding orbital) of the CßS
or S
'S
bond can interact favorably with the
-electrons. Although Met
Arom interactions were not analyzed in terms of similar geometric parameters, the preferred geometry of sulfur appears to be on top of the aromatic ring, close to the periphery (Pal and Chakrabarti, 2001
). Energy calculations also suggest an interaction of the sulfur atom with the aromatic face, but all of the possible geometries encountered in protein structures have not been investigated (Némethy and Scheraga, 1981
; Pranata, 1997
). It is normally assumed that the environment of disulfide bond is hydrophobic. However, the stereospecific location of O atoms and aromatic rings, whose geometry is dictated by the electronic interactions, suggests that electrostatic factors also control the local structure around the bond.
|
Conservation of Cys
Arom interactions and their functional role
When a Cys
Arom interaction occurs in a structure, it is found to be an invariant feature in at least 50% of the families to which the protein belongs, as long as the disulfide bond itself is fully conserved (Table V). Such bridges are also fully buried in the structure. In families in which the SS bond is not fully conserved, constituent members showing the conservation of the bond usually have a Cys
Arom interaction conserved. When they are conserved, Cys
Arom interactions can be considered as the signature motif for a particular family of proteins and may be used as a fingerprint to annotate the function based on the three-dimensional structure of an unknown protein.
Although one normally associates disulfide bridges to provide extra stability, these may also be needed for function (Chang et al., 2003
) and the engineered disulfide bond can also modulate the functional attribute of a molecule (Sauer et al., 1986
). For example, human insulin contains two inter-chain and one intra-chain disulfide linkages, which make different contributions to the structure formation of insulin and are formed sequentially in the order A20B19, A7B7 and A6A11 (the letter corresponds to the chain label) in the folding pathway of proinsulin; but all three are essential for receptor binding activity (Chang et al., 2003
). Interestingly, in many of the proteins in Table V, the aromatic residues interacting with disulfide bonds have a unique role in the function, especially in binding or even preventing the substrate binding (see Results, last section) and, expectedly, these Cys
Arom interactions are evolutionarily conserved. Given the importance of Arom
Arom interactions (Singh and Thornton, 1985
; Burley and Petsko, 1988
; Samanta et al., 1999
, 2000
; Bhattacharyya et al., 2002
, 2003
) and the prevalence of aromatic residues at proteinprotein interfaces (Chakrabarti and Janin, 2002
), one can encounter a Cys
Arom
Arom triad stabilizing proteinprotein interactions.
Transcription growth factor ß (TGFß) has four disulfide bonds formed by eight conserved half cystines. The hydrophobic clusters at the binding surface are constituted mostly by aromatic residues, which are highly conserved in this family of proteins. Among these conserved aromatic residues, His40, Trp45 and Phe83 are within 4.5 Å of cystine (Greenwald et al., 1999
). The mutual exclusiveness of disulfides and aromatic residues in protein families suggests the importance of their interaction in the protein structure and function and this fact can be used in molecular design.
Stability of disulfide bonds and the sequential pattern of residues
The disulfide bonds are susceptible to thiol/disulfide exchanges, provided a nearby thiolate anion can attack it (Gilbert, 1984
). The direction of this attack will be along the backside of the S'
S
bond. As long as this approach is obstructed by the presence of an incipient electrophilenucleophile interaction (sulfur with oxygen or aromatic) within the protein structure, with the Cys residues having very low accessibility, the disulfide bond is prevented from undergoing any reaction. This is also the reason why such disulfide bonds have survived the evolutionary pressure of mutation. However, when the cystine is at the enzyme active site, such as 58C63C in glutathione reductase (Karplus and Schulz, 1987
), the half cystines are more exposed with relative accessibilities 14 and 39% (in the PDB file, 3grs), respectively; the only carbonyl group close by being that of Cys58 (at 3.5 Å) with
1 and
(Figure 4c) being 18 and 121°, a location nearly perpendicular to the sulfide plane, such that the rear side of the S'
S
bond is available for nucleophilic attack by a thiolate anion.
When a specific interaction occurs between two residues at a particular sequence difference, it is observed that the order of the two residues in the sequence is also important. For example, the location of a His four residues following a Phe in an
-helix can give rise to an NH
(or CH
) interaction stabilizing the helix; the interaction is not attainable if the order of the pair is reversed (Bhattacharyya et al., 2002
). Likewise in a helix, an interacting aromatic residue preceding a half cystine by four (or one) residues is found to occur more frequently than when the order is the reverse; a similar order is also observed with an interacting carbonyl group (Tables III and IV; Figures 8a and b and 10a) and in the S(Met)
Arom interaction (Pal and Chakrabarti, 2001
).
Finally, we have shown in this paper that there are preferred geometries for interaction of half cystines with the peptide oxygen atoms and aromatic side chains and there are compelling reasons to believe that the S
Arom interaction is distinct from the hydrophobic interaction that one expects when the disulfide bond is buried in an environment of aliphatic groups. There is a high degree of conservation of the aromatic residues in contact with the bond and many of these are endowed with a functional role.
| Notes |
|---|
1 Present address: UCLADOE Institute for Genomics and Proteomics, Box 951570, University of California at Los Angeles, Los Angeles, CA 90095-1570, USA
| Acknowledgments |
|---|
A Senior Research Fellowship from the Council of Scientific and Industrial Research, India, supported R.B. and the Department of Biotechnology provided some computational facilities.
| References |
|---|
|
|
|---|
Abkevich,V.I. and Shakhnovich,E.I. (2000) J. Mol. Biol., 300, 975985.[CrossRef][Web of Science][Medline]
Allen,F.H. (2002) Acta Crystallogr. B, 58, 380388; http://www.ccdc.cam.ac.uk/[CrossRef][Medline]
Allen,F.H., Bird,C.M., Rowland,R.S. and Raithby,P.R. (1997) Acta Crystallogr. B, 53, 696701.[CrossRef]
Antuch,W., Guntert,P., Billeter,M., Hawthorne,T., Grossenbacher,H. and Wuthrich,K. (1994) FEBS Lett., 352, 251257.[CrossRef][Web of Science][Medline]
Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne, P.E. (2000) Nucleic Acids Res., 28, 235242.
Betz,S.F. (1993) Protein Sci., 2, 15511558.[Web of Science][Medline]
Bhattacharyya,R. and Chakrabarti,P. (2003) J. Mol. Biol., 331, 925940.[CrossRef][Web of Science][Medline]
Bhattacharyya,R., Samanta,U. and Chakrabarti,P. (2002) Protein Eng., 15, 91100.
Bhattacharyya,R., Saha,R.P., Samanta,U. and Chakrabarti,P. (2003) J. Proteome Res., 2, 255263.[CrossRef][Web of Science][Medline]
Brandl,M., Weiss,M.S., Jabs,A., Sühnel,J. and Hilgenfeld,R. (2001) J. Mol. Biol., 307, 357377.[CrossRef][Web of Science][Medline]
Burley,S.K. and Petsko,G.A. (1988) Adv. Protein Chem., 39, 125189.
Chakrabarti,P. (1989) Biochemistry, 28, 60816085.[CrossRef][Medline]
Chakrabarti,P. and Janin,J. (2002) Proteins, 47, 334343.[CrossRef][Web of Science][Medline]
Chakrabarti,P. and Pal,D. (1997) Protein Sci., 6, 851859.[Web of Science][Medline]
Chakrabarti,P. and Pal,D. (2001) Prog. Biophys. Mol. Biol., 76, 1102.[CrossRef][Web of Science][Medline]
Chang,S.G., Choi,K.D., Jang,S.H. and Shin,H.C. (2003) Mol. Cells, 16, 323330.[Web of Science][Medline]
Charles,R.St., Padmanabhan,K., Arni,R.V., Padmanabhan,K.P. and Tulinsky,A. (2000) Protein Sci., 9, 265272.[Web of Science][Medline]
Clarke,J. and Fersht,A.R. (1993) Biochemistry, 32, 43224329.[CrossRef][Medline]
Dadlez,M. (1997) Biochemistry, 36, 27882797.[CrossRef][Medline]
Darby,N.J. and Creighton,T.E. (1993) J. Mol. Biol., 232, 873896.[CrossRef][Web of Science][Medline]
Desiraju,G.R. and Steiner,T. (1999) The Weak Hydrogen Bond in Structural Chemistry and Biology. Oxford University Press, Oxford.
Dill,K.A. (1990) Biochemistry, 29, 71337155.[CrossRef][Medline]
Dixit,A.N., Reddy,K.V., Rakeeb,A., Deshmukh,A.S., Rajappa,S., Ganguly,B. and Chandrasekhar,J. (1995) Tetrahedron, 51, 14371448.[CrossRef]
Fiser,A., Cserzö,M., Tüdös,É. and Simon,I. (1992) FEBS Lett., 302, 117120.[CrossRef][Web of Science][Medline]
Gilbert,H.F. (1984) Methods Enzymol., 107, 330351.[Web of Science][Medline]
Goldenberg,D.P., Frieden,R.W., Haack,J.A. and Morrison,T.B. (1989) Nature, 338, 127132.[CrossRef][Medline]
Greenwald,J., Fischer,W.H., Vale,W.W. and Choe,S. (1999) Nat. Struct. Biol., 6, 1822.[CrossRef][Web of Science][Medline]
Gregoret,L.M., Rader S.D., Fletterick,R.J. and Cohen,F.E. (1991) Proteins, 9, 99107.[CrossRef][Web of Science][Medline]
Guru Row,T.N. and Parthasarathy,R. (1981) J. Am. Chem. Soc., 103, 477479.[CrossRef]
Hinck,A.P., Truckses,D.M. and Markley,J.L. (1996) Biochemistry, 35, 1032810338.[CrossRef][Medline]
Hubbard,S.J. (1992) NACCESS: A Program for Calculating Accessibilities. Department of Biochemistry and Molecular Biology, University College London, London.
Ioerger,T.R., Du,C. and Linthicum,D.S. (1999) Mol. Immunol., 36, 373386.[CrossRef][Web of Science][Medline]
Ippolito,J.A., Alexander,R.S. and Christianson,D.W. (1990) J. Mol. Biol., 215, 457471.[Web of Science][Medline]
Iwaoka,M., Takemoto,S., Okada,M. and Tomoda,S. (2002a) Bull. Chem. Soc. Jpn., 75, 16111625.[CrossRef]
Iwaoka,M., Takemoto,S. and Tomoda,S. (2002b) J. Am. Chem. Soc., 124, 1061310620.[CrossRef][Web of Science][Medline]
Janin,J., Wodak,S., Levitt,M. and Maigret,B. (1978) J. Mol. Biol., 125, 357386.[CrossRef][Web of Science][Medline]
Jia,Z., Hasnain,S., Hirama,T., Lee,X., Mort,J.S., To,R. and Huber,C.P. (1995) J. Biol. Chem., 270, 55275533.
Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 25772637.[CrossRef][Web of Science][Medline]
Kamphuis,I.G., Drenth,J. and Baker,E.N. (1985) J. Mol. Biol., 182, 317329.[CrossRef][Web of Science][Medline]
Karplus,P.A. and Schulz,G.E. (1987) J. Mol. Biol., 195, 701729.[CrossRef][Web of Science][Medline]
Katz,B.A. and Kossiakoff,A. (1986) J. Biol. Chem., 261, 1548015485.
Kraulis,P.J. (1991) J. Appl. Crystallogr., 24, 946950.[CrossRef]
Lee,B. and Richards,F.M. (1971) J. Mol. Biol., 55, 379400.[CrossRef][Web of Science][Medline]
Lesk,A.M. and Chothia,C. (1982) J. Mol. Biol., 160, 325345.[CrossRef][Web of Science][Medline]
Liu,Q., Huang,Q., Teng,M., Weeks,C.M., Jelsch,C., Zhang,R. and Niu,L. (2003) J. Biol. Chem., 278, 4140041408.
Mansfeld,J., Vriend,G., Dijkstra,B.W., Veltman,O.R., Van den Burg,B., Venema,G., Ulbrich-Hofmann,R. and Eijsink,V.G.H. (1997) J. Biol. Chem., 272, 1115211156.
Matsumura,M., Signor,G. and Matthews,B.W. (1989) Nature, 342, 291293.[CrossRef][Medline]
McGaughey,G.B., Gagné,M. and Rappé, A.K. (1998) J. Biol. Chem., 273, 1545815463.
Meyer,E.A., Castellano,R.K. and Diederich,F. (2003) Angew. Chem. Int. Ed., 42, 12101250.
Mitchinson,C. and Wells,J.A. (1989) Biochemistry, 28, 48074815.[CrossRef][Medline]
Mizuguchi,K., Dean,C.M., Blundell,T.L. and Overington,J.P. (1998) Protein Sci., 7, 24692471.[Web of Science][Medline]
Morgan,R.S., Tatsch,C.E., Gushard,R.H., McAdon,J.M. and Warme,P.K. (1978) Int. J. Pept. Protein Res., 11, 209217.[Web of Science][Medline]
Morris,A.L., MacArthur,M.W., Hutchinson,E.G. and Thornton,J.M. (1992) Proteins, 12, 345364.[CrossRef][Web of Science][Medline]
Némethy,G. and Scheraga,H.A. (1981) Biochem. Biophys. Res. Commun., 98, 482487.[CrossRef][Web of Science][Medline]
Pal,D. and Chakrabarti,P. (1998) J. Biomol. Struct. Dyn., 15, 10591072.[Web of Science][Medline]
Pal,D. and Chakrabarti,P. (2001) J. Biomol. Struct. Dyn., 19, 115128.[Web of Science][Medline]
Petersen,T.N., Henriksen,A. and Gajhede,M. (1996) Acta Crystallogr. D, 52, 730737.[CrossRef][Medline]
Petersen,T.N., Jonson,P.H. and Petersen,S.B. (1999) Protein Eng., 12, 535548.
Pranata,J. (1997) Bioorg. Chem., 25, 213219.[CrossRef]
Reid,K.S.C., Lindley,P.F. and Thornton,J.M. (1985) FEBS Lett., 190, 209213.[CrossRef]
Richardson,J.S. (1981) Adv. Protein Chem., 34, 167339.[Medline]
Rosenfield,R.E.,Jr, Parthasarathy,R. and Dunitz,J.D. (1977) J. Am. Chem. Soc., 99, 48604862.[CrossRef]
Samanta,U., Pal,D. and Chakrabarti,P. (1999) Acta Crystallogr. D, 55, 14211427.[CrossRef][Medline]
Samanta,U., Pal,D. and Chakrabarti,P. (2000) Proteins, 38, 288300.[CrossRef][Web of Science][Medline]
Sauer,R.T., Hehir,K., Stearman,R.S., Weiss,M.A., Jeitler-Nilsson,A., Suchanek,E.G. and Pabo,C.O. (1986) Biochemistry, 25, 59925998.[CrossRef][Medline]
Singh,J. and Thornton,J.M. (1985) FEBS Lett., 191, 16.[CrossRef]
Srinivasan,N., Sowdhamini,R., Ramakrishnan,C. and Balaram,P. (1990) Int. J. Pept. Protein Res., 36, 147155.[Web of Science][Medline]
Stebbings,L.A. and Mizuguchi,K. (2004) Nucleic Acids Res., 32, D203D207.
Steiner,T. and Koellner,G. (2001) J. Mol. Biol., 305, 535557.[CrossRef][Web of Science][Medline]
Taylor,J.C. and Markham,G.D. (1999) J. Biol. Chem., 274, 3290932914.
Thomas,A., Meurisse,R., Charloteaux,B. and Brasseur,R. (2002) Proteins, 48, 628634.[CrossRef][Web of Science][Medline]
Thornton,J.M. (1981) J. Mol. Biol., 151, 261287.[CrossRef][Web of Science][Medline]
Umezawa,Y. and Nishio,M. (1998) Bioorg. Med. Chem., 6, 25072515.[CrossRef][Medline]
van den Burg,B., Dijkstra,B.W., van der Vinne,B., Stulp,B.K., Eijsink,V.G.H. and Venema,G. (1993) Protein Eng., 6, 521527.
van Vlijmen,H.W.T., Gupta,A., Narasimhan,L.S. and Singh,J. (2004) J. Mol. Biol., 335, 10831092.[CrossRef][Web of Science][Medline]
Wang,G. and Dunbrack,R.L.,Jr (2003) Bioinformatics, 19, 15891591.
Waxman,L., Smith,D.E., Arcuri,K.E. and Vlasuk,G.P. (1990) Science, 248, 593596.
Received September 2, 2004; revised November 2, 2004; accepted November 27, 2004.
Edited by P.Balaram
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. R. Tubb, R. A. G. D. Silva, K. J. Pearson, P. Tso, M. Liu, and W. S. Davidson Modulation of Apolipoprotein A-IV Lipid Binding by an Interaction between the N and C Termini J. Biol. Chem., September 28, 2007; 282(39): 28385 - 28394. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Di Matteo, S. Gianni, M. E. Schinina, A. Giorgi, F. Altieri, N. Calosci, M. Brunori, and C. Travaglini-Allocatelli A Strategic Protein in Cytochrome c Maturation: THREE-DIMENSIONAL STRUCTURE OF CcmH AND BINDING TO APOCYTOCHROME c J. Biol. Chem., September 14, 2007; 282(37): 27012 - 27019. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






(where 





