Skip Navigation


PEDS Advance Access originally published online on February 16, 2007
Protein Engineering Design and Selection 2007 20(3):133-141; doi:10.1093/protein/gzm004
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
20/3/133    most recent
gzm004v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Thusberg, J.
Right arrow Articles by Vihinen, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Thusberg, J.
Right arrow Articles by Vihinen, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org

The structural basis of hyper IgM deficiency – CD40L mutations

J. Thusberg1 and M. Vihinen1,2,3

1 Institute of Medical Technology, FI-33014, University of Tampere, Finland 2 Research Unit, Tampere University Hospital, FI-33520 Tampere, Finland

3 To whom correspondence should be addressed. Institute of Medical Technology, FI-33014, University of Tampere, Finland. Email: mauno.vihinen{at}uta.fi


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
X-linked hyper-IgM syndrome (XHIGM) is a primary immunodeficiency characterised by an inability to produce immunoglobulins of the IgG, IgA and IgE isotypes. It is caused by mutations of CD40 ligand (CD40L, CD154), expressed on T-lymphocytes. The interaction of CD40L on T-cells and its receptor CD40 on B-cells is essential for lymphocyte signalling leading to immunoglobulin class switching and B-cell maturation. To understand the structural basis for XHIGM, we utilised bioinformatics methods to analyse all the known CD40L missense mutations at both the sequence and structural level. Our results demonstrate that the 35 different missense mutations have diverse effects on CD40L structure and function, affecting structural disorder and aggregation tendencies, stability maintaining contacts and electrostatic properties. Several mutations also affect residues essential in receptor binding and trimerisation. Experimental study of effects of mutations is laborious and time-consuming and at the structural level often almost impossible. By contrast, precise and useful information about effects of mutations on protein structure and function can readily be obtained by theoretical methods. In this study, all the XHIGM causing missense mutations could be explained in terms of CD40L structure and function. Thus, the molecular basis of the syndrome could be elucidated.

Keywords: bioinformatical analysis/disease-causing mutations/immunodeficiencies/structural basis of disease/structure–function relationships


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
X-linked hyper-IgM syndrome (XHIGM; OMIM 308230 [OMIM] ) is a rare and severe primary immunodeficiency characterised by the absence or low levels of IgG, IgA and IgE, normal or elevated IgM level in serum, and defective immunoglobulin class switch recombination (Notarangelo et al., 1992Go; Fuleihan et al., 1993Go). XHIGM patients are highly susceptible to recurrent bacterial infections and they are prone to autoimmune diseases and neutropenia (Levy et al., 1997Go). The syndrome is caused by mutations of CD40 ligand (CD40L, CD154), expressed on T-cells, and the inability of the mutated protein to bind to its receptor CD40 on B-cells (Aruffo et al., 1993Go).

CD40L, a member of the tumour necrosis factor (TNF) family of cytokines, is a 39 kDa Type II membrane glycoprotein expressed primarily on activated CD4+ T-cells (Noelle et al., 1992Go). The CD40L monomer consists of four distinct structural domains: an N-terminal intracellular tail (amino acids 1–22), a short transmembrane domain (amino acids 23–46), a portion that forms the extracellular unique domain (amino acids 47–122) and the extracellular, C-terminal TNF homology (TNFH) domain (amino acids 123–261). The crystal structure of the CD40L TNFH domain has been determined to 2.0 Å resolution (Karpusas et al., 1995Go).

TNFH domain superfamily members have a highly conserved jelly roll type structure, consisting of two ß sheets that have a Greek key topology. The TNFH domains are responsible for receptor binding. The sequence identity between family members is ~20–30% (Bodmer et al., 2002Go).

The CD40L–CD40 interaction is essential in B-cell activation and antibody isotype switching (Kroczek et al., 1994Go). Isotype switching by B-cells stimulated by T-dependent signals requires both the ligation of CD40 and a second signal provided by a T-cell derived cytokine (Coffman et al., 1993Go). CD40 is constantly expressed on B-cells (Clark and Ledbetter, 1986Go), whereas CD40L is expressed only after class II major histocompatibility factor (MHC)–T-cell receptor (TCR) interaction and T-cell activation (Armitage et al., 1992Go). The expression of CD40L is also regulated in an autologous manner, so that the interaction of the ligand with its receptor upregulates its own expression (Pinchuk et al., 1996Go).

The ligand–receptor interaction triggers a signalling cascade leading to the activation of several genes involved in B-cell proliferation and antibody production (Allen et al., 1993Go), and the downregulation of genes whose expression has been shown to lead to cell cycle arrest (Dadgostar et al., 2002Go). The expression of B7 proteins on the B-cell surface is also stimulated by the interaction, which contributes to the stability of the immunological synapse via the formation of co-stimulatory B7–CD28 interactions between T-cells and B-cells (Klaus et al., 1994Go). The ligation of CD40 stimulates the production in B-cells of cytokines, such as IL2, IL6, IL10, TNF{alpha}, LT{alpha}, LTß and TGFß (Clark and Shu, 1990; Burdin et al., 1993; Kindler et al., 1995; Worm and Geha, 1995; Worm et al., 1998). The CD40–CD40L interaction also induces T-cells to produce cytokines that determine the antibody class to be expressed in B-cells, and contributes to the proliferation of B-cells (Finkelman et al., 1990Go).

Signalling pathways activated by the ligation of CD40 originate from the interaction of the intracellular domain of the receptor with TNF-associated proteins (TRAFs) (Harigai et al., 2004Go). As a consequence of the interaction of CD40 with its trimeric ligand, it forms clusters at the B-cell membrane. The clustering of the receptor involves the recruitment and localisation of the TRAFs to membrane microdomains, which enables them to initiate signalling cascades by interacting with downstream signalling proteins (Hostager et al., 2000Go). Like many TNFR family members, CD40 activates the JNK/SAPK and NF-{kappa}B pathways (Berberich et al., 1994Go, 1996Go). Both pathways involve protein serine/threonine kinases that activate AP1 and Rel transcription factors, thereby regulating gene expression. The p38 kinase pathway, which leads to the activation of transcription factors such as ATF2 (Raingeaud et al., 1996Go), has also been reported to be activated by CD40 (Sutherland et al., 1996Go). The extracellular signal-regulated kinase/mitogen-activated protein kinase pathway, which is also activated by CD40 (Li et al., 1996Go), contributes to the activation of AP1, NF-{kappa}B and NF-AT, and the subsequent induction of cytokine gene expression (Park and Levitt, 1993Go).

The mutation registry for XHIGM, CD40Lbase (Piirilä et al., 2006Go; Notarangelo and Peitsch, 1996Go) (http://bioinf.uta.fi/CD40Lbase/), currently lists 212 XHIGM patient entries with a total of 128 different mutations. Most disease causing mutations are found in exons (106), 35 of which are missense mutations located mainly in the TNFH domain of the protein (Fig. 1A). We investigated the consequences of all the CD40L missense mutations by applying structural and bioinformatics methods.


Figure 1
View larger version (81K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1.. (A) MultiDisp visualisation of the sequence alignment for CD40L and its homologues. The height of the characters indicates the frequency of the amino acids in the alignment positions, and the colour of the objects reflects the chemical nature of the amino acids. The CD40L domain boundaries and secondary structures are presented according to Karpusas et al. (1995)Go. The positions of CD40L missense mutations are indicated by arrowheads below the alignment, together with all mutant forms. XHIGM-causing mutations are clustered almost exclusively to the TNFH domain. The protein consists of an intracellular domain, IC; a transmembrane domain, TM; an extracellular unique domain, ECU; and a TNF homology domain, TNFH. (B) Structure of CD40 ligand (PDB code 1ALY). The residues involved in protein–protein interactions are coded as follows: trimer formation – orange; receptor binding – magenta. (C) CD40L missense mutations coloured according to their principal effects on CD40L structure and function. Change in electrostatic surface potential – red; conformational perturbation – cyan; loss of hydrophobic interactions and structural stability – yellow; protein–protein interactions – magenta. Secondary structures are named according to Karpusas et al. (1995)Go.

 

    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
The amino acid sequence and missense mutations for CD40L were obtained from our database CD40Lbase (http://bioinf.uta.fi/CD40Lbase/). The database lists all known mutations for CD40L and stores patient information, such as clinical and immunological phenotype, prognosis and treatment. The database was updated with recently published cases. Sequence homologues (33) were obtained by PSI-BLAST (Altschul et al., 1997Go), and homologues for the TNFH domain sequence were collected from the Pfam database (Bateman et al., 2004Go) (seed: 52, full: 175 sequences). Multiple sequence alignments were performed by Clustal W (Thompson et al., 1994Go). Alignments were visualised using MultiDisp (Riikonen, P. and Vihinen, M., in preparation) and ConSeq (Berezin et al., 2004Go) for illustration of conserved amino acids in the sequence.

The evolutionary conservation of the sequences was studied, in addition to the visualisation programs, by ProCon, a program for calculating mutual information and entropy in amino acid sequences (Shen and Vihinen, 2004Go). For entropy calculations, default parameters (p1 = 0.01, p2 = 0.05) were used, whereas parameters for the mutual information were readjusted to p1 = 0.005 and p2 = 0.020. Conservation indices were calculated with the program al2co (Pei and Grishin, 2001Go) and the ConSurf server (Glaser et al., 2003Go).

Structural disorder in the protein and the effects of mutations on disordered regions were studied using four predictors, DISOPRED (Ward et al., 2004Go), DisEMBL (Linding et al., 2003aGo), GlobPlot (Linding et al., 2003bGo) and PONDR (Romero et al., 1997). The disorder prediction methods are based on different principles, which are further discussed in the corresponding papers and in Thusberg and Vihinen (2006)Go.

The effects of mutations on aggregation propensities were studied by TANGO (Fernandez-Escamilla et al., 2004Go), and calculations presented by Chiti et al. (2003)Go, for which {alpha}-helical propensities were calculated with the program AGADIR (Muñoz and Serrano, 1997Go). A script was written to implement the method of Chiti et al. (2003)Go.

The damaging effects of point mutations were analysed using SNPs3D (Yue et al., 2006Go), SIFT (Ng and Henikoff, 2001Go), PolyPhen (Sunyaev et al., 2001Go), PoPMuSiC (Gilis and Rooman, 2000; Kwasigroch et al., 2002) and Pmut (Ferrer-Costa et al., 2005Go).

Structural analyses were performed based on the crystal structure of the protein (PDB 1ALY). The structure was visualised and the mutations were modelled by PyMOL (DeLano, 2002Go). Hydrogen atoms were added to the structures using Reduce (Word et al., 1999bGo). Mutant amino acid side chain {chi} angles were rotated at intervals of 10° by the Autobondrot function in PROBE (Word et al., 1999aGo; Lovell et al., 2000Go) and the best rotamers were selected for further analysis. The acceptable conformations for a mutated side chain have a total score above –1.0, allowing for small local perturbations in the structure (Lovell et al., 2000Go). The created structures were verified by MolProbity (Lovell et al., 2003Go), which was also used for converting the PDB files into Kinemage format. MolProbity adds all atom contacts into the structures and flips asparagines and glutamine side chains when necessary. Mutation structures were visualised by the program KiNG (Lovell et al., 2003Go), to analyse all atom contacts and clashes.

Amino acid contact analysis for the mutant residues in the TNFH domain was performed with CSU (Sobolev et al., 1999Go), and the nature of the contacts, contact surfaces, as well as solvent accessible surfaces, were elucidated. Contact energies between amino acids in the TNFH domain were analysed using RankViaContact (Shen and Vihinen, 2003Go). By analysing the wild-type protein, we could determine structurally important amino acids, which contribute to the stability of the protein, or amino acids with weak contacts that may be important for functional specificity. The analysis of changes in the contact energies for mutant structures provided hypotheses for the roles of the mutated amino acids. Electrostatic surface potentials were calculated and visualised with the PyMOL program (DeLano, 2002Go) using the absolute electrostatic potential in a vacuum.


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Diseases can arise from numerous genetic defects. To understand the basis of diseases, one has to study the effects of mutations at the gene and/or protein level. Missense mutations are, in this respect, the most interesting because of the possibility to learn about functions and properties of the protein. The effects of many other mutation types are self evident, such as large deletions or insertions, frameshift mutations and nonsense mutations, which affect the size and/or sequence of the protein. All the reported missense mutations causing XHIGM, altogether 35, could be explained at the molecular level by means of sequence and structure analysis.

Sequence conservation and mutations at the conserved residues

Disease causing mutations are typically located at conserved positions within a protein family, since these positions are usually essential for the structure and/or function of the protein (Miller and Kumar, 2001Go; Mooney and Klein, 2002Go; Shen and Vihinen, 2004Go). In CD40L, the degree of conservation depends on the protein domain. The CD40L TNFH domain (residues 123–261) has many more homologues than the IC and ECU domains, which are very different from the corresponding domains of other family members.

According to the full sequence alignment visualised with MultiDisp (Riikonen et al, in preparation) (Fig. 1A) and the conservation indices calculated by al2co (Pei and Grishin, 2001Go) and ConSeq (Berezin et al., 2004Go), there are 10 invariant positions in the TNF family corresponding to the amino acids W140, L161, G167, Y169, Y172, L205, G226, G227, L231 and G257 in CD40L. There are missense mutations at six of these positions: W140C/G/R, Y169D/N, G226A, G227V, L231S and G257D/S. Type II conservation, where the physicochemical nature of the amino acid is conserved, was studied by calculation of information with an alphabet in which amino acids are split into six groups based on their physicochemical properties. Type II conserved amino acids with known XHIGM causing mutations are Q174, with polarity as the conserved property, and Y170 and V237, the conserved property of which is hydrophobicity. In Fig. 1A, the physicochemical nature of these positions is represented by different colours.

Type III conservation refers to covariation of two or more positions in the protein family. In the TNF family, Type III conservation is evident, but almost all of the covarying amino acids are not conserved in the CD40L sequence—only at positions 222, 239 and 240 is the covarying amino acid the same as in the other family members, but there are no mutations in these amino acids.

Mutations predicted to affect structural disorder and protein ß-aggregation propensity

None of the mutations was predicted to cause disorder by all the programs we used (DisEMBL (Linding et al., 2003aGo), DISOPRED (Ward et al., 2004Go), GlobPlot (Linding et al., 2003bGo) and PONDR (Li et al., 1999Go)). G116R is the only mutation likely to cause disorder, because three of the four programs agreed on the disorder-causing nature of the mutation. In addition, V126D, W140G, L155P, A208D, A235P, V237E and L258S might increase disorder in the protein structure, being predicted to do so by half of the programs. E129G, K143T, T176I, H224Y and G227V were predicted to increase the protein aggregation rate when calculated by the methods of Chiti et al. (2003)Go. T176I was also predicted to cause aggregation by the program TANGO (Fernandez-Escamilla et al., 2004Go; Linding et al., 2004). In Fig. 1C, these effects are presented under the category of structural stability loss.

Structural mutations

In the structure-based studies, only the 33 mutations located within the structurally determined CD40L TNFH domain (PDB ID 1ALY) (Fig. 1A and C) could be analysed. At the position 116, where there are two known missense mutations, the structure is not well defined (Karpusas et al., 1995Go). Consequently, the effects of these mutations cannot be reliably predicted at the structural level. Effects of mutations on protein structure and stability were studied by rotamer analysis and determination of overlapping side chains. The best rotamers according to the PROBE scores were used in the analyses. Most of the mutated side chains fit into the structure without deleterious changes to protein scaffolding, as determined both computationally by the PROBE score (Word et al., 1999aGo; Word et al., 2000) and visually by the program KiNG (Lovell et al., 2003Go). Of the 33 mutations screened, 22 gave an acceptable score above –1.0, allowing for small local perturbations in the structure (Lovell et al., 2000Go). In the visual inspection, 17 of the mutations showed very little or no effect on the structure.

According to PROBE scores, the S128R/E129G double mutant, L155P, T176I, L195P, G226A, G227V, A235P, T254M, G257D and G257S, do not fit to the structure in any rotamer (example in Fig. 2A). In addition, mutations A173D, W140C, W140R and A208D were shown to cause serious clashes with other side chains (example in Fig. 2B). Proline is a known secondary structure breaker, and L155 and A235 are located in the middle of ß strands. Prolines in these positions have been suggested to cause structural disorder (Karpusas et al., 1995Go). Mutated amino acids that cannot fit into the structure without clashes, lead to changes in protein scaffolding, stability and properties of the protein. These mutations are indicated as conformation perturbating in Fig. 1C.


Figure 2
View larger version (46K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2.. (A) Substitution of T176 by I causes serious clashes with neigh bouring residues, also indicated by a negative PROBE score (–18.604). (B) Substitution of W140 by C causes serious clashes with V126, in spite of a positive PROBE score (0.639). Yellow – negligible overlap; red – significant overlap ≥0.25 Å; hot pink – serious clash overlap ≥0.4 Å.

 
Mutations causing changes in contacts maintaining stability

Amino acids located in the core of the protein, with a negligible solvent accessible surface area, typically form several hydrophobic interactions essential for the folding of the protein and for the stability of the protein structure. Mutations in such amino acids may cause detrimental changes to the structure-maintaining contacts. The probability of a mutation being pathogenic has been shown to increase with a decrease in the solvent accessibility of the site (Vitkup et al., 2003Go). In addition, introduction of charged side chains into the hydrophobic core is known to destabilise protein structure (Chasman and Adams, 2001Go). Of the 33 CD40L TNFH mutations, 10 cause significant loss of hydrophobic interactions: V126A, V126D, W140G, W140R, W140C, Y169D, Y169 N, A173D, L231S and L258S. All these residues are located in the hydrophobic core of the protein, with a solvent accessible surface of 0.0–1.4%, and strong contact energies.

Residues with strong contact energies are important for protein stability (Shen and Vihinen, 2003Go). Mutations that affect such residues can thus be predicted to decrease stability. Of the 33 investigated mutations, five affect residues from among the 10% most stabilising amino acids: V126D, V237E, G257D, G257S and L258S. V237 forms several hydrophobic interactions with neighbouring residues, but the mutation does not change the number of these contacts. Instead, the contact energy of the residue decreases from –23.850 to –5.550, which indicates a significant weakening of the contacts as a consequence of the mutation. G257, another residue with strong contact energy, is also presumably structurally important. The hydrogen bonds it forms with Y172 and L258 do not change upon substitution with D or S, so the structure destabilising effects of the mutations can be explained by side chain clashes and the restriction of the mobility of the backbone (as usually happens when glycine is replaced by an alternative amino acid). The introduction of a charged residue (aspartate) into the protein core would require another mutation to neutralise the charge.

L258 and V126 form several strong hydrophobic contacts, the number of which decreases markedly when mutated to an S or D. The negative charge introduced by aspartate into the hydrophobic core contributes to the destabilising effect of the mutation V126D. The mutations, whose principal effect is the loss of essential stability maintaining contacts, are categorised as stability reducing in Fig. 1C.

Mutations affecting the electrostatic surface potential of CD40L

G116R, A123E, H125R, V126D, S128R/E129G, K143T, G144E, A173D, Q174R, R203I, A208D, V237E and G257D introduce significant changes into the electrostatic surface potential of CD40L (Figs. 1C and 3). Most mutations change the surface potential more negative, whereas the arginine-introducing mutations, G116R, H125R, Q174R and S128R/E129G, make the potential more positive. W140R does not introduce a significant change to the electrostatic surface potential because the side chain of the mutated residue lies inside the structure. However, the positive charge is not neutralised by interactions with other residues in the protein core.

Changes in the electrostatic potential affect the properties of a protein in many ways. Electrostatics is a significant factor in protein folding and stability, and it has an effect on protein interactions. Electrostatic surface potential has a major role in CD40 ligand–receptor interactions, since the positive surface of the ligand attracts the negative surface of the receptor (Singh et al., 1998Go).

Effects of mutations on protein–protein interactions

The function of the CD40 ligand is based on two different kinds of protein–protein interactions. The interaction with the receptor is essential for initiating the signalling cascade leading to B-cell activation, and interactions between CD40L monomers enable CD40L trimer formation, which has to occur in order for the interaction with the receptor to take place (Peitsch and Jongeneel, 1993Go).

The CD40L–CD40 interaction is based on electrostatic interactions whereby basic side chains on the CD40L surface attract acidic side chains on the surface of CD40. CD40L residues K143, R203 and R207 form salt bridges with receptor surface amino acids (Singh et al., 1998Go). K143 is mutated to T in XHIGM, and thus a salt bridge with the receptor E66 is lost. K143T also affects the CD40L electrostatic surface potential, which contributes to the loss of affinity between the ligand and receptor. The substitution of R203 by I leads to the loss of the ion pair formed with E74 of the CD40. The mutation changes the surface potential more negative as well.

In addition to the acid–base contacts formed between the ligand–receptor pair, other direct contact (distance between heavy atoms <5 Å) forming CD40L residues are: I127, S128, E129, E142, G144, Y145, Y146, C178, S185, Q186, A187, R200, F201, C218, Q220, S248, H249, G250, T251 and G252 (Singh et al., 1998Go). XHIGM causing mutations occur in residues 128, 129 and 144. When a glycine is replaced by a glutamate; with a long side chain and a negative charge, it has an effect on the specificity of the interaction. The introduction of glutamate might result in the loss of the conformational freedom necessary at that position in order to form the corner of the AA'' loop (Karpusas et al., 1995Go). The orientation of K143 and Y145 might consequently change, making the ligand–receptor interaction less likely to occur.

The essential contact-forming residues in homotrimer interactions are Y170 and H224 (Karpusas et al., 1995Go), both of which are mutated in XHIGM (Y170C and H224Y). Neither of these mutations changes the polar nature of the position, but specific contacts between the monomers are potentially affected or hindered. Complex stabilising interactions also important for CD40L trimerisation are formed by Q121, H125, Y145, T147, Y172, Q174, L195, R203, L205, L206, R207, A208, A209, N210, T211, A215, G219, Q220, Q221, S222, L225, G226, G227, V228, F229, E230, T251, G252, F253, L259 and L261 (Morris et al., 1999; Karpusas et al., 2001Go). There are eight mutations at positions 125, 147, 174, 195, 203, 208, 226 and 227. The residues involved in protein–protein interactions are presented in Fig. 1B. The effects of mutations on the structure and function of CD40L are summarised in Fig. 1C and Table I.


View this table:
[in this window]
[in a new window]

 
Table I.. Summary of the effects of CD40L mutations on structure and function

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Here, we have investigated the structural effects and consequences of disease-causing mutations in CD40 ligand. We have collected information about disease-causing mutations in immunodeficiencies to databases called IDbases (Piirilä et al., 2006Go). Currently there are 115 IDbases and 4587 patient cases in them. We have previously applied bioinformatics and structural analysis methods to reveal the basis of e.g. Bruton's tyrosine kinase mutations in X-linked agammaglobulinemia (Vihinen et al., 1994aGo, 1994bGo; Väliaho et al., 2006Go), SHD1A mutations in X-linked lymphoproliferative disease (Lappalainen et al., 2000Go), BLM mutations in Bloom syndrome (Rong et al., 2000Go), mutations in the WAS protein in Wiskott–Aldrich syndrome (Rong and Vihinen, 2000Go) and mutations in the methyltransferase domain of DNMT3B in immunodeficiency, centromeric instability and facial abnormalities (ICF) syndrome (Lappalainen and Vihinen, 2002Go). In addition, we have tested and discussed the applicability of sequence and structure-based bioinformatics methods to reveal structure–function correlations of disease-causing missense mutations (Thusberg and Vihinen, 2006Go). In addition to understanding the molecular basis of disease, the ability to predict the effects of amino acid substitutions is useful for protein engineering purposes.

Most disease causing mutations affect the stability of protein structure (Wang and Moult, 2001; Steward et al., 2003). Thirteen of the thirty-five mutations (one being a double mutant) in CD40L can be classified as functional, directly changing amino acids involved in trimerisation and ligand–receptor interactions. Eight of these mutated amino acids are also involved in stabilisation of the CD40L trimer, which is why they could also be classified as structural mutations (Table I). Because of the correlation between structure and function, the classifications are overlapping.

Conserved amino acids tend to be essential for structure and function, which is why disease-causing mutations often occur at the corresponding positions (Miller and Kumar, 2001Go; Mooney and Klein, 2002Go). The probability that a random mutation will cause a genetic disease has been shown to increase with an increase in the degree of site conservation (Vitkup et al., 2003Go). In the TNF family, sequence conservation is evident, and 37% of CD40L mutations affect Type I and Type II conserved amino acids (Table I, Fig. 1A). There are many covarying positions in the protein family, but only few of them are conserved in CD40L. None of these sites has been shown to have disease-causing mutations.

The members of the TNF family exhibit structural conservation—all of them have a jelly roll fold and Greek key topology. The specific functions of these proteins are governed by the loops connecting the ß strands. The length and properties of the loops vary significantly among the family members (Karpusas et al., 1995Go). Most of the disease-causing mutations in CD40L are located in the ß strands (Fig. 1A and C), and are thus predicted to affect protein structure and stability, thereby hindering protein function.

Some missense mutations may increase disorder in the CD40L structure according to our predictions. Although there are several methods available for disorder prediction, they seldom agree on the effects of mutations. The structure-based predictions of the effects of mutations gave further insight into their role in CD40L structure and function. The three-dimensional structure has been determined only for the TNFH domain, thus missense mutations outside the region (M36R and G38R) could not be analysed at the structural level. At the first residues of the structurally determined domain, the structure is not well defined (Karpusas et al., 1995Go), which is why the effects of the mutations G116R and G116S cannot be reliably predicted at the structural level. The mutations are likely to cause conformational rearrangements into the structure. Mutations introducing arginine can be predicted to affect protein structure, as problems in side chain packing are common when the replacing residue is larger than the one being substituted, especially when the substituted amino acid is glycine. A positively charged side chain may also change fundamental structure-maintaining contacts. In the membrane spanning {alpha}-helix, the introduction of positively charged amino acids may cause problems for the stability of the helix or its insertion into the membrane. It has been hypothesised that positively charged substitutions in transmembrane helices act as signals guiding the protein to be degraded in the endoplasmic reticulum (Bonifacino et al., 1991Go). Indeed, CD40L forms with the mutations M36R or G38R are expressed on the T-cell membrane to a greatly reduced extent (10%) compared to the wild-type protein (Garber et al., 1999Go) (Table I). The mutations in the transmembrane helix probably cause the XHIGM phenotype by reducing the expression of CD40L on the T-cell surface, thereby decreasing the number of ligand–receptor contacts.

Forty percent of the missense mutations analysed cause major structural changes to the protein, according to the PROBE score for the best rotamer (Table I). Side chains with a low score do not fit into the structure in any conformation, causing changes to the structure already during the folding process. Side chains predicted to cause clashes lead to at least local rearrangements of the structure as well. Even subtle changes in protein scaffolding may have an influence on specific protein–protein interactions. The amino acid contacts in the hydrophobic core are crucial for the folding and stability of the protein. Thirteen of the mutations in CD40L affect amino acids forming strong contacts in the hydrophobic core of the protein, thereby causing the loss of a number of structure and stability maintaining contacts (Table I).

Electrostatic surface potentials, calculated with the program PyMOL, are suggestive and qualitative (DeLano, 2002Go). Electrostatic surface potential is an important property of CD40L, since the ligand–receptor interaction is mainly based on the attraction of molecular surfaces having opposite charges (Singh et al., 1998Go). Our results indicate that the changes in the potential were evident for many of the substitutions (Fig. 3, Table I). Mutations that make the potential more negative are likely to affect the affinity between the ligand and receptor, because the positive surface potential of CD40L attracts the negative surface of CD40 (Singh et al., 1998Go).


Figure 3
View larger version (93K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3.. Electrostatic surface potentials in wild-type (A and C) and mutated (B and D) CD40L. In the mutated structures, all missense mutations are included, except for the cases where a single position is affected by several mutations. In these positions, the following mutations are displayed: G116R, V126A, W140R, Y169 N and G257D. A large proportion of the mutations alter the surface potential from positive to negative.

 
The consequences of mutations are diverse and the different effects on CD40L structure and function are equally represented. Thirty-seven percent of the mutations affect residues known to be crucial for receptor binding or trimerisation (Table I). Electrostatic surface potential, which is also an important factor in protein–protein interactions, is affected by six additional substitutions (Table I). Thus, more than 50% of XHIGM causing missense mutations are predicted to affect CD40L ligation and trimerisation (Table I), the proportion of structural mutations being slightly bigger (63%). Generally, the majority of pathogenic mutations affect structural rather than functional residues (Wang and Moult, 2001Go; Mooney and Klein, 2002Go). The analysis of structural and functional consequences of the CD40L mutations identified in XHIGM patients provides insights into the molecular basis of the syndrome. Further analysis at the experimental level will be needed to test our predictive findings and to fully understand the mechanisms behind the disease.


    Footnotes
 
Edited by Elizabeth Meiering


    Acknowledgements
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
The financial support from the Tampere Graduate School in Biomedicine and Biotechnology, the Medical Research Fund of the Tampere University Hospital and the EU is gratefully acknowledged.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Acknowledgements
 References
 
Allen R.C., et al. (1993) Science 259:990–993.[Abstract]

Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. (1997) Nucleic Acids Res. 25:3389–3402.[Abstract/Free Full Text]

Armitage R.J., et al. (1992) Nature 357:80–82.[CrossRef][Medline]

Aruffo A., et al. (1993) Cell 72:291–300.[CrossRef][ISI][Medline]

Bateman A., et al. (2004) Nucleic Acids Res. 32:D138–D141.[Abstract/Free Full Text]

Berberich I., Shu G., Siebelt F., Woodgett J.R., Kyriakis J.M., Clark E.A. (1996) EMBO J. 15:92–101.[ISI][Medline]

Berberich I., Shu G.L., Clark E.A. (1994) J. Immunol. 153:4357–4366.[Abstract]

Berezin C., Glaser F., Rosenberg J., Paz I., Pupko T., Fariselli P., Casadio R., Ben-Tal N. (2004) Bioinformatics 20:1322–1324.[Abstract/Free Full Text]

Bodmer J.L., Schneider P., Tschopp J. (2002) Trends Biochem. Sci. 27:19–26.[CrossRef][ISI][Medline]

Bonifacino J.S., Cosson P., Shah N., Klausner R.D. (1991) EMBO J. 10:2783–2793.[ISI][Medline]

Burdin N., Peronne C., Banchereau J., Rousset F. (1993) J. Exp. Med. 177:295–304.[Abstract/Free Full Text]

Chasman D. and Adams R.M. (2001) J. Mol. Biol. 307:683–706.[CrossRef][ISI][Medline]

Chiti F., Stefani M., Taddei N., Ramponi G., Dobson C.M. (2003) Nature 424:805–808.[CrossRef][Medline]

Clark E.A. and Ledbetter J.A. (1986) Proc. Natl Acad. Sci. USA 83:4494–4498.[Abstract/Free Full Text]

Clark E.A. and Shu G. (1990) J. Immunol. 145:1400–1406.[Abstract]

Coffman R.L., Lebman D.A., Rothman P. (1993) Adv. Immunol. 54:229–270.[ISI][Medline]

Dadgostar H., Zarnegar B., Hoffmann A., Qin X.F., Truong U., Rao G., Baltimore D., Cheng G. (2002) Proc. Natl Acad. Sci. USA 99:1497–1502.[Abstract/Free Full Text]

DeLano W.L. (2002) DeLano Scientific. , San Carlos, CA USA. http://www.pymol.org.

Fernandez-Escamilla A.M., Rousseau F., Schymkowitz J., Serrano L. (2004) Nat. Biotechnol. 22:1302–1306.[CrossRef][ISI][Medline]

Ferrer-Costa C., Gelpi J.L., Zamakola L., Parraga I., de la Cruz X., Orozco M. (2005) Bioinformatics 21:3176–3178.[Abstract/Free Full Text]

Finkelman F.D., Holmes J., Katona I.M., Urban J.F., Beckmann M.P., Park L.S., Schooley K.A., Coffman R.L., Mosmann T.R., Paul W.E. (1990) Annu. Rev. Immunol. 8:303–333.[CrossRef][ISI][Medline]

Fuleihan R., Ramesh N., Loh R., Jabara H., Rosen R.S., Chatila T., Fu S.M., Stamenkovic I., Geha R.S. (1993) Proc. Natl Acad. Sci. USA 90:2170–2173.[Abstract/Free Full Text]

Garber E., Su L., Ehrenfels B., Karpusas M., Hsu Y.M. (1999) J. Biol. Chem. 274:33545–33550.[Abstract/Free Full Text]

Gilis D. and Rooman M. (2000) Protein Eng. 13:849–856.[Abstract/Free Full Text]

Glaser F., Pupko T., Paz I., Bell R.E., Bechor-Shental D., Martz E., Ben-Tal N. (2003) Bioinformatics 19:163–164.[Abstract/Free Full Text]

Harigai M., et al. (2004) Arthritis Rheum. 50:2167–2177.[CrossRef][ISI][Medline]

Hostager B.S., Catlett I.M., Bishop G.A. (2000) J. Biol. Chem. 275:15392–15398.[Abstract/Free Full Text]

Karpusas M., Hsu Y.M., Wang J.H., Thompson J., Lederman S., Chess L., Thomas D. (1995) Structure 3:1031–1039.[Medline]

Karpusas M., Lucci J., Ferrant J., Benjamin C., Taylor F.R., Strauch K., Garber E., Hsu Y.M. (2001) Structure 9:321–329.[Medline]

Kindler V., Matthes T., Jeannin P., Zubler R.H. (1995) Eur. J. Immunol. 25:1239–1243.[ISI][Medline]

Klaus S.J., Berberich I., Shu G., Clark E.A. (1994) Semin. Immunol. 6:279–286.[CrossRef][Medline]

Kroczek R.A., Graf D., Brugnoni D., Giliani S., Korthuer U., Ugazio A., Senger G., Mages H.W., Villa A., Notarangelo L.D. (1994) Immunol. Rev. 138:39–59.[CrossRef][ISI][Medline]

Kwasigroch J.M., Gilis D., Dehouck Y., Rooman M. (2002) Bioinformatics 18:1701–1702.[Abstract/Free Full Text]

Lappalainen I., Giliani S., Franceschini R., Bonnefoy J.Y., Duckett C., Notarangelo L.D., Vihinen M. (2000) Biochem. Biophys. Res. Commun. 269:124–130.[CrossRef][ISI][Medline]

Lappalainen I. and Vihinen M. (2002) Protein Eng. 15:1005–1014.[Abstract/Free Full Text]

Levy J., et al. (1997) J. Pediatr. 131:47–54.[CrossRef][ISI][Medline]

Li X., Romero P., Rani M., Dunker A.K., Obradovic Z. (1999) Genome Inform. Ser. Workshop Genome Inform. 10:30–40.[Medline]

Li Y.Y., Baccam M., Waters S.B., Pessin J.E., Bishop G.A., Koretzky G.A. (1996) J. Immunol. 157:1440–1447.[Abstract]

Linding R., Jensen L.J., Diella F., Bork P., Gibson T.J., Russell R.B. (2003a) Structure 11:1453–1459.[Medline]

Linding R., Russell R.B., Neduva V., Gibson T.J. (2003b) Nucleic Acids Res. 31:3701–3708.[Abstract/Free Full Text]

Linding R., Schymkowitz J., Rousseau F., Diella F., Serrano L. (2004) J. Mol. Biol. 342:345–353.[CrossRef][ISI][Medline]

Lovell S.C., Davis I.W., Arendall W.B. 3rd, de Bakker P.I., Word J.M., Prisant M.G., Richardson J.S., Richardson D.C. (2003) Proteins 50:437–450.[CrossRef][ISI][Medline]

Lovell S.C., Word J.M., Richardson J.S., Richardson D.C. (2000) Proteins 40:389–408.[CrossRef][ISI][Medline]

Miller M.P. and Kumar S. (2001) Hum. Mol. Genet. 10:2319–2328.[Abstract/Free Full Text]

Mooney S.D. and Klein T.E. (2002) BMC Bioinformatics 3:24.[CrossRef][Medline]

Morris A.E., Remmele R.L. Jr, Klinke R., Macduff B.M., Fanslow W.C., Armitage R.J. (1999) J. Biol. Chem. 274:418–423.[Abstract/Free Full Text]

Muñoz V. and Serrano L. (1997) Biopolymers 41:495–509.[CrossRef][ISI][Medline]

Ng P.C. and Henikoff S. (2001) Genome Res. 11:863–874.[Abstract/Free Full Text]

Noelle R.J., Roy M., Shepherd D.M., Stamenkovic I., Ledbetter J.A., Aruffo A. (1992) Proc. Natl Acad. Sci. USA 89:6550–6554.[Abstract/Free Full Text]

Notarangelo L.D., Duse M., Ugazio A.G. (1992) Immunodefic. Rev. 3:101–121.[Medline]

Notarangelo L.D. and Peitsch M.C. (1996) Immunol. Today 17:511–516.[CrossRef][ISI][Medline]

Park J.H. and Levitt L. (1993) Blood 82:2470–2477.[Abstract/Free Full Text]

Pei J. and Grishin N.V. (2001) Bioinformatics 17:700–712.[Abstract/Free Full Text]

Peitsch M.C. and Jongeneel C.V. (1993) Int. Immunol. 5:233–238.[Abstract/Free Full Text]

Piirilä H., Väliaho J., Vihinen M. (2006) Hum. Mutat. 27:1200–1208.[CrossRef][ISI][Medline]

Pinchuk L.M., Klaus S.J., Magaletti D.M., Pinchuk G.V., Norsen J.P., Clark E.A. (1996) J. Immunol. 157:4363–4370.[Abstract]

Raingeaud J., Whitmarsh A.J., Barrett T., Derijard B., Davis R.J. (1996) Mol. Cell Biol. 16:1247–1255.[Abstract]

Romero P., Obradovic Z., Dunker A.K. (1997) Genome Inform. Ser. Workshop Genome Inform. 8:110–124.[Medline]

Rong S.B., Väliaho J., Vihinen M. (2000) Mol. Med. 6:155–164.[ISI][Medline]

Rong S.B. and Vihinen M. (2000) J. Mol. Med. 78:530–537.[CrossRef][ISI][Medline]

Shen B. and Vihinen M. (2003) Bioinformatics 19:2161–2162.[Abstract/Free Full Text]

Shen B. and Vihinen M. (2004) Protein Eng. Des. Sel. 17:267–276.[Abstract/Free Full Text]

Singh J., Garber E., Van Vlijmen H., Karpusas M., Hsu Y.M., Zheng Z., Naismith J.H., Thomas D. (1998) Protein Sci. 7:1124–1135.[Abstract]

Sobolev V., Sorokine A., Prilusky J., Abola E.E., Edelman M. (1999) Bioinformatics 15:327–332.[Abstract/Free Full Text]

Steward R.E., MacArthur M.W., Laskowski R.A., Thornton J.M. (2003) Trends Genet. 19:505–513.[CrossRef][ISI][Medline]

Sunyaev S., Ramensky V., Koch I., Lathe W. 3rd, Kondrashov A.S., Bork P. (2001) Hum. Mol. Genet. 10:591–597.[Abstract/Free Full Text]

Sutherland C.L., Heath A.W., Pelech S.L., Young P.R., Gold M.R. (1996) J. Immunol. 157:3381–3390.[Abstract]

Thompson J.D., Higgins D.G., Gibson T.J. (1994) Nucleic Acids Res. 22:4673–4680.[Abstract/Free Full Text]

Thusberg J. and Vihinen M. (2006) Hum. Mutat. 27:1230–1243.[CrossRef][ISI][Medline]

Väliaho J., Smith C.I.E., Vihinen M. (2006) Hum. Mutat. 27:1209–1217.[CrossRef][ISI][Medline]

Vihinen M., Nilsson L., Smith C.I.E. (1994a) Biochem. Biophys. Res. Commun. 205:1270–1277.[CrossRef][ISI][Medline]

Vihinen M., et al. (1994b) Proc. Natl Acad. Sci. USA 91:12803–12807.[Abstract/Free Full Text]

Vitkup D., Sander C., Church G.M. (2003) Genome Biol. 4:R72.[CrossRef][Medline]

Wang Z. and Moult J. (2001) Hum. Mutat. 17:263–270.[CrossRef][ISI][Medline]

Ward J.J., McGuffin L.J., Bryson K., Buxton B.F., Jones D.T. (2004) Bioinformatics 20:2138–2139.[Abstract/Free Full Text]

Word J.M., Bateman R.C. Jr, Presley B.K., Lovell S.C., Richardson D.C. (2000) Protein Sci. 9:2251–2259.[Abstract]

Word J.M., Lovell S.C., LaBean T.H., Taylor H.C., Zalis M.E., Presley B.K., Richardson J.S., Richardson D.C. (1999a) J. Mol. Biol. 285:1711–1733.[CrossRef][ISI][Medline]

Word J.M., Lovell S.C., Richardson J.S., Richardson D.C. (1999b) J. Mol. Biol. 285:1735–1747.[CrossRef][ISI][Medline]

Worm M., Ebermayer K., Henz B. (1998) Immunology 94:395–402.[CrossRef][ISI][Medline]

Worm M. and Geha R.S. (1995) Int. Arch. Allergy Immunol. 107:368–369.[ISI][Medline]

Yue P., Melamud E., Moult J. (2006) BMC Bioinform. 7:166.[CrossRef][Medline]

Received September 19, 2006; revised November 17, 2006; accepted December 19, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
20/3/133    most recent
gzm004v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Thusberg, J.
Right arrow Articles by Vihinen, M.
Right arrow Search for Related Content