Skip Navigation


PEDS Advance Access first published online on December 18, 2007
This version published online on January 9, 2008

Protein Engineering Design and Selection, doi:10.1093/protein/gzm071
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrowOA All Versions of this Article:
21/1/11    most recent
gzm071v2
gzm071v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Hernandez Alvarez, B.
Right arrow Articles by Linke, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hernandez Alvarez, B.
Right arrow Articles by Linke, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

A new expression system for protein crystallization using trimeric coiled-coil adaptors

Birte Hernandez Alvarez, Marcus D. Hartmann, Reinhard Albrecht, Andrei N. Lupas1, Kornelius Zeth and Dirk Linke

Department Protein Evolution, Max Planck Institute for Developmental Biology, Spemannstr. 35, 72076 Tübingen, Germany

1 To whom correspondence should be addressed. E-mail: andrei.lupas{at}tuebingen.mpg.de


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Funding
 Acknowledgements
 References
 
We repeatedly experienced difficulties in obtaining pure protein of a defined oligomeric state when expressing domains that consist partially or entirely of coiled coils. We therefore modified an established expression vector, pASK-IBA, to generate N- and C-terminal fusions of the cloned domain in heptad register with the GCN4 leucine zipper. GCN4 is a well-characterized coiled coil, for which stable dimeric, trimeric and tetrameric forms exist. To test this expression system, we produced a series of constructs derived from the trimeric autotransporter adhesin STM3691 of Salmonella (SadA), which has a highly repetitive structure punctuated by coiled-coil regions. The constructs begin and end with predicted coiled-coil segments of SadA, each fused in the correct heptad register to the trimeric form of GCN4, GCN4pII. All constructs were expressed at high levels, trimerized either natively or after refolding from inclusion bodies, and yielded crystals that diffracted to high resolution. Thus, fusion to GCN4pII allows for the efficient expression and crystallization of proteins containing trimeric coiled coils. The structure of short constructs can be solved conveniently by molecular replacement using the known GCN4 structure as a search model. The system can be adapted for constructs with dimeric or tetrameric coiled coils, using the corresponding GCN4 variants.

Keywords: coiled coil/expression system/fusion protein/GCN4/trimeric adhesin


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Funding
 Acknowledgements
 References
 
Coiled coils are supercoiled bundles of {alpha}-helices and represent one of the most frequent structural motifs in proteins (Lupas and Gruber, 2005Go). Conservative estimates suggest that about 5% of all residues in proteins are present in {alpha}-helical coiled coils (Lupas et al., 1991Go; Jones, 1999Go; Offer et al., 2002Go). Their hallmark is the packing geometry of core residues, termed knobs-into-holes (Crick, 1952Go), which specifies the packing of a residue from one helix (knob) into a space surrounded by four side chains of the facing helix (hole). Thus, symmetry-related hydrophobic residues are arranged side by side to form layers throughout the core of coiled coils. This very regular packing mode results in a strong sequence periodicity, called the ‘heptad repeat’, whose seven positions are denoted a-g, with a and d containing the hydrophobic residues that form the core. This periodicity can be detected easily by sequence inspection or bioinformatic software (Lupas, Van Dyke and Stock, 1991Go; Delorenzi and Speed, 2002Go) and provides a natural register, in which coiled coils can be fused together without distortion of their structure. Such fusion proteins can overcome experimental problems that appear when coiled-coil domains are removed from their native context or lack proper trigger sequences that define their oligomeric state (Kammerer et al., 1998Go, 2005Go).

A coiled-coil protein that has been used frequently and successfully in such fusions is GCN4. GCN4 is a transcription factor responsible for the derepression response upon amino acid starvation in yeast (Arndt and Fink, 1986Go). It is a dimer, and dimerization is mediated by a short domain at the C-terminus, termed the leucine zipper (Landschulz et al., 1988Go), which forms a very stable coiled coil (Thompson et al., 1993Go). The oligomeric form of this coiled coil can be altered in defined ways by mutating residues of its hydrophobic core (Harbury et al., 1993Go). In essence, beta-branched residues in a (Ile, Val) and gamma-branched residues in d (Leu) specify dimers, a reversal of side-chains at these positions specifies tetramers, and a core formed entirely of beta-branched residues forms trimers. Native and mutant GCN4 zippers have been used successfully for the production of chimeric proteins, allowing, among many others, studies of dimeric DNA-binding proteins (Wolfe et al., 2003Go) and histidine kinases (Wang et al., 2002Go), dimeric (Perera et al., 2003Go) and trimeric (Yang et al., 2000Go) virus envelope proteins, and tetrameric high-affinity artificial antibodies (Pack et al., 1995Go). In all these cases, the zippers were used to trigger oligomerization of the proteins of interest, and as a side effect, protein stability and solubility was increased.

These applications were all customized for specific proteins, so we set out to construct a versatile expression system for medium- to high-throughput applications. We exploit the hydrophobic repeat pattern to connect any protein fragment in the correct structural register to GCN4, provided it itself contains a coiled-coil segment N- or C-terminally. This requires that the native oligomeric state of the target protein is known, so that the appropriate form of GCN4 can be chosen, and that the heptad pattern of the coiled-coil segments where the fusions are to be made is assigned. Seamless fusion is made possible through the use of Type IIS restriction enzymes that have an asymmetric recognition site with cleavage occurring at a defined distance (Pingoud and Jeltsch, 2001Go). As an extension to previous approaches, we aim to fuse GCN4 at both ends, wherever possible. Because of the high frequency of coiled coils in proteins, many domains are found naturally flanked by coiled coils and can be approached with our expression system. Our primary targets are trimeric autotransporter adhesins, which have a highly modular domain structure with many interspersed coiled-coil segments (Linke et al., 2006Go). Expression of individual domains from these proteins did not in general yield useful material for structural studies, as the proteins were frequently insoluble, folded inefficiently or formed mixtures of different oligomers. These problems were solved by fusing the trimeric version of the GCN4 coiled coil, GCN4pII (Harbury et al., 1994Go), to the constructs N- and C-terminally.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Funding
 Acknowledgements
 References
 
Cloning of the vector

The following primers were used in the construction of the vector pIBA-GCN4tri: FrontGCN4-f: CGACAAAAATCTAGATAACGAGGGCAAAAAATGAAACAGATTGAAGATAAAATTGAAG; FrontGCN4-r: GAATTCGGGACCATGGTCTCCAATCAGTTTTTTAATACGCGCAATTTC; Front GCN4-x: GAAATTGCGCGTATTAAAAAACTGATTGGAGACCATGGTCCCGAATTC; RearGCN4-f: CAGGGGGACCATGGTCTCAATGAAACAGATTGAAGATAAAATTGAAG; RearGCN4-r: CACAGGTCAAGCTTATTAAATCAGTTTTTTAATACGCGCAATTTC; RearGCN4-x: CTTCAATTTTATCTTCAATCTGTTTCATTGAGACCATGGTCCCCCTG.

The FrontGCN4-f and -r primers were used in a PCR reaction to produce the first GCN4 sequence in the vector, including the XbaI site, ribosomal binding site (RBS) and an overlap into the multiple cloning site (MCS). The RearGCN4-f and -r primers were used to produce the second GCN4 sequence, including an overlap into the MCS, the stop codon and the HindIII site. For both reactions, a plasmid containing a GCN4 fusion construct (lab collection) was used as a template. In a third PCR reaction, the MCS from the vector pASK-IBA2 (IBA, Göttingen, Germany) was amplified using the primers FrontGCN4-x and RearGCN4-x, producing a short DNA fragment with overlaps into both GCN4 sequences. The resulting PCR fragments were fused, first the front GCN4 part with the MCS (primers FrontGCN4-f and RearGCN4-x) and second the rear GCN4 part with the MCS (primers FrontGCN4-x and RearGCN4-r). In a last PCR reaction, the GCN4-MCS and MCS-GCN4 fragments were fused using the primers FrontGCN4-f and RearGCN4-r. A sufficiently high annealing temperature was applied to avoid mispriming to other parts of the repetitive GCN4 sequences. The resulting PCR fragment representing the MCS with both GCN4 sequences was cut with XbaI/HindIII and ligated into the vector pASK-IBA2, which in turn was cut with XbaI/HindIII after isolating the plasmid from the Methylase-deficient strain Escherichia coli GM2163 (Fermentas, St Leon-Roth, Germany). The sequence of the resulting vector pIBA-GCN4tri was verified using the commercially available sequencing primers for the pASK-IBA vectors. To construct the vector pIBA-GCN4tri-His, the stop codon was removed from pIBA-GCN4tri using a Quickchange kit (Stratagene) and the primers.

StremP1se: GCGCGTATTAAAAAACTGATTAAGCTTGACCTGTGAAGTGAAAAATGGCGC and StremP1as: GCGCCATTTTTCACTTCACAGGTCAAGCTTAATCAGTTTTTTAATACGCGC.

The plasmid was then cut with HindIII. The primers HisP1: GCTTCATCATCATCATCATCACTGAGCTAGC and HisP2: AGCTGCTAGCTCAGTGATGATGATGATGATGA were annealed by mixing them, heating them to 95°C and subsequent slow cooling. The resulting double-stranded fragment contained the necessary overhangs to ligate it into the vector, producing a HindIII site before and an NheI site after the 6xHis tag. After transformation, clones were screened by PCR for proper directionality of the inserts and verified by DNA sequencing.

Cloning of SadA fragments

For construction of SadAK1, SadAK2, SadAK3 and SadAK3His, the following primers were used:

SadAK1 P1: TATTCTTTAAGTCAATCCGTCGCCGACCGACTCGGCGGAGGGGCTTCCGTTAATAGTGATGGTACAGTGAATGCGCCCCTCTACG; SadAK1 P2: AGACGTGTTAAGTGCGCTTAATGCACTACCTACGTTATTGTAGATGCCTGTGCCTACCTCGTAGAGGGGCGCATTCAC; SadAK1 P3-fw: GACCATGGTCTCCGATTTATTCTTTAAGTCAATCCGTCGCCG; SadAK1 P4-rev: GACCATGGTCTCCTCATAGACGTGTTAAGTGCGCTTAATGC; SadAK2 P1-fw: GACCATGGTCTCCGATTACTAACACAGAGGCCTCTGTCGCAGGATTAGCCGAAGACGCGCTGTTGTGG; SadAK2 P2: TGGCGTTTCCCGTGTGGCTAGCGCTAAAGGCGCTGA TGCTTTCATCCCACAACAGCGCGTCTTCGGC; SadAK2 P3: CTAGCCACACGGGAAACGCCAGCAAAATCACCAATCTGGCGGCGGGTACCCTGGCTGCGGACAGCACCG; SadAK2 P4-rev: GACCATGGTCTCCTCATCTGAGAGCCGTTAACGGCATCGGTGCTGTCCGCAGCCAGG; SadAK3 P1-fw: GACCATGGTCTCCGATTTTTGATACAAATGAGAAAGTGGATCAGAACACCGCTGATATCACCACCAATACCAACAGCATCAATCAGAACAC; SadAK3 P2-rev: GACCATGGTCTCCTCATGGAATCGCTCAGGTTGTTGATATTG GTGGTGTTGGTGGCAATATCAGTGGTGTTCTGATTGATGCTGTTG

The SadAK1 fragment was amplified by PCR in two steps. The first fragment was achieved by elongation of annealed primers SadAK1 P1 and SadAK1 P2 in a PCR reaction. This was subsequently used as a template in a second PCR with primers SadAK1 P3-fw and SadAK1 P4-rev. SadAK2 was constructed in three steps. First primers SadK2 P1-fw annealed with SadK2 P2 and SadK2 P3 annealed with SadK2-P4-rev were extended in separate PCR reactions. The resulting products served as templates in a second PCR using primers SadK2 P1-fw and SadK2-P4-rev. SadAK3 was amplified in one step by elongation of annealed primers SadAK3 P1-fw and SadAK3 P2-rev. The resulting DNA fragments were digested with Eco31I and ligated into the vector pIBA-GCN4tri and pIBA-GCN4tri-His yielding the constructs SadAK1, SadAK2, SadAK3 and SadAK3His, respectively. Correctness of the clones was verified by DNA sequencing.

Expression

For protein expression, plasmids were transformed into E.coli TOP10 cells (Invitrogen). Shaking flask cultures were grown at 37°C in LB medium to an OD600 of 0.5 and protein expression was subsequently induced with 0.2 µg/ml anhydrotetracycline (AHTC). After 4 h, cells were harvested by centrifugation.

Protein purification/refolding

Bacterial pellets were resuspended in 20 mM Tris/HCl pH 7.4, 40 mM NaCl, 5 mM MgCl2 containing a protease inhibitor mix (Roche), PMSF and Dnase I. Cells were lysed using a French press. Lysed cells overexpressing SadAK1 were centrifuged at 35 000 rpm for 40 min at 4°C using an ultracentrifuge (Beckman). The resulting pellet was resuspended in 50 mM Tris/HCl pH 8.0, 5 M Urea, 2% Triton X-100, 20 mM EDTA and stirred at room temperature for 1 h. The inclusion bodies were pelleted at 15 000 rpm for 30 min (Sorvall SS34) and dissolved in 50 mM Tris/HCl pH 8.0, 6 M guanidinium hydrochloride, 20 mM EDTA. Insoluble particles were separated by centrifugation at 15 000 rpm for 30 min (Sorvall SS34). The protein from the supernatant was refolded by dialysis against 20 mM MOPS pH 7.2, 150 mM NaCl. The SadAK2-containing supernatants of lysed cells were cleared by ultracentrifugation. Eight molar Urea in 20 mM MOPS/NaOH, pH 6.5 was added to the soluble protein fractions and stirred for 1 h at room temperature. After 30 min of centrifugation at 15 000 rpm (Sorvall SS34), the supernatant was loaded on an anion exchange column (QHP sepharose, GE Healthcare). Bound proteins were eluted from the column with a linear gradient of 0–1 M NaCl in 20 mM MOPS pH 6.5, 8 M Urea. Fractions were analyzed by SDS–PAGE. SadAK2 containing fractions were pooled, diluted with 0.1 M citrate buffer pH 3.5, 8 M Urea and loaded on a cation exchange column (MonoS, GE Healthcare). A linear gradient of 0–1 M NaCl in the loading buffer was used to elute bound SadAK2. The SadAK2 fractions were identified by SDS–PAGE and pooled. SadAK2 was refolded by dialysis against 20 mM MOPS/NaOH pH 7.2, 150 mM NaCl and finally purified to homogeneity by size exclusion chromatography using a Superdex 75 column (GE Healthcare) and concentrated. SadAK3 was purified from inclusion bodies as described for SadAK1. SadAK3His was purified under denaturing conditions from the soluble as well as the insoluble fraction of the cell extract. After French press, the extracts were directly diluted in equilibration buffer (20 mM Tris/HCl pH 7.9, 6 M guanidinium hydrochloride, 0.5 M NaCl, 10% glycerol) and incubated for 1 h at room temperature. After centrifugation for 40 min at 35 000 rpm (Beckman ultracentrifuge, Ti60), the supernatant was loaded on a NiNTA column (GE Healthcare). Bound proteins were eluted using a linear gradient of 0–0.5 M imidazole in equilibration buffer. After analysis by SDS–PAGE, SadAK3His fractions were pooled and dialyzed against 20 mM MOPS/NaOH pH 7.2, 400 mM NaCl and 10% glycerol. The protein was finally purified to homogeneity by gel filtration using a Superdex 75 column (GE Healthcare) in the same buffer and concentrated.

Gel shift assay

Samples were mixed with SDS sample buffer and either used directly or heated to 95°C for 15 min. Unheated samples did not denature. For testing the stability of the proteins against proteolytic degradation, 10 µg of the appropriate protein were incubated with 12 ng Proteinase K (600 units/ml, Fermentas) in 20 mM MOPS pH 7.2, 50 mM NaCl for 15 min at 22°C. The reaction was stopped by addition of PMSF (final concentration 1 mM).

FTIR spectroscopy and secondary structure prediction

Infrared spectra were recorded on a Bruker Optics Tensor27 FTIR spectrophotometer equipped with an ultrathin AquaSpec cuvette. Samples were dialyzed to equilibrium against buffer (20 mM MOPS-NaOH pH 7.2, 100 mM NaCl in water), and buffer from this dialysis was used for background measurements. Spectra were accumulated from 32 FTIR scans with a resolution of 4 cm–1 at 25°C. Calculation of protein concentration and protein secondary structure was done with a multivariant pattern recognition method supplied by Bruker Optics. The method uses an FTIR spectra library of more than 40 proteins of known structure and concentration measured in water with a similar instrumental setup as the one used here. Secondary structure predictions were obtained from the Quick2D server at http://toolkit.tuebingen.mpg.de which includes the programs PSIPRED (Jones, 1999Go), JNET (Cuff and Barton, 2000Go), PROF (Ouali and King, 2000Go), and PROFTMB (Rost, 2001Go). Because of the known problems of profile-based methods in predicting chimeric proteins, which are due to non-homologous sequences being joined into one profile during the sequence search phase, predictions were gathered separately for the GCN4pII linker and for the inserts, and then assembled.

Protein crystallzation and X-ray data collection

Proteins SadAK1, -K2 and -K3 were crystallized at concentrations of 7, 19.6 and 5 mg/ml, respectively. Initial crystallization trials were performed by mixing 400 nl of protein with 400 nl of reservoir solution on 3:1, 96 well Corning 3550 plates using the Honeybee 961 crystallization robot (Genomic Solutions). Drop images were obtained with the RockImager 54 device (Formulatrix) and visually inspected. Crystals of sufficient size were mounted directly from the crystallization plates. Highly diffracting crystals of the SadAK1 construct were obtained from condition 24 (12% PEG 4000, 0.1 M Hepes pH 7.5, 0.1 M NaOAc) of the Jena Bioscience I screen. Crystals of the SadAK3 construct in space group P21 were obtained from condition 8 (20% PEG 3350, 0.2 M KNO3) of the Wizard III screen (Emerald). Prior to flash-freezing, crystals were stabilized in reservoir solution supplemented with 20% ethylene glycol. Crystals of the SadAK2 construct were obtained from condition 3 (50% PEG 200, 0.1 M Na-citrate pH 5.5) of the Emerald Cryo I screen and frozen directly. Screening of crystals was done at the MPG beamline BW6/DESY (Hamburg, Germany), and data sets were collected at the SLS beamline PXII (Villigen, Switzerland). Data collection was performed at 100 K and all diffraction images were collected on a MAR225 detector (mar research). Diffraction images were integrated and scaled with the program XDS and XSCALE from the XDS program package (Kabsch, 1993Go). Structures were solved based on the molecular replacement result of the GCN4pII mutant protein part with the PDB entry 1GCM as input file. Molecular replacement calculations were done with MOLREP (Vagin and Teplyakov, 1997Go).


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Funding
 Acknowledgements
 References
 
As a basis for our expression system, we chose the commercially available pASK-IBA vectors (IBA, Göttingen, Germany, http://www.iba-go.de). Expression in this system is induced by AHTC; it is thus independent of T7-polymerase expressing E.coli strains and is tightly controlled. In several PCR steps, a MCS was assembled that retained the general features of the original vector, i.e. the transcriptional start and the ribosomal binding site, the sequencing primer sites and all of the recognition sites for type IIS restriction enzymes. The latter are important for the ability to fuse coiled-coil fragments with GCN4 in the correct heptad register as they cut outside of their recognition sequence.

In our experiments, we used the trimeric variant of GCN4, GCN4pII, which has a core formed entirely of isoleucine (Harbury, Kim and Alber, 1994Go). Any trimeric coiled coil can be fused to it to form a continuous structure, provided the periodicity of the heptad repeat is maintained. In our vector, the N-terminal GCN4pII ends on a, so the construct to be fused to it should start at b, and the C-terminal GCN4pII starts at a, so the construct to be fused to it should end on g. Using enzymes with cleavage sites in the MCS that cut DNA far from their recognition sequence, such as BsaI, one can design primers that allow for fusions without changes in the amino acid sequence and thus without disturbance of the heptad register (Fig. 1). This results in chimeric proteins, in which two copies of GCN4pII flank the protein of interest.


Figure 1
View larger version (31K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. Vector map and MCR. Schematic view of the plasmid pIBA-GCN4tri. The plasmids pIBA-GCN4tri and pIBA-GCN4tri-His differ only in the region before the stop codon, as shown in the lower panel. The sequencing primer sites of the original vector (pASK-IBA2) were retained. The heptad register of the GCN4pII coiled coil (a-g) is shown below the amino acid sequence.

 
As a test case for the vector, which we named pIBA-GCN4tri, we used fragments of the Salmonella enterica adhesin STM3691 (McClelland et al., 2001) [called SadA for ‘Salmonella adhesin A’, by analogy to YadA from Yersinia enterocolitica (Hoiczyk et al., 2000Go) and BadA from Bartonella henselae (Riess et al., 2004)]. SadA belongs to the family of trimeric autotransporter adhesins, a group of highly repetitive, fibrous surface proteins of Gram-negative bacteria, in which domains are frequently separated by short segments of trimeric coiled coil (Linke et al., 2006Go).

We found it easier to clone fragments of SadA by assembling them from long primers, because the highly repetitive nature of the gene resulted in PCR problems when using genomic DNA as a template. We chose fragments that fulfilled two conditions: first, that their structure could not be predicted from sequence, and second, that they would be flanked N- and C-terminally by coiled coils.

Expression and refolding

The constructs chosen for expression and purification, including their predicted coiled-coil register, are shown in Fig. 2. Their solubility under overexpression conditions in the E.coli cytosol depended on the inserted protein domain: fragment SadAK2 was expressed in a soluble form, but SadAK1 was completely localized in inclusion bodies and SadAK3 as well as SadAK3His, the 6xHis-tagged version obtained by cloning SadAK3 into the vector pASK-IBAtri-His, were only partially found in the soluble fraction. Attempts to purify the soluble constructs in native form resulted in lower yields and less purity compared to purification under denaturing conditions, so we decided to purify all proteins under denaturing conditions. Refolding was done by dialysis and worked best in buffers with high ionic strength, with yields above 90%, as estimated from the amount of protein that was soluble after removal of the denaturant (urea or guanidine). The proteins were pure after a final gel exclusion chromatography step, as assessed by SDS–PAGE (see also Fig. 3). We found that affinity chromatography with agarose-coupled dyes like Reactive Blue-4 and Reactive Green-19 (Sigma) could also be used to purify the constructs, a property that appeared to be independent of the insert, suggesting a generally applicable alternative for the purification of GCN4-tagged constructs soluble in low-salt buffers (data not shown). At the moment, we are raising antibodies against GCN4pII peptides, which will allow us to detect all constructs and to perform coimmunoprecipitations and other functional assays, independently of the insert.


Figure 2
View larger version (39K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2. Expressed sequences. Schematic view of the constructs used in this study. The SadA parts of the sequence are underlined and are flanked by GCN4pII N- and C-terminally. The heptad register of core residues (a and d) was assigned by sequence inspection, except for the C-terminal end of the SadAK2 insert, where it was assigned by reference to the structure of the homologous YadA neck (1P9H). Secondary structure was predicted with PSIPRED (Jones, 1999Go), JNET (Cuff and Barton, 2000Go), PROF (Ouali and King, 2000Go) and PROFTMB (Rost, 2001Go), as described in the Methods.

 

Figure 3
View larger version (36K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3. Gel shift assays. SDS–PAGE gels of the constructs described in this study. Samples are not degraded by Proteinase K, indicating a stable, folded structure. SadAK2 and SadAK3, but not SadAK1, migrate as oligomers in unboiled samples.

 
Ascertaining the folded state

After purification and refolding, we tested the fusion proteins for their stability against Proteinase K digestion and heat denaturation (Fig. 3). Proteinase K-treated and untreated samples were either boiled in SDS sample buffer or just mixed with the buffer and subjected to SDS–PAGE directly. All three proteins were resistant to proteolytic degradation, indicating that they were assembled and folded. SadAK2 and SadAK3, but not SadAK1, were also resistant to denaturation by SDS in unheated sample buffer and migrated at an apparent molecular weight indicative of trimers. SadAK3His behaved exactly like untagged SadAK3 in this assay (data not shown). Many coiled-coil proteins, including domains of trimeric autotransporter adhesins, such as the membrane anchors of Yersinia YadA (Wollmann et al., 2006Go) and Neisseria NhhA (Scarselli et al., 2006Go), are thermostable and resistant to denaturation by SDS.

We further verified the folded state of the constructs and measured their secondary structure content with FTIR spectroscopy. The difference spectra are shown in Fig. 4. All three constructs gave clear {alpha}-helical signals in the amide I and amide II bands, as expected for folded proteins with a high coiled-coil content. The secondary structure content calculated with the Bruker spectrum analysis software agreed well with the predicted secondary structure (Fig. 2 and Table I).


Figure 4
View larger version (12K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 4. FTIR spectroscopy. FTIR spectra of the test constructs. The overall spectral features do not differ significantly between the three constructs, as expected from secondary structure prediction. The spectra are dominated by the {alpha}-helical signal from the two GCN4 tags in the amide I (ca. 1650 cm–1) and amide II (ca. 1550 cm–1) bands.

 

View this table:
[in this window]
[in a new window]

 
Table I. FTIR results

 
Crystallography

We performed standard crystallization screens with all three constructs and obtained crystals under many conditions. Most crystals diffracted to better than 2 Å resolution without further refinement of the crystallization conditions. Unfortunately, some crystal forms were prone to twinning. SadAK1 and SadAK3 yielded crystals in multiple space groups and, while the datasets obtained in P21 were not twinned, the twinning fraction in other space groups was always above 30%. For SadAK2, all crystals tested so far were of space group P321 with a twinning fraction of about 10%. Statistics for datasets successfully used for structure solution are shown in Table II.


View this table:
[in this window]
[in a new window]

 
Table II. X-ray data collection

 
For constructs of limited length (~60 amino acids, the length of the two GCN4 linkers combined), the GCN4-fusion proteins are obviously well suited for molecular replacement, using the known GCN4 structure as a search model. In our test cases, the orientation of the trimers in the unit cell was deducible a priori from the chi = 120° section of the self-rotation function as shown in Fig. 5. While the 3-fold symmetry axis of the constructs is tilted versus the C-axis for constructs SadAK1 and SadAK3 in space group P21, it runs parallel to it for construct SadAK2 in space group P321. As follows from packing density considerations, the SadAK2 crystals contain only one monomer per asymmetric unit and the trimer is built around the crystallographic threefold axis. Using this approach, the structures of all three constructs could be solved conveniently (we are currently working to solve exemplars of all SadA domains with this approach, in order to reconstruct the entire fiber. We will publish the three structures in that context).


Figure 5
View larger version (51K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 5. Crystallography. (A) chi = 120° sections of self-rotation functions for the three datasets listed in Table II, generated with the program Molrep (Vagin and Teplyakov, 1997Go). (B) Schematic stereo view of crystal packing for the untwinned SadAK3 structure (N-terminal GCN4pII: light grey, K3 insert: black, C-terminal GCN4pII: dark grey).

 
In the untwinned dataset for construct SadAK3, a large fraction of the crystal contacts is mediated by pairs of GCN4 linkers, as illustrated schematically in Fig. 5. A preliminary analysis of one twinned dataset for this construct revealed a packing in which the GCN4 linkers did not interact with each other over their full length (not shown). Hence, a specific interaction of the linkers over their full length appears to promote a robust crystal packing less prone to twinning.


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Funding
 Acknowledgements
 References
 
Long, fibrous proteins are generally more difficult to purify and crystallize than their globular counterparts. Many have adhesive properties or assemble into larger complexes. Divide-and-conquer approaches have been proposed, e.g. for intermediate filaments (Strelkov et al., 2001Go), in order to make these proteins amenable to structure determination. We have used such an approach successfully for the trimeric autotransporter adhesin SadA, by cloning and purifying one fibrous domain at a time, instead of the complete protein.

Symmetry in proteins, and especially homooligomerization, seems favorable for protein crystallization. Even artificial symmetrization, by forcing proteins into a dimeric state via disulfide bonds, has been used successfully in crystallization trials (Banatao et al., 2006Go). Our cloning targets are highly symmetrical homotrimers and it is thus not surprising that they crystallized well, once the problems with expression and purification were solved by fusion to a trimeric variant of the GCN4 leucine zipper. We previously encountered problems with twinning in the crystallization of coiled-coil domains, which we overcame here for the constructs SadAK1, SadAK2 and SadAK3 by using GCN4pII tags. To further increase the prospect of success with such constructs, we are exploring two approaches: (i) to break the symmetric shape by fusing a bigger, trimeric globular domain to one of the GCN4 linkers, and (ii) to increase the affinity between the GCN4 linkers, in order to promote a packing that is more robust against twinning. Another improvement in progress is to incorporate further methionine residues into the linkers, in order to allow the solution of longer constructs by anomalous dispersion experiments with SeMet-labeled protein.

We modified an existing expression vector to allow for the easy cloning and efficient expression of proteins containing trimeric coiled coils. Protein fragments that are insoluble, fold inefficiently or form mixtures of oligomers, can be expressed and purified with high yields in soluble and stable form using this system. The obtained fusion proteins are well suited for crystallization, but could also be used in functional assays, such as binding assays, provided that GCN4pII alone is used as a control.

Once crystals are obtained, constructs with short inserts (~60 residues) are well suited for molecular replacement, using the N- and C-terminal GCN4 linkers as a search model. For longer insertions of unknown structure, experimental phasing becomes necessary and insertion of further methionine residues into the GCN4 linkers will improve experimental phasing, especially for inserts with few or no methionines. For both molecular replacement and experimental phasing, the a priori knowledge of the orientation of the trimeric constructs in the unit cell facilitates the structure determination.

The system introduced here can be adapted for the expression of dimeric or tetrameric coiled coils, by using the corresponding GCN4 variants (Harbury et al., 1993Go). Because of the high abundance of coiled-coil proteins, particularly in eukaryotes, the system is a useful addition to the toolkit of high-throughput methods, with applications in structural genomics, functional screens or ‘dictionary’ approaches to elucidate the structure of all domain types found in a protein family, such as we are currently taking with trimeric autotransporter adhesins.


    Funding
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Funding
 Acknowledgements
 References
 
This work was supported by institutional funds from the Max Planck Society and by the German Science Foundation (FOR449/LU1165 to A.N.L. and SFB766/B4 to A.N.L. and D.L.).


    Footnotes
 
The originally published version of this paper was incorrect. One of the authors' names was published as Alvarez Birte Hernandez when it should be Birte Hernandez Alvarez.

Edited by Adrian Goldman


    Acknowledgements
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Funding
 Acknowledgements
 References
 
The authors swish to thank Cedric Hobel for help with DNA-related software and Silvia Deiss and Ines Wanke for technical assistance.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Funding
 Acknowledgements
 References
 
Arndt K., Fink G.R. Proc. Natl Acad. Sci. USA (1986) 83:8516–8520.[Abstract/Free Full Text]

Banatao D.R., Cascio D., Crowley C.S., Fleissner M.R., Tienson H.L., Yeates T.O. Proc. Natl Acad. Sci. USA (2006) 103:16230–16235.[Abstract/Free Full Text]

Crick F.H. Nature (1952) 170:882–883.[Medline]

Cuff J.A., Barton G.J. Proteins (2000) 40:502–511.[CrossRef][ISI][Medline]

Delorenzi M., Speed T. Bioinformatics (2002) 18. Oxford, England. 617–625.[Abstract/Free Full Text]

Harbury P.B., Zhang T., Kim P.S., Alber T. Science (1993) 262:1401–1407.[Abstract/Free Full Text]

Harbury P.B., Kim P.S., Alber T. Nature (1994) 371:80–83.[CrossRef][Medline]

Hoiczyk E., Roggenkamp A., Reichenbecher M., Lupas A., Heesemann J. EMBO J. (2000) 19:5989–5999.[CrossRef][ISI][Medline]

Jones D.T. J. Mol. Biol. (1999) 292:195–202.[CrossRef][ISI][Medline]

Kabsch W. J. Appl. Crystallogr. (1993) 26:795–800.[CrossRef][ISI]

Kammerer R.A., Schulthess T., Landwehr R., Lustig A., Engel J., Aebi U., Steinmetz M.O. Proc. Natl Acad. Sci. USA (1998) 95:13419–13424.[Abstract/Free Full Text]

Kammerer R.A., Kostrewa D., Progias P., Honnappa S., Avila D., Lustig A., Winkler F.K., Pieters J., Steinmetz M.O. Proc. Natl Acad. Sci. USA (2005) 102:13891–13896.[Abstract/Free Full Text]

Landschulz W.H., Johnson P.F., McKnight S.L. Science (1988) 240:1759–1764.[Abstract/Free Full Text]

Linke D., Riess T., Autenrieth I.B., Lupas A., Kempf V.A. Trends Microbiol. (2006) 14:264–270.[CrossRef][ISI][Medline]

Lupas A.N., Gruber M. Adv. Protein Chem. (2005) 70:37–78.[CrossRef][ISI][Medline]

Lupas A., Van Dyke M., Stock J. Science (1991) 252:1162–1164.[Free Full Text]

McClelland M., et al. Nature (2001) 413:852–856.[CrossRef][Medline]

Offer G., Hicks M.R., Woolfson D.N. J. Struct. Biol. (2002) 137:41–53.[CrossRef][ISI][Medline]

Ouali M., King R.D. Protein Sci. (2000) 9:1162–1176.[Abstract]

Pack P., Muller K., Zahn R., Pluckthun A. J. Mol. Biol. (1995) 246:28–34.[CrossRef][ISI][Medline]

Perera R., Navaratnarajah C., Kuhn R.J. J. Virol. (2003) 77:8345–8353.[Abstract/Free Full Text]

Pingoud A., Jeltsch A. Nucleic Acids Res. (2001) 29:3705–3727.[Abstract/Free Full Text]

Riess T., et al. J. Exp. Med. (2004) 200:1267–1278.[Abstract/Free Full Text]

Rost B. J. Struct. Biol. (2001) 134:204–218.[ISI][Medline]

Scarselli M., Serruto D., Montanari P., Capecchi B., Adu-Bobie J., Veggi D., Rappuoli R., Pizza M., Arico B. Mol. Microbiol. (2006) 61:631–644.[CrossRef][ISI][Medline]

Strelkov S.V., Herrmann H., Geisler N., Lustig A., Ivaninskii S., Zimbelmann R., Burkhard P., Aebi U. J. Mol. Biol. (2001) 306:773–781.[CrossRef][ISI][Medline]

Thompson K.S., Vinson C.R., Freire E. Biochemistry (1993) 32:5491–5496.[CrossRef][ISI][Medline]

Vagin A., Teplyakov A. J. Appl. Crystallogr. (1997) 30:1022–1025.[CrossRef][ISI]

Wang Y., Gao R., Lynn D.G. Chembiochemistry (2002) 3:311–317.[CrossRef]

Wolfe S.A., Grant R.A., Pabo C.O. Biochemistry (2003) 42:13401–13409.[CrossRef][ISI][Medline]

Wollmann P., Zeth K., Lupas A.N., Linke D. Int. J. Biol. Macromol. (2006) 39:3–9.[CrossRef][ISI][Medline]

Yang X., Farzan M., Wyatt R., Sodroski J. J. Virol. (2000) 74:5716–5725.[Abstract/Free Full Text]

Received September 17, 2007; revised November 7, 2007; accepted November 9, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrowOA All Versions of this Article:
21/1/11    most recent
gzm071v2
gzm071v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Hernandez Alvarez, B.
Right arrow Articles by Linke, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hernandez Alvarez, B.
Right arrow Articles by Linke, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?