PEDS Advance Access originally published online on January 18, 2008
Protein Engineering Design and Selection 2008 21(3):161-164; doi:10.1093/protein/gzm078
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A strategy for generating polyglutamine length libraries in model host proteins
1Department of Chemical and Biological Engineering, University of Wisconsin, Madison, WI 53706, USA 2Department of Bacteriology, University of Wisconsin, Madison, WI 53706, USA
3 To whom correspondence should be addressed: E-mail: regina{at}engr.wisc.edu
| Abstract |
|---|
|
|
|---|
Huntingtons disease is one of nine known neurodegenerative diseases in which a disease-specific protein contains an unusually long polyglutamine (polyQ) stretch. The proteins associated with each disease are unrelated in sequence, size, structure, function or location of the mutation. In all cases, there is an apparent critical number of glutamines below which individuals do not develop disease. Expansion of the polyQ domain is closely associated with misfolding and aggregation of the protein. It is not yet well understood how the length of the polyQ tract, and its location within a given protein, is related to misfolding and to disease. In this work we developed a strategy for generating length libraries of polyQ-containing proteins, with the polyQ inserted at an arbitrary location. This strategy facilitates systematic, detailed study of the relationship among polyQ length, context and misfolding.
Keywords: aggregation/misfolding/polyglutamine/recombination
| Introduction |
|---|
|
|
|---|
Protein aggregation is implicated in the pathogenesis of a number of neurological diseases, including β-amyloid peptide in Alzheimers disease (Lichtenthaler et al., 2002
-synuclein in Parkinsons disease (Gandhi and Wood, 1999
20 in spinocerebellar ataxia 6 and
60 in Machado–Joseph disease (Gusella and MacDonald, 2000
To examine the role of polyQ length in folding and aggegation, investigators have used synthetic peptides (Bhattacharyya et al., 2005
; Lee et al., 2007
), and disease-specific proteins including ataxin-1 (Rich and Varadaraj, 2007
), ataxin-3 (Bevivino and Loll, 2001
; Masino et al., 2003
; Shehi et al., 2003
; Chow et al., 2004
) and the N-terminal fragment of huntingtin (Scherzinger et al., 1999
; Wacker et al., 2004
; Poirier et al., 2005
). Many of the disease-specific proteins are difficult to express and/or are not structurally characterized. Given that the disease proteins are unrelated except for their common polyQ tract, it has been hypothesized that any protein might serve as an appropriate model host for polyQ. This line of reasoning has motivated studies in which polyQ was inserted into glutathione S-transferase (Masino et al., 2002
), chymotrypsin inhibitor-2 (Chen et al., 1999
) and apomyoglobin (Tanaka et al., 2001
). Interestingly, transgenic mice carrying a non-disease-related protein (hypoxanthine phosphoribosyltransferase), mutated to incorporate a polyQ domain, developed symptoms characteristic of classic polyQ neurodegenerative disease (Ordway et al., 1997
).
Biophysical studies on the effect of polyQ on the folding and aggregation of a recombinant host protein necessitate the insertion of a new polyCAG DNA sequence, or expansion or contraction of an existing one. In most previous studies, researchers have synthesized only two to four mutants in which the polyQ length was chosen to be above and below the apparent critical length, based on a hypothesis that there is an abrupt and dramatic transition in physical and biological properties as the length changes from non-pathological to pathological. However, more recent evidence suggests that there are not striking qualitative differences between short and long polyQ domains; rather, the conformation and aggregation propensity may change gradually as the polyQ tract lengthens (Klein et al., 2007
). Furthermore, it has been hypothesized that any length of polyQ could cause disease given a sufficiently long lifespan (Gusella and MacDonald, 2006
).
We are interested in systematically examining the effect of polyQ length and location on protein folding, stability and aggregation. Practical concerns drove our choice of myoglobin as our host protein. Specifically, a high resolution crystal structure is available, myoglobin is easily produced and (re)-folded and its predominantly helical structure makes following the transition to beta-sheet aggregates easy to detect spectroscopically. Furthermore, Tanaka et al. (2001)
have successfully used myoglobin as a model for polyQ disease. We required a facile method for inserting varying lengths of polyQ at arbitrary sites within myoglobin, a protein that contains no pre-existing polyQ region. We sought to develop a strategy for generating small libraries of myoglobin mutants containing polyQ domains of incremented, biologically relevant lengths inserted at any chosen site within our host protein.
Several methods for inserting expanded CAG domains into genes and generating length diversity have been reported. Ordway and Detloff (1996)
employed a polymerase chain reaction (PCR)-based method in which (CAG)20 and (CTG)20 oligonucleotides were subjected to PCR, yielding CAG repeats from 100 bp (
30 glutamines) to 15 kb. This method yielded polyCAG tracts that were much too long for our purpose, and also necessitated the blunt-end ligation of the PCR product into the gene of interest. Takahashi et al. (1999)
devised a PCR-based method to expand or contract the existing polyCAG regions within a gene. However, this method yielded too little diversity for our application and required a pre-existing polyCAG region. Other methods for expansion of pre-existing polyCAG regions allowed for joining [doubling] an existing polyCAG region (Peters and Ross, 1999
) or the introduction of stable polyQ-encoding DNA of very specific lengths to pre-existing polyCAG regions (Michalik et al., 2001
). Lastly, methods exist for synthesizing very long polyCAG tracts (>100 CAG triplets) (Kim et al., 2005
) flanked with restriction enzyme sites for use in ligation into a gene (Sasagawa and Ishiura, 2006
). These published methods were not readily adaptable to our application, either because they required an endogeneous polyQ domain, which myoglobin does not possess, or they do not generate the CAG lengths and diversity of interest.
In this paper, we present a novel and simple strategy for site-directed introduction of polyQ-encoding DNA sequences and generation of length libraries. The method generates polyQ libraries with biologically relevant lengths and small increments [2–8 glutamines] between the different lengths. This method will allow us to systematically probe subtle differences among mutants of different lengths.
| Materials and methods |
|---|
|
|
|---|
Plasmid construction
The sperm whale myoglobin gene in a pETBlue1 expression vector (Novagen, Madison, WI, USA) was a gift from Silvia Cavagnero (University of Wisconsin-Madison, Department of Chemistry). A 6x-histidine tag was added to the C-terminus of the myoglobin gene. Briefly, the gene was PCR-amplified using the primers 5'-GAGAGATCTTGATTGGCTAGC-3' and 5'-TGCGTCGGTACCTCATTAATGGTGATGGTGGTGATGACCCTGGTAACCCAGTTCTTT-3'. The PCR product was then digested with KpnI and NheI (New England Biolabs, Ipswich, MA, USA). A pETBlue1 vector was also digested with KpnI and NheI. Digested DNA fragments (gene and vector) were isolated on a 1% agarose gel stained with 0.5 µg/ml of ethidium bromide and purified using a QIAquick Gel Extraction Kit (Qiagen, Valencia, CA, USA). The histidine-tagged myoglobin gene was ligated into the pETBlue1 vector using T4 DNA ligase (Invitrogen, Carlsbad, CA, USA) and overnight incubation at 16°C. The resulting plasmid was named pMb.
The site of polyQ codon insertion was chosen to be within the flexible CD loop of apomyoglobin, between D45 and K48, leading to the deletion of R46 and F47 (Tanaka et al., 2001
). A silent mutation was introduced to the R46 codon changing it from CGT to CGA resulting in a novel ClaI restriction enzyme site. The mutagenic primers 5'-CTCTGGAAAAATTCGATCGATTCAAACATCTGAAAACTG-3' and 5'-CAGTTTTCAGATGTTTGAATCGATCGAATTTTTCCAGAG-3' were used in concert with a QuikChange Site-Directed Mutagenesis Kit (Stratagene, La Jolla, CA, USA) to perform the mutation. Mutagenesis was confirmed by sequencing.
Site-directed introduction of polyQ codons
The pMb plasmid was digested overnight at 37°C with ClaI (New England Biolabs, Ipswich, MA, USA), then isolated on a 1% agarose gel, stained with 0.5 µg/ml ethidium bromide and purified using a QIAquick Gel Extraction Kit. Two polyQ mutagenic primers were designed to PCR-amplify the entire linear pMb plasmid excluding R46 and F47. To the non-priming (5') end of each primer, a 5'-(CAACAG)8 or 5'-(CTGTTG)8, sequence was added yielding the following polyQ mutagenic primers: 5'-(CTGTTG)8ATCGAATTTTTCCAGAGTTTCC-3' and 5'-(CAACAG)8AAACATCTGAAAACTGAAGC-3'. The sextet repeat CAACAG was chosen to promote polyQ repeat codon stability during protein production (Laccone et al., 1999
). PCR was conducted using Pfx DNA polymerase (Invitrogen) and the manufacturers protocol with the following exceptions: the extention time was changed to 4.5 min and the reaction included
30 ng linear pMb template and 2.5 U of polymerase. PCR was performed for 35 cycles on a PTC-200 Thermocycler (MJ Research, Waltham, MA, USA).
Ultracompetent MC1061/P3 bacteria (Invitrogen) were used for transformation with 1 µl PCR reaction products. This strain contains the recA gene, unlike most cloning Escherichia coli strains, thus allowing the bacteria to recombine the polyCAA/CAG ends of the PCR products at varying lengths. A 200 µl volume of transformation reaction product was spread on LB agar plates containing 50 µg/ml ampicillin and incubated overnight at 37°C. The following day, single colonies were inoculated in 3 ml LB media containing 50 µg/ml ampicillin and grown in a shaker-incubator at 37°C overnight. The plasmids were purified using a QIAprep Spin Miniprep Kit and subjected to a double restriction enzyme digestion using NheI and KpnI. The DNA fragments from the plasmid digests were separated on a 1% agarose gel stained with 0.5 µg/ml ethidium bromide. Plasmids were sequenced to confirm insertion of polyQ codons.
Protein production and purification
Several of the identified polyQ codon-containing mutated plasmids (named pMb-QXX(CD)) were transformed into TUNER(DE3)pLacI cells (Novagen, Madison, WI, USA). Briefly, single colonies were grown overnight in 100 ml LB media containing 50 µg/ml ampicillin, 34 µg/ml chloroamphenicol and 1% glucose. The overnight culture was used to inoculate 1000 ml of identical medium to an initial optical density (600 nm) of 0.2. The culture was grown into the logarithmic phase (OD600 nm = 0.4–0.7) at which time protein production was induced by adding isopropyl β-D-1-thiogalactopyranoside (Sigma, St. Louis, MO, USA) to 1 mM. The culture was grown to a final OD600 nm of
2.0. Protein was purified under denaturing conditions using nickel affinity chromatography. Briefly, the cells were centrifuged, the cell pellet was resuspended in 50 ml 100 mM NaH2PO4, 10 mM Tris·Cl, 8 M Urea, pH 8.0, and the suspension was stirred on ice for 30 min to lyse the cells. The lysate was centrifuged at 20 000g for 20 min. The supernatant was collected and loaded onto a gravity-fed affinity column packed with Ni-NTA agarose (Qiagen). Non-specific-binding proteins were washed from the column using the lysis buffer at pH 6.3. The proteins were refolded on the affinity column by rapidly washing the column with PBSA. Pure protein was eluted in PBSA+250 mM imidazole and analyzed by SDS–PAGE on a 4–20% polyacrylamide gel (Pierce Biotechnolgy, Rockford, IL, USA). The samples were boiled for 3 min before loading. Electrophoresis was conducted at 125 V in an XCell SureLock Mini-Cell (Invitrogen) using a running buffer of 100 mM Tris·Cl, 100 mM HEPES and 3 mM SDS at pH 8. Purified horse myoglobin (Sigma) was also run as an additional molecular weight control.
| Results and discussion |
|---|
|
|
|---|
A schematic depicting our strategy is given in Fig. 1. Briefly, (i) a unique restriction enzyme site was engineered at the desired polyQ insertion site. (ii) The plasmid DNA was digested at the new restriction enzyme site to linearize the plasmid. (iii) After purification, PCR was conducted using mutagenic primers. These primers encode sequences complementary to the ends of the linear plasmid with 5' extensions encoding eight pairs of glutamine sense (CAACAG)8 or anti-sense (CTGTTG)8 codons. (iv) The recA+ E. coli strain MC1061/P3 was transformed directly with the PCR product. Forty-three of the resultant positive transformants were inoculated into LB media with ampicillin and grown overnight. Of these colonies, seven did not grow in liquid media. The plasmid DNA was purified from the remaining 37 colonies and digested. Fragment analysis of the digests showed that 11 plasmids contained either a gene insert of the same size as the wild-type myoglobin (failed polyCAG/CAA introduction) or no plasmid DNA. For the remaining 26 plasmids, the myoglobin gene portion of the plasmid was sequenced and the sequences were analyzed to determine the number of inserted glutamine codons. Table I summarizes these data. Excellent length diversity in the window of interest was obtained. The two most prevalent lengths found were Q16 and Q34. Q16 likely arises due to perfect recombination of the complementary ends of the PCR product. PolyQ inserts of 32 codons likely represent blunt ligation of the PCR product termini, whereas lengths between Q32 and Q16 indicate partial recombination events within the complementary ends. About 20% of the mutants contained tracts longer than 100 codons. We hypothesize that mutants longer than Q32 arise due to either of two mechanisms. First, there may be lengthening of the polyQ codon ends during PCR leading to ends longer than Q16 (i.e. Q20, Q28, etc.) available for recombination after transformation into bacteria. Secondly, mutants initially within the Q16–Q32 range may mutate during growth in recA+ bacteria and gain [or lose] codons, as is commonly known to occur (Ohshima et al., 1996
|
|
Several plasmids were selected corresponding to myoglobins with polyQ lengths above and below the pathological critical length of 35–40 glutamines. The selected proteins were expressed in E. coli and purified by nickel-affinity chromatography. The gel shown in Fig. 2 shows the purity of the protein products and also confirms the relative sizes. The polyQ variant proteins migrate anomalously, with apparent molecular weights that are somewhat higher than the true molecular weight.
|
Preliminary results indicate that the wild-type myoglobin and the Q16, Q20 and Q24 mutants can be successfully refolded. Circular dichroism spectra indicated that the helical content of refolded sperm whale myoglobin was consistent with its known crystal structure and was similar to spectra obtained with purchased horse myoglobin (not shown). Q16, Q20 and Q24 mutants contained decreasing amounts of helical content as the polyQ tract grew larger. All four proteins bound 1-anilino-8-naphthalene sulfonate similarly (data not shown), suggesting that all retained a heme-binding site. Detailed biophysical studies of these and longer polyQ proteins are underway.
To summarize, we have developed a straightforward method for site-directed insertion of expanded polyCAG/CAA into any gene of interest with generation of a diverse length library. Very preliminary evidence suggests that the method can be further simplified by eliminating steps (1) and (2) shown in Fig. 1: one may be able to directly perform PCR on circular plasmids and obtain the same PCR product used for transformation (not shown). Using this strategy, polyQ codons can be inserted into proteins at regions that contain no endogenous glutamine codons, and a diverse library of length mutants can be generated from a small pool of plasmids. The incremental changes in polyQ length will allow us to closely study the subtle changes in the folding, stability and aggregation of the mutants as the polyQ tract lengthens.
| Funding |
|---|
|
|
|---|
National Science Foundation [BES-0330537]; National Institutes of Health [5 T32 GM-08349 to M.D.T., GM-53228 to R.L.K.].
| Footnotes |
|---|
Edited by Lynne Regan
| Acknowledgements |
|---|
|
|
|---|
We thank Dr. Darrell McCaslin of the University of Wisconsin Biophysics Instumentation Facility for technical assistance with circular dichroism spectroscopy. We also thank Professor Gary Roberts for several helpful discussions.
| References |
|---|
|
|
|---|
Bevivino A.E., Loll P.J. Proc. Natl Acad. Sci. USA (2001) 98:11955–11960.
Bhattacharyya A.M., Thakur A.K., Wetzel R. Proc. Natl Acad. Sci. USA (2005) 103:15400–15405.
Chen Y.W., Stott K., Perutz M.F. Proc. Natl Acad. Sci. USA (1999) 96:1257–1261.
Chow M.K.M., Ellisdon A.M., Cabrita L.D., Bottomley S.P. J. Biol. Chem. (2004) 279:47643–47651.
Gandhi S., Wood N.W. Hum. Mol. Genet. (1999) 14:2749–2755.[CrossRef]
Gusella J.F., MacDonald M.E. Nat. Rev. Neurosci. (2000) 1:109–115.[Web of Science][Medline]
Gusella J.F., MacDonald M.E. Trends Biochem. Sci. (2006) 31:533–540.[CrossRef][Web of Science][Medline]
Jakupciak J.P., Wells R.D. J. Biol. Chem. (1999) 274:23468–23479.
Kim S.-H., Cai L., Pytlos M.J., Edwards S.F., Sinden R.R. BioTechniques (2005) 38:247–253.[Web of Science][Medline]
Klein F.A.C., Pastore A., Masino L., Zeder-Lutz G., Nierengarten H., Oulad-Abdeighani M., Altschuh D. J. Mol. Biol. (2007) 371:235–244.[CrossRef][Web of Science][Medline]
Laccone F., Maiwald R., Bingemann S. Hum. Mutat. (1999) 13:497–502.[CrossRef][Web of Science][Medline]
Lee C.C., Walters R.H., Murphy R.M. Biochem. 46:12810–12820.
Lichtenthaler S.F., Beher D., Grimm H.S., Wang R., Shearman M.S., Masters C.L. Proc. Natl Acad. Sci. USA (2002) 99:1365–1370.
Masino L., Kelly G., Leonard K., Trottier Y., Pastore A. FEBS Lett. (2002) 513:267–272.[CrossRef][Web of Science][Medline]
Masino L., Musi V., Menon R.P., Fusi P., Kelly G., Frenkiel T.A., Trottier Y., Pastore A. FEBS Lett. (2003) 549:21–25.[CrossRef][Web of Science][Medline]
Michalik A., Kazantsev A., Van Broeckhoven C. BioTechniques (2001) 31:250–254.[Web of Science][Medline]
Ohshima K., Kang S., Wells R.D. J. Biol. Chem. (1996) 271:1853–1856.
Ordway J.M., Detloff P.J. BioTechniques (1996) 21:609–612.[Web of Science][Medline]
Ordway J.M., TallaksenGreene S., Gutekunst C.A., Bernstein E.M., Cearley J.A., Wiener H.W., Dure L.S., Linsey R., Hersch S.M., Jope R.S., et al. Cell (1997) 91:753–763.[CrossRef][Web of Science][Medline]
Peters M.F., Ross C.A. Neurosci. Lett. (1999) 275:129–132.[CrossRef][Web of Science][Medline]
Poirier M.A., Jiang H., Ross C.A. Hum. Mol. Genet (2005) 14:765–774.
Prusiner S.B., Scott M.R., DeArmond S.J., Cohen F.E. Cell (1998) 93:337–348.[CrossRef][Web of Science][Medline]
Rich T., Varadaraj A. PLoS One (2007) 2:e1014. doi:10.1371/journal.pone.0001014.[CrossRef]
Sasagawa N., Ishiura S. Anal. Biochem. (2006) 357:308–310.[CrossRef][Web of Science][Medline]
Scherzinger E., Schweiger K., Lurz R., Lehrach H., Wanker E.E. Phil. Trans. Royal Soc. Lond. B Biol. Sci. (1999) 354:991–994.[CrossRef]
Shehi E., Fusi P., Secundo F., Pozzuolo S., Bairati A., Tortora P. Biochemistry (2003) 42:14626–14632.[CrossRef][Web of Science][Medline]
Takahashi N., Sasagawa N., Suzuki K., Ishiura S. Neurosci. Lett. (1999) 262:45–48.[CrossRef][Web of Science][Medline]
Tanaka M., Morishima I., Akagi T., Hashikawa T., Nukina N. J. Biol. Chem. (2001) 276:45470–45475.
Wacker J.L., Zareie M.H., Fong H., Sarikaya M., Muchowski P.J. Nat. Struct. Mol. Biol. (2004) 11:1215–1222.[CrossRef][Web of Science][Medline]
Wanker E.E. Biol. Chem. (2000) 381:937–942.[CrossRef][Web of Science][Medline]
Received November 16, 2007; revised November 16, 2007; accepted November 20, 2007.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

