Skip Navigation



PEDS Advance Access published online on May 5, 2007

Protein Engineering Design and Selection, doi:10.1093/protein/gzm014
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow supplementary data
Right arrow All Versions of this Article:
20/5/219    most recent
gzm014v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Herman, A.
Right arrow Articles by Tawfik, D. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Herman, A.
Right arrow Articles by Tawfik, D. S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org

Incorporating Synthetic Oligonucleotides via Gene Reassembly (ISOR): a versatile tool for generating targeted libraries

Asael Herman1,2 and Dan S. Tawfik1,3

1 Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel

3 To whom correspondence should be addressed. E-mail: tawfik{at}weizmann.ac.il


    Abstract
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Supplementary Data
 References
 
The directed evolution of proteins has benefited greatly from site-specific methods of diversification such as saturation mutagenesis. These techniques target diversity to a number of chosen positions that are usually non-contiguous in the protein's primary structure. However, the number of targeted positions can be large, thus leading to impractically large library size, wherein almost all library variants are inactive and the likelihood of selecting desirable properties is extremely small. We describe a versatile combinatorial method for the partial diversification of large sets of residues. Our library oligonucleotides comprise randomized codons that are flanked by wild-type sequences. Adding these oligonucleotides to an assembly PCR of wild-type gene fragments incorporates the randomized cassettes, at their target sites, into the reassembled gene. Varying the oligonucleotides concentration resulted in library variants that carry a different average number of mutated positions that comprise a random subset of the entire set of diversified codons. This method, dubbed Incorporating Synthetic Oligos via Gene Reassembly (ISOR), was used to create libraries of a cytosine-C5 methyltransferase wherein 45 individual positions were randomized. One library, containing an average of 5.6 mutated residues per gene, was selected, and mutants with wild-type-like activities isolated. We also created libraries of serum paraoxonase PON1 harboring insertions and deletions (indels) in various areas surrounding the active site. Screening these libraries yielded a range of mutants with altered substrate specificities and indicated that certain regions of this enzyme have a surprisingly high tolerance to indels.

Keywords: directed evolution/rational design/PON1/methyltransferase/insertions/deletions/indels/DNA shuffling


    Introduction
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Supplementary Data
 References
 
Rational design and directed evolution are the two conceptually contrary strategies that underpin protein engineering. Directed evolution requires no prior knowledge of the target protein, yet it relies on selection capabilities that sample only a miniscule fraction of all possible permutations. Rational and computational designs greatly minimize the number of sequence permutations that are explored (often down to one sequence), but are hampered by the complexity of proteins and our limited knowledge regarding sequence–function relationships. An awareness of the relative strengths and weaknesses of these approaches has led workers to combine them, for example, in ‘semi-rational’ protein engineering (Minshull et al., 2004Go; Chica et al., 2005Go; Patrick and Firth, 2005Go) and in targeting library diversity to a given set of residues (e.g. a defined set of active site positions). Recent examples of this ‘targeted library’ approach include the modification of the substrate specificity of a Cre DNA recombinase (Santoro and Schultz, 2002Go), an epoxide hydrolase (Rui et al., 2004Go; Park et al., 2005Go), a lipase (Reetz et al., 2005Go) and an esterase (Park et al., 2005Go). Other ‘targeted library’ approaches utilize computational methods and protein design algorithms. Examples include methods that perform a ‘virtual screening’ of otherwise impossibly large libraries (Hayes et al., 2002Go), computational methods for the design of enzyme active sites (Dwyer et al., 2004Go) and algorithms that direct recombination by predicting optimal crossover loci (Voigt et al., 2002Go).

‘Targeted libraries’ are constructed primarily by directing randomization (by saturation mutagenesis) to specific positions within the gene. Saturation mutagenesis uses synthetic oligonucleotides that encode the desired diversity at the specified positions (for example, see Reetz et al., 2001Go; Santoro and Schultz, 2002Go; Antikainen et al., 2003Go; Reetz, 2004Go; Rui et al., 2004Go). The diversified oligonucleotides are incorporated by PCR, or directly cloned into the gene of interest as a cassette. The main drawback of this approach is that it often produces library sizes that are too large to explore fully by available selection strategies. It is possible, therefore, that combining computational design with such repertoire selections might, by allowing the former large degrees of freedom, narrow down the potential library size to a more manageable number.

Rational library design also requires compromises to be made. A typical active site comprises well over 20 non-contiguous residues, and a change of any one of these residues may provide the key to the desired new function. However, simultaneous diversification of many residues creates library sizes that are beyond any available screening capabilities. Typical plate-based screens involve ≤104 variants and can therefore target only two fully randomized positions. Even high-throughput technologies that allow 1010 variants to be screened can only therefore accommodate six fully randomized positions. Screening only a sample of the library diversity is an option, but one must bear in mind that as the number of diversified positions increases, the number of library variants that are completely inactive (due to the presence of a stop codon, or a mutation that severely undermines stability) increases substantially (Bershtein et al., 2006Go).

A potential solution to the above obstacle is parsimonious mutagenesis (Balint and Larrick, 1993Go). As the name suggests, this technique provides a means of partial diversification using oligonucleotides in which the diversified codons comprise a small proportion of mutating bases in an excess of wild-type bases. However, this technique has not been used extensively, possibly because the high cost of ‘doped’ oligonucleotides and their limited purity.

Our aim was to develop a cost-effective, facile and general method for the creation of targeted libraries by partial diversification. The approach we took, dubbed ISOR, is a simple adaptation of gene shuffling and allows diversification—by substitution, insertion or deletion—of large sets of residues. Each library variant carries a random, and different, subset of mutated residues, with the entire set represented in the complete library. Here we demonstrate ISOR's applicability and versatility in two different systems. Following a bioinformatic analysis, we targeted for diversification 45 individual positions in a DNA methyltransferase (M.HaeIII). We created a series of gene-libraries in which the average number of mutations ranged from 1 to 6, and that combinatorially covered the entire set of 45 positions. We also targeted indels (insertions and deletions) to various structural elements in serum paraoxonase (PON1). Both libraries were screened, and functional variants with wild-type-like specificity, or with altered specificities, were isolated.


    Materials and methods
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Supplementary Data
 References
 
Oligonucleotides

The libraries described here were constructed with oligonucleotides obtained from MWG Biotech (Ebersberg, Germany), at the lowest purification grade (HPSF purification). More recently, we have also applied oligonucleotides from IDT (Coralville, IA, USA) of standard desalting grade.

Incorporation of oligonucleotides into the M.HaeIII gene

A schematic outline of ISOR is illustrated in Fig 1. The M.HaeIII gene (992 bp) was initially PCR amplified from the pIVEX2.2-M.HaeIII plasmid using primers LMB2-4 (5'-biotin-labeled) and pIVB-7 (Griffiths and Tawfik, 2003Go). Approximately 6 µg of this PCR product, in 50 µl of digestion buffer (50 mM Tris–HCl buffer pH 7.5, 10 mM MnCl2), was equilibrated at 20°C in a thermocycler (Eppendorf, Germany). 0.05 U DNaseI was added and DNA digestion allowed to proceed for 5 min at 20°C, then terminated by adding 15 µl 0.5 M EDTA and heating at 90°C for 10 min. The gene fragments were separated in a 2% agarose gel and those of 70–100 bp in size excised and purified using the QIAEX II Gel Extraction Kit (Qiagen). The gene was reassembled by combining 100 ng of purified DNA fragments with varied amounts of oligonucleotides (Supplementary Table II available at PEDS online) and thermocycling in a 50 µl reaction mixture that contained 2.5 U Pfu Turbo DNA polymerase (Stratagene) in the supplied buffer and 0.4 mM of each dNTP. The thermocycling program included: one denaturation step at 96°C for 1.5 min, then 35 cycles composed of: (i) a denaturation step at 94°C (30 s); (ii) nine successive hybridization steps separated by 3°C each, from 65°C to 41°C for 1.5 min each (total 13.5 min) and (iii) an elongation step of 1.5 min at 72°C. A final 7 min elongation step at 72°C was added as the last step of the PCR program to allow full elongation of all assembled genes. The full-length assembly product was further amplified in a ‘nested’ PCR reaction with primers LMB2-9 and pIVB-9. In this step, 0.1 µl of the reassembly reaction was used as a template in a standard 50 µl PCR reaction using 2.5 U of Pfu Turbo DNA polymerase (Stratagene). The purified PCR product was digested with EcoRI endonuclease, and the reaction products analyzed on a 1% agarose gel to establish that our oligonucleotides had been incorporated into the reassembled gene.


Figure 1
View larger version (22K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1.. Schematic representation of ISOR. As noted in the text, the use of biotinylated DNA, and purification by their capture onto streptavidin-coated beads, is optional.

 
Construction of M.HaeIII gene-libraries

We synthesized 45 oligonucleotides in order to direct random substitutions to 45 different positions within the M.HaeIII gene (Supplementary Table II available at PEDS online). The NNS codon (that gives rise to all 20 amino acid residues and minimizes the frequency of stop codons) was used for all substitutions. Each of the 45 oligonucleotides was 33 bases long and comprised an NNS codon with 30 flanking bases complementary to the wild-type sequence (15 bases from each side of the NNS). All 45 oligonucleotides were mixed with M.HaeIII gene fragments at equimolar ratios and the whole assembled as described earlier. To help maintain the diversity created in the assembly reaction (>1010 genes), the full-length assembly product was enriched, and the un-incorporated oligonucleotides removed as follows: 50 µl of assembly reaction was mixed with 2.5 µl M280 streptavidin-coated magnetic beads (Dynal) and 50 µl buffer (10 mM Tris–HCl buffer, pH 7.4 containing 1 M NaCl, 25 mM EDTA and 15 mM EGTA), and incubated at ambient temperature for 1 h. The beads were rinsed three times with the same buffer, and three times with 50 mM Tris–HCl (pH 8), and resuspended in a 50 µl nested PCR reaction mixture. The resulting PCR product was then cloned back into pIVEX2.2.

In vitro methylation assays

Libraries, gene pools from selected libraries and individual cloned genes were all assayed for methylation activity by the same digoxygenin–biotin ELISA-based method (Tawfik and Griffiths, 1998Go). PCR-amplified DNA (2 nM) was transcribed and translated in vitro with Eco Pro T7 extract (Novagen) for 1 h at 30°C. The temperature was adjusted to 24°C and the reaction mixed with an equal volume of ‘methylation mixture’ (100 mM Tris–HCl buffer, pH 8.5 containing 100 mM NaCl, 20 mM dithiothreitol, 20 mM EDTA, 0.3 mM S-adenosyl-L-Methionine and 30 nM of a 1 kb DIG-folA-3-biotin DNA substrate). Aliquots were collected at various time points, then quenched and incubated in streptavidin-coated 96-well plates (Nunc). The bound DNA was digested with the restriction endonuclease HaeIII (New England Biolabs). Methylation progress was followed by ELISA, using anti-DIG-HRP-conjugated antibodies (Roche). The ELISA signal was plotted against time and the time required to methylate 50% of the restriction/methylation sites (t50) determined.

In vitro compartmentalization and selection for M.HaeIII activity

Selection for M.HaeIII activity by in vitro compartmentalization took a modified form of that described by Tawfik and Griffiths (1998)Go. Briefly, 100 µl of EcoPro T7 in vitro transcription/translation system (Novagen), plus 0.1 nM biotinylated library DNA and 0.3 mM S-adenosyl-L-Methionine, were added to 1 ml of ice-cold oil mix [4.5% (w/w) Span80, 0.5% (w/w) Tween80 in light mineral oil (Sigma)]. The mixture was homogenized on ice for 5 min at 8000 rpm in an Ultra Turrax T25 (IKA) homogeniser equipped with a disposable shaft (OmniTip). The resulting emulsions were incubated at 25°C for 4 h, and then broken and the biotinylated genes therein captured on streptavidin-coated beads. Non-methylated genes were ‘neutralized’ by digesting with the restriction endonuclease HaeIII (NEB). The methylated and, therefore, undigested genes were subsequently amplified by PCR and re-cloned into pIVEX2.2.

Construction of PON1 libraries

The PON1 variant gene G3C9 (Aharoni et al., 2004Go) was PCR amplified from a pET32b plasmid using primers 5'-biotin pET-fw (GGCAGCCAACTCAGCTTCC) and pET-bac (CGAACGCCAGCACATGG). The 1065 bp PCR product was used as a template for the creation of ISOR gene libraries harboring indels in different structural elements (Supplementary Table IV available at PEDS online). Once assembled (as earlier), the full-length products were re-cloned into pET32b plasmid.

Screening PON1 libraries

A screen for a range of PON1 activities was applied, essentially as described elsewhere (Aharoni et al., 2004Go; Harel et al., 2004Go). Briefly, libraries were transformed into E. coli cells; then grown on nutrient agar plates and replicated with velvet cloth for the esterase screen. A layer of soft agar (0.5%) in activity buffer (50 mM Tris buffer, pH 8.0 containing 1 mM CaCl2) was supplemented with 0.3 mM 2-naphthylacetate (2NA) and 1.3 mg/ml Fast Red (Sigma Aldrich) then poured onto the original agar plates. Colonies that turned red first were picked from the replica plate and used to inoculate 500 µl of LB medium in a 96-deep-well plate. Following growth overnight on a shaker plate (200 rpm) at 30°C, plates were duplicated and lysed with BugBuster (Novagen). The hydrolysis of three substrates, 2NA, paraoxon and {gamma}-thiobutyrolactone, was monitored in cleared crude cell lysates as described elsewhere (Aharoni et al., 2004Go; Harel et al., 2004Go). Briefly, aliquots (10–100 µl) of cleared lysate were transferred into transparent polystyrene 96-well plates, and mixed with substrate in activity buffer. Product release was monitored in a plate reader at 405 nm for p-nitrophenol, and 320 nm for 2-naphthol. Hydrolysis of {gamma}-thiobutyrolactone was detected using 5,5'-dithio-bis-2-nitrobenzoic acid at 412 nm as described (Aharoni et al., 2005aGo, 2005b).


    Results
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Supplementary Data
 References
 
Incorporation of synthetic oligonucleotides in the process of gene reassembly

Figure 1 presents a schematic of ISOR where a biotinylated PCR product of the target gene is subjected to fragmentation by digestion with DNaseI. The DNaseI fragments are then mixed with a set of synthetic oligonucleotides, and assembled in a process of self-primed extension by Taq polymerase. The assembled genes can be enriched by capture on streptavidin-coated magnetic beads, thereby maintaining the diversity created in the assembly reaction by minimizing mispriming and the amplification of short products. It is worth noting, however, that in most cases magnetic bead separation need not be applied, particularly if the required library diversity is ≤106 genes. In any case, the product (enriched or not) is then amplified in a nested PCR using internal, non-biotinylated primers. It should be noted that the assembly reaction and subsequent PCR amplification will introduce additional point mutations at random. The frequency of such can, however, be controlled by the choice of polymerase. Here, we primarily used a high-fidelity polymerase that gave an average of 0.5-point mutations per gene. The application of ordinary polymerases may result in a much higher frequency of mutations (>2 per 1 kb genes).

Our first goal was to optimize reaction conditions and tune the frequency of oligonucleotide incorporation. To this end, we tested the incorporation of two oligonucleotides, oligo I (40 bases) and oligo II (60 bases), which bore a unique EcoRI restriction site plus two (oligo I) and three (oligo II) randomized (NNS) codons (Supplementary Table II available at PEDS online). Varying amounts of these oligonucleotides were mixed with 100 ng (120 nM) of M.HaeIII gene fragments (generated by DNaseI digestion), and the gene reassembled by self-primed extension. The expected EcoRI restriction pattern was then compared with that obtained with the assembly products (Fig. 2). At an initial oligonucleotide concentration of 320 nM, DNA products with one EcoRI restriction site appeared, indicating a single oligonucleotide had been incorporated. Seventy percent DNA products at this concentration, however, had no oligonucleotide incorporated at all. On the other hand, at the highest oligonucleotide concentration tested (800 nM), intact DNA that had no oligonucleotides incorporated was scarce (4%), and the products containing oligo I, oligo II or both were found in equal proportions. It is also notable that the strand used as template for the oligonucleotides' synthesis is generally of no importance, and the libraries described in the subsequent parts of this work were made with oligonucleotides that were all complementary to the same strand. In cases where neighboring residues should be targeted independently, the usage of oligonucleotides complementing the opposite strands is recommended. We were able to successfully modify two consecutive codons, using two oligonucleotides that were complementary to opposite strands (data not shown). Using an approach similar to that described earlier, we were able to incorporate oligonucleotides encoding insertions or deletions of 3 or 12 bases (Supplementary Fig. I available at PEDS online). Thus, both substitutions and indels could be incorporated into genes, and the level of incorporation could be tuned using different oligonucleotide concentrations.


Figure 2
View larger version (50K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 2.. ISOR with the M.HaeIII gene. Gene fragments obtained by EcoRI restriction digests of ISOR products obtained from oligonucleotide concentrations ranging from 0 to 800 nM. The expected fragment sizes are annotated on the left side of the picture and correspond to: (A) 1500 bp, no incorporation; (B) 1160 bp and (F) 340 bp, incorporation of oligo I; (C) 1029 bp and (E) 471 bp, incorporation of oligo II; (D) 689 bp, (E) 471 bp and (F) 471 bp, incorporation of both oligos.

 
Generation of targeted libraries of M.HaeIII

The M.HaeIII gene was diversified by ISOR to demonstrate that our technology could be used to construct highly diverse, yet targeted gene libraries. Our goal was to direct diversity to a subset of residues that are only moderately conserved; working by the assumption that a library with mutations in highly conserved residues would contain a high ratio of inactive variants, whereas mutations in non-conserved residues would be largely neutral. A library based on moderately conserved residues therefore might be an optimal starting point for the evolution of new enzyme variants. We used ConSurf (Landau et al., 2005Go) to search for M.HaeIII-homologous cytosine-5 DNA methyltransferase sequences. The results were aligned against M.HaeIII, a phylogenetic tree was constructed and the degree of conservation for each M.HaeIII residue calculated. This investigation led to the identification of a group of 45 residues which, judging by ConSurf scores, were moderately conserved (Supplementary Table I available at PEDS online). A set of 45 oligonucleotides was synthesized to target NNS codons to each one of these 45 positions. A range of libraries was made by including different concentrations of an equimolar mixture of all 45 oligonucleotides in the assembly of the M.HaeIII gene. A range of oligonucleotide concentrations (9–144 nM) gave a concentration-dependent average incorporation rate of 1–6 randomizing oligonucleotides (Table I). Oligonucleotide concentrations under 9 nM (0.2 nM each oligo) resulted in no detectable incorporation, whereas concentrations above 360 nM inhibited the assembly reaction, reducing yield significantly. Sequence analysis of library variants did not reveal any bias either in the location of incorporation or in the nature of the amino acids incorporated.


View this table:
[in this window]
[in a new window]

 
Table I.. Gene-libraries derived from the M.HaeIII gene

 
The residual methylation activity of the proteins encoded by the library was determined. The PCR amplified libraries transcribed and translated in vitro, and the resulting proteins assayed in pool for methylation activity. Table I shows the relationship between oligonucleotide concentration and methylation activity. Thus, a library containing, on average, one mutation per gene exhibited 11% of wild-type activity, whereas the incorporation of ≥5 mutations per gene reduced the residual activity below the detection threshold.

Selection of the targeted M.HaeIII gene-libraries

Having established the ability of ISOR to controllably incorporate site-specific mutations, we wanted to uncover the potential of ISOR libraries, particularly those containing a high number of mutated residues, to yield active protein variants. We created an initial library by combining the two libraries with the average mutation frequencies of 5 and 6 mutations per gene. Prior to selection, the combined library exhibited no detectable methylation activity. A pool of ~1010 of these genes was taken through two rounds of selection for M.HaeIII activity (methylation of GGCC) by compartmentalization in emulsions (Tawfik and Griffiths, 1998Go). The gene pools from the first, and second, round of selection exhibited 3%, and 6%, of the M.HaeIII wild-type activity. Ten clones from the second rounds were randomly picked and their sequence and methylation activity determined (Supplementary Table III available at PEDS online). Of these ten clones, three were not active (two of these encoded truncated proteins due to the incorporation of stop codons in one of the diversified positions). The remaining seven clones had activities ranging from 3% to 42% of wild-type activity, and contained 2–5 mutations per gene. None of the 10 clones had a wild-type sequence, and no mutation repeated among the clones. As expected, most of the substitutions observed in the targeted positions (68%) were to amino acids that appear in homologous C5 methyltransferases, as judged by our multiple sequence alignment (data not shown). These results demonstrate the capability of ISOR to create large, highly complex and diverse, yet functional libraries.

Generation and selection of PON1 indel libraries

We also employed ISOR to generate serum paraoxonase (PON1) gene libraries containing indels in various structural elements related to PON1's active site and which are not an integral part of the ß-propeller scaffold. Thirty different active site positions were chosen along PON1's primary structure (Supplementary Table IV available at PEDS online) targeting two major locations of PON1's active site (Supplementary Fig. II available at PEDS online).

The first targeted group is the active site canopy, apparently unique to PONs, which is defined by helices H2 and H3, and the loops connecting them to the ß-propeller scaffold (Harel et al., 2004Go). We included in this group a long surface loop (residues 68–83) that is mostly disordered in the crystal structure, but its location suggests that it may comprise part of the active site. The canopy, including the 68–83 loop, is thought to be of principle importance to the function of PONs as it contains most of the residues that seem to determine the substrate specificity of the various PON family members (Harel et al., 2004Go).

The second group locates to relatively short loops (2–9 residues) at the top of the tertiary structure that are typical to ß-propellers (the upper loops). These loops connect either the outer ß-strand of each blade (strand D) with the inner strand of the next blade (A), or the two strands in the middle of the blades (strands B and C). Many of the residues within these loops face the entrance to the central tunnel of the propeller, and some may influence enzymatic activity (Harel et al., 2004Go).

Ultimately, 16 canopy positions and 14 upper loops positions were chosen for the introduction of indels. Three different oligonucleotides were synthesized for each of the 30-targeted positions (Supplementary Table IV available at PEDS online) as follows: one oligonucleotide introduced an insertion of a single randomized codon (NNS), one inserted two NNS codons and the third was designed to delete the targeted position.

Several gene-libraries of PON1 were prepared by incorporating different concentrations of the 90 oligonucleotides (at equal ratios). The relationship between oligonucleotide concentration and mutation frequency was nearly linear within the range of 9–72 nM (or, 0.1–0.8 nM of each oligonucleotide). In addition to the ‘designed’ indels, the libraries carried an average of 0.5 additional random point mutations due to PCR errors. The residual activity of these libraries also correlated with the average number of mutations (Fig. 3).


Figure 3
View larger version (17K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 3.. Mutation rate and residual activity of PON1 indels libraries. Average number of indels per gene (bars, left ordinate), plus residual thio-lactonase (squares), and phosphotriesterase (circles) activities relative to wild-type PON1 (right ordinate). Data were calculated for four different PON1 gene-libraries, each generated with different oligonucleotide concentrations as indicated. The oligonucleotide mix applied contained 90 oligonucleotides encoding indels in various areas surrounding the active site (see main text).

 
Mapping PON1's tolerance to indels

The versatility of ISOR allowed us to create various gene-libraries with indels in different structural elements of the protein, and then to compare the residual activity of libraries carrying a similar (~0.4%) mutation frequency (Table II).


View this table:
[in this window]
[in a new window]

 
Table II.. Residual activity of PON1 libraries with indels incorporated into various structural elementsa.

 
The residual activity of the canopy library (19%) was similar to the residual activity arising from diversification of its various components [H2 (16%), H3 and connecting loop (13%)]. Interestingly, this level of residual activity (13–19%) is very similar to residual activities observed in libraries with the same mutation frequency, but created by error-prone PCR along the entire gene (L.Gaidukov, unpublished data). Somewhat surprisingly, therefore, the canopy region appears to be as tolerant to indels as PON1 is to point mutations across its entire length. Introducing indels to other regions in the PON1 fold produced rather different residual activities. The upper loops were found to be less tolerant to indels (7% residual activity), possibly a function of their short length. The long, mobile surface loop, on the other hand, was exceptionally tolerant to indels (37.5% residual activity).

Screening the PON1 indel libraries

To demonstrate the capability of our ISOR indel libraries to generate functional variants, we screened a library with indels at all the structural elements described and further characterized variants markedly differ from wild-type PON1 in their substrate specificity. A library was prepared from all 90 oligonucleotides at a total concentration of 36 nM, this produced an average of 2.75 directed indels and 0.62-point mutations per gene. We applied a screen on agar plates for esterase activity using 2NA. This screen is highly sensitive, and even clones with very weak esterase activity (kcat/KM >25 M– 1s– 1 (Gould and Tawfik, 2005Go) appear positive. This method was used to screen ~2000 clones, and we picked 300 active clones that exhibited the highest rate of color formation. The crude cell lysates of each of the 300 active clones was assayed spectrophotometrically for hydrolysis of three different substrates (2NA, paraoxon and {gamma}-thiobutyrolactone). Eighteen out of the 300 active clones had markedly different substrate specificities from wild-type PON1. These were assayed in triplicate and sequenced. None had the wild-type sequence. Five clones harbored indels and substitutions (not shown) and the rest comprised PON1 sequences with indels only; nine of which were unique sequences. Of the nine unique sequences, six indels occurred in Helix 3, highlighting its importance in substrate recognition (Fig. 4). In addition, four of the six indels identified in Helix 3 occurred at residue 291, including insertions of one and two amino acids (Trp in NA087 and Thr-Gly in NA235) and a deletion of 291 (NA311 and NA332). All three of these clones showed higher esterase activity than wild-type, whereas the two other assayed activities (paraoxon and {gamma}-thiobutyrolactone) were reduced by more than 50-fold. Conversely, two other Helix 3 mutants showed a bias toward thiolactonase and phosphotriesterase activities (NA070 and NA079, respectively).


Figure 4
View larger version (23K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 4.. Activity and sequence changes in a representative group of active variants isolated from the PON1 indels library. Activities of each variant are indicated, relative to wild-type PON1 (=1). Three substrates—an aryl ester, 2NA (grey bar), a phosphotriester (paraoxon; white bar) and {gamma}-thiobutyrolactone (black bar)—were tested. Activities were measured in cleared cell lysates in triplicate. Lower panel: the indels found in each clone variant and its location.

 
Two clones had insertions at the surface loop: NA024 had an insertion of two amino acids between residues 82 and 83 (Val-Tyr) and ‘specialized’ as esterase. Clone NA311 carried an insertion also at the surface loop (Gly, between residues 74 and 75) as well as a deletion of H3 residue 291. The substrate specificity of this mutant is very similar to that of NA332 and is therefore most likely dictated by the deletion in H3, rather than by the surface loop insertion (although this change, not having been examined here, may exert a subtle influence). As expected from the analysis of whole libraries (Table II), only a minority of clones (2/9) harbored insertions in the indel-sensitive upper loops, although interesting improvements were recorded. NA073 had an insertion of Leu in loop 2D3A (i.e. the loop connecting the D strand of the second blade with the A strand of the third blade) and exhibited increased thiolactonase activity. NA198 harbored a Gly insertion at loop 2B2C and exhibited increased phosphotriesterase activity.


    Discussion
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Supplementary Data
 References
 
Optimization and application of the ISOR protocol

Directed evolution methodologies need not be seen necessarily as a way of circumventing design, but rather as a way of complementing it by allowing for a much larger margin of error. However, despite the availability of numerous methods for directed and random mutagenesis (Arnold and Georgio, 2003Go; Neylon, 2004Go), there is still a need for techniques that allow a systematic design of gene libraries informed by inputs from rational, or computational design. Prior to this study, we had made attempts to perform gene assembly from long synthetic oligonucleotides (60–80 bp) that were designed to introduce the targeted diversified residues in a manner similar to the synthetic shuffling method (Ness et al., 2002Go; Zha et al., 2003Go). We found, however, that the libraries constructed in this way were very sensitive to oligonucleotide quality, and that long oligonucleotides contain a significant fraction of (n – 1) and (n + 1) products. What is more, purification of these oligonucleotides by PAGE resulted in an even higher frequency of frame-shifts. Chastened by these experiences, we directed our efforts at developing a general and versatile technique that targets diversity to pre-define, and specific positions, thereby creating the desired gene libraries with high precision. ISOR is the result of this effort, and is based on the incorporation of synthetic oligonucleotides via gene-reassembly (Fig. 1). The addition of synthetic oligonucleotides to a mixture of gene fragments prior to DNA shuffling was suggested in Stemmer's original report (Stemmer, 1994Go). Perhaps, due to the lack of a systematic, well-established protocol, this approach has been only very rarely applied (van den Beucken et al., 2001Go; Stutzman-Engwall et al., 2005Go). In this work, we describe the optimization of this method, and its application toward the generation of a range of different targeted libraries, while incorporating base substitutions, insertions and deletions.

As shown here, relatively short synthetic oligonucleotides (~30 bp) can encode substitutions, insertions or deletions, at any given position with a high degree of precision. The frequency of errors in such short oligonucleotides, and their cost, is much lower than of long ones, and they require no chromatographic purification that increases costs and biases library content. ISOR, therefore, begins from a reliable starting point, yet is extremely versatile and adaptable. Once a set of oligonucleotides has been synthesized, it can be used for the assembly of various libraries with different rates of diversification, or indeed libraries created with different subsets of the same oligonucleotides. Since most of the gene sequence is reassembled from DNaseI fragments, and because the oligonucleotides used are short, the method is not very sensitive to oligonucleotide quality.

The major advantage of ISOR is its ‘tuneability’ in that it allows a ‘parsimonious’ representation of diversity in many positions (30–45, as demonstrated here), while affording the opportunity to control the mutation frequency at each targeted residue. In saturation mutagenesis, randomization of more than a few positions results in impossibly high library sizes and high mutation frequencies that render almost all library variants inactive and the identification of positives therefore depends on screening of an extremely large number of variants. Attempts have been made to overcome this drawback. For example, using oligonucleotides in which a small proportion of randomizing bases were ‘doped’ into the wild-type sequence. However, the applicability of this method, often called parsimonious mutagenesis (Balint and Larrick, 1993Go), has been limited, possibly due to the high cost and limited quality of doped oligonucleotides. The power of ISOR is in the fact that the concentration of each oligonucleotide in the assembly reaction determines the frequency of modification at each position (Fig. 2, Tables I and II). This is much more difficult to achieve with synthetic shuffling, especially when mutations in adjacent codons are encoded by the same oligonucleotide.

We demonstrated the benefits of ISOR in the preparation of targeted gene libraries of two enzymes. A bioinformatics analysis allowed us to define a set of 45 non-contiguous amino acids as moderately conserved in M.HaeIII. We then used ISOR to target its diversification power to these residues. The result was a series of libraries with a range of mutation frequencies (Table I). Each gene in the library carried different mutations, at a different subset of the targeted positions, and the entire set was therefore explored in a systematic, predictable and combinatorial manner. A library with an average mutation frequency of 5.5 mutations per gene was subjected to iterative rounds of selection, and gave rise to a range of active variants (Supplementary Table III).

Directed evolution with indels

Almost all gene libraries described to date have employed point mutations. The application of indels libraries to protein engineering, therefore, has thus far been limited—also due to the lack of an appropriate methodology (for a newly developed technique for random indels incorporation see Fujii et al., 2006Go). In order to study the undiscovered role of indels in the evolution of new enzyme functions, we generated indels libraries by ISOR. The enzyme under investigation was PON1, a calcium-dependent lactonase that exhibits a range of promiscuous activities (Aharoni et al., 2005aGo, 2005b; Khersonsky and Tawfik, 2005Go). ISOR was used to target indels to surface loops, and various structural elements within them, that comprise the wall and perimeter of PON1's active site. These regions contain the vast majority of residues that are believed to dictate the substrate selectivity of this enzyme family, and seem to have changed over the course of its natural divergence (Harel et al., 2004Go). Consequently, they seem the most promising for altering the enzyme's specificity.

We designed a set of 90 oligonucleotides encoding indels throughout the identified structural elements. The versatility of ISOR allowed us to create an array of libraries, including libraries with indels in individual structural elements, and others with indels distributed along several structural elements. Characterization of the constructed libraries before, and after, an activity screen, indicated the potential of ISOR to generate highly complex combinatorial libraries with insertions, deletions or both. Oligonucleotide incorporation proceeded in a highly efficient manner, with no apparent biases and with a minimal number of random point mutations due only to PCR errors.

We also demonstrated the potential of ISOR to elucidate the tolerance of different structural elements to indels. We found that the ‘canopy’ elements, and a highly mobile ‘surface loop’, are much more tolerant to indels than the shorter ‘upper loops’ (Table II). Although these results are obviously preliminary, they seem to indicate that longer, and certainly more mobile loops (the ‘surface loop’ is disordered in the crystal structure), are more tolerant of indels than short, highly ordered loops (for other recent examples see, Scalley-Kim et al., 2003Go; Mathonet et al., 2006Go, and references therein). Underlining the dichotomy between short, ordered and longer disordered loops is the finding that indels in several short loops (e.g. 6D1A and 5D6A) were not found among any of the active library variants (Fig. 4; data not shown). Another interesting observation is that most of the modified variants (66.7%) harbored indels in Helix 3 (H3) which is known to play an important role in substrate recognition (Aharoni et al., 2004Go; Harel et al., 2004Go). However, incorporation of indels at this surface loop did not give rise to many variants with altered specificities despite its high tolerance.

In summary, although the above results regarding the role of indels in altering PON1's catalytic activities are preliminary, they do indicate the potential of indels libraries for directed evolution. Foremost, these libraries, and the one of M.HaeIII, demonstrate the versatility and applicability of ISOR.


    Supplementary Data
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Supplementary Data
 References
 
Supplementary data mentioned in the text is available to online subscribers at http://www.peds.oxfordjournals.org.


    Footnotes
 
2 Present address: Department of Pathology, University of Washington, Seattle, WA 98195, USA Back

Edited by Phillip Holliger


    Acknowledgments
 
A.H. thanks Professor Michael Fry and Dr Manel Camps for critical reading of the manuscript, and Professor Lawrence A. Loeb for his support. Financial funding was provided by the Israel Science Foundation, and is gratefully acknowledged. D.S.T. is the incumbent of the Elaine Blonde Career development Chair.


    References
 Top
 Abstract
 Introduction
 Materials and methods
 Results
 Discussion
 Supplementary Data
 References
 
Aharoni A., Amitai G., Bernath K., Magdassi S., Tawfik D.S. Chem. Biol. (2005a) 12:1281–1289.[CrossRef][ISI][Medline]

Aharoni A., Gaidukov L., Khersonsky O., McQ Gold.S., Roodveldt C., Tawfik D.S. Nat. Genet. (2005b) 37:73–76.[ISI][Medline]

Aharoni A., Gaidukov L., Yagur S., Toker L., Silman I., Tawfik D.S. Proc. Natl Acad. Sci. USA (2004) 101:482–487.[Abstract/Free Full Text]

Antikainen N.M., Hergenrother P.J., Harris M.M., Corbett W., Martin S.F. Biochemistry (2003) 42:1603–1610.[CrossRef][Medline]

Arnold F.H., Georgio G. Directed Evolution Library Creation. Methods in Molecular Biology (2003) Yotowa, NJ: Humana Press.

Balint R.F., Larrick J.W. Gene (1993) 137:109–118.[CrossRef][ISI][Medline]

Bershtein S., Segal M., Bekerman R., Tokuriki N., Tawfik D.S. Nature (2006) 440:929–932.

Chica R.A., Doucet N., Pelletier J.N. Curr. Opin. Biotechnol. (2005) 16:378–384.[CrossRef][ISI][Medline]

Dwyer M.A., Looger L.L., Hellinga H.W. Science (2004) 304:1967–1971.[Abstract/Free Full Text]

Fujii R., Kitaoka M., Hayashi K. Nucleic Acids Res. (2006) 34:e30.[Abstract/Free Full Text]

Gould S.M., Tawfik D.S. Biochemistry (2005) 44:5444–5452.[CrossRef][Medline]

Griffiths A.D., Tawfik D.S. EMBO J. (2003) 22:24–35.[CrossRef][ISI][Medline]

Harel M., et al. Nat. Struct. Mol. Biol. (2004) 11:412–419.[CrossRef][ISI][Medline]

Hayes R.J., Bentzien J., Ary M.L., Hwang M.Y., Jacinto J.M., Vielmetter J., Kundu A., Dahiyat B.I. Proc. Natl Acad. Sci. USA (2002) 99:15926–15931.[Abstract/Free Full Text]

Khersonsky O., Tawfik D.S. Biochemistry (2005) 44:6371–6382.[CrossRef][Medline]

Landau M., Mayrose I., Rosenberg Y., Glaser F., Martz E., Pupko T., Ben-Tal N. Nucleic Acids Res. (2005) 33:W299–W302.[Abstract/Free Full Text]

Mathonet P., Deherve J., Soumillion P., Fastrez J. Protein Sci. (2006) 15:2323–2334.[Abstract/Free Full Text]

Minshull J., Govindarajan S., Cox T., Ness J.E., Gustafsson C. Methods (2004) 32:416–427.[CrossRef][ISI][Medline]

Ness J.E., Kim S., Gottman A., Pak R., Krebber A., Borchert T.V., Govindarajan S., Mundorff E.C., Minshull J. Nat. Biotechnol. (2002) 20:1251–1255.[CrossRef][ISI][Medline]

Neylon C. Nucleic Acids Res. (2004) 32:1448–1459.[Abstract/Free Full Text]

Park S., Morley K.L., Horsman G.P., Holmquist M., Hult K., Kazlauskas R.J. Chem. Biol. (2005) 12:45–54.[CrossRef][ISI][Medline]

Patrick W.M., Firth A.E. Biomol. Eng. (2005) 22:105–112.[CrossRef][ISI][Medline]

Reetz M.T. Proc. Natl Acad. Sci. USA (2004) 101:5716–5722.[Abstract/Free Full Text]

Reetz M.T., Bocola M., Carballeira J.D., Zha D., Vogel A. Angew. Chem. Int. Ed. Engl. (2005) 44:4192–4196.[CrossRef]

Reetz M.T., Wilensek S., Zha D., Jaeger K.E. Angew. Chem. Int. Ed. Engl. (2001) 40:3589–3591.[CrossRef][Medline]

Rui L., Cao L., Chen W., Reardon K.F., Wood T.K. J. Biol. Chem. (2004) 279:46810–46817.[Abstract/Free Full Text]

Santoro S.W., Schultz P.G. Proc. Natl Acad. Sci. USA (2002) 99:4185–4190.[Abstract/Free Full Text]

Scalley-Kim M., Minard P., Baker D. Protein Sci (2003) 12:197–206.[Abstract/Free Full Text]

Stemmer W.P. Proc. Natl Acad. Sci. USA (1994) 91:10747–10751.[Abstract/Free Full Text]

Stutzman-Engwall K., et al. Metab. Eng. (2005) 7:27–37.[CrossRef][ISI][Medline]

Tawfik D.S., Griffiths A.D. Nat. Biotechnol. (1998) 16:652–656.[CrossRef][ISI][Medline]

van den Beucken T., van Neer T., Sablon E., Desmet J., Celis L., Hoogenboom H.R., Hufton S.E. J. Mol. Biol. (2001) 310:591–601.[CrossRef][ISI][Medline]

Voigt C.A., Martinez C., Wang Z.G., Mayo S.L., Arnold F.H. Nat. Struct. Biol. (2002) 9:553–558.[ISI][Medline]

Zha D., Eipper A., Reetz M.T. Chembiochem (2003) 4:34–39.[CrossRef][ISI][Medline]

Received February 17, 2007; revised February 17, 2007; accepted February 26, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow supplementary data
Right arrow All Versions of this Article:
20/5/219    most recent
gzm014v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Herman, A.
Right arrow Articles by Tawfik, D. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Herman, A.
Right arrow Articles by Tawfik, D. S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?