Protein Engineering vol. 16 no. 6 pp. 451-457, 2003
© 2003 Oxford University Press
User-friendly algorithms for estimating completeness and diversity in randomized protein-encoding libraries
1Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA and 2Institute of Astronomy, University of Cambridge, Madingley Road, Cambridge CB3 0HA, UK.
3 To whom correspondence should be addressed. e-mail: jmb50{at}cam.ac.uk
Directed evolution of proteins depends on the production of molecular diversity by random mutagenesis. While a number of methods have been developed for introducing this diversity, the best ways to sample it are not always clear. Here we used simple statistics to analyse completeness and diversity in randomized libraries generated by oligonucleotide-directed mutagenesis, error-prone polymerase chain reaction (epPCR) and in vitro recombination of highly homologous sequences. For oligonucleotide-directed mutagenesis, we derive equations to estimate how complete a given library is expected to be and also to predict the size of library required to give a fixed probability of being 100% complete. We describe the statistical bases for computer programs which estimate the number of distinct variants represented in epPCR and shuffled libraries, dubbed PEDEL and DRIVeR, respectively. These programs allow the user to calculate (rather than guess) the diversity represented in a given library and also provide empirical guidelines for maximizing this diversity. PEDEL and DRIVeR are available at www.bio.cam.ac.uk/
blackburn/stats.html.
Received November 6, 2002; revised May 6, 2003; accepted May 20, 2003.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. E. Firth and W. M. Patrick GLUE-IT and PEDEL-AA: new programmes for analyzing protein diversity in randomized libraries Nucleic Acids Res., July 1, 2008; 36(suppl_2): W281 - W285. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. P. Treynor, C. L. Vizcarra, D. Nedelcu, and S. L. Mayo Computationally designed libraries of fluorescent proteins evaluated by preservation and diversity of function PNAS, January 2, 2007; 104(1): 48 - 53. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. E. Kunzler, S. Sasso, M. Gamper, D. Hilvert, and P. Kast Mechanistic Insights into the Isochorismate Pyruvate Lyase Activity of the Catalytically Promiscuous PchB from Combinatorial Mutagenesis and Selection J. Biol. Chem., September 23, 2005; 280(38): 32827 - 32834. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Firth and W. M. Patrick Statistics of protein library construction Bioinformatics, August 1, 2005; 21(15): 3314 - 3315. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Volles and P. T. Lansbury Jr A computer program for the estimation of protein and nucleic acid sequence diversity in random point mutagenesis libraries Nucleic Acids Res., June 29, 2005; 33(11): 3667 - 3677. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Hughes, Z.-R. Zhang, A. J. Sutherland, A. F. Santos, and A. V. Hine Discovery of active proteins directly from combinatorial randomized protein libraries without display, purification or sequencing: identification of novel zinc finger proteins Nucleic Acids Res., February 18, 2005; 33(3): e32 - e32. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Neylon Chemical and biochemical strategies for the randomization of protein encoding DNA sequences: library construction methods for directed evolution Nucleic Acids Res., February 27, 2004; 32(4): 1448 - 1459. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. S. Wong, K. L. Tee, B. Hauer, and U. Schwaneberg Sequence saturation mutagenesis (SeSaM): a novel method for directed evolution Nucleic Acids Res., February 10, 2004; 32(3): e26 - e26. [Abstract] [Full Text] [PDF] |
||||



