High-Throughput Protein Production and Purification: Methods and Protocols [1st ed.] 978-1-4939-9623-0;978-1-4939-9624-7

This book compiles key protocols instrumental to the study of high-throughput protein production and purification which

1,322 110 15MB

English Pages XVI, 537 [536] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

High-Throughput Protein Production and Purification: Methods and Protocols [1st ed.]
 978-1-4939-9623-0;978-1-4939-9624-7

Table of contents :
Front Matter ....Pages i-xvi
Front Matter ....Pages 1-1
Overview of High-Throughput Cloning Methods for the Post-genomic Era (Claudia Ortega, Cecilia Abreu, Pablo Oppezzo, Agustín Correa)....Pages 3-32
Overview of a High-Throughput Pipeline for Streamlining the Production of Recombinant Proteins (Joanne E. Nettleship, Heather Rada, Raymond J. Owens)....Pages 33-49
Semiautomated Small-Scale Purification Method for High-Throughput Expression Analysis of Recombinant Proteins (Edward Kraft, Yvonne Franke, Katharine Heeringa, Stephanie Shriver, Inna Zilberleyb, Christine Kugel et al.)....Pages 51-68
High-Throughput Protein Production in Yeast (Francisco J. Fernández, Sara Gómez, M. Cristina Vega)....Pages 69-91
A High-Throughput System for Transient and Stable Protein Production in Mammalian Cells (Sarah M. Rue, Paul W. Anderson, Michelle R. Gaylord, Jessica J. Miller, Scott M. Glaser, Scott A. Lesley)....Pages 93-142
A High-Throughput Automated Protein Folding System (Kenneth W. Walker, Philip An, Dwight Winters)....Pages 143-161
Front Matter ....Pages 163-163
High-Throughput Production of Oxidized Animal Toxins in Escherichia coli (Yoan Duhoo, Ana Filipa Sequeira, Natalie J. Saez, Jeremy Turchetto, Laurie Ramond, Fanny Peysson et al.)....Pages 165-190
High-Throughput Purification of Protein Kinases from Escherichia coli and Insect Cells (Sebastian Mathea, Eidarus Salah, Stefan Knapp)....Pages 191-202
Parallelized Microscale Expression of Soluble scFv (Giulio Russo, Viola Fühner, André Frenzel, Michael Hust, Stefan Dübel)....Pages 203-211
High-Throughput Production of Influenza Virus-Like Particle (VLP) Array by Using VLP-factory™, a MultiBac Baculoviral Genome Customized for Enveloped VLP Expression (Duygu Sari-Ak, Shervin Bahrami, Magdalena J. Laska, Petra Drncova, Daniel J. Fitzgerald, Christiane Schaffitzel et al.)....Pages 213-226
High-Throughput Protein Production of Membrane Proteins in Saccharomyces cerevisiae (Jennifer M. Johnson, Franklin A. Hays)....Pages 227-259
High-Throughput E. coli Cell-Free Expression: From PCR Product Design to Functional Validation of GPCR (Sandra Cortès, Fatima-Ezzahra Hibti, Frydman Chiraz, Safia Ezzine)....Pages 261-279
High-Throughput Site-Directed Mutagenesis (Claire Strain-Damerell, Nicola A. Burgess-Brown)....Pages 281-296
Front Matter ....Pages 297-297
Hot CoFi Blot: A High-Throughput Colony-Based Screen for Identifying More Thermally Stable Protein Variants (Ignacio Asial, Pär Nordlund, Sue-Li Dahlroth)....Pages 299-320
High-Throughput Isolation of Soluble Protein Domains Using a Bipartite Split-GFP Complementation System (Amélie Massemin, Stéphanie Cabantous, Geoffrey S. Waldo, Jean-Denis Pedelacq)....Pages 321-333
High-Throughput Analytical Light Scattering for Protein Quality Control and Characterization (Daniel Some, Vladimir Razinkov)....Pages 335-359
High-Throughput Nano-Scale Characterization of Membrane Proteins Using Fluorescence-Detection Size-Exclusion Chromatography (Alex J. Vecchio, Robert M. Stroud)....Pages 361-388
A Flexible and Scalable High-Throughput Platform for Recombinant Membrane Protein Production (Hui Xu, Thomas Clairfeuille, Christine C. Jao, Hoangdung Ho, Zachary Sweeney, Jian Payandeh et al.)....Pages 389-402
Adaption of the Leishmania Cell-Free Expression System to High-Throughput Analysis of Protein Interactions (Wayne A. Johnston, Shayli Varasteh Moradi, Kirill Alexandrov)....Pages 403-421
High-Throughput Protein–Protein Interaction Assays Using Tripartite Split-GFP Complementation (Jean-Denis Pedelacq, Geoffrey S. Waldo, Stéphanie Cabantous)....Pages 423-437
High-Throughput Production of a New Library of Human Single and Tandem PDZ Domains Allows Quantitative PDZ-Peptide Interaction Screening Through High-Throughput Holdup Assay (Yoan Duhoo, Virginie Girault, Jeremy Turchetto, Laurie Ramond, Fabien Durbesson, Patrick Fourquet et al.)....Pages 439-476
High-Throughput Protein Analysis Using Negative Stain Electron Microscopy and 2D Classification (Christopher P. Arthur, Claudio Ciferri)....Pages 477-485
High-Throughput Protein Production Combined with High- Throughput SELEX Identifies an Extensive Atlas of Ciona robusta Transcription Factor DNA-Binding Specificities (Kazuhiro R. Nitta, Renaud Vincentelli, Edwin Jacox, Agnès Cimino, Yukio Ohtsuka, Daniel Sobral et al.)....Pages 487-517
High-Throughput Micro-Characterization of RNA–Protein Interactions (Sara Gómez, Francisco J. Fernández, M. Cristina Vega)....Pages 519-531
Back Matter ....Pages 533-537

Citation preview

Methods in Molecular Biology 2025

Renaud Vincentelli Editor

High-Throughput Protein Production and Purification Methods and Protocols

METHODS

IN

MOLECULAR BIOLOGY

Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, UK

For further volumes: http://www.springer.com/series/7651

For over 35 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-bystep fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitute the key ingredient in each and every volume of the Methods in Molecular Biology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexed in Pub Med.

High-Throughput Protein Production and Purification Methods and Protocols

Edited by

Renaud Vincentelli Architecture et Fonction des Macromolécules Biologiques (AFMB), Unité Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS), Aix-Marseille Université, Marseille cedex 9, France

Editor Renaud Vincentelli Architecture et Fonction des Macromole´cules Biologiques (AFMB), Unite´ Mixte de Recherche (UMR) 7257 Centre National de la Recherche Scientifique (CNRS) Aix-Marseille Universite´ Marseille cedex 9, France

ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-4939-9623-0 ISBN 978-1-4939-9624-7 (eBook) https://doi.org/10.1007/978-1-4939-9624-7 © Springer Science+Business Media, LLC, part of Springer Nature 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature. The registered company address is: 233 Spring Street, New York, NY 10013, U.S.A.

Preface In the last decade, the data generated by genome sequencing programs, together with the proliferation of flexible cloning strategies and structural genomics programs, have been instrumental in the development of high-throughput methods for protein production and purification. This book compiles protocols that have been refined and simplified over the years and that are now ready to be transferred to any laboratory. The vast majority of these protocols can be implemented manually, without the need of investing in any particular equipment. However, to be able to reach the maximum throughput, some protocols will require commercial or customized robotic systems. The chapters of this book are grouped in three parts: Part I describes general procedures for high-throughput protein production, Part II describes high-throughput protocols adapted to the production of specific protein families, and Part III describes protocols combining high-throughput protein production and their micro-characterization. I have purposefully chosen to dedicate half of this book to the third part. With the possibility to express and purify hundreds of proteins in parallel (see Part I and Part II), the importance of identifying the best behaving proteins at this early scale becomes paramount. In general, the choice of the expression strategy for scaling up production is based on the basic criteria of the highest soluble yield and rarely considers other functional, biochemical, or biophysical aspects of the sample. But nowadays, the combination of high-throughput protein production and the recent developments on protein micro-characterization allows in some cases for the inclusion of quality criteria in the selection process. This book is mainly addressed to biochemists ranging from engineers, PhD students, and postdoctoral fellows to the heads of protein expression facilities and researchers. Marseille, France

Renaud Vincentelli

v

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

PART I

v xi

GENERAL PROCEDURES FOR HIGH-THROUGHPUT PROTEIN PRODUCTION

1 Overview of High-Throughput Cloning Methods for the Post-genomic Era . . . 3 Claudia Ortega, Cecilia Abreu, Pablo Oppezzo, and Agustı´n Correa 2 Overview of a High-Throughput Pipeline for Streamlining the Production of Recombinant Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Joanne E. Nettleship, Heather Rada, and Raymond J. Owens 3 Semiautomated Small-Scale Purification Method for High-Throughput Expression Analysis of Recombinant Proteins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Edward Kraft, Yvonne Franke, Katharine Heeringa, Stephanie Shriver, Inna Zilberleyb, Christine Kugel, Trisha Dela Vega, Athena Wong, Bobby Brillantes, Claudio Ciferri, George Dutina, Grace Lee, Isabelle Lehoux, Zhong Rong Li, Lee Lior-Hoffmann, Jiyoung Hwang, Chris Lonergan, Lynn Martin, Kyle Mortara, Lananh Nguyen, Jian Payandeh, Andrew Perez, Jun Sampang, Lovejit Singh, Kurt Schroeder, Christine Tam, Shu Ti, Ye Naing Win, and Krista Bowman 4 High-Throughput Protein Production in Yeast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Francisco J. Ferna´ndez, Sara Gomez, and M. Cristina Vega 5 A High-Throughput System for Transient and Stable Protein Production in Mammalian Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Sarah M. Rue, Paul W. Anderson, Michelle R. Gaylord, Jessica J. Miller, Scott M. Glaser, and Scott A. Lesley 6 A High-Throughput Automated Protein Folding System . . . . . . . . . . . . . . . . . . . . 143 Kenneth W. Walker, Philip An, and Dwight Winters

PART II

HIGH-THROUGHPUT PROTOCOLS ADAPTED TO THE PRODUCTION OF SPECIFIC PROTEIN FAMILIES

7 High-Throughput Production of Oxidized Animal Toxins in Escherichia coli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Yoan Duhoo, Ana Filipa Sequeira, Natalie J. Saez, Jeremy Turchetto, Laurie Ramond, Fanny Peysson, Joana L. A. Bra´s, Nicolas Gilles, Herve´ Darbon, Carlos M. G. A. Fontes, and Renaud Vincentelli 8 High-Throughput Purification of Protein Kinases from Escherichia coli and Insect Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Sebastian Mathea, Eidarus Salah, and Stefan Knapp

vii

viii

Contents

9 Parallelized Microscale Expression of Soluble scFv . . . . . . . . . . . . . . . . . . . . . . . . . . ¨ hner, Andre´ Frenzel, Michael Hust, and Stefan Du ¨ bel Giulio Russo, Viola Fu 10 High-Throughput Production of Influenza Virus-Like Particle (VLP) Array by Using VLP-factory™, a MultiBac Baculoviral Genome Customized for Enveloped VLP Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Duygu Sari-Ak, Shervin Bahrami, Magdalena J. Laska, Petra Drncova, Daniel J. Fitzgerald, Christiane Schaffitzel, Frederic Garzoni, and Imre Berger 11 High-Throughput Protein Production of Membrane Proteins in Saccharomyces cerevisiae. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jennifer M. Johnson and Franklin A. Hays 12 High-Throughput E. coli Cell-Free Expression: From PCR Product Design to Functional Validation of GPCR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sandra Corte`s, Fatima-Ezzahra Hibti, Frydman Chiraz, and Safia Ezzine 13 High-Throughput Site-Directed Mutagenesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Claire Strain-Damerell and Nicola A. Burgess-Brown

PART III

203

213

227

261 281

HIGH-THROUGHPUT PROTEIN PRODUCTION COMBINED HIGH-THROUGHPUT MICRO-CHARACTERIZATION FOR PROTEIN STABILITY, SOLUBILITY, QUALITY AND INTERACTIONS

WITH

14

15

16

17

18

19

20

Hot CoFi Blot: A High-Throughput Colony-Based Screen for Identifying More Thermally Stable Protein Variants. . . . . . . . . . . . . . . . . . . . . . Ignacio Asial, P€ a r Nordlund, and Sue-Li Dahlroth High-Throughput Isolation of Soluble Protein Domains Using a Bipartite Split-GFP Complementation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ame´lie Massemin, Ste´phanie Cabantous, Geoffrey S. Waldo, and Jean-Denis Pedelacq High-Throughput Analytical Light Scattering for Protein Quality Control and Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Daniel Some and Vladimir Razinkov High-Throughput Nano-Scale Characterization of Membrane Proteins Using Fluorescence-Detection Size-Exclusion Chromatography . . . . . . Alex J. Vecchio and Robert M. Stroud A Flexible and Scalable High-Throughput Platform for Recombinant Membrane Protein Production . . . . . . . . . . . . . . . . . . . . . . . . . . . Hui Xu, Thomas Clairfeuille, Christine C. Jao, Hoangdung Ho, Zachary Sweeney, Jian Payandeh, and Christopher M. Koth Adaption of the Leishmania Cell-Free Expression System to High-Throughput Analysis of Protein Interactions . . . . . . . . . . . . . . . . . . . . . . . Wayne A. Johnston, Shayli Varasteh Moradi, and Kirill Alexandrov High-Throughput Protein–Protein Interaction Assays Using Tripartite Split-GFP Complementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jean-Denis Pedelacq, Geoffrey S. Waldo, and Ste´phanie Cabantous

299

321

335

361

389

403

423

Contents

High-Throughput Production of a New Library of Human Single and Tandem PDZ Domains Allows Quantitative PDZ-Peptide Interaction Screening Through High-Throughput Holdup Assay . . . . . . . . . . . . . Yoan Duhoo, Virginie Girault, Jeremy Turchetto, Laurie Ramond, Fabien Durbesson, Patrick Fourquet, Yves Nomine´, Vaˆnia Cardoso, Ana Filipa Sequeira, Joana L. A. Bra´s, Carlos M. G. A. Fontes, Gilles Trave´, Nicolas Wolff, and Renaud Vincentelli 22 High-Throughput Protein Analysis Using Negative Stain Electron Microscopy and 2D Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christopher P. Arthur and Claudio Ciferri 23 High-Throughput Protein Production Combined with HighThroughput SELEX Identifies an Extensive Atlas of Ciona robusta Transcription Factor DNA-Binding Specificities . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kazuhiro R. Nitta, Renaud Vincentelli, Edwin Jacox, Agne`s Cimino, Yukio Ohtsuka, Daniel Sobral, Yutaka Satou, Christian Cambillau, and Patrick Lemaire 24 High-Throughput Micro-Characterization of RNA–Protein Interactions . . . . . . Sara Go mez, Francisco J. Ferna´ndez, and M. Cristina Vega

ix

21

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

439

477

487

519 533

Contributors CECILIA ABREU  Recombinant Protein Unit, Institut Pasteur de Montevideo, Montevideo, Uruguay; Molecular, Cellular and Animal Technology Program, Institut Pasteur de Montevideo, Montevideo, Uruguay KIRILL ALEXANDROV  Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD, Australia; Institute for Molecular Bioscience, The University of Queensland, St. Lucia, QLD, Australia PHILIP AN  Amgen Research, Amgen, Inc., One Amgen Center, Thousand Oaks, CA, USA PAUL W. ANDERSON  Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA; Lilly Biotechnology Center, San Diego, CA, USA CHRISTOPHER P. ARTHUR  Department of Structural Biology, Cryo-EM Unit. 1 DNA Way, MS, Genentech, Inc., South San Francisco, CA, USA IGNACIO ASIAL  DotBio Pte. Ltd.,1 Research Link, #05-30, Singapore, Singapore SHERVIN BAHRAMI  AimVion AS, Aarhus, Denmark IMRE BERGER  School of Biochemistry and Bristol Synthetic Biology Centre BrisSynBio, University Walk, University of Bristol, Clifton, UK KRISTA BOWMAN  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA JOANA L. A. BRA´S  NZYTech Genes and Enzymes, Campus do Lumiar, Estrada do pac¸o do Lumiar, Lisbon, Portugal BOBBY BRILLANTES  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA NICOLA A. BURGESS-BROWN  Structural Genomics Consortium, University of Oxford, Oxford, UK STE´PHANIE CABANTOUS  Centre de Recherche en Cance´rologie de Toulouse (CRCT), Inserm, Universite´ Paul Sabatier-Toulouse III, CNRS, Toulouse, France CHRISTIAN CAMBILLAU  Unite´ Mixte de Recherche (UMR) 7257, Architecture et Fonction des Macromole´cules Biologiques, CNRS and Aix Marseille University, Marseille, France VAˆNIA CARDOSO  NZYTech Genes and Enzymes, Lisbon, Portugal FRYDMAN CHIRAZ  Horiba Scientific, Palaiseau, France CLAUDIO CIFERRI  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA; Department of Structural Biology, Cryo-EM Unit. 1 DNA Way, MS, Genentech, Inc., South San Francisco, CA, USA AGNE`S CIMINO  Architecture et Fonction des Macromole´cules Biologiques (AFMB), UMR 7257, CNRS, Aix-Marseille Universite, Marseille cedex 9, France THOMAS CLAIRFEUILLE  Department of Structural Biology, Genentech Inc., South San Francisco, CA, USA AGUSTI´N CORREA  Recombinant Protein Unit, Institut Pasteur de Montevideo, Montevideo, Uruguay; Research Laboratory on Chronic Lymphocytic Leukemia, Institut Pasteur de Montevideo, Montevideo, Uruguay SANDRA CORTE`S  Synthelis, Biopolis, Grenoble, France SUE-LI DAHLROTH  Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, Sweden

xi

xii

Contributors

HERVE´ DARBON  Unite´ Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS) Aix-Marseille Universite´, Architecture et Fonction des Macromole´ cules Biologiques (AFMB), Marseille, France TRISHA DELA VEGA  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA PETRA DRNCOVA  The European Molecular Biology Laboratory (EMBL), Grenoble Cedex 9, France STEFAN DU¨BEL  Technische Universit€ a t Braunschweig, Institut fu¨r Biochemie, Biotechnologie und Bioinformatik, Abteilung Biotechnologie, Braunschweig, Germany YOAN DUHOO  Unite´ Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS) Aix-Marseille Universite´, Architecture et Fonction des Macromole´ cules Biologiques (AFMB), Marseille, France FABIEN DURBESSON  Architecture et Fonction des Macromole´cules Biologiques (AFMB), Unite´ Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS) Aix-Marseille Universite´, Marseille, France GEORGE DUTINA  Department of Early Stage Cell Culture, Genentech, Inc., South San Francisco, CA, USA SAFIA EZZINE  Synthelis, Biopolis, Grenoble, France FRANCISCO J. FERNA´NDEZ  Abvance Biotech srl, Madrid, Spain; Center for Biological Research, Spanish National Research Council, Madrid, Spain DANIEL J. FITZGERALD  Geneva Biotech SARL, Geneva, Switzerland CARLOS M. G. A. FONTES  CIISA-Faculdade de Medicina Veterina´ria, Universidade de Lisboa, Avenida da Universidade Te´cnica, Lisbon, Portugal; NZYTech Genes and Enzymes, Campus do Lumiar, Estrada do pac¸o do Lumiar, Lisbon, Portugal PATRICK FOURQUET  Universite´ Aix-Marseille, Inserm, CNRS, Institut Paoli-Calmettes, CRCM, Marseille Prote´omique, Marseille, France YVONNE FRANKE  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA ANDRE´ FRENZEL  Technische Universit€ a t Braunschweig, Institut fu¨r Biochemie, Biotechnologie und Bioinformatik, Abteilung Biotechnologie, Braunschweig, Germany; YUMAB GmbH, Braunschweig, Germany VIOLA FU¨HNER  Technische Universit€ a t Braunschweig, Institut fu¨r Biochemie, Biotechnologie und Bioinformatik, Abteilung Biotechnologie, Braunschweig, Germany FREDERIC GARZONI  Imophoron Ltd., Unit DX, St Philips Central, Bristol, UK MICHELLE R. GAYLORD  Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA NICOLAS GILLES  CEA/DRF/Joliot, Service d’Inge´nierie Mole´culaire des Prote´ines, Gif-surYvette, France VIRGINIE GIRAULT  Unite´ Re´cepteurs-Canaux, De´partement of Neuroscience, CNRS UMR 3571, Institut Pasteur, Paris, France SCOTT M. GLASER  Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA SARA GO´MEZ  Center for Biological Research (CIB-CSIC), Structural and Chemical Biology, Madrid, Spain; Center for Biological Research, Spanish National Research Council, Madrid, Spain FRANKLIN A. HAYS  Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA; Stephenson Cancer Center, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA; Harold

Contributors

xiii

Hamm Diabetes Center, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA KATHARINE HEERINGA  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA FATIMA-EZZAHRA HIBTI  Horiba Scientific, Palaiseau, France HOANGDUNG HO  Department of Structural Biology, Genentech Inc., South San Francisco, CA, USA MICHAEL HUST  Technische Universit€ at Braunschweig, Institut fu¨r Biochemie, Biotechnologie und Bioinformatik, Abteilung Biotechnologie, Braunschweig, Germany JIYOUNG HWANG  23andMe Inc., South San Francisco, CA, USA EDWIN JACOX  Institute of Developmental Biology of Marseille (IBDM), Aix-Marseille Universite´/CNRS, Marseille cedex 9, France; Centre de Recherches de Biologie cellulaire de Montpellier (CRBM), Universite´ de Montpellier/CNRS, Montpellier, France CHRISTINE C. JAO  Department of Structural Biology, Genentech Inc., South San Francisco, CA, USA JENNIFER M. JOHNSON  Department of Biochemistry and Molecular Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA WAYNE A. JOHNSTON  Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD, Australia STEFAN KNAPP  Target Discovery Institute and Structural Genomics Consortium, Oxford University, Oxford, UK; Goethe-University Frankfurt, Institute of Pharmaceutical Chemistry and Buchmann Institute for Life Sciences, Frankfurt am Main, Germany; German Cancer Network (DKTK), Frankfurt am Main, Germany CHRISTOPHER M. KOTH  Department of Structural Biology, Genentech Inc., South San Francisco, CA, USA EDWARD KRAFT  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA CHRISTINE KUGEL  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA MAGDALENA J. LASKA  Department of Biomedicine, Bartholins Alle´ 6, University of Aarhus, Aarhus, Denmark GRACE LEE  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA ISABELLE LEHOUX  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA PATRICK LEMAIRE  Institute of Developmental Biology of Marseille (IBDM), Aix-Marseille Universite´/CNRS, Marseille cedex 9, France; Centre de Recherches de Biologie cellulaire de Montpellier (CRBM), Universite´ de Montpellier/CNRS, Montpellier cedex 5, France SCOTT A. LESLEY  Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA; Merck Research Laboratories, South San Francisco, CA, USA ZHONG RONG LI  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA LEE LIOR-HOFFMANN  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA CHRIS LONERGAN  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA LYNN MARTIN  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA

xiv

Contributors

AME´LIE MASSEMIN  Institut de Pharmacologie et de Biologie Structurale, IPBS, Universite´ de Toulouse, CNRS, UPS, Toulouse, France SEBASTIAN MATHEA  Target Discovery Institute and Structural Genomics Consortium, Oxford University, Oxford, UK; Goethe-University Frankfurt, Institute of Pharmaceutical Chemistry and Buchmann Institute for Life Sciences, Frankfurt am Main, Germany; German Cancer Network (DKTK), Frankfurt am Main, Germany; German Cancer Centre (DKFZ), Heidelberg, Germany JESSICA J. MILLER  Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA SHAYLI VARASTEH MORADI  Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD, Australia; Institute for Molecular Bioscience, The University of Queensland, St. Lucia, QLD, Australia KYLE MORTARA  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA JOANNE E. NETTLESHIP  Research Complex at Harwell, Rutherford Appleton Laboratory Harwell Oxford, Oxford, UK; Division of Structural Biology, Henry Wellcome Building for Genomic Medicine, University of Oxford, Oxford, UK LANANH NGUYEN  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA KAZUHIRO R. NITTA  Institute of Developmental Biology of Marseille (IBDM), Aix-Marseille Universite´/CNRS, Marseille cedex 9, France; Division of Genomic Medicine, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan YVES NOMINE´  E´quipe Labellise´e Ligue 2015, Department of Integrated Structural Biology, Institut de Ge´ne´tique et de Biologie Mole´culaire et Cellulaire (IGBMC), INSERM U1258/ CNRS UMR 7104/Universite´ de Strasbourg, Illkirch, France € R NORDLUND  DotBio Pte. Ltd. 1 Research Link, #05-30, Singapore, Singapore; PA Department of Materials and Environmental Chemistry, Stockholm University, Stockholm, Sweden; Department of Oncology-Pathology, Cancer Center Karolinska, Karolinska Institutet, Stockholm, Sweden; School of Biological Sciences, Nanyang Technological University, Singapore, Singapore; Department of Oncology-Pathology, Karolinska Institutet, Stockholm, Sweden YUKIO OHTSUKA  Institute of Developmental Biology of Marseille (IBDM), Aix-Marseille Universite´/CNRS, Marseille cedex 9, France; Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki, Japan PABLO OPPEZZO  Recombinant Protein Unit, Institut Pasteur de Montevideo, Montevideo, Uruguay; Research Laboratory on Chronic Lymphocytic Leukemia, Institut Pasteur de Montevideo, Montevideo, Uruguay CLAUDIA ORTEGA  Recombinant Protein Unit, Institut Pasteur de Montevideo, Montevideo, Uruguay; Research Laboratory on Chronic Lymphocytic Leukemia, Institut Pasteur de Montevideo, Montevideo, Uruguay RAYMOND J. OWENS  Research Complex at Harwell, Rutherford Appleton Laboratory Harwell Oxford, Oxford, UK; Division of Structural Biology, Henry Wellcome Building for Genomic Medicine, University of Oxford, Oxford, UK JIAN PAYANDEH  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA; Department of Structural Biology, Genentch Inc, South San Francisco, CA, USA JEAN-DENIS PEDELACQ  Institut de Pharmacologie et de Biologie Structurale, IPBS, Universite´ de Toulouse, CNRS, UPS, Toulouse, France

Contributors

xv

ANDREW PEREZ  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA FANNY PEYSSON  Unite´ Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS) Aix-Marseille Universite´, Architecture et Fonction des Macromole´ cules Biologiques (AFMB), Marseille, France HEATHER RADA  Research Complex at Harwell, Rutherford Appleton Laboratory Harwell Oxford, Oxford, UK; Division of Structural Biology, Henry Wellcome Building for Genomic Medicine, University of Oxford, Oxford, UK LAURIE RAMOND  Unite´ Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS) Aix-Marseille Universite´, Architecture et Fonction des Macromole´ cules Biologiques (AFMB), Marseille, France VLADIMIR RAZINKOV  Drug Product Technologies, Amgen, Inc., Thousand Oaks, CA, USA SARAH M. RUE  Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA GIULIO RUSSO  Technische Universit€ a t Braunschweig, Institut fu¨r Biochemie, Biotechnologie und Bioinformatik, Abteilung Biotechnologie, Braunschweig, Germany NATALIE J. SAEZ  Unite´ Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS) Aix-Marseille Universite´, Architecture et Fonction des Macromole´ cules Biologiques (AFMB), Marseille, France; Institute for Molecular Bioscience, The University of Queensland, St. Lucia, QLD, Australia EIDARUS SALAH  Target Discovery Institute and Structural Genomics Consortium, Oxford University, Oxford, UK JUN SAMPANG  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA DUYGU SARI-AK  The European Molecular Biology Laboratory (EMBL), Grenoble Cedex 9, France YUTAKA SATOU  Department of Zoology, Graduate School of Science, Kyoto University, Kyoto, Japan CHRISTIANE SCHAFFITZEL  School of Biochemistry and Bristol Synthetic Biology Centre BrisSynBio, University Walk, University of Bristol, Clifton, UK KURT SCHROEDER  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA ANA FILIPA SEQUEIRA  CIISA-Faculdade de Medicina Veterina´ria, Universidade de Lisboa, Avenida da Universidade Te´cnica, Lisbon, Portugal; NZYTech Genes and Enzymes, Campus do Lumiar, Estrada do pac¸o do Lumiar, Lisbon, Portugal STEPHANIE SHRIVER  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA LOVEJIT SINGH  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA DANIEL SOBRAL  Institute of Developmental Biology of Marseille (IBDM), Aix-Marseille Universite´/CNRS, Marseille cedex 9, France; Instituto Gulbenkian de Cieˆncia, Rua da Quinta Grande, Oeiras, Portugal DANIEL SOME  Wyatt Technology Corp., Santa Barbara, CA, USA CLAIRE STRAIN-DAMERELL  Diamond Light Source Ltd., The Research Complex at Harwell, Harwell Science and Innovation Campus, Didcot, Oxfordshire, UK ROBERT M. STROUD  Department of Biochemistry and Biophysics, University of California, San Francisco, CA, USA

xvi

Contributors

ZACHARY SWEENEY  Department of Discovery Chemistry, Genentech Inc., South San Francisco, CA, USA; Denali Therapeutics, South San Francisco, CA, USA CHRISTINE TAM  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA SHU TI  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA GILLES TRAVE´  E´quipe Labellise´e Ligue 2015, Department of Integrated Structural Biology, Institut de Ge´ne´tique et de Biologie Mole´culaire et Cellulaire (IGBMC), INSERM U1258/ CNRS UMR 7104/Universite´ de Strasbourg, Illkirch, France JEREMY TURCHETTO  Unite´ Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS) Aix-Marseille Universite´, Architecture et Fonction des Macromole´cules Biologiques (AFMB), Marseille, France ALEX J. VECCHIO  Department of Biochemistry and Biophysics, University of California, San Francisco, CA, USA; Department of Biochemistry, University of Nebraska–Lincoln, Lincoln, NE, USA M. CRISTINA VEGA  Center for Biological Research (CIB-CSIC), Structural and Chemical Biology, Madrid, Spain; Center for Biological Research, Spanish National Research Council, Madrid, Spain RENAUD VINCENTELLI  Architecture et Fonction des Macromole´cules Biologiques (AFMB), Unite´ Mixte de Recherche (UMR) 7257, Centre National de la Recherche Scientifique (CNRS)), Aix-Marseille Universite´, Marseille cedex 9, France GEOFFREY S. WALDO  Bioscience Division, MS-M888, Los Alamos National Laboratory, Los Alamos, NM, USA KENNETH W. WALKER  Amgen Research, Amgen, Inc., One Amgen Center, Thousand Oaks, CA, USA YE NAING WIN  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA DWIGHT WINTERS  Amgen Research, Amgen, Inc., One Amgen Center, Thousand Oaks, CA, USA NICOLAS WOLFF  Unite´ Re´cepteurs-Canaux, De´partement of Neuroscience, CNRS UMR 3571, Institut Pasteur, Paris Cedex 15, France ATHENA WONG  Department of Early Stage Cell Culture, Genentech, Inc., South San Francisco, CA, USA HUI XU  Department of Structural Biology, Genentech Inc., South San Francisco, CA, USA INNA ZILBERLEYB  Department of Biomolecular Resources, Genentech, Inc., South San Francisco, CA, USA

Part I General Procedures for High-Throughput Protein Production

Chapter 1 Overview of High-Throughput Cloning Methods for the Post-genomic Era Claudia Ortega, Cecilia Abreu, Pablo Oppezzo, and Agustı´n Correa Abstract The advent of new DNA sequencing technologies leads to a dramatic increase in the number of available genome sequences and therefore of target genes with potential for functional analysis. The insertion of these sequences into proper expression vectors requires a simple an efficient cloning method. In addition, when expressing a target protein, quite often it is necessary to evaluate different DNA constructs to achieve a soluble and homogeneous expression of the target with satisfactory yields. The development of new molecular methods made possible the cloning of a huge number of DNA sequences in a high-throughput manner, necessary for meeting the increasing demands for soluble protein expression and characterization. In this chapter several molecular methods suitable for high-throughput cloning are reviewed. Key words High-throughput cloning, Restriction-based cloning, Ligation-independent cloning, Recombinational cloning, PCR-based cloning, Restriction enzymes

1

Introduction The discovery of DNA ligase in 1967 [1] and the first restriction enzymes (HindIII and EcoRI) in the early 1970s [2, 3] gave rise to the recombinant DNA technology, were DNA sequences can be joined to form a hybrid molecule not present in nature, as was demonstrated for the first time by Jackson et al. in 1972 [4]. Soon after, the finding that DNA molecules from different species can be cloned and replicated in other species [5] encouraged the cloning of the human somatostatin [6] and insulin [7] for expression into Escherichia coli, setting the beginning of heterologous recombinant gene expression technologies. Since their discovery, molecular methods have advanced dramatically and the cloning of DNA fragments for their subsequent heterologous protein expression has become a routine practice for a large number of laboratories worldwide.

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_1, © Springer Science+Business Media, LLC, part of Springer Nature 2019

3

4

Claudia Ortega et al.

The complete sequencing of many eukaryotic, bacterial, and viral genomes has provided massive information of protein sequences with potential for structural-functional analysis. In this regard, several structural genomics projects have been initiated that implemented high-throughput (HTP) cloning, protein expression, and crystallization screening methods for the obtaining of thousands of protein structures [8–10]. In addition, for many targets, the soluble expression of the gene of interest (GOI) may require the evaluation of different expression conditions including different promoters, solubility enhancer proteins and co-expression of chaperones or molecular partners [11–14]. This results in a dramatic increase in the number of gene constructs that are necessary for a particular target, underlining the importance of selecting an efficient cloning method. Although traditional restriction-ligation based cloning methods are still widely employed in many research laboratories, they are not proper for HTP cloning projects due to several disadvantages. For example, incomplete DNA digestion and poor ligation yields will reduce cloning efficiency and each target sequence has to be inspected for the absence of internal restriction sites which make their use impractical when cloning several DNA fragments into multiple vectors. In addition, unwanted amino acids can be introduced to the expressed protein derived from the restriction site sequence and the target DNA can only be cloned in the vector position where the selected restriction site is present, restricting many applications. In order to overcome these limitations, several different cloning strategies were developed over the last 25 years. They include methods based in the use of rare cutting enzymes; in the generation of overlapping single-stranded overhangs with exonucleases; or in the site-directed recombination of specific sequences, as well as methods based on PCR for the integration of the target gene in the destination vector. In this chapter, several examples of cloning methods based in the different strategies aforementioned are described.

2

Restriction-Based methods Restriction-based methods have being used since the initial stages of genetic engineering. In the last years, several modifications to the initial protocols were developed to circumvent some of the mentioned limitations.

2.1 Flexi® Cloning System

Promega® developed a new technique called Flexi Vector System. This method is based on two rare-cutting restriction enzymes, SgfI and PmeI, which recognize the DNA sequence GCGATCGC and GTTTAAAC, respectively. They present a combined digestion

High-Throughput Cloning Methods

5

frequency lower than 1.2% in human ORFs and other organisms such as M. musculus (1.2%), S. cerevisiae (2.96%), A. thaliana (2.4%), and E. coli (6.35%); thus, the probability of having a target gene that contains an internal recognition sequence for these enzymes is very low [15]. The Flexi Vectors were designed to contain various expression or peptide tag options to enable expression of native or fusion proteins. To clone a GOI into the Flexi system, primers with adaptor sequences are used. The SgfI site is added at 50 while the PmeI site is added at the 30 end of the coding sequence, keeping the cloning process directional. The digestion of the Flexi vector with these enzymes releases the lethal barnase gene, which is replaced after ligation with the target gene and acts as a positive selection for successful ligation of the insert. The efficiency of Flexi Vector cloning can match or exceed that of recombination cloning both for initial capture of PCR products and for transfer between different vector combinations, leading to savings in time and cost [16]. In addition, several HTP projects were effectively done with more than 3500 human clones generated and expressed with this technique [17, 18]. Although the inspection for internal recognition sites in the target can be avoided, the restriction sites may introduce additional nucleotides to the GOI, leading to a “scar” into the target sequence and thus the incorporation of unwanted amino acids into the expressed protein. In addition, the cloning will occur only were the restriction sites are present in the vector limiting other applications. 2.2 Golden Gate Assembly Method

Another method based on the use of special restriction enzymes is golden gate assembly [19–22]. This technique depends on the use of type IIS restriction endonucleases (REase), whose recognition sites are distal from their cut sites. The inserts and cloning vectors are designed to place the Type IIS recognition site distal to the cleavage site, so the REase can remove the recognition sequence from the assembly, avoiding the introduction of unwanted sequences to the GOI. There are several different type IIs endonucleases available (BbsI, BsmBI, and Esp3I), where the most commonly used enzyme is BsaI. The BsaI recognition sequence (GGTCTC) is separated from the generated four bases overhang by a single base, and the enzyme activity is independent of the sequences of the single base spacer and the four bases overhang (Fig. 1). The recognition site for BsaI is not palindromic, and is for that reason directional. The PCR product containing the GOI as well as the insertion site of the destination vector are flanked by two BsaI recognition sites. If only the PCR product is mixed with BsaI and ligase, the PCR product is (reversibly) digested, resulting in three DNA fragments and ligated back together again. The same is true for the destination vector. However, if the PCR product and the destination vector (each of which contains at 50 and 30 a 4 bp

6

Claudia Ortega et al.

complementary sequences) are both mixed together with BsaI and ligase, the cut linearized destination vector will irreversibly ligate with the cut PCR product containing the target DNA. This particular ligation is irreversible, because the ligation product no longer contains any BsaI recognition sequences (Fig. 1). This method allows obtaining in one tube and one step close to one hundred percent correct recombinant plasmids after just 5 min restriction-ligation reaction [20]. In addition, this method also allows the assembly of multiple DNA fragments, even if repetitive

Fig. 1 Golden gate assembly. This method uses Type IIS restriction endonucleases (REases) such as BsaI. The recognition sequence (GGTCTC) is separated from its four bp overhang by a single bp. The PCR product containing the GOI is flanked by two BsaI recognition sites (black), and complementary sequences to the insertion site in the vector (green and red). After BsaI digestion, the BsaI sequences are left behind, with each fragment bearing the designed 4-nt overhangs that direct the assembly by ligation

High-Throughput Cloning Methods

7

elements are present [23], as well as the introduction of multiple site-directed mutations [24]. In addition, by linearizing the plasmid by PCR with primers containing the BsaI recognition sequence and directed to a specific position in the vector, is possible the cloning of DNA fragments at any location. Golden gate circumvents the addition of unwanted sequences to the target, but still the DNA fragment and the destination vector must be inspected for absence of internal BsaI recognition sites. New England Biolabs developed a web tool that helps in the primer design simplifying the cloning processes (https://goldengate.neb. com/editor). Recently, golden gate was used for the generation of plasmid libraries containing different features for recombinant protein expression. In this regard, six donor plasmids and one acceptor vector were combined to generate a panel of expression constructs in a single golden gate reaction [25]. Donor plasmids contained only one component being a promoter, HisTag, fusion partner, protease cleavage site, target gene, or a transcriptional terminator sequence. For expression screening in E. coli a panel of different promoters, solubility enhancer fusion proteins, and target genes were combined to generate 27 different constructs. The same strategy was also successful in the generation of DNA constructs for expression in Pichia pastoris, and 18 expression plasmids containing different promoters, secretion factors, and target genes were also obtained, facilitating the HTP cloning of multiple fragments for protein expression screening in different host [25].

3

Ligation-Independent Cloning Methods The drawbacks founded with traditional restriction-based methods and their limitations to be implemented in HTP cloning projects derived in the development of ligation-independent cloning (LIC) methods that are simpler, faster, and highly efficient [26–28]. These strategies take advantage of the (30 –50 ) exonuclease activity of T4 DNA polymerase to generate DNA fragments flanked with 50 overhangs allowing the directional cloning of any insert by complementarity, independent of restriction enzymes and in vitro ligation.

3.1 LigationIndependent Cloning (LIC)

In the presence of all four deoxyribonucleoside triphosphates (dNTPs), the polymerase activity of T4 DNA polymerase is much faster than the exonuclease activity. However, if only one dNTPs is present, the (30 –50 ) exonuclease activity will remove nucleotides from the 30 ends until the same dNTP present in the reaction appears in the DNA sequence. Taking this into account, chimeric primers used to amplify the GOI are designed to contain at their 50

8

Claudia Ortega et al.

ends 12–15 nucleotide (nt) sequences complementary to the insertion site in the destination vector lacking, for example, dCMP, followed by specific sequences for the target gene (usually 18–25 nt) at their 30 ends. As a result, the amplification products include 12–15 nt sequences lacking dGMP at their 30 ends, which can be removed by the action of the (30 –50 ) exonuclease activity of T4 DNA polymerase in the presence of dGTP and the remaining overhangs are complementary to the insertion site. The destination vector for LIC must be linearized. This can be done by PCR or by restriction endonuclease digestion, in order to after linearization, be flanked on each side by the same 12–15 nt sequence lacking dCMP that was also present in the chimeric primers. By this way they can act as the annealing sites for the LIC reaction. The linearized plasmid is also treated with T4 DNA polymerase, which generates single-stranded overhangs that will be complementary to those generated in the target gene, enabling the hybridization of plasmid molecules with the PCR fragment (Fig. 2). The recombinant molecule is ligated in vivo after transformation by the bacterial DNA recombination and repair machinery. LIC cloning procedure is rather simple since only a single type of enzymatic reaction is required prior to transformation and results in good cloning efficiencies. LIC was successfully used for gene cloning in HTP formats [29]. By using this method, it was possible to assemble in an automated manner more than 600 TALENs (transcription activator-like effector nucleases) genes in a single day with 89% efficiency [30]. Additionally, 130 genes encoding glycoside hydrolases from 13 different organisms were cloned in parallel in two vectors in a 96-well plate cloning format and subjected to a semi-automated protein expression and solubility screening in E. coli [31]. In the same sense, the cloning of Pyrococcus furiosus genes at a whole genome level was achieved using a modified λexonuclease-based LIC method reaching >80% efficiency [32]. In this regard, the 2125 ORFs from P. furiosus were amplified by two-step PCR and cloned into the expression vector pDEST17 for expression screening in Rosetta 2(DE3)pLysS strain. Primary results with the recombinant expression library demonstrated that around 70% of P. furiosus proteins could be produced in this E. coli strain [32]. 3.2 Sequence- and Ligation-Independent Cloning (SLIC)

Ligation-independent cloning (LIC) has been extensively used as HTP method due to its uniformity and cost-effectiveness, but it has sequence constraints. As mentioned before, the 12–15 base overhangs must not contain the dNTP present in the reaction to allow for T4 polymerase to chew back the entire 12–15 bases, so the use of LIC is often limited to specially designed plasmids. As a result, a modification of the method, known as Sequence- and LigationIndependent Cloning (SLIC), was developed and eliminates many

High-Throughput Cloning Methods

9

Fig. 2 (Sequence and) ligation-independent cloning. The GOI and destination vector are amplified using chimeric primers that have complementary sequences at their 50 ends that will act as the annealing sites for the LIC reaction. Then both PCR products are digested by (30 –50 ) exonuclease activity of T4 DNA polymerase generating 50 single-strand overhangs. For LIC cloning, the 30 digestion occurs until the dNTP present in the reaction (dGTP in this case) appears in the DNA sequence while for SLIC cloning the digestion is controlled by the time of exonuclease treatment. The generated 50 single-stranded complementary ends allow directional annealing of insert and vector. The generated double-nicked recombinant molecule is repaired in vivo after transformation

of the LIC’s constraints [33]. A key feature of SLIC is the potential of homologous recombination system in E. coli, which allows for the repair of gaps and overhangs based on regions of sequence homology. This process can occur through one of two pathways: RecA-mediated recombination or RecA-independent singlestranded annealing. It was demonstrated that it is possible to generate imperfect “recombination intermediates” through PCR and imprecise T4 exonuclease activity. Unlike LIC, the length of the single-stranded DNA generated is controlled by the time of

10

Claudia Ortega et al.

exonuclease treatment. Therefore, a special dCMP-free sequence is no longer required. As long as there was enough sequence homology (20–60 bp) to organize the fragments and hold them together, E. coli would be able to “repair” the plasmid, generating recombinant DNA molecules (Fig. 2). Adding purified RecA to the pre-transformation incubation enhances the repair process, increasing efficiency [33]. SLIC is even compatible with incomplete PCR fragments. If PCR does not include a final extension step, many of the products will have single-stranded overhangs due to incomplete extension, and these fragments can induce recombination. SLIC is ideal for multicomponent assembly, as overlapping sequence homology specifies the order of multiple fragments, and the assembly is scarless. With 40 bp homology regions, a five piece assembly reaction is highly efficient (~80%). Ten-fragment assembly can also be successful, but at a lower efficiency (~20%) [33], so for the construction of plasmids with greater than six fragments, In-Fusion [34], Gibson method [35], or CPEC cloning [36] are recommended and are explained in Subheadings 0, 0, and 0, respectively, on this chapter. Jeong and colleagues optimized SLIC parameters to make it comparable to commercial methods in terms of simplicity, time saving, and cloning efficiency [37]. In one-step SLIC the vector and insert(s) are mixed and incubated at room temperature for 2.5 min with T4 DNA polymerase to generate 50 overhangs. For optimal results, a 1:2 to 1:4 molar vector-to-insert ratios are desirable. Then, the reaction mixture is placed on ice for 10 min for single-strand annealing and competent E. coli cells are transformed with the annealed DNA complex directly, obtaining homologous recombination in vivo with high efficiency (100% and >85% when a single or multiple fragments are cloned respectively) [37]. Several groups have contributed to this method by developing vectors suitable for SLIC cloning. In this regard, Scholz and colleagues generated more than 30 different expression vectors [38]. They modified vectors based on pET, pFastBac, and pTT backbones for parallel cloning and protein expression screening in E. coli, insect, and HEK293E cells, respectively. Another improvement they made was to introduce the ccdB gene under the control of a strong constitutive promoter for counterselection of insert-less vector. The ccdB gene codes for the coupled cell division B protein, a GyrA inhibitor that interferes with the DNA gyrase activity leading to cell death when the ccdB gene is expressed. In contrast to DpnI treatment commonly used to reduce vector background, ccdB is 100% efficient in killing parental vector carrying cells [38]. The concept of cloning through the generation of singlestranded DNA overhangs by exonuclease activity in a DNA fragment and a linearized destination vector followed by annealing and ligation of the recombinant DNA molecules was the basis for other related cloning methods including In-Fusion Cloning and Gibson Assembly.

High-Throughput Cloning Methods

3.3 In-Fusion Cloning System

11

In-Fusion is a commercially available cloning system from Clontech Inc., which fuses DNA fragments, e.g., PCR-generated target sequences and linearized vectors, by recognizing a 15 bp overlap at their ends. The process includes a PCR reaction with one set of primers for the amplification of the target containing the 15 pb overlap, followed by a homologous recombination reaction between the insert and the linearized destination vector. This last step is performed by incubating the DNA fragments (insert and vector) at 50  C for 15 min [39] or 42  C for 30 min in the presence of a special enzyme mix [34, 40, 41] (Fig. 3a). In-Fusion cloning is based on the 30 –50 exonuclease proofreading activity of vaccinia virus DNA polymerase capable to degrade double-strand DNA ends, thus exposing single-stranded DNA overhangs [41, 42]. DNA molecules containing short sequence homology at their ends, such as a PCR product and a linearized vector, can hybridize through the generated overhangs forming a hybrid molecule, which is resistant to exonuclease activity and can accumulate over time. Additionally, in In-Fusion cloning, the single-strand annealing reaction is enhanced by the action of vaccinia virus single-strand DNA binding protein increasing the efficiency of the process [42]. The resulting hybrid molecule contains nicks or small gaps at the junctions that are repaired upon transformation into E. coli to covalently join the vector and insert [41, 42] (Fig. 3a). Because no sequence requirements are necessary in this method, the target gene can be inserted at any location in the destination vector and the introduction of undesired amino acids to the expressed protein can be easily avoided [40]. For In-Fusion cloning, chimeric primers containing overlapping sequences at 50 and gene-specific sequences at 30 should be designed. In this regard, a web tool that facilitates the design of the appropriate primers is available from Clontech (www.clontech. com). By using this system, cloning of DNA fragments with a wide size range (83–12,000 bp), as well as simultaneous insertion of target sequences or mutations at different positions, was performed with high efficiency and accuracy [34]. A versatile vector suite compatible with In-Fusion cloning was generated and incorporated into a semi-automated pipeline for HTP cloning and expression screening reaching high cloning efficiencies (up to 94%) [40]. With this vector suite, the cloning is done in a single-step reaction and in a parallel manner, where the PCR products can be cloned into different vectors with different fusion tags (N-His, N-His-GST and N-His-MBP) using a 96-well plate format. In addition, by the design of a multi-promoter vector, it was possible to express recombinant targets in multiple hosts (E. coli, mammalian and insect cell lines), demonstrating the flexibility of this vector suite [40]. The In-Fusion method was also recently applied in the successful HTP cloning of antibody VH and VL sequences for their

12

Claudia Ortega et al.

Fig. 3 (a) In-Fusion system. This method requires a 15 bp homology sequence between the linearized destination vector and the GOI. In this regard, the plasmid is linearized by PCR and the GOI is amplified with chimeric primers that contain at their 50 ends, 15 nt complementary to the insertion site in the vector (represented in green and red). The technique is dependent on the action of the 30 –50 exonuclease activity that digest the ends of GOI and linearized plasmid, thereby generating 50 single-stranded overhangs, which are able to anneal to compatible ends at the destination vector. (b) Gibson assembly in one step isothermal in vitro recombination. Two adjacent DNA fragments sharing terminal sequence overlaps (red) are generated and mixed together. The action of T5 exonuclease removes nucleotides from the 50 ends of double-stranded DNA molecules, generating 30 single-stranded overhangs. After the annealing of the fragments, the gaps are filled by the action of Phusion DNA polymerase and the nicks are repaired by Taq DNA ligase. All the reactions are performed at 50  C in one single step

High-Throughput Cloning Methods

13

expression in Human embryonic kidney 293(HEK293) cells. By coupling In-Fusion strategy with the use of a bicistronic plasmid that links antibody expression with drug resistance, the time required from cloning V-region cDNA to producing large-scale monoclonal antibodies from stable cell pools in a high-throughput format was reduced from 4–6 months to only 4–6 weeks [43]. Besides the cloning of target DNA fragments into vectors, In-Fusion was also used for the simultaneous assembly of genetic circuits from standardized biological parts called BioBricks. These include different promoters, ribosome-binding sites (RBS), protein or RNA-coding sequences, and transcriptional terminators. By applying In-Fusion cloning, the assembly of different BioBricks was achieved in a faster, more flexible and efficient manner than traditional methods making it a useful method to be used in synthetic biology projects [44]. 3.4 Gibson Assembly Method

Similar to In-Fusion, the Gibson method is based on in vitro recombination with the capacity to assemble and repair overlapping DNA molecules in a single isothermal step. By using this approach joined products as large as 300 kb were correctly cloned into E. coli [35]. The method is based on the activities of three common enzymes: an exonuclease (to chews back the ends of doublestranded DNA, exposing single-stranded DNA overhangs), a DNA polymerase, and a Ligase. For cloning with Gibson assembly, the target DNA fragment has to be amplified with chimeric primers containing overlapping sequences to the selected insertion site in the destination vector at 50 and gene-specific sequences at 30 . After the exonuclease activity along the ends of the amplified DNA fragment and the linearized destination vector, insert-to-plasmid annealing at cohesive complementary single-stranded ends occurs. The single-stranded gaps are filled in by the action of DNA polymerase and the remaining nicks are then covalently sealed by Taq DNA ligase [45]. Because no special sequence requirements are needed for DNA assembly with Gibson, the introduction of unwanted sequences into the target is avoided and the insertion can be done at any desired position within the vector. Three different variants of Gibson assembly were developed based in the same basic approach described above, were the most simple and suitable for HTP cloning proposes is the one-step isothermal assembly [35, 45]. In this variant, all three enzymes required for DNA assembly are simultaneously active. Moreover, the generated circular products can accumulate since they are not processed by any of the used enzymes, favoring cloning efficiency. In this protocol, the assembly is performed at 50  C in the presence of a 50 T5 exonuclease (to create 30 single-stranded overhangs), Phusion DNA polymerase (to fill in the gaps within each annealed fragment), and Taq DNA ligase (to seal the nicks), and the

14

Claudia Ortega et al.

complete reaction can be done in as few as 15 min [35] (Fig. 3b). With Gibson assembly, multiple DNA fragments with overlapping sequences (15–80 bp) can be efficiently cloned into a linearized destination vector [35, 46], reaching for a single fragment cloning, an efficiency of 98% [47]. In addition, all enzymes for Gibson assembly are commercially available as a cloning kit or as separate reagents (New England Biolabs), and a web tool was developed to facilitate the proper design of the primers and length of the overlapping regions for Gibson assembly (NEBuilder, www.nebgibson. com). Finally, there is an automated genomic workstation called The BioXp™ 3200 (SGI-DNA) that clones DNA fragments in a process that is almost hands-free. The instrument can currently assemble and clone 24 DNA fragments of interest simultaneously in an overnight run into a specially designed vector (pUCGA 1.0 vector). The clones generated with the BioXp system are immediately ready for transformation and further downstream analysis [48].

4

Site-Directed Recombination Methods Several HTP cloning systems are based on recombinational cloning [49], and many groups implemented this technology to construct expression libraries [50]. Recombination cloning is mediated by recombinases at site-specific sequences eliminating the use of restriction enzymes and ligases. These sites are long sequences, ranging from 25 bp up to a couple hundred base pairs. Recombinase-based systems have a high efficiency rate, making them robust, fast, and reliable and are therefore easily automated and suitable to HTP cloning.

4.1

Gateway Cloning

In the late 1990s, Invitrogen™ introduced recombination-based Gateway Cloning Technology [51], representing a fast and easy way to transfer a target gene from one vector into another within a multitude of vectors already available. The Gateway Cloning System is based on a recombination event that occurs naturally, in this case between the bacteriophage λ (phage) and its host E. coli. The phage is a virus that infects bacteria and either propagates inside the host, leading to cell lysis, or integrates into the host genome and stays in a dormant form. The integration occurs between specific sites, named attachment sites (att sites), in a reaction mediated by a set of enzymes expressed by both phage and bacteria [52]. In this system, chimeric PCR primers containing a 30 target gene-specific portion along with a common 50 tail sequence for adding the flanking attB recombination sites are designed to amplify the GOI [53].

High-Throughput Cloning Methods

15

For Gateway cloning, two different recombination reactions are performed. The first recombination corresponds to the BP reaction, which involves the recombination between attB sites present in the target DNA fragment with the attP sites flanking the insertion site in the entry vector by the action of the BP clonase enzyme mix. The attB substrate should be in linear form and the attP-containing entry vector should be supercoiled for an efficient BP reaction [39]. The reaction is specific and directional since the attB1 site at 50 end of the amplified GOI recombines only with the attP1 site present in the entry vector as well as the attB2 at the 30 end recombines only with attP2 site. The product of BP recombination generates attL1 and attL2 sites flanking the GOI in the entry vector [53] (Fig. 4). Once the target gene is cloned into the entry vector, the second recombination reaction is performed in order to clone the GOI in several expression vectors. Expression clones are generated by the LR recombination reaction between the generated attL-containing entry clone and attR-containing Gateway-compatible expression vectors by the LR clonase enzyme mix. Again, attL1 anneals only with attR1 and attL2 only with attR2. The new recombination sites created in the expression vector from LR recombination are attB sites (Fig. 4). Both recombination reactions are performed in less than 1 h each at room temperature, and since both reactions are reversible, this allows for a very flexible approach of the insert DNA from one expression vector to another [54]. The commercially available Gateway vectors contain at the insertion site a cassette that is replaced by the GOI after recombination. The conversion cassette includes the ccdB gene [55], and a chloramphenicol resistance (Cmr) gene flanked by attP (entry vectors) or attR sites (expression vectors) [53]. The cloning steps in the Gateway system are highly efficient, and there is wide variety of vectors available due to the length of time this system has been in use. In this regard, Gatewaycompatible destination vectors have been broadly developed for protein production in a variety of model systems, including bacteria, fungi, virus, drosophila, zebrafish, plants, and mammals [54]. In addition different vectors for N- or C-terminal tagged proteins were also developed, where the N-terminal tag is placed in the expression vector upstream of the attR1 site and the C-terminal tag is located downstream of the attR2 site. Moreover, a vector suite named pCellFree was also developed that carries species-independent translation initiation sequences (SITS), allowing the recombinant protein expression by any in vitro translation system [56]. With LR recombination, 125 entry clones derived from the Human ORFeome v5.1 collection were correctly cloned into the pCellFree vector suite in a microwell-plate format designed for robot-assisted handling [56].

16

Claudia Ortega et al.

Fig. 4 Gateway cloning system. A PCR reaction is performed with specific primers to amplify the GOI and add the attB1 and attB2 sites. BP clonase enzyme mix catalyzes the in vitro recombination of PCR products (containing attB sites) with an entry vector (containing attP sites) to generate entry clones by replacing the death cassette that was present in the vector. LR recombinase enzyme mix catalyzes the in vitro recombination between an entry clone (containing a gene of interest flanked by attL sites) and an expression vector (containing attR sites) to generate the expression clone flanked with attB sites. Note that the entry and expression vectors must be propagated in special ccdB-resistant bacteria

Despite widely used, Gateway cloning has some disadvantages like a decrease in efficiency for inserts longer than 3 kb [51] and this cloning approach is costly and depends on one supplier for the enzymes and reagents. Furthermore another issue of Gateway cloning is the addition of extra amino acids to the expressed protein due to translation of attB sites (adding nine extra amino acids) which can affect protein expression and solubility [8]. This last can be avoided if recombination sites are placed outside the open reading frame, but will require long primers that include Shine-

High-Throughput Cloning Methods

17

Delgarno or Kozak sequences between the attB1 site and the GOI. Alternatively, a protease cleavage site can be introduced at the N-terminus of the target protein when designing the forward primer for cloning the target sequence, but still a large primer will be necessary [8, 53]. At the start of the twenty-first century, Gateway cloning became very popular because it was the only cloning system that could be adapted to HTP programs. Most of the ORFeomes public/private clone collections (Human, C. elegans, etc.) were transferred to Gateway compatible format [51] so that they could be easily transferred between different expression systems and/or to various promoter/fusion tags. Today, with the development of faster and cheaper cloning alternatives, the main advantage of using Gateway is the possible access to these libraries of entry clones.

5

PCR-Based Cloning Methods The extraordinary low error rates that can be obtained with modern high-fidelity polymerases makes the correct amplification of large DNA fragments including entire plasmids possible. Several cloning techniques have taken advantage of this and emerge as fast, efficient, and cost-effective alternatives to traditional cloning methods when facing HTP cloning projects. By using these PCR-based cloning methodologies, any DNA sequence amplified by PCR can be integrated into any desired position of any destination vector without the introduction of unwanted sequences, allowing a sequence-independent and “scarless” cloning [57]. Such is the case of Restriction-site-free cloning [58], later known as Restriction-Free cloning (RF-cloning) [59, 60].

5.1 Restriction Free (RF) Cloning

RF cloning allows the precise introduction of a DNA fragment into any desired position within a circular plasmid in a sequenceindependent manner, with the advantage that no special pretreatment of the destination vector is required. Several molecular manipulations can be performed with RF-cloning in addition to the insertion of a GOI into a desired vector. These include the simultaneous multicomponent assembly and simultaneous cloning of several DNA fragments into distinct positions within a destination vector, the parallel cloning of the same PCR product into different expression vectors, as well as the simultaneous introduction of deletions, insertions, or mutations at various positions [59, 61]. RF cloning is inspired on the overlap extension site-directed mutagenesis method that was first described by Steffan Ho in 1989 [62] and was then commercialized under the brand name QuikChange™ (Stratagene, LaJolla, CA). While for the QuikChange™ method two overlapping primers are designed and used in a linear-

18

Claudia Ortega et al.

amplification reaction for the introduction of a particular mutation into a circular plasmid, in RF cloning, two chimeric primers (ChiFor and ChiRev) are designed to carry at their 50 ends, complimentary sequences to the insertion site in the destination vector (usually 30 nt), followed by target-specific sequences (usually 20–25 nt) at their 30 ends. The overlapping sequences are designed to have high and similar melting temperatures in order to improve integration accuracy. These primers are used in a first PCR for the exponential amplification of the target gene generating what is known as megaprimer, which corresponds to the gene of interest flanked in both ends with overlapping sequences to the integration site in the destination vector (Fig. 5a). The megaprimer is purified and used in a second PCR (RF-reaction) in combination with the circular

Fig. 5 RF-cloning. (a) Megaprimer synthesis. A first PCR with two chimeric primers (ChiFor and ChiRev) is performed for the amplification of the target gene to generate a megaprimer that corresponds to the GOI (blue) flanked with homologous sequences to the integration site in the destination vector (indicated in green and red). (b) RF-reaction. A second PCR is performed using the generated megaprimer and the circular destination vector resulting in a linear amplification of the whole-plasmid and integration of the GOI in the selected site. The parental plasmid is removed by DpnI treatment and the generated doublenicked plasmid is repaired upon transformation

High-Throughput Cloning Methods

19

destination vector, for the linear amplification of the whole-plasmid and integration of the GOI in the desired site [61] (Fig. 5b). Only high-fidelity PCR polymerases with no strand displacement activity should be used for RF-cloning to avoid the introduction of errors and formation of long concatemers. The parental destination plasmid is removed by treatment with an enzyme that cuts only methylated/hemimethylated sequences such as DpnI and the newly synthesized double-stranded plasmid containing two nicks is used to transform E. coli cells were the nicked DNA can be sealed by endogenous enzymatic activity [59–61]. It is important to note that the parental plasmid must be purified from a dam + E. coli strain in order to be methylated, thus a substrate for DpnI treatment. By using RF-cloning we were able to generate a vector suite that allows the parallel cloning of a PCR product into 12 expression vectors for the evaluation of the effect of different promoters and distinct fusion proteins in recombinant protein expression [13]. Moreover, we extended this vector suite recently to 20 vectors to allow the evaluation of recombinant protein expression in different cell compartments and expression hosts [63]. The cloning of target genes into this vector suite can be easily adapted to 96-well format making it very useful for HTP cloning projects. With RF cloning, DNA fragments of different sizes (100–2000 pb) were precisely inserted into a wide range of vectors with good efficiency [13, 59]. Recently, a web-server was developed (www.rf-cloning. org) that automates the primer design process, simplifying RF-cloning projects [64]. For this propose, the user has to supply the sequences of the DNA insert as well as the destination vector and specify the insertion sites directly into the plasmid. Then, the program returns the designed chimeric primers and recommended PCR conditions [64]. In addition, RF-cloning was also adapted for a single-tube reaction. In this regard, the amplification of the target DNA from a donor plasmid and subsequent integration of the generated megaprimer into the destination vector is done simultaneously in a single tube. This variation was termed Transfer-PCR [65, 66], and combines two sequential amplification sets, the first thirteen cycles with short elongation times for target amplification followed by twenty cycles with longer elongation times for the incorporation of the generated megaprimer into the destination vector. A critical parameter in the efficiency of transfer-PCR is the primer concentration, where low primer concentrations (in the range of 10–20 nM) give the best results [65, 66]. All these advantages makes RF-cloning one of the most simple and costefficient method for routine cloning. In this regard, several research groups have independently developed different protocols based in almost the same concept than RF-cloning, including Megaprimer PCR of whole plasmid (MEGAWHOP cloning) [67], and Overlap extension PCR cloning [68].

20

Claudia Ortega et al.

In order to improve efficiency for difficult to clone targets, some parameters in the RF-reaction can be evaluated. These include the addition of DMSO (5–10% v/v), purify the megaprimer directly from the PCR instead from agarose gel, changing the concentration ratio between destination vector and megaprimer as well as the annealing temperature [61]. Also the concentration and size of the megaprimer can affect cloning efficiency where it was proposed that for long inserts reducing megaprimer concentration has a positive effect and vice versa [57]. This is probably due to larger megaprimers tends to anneal to themselves instead to the destination vector. As mentioned above, a double-stranded plasmid with two single-strand nicks is generated after RF reaction, and the nicks are repaired in vivo after transformation. Given that in vivo ligation of doubly nicked plasmids can be inefficient, an alternative to increase the final efficiency of RF-cloning is to perform in vitro ligation before DpnI treatment. For in vitro ligation, the product from the RF reaction is purified and treated with T4 polynucleotide kinase (PNK) to add a 50 phosphate and the nicks are sealed with T4 ligase. The product is then used for DpnI digestion and E. coli transformation [69]. Although RF-cloning is a fast, cheap, and easy to implement methodology, it was observed that for large DNA fragments (>2.5 kb), cloning efficiency is reduced drastically, making very difficult the correct integration of several targets [57, 68–70]. RF-cloning is based on a linear amplification of the whole-plasmid containing the target sequence, which can lead to low product yields. In order to overcome these limitations, some modifications to the original protocol were proposed and are based in an exponential amplification of the whole-plasmid rather than a linear amplification as mentioned above. By this means, an increase in product yields and cloning efficiency was achieved [69–72]. In several of these methods, the exponential amplification generates a linear product (corresponding to the recombinant vector) that is then circularized in vitro by phosphorylation/ligation steps [69, 73], or if complementary sequences are present at both ends, by in vivo homologous recombination [70, 71]. These methods include Exponential Megapriming PCR cloning (EMP cloning) [69] and Recombination-Assisted Megaprimer cloning (RAM-cloning) [70]. A possible drawback of an exponential amplification of the whole-plasmid containing the insert is the possibility to increase the appearance of unwanted mutations. 5.2 Exponential Megapriming PCR (EMP) Cloning

For EMP cloning, two consecutive PCR steps are performed that exponentially amplify the templates [69]. In the first PCR, two primers are designed, where the forward primer (For1) contains only sequences for insert amplification while the reverse primer

High-Throughput Cloning Methods

21

Fig. 6 EMP-cloning. (a) A first PCR is performed with a target-specific forward primer (For1) and a chimeric reverse primer (ChiRev) to generate a PCR product that contains a complementary sequence to only one vector-integration site at the 30 end (indicated in red). (b) This product is used in a second PCR along with the destination vector and two additional primers, the For1 primer used before and a second reverse primer (Rev2), complementary to the 50 region of the insertion site in the vector (indicated in green), leading to an exponential amplification of the vector with the integration of the GOI. The linear PCR product is purified and circularized in vitro and used for transformation. Parental plasmid is removed by DpnI treatment

(ChiRev) is designed in the same manner than RF-cloning. This generates a PCR product (megaprimer) that contains only one vector-integration site at the 30 end (Fig. 6a). For the integration of the target sequence, a second PCR is performed containing the generated megaprimer, the destination vector, a short reverse primer (20–25 nt) complementary to the 50 region of the insertion site (Rev2) and the For1 primer used before. This results in the exponential amplification of the plasmid containing the target sequence. In the first cycles the sense strand of the megaprimer and the Rev2 primer generates a linear DNA fragment

22

Claudia Ortega et al.

corresponding to the destination vector containing the target sequence in the selected site. Once the megaprimer is depleted, primers For1 and Rev2 continue with the exponential amplification of the linear product (EMP product). The product is then purified and phosphorylated with PNK, circularized by ligation with T4 ligase, and parental plasmid removed with DpnI treatment [69] (Fig. 6b). By using this approach, an increase in cloning efficiency when compared with RF-cloning was obtained especially for larger inserts (>2.5 kb), overcoming fragment size limitations. In addition, it was demonstrated that both PCRs (megaprimer generation and fragment integration) can be combined into a single tube (one-step EMP cloning), and also that EMP cloning can be used for the simultaneous insertion of several fragments into the destination vector [69]. Recently, a gene replacement strategy was employed for the cloning of target genes by RF and EMP cloning methods. In this work, the vector insertion site contains the cytotoxin-encoding ccdB gene, which during the cloning reaction is replaced with the target gene [74]. With this approach, the product generated during the second PCR can be directly transformed into E. coli cells, eliminating the requirement for DpnI treatment and simplifying the clone screening process of these methods, especially when working in HTP cloning projects [74]. 5.3 RecombinationAssisted Megaprimer (RAM) Cloning

Although EMP-cloning showed higher efficiencies, the necessity of product purification and sequential phosphorylation and ligation steps for plasmid circularization can increase time and cost compared to RF-cloning. In this regard, other methods were also developed and are based on in vivo homologous recombination for end-joining and plasmid circularization. Such is the case for RAM-cloning [70]. This technique uses the same chimeric primers than for RF-cloning, so the same megaprimer used in RF-cloning can be used for RAM-cloning. The main difference with RF-cloning is in the second PCR. For the exponential amplification of the whole-plasmid, an additional reverse primer (CR) is included as well as the incorporation of the chimeric forward primer (ChiFor) that was used for megaprimer synthesis. The CR primer contains a 50 overlap of 20 nt with the 50 end of the ChiFor primer (corresponding to the region directly upstream of the desired integration site) and another 20 nt corresponding to the sequence upstream of the 20 bp overlapping region also in the destination vector. For the second PCR a limited number of PCR cycles are performed (10–15 cycles) to reduce error propagation and the presence of 1 M betaine can be included to increase yield and specificity. The parental plasmid is then removed with DpnI treatment, and the linear plasmid containing the insert and homologous sequences at both ends is used to transform E. coli cells, whereby in vivo homologous recombination is circularized [70] (Fig. 7).

High-Throughput Cloning Methods

23

Fig. 7 RAM-cloning. A megaprimer is generated in the same manner than for RF-cloning with two chimeric primers (ChiFor and ChiRev). A second PCR is performed using the megaprimer, the destination vector, the same ChiFor primer, and an additional reverse primer (CR), corresponding to a region of 40 nt directly upstream of the integration site in the vector (indicated in green and yellow). An exponential amplification of the vector with the integrated target sequence occurs and after removing parental plasmid with DpnI and transformation, the linear PCR product is circularized in vivo by homologous recombination

Given that, for RAM-cloning and RF-cloning the same primers are used for megaprimer generation, this methodology is very easy to implement as a rescue strategy in the cases were RF-cloning fails, requiring only the design and synthesis of an extra primer (CR primer). A similar method to RAM-cloning, also based in an exponential amplification of the recombinant plasmid coupled with in vivo homologous recombination for vector circularization, is ABI-REC (derived from the combination of Asymmetric Bridge PCR with Intramolecular homologous Recombination)

24

Claudia Ortega et al.

[71]. ABI-REC combines in the same tube the three primers with the DNA template and the destination vector, so the megaprimer synthesis and its integration into the destination vector can be performed in a single PCR reaction [71]. 5.4 QuickStepCloning

An additional technique that can overpass some of the RF-cloning limitations is QuickStep-cloning [72, 75]. This methodology does not rely on phosphorylation/ligation steps or in vivo homologous recombination for vector circularization, as described before, although an exponential amplification of the whole-plasmid is achieved. QuickStep is based on the amplification of the target sequence by two parallel asymmetric PCR for the generation of predominantly single-stranded DNA containing 30 overhangs complementary to the integration site in the destination vector. In order to perform the asymmetric PCR four different primers are designed, two containing target-specific sequences only (For1 and Rev1) and two chimeric primers containing sequences complimentary to the integration site in the 50 end (25–30 nt) followed by target-specific sequences in the 30 end (ChiFor and ChiRev) (Fig. 8a). For the generation of sense and antisense strands containing integration sequences at the 30 ends, an unbalanced concentration of primers is used. Asymmetric PCR1-A is performed containing 500 nM For1 and 10 nM ChiRev primers resulting in the production of an excess of the sense strand, and PCR1-B containing 10 nM ChiFor and 500 nM Rev1 primers resulting in the generation of an excess of the antisense strand (Fig. 8a). The products of these PCRs are purified and mixed together, thus generating a megaprimer flaked with 30 overhangs that is used in a PCR reaction along with the destination vector, for the exponential amplification of the whole-plasmid containing the target sequence. A double-nicked circular plasmid is obtained, where after treatment with DpnI to remove parental plasmid is used for transformation (Fig. 8b). In addition, QuickStep cloning circumvents another issue of RF-cloning corresponding to the complete self-annealing of megaprimers. Owing to in this technique the megaprimer contains 30 overhangs, the megaprimer can anneal to the vector integration site even if the two megaprimer strands self-anneal. By using this approach higher cloning efficiencies were obtained compared to RF-cloning [72]. Furthermore, the yield of the second PCR can be increased even further, by maintaining the two megaprimers separated in parallel reactions for the first five PCR cycles and mix them together for the remaining twenty cycles of the PCR program [75]. As for RF-cloning, efficiency can be incremented by performing phosphorylation/ligation steps for sealing the plasmid nicks in vitro.

High-Throughput Cloning Methods

25

Fig. 8 QuickStep cloning. (a) Two parallel asymmetric PCR are performed for the generation of an excess of single-stranded DNA (PCR-1A and PCR-1B). (b) After purification, both PCR products are mixed together to generate a megaprimer that contains 30 single-strand overhangs at both ends and are complementary to the integration site in the vector (indicated in green and red). This megaprimer is used in a PCR reaction along with the destination vector for the exponential amplification of the whole-plasmid containing the target sequence. Parental plasmid is removed with DpnI and the generated double-nicked circular vector is used for transformation and repaired in vivo

5.5 Circular Polymerase Extension Cloning (CPEC)

Finally, another popular PCR-based cloning method similar to RF-cloning is CPEC [36, 76, 77]. In this method, overlapping regions of 25–27 nt between the target gene and the destination vector are extended by PCR to form a double-nicked circular plasmid. The principle is similar to that of RF-cloning with the main difference that the destination vector is linearized by PCR prior to the integration reaction, so the insert and vector share overlapping sequences at both ends with the additional advantage

26

Claudia Ortega et al.

that DpnI treatment is no longer required. A remarkable feature of CPEC is that it is possible to clone target genes with only one PCR cycle, thus the final cloning reaction can be completed in only 5 min [36]. Several molecular biology procedures can be done with CPEC including gene library cloning and multi-fragment cloning although the cycle number must be increased in such cases [36, 77]. Its successful use in combinatorial library cloning makes CPEC an attractive method to be applied in synthetic biology applications and can accelerate the process of protein expression screening by using large quantities of library gene variants [77].

6

Discussion The enormous progress achieved in molecular cloning allowed overcoming many obstacles commonly found when working with traditional restriction-based methods. However nowadays it is still difficult to find a particular method superior to all cloning techniques. The selection of the cloning strategy will depend on several factors including the number of target genes to be expressed, or variants of the GOI to be evaluated (homologs, truncations, etc.), the use of multiple vectors containing different tags or promoters, as well as the use of distinct expression hosts. In addition, the laboratory equipment as well as the available budget may influence the selection of a particular cloning strategy [78]. Restriction-based methods are not very suitable for cloning several genes into different vectors, typically found in HTP projects, although some strategies were developed that circumvent some of the common limitations. If a vector suite is available were the targets can be cloned always in the same insertion site among the different vectors, Golden gate cloning can be appropriate. However, the possibility of internal restriction sites in the GOI as well as in the selected vector will require their elimination by mutagenesis, increasing the complexity and/or costs of the cloning process. In a different scenario, Gateway and LIC cloning showed good efficiencies and have been used in several HTP cloning projects with success [8]. However, because a defined sequence is required at the insertion site, unwanted sequences can be introduced into the GOI and their use in other applications like insertions or deletions into the cloned GOI among others may be limited. In this regard the selection of a method that is sequence independent will be an attractive option. The use of a sequenceindependent cloning method will allow the cloning at any site in the destination vector and the easy avoidance of introducing unwanted sequences to the expressed protein. Depending on the laboratory budget, the use of commercial kits like In-Fusion or Gibson

High-Throughput Cloning Methods

27

assembly for the routine laboratory cloning may be difficult to afford, making the implementation of SLIC or PCR-based methods a cost-effective approach for cloning targets of different lengths in a sequence-independent manner. SLIC and CPEC showed great efficiencies at low costs, but requires the linearization of the destination vector requiring additional primers and increasing complexity. RF cloning uses circular destination vectors, so only two primers are required, making the method very simple. However, it was seen that the efficiency decreases significantly with inserts longer than 2.5 kb. In this regard, several PCR-based methods were developed that overcome this limitation by performing an exponential amplification rather than a linear amplification of the vector containing the GOI. By applying these approaches increased efficiency compared to RF-cloning was observed and large DNA fragments were successfully cloned [69, 70, 72]. Nevertheless, the exponential amplification of the vector can increase the chances of introducing unwanted mutations into the plasmid backbone that could affect expression levels despite the GOI sequence is correct. Taking all this into account it may be convenient to start with a RF cloning strategy. If it does not work for a particular target, changing RF parameters (such as megaprimer concentration and annealing temperature) and/or transformation efficiency by performing in vitro ligation (with PNK and T4 ligase), could be worthwhile. If the results are not satisfactory, RAM cloning can be easily adapted since the same megaprimer used for RF can be used, requiring only one additional primer. The use of EMP or QuickStep cloning methods will also be attractive options and will require two additional primers. So as depicted above, there are different techniques available with different properties that can work as rescue strategies despite selecting a particular method (Table 1). As it is shown in this chapter, several molecular methods based in different principles have been developed over the last two decades. They reached successful cloning efficiencies, enabling the HTP cloning of many DNA targets possible for the increasing demands of protein expression screening and purification.

Acknowledgements The authors want to thank Maria Clara Miranda for helping in the drawing of the figures. This work was supported by grants from Agencia Nacional de Investigacio´n e Innovacio´n (FMV_1_2014_1_104397 and FCE_3_2016_1_125765).

Yes

Yes

No

In-Fusion HD cloning enzyme 94% [40] mix (Clontech)

Gibson assembly T5 Exonuclease, Phusion DNA 98% [47] polymerase, Taq ligase or Gibson assembly master mix (NEB)

BP and LR Clonase enzymes mix (Invitrogen)

High-fidelity DNA polymerase 50–100% [59]

In-Fusion

Gateway

RF cloning

97% [16]

Yes

Vector dependent

Addition of extra amino acids to the expressed protein

Cheap

2

High (kit 2 requirement)

No

No

No Yes, but can be avoided with special designed primers

Linearization

No 4 (target amplification and vector linearization) High (kit or 3 different enzymes are required)

Linearization

No High (kit 4 (target requirement) amplification and vector linearization)

Linearization

Linearization

Linearization

Vector pretreatment

Linearization

2 or 4 (vector No, depends on linearization) the expression vector

2 or 4 (vector No linearization)

2

Number of primers required

2 or 4 (vector No linearization)

Cheap

>95% adding Yes RecA [38]

T4 DNA polymerase

SLIC

Cheap

Cheap

Cheap

No

85% [8]

T4 DNA polymerase

LIC

Yes

Type II endonuclease, T4 DNA Ligase

100% [20]

SgfI/PmeI restriction enzymes, 95–98% [16] No T4 DNA ligase

Cloning at any desired location within the vector Cost

Golden gate

Flexi

Cloning method Type of enzymes requireda

Reported cloning efficiency (reference)b

Table 1 HTP cloning methods and their principal characteristics

28 Claudia Ortega et al.

b

A DNA polymerase will also be required for target amplification or vector linearization The efficiency can change depending on target sizes, GC content, as well as in the evaluation methods employed

a

Cheap

High-fidelity DNA polymerase >95% [36]

CPEC

Yes

Cheap

High-fidelity DNA polymerase 93–97% [71] Yes

QuickStep cloning

Cheap

High-fidelity DNA polymerase 75–94% [69] Yes

Cheap

RAM cloning

Yes

High-fidelity DNA polymerase 10–100% [68]

EMP cloning

No

No

No

No 4 (target amplification and vector linearization)

4

3

3

Linearization

No

No

No

High-Throughput Cloning Methods 29

30

Claudia Ortega et al.

References 1. Zimmerman SB, Little JW, Oshinsky CK, Gellert M (1967) Enzymatic joining of DNA strands: a novel reaction of diphosphopyridine nucleotide. Proc Natl Acad Sci U S A 57 (6):1841–1848 2. Smith HO, Wilcox KW (1970) A restriction enzyme from Hemophilus influenzae. I Purification and general properties. J Mol Biol 51 (2):379–391 3. Yoshimori R, Roulland-Dussoix D, Boyer HW (1972) R factor-controlled restriction and modification of deoxyribonucleic acid: restriction mutants. J Bacteriol 112(3):1275–1279 4. Jackson DA, Symons RH, Berg P (1972) Biochemical method for inserting new genetic information into DNA of Simian Virus 40: circular SV40 DNA molecules containing lambda phage genes and the galactose operon of Escherichia coli. Proc Natl Acad Sci U S A 69 (10):2904–2909 5. Morrow JF, Cohen SN, Chang AC, Boyer HW, Goodman HM, Helling RB (1974) Replication and transcription of eukaryotic DNA in Escherichia coli. Proc Natl Acad Sci U S A 71 (5):1743–1747 6. Itakura K, Hirose T, Crea R, Riggs AD, Heyneker HL, Bolivar F et al (1977) Expression in Escherichia coli of a chemically synthesized gene for the hormone somatostatin. Science 198(4321):1056–1063 7. Crea R, Kraszewski A, Hirose T, Itakura K (1978) Chemical synthesis of genes for human insulin. Proc Natl Acad Sci U S A 75 (12):5765–5769 8. Alzari PM, Berglund H, Berrow NS, Blagova E, Busso D, Cambillau C et al (2006) Implementation of semi-automated cloning and prokaryotic expression screening: the impact of SPINE. Acta Crystallogr D Biol Crystallogr 62(Pt 10):1103–1113 9. Kim Y, Bigelow L, Borovilos M, Dementieva I, Duggan E, Eschenfeldt W et al (2008) Chapter 3. High-throughput protein purification for x-ray crystallography and NMR. Adv Protein Chem Struct Biol 75:85–105 10. Lee D, de Beer TA, Laskowski RA, Thornton JM, Orengo CA (2011) 1,000 structures and more from the MCSG. BMC Struct Biol 11:2 11. Vincentelli R, Cimino A, Geerlof A, Kubo A, Satou Y, Cambillau C (2011) Highthroughput protein expression screening and purification in Escherichia coli. Methods 55 (1):65–72 12. Vincentelli R, Romier C (2013) Expression in Escherichia coli: becoming faster and more

complex. Curr Opin Struct Biol 23 (3):326–334 13. Correa A, Ortega C, Obal G, Alzari P, Vincentelli R, Oppezzo P (2014) Generation of a vector suite for protein solubility screening. Front Microbiol 5:67 14. Correa A, Oppezzo P (2015) Overcoming the solubility problem in E. coli: available approaches for recombinant protein production. Methods Mol Biol 1258:27–44 15. Slater M, Strauss E (2006) The Flexi® vector systems: the easy way to clone. Promega Notes 93:8–10 16. Blommel PG, Martin PA, Wrobel RL, Steffen E, Fox BG (2006) High efficiency single step production of expression plasmids from cDNA clones using the Flexi Vector cloning system. Protein Expr Purif 47(2):562–570 17. Nagase T, Yamakawa H, Tadokoro S, Nakajima D, Inoue S, Yamaguchi K et al (2008) Exploration of human ORFeome: high-throughput preparation of ORF clones and efficient characterization of their protein products. DNA Res 15(3):137–149 18. Yamakawa H (2009) High-throughput construction of ORF clones for production of the recombinant proteins. Methods Mol Biol 577:25–39 19. Engler C, Marillonnet S (2014) Golden Gate cloning. Methods Mol Biol 1116:119–131 20. Engler C, Kandzia R, Marillonnet S (2008) A one pot, one step, precision cloning method with high throughput capability. PLoS One 3 (11):e3647 21. Engler C, Gruetzner R, Kandzia R, Marillonnet S (2009) Golden gate shuffling: a one-pot DNA shuffling method based on type IIs restriction enzymes. PLoS One 4(5):e5553 22. Engler C, Marillonnet S (2011) Generation of families of construct variants using golden gate shuffling. Methods Mol Biol 729:167–181 23. Scior A, Preissler S, Koch M, Deuerling E (2011) Directed PCR-free engineering of highly repetitive DNA sequences. BMC Biotechnol 11:87 24. Yan P, Gao X, Shen W, Zhou P, Duan J (2012) Parallel assembly for multiple site-directed mutagenesis of plasmids. Anal Biochem 430 (1):65–67 25. Schreiber C, Muller H, Birrenbach O, Klein M, Heerd D, Weidner T et al (2017) A highthroughput expression screening platform to optimize the production of antimicrobial peptides. Microb Cell Factories 16(1):29

High-Throughput Cloning Methods 26. Aslanidis C, de Jong PJ (1990) Ligationindependent cloning of PCR products (LIC-PCR). Nucleic Acids Res 18 (20):6069–6074 27. Shuldiner AR, Scott LA, Roth J (1990) PCR-induced (ligase-free) subcloning: a rapid reliable method to subclone polymerase chain reaction (PCR) products. Nucleic Acids Res 18 (7):1920 28. Yang YS, Watson WJ, Tucker PW, Capra JD (1993) Construction of recombinant DNA by exonuclease recession. Nucleic Acids Res 21 (8):1889–1893 29. Jia B, Jeon CO (2016) High-throughput recombinant protein expression in Escherichia coli: current status and future perspectives. Open Biol 6(8):160196 30. Schmid-Burgk JL, Schmidt T, Kaiser V, Honing K, Hornung V (2013) A ligationindependent cloning technique for highthroughput assembly of transcription activator-like effector genes. Nat Biotechnol 31(1):76–81 31. Camilo CM, Polikarpov I (2014) Highthroughput cloning, expression and purification of glycoside hydrolases using LigationIndependent Cloning (LIC). Protein Expr Purif 99:35–42 32. Yuan H, Peng L, Han Z, Xie JJ, Liu XP (2015) Recombinant expression library of Pyrococcus furiosus constructed by high-throughput cloning: a useful tool for functional and structural genomics. Front Microbiol 6:943 33. Li MZ, Elledge SJ (2007) Harnessing homologous recombination in vitro to generate recombinant DNA via SLIC. Nat Methods 4 (3):251–256 34. Zhu B, Cai G, Hall EO, Freeman GJ (2007) In-fusion assembly: seamless engineering of multidomain fusion proteins, modular vectors, and mutations. BioTechniques 43(3):354–359 35. Gibson DG, Young L, Chuang RY, Venter JC, Hutchison CA 3rd, Smith HO (2009) Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat Methods 6 (5):343–345 36. Quan J, Tian J (2009) Circular polymerase extension cloning of complex gene libraries and pathways. PLoS One 4(7):e6441 37. Jeong JY, Yim HS, Ryu JY, Lee HS, Lee JH, Seen DS et al (2012) One-step sequence- and ligation-independent cloning as a rapid and versatile cloning method for functional genomics studies. Appl Environ Microbiol 78 (15):5440–5443 38. Scholz J, Besir H, Strasser C, Suppmann S (2013) A new method to customize protein

31

expression vectors for fast, efficient and background free parallel cloning. BMC Biotechnol 13:12 39. Park J, Throop AL, LaBaer J (2015) Sitespecific recombinational cloning using gateway and in-fusion cloning schemes. Curr Protoc Mol Biol 110:3.20.1–3.20.23 40. Berrow NS, Alderton D, Sainsbury S, Nettleship J, Assenberg R, Rahman N et al (2007) A versatile ligation-independent cloning method suitable for high-throughput expression screening applications. Nucleic Acids Res 35(6):e45 41. Bird LE, Rada H, Flanagan J, Diprose JM, Gilbert RJ, Owens RJ (2014) Application of In-Fusion cloning for the parallel construction of E. coli expression vectors. Methods Mol Biol 1116:209–234 42. Irwin CR, Farmer A, Willer DO, Evans DH (2012) In-fusion(R) cloning with vaccinia virus DNA polymerase. Methods Mol Biol 890:23–35 43. Spidel JL, Vaessen B, Chan YY, Grasso L, Kline JB (2016) Rapid high-throughput cloning and stable expression of antibodies in HEK293 cells. J Immunol Methods 439:50–58 44. Sleight SC, Bartley BA, Lieviant JA, Sauro HM (2010) In-Fusion BioBrick assembly and re-engineering. Nucleic Acids Res 38 (8):2624–2636 45. Gibson DG (2011) Enzymatic assembly of overlapping DNA fragments. Methods Enzymol 498:349–361 46. Gibson DG, Smith HO, Hutchison CA 3rd, Venter JC, Merryman C (2010) Chemical synthesis of the mouse mitochondrial genome. Nat Methods 7(11):901–903 47. Fu C, Donovan WP, Shikapwashya-Hasser O, Ye X, Cole RH (2014) Hot Fusion: an efficient method to clone multiple DNA fragments as well as inverted repeats without ligase. PLoS One 9(12):e115318 48. Alvarez C (2015) DNA assembly and cloning in an overnight run with the BioXp™ 3200 system. Nat Methods 12:12 49. Marsischky G, LaBaer J (2004) Many paths to many clones: a comparative look at highthroughput cloning methods. Genome Res 14(10B):2020–2028 50. Skalamera D, Ranall MV, Wilson BM, Leo P, Purdon AS, Hyde C et al (2011) A highthroughput platform for lentiviral overexpression screening of the human ORFeome. PLoS One 6(5):e20057 51. Walhout AJ, Temple GF, Brasch MA, Hartley JL, Lorson MA, van den Heuvel S et al (2000) GATEWAY recombinational cloning:

32

Claudia Ortega et al.

application to the cloning of large numbers of open reading frames or ORFeomes. Methods Enzymol 328:575–592 52. Lederberg EM (1981) Plasmid reference center registry of transposon (Tn) allocations through July 1981. Gene 16(1–3):59–61 53. Esposito D, Garvey LA, Chakiath CS (2009) Gateway cloning for protein expression. Methods Mol Biol 498:31–54 54. Festa F, Steel J, Bian X, Labaer J (2013) Highthroughput cloning and expression library creation for functional proteomics. Proteomics 13 (9):1381–1399 55. Bernard P, Gabant P, Bahassi EM, Couturier M (1994) Positive-selection vectors using the F plasmid ccdB killer gene. Gene 148(1):71–74 56. Gagoski D, Mureev S, Giles N, Johnston W, Dahmer-Heath M, Skalamera D et al (2014) Gateway-compatible vectors for highthroughput protein expression in pro- and eukaryotic cell-free systems. J Biotechnol 195:1–7 57. Stevenson J, Krycer JR, Phan L, Brown AJ (2013) A practical comparison of ligationindependent cloning techniques. PLoS One 8 (12):e83888 58. Chen GJ, Qiu N, Karrer C, Caspers P, Page MG (2000) Restriction site-free insertion of PCR products directionally into vectors. BioTechniques 28(3):498–500. 504–495 59. Unger T, Jacobovitch Y, Dantes A, Bernheim R, Peleg Y (2010) Applications of the Restriction Free (RF) cloning procedure for molecular manipulations and protein expression. J Struct Biol 172(1):34–44 60. van den Ent F, Lowe J (2006) RF cloning: a restriction-free method for inserting target genes into plasmids. J Biochem Biophys Methods 67(1):67–74 61. Peleg Y, Unger T (2014) Application of the Restriction-Free (RF) cloning for multicomponents assembly. Methods Mol Biol 1116:73–87 62. Ho SN, Hunt HD, Horton RM, Pullen JK, Pease LR (1989) Site-directed mutagenesis by overlap extension using the polymerase chain reaction. Gene 77(1):51–59 63. Ortega C, Prieto D, Abreu C, Oppezzo P, Correa A (2018) Multi-compartment and multihost vector suite for recombinant protein expression and purification. Front Microbiol 9:1384 64. Bond SR, Naus CC (2012) RF-Cloning.org: an online tool for the design of restriction-free

cloning projects. Nucleic Acids Res 40(Web Server issue):W209–W213 65. Erijman A, Dantes A, Bernheim R, Shifman JM, Peleg Y (2011) Transfer-PCR (TPCR): a highway for DNA cloning and protein engineering. J Struct Biol 175(2):171–177 66. Erijman A, Shifman JM, Peleg Y (2014) A single-tube assembly of DNA using the transfer-PCR (TPCR) platform. Methods Mol Biol 1116:89–101 67. Miyazaki K (2011) MEGAWHOP cloning: a method of creating random mutagenesis libraries via megaprimer PCR of whole plasmids. Methods Enzymol 498:399–406 68. Bryksin AV, Matsumura I (2010) Overlap extension PCR cloning: a simple and reliable way to create recombinant plasmids. BioTechniques 48(6):463–465 69. Ulrich A, Andersen KR, Schwartz TU (2012) Exponential megapriming PCR (EMP) cloning--seamless DNA insertion into any target plasmid without sequence constraints. PLoS One 7(12):e53360 70. Mathieu J, Alvarez E, Alvarez PJ (2014) Recombination-assisted megaprimer (RAM) cloning. MethodsX 1:23–29 71. Bi Y, Qiao X, Hua Z, Zhang L, Liu X, Li L et al (2012) An asymmetric PCR-based, reliable and rapid single-tube native DNA engineering strategy. BMC Biotechnol 12:39 72. Jajesniak P, Wong TS (2015) QuickStepCloning: a sequence-independent, ligationfree method for rapid construction of recombinant plasmids. J Biol Eng 9:15 73. Spiliotis M (2012) Inverse fusion PCR cloning. PLoS One 7(4):e35407 74. Lund BA, Leiros HK, Bjerga GE (2014) A high-throughput, restriction-free cloning and screening strategy based on ccdB-gene replacement. Microb Cell Factories 13(1):38 75. Jajesniak P, Wong TS (2017) Rapid construction of recombinant plasmids by QuickStepCloning. Methods Mol Biol 1472:205–214 76. Quan J, Tian J (2011) Circular polymerase extension cloning for high-throughput cloning of complex and combinatorial DNA libraries. Nat Protoc 6(2):242–251 77. Quan J, Tian J (2014) Circular polymerase extension cloning. Methods Mol Biol 1116:103–117 78. Celie PH, Parret AH, Perrakis A (2016) Recombinant cloning strategies for protein expression. Curr Opin Struct Biol 38:145–154

Chapter 2 Overview of a High-Throughput Pipeline for Streamlining the Production of Recombinant Proteins Joanne E. Nettleship, Heather Rada, and Raymond J. Owens Abstract Production of high quality protein is an essential step for both structural and functional studies. Throughput has increased in the past decade by the use of streamlined workflows with standard operating procedures and automation. In this chapter, we describe the Oxford Protein Production Facility (OPPF) pipeline for protein production, from conception, through vector construction, to expression and purification. Results from projects run in the OPPF demonstrate the value of using parallel expression screening of intracellular proteins in both E. coli and insect cells. Transient expression in Human Embryonic Kidney (HEK) cells is used exclusively for production of secreted glycoproteins. Protein purification and quality assessment are independent of the expression system and enable sample preparation to be simplified and streamlined. Key words Recombinant protein production, High throughput, E. coli, Insect cells, HEK cells

1

Introduction The production of high quality recombinant proteins is a critical step in many fields of protein science but especially structural biology where sample quality is crucial for downstream applications including crystallization, cryo-electron microscopy, and nuclear magnetic resonance measurements. Over the past decade, the production of proteins for structural studies has been streamlined by the introduction of standard operating procedures and the use of laboratory automation to enhance throughput. The different stages in design, expression, and purification of recombinant proteins at laboratory scale have become integrated into a single workflow. The pipeline developed by the Oxford Protein Production Facility (OPPF) (Fig. 1) is an example of this approach and comprises three stages: (1) construct design, (2) expression screening, and (3) sample preparation. Each of these steps is considered in the sections that follow.

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_2, © Springer Science+Business Media, LLC, part of Springer Nature 2019

33

34

Joanne E. Nettleship et al.

Fig. 1 Overview of the protein production pipeline

2

Construct Design Designing the construct(s) that will encode the target protein is probably one of the most critical stages in the process of producing a recombinant protein. Therefore, at the outset, it is important to define the intended use of the protein, for example, activity screening for which only the catalytic domain is required or the full length open reading frame to determine the overall architecture of the protein. The approach taken in the OPPF typically involves making both full length and truncated constructs in order to maximize the output of the experiment.

2.1 Bioinformatic Analyses

Information about the target protein(s) is gathered from the scientific literature and by various bioinformatics analyses as part of the design process. Resources such as Uniprot and the Protein Databank (PDB) are used as entry points to prior knowledge. In addition, selected online resources are used to gather information, for example, disorder prediction using the RONN algorithm (Fig. 2) [1] and to generate 3-D homology models using the Phyre2 server [2]. A list of the web-based bioinformatic resources routinely used in the OPPF is given in Table 1.

2.2

The information compiled from both the scientific literature and bioinformatics analyses is used to inform the choice of start and stop positions of expression constructs and hence the locations of the forward and reverse primer sequences that will be used for PCR amplification and cloning. An in-house MySQL database has been developed, called OPTIC [3] for storing target sequences. On entry into OPTIC each sequence acquires a unique identifier (OPITC number). A primer design tool linked to the OPTIC database enables PCR primers to be designed automatically but also manually adjusted as required. The OPTIC database contains a table of all the primer extensions required for ligation-

Primer Design

A High-Throughput Pipeline for Protein Production

35

Fig. 2 Screenshot of output from RONN v3.2 showing disorder prediction. The solid line represents the probability of disorder for each amino acid with the dotted line showing the boundary between order and disorder. The amino acid sequence is given below the graph. In this case, an area of disorder can be seen at the C-terminus where the blue line is above the red dotted line for a string of amino acids. Two further short sequences of disorder can be seen within the protein sequence Table 1 Web-based resources for bioinformatic analysis of protein sequences used routinely in the OPPF Name

Use

Url

Reference

NetNGlyc N-glycosylation predication

http://www.cbs.dtu.dk/services/ NetNGlyc/

[19]

NetOGlyc O-glycosylation prediction

http://www.cbs.dtu.dk/services/ NetOGlyc/

[20]

PDB

Protein structure database

http://www.rcsb.org/

[21]

Phyre2

Structure homology model

http://www.sbg.bio.ic.ac.uk/phyre2/

[2]

ProtParam Physical and chemical parameters

http://web.expasy.org/protparam/

[22]

RONN

Disorder predication

https://www.strubi.ox.ac.uk/RONN

[1]

SignalP

Signal sequence predication

http://www.cbs.dtu.dk/services/ SignalP/

[23]

TMHMM Transmembrane predication

http://www.cbs.dtu.dk/services/ TMHMM/

[24]

Uniprot

http://www.uniprot.org/

[25]

Protein sequence and function information

independent cloning and these are automatically added to the gene-specific forward and reverse primer sequences by selecting the vector that will be used for cloning and expression screening (see below). When a construct is generated it is automatically given a unique OPPF number and the PCR primer sequences stored alongside the sequence that will be amplified. Typically more than

36

Joanne E. Nettleship et al.

one construct is designed for each target protein such that there is a one to many relationship between target (OPTIC number) and constructs (OPPF numbers). The list of OPPF numbers and associated OPITC identifiers can be exported from the OPTIC database into a construct design spreadsheet, with each OPPF construct corresponding to a position in a 96-well plate (A1-H12). An order form for PCR primers can be created either manually by “cutting and pasting” from the record in OPTIC or automatically using a protocol written in the Protein Information Management System, PiMS [4].

3

Expression Screening For all cloning and expression experiments, constructs are organized in a 96 well SBS plate format as defined by the construct design spreadsheet. Each experiment typically comprises between 24 and 96 expression vectors corresponding to one or more targets. The multiple constructs for each target may consist of different domains or sub-domains and/or different fusion tags added to facilitate protein solubility and detection/purification. Subsequently, the different vector configurations can be tested for expression in different hosts (E. coli, insect, and mammalian cells). In this way information is obtained about the optimal construct and expression host for subsequent production of the protein target.

3.1 Vector Construction

Multiple expression vectors are constructed in parallel by ligationindependent cloning using the Infusion™ polymerase that catalyzes the precise joining of DNA fragments (e.g., PCR-generated inserts, synthetic genes, and linearized vectors) by recognizing 15 bp overlaps at their ends [5]. The primer design tool automatically adds the appropriate 15 bp extension that is required for cloning into a particular vector. A suite of vectors (the pOPIN vectors) based on the pTriEx 1.1 have been constructed such that the same 15 bp annealing sequences can be used for cloning. Hence the same PCR product can used to construct a whole series of pOPIN vectors that in turn can be tested in more than one host [5–8]. The pTriEx plasmid backbone incorporates promoter elements for expression in E. coli, mammalian cells, and insect cells using baculoviruses (Fig. 3a). Therefore the same vector, generated by a single cloning step can be used for screening in different host cells. For expression in insect cells, homologous recombination within the insect cell is used to construct baculoviruses in a single step. The pOPIN transfer vector is co-transfected with a genetically disabled baculovirus propagated as a bacmid and the recombinant virus subsequently amplified [9, 10]. All the pOPIN vectors include a hexa/octahistidine tag to facilitate detection and purification of expressed proteins. A subset of pOPIN vectors have been constructed for

A High-Throughput Pipeline for Protein Production

37

Fig. 3 Diagrammatic representation of (a) pOPINF and (b) pOPINTTGneo. pOPINF is a vector conferring an N-terminal hexahistidine tag followed by a 3C protease cleavage site onto a gene of interest and is based on TriEx 1.1. pOPINTTGneo contains the RTPTμ signal sequence at the N-terminus of the gene of interest and a C-terminal histidine tag and is based on the pTT vector (Maps generated using SnapGene Viewer)

38

Joanne E. Nettleship et al.

Fig. 4 pOPIN vector configurations. (a) N-terminal tags. Vectors containing a hexahistidine tag (green) in combination with different fusion proteins (blue) and a 3C protease cleavage site (black) upstream of the protein-of-interest (red). (b) C-terminal tags. The top two vectors contain the protein-of-interest (red) followed by a lysine before the hexahistine tag (green). The lysine allows the tag to be removed by carboxypeptidase A. The second of these two vectors contains a signal sequence for periplasmic expression in E. coli (orange). The bottom four vectors contain a 3C protease cleavage site (black) after the protein-of-interest (red) and then longer C-terminal fusion proteins (blue) before the hexahistine tag (green). (c) Vectors based on pTT with the RTPTμ signal sequence (purple) for secreted expression in eukaryotic systems. pOPIN vector sequences are available on the OPPF website (www.oppf.rc-harwell.ac.uk)

production of secreted eukaryotic proteins that have a resident signal sequence [11] and are based on the pTT vector backbone [12] (Fig. 3b). In addition, a number of fusion protein tags have been added that can aid solubility and/or expression levels (Fig. 4). 3.2 Screening for the Expression of Intracellular Proteins

Expression screening of prokaryotic proteins is carried out exclusively in E. coli. However, for the production of eukaryotic proteins, experience has shown that testing for expression in both E. coli and insect cells significantly increases the number of expressed targets. In both cases, expression screening is automated and performed in 96-well format. The assay consists of pelleting 1 ml of culture, lysing the cells and performing small-scale nickel affinity capture using magnetic beads [5] (Fig. 5). The resulting purified protein is then analyzed by SDS-PAGE and bands compared to a GFP (green fluorescent protein) positive control. Expression screening in E. coli and insect cells can take place in parallel, although the insect cell screen takes longer to perform due to raising the baculovirus (Fig. 5). An example of the results obtained from using both E. coli and insect cells in expression trials run in parallel is shown in Fig. 6. Here, no detectable expression of Discoidin Domain Receptor Tyrosine Kinase 1 (DDR1) was seen in E. coli (Fig. 6a) whereas protein of the expected size was expressed in insect cells infected with recombinant DDR1 kinase baculoviruses (Fig. 6b).

A High-Throughput Pipeline for Protein Production

Fig. 5 Cloning and expression screening pipelines for intracellular proteins using E. coli and insect cells

39

40

Joanne E. Nettleship et al.

Fig. 6 SDS-PAGE gel after NiNTA magnetic bead purification showing expression of constructs for DDR1 in (a) E. coli and (b) insect cells via the baculovirus system. The lanes are numbered according to the construct amino acid start and stop points along with the vector which is either pOPINE “E,” pOPINF “F,” or pOPINJ “J” conferring a C-terminal His tag, an N-terminal His tag, or an N-terminal His-GST tag, respectively. Triangles represent the expected molecular weight of the protein Table 2 Comparison of E. coli versus insect cell expression for constructs with N- and C-terminal His tags pOPINE

pOPINF

pOPINE/F

Expression in either system

94

113

207

The same expression level

13

11

24

Better expression in E. coli

17

17

34

Better expression in insect cells

64

85

149

The results for screening soluble protein expression in both E. coli and insect cells of 427 constructs corresponding to 135 different eukaryotic proteins are shown in Table 2. Constructs contained either an N or C-terminal hexahistidine tag. A total of 207 constructs representing 65 proteins (unique OPTIC numbers) gave expression in either E. coli, insect cells, or both. However, of these only 58 (28%) would have been obtained using E. coli alone. Whereas insect cells alone accounted for 183 (88%) of the expression hits. For eukaryotic proteins, using insect cells for expression is clearly beneficial. Screening using both E. coli and insect cells in parallel will maximize the number of expression positive constructs obtained in the shortest possible time. Fusion of a target protein to a highly soluble carrier protein can enhance the expression of eukaryotic proteins in E. coli. Figure 7a

A High-Throughput Pipeline for Protein Production

41

Fig. 7 SDS-PAGE gel after NiNTA magnetic bead purification showing expression in (a) E. coli and (b) insect cells with eight different N-terminal tags. The solubility tag is shown above the gel. Expected molecular weights are as follows: His (pOPINF) ¼ 38.5 kDa, SUMO small ubiquitin-like modifier (pOPINS3C) ¼ 49.5 kDa, NusA N-utilization substance protein A (pOPINNusA) ¼ 93 kDa, MysB (pOPINMysB) ¼ 52.5 kDa, MBP maltose binding protein (pOPINM) ¼ 79 kDa, TF trigger factor (pOPINTF) ¼ 86.5 kDa, TRX thioredoxin (pOPINTRX) ¼ 50 kDa, Halo7™ (pOPINHalo7) ¼ 71.5 kDa. Triangles represent the expected molecular weight of the protein

shows the full length construct of a phospholipase C family protein expressed in E. coli with eight different N-terminal tags. In terms of expression in E. coli, it can be seen that the His-tag alone gave no detectable expression for this protein, whereas addition of solubility tags led to detectable expression of the fusion protein (Fig. 7a). However, one of the drawbacks of using solubility tags is that truncated products are often observed, for instance in the lane with the Halo7 tagged protein, bands can be seen around 35 kDa which correspond to the Halo7 tag alone. When the same constructs were expressed in insect cells not only the fusion proteins but also the His-tagged version were expressed (Fig. 7b). This construct was scaled to 1 L in insect cells and gave 1.4 mg of purified protein. The conclusion from screening expression of 135 proteins (unique OPTIC numbers used in this dataset) is that changing the expression system is better than trying to produce E. coli fusion proteins. 3.3 Screening for the Expression of Secreted Glycoproteins Proteins

Production of secreted (glyco)proteins forms a major part of the project portfolio of the OPPF for which transient expression in HEK 293 cells is routinely used [13]. As with intracellular proteins, the same principal of evaluating multiple constructs in parallel is applied to secreted proteins. Cells are grown in either 24- or 96-well tissue culture plates and expression of secreted proteins detected by western blotting and/or ELISA of culture supernatants 72 h post-transfection. For projects requiring multiple rounds of screening, the transient transfection process has been automated [10].

42

Joanne E. Nettleship et al.

Fig. 8 Western blot analysis of secreted expression in HEK 293 cells for human epididymis protein 4 (HE4) with the native signal peptide (pOPINEneo), the RTPTμ signal peptide (pOPINTTGneo) and the RTPTμ signal peptide and a C-terminal CD4 tag (pOPINTTGneo-CD4)

In most cases, proteins are expressed using the RTPTμ signal peptide [11] from the pOPINTTGneo vector, but native signal peptides have also been tested using the pOPINEneo vector. With secreted proteins, the addition of a C-terminal CD4 fusion protein [14] can improve relative expression levels as in the case shown in Fig. 8. Therefore, this tag is routinely tested alongside the C-terminal His tag. In this example, the native signal sequence did not lead to detectable secreted protein in contrast to the RTPTμ signal peptide (Fig. 8). Where tested, we have found that RTPTμ has invariably given the same or better secretion than the native signal peptide.

4

Sample Preparation In most cases the results from small-scale expression screening translate to the outcomes at larger scale. Overall screening enables effort to be prioritized so that resources (time and money) are only invested in those constructs that will give sufficient yields for downstream applications.

4.1 Scale Up and Purification

The results from expression screening inform the choice both of construct and production system for sample preparation. Simple fed batch cultures are grown for E. coli, insect, and mammalian cells using commercially available media in 1–2.5 L volumes.

A High-Throughput Pipeline for Protein Production

43

Fig. 9 Pipeline for protein purification for (a) intracellular proteins from E. coli and insect cells and (b) secreted proteins from mammalian cells. 3C protease and EndoF1 are made in-house (3C protease clone from EMBL Heidelberg, Germany and EndoF1 clone from Weizmann Institute, Israel)

Expression pre-screening means that a simple two-step purification method is sufficient to prepare samples of sufficient purity for structural studies (Fig. 9). For intracellular proteins, the first step is a nickel column followed automatically by a size exclusion column ¨ ktaXpress purification system (GE Healthcare). If a using an A cleavable tag has been used, this is then cut using 3C protease before secondary purification (Fig. 9a). Products secreted from mammalian cells undergo the same initial purification steps of nickel column followed automatically by size exclusion using the ¨ ktaXpress programme published by Nettleship et al. (Fig. 9b) A [15]. This programme allows large volumes of media to be loaded onto the nickel column using a cycle of loading 200 ml media followed by 50 ml of wash buffer before continuing with elution from the nickel column and size exclusion chromatography. The method reduces pressure build-up due to viscous components of the media alongside reducing nonspecific binding to the nickel column. Nonspecific binding is also reduced by the addition of 2 mM NiCl2 to serum-containing media before loading onto the nickel column. If required, N-glycosylation of the product can be simplified by the addition of kifunensine to the cell media during scale-up of expression [16]. Kifunensine inhibits α-mannosidase I and leaves high-mannose N-glycans on the expressed glycoproteins which can then be removed after purification by treatment with

44

Joanne E. Nettleship et al.

Endoglycosidase F1 (EndoF1) followed by a second gel filtration step to remove the EndoF1 from the sample. A standard set of buffers is used for the column chromatography steps which simplifies the process and facilitates running several protein purifications in parallel. The buffers used for nickel affinity chromatography contain 50 mM Tris, pH 7.5 and 500 mM NaCl with 30 mM or 500 mM imidazole, pH 7.5 for the wash and elution buffers respectively. The size exclusion buffer is 20 mM Tris, pH 7.5 and 200 mM NaCl with 1 mM TCEP (Tris(2-carboxyethyl)phosphine hydrochloride) if a reducing agent is required. Examples of size exclusion profiles from one week of protein purification using the standard protocols and buffers are show in Fig. 10. The majority of the gel filtration profiles (HiLoad 16/600 Superdex 200 or Superdex 75, GE Healthcare) show symmetrical peaks corresponding to a relatively monodisperse product, and further analysis by SDS-PAGE showed that the proteins were >95% pure and hence suitable for biophysical characterization, activity assay, and/or crystallization. The profile for protein OPPF 17541 shows two peaks which upon further analysis (not shown here) related to monomeric and dimeric forms of the protein. OPPF 16681 can be seen to give many peaks in the size exclusion profile and therefore needs further purification. Using the standard buffers, 76% of proteins entering the sample preparation stage in the OPPF have been successfully purified. This success rate is independent of whether the protein is intracellular or secreted and of which host cell was used for production. Of the samples that failed production, unexpectedly low initial expression level on scale-up, precipitation upon tag removal, or aggregation of the protein were among the most common problems encountered. Low yield can be addressed by significantly increasing culture volumes for scale-up to produce more biomass and protein solubility by optimizing buffer conditions (see below). 4.2 Biophysical Characterization

At the OPPF, all purified proteins are characterized by intact protein mass spectrometry as a quality assurance step [17]. This allows the measured mass to be compared with the expected mass and any posttranslational modifications to be assessed. Figure 11 shows intact protein mass spectrometric analyses of a number of purified proteins produced from different host cells. Mass spectrometry can be used to measure modifications, for example, OPPF 2145 where the measured mass relates to labeling of the three methionines with selenomethionine (Fig. 11). Glycoproteins are treated with PNGase F prior to intact protein mass spectrometry to remove the N-glycans. This alters the asparagine at the N-glycosylation site to an aspartic acid resulting in a + 1 Da increase in mass. OPPF 19763 has a + 1 Da shift in mass relating to one N-glycosylation site and OPPF 7040 has a + 2 Da shift showing that it contains two occupied N-glycosylation sites (Fig. 11). The

mAu

0

Retention volume

0

100

200

Retention volume

−20 20 40 60 80 100 120 140

−10

0

0

0

10

−10

10

20

0

Retention volume

20 40 60 80 100 120 140

17327: 30 Insect cells

16 14 12 10 8 6 4 2 0 −2

−20

−40 20 40 60 80 100 120 140 Retention volume 40

0

−20

15112: Insect cells

20

0

20 40 60 80 100 120 140 Retention volume

40

20

0

60

40

20

30

40

50

0

80

60

120

140 100

16681: E. coli

80

100

Retention volume

−200

−3 20 40 60 80 100 120 140

0

−2

400

600

800

1000

200

0

17539: E. coli

−1

0

1

2

3

120

20 40 60 80 100 120 140

7591: E. coli

17541: 300 E. coli

400

0

20

40

60

80

100

Retention volume

20 40 60 80 100 120 140

0

Retention volume

20 40 60 80 100 120 140

15459: Insect cells

0 20 40 60 80 100 120 140 160 Retention volume

8856: HEK cells (Secreted)

0

14244: E. coli

−10

0

10

20

30

40

50

−20

0

20

40

60

80

40

20

0

20

40

60

80

Retention volume

20 40 60 80 100 120 140

20 40 60 80 100 120 140 Retention volume

0

Retention volume

20 40 60 80 100 120 140

15691: Insect cells

0

17592: HEK cells (Secreted)

0

6502 mutant: E. coli

Fig. 10 Size exclusion profiles from standardized purification of 12 products performed over a one week period. Proteins are labeled according to OPPF number and the host cell used for production is indicated. Arrows indicate the peak containing the protein of interest

mAu

−100

mAu

120

A High-Throughput Pipeline for Protein Production 45

46

Joanne E. Nettleship et al. 3325: E. coli (Expected 29882 Da)

15459: Insect cells

2145 SeMet: E. coli 18262 Da

16251 Da

12000 14000 16000 18000 20000 22000 24000

12000

14000

16000

Mass (Da)

7040 PNGaseF: HEK cells (secreted)

20000

22000

24000

40000

44000

83361 Da

84000 Mass (Da)

86000

48000

52000

44000

40000

13077 Da

8000 10000 12000 14000 16000 18000 20000

15616: Insect cells

44949 Da

42000

35000

Mass (Da)

19763 PNGaseF: HEK cells (secreted)

88000 40000

30000

20635: E. coli

Mass (Da)

3355: E. coli

82000

25000

47924 Da

Mass (Da)

80000

20000 20000

Mass (Da)

20371 + 18732: HEK cells (secreted)

21034 Da

18000

18000

Mass (Da)

46000

Mass (Da)

48000

37546 Da

50000 34000 35000 36000 37000 38000 39000 40000 Mass (Da)

Fig. 11 Intact protein mass spectrometry deconvoluted spectra from purified proteins. Proteins are labeled according to their OPPF number and host cell used for production is indicated. Glycoproteins which have been treated with PNGase F to remove the glycans prior to intact protein mass spectrometry are labeled “PNGaseF”

mass spectrum for OPPF 20371 + 18,732 shows one peak corresponding to a non-reduced Fab fragment where the heavy and light chains are linked by a cysteine bridge. This was produced by co-transfection in mammalian cells [18]. In the case of OPPF 3325, although purified protein could be seen by SDS-PAGE, the protein failed quality assurance by mass spectrometry (Fig. 11). The second sample quality measure that is routinely used is the thermal shift assay [17]. Although not applicable to all proteins, this assay is a fast and convenient method for indirectly assessing protein folding. Further, by analyzing the thermal stability of a sample in different buffer conditions, formulation of the sample can be optimized. Figure 12 shows results from a thermal shift assay for a human histone acetyltransferase. Here it can be seen that there

A High-Throughput Pipeline for Protein Production

47

Fig. 12 Results from a thermal shift assay investigating the effect of pH on protein stability for a human histone acetyltransferase. The assay is performed in 96 well format and only a subset of data is shown here

is a shift to a higher melting temperature, and therefore increased protein stability, at pH 6.5 (Gray dashed line; Tm ¼ 39.9  C) rather than the standard pH of 7.5 (Black solid line; Tm ¼ 38.6  C). This information is used to determine the optimum pH for purification and storage of the protein.

5

Summary The OPPF pipeline has been developed in order to streamline protein production for structural and functional studies. A number of factors, outlined below, make this possible: 1. Design of multiple constructs at the start of a project. 2. Ligation-independent cloning in 96-well format. 3. Preferential use of short hexa/octahistidine tags as opposed to fusion proteins. 4. Parallelization of small-scale expression screening in multiple hosts through the use of the pOPIN vector system. 5. Changing the host cell from E. coli when little or no expression is detected. 6. Use of standardized buffers and protocols during protein purification.

48

Joanne E. Nettleship et al.

Acknowledgements The OPPF was funded by the Medical Research Council, UK (grant MR/K018779/1). The authors wish to thank Louise Bird for help with data analysis. References 1. Yang ZR et al (2005) RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 21(16):3369–3376 2. Kelley LA et al (2015) The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc 10(6):845–858 3. Pajon A et al (2005) Design of a data model for developing laboratory information management and analysis systems for protein production. Proteins 58(2):278–284 4. Morris C et al (2011) The Protein Information Management System (PiMS): a generic tool for any structural biology research laboratory. Acta Crystallogr D Biol Crystallogr 67 (Pt 4):249–260 5. Berrow NS et al (2007) A versatile ligationindependent cloning method suitable for high-throughput expression screening applications. Nucleic Acids Res 35(6):e45 6. Berrow NS, Alderton D, Owens RJ (2009) The precise engineering of expression vectors using high-throughput In-Fusion PCR cloning. Methods Mol Biol 498:75–90 7. Bird LE (2011) High throughput construction and small scale expression screening of multitag vectors in Escherichia coli. Methods 55 (1):29–37 8. Bird LE et al (2014) Application of In-Fusion cloning for the parallel construction of E. coli expression vectors. Methods Mol Biol 1116:209–234 9. Zhao Y, Chapman DA, Jones IM (2003) Improving baculovirus recombination. Nucleic Acids Res 31(2):E6–E6 10. Nettleship JE et al (2010) Recent advances in the production of proteins in insect and mammalian cells for structural biology. J Struct Biol 172(1):55–65

11. Aricescu AR, Lu W, Jones EY (2006) A timeand cost-efficient system for high-level protein production in mammalian cells. Acta Crystallogr D Biol Crystallogr 62(Pt 10):1243–1250 12. Durocher Y, Perret S, Kamen A (2002) Highlevel and high-throughput recombinant protein production by transient transfection of suspension-growing human 293-EBNA1 cells. Nucleic Acids Res 30(2):E9 13. Nettleship JE et al (2015) Transient expression in HEK 293 cells: an alternative to E. coli for the production of secreted and intracellular mammalian proteins. Methods Mol Biol 1258:209–222 14. Brown MH, Barclay AN (1994) Expression of immunoglobulin and scavenger receptor superfamily domains as chimeric proteins with domains 3 and 4 of CD4 for ligand analysis. Protein Eng 7(4):515–521 15. Nettleship JE, Rahman-Huq N, Owens RJ (2009) The production of glycoproteins by transient expression in Mammalian cells. Methods Mol Biol 498:245–263 16. Chang VT et al (2007) Glycoprotein structural genomics: solving the glycosylation problem. Structure 15(3):267–273 17. Nettleship JE et al (2008) Methods for protein characterization by mass spectrometry, thermal shift (ThermoFluor) assay, and multiangle or static light scattering. Methods Mol Biol 426:299–318 18. Nettleship JE et al (2012) Converting monoclonal antibodies into Fab fragments for transient expression in mammalian cells. Methods Mol Biol 801:137–159 19. Joshi HJ, Gupta R (2015) Eukaryotic glycosylation: online methods for site prediction on protein sequences. In: Lu¨tteke T, Frank M (eds) Glycoinformatics. Springer, New York, pp 127–137

A High-Throughput Pipeline for Protein Production 20. Steentoft C et al (2013) Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology. EMBO J 32 (10):1478–1488 21. Berman HM et al (2000) The protein data bank. Nucleic Acids Res 28(1):235–242 22. Gasteiger E et al (2005) Protein identification and analysis tools on the ExPASy server. In: Walker JM (ed) The proteomics protocols handbook. Humana Press, Totowa, NJ, pp 571–607

49

23. Petersen TN et al (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8(10):785–786 24. Moller S, Croning MD, Apweiler R (2001) Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17(7):646–653 25. The UniProt Consortium (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res 45(D1):D158–D169

Chapter 3 Semiautomated Small-Scale Purification Method for High-Throughput Expression Analysis of Recombinant Proteins Edward Kraft, Yvonne Franke, Katharine Heeringa, Stephanie Shriver, Inna Zilberleyb, Christine Kugel, Trisha Dela Vega, Athena Wong, Bobby Brillantes, Claudio Ciferri, George Dutina, Grace Lee, Isabelle Lehoux, Zhong Rong Li, Lee Lior-Hoffmann, Jiyoung Hwang, Chris Lonergan, Lynn Martin, Kyle Mortara, Lananh Nguyen, Jian Payandeh, Andrew Perez, Jun Sampang, Lovejit Singh, Kurt Schroeder, Christine Tam, Shu Ti, Ye Naing Win, and Krista Bowman Abstract The expression analysis of recombinant proteins is a challenging step in any high-throughput protein production pipeline. Often multiple expression systems and a variety of expression construct designs are considered for the production of a protein of interest. There is a strong need to triage constructs rapidly and systematically. This chapter describes a semiautomated method for the simultaneous purification and characterization of proteins expressed from multiple samples of expression cultures from the E. coli, baculovirus expression vector system, and mammalian transient expression systems. This method assists in the selection of the most promising expression construct(s) or the most favorable expression condition (s) to move forward into large-scale protein production. Key words Recombinant protein expression, High-throughput protein expression, Parallel protein purification, E. coli, Baculovirus, BEVS, Mammalian transient expression

1

Introduction High-throughput protein production has matured into a widespread practice over the last decade. Projects at academic institutions, structural biology consortia, and pharmaceutical companies often require the generation of large numbers of proteins from different protein families, different species, and containing assayspecific tags to support complex research and drug discovery. While

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_3, © Springer Science+Business Media, LLC, part of Springer Nature 2019

51

52

Edward Kraft et al.

every protein is different and individual protein expression laboratories take slightly varied approaches, there are now many examples of highly effective high-throughput protein production pipelines [1–6]. There are many considerations when establishing a highthroughput protein production pipeline. Choosing the best expression system and recovery method depends heavily on the characteristics of individual proteins of interest such as size, structure, native localization, solubility, posttranslational modifications, required co-factors, toxicity during expression, and stability in vivo and in vitro. Additionally, researchers must consider the desired quantity, purity, activity, and specific tagging requirements that may result from assay design or desired end-use of the protein. Most high-throughput expression laboratories focus their resources to express proteins primarily using the Escherichia coli (E. coli), baculovirus expression vector system (BEVS), and/or mammalian transient expression systems. The data accumulated over years of protein expression in the scientific community suggest that each expression system offers a distinct potential for success with regard to protein size, structure, posttranslational modification, native localization, and enzymatic activity [7–12]. There are, however, unpredicted successes and failures that appear to be proteinspecific. Multiple structural biology consortia working over the last decade to determine three-dimensional protein structures have advanced the field of high-throughput protein production greatly by developing and refining new protein expression, purification, and characterization technologies [13–16]. A review of the creative and efficient pipelines implemented by these groups suggests that one can establish a generic method for successfully generating a broad range of proteins of interest with a high potential for success. It is common to generate a large variety of proteins and protein variants to support high-throughput screening of small molecule protein modulators, structure-based drug design, antigen generation, and other activities pertinent to drug discovery. At the start of a target protein production project, researchers first investigate all available literature precedents for the protein of interest and for other relevant proteins such as orthologs and protein superfamily members. Scientists examine the characteristics of the protein including primary amino acid sequence, homology with other species or family members, known or predicted structure, posttranslational modifications, and enzymatic activity. A typical expression approach is to generate and screen a series of expression constructs using multiple expression systems in order to identify the most suitable system for the generation of sufficient soluble proteins for downstream applications. This strategy may

High-Throughput Protein Expression in Multiple Expression Hosts

53

include expressing full-length proteins, specific domains, mutant proteins, or chimeric proteins. Various peptide affinity tags or fusion partners such as poly-Histidine, FLAG, glutathione-Stransferase (GST), and maltose binding protein (MBP) may be investigated to enhance soluble expression and purification of the target protein. For proteins with limited expression information, one may consider adding fusion tags to both the N- and C-termini, individually or together, and consider leveraging the use of multiple expression systems (E. coli, BEVS, and mammalian transient) in parallel to increase the chance of successful delivery of the desired protein. The rapid generation of multiple expression constructs in parallel is vital to this approach and has been greatly enhanced through the implementation of efficient and highthroughput subcloning methods [17–20]. We have designed our expression analysis pipeline for simultaneous analysis of proteins with poly-Histidine, FLAG, and Fc tags using pipet tips filled with Ni-NTA, anti-FLAG, or ProPlus affinity resin, respectively. In cases where a fusion partner such as GST or MBP is necessary to improve recovery of soluble protein, a polyHistidine tag is added to the expression construct immediately upstream of the fusion partner. Sample preparation of intracellular proteins for purification requires the harvest of cells, lysis, and, in the case of integral membrane proteins, solubilization of the membrane. Sample preparation of secreted proteins for purification involves the removal of cells and cell debris by centrifugation. In this chapter, we describe two variations of this purification method (Fig. 1): one that applies to intracellular and integral membrane proteins, which is performed on a Beckman Biomek FX (Subheading 3.1.1) and one that applies to secreted proteins and is performed on a Hamilton STAR (Subheading 3.1.2). The major steps in both of these methods include sample preparation, semiautomated purification, and eluted sample analysis. The estimation of expression level is the most challenging and laborious step in this method. Absorbance at 280 nm is used as a measure of protein concentration of the elution samples. SDS-PAGE analysis confirms the presence of full-length or truncated proteins, identifies the MW, and provides information regarding sample purity. Overall estimated expression levels are reported as a range: 10 mg/L. The method outlined in this chapter allows for the concurrent triaging of multiple expression constructs to identify suitable approaches for producing soluble protein for diverse protein targets. Additionally, it can be used for the evaluation of expression, lysis, purification, and formulation conditions to identify optimal strategies for producing soluble stable proteins. The success of this

54

Edward Kraft et al.

Fig. 1 Process flow for sample preparation and purification of intracellular, membrane, and extracellular proteins in a 96-well layout. (a) Intracellular and membrane purification workflow for the processing of samples. Cell pellets are collected and arrayed as listed above to allow for lysis of samples in 48-well blocks and filtration of samples in a 96-well format. (b) Extracellular purification workflow for the processing of secreted protein samples. Supernatants of cultures are collected and transferred to multiple 96-well blocks to allow for sequential binding of 1 ml volumes. (c) Resin tip architecture for sample processing. Resin type, size, vendor and screen choice are made according to platform requirements. (d) For both purification workflows samples are bound, washed, and eluted using a 96-well head for processing on Biomek or Hamilton liquid handlers

method relies on small scale expression and purification of multiple constructs in parallel. The expression yield information generated from these steps allows for the prioritization of the best constructs for subsequent large-scale protein production.

High-Throughput Protein Expression in Multiple Expression Hosts

2

55

Materials

2.1 Purification and Analysis of Intracellular and Membrane Proteins 2.1.1 Reagents

1. Lysis Buffer: 50 mM Tris, pH 7.5, 300 mM NaCl, 5 mM Imidazole, 5 mM MgCl2, 15% Glycerol, 1 mM TCEP, Protease Inhibitor tablet (Complete, EDTA-free, 1 per 50 mL), Lysonase™ (1 μL of Lysonase™ stock per 1 mL of Lysis Buffer). TCEP, Protease Inhibitor, and Lysonase™(Novagen) should be added immediately before use (see Note 1). 2. FC14: 10% (wt/vol) Fos Choline-14 (FC14) stock solution in water. 3. Acid Buffer: 0.1 M Glycine pH 3.0, 150 mM NaCl. 4. Wash Buffer: 50 mM Tris, pH 7.5, 300 mM NaCl, 20 mM Imidazole, 15% Glycerol, 1 mM TCEP, 0.015% FC14. 5. Elution Buffer: 50 mM Tris, pH 7.5, 300 mM NaCl, 250 mM Imidazole, 15% Glycerol, 0.015% FC14, 1 mM TCEP, 150 μg/mL FLAG peptide (BioBasic; added immediately before use). 6. MES Buffer: 50 mM MES, 50 mM Tris Base, 0.1% SDS, 1 mM EDTA, pH 7.3. 7. 4 LDS Sample Buffer: 40% Glycerol, 4% Lithium Dodecyl Sulfate (LDS), 0.8 M Triethanolamine-Cl, pH 7.6, 4% Ficoll®400, 0.025% Phenol Red, 0.025% Coomassie G250, 2 mM EDTA disodium. 8. 10 Sample Reducing Reagent: 500 mM DTT. 9. Coomassie stain: InstantBlue (Expedeon). 10. Precision plus protein™ Kaleidoscope™ prestained protein standards.

2.1.2 Equipment and Consumables

1. Beckman Biomek FX liquid handler with 96-well head. 2. Water bath sonicator (Qsonica, llc) for lysing cells in 48 deepwell plates (E&K Scientific). 3. 200 μL resin-filled tips compatible with Biomek FX liquid handler containing 5 μL Ni-NTA (Qiagen) or 20 μL antiFLAG affinity resin (Sigma; see Note 2 and Fig. 1c). 4. Plastics: 2 48-well deep-well plates V-bottom (E&K Scientific); 96-well plates with 0.5 mL volume, V-bottom (Costar); 96-well 0.45 μm filter plates with low protein binding, high flow rate (Pall); 96-well PCR plates (VWR) for preparation of samples for SDS-PAGE; DropPlate D+ plates (Trinean) for measuring protein concentration on Dropsense (Trinean). 5. Multichannel pipettes for sample transfer and gel loading.

56

Edward Kraft et al.

6. Instrument for measuring absorbance of samples at 280 nm such as Nanodrop (Thermo Scientific) or Dropsense 96 (Trinean). 7. Swinging bucket centrifuge compatible with the required g-forces. 8. Foil adhesive seals (E&K Scientific) for storage of samples in plates. 9. SDS-PAGE running apparatus such as XCell4 Sure lock Midi Cell (Thermofisher). 10. 20-well NuPage Midi Bis-Tris 4–12% gels (Thermofisher). 11. Bio-Rad Image Lab software.package or similar for scanning/ analyzing Coomassie-stained gels. 2.2 Purification and Analysis of Extracellular Proteins

1. Mammalian Neutralization Buffer: 1 M Tris pH 7.5.

2.2.1 Reagents

4. NiCl2 stock solution: 1 M NiCl2.

2. BEVS Neutralization Buffer: 1 M Tris, pH 8.0. 3. CaCl2 stock solution: 1 M CaCl2. 5. Acid Buffer: 0.1 M Glycine pH 3.0, 150 mM NaCl. 6. Equilibration Buffer: 50 mM Tris, pH 7.5, 300 mM NaCl, 5 mM Imidazole. 7. Wash Buffer 1: 50 mM Tris, pH 7.5, 300 mM NaCl, 15 mM Imidazole. 8. Wash Buffer 2: 150 mM NaCl. 9. Elution Buffer: 50 mM Tris pH 7.5, 300 mM NaCl, 250 mM Imidazole. 10. MES Buffer: 50 mM MES, 50 mM Tris Base, 0.1% SDS, 1 mM EDTA, pH 7.3. 11. 4 LDS Sample Buffer: 40% Glycerol, 4% Lithium Dodecyl Sulfate (LDS), 0.8 M Triethanolamine-Cl pH 7.6, 4% Ficoll®400, 0.025% Phenol Red, 0.025% Coomassie G250, 2 mM EDTA disodium. 12. 10 Sample Reducing Reagent: 500 mM DTT. 13. Coomassie stain: InstantBlue (Expedeon). 14. Precision plus protein™ Kaleidoscope™ prestained protein standards.

2.2.2 Equipment and Consumables

1. Hamilton STAR liquid handler. 2. 1 mL resin-filled tips compatible with Hamilton STAR liquid handler containing 10 μL Ni-NTA, 20 μL anti-FLAG, or 10 μL ProPlus (Phynexus) affinity resin (see Note 2 and Fig. 1c). 3. 1 mL CO-RE liquid-sensing tips (Hamilton).

High-Throughput Protein Expression in Multiple Expression Hosts

57

4. Plastics: 4 200 mL buffer troughs; 14 96-well deep-well plates with 2.2 mL volume (Thermo Scientific), V-bottom (5 for buffers, 8 for samples, 1 for eluted samples from the purification resin tips); blotting paper/plate; 50 mL conical tubes; 96-well PCR plates for preparation of samples for SDSPAGE; DropPlate D+ plates (Trinean) for measuring protein concentration on Dropsense. 5. Multichannel pipettes for sample transfer and gel loading. 6. Instrument for measuring absorbance of samples at 280 nm such as Nanodrop (Thermo Scientific) or Dropsense 96 (Trinean). 7. Swinging bucket centrifuge compatible with the required g-forces. 8. Foil adhesive seals (E&K Scientific) for storage of samples in plates. 9. SDS-PAGE running apparatus such as XCell4 Sure lock Midi Cell (Thermofisher). 10. 20-well NuPage Midi Bis-Tris 4–12% gels. 11. BioRad Image Lab software package or similar for scanning/ analyzing Coomassie-stained gels.

3

Methods

3.1 Purification and Analysis of Intracellular and Membrane Proteins 3.1.1 Sample Preparation for Intracellular and Membrane Protein Purification

We focus on two initial expression conditions for E. coli: autoinduction at 17  C for 48–64 h [21] and 0.4 mM IPTG induction at 17  C for 18 h. When expressing proteins in BEVS, our approach relies on Sf9 and Trichoplusia ni (T. ni; Expression Systems, llc.) cell lines grown in suspension culture and infected under low multiplicity of infection (MOI) conditions. We typically culture insect cells in 24-deep well plates, infect at a cell density of 2 E6 cells/mL with an MOI of approximately 0.5 for Sf9 and 1 for T. ni. We harvest the Sf9 cells at 65–72 h and the T. ni cells at 42–48 h post-infection. We express proteins in the mammalian transient system using PEI-mediated transient transfection of CHO and HEK293 cells. Modulation of seeding densities, temperature, media/feed, and chemical additives such as valproic acid can further enhance productivity. For our transient transfections to express secreted proteins, we seed 10 mL of cells at 1.2 E6 cells/mL in 50 mL TubeSpin® Bioreactors, transfect with 5 μg DNA, and harvest 7 days post-transfection. For expression of intracellular and membrane proteins, we seed 24 mL of cells at 1.2 E6 cells/mL, transfect with 12 μg DNA, and harvest 2–4 days post-transfection. This method is written for up to 96 samples.

58

Edward Kraft et al.

1. Resuspend each cell pellet in 600 μL of Lysis Buffer and transfer into 2 48 deep-well plates. See Fig. 1a for plate array (see Note 3). 2. Fully suspend sample plates in a water bath sonicator with sufficient ice to surround plate and sonicate using 5 cycles at 50% amplitude of 30 s on/ 30 s rest for insect or mammalian cells and 10 cycles at 50% amplitude of 30 s on/ 30 s rest for E. coli cells (see Note 4). 3. For sample groups that contain membrane proteins perform the following steps: (a) Solubilize membrane proteins by adding 10% FC14 stock solution to a final concentration of 1% (vol/vol). (b) Place 48-well plates on a shaker with a 0.5 in. throw and shake at 225 rpm for 1 h at 4  C (see Note 5). 4. Centrifuge 48-well plates at 4000  g in swinging bucket rotor for 1 h to obtain clarified cell lysate. 5. Prepare 2 0.45 μm filter plates by placing each filter plate on top of a 0.5 mL 96-well collection plate. 6. Pipette 300 μL (half) of the lysate into each 0.45 μm filter plate. 7. Centrifuge samples at 1500  g for 10 min in swinging bucket rotor at 4  C. If lysate does not filter after the first spin completes, transfer remaining lysate to a new filter plate and repeat spin. Combine filtrates as needed to obtain the protein load plates (see Note 6). 3.1.2 Purification of Intracellular and Membrane Proteins

1. Prepare or obtain resin-filled tips (Fig. 1c) that are compatible with the Biomek FX. Manually create a tip source rack (Fig. 2a) to match purification resin with the affinity tag on desired protein. 2. Prepare the following 96-well 500 μL V-bottom plates and place on deck of Biomek FX according to Fig. 2a: (a) Acid Buffer plate (Acid): 200 μL/well. (b) Lysis Buffer plate (TBS): 200 μL/well. (c) Wash Buffer plate (Wash): 500 μL/well. (d) Buffer waste (Dump) and Sample Collection (Collect) Plates: empty. (e) Elution Buffer Plate (Elution): 100 μL. 3. Place the two load plates on the Biomek FX deck as indicated in Fig. 2a. 4. Wash all resin tips with 100 μL of Acid Buffer by pipetting up and down 1 time at a flow rate of 8 μL/s, including a 30 s pause step after aspirating and after dispensing (see Note 7).

High-Throughput Protein Expression in Multiple Expression Hosts

59

Fig. 2 (a) Biomek (Beckman) deck layout for purification of intracellular and membrane protein samples. Plates containing the lysate, associated buffers and sample collection are placed on the deck according to method setup. (b) Hamilton deck layout for purification of secreted protein samples. Tubes containing the supernatant, plates, associated buffers and sample collection are placed on the deck according to method setup. For sample numbers greater than 48 a second set of racks is loaded with tubes

60

Edward Kraft et al.

5. Equilibrate resin tips with 100 μL Lysis Buffer by pipetting up and down 3 times at a flow rate of 8 μL/s, including a 30 s pause step after aspirating and after dispensing. 6. Load protein onto tips by pipetting up and down in each load plate 6 times at a flow rate of 4 μL/s, including a 30 s pause step after aspirating and after dispensing. 7. Wash columns by pipetting up 75 μL Wash Buffer from the Wash Buffer plate and dispensing into dump plate at flow rate of 8 μL/s, including a 30 s pause step after aspirating and after dispensing (see Note 8). 8. Repeat step 7 for two additional times. 9. Elute protein by pipetting up 100 μL of Elution Buffer from the Elution Buffer Plate and transferring to the Sample Collection Plate at a flow rate of 8 μL/s. 10. Complete elution by pipetting up and down 100 μL for 3 times at a flow rate of 8 μL/s, including a 30 s pause step after aspirating and after dispensing. 11. Touch the tips to the side of the plate after dispensing to ensure any droplets at the base of the tips are recovered. 12. The approximate time required to run the robotic protocol is 1 h per 96 well plate. 3.1.3 Analysis of Intracellular and Membrane Proteins

1. Determine total protein concentration in eluted samples by measuring absorbance at 280 nm on Dropsense 96 of 2 μL of each eluted sample (Fig. 3). 2. Run eluted samples on a 20-well NuPage Midi Bis-Tris 4–12% gel in 1 MES Buffer. (a) Prepare reduced gel loading samples (Table 1). (b) Heat samples 5 min at 95  C. See Note 9 for membrane proteins. (c) Run gels at 180 V for 60 min. 3. Stain gel with Coomassie-based dye and destain until desired contrast is obtained (see Fig. 4 for representative data). 4. Determine lane purity percentage for the band of interest by measuring band densitometry for all bands of interest in a given lane (Fig. 5). (a) Import Coomassie-stained gel image. (b) Auto-detect protein bands/lanes and select/deselect/ adjust as needed. (c) Add molecular weight standards to provide a reference for band molecular weight section. (d) Under Image Info above the gel image, record the identity of each lane in the notes section.

High-Throughput Protein Expression in Multiple Expression Hosts

61

Fig. 3 Analysis of eluted samples by absorbance at 280 nm. Calculation of estimated total protein yields in eluted samples using absorbance at 280 nm Table 1 Preparation of SDS-PAGE gel loading samples for analysis of expressed proteins on a 20-well NuPage Midi Bis-Tris 4–12% gel

(e) Select the Analysis Table to determine target band percentage and apparent molecular weight for band(s) of interest. (f) Record percentage purity and molecular weight of target band in each corresponding elution. 2. Determine yield range of each protein as follows: (a) Sample concentration for each eluted sample is calculated according to Beer’s Law as follows: concentration (in mg/mL) ¼ absorbance at 280 nm/extinction coefficient. (b) Total protein yield for each eluted sample is calculated as follows: total protein yield (in mg/L) ¼ (sample concentration from step 5a)  (elution volume (in mL))/culture volume (in L). (c) The final protein yield is reported by multiplying the target band purity in step 4f by the total protein yield

62

Edward Kraft et al.

Fig. 4 Examples of proteins analyzed on SDS-page gels, representing the three expression systems. E. coli membrane bound protein: analysis of total (T), soluble (S) samples and eluted sample (E). Mammalian transient system: secreted protein eluted samples under reducing (R) and non-reducing (NR) conditions in two cell lines (HEK293, CHO). BEVS: secreted protein eluted samples under reducing (R) and non-reducing conditions (NR) using Sf9 and T. ni PRO cells

calculated in step 5b. Estimated yields for each construct are recorded in ranges: 10 mg/L of culture. Adjust the values/units as necessary to report in desired format (mg/L, etc.). 3. Create report (Fig. 5). 3.2 Purification and Analysis of Extracellular Proteins 3.2.1 Sample Preparation for Extracellular Protein Purification

1. Neutralize protein-containing medium in preparation for loading onto resin. (a) For Mammalian transient samples, add Mammalian Neutralization Buffer to 100 mM final concentration. (b) For BEVS samples: l

For all proteins, add BEVS Neutralization Buffer to 50 mM final concentration.

l

For samples containing poly-Histidine-tagged proteins, add 1 M NiCl2 and 1 M CaCl2 to a final concentration of 0.8 mM and 5 mM, respectively.

2. Incubate in 27  C shaking incubator with a 1-inch throw at 300 rpm for 20 min (see Note 10). 3. Centrifuge conditioned media at 3000  g for 10 min (see Note 11). 3.2.2 Purification of Extracellular Proteins

1. Prepare or obtain resin-filled tips (Fig. 1c) that are compatible with the Hamilton STAR. Manually create a tip source rack

High-Throughput Protein Expression in Multiple Expression Hosts

63

Fig. 5 Analysis of eluted samples by SDS-PAGE. (a) Aliquots of eluted samples are run on SDS-PAGE according to Table 1. Coomassie-stained images are analyzed using Image Lab software to determine target band purity. A representative image of the band detection and lane analysis steps are shown. (b) Representative final report generated to capture construct details, key characteristics of each protein and final yield ranges to inform scale-up decisions. Additional bands are shown on SDS-PAGE gels to represent common occurrences of co-purifying (background) proteins

(Fig. 2b) to match purification resin with the affinity tag on desired protein. 2. Turn on 4  C water cooler to cool the deck that will hold the 14 96 well plates and the blotting paper/plate. 3. To the Elution Buffer plate (ELN start), pipette 100 μL of appropriate Elution Buffer to match purification resin tip for each sample, and place onto deck in the specified deck location. 4. Prepare Hamilton STAR deck according to Fig. 2b: (a) 8 96-well deep-well Sample plates (S1-S8): empty. (b) Acid Buffer plate (Acid): empty. (c) Equilibration Buffer plate (EQ): empty. (d) Wash Buffer 1 plate (Wash1): empty. (e) Wash Buffer 2 plate (Wash2): empty.

64

Edward Kraft et al.

(f) Elution Sample Plates (ELN start and finish): 100 μL/ well in start plate. (g) Blotting Paper in plate (blotting station) (see Note 12). 5. Pour buffers into troughs and place on deck ensuring that troughs contain 20% excess buffer volume to account for tip dead space. 6. Fill carriers with 1 mL CO-RE Liquid Sensing Hamilton Tips to allow for monitoring of aspiration and dispensing. 7. Load racks with uncapped 50 mL tubes containing protein samples onto the deck. 8. Transfer 1 mL each of Acid Buffer, Equilibration Buffer, Wash Buffer 1, and Wash Buffer 2 into the designated 96-well plates. 9. Prepare 8 Sample Plates (S1-S8) by transferring 1 mL culture from 50 mL tubes to each 96-well plate (see Note 13). 10. Wash all resin tips with 1 mL of Acid Buffer by pipetting up and down one time at a flow rate of 8 μL/s. Include a 30 s pause step after aspirating and after dispensing (see Note 7). 11. Equilibrate all resin tips with 1 mL Equilibration Buffer by pipetting up and down 3 times at a flow rate of 8 μL/s, including a 30 s pause step after aspirating and after dispensing. 12. Load protein on to resin-containing tips by pipetting up and down in each Sample Plate 6 times at a flow rate of 4 μL/s, including a 30 s pause step after aspirating and after dispensing. 13. Wash protein bound resin tips by pipetting up and down 2 times with 1 mL Wash Buffer 1 from the Wash Buffer 1 plate at a flow rate of 8 μL/s, including a 30 s pause step after aspirating and after dispensing. 14. Wash protein bound resin tips by pipetting up and down 2 times with 1 mL of Wash Buffer 2 from the Wash Buffer 2 plate at a flow rate of 8 μL/s, including a 30 s pause step after aspirating and after dispensing. 15. Blot protein bound resin tips onto the blotting paper/plate. 16. Elute protein by pipetting up 100 μL of Elution Buffer from the Elution Buffer Plate and transferring to the Elution Finish Plate. 17. Complete elution by pipetting up and down 100 μL 3 times at a flow rate of 8 μL/s, including a 30 s pause step after aspirating and after dispensing. 18. Touch the tips to the side of the plate after dispensing to ensure any droplets at the base of the tips are recovered. The approximate time required for this robotic protocol is 3.5 h per 96 well plate.

High-Throughput Protein Expression in Multiple Expression Hosts 3.2.3 Analysis of Extracellular Proteins

65

1. Determine total protein concentration in eluted samples by measuring absorbance at 280 nm on a Dropsense 96 of 2 μL of each eluted sample (see Fig. 3). 2. Run eluted samples on a 20-well NuPage Midi Bis-Tris 4–12% gel in 1 MES Buffer (see Fig. 4b). (a) Prepare reduced and non-reduced gel loading samples (see Table 1). (b) Heat samples 5 min at 95  C. (c) Run gels at 180 V for 60 min. 3. Stain gel with Coomassie-based dye and destain until desired contrast is obtained (see Fig. 4 for representative data). 4. Determine lane purity percentage for the band of interest by measuring band densitometry for all bands of interest in a given lane (Fig. 5). (a) Import Coomassie-stained gel image. (b) Auto-detect protein bands/lanes and select/deselect/ adjust as needed. (c) Add molecular weight standards to provide a reference for band molecular weight section. (d) Under Image Info above the gel image, record the identity of each lane in the notes section. (e) Select the Analysis Table to determine target band percentage and apparent molecular weight for band(s) of interest. (f) Record percentage purity and molecular weight of target band in each corresponding elution. 5. Determine yield range of each protein as follows: (a) Sample concentration for each eluted sample is calculated according to Beer’s Law as follows: concentration (in mg/mL) ¼ absorbance at 280 nm/extinction coefficient. (b) Total protein yield for each eluted sample is calculated as follows: total protein yield (in mg/L) ¼ (sample concentration from step 5a)  (elution volume (in mL))/culture volume (in L). (c) The final protein yield is reported by multiplying the target band purity in step 4f by the total protein yield calculated in step 5b. Estimated yields for each construct are recorded in ranges: 10 mg/L of culture. Adjust the values/units as necessary to report in desired format (mg/L, etc.). 6. Create report (Fig. 5).

66

4

Edward Kraft et al.

Notes 1. When lysing insect or mammalian cells, use 5 units of Benzonase per 1 mL of Lysis Buffer. 2. To allow for simultaneous processing of poly-Histidine-, FLAG-, and IgG-tagged samples, use tips that are identical in height for all resins. 3. Starting cell pellets may be freshly harvested or previously frozen. E. coli samples are always frozen and thawed at least one time. 4. For E. coli samples, collect 20 μL of sample following sonication for use as the total protein sample for SDS-PAGE analysis. 5. This step allows for solubilization of integral membrane proteins. It is not a required step for cytoplasmic proteins and must be included to allow concurrent purification of membrane and cytoplasmic proteins in Subheading 3.1.2. 6. For E. coli samples, collect 20 μL of filtrate to be used as the soluble protein sample for SDS-PAGE analysis. 7. The 30 s pause steps are necessary to allow for complete aspiration and dispensing of volume within resin-filled tips. The time may be adjusted depending on the properties of samples or buffers. 8. The addition of 0.015% FC14 to both the Wash and Elution buffers allows for simultaneous purification of cytoplasmic and integral membrane proteins. 9. Heating will cause many membrane proteins to aggregate in SDS-PAGE sample loading buffer. Samples containing membrane proteins should be analyzed unheated. 10. The addition of NiCl2 and CaCl2 are required to precipitate and remove specific components of the insect cell culture medium that will otherwise strip the Ni-affinity resin during purification. 11. Following centrifugation, there will be a pellet containing precipitate. 12. Blotting station is used to blot purification resin tips before elution to remove excess wash buffer. 13. Sequential binding of small sample volumes increases protein recovery.

High-Throughput Protein Expression in Multiple Expression Hosts

67

References 1. Saez NJ, Vincentelli R (2014) Highthroughput expression screening and purification of recombinant proteins in E. coli. Methods Mol Biol 1091:33–53. https://doi.org/ 10.1007/978-1-62703-691-7_3 2. Hunt I (2005) From gene to protein: a review of new and enabling technologies for multiparallel protein expression. Protein Expr Purif 40(1):1–22 3. Buchs M, Kim E, Pouliquen Y, Sachs M, Geisse S, Mahnke M, Hunt I (2009) Highthroughput insect cell protein expression applications. Methods Mol Biol 498:199–227. https://doi.org/10.1007/978-1-59745-1963_14 4. Chambers SP, Austen DA, Fulghum JR, Kim WM (2004) High-throughput screening for soluble recombinant expressed kinases in Escherichia coli and insect cells. Protein Expr Purif 36(1):40–47 5. Gileadi O, Burgess-Brown NA, Colebrook SM, Berridge G, Savitsky P, Smee CE, Loppnau P, Johansson C, Salah E, Pantic NH (2008) High throughput production of recombinant human proteins for crystallography. Methods Mol Biol 426:221–246. https://doi.org/10.1007/ 978-1-60327-058-8_14 6. Peleg Y, Unger T (2008) Application of highthroughput methodologies to the expression of recombinant proteins in E. coli. Methods Mol Biol 426:197–208. https://doi.org/10. 1007/978-1-60327-058-8_12 7. Ferna´ndez FJ, Vega MC (2016) Choose a suitable expression host: a survey of available protein production platforms. Adv Exp Med Biol 896:15–24. https://doi.org/10.1007/978-3319-27216-0_2 8. Koehn J, Hunt I (2009) High-Throughput Protein Production (HTPP): a review of enabling technologies to expedite protein production. Methods Mol Biol 498:1–18. https:// doi.org/10.1007/978-1-59745-196-3_1 9. Lesley SA (2009) Parallel methods for expression and purification. Methods Enzymol 463:767–785. https://doi.org/10.1016/ S0076-6879(09)63041-X 10. Gecchele E, Merlin M, Brozzetti A, Falorni A, Pezzotti M, Avesani L (2015) A comparative analysis of recombinant protein expression in different biofactories: bacteria, insect cells and plant systems. J Vis Exp 97:e52459. https:// doi.org/10.3791/52459 11. Jain NK, Barkowski-Clark S, Altman R, Johnson K, Sun F, Zmuda J, Liu CY, Kita A, Schulz R, Neill A, Ballinger R, Patel R, Liu J,

Mpanda A, Huta B, Chiou H, Voegtli W, Panavas T (2017) A high density CHO-S transient transfection system: comparison of ExpiCHO and Expi293. Protein Expr Purif 134:38–46. https://doi.org/10.1016/j.pep.2017.03.018 12. Nettleship JE, Assenberg R, Diprose JM, Rahman-Huq N, Owens RJ (2010) Recent advances in the production of proteins in insect and mammalian cells for structural biology. J Struct Biol 172(1):55–65. https://doi.org/ 10.1016/j.jsb.2010.02.006 13. Bonanno JB, Almo SC, Bresnick A, Chance MR, Fiser A, Swaminathan S, Jiang J, Studier FW, Shapiro L, Lima CD, Gaasterland TM, Sali A, Bain K, Feil I, Gao X, Lorimer D, Ramos A, Sauder JM, Wasserman SR, Emtage S, D’Amico KL, Burley SK (2005) New York-Structural GenomiX Research Consortium (NYSGXRC): a large scale center for the protein structure initiative. J Struct Funct Genom 6(2–3):225–232 14. Structural Genomics Consortium.; China Structural Genomics Consortium.; Northeast Structural Genomics Consortium., Gr€aslund S, Nordlund P, Weigelt J, Hallberg BM, Bray J, Gileadi O, Knapp S, Oppermann U, Arrowsmith C, Hui R, Ming J, dhe-Paganon S, Park HW, Savchenko A, Yee A, Edwards A, Vincentelli R, Cambillau C, Kim R, Kim SH, Rao Z, Shi Y, Terwilliger TC, Kim CY, Hung LW, Waldo GS, Peleg Y, Albeck S, Unger T, Dym O, Prilusky J, Sussman JL, Stevens RC, Lesley SA, Wilson IA, Joachimiak A, Collart F, Dementieva I, Donnelly MI, Eschenfeldt WH, Kim Y, Stols L, Wu R, Zhou M, Burley SK, Emtage JS, Sauder JM, Thompson D, Bain K, Luz J, Gheyi T, Zhang F, Atwell S, Almo SC, Bonanno JB, Fiser A, Swaminathan S, Studier FW, Chance MR, Sali A, Acton TB, Xiao R, Zhao L, Ma LC, Hunt JF, Tong L, Cunningham K, Inouye M, Anderson S, Janjua H, Shastry R, Ho CK, Wang D, Wang H, Jiang M, Montelione GT, Stuart DI, Owens RJ, Daenke S, Schu¨tz A, Heinemann U, Yokoyama S, Bu¨ssow K, Gunsalus KC (2008) Protein production and purification. Nat Methods 5(2):135–46. https:// doi.org/10.1038/nmeth.f.202 15. Savitsky P, Bray J, Cooper CD, Marsden BD, Mahajan P, Burgess-Brown NA, Gileadi O (2010) High-throughput production of human proteins for crystallization: the SGC experience. J Struct Biol 172(1):3–13. https://doi.org/10.1016/j.jsb.2010.06.008 16. Xiao R, Anderson S, Aramini J, Belote R, Buchwald WA, Ciccosanti C, Conover K, Everett JK,

68

Edward Kraft et al.

Hamilton K, Huang YJ, Janjua H, Jiang M, Kornhaber GJ, Lee DY, Locke JY, Ma LC, Maglaqui M, Mao L, Mitra S, Patel D, Rossi P, Sahdev S, Sharma S, Shastry R, Swapna GV, Tong SN, Wang D, Wang H, Zhao L, Montelione GT, Acton TB (2010) The highthroughput protein sample production platform of the Northeast Structural Genomics Consortium. J Struct Biol 172(1):21–33. https://doi.org/10.1016/j.jsb.2010.07.011 17. Esposito D, Garvey LA, Chakiath CS (2009) Gateway cloning for protein expression. Methods Mol Biol 498:31–54 18. Marsischky G, LaBaer J (2004) Many paths to many clones: a comparative look at high-

throughput cloning methods. Genome Res 14:2020–2028 19. Festa F, Steel J, Bian X, LaBaer J (2013) Highthroughput cloning and expression library creation for proteomics. Proteomics 13:1381–1399 20. Unger T, Jacobovitch Y, Dantes A, Bernheim R, Peleg Y (2010) Applications of the Restriction Free (RF) cloning procedure for molecular manipulations and protein expression. J Struct Biol 172(1):34–44 21. Studier FW (2005) Protein production by auto-induction in high density shaking cultures. Protein Expr Purif 41(1):207–234

Chapter 4 High-Throughput Protein Production in Yeast Francisco J. Ferna´ndez, Sara Go´mez, and M. Cristina Vega Abstract Yeasts are versatile single-celled fungi that grow to high cell densities on inexpensive media. With wellstudied genetics and metabolism and a wealth of knowledge available about their propagation and growth in academic as well as industrial settings, yeasts have long been used for recombinant protein production of isolated proteins and multisubunit complexes. They can be easily adapted to high-throughput protein expression pipelines. Importantly, the outcome from small-scale expression evaluations in high-throughput mode is scalable to laboratory and industrial scales using well-established procedures. In this chapter, we offer a state-of-the-art perspective on currently available high-throughput pipelines for protein production in S. cerevisiae and P. pastoris and discuss future challenges and avenues for improvement. Key words High throughput (HTP), HTP protein production (HTPP), Yeast, Saccharomyces cerevisiae, Pichia pastoris, Recombinant expression

1

Introduction The development of high-throughput (HTP) screening technologies in biology has extensively used baker’s yeast S. cerevisiae as an advanced model system. S. cerevisiae is not only an essential model of eukaryotic genetics and cell biology [1], metabolism [2], proteomics [3], systems biology [4, 5] and evolution [6], and an honorary mammalian model system [7], it is also an important platform for the production of small molecules, proteins, and biopharmaceuticals [8]. HTP methods for the analysis of protein-protein interactions (two-hybrid screening) [9–11], protein-DNA interactions (protein arrays) [12], and genomics and transcriptomics analyses in eukaryotes were all set up first for yeast. HTP protein production (HTPP) methods originally optimized for baker’s yeast have been progressively adapted to other yeasts like P. pastoris [13]. Both S. cerevisiae and P. pastoris are firmly established microbial cell factories for the production of high-value proteins and biologics by academic and industry groups [14]. Their well-known genetics and physiology, combined with

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_4, © Springer Science+Business Media, LLC, part of Springer Nature 2019

69

70

Francisco J. Ferna´ndez et al.

fast growth on inexpensive media are two characteristics that make these two yeasts ideal model systems for protein production. The capacity of yeasts to perform complex posttranslational modifications and the presence of eukaryotic folding pathways in yeasts distinguish them from bacteria [15]. Different yeast species including Kluyveromyces lactis and Hansenula polymorpha have also been tested as potential expression hosts [16], although their use is less widespread [17, 18]. The expansion of the repertoire of yeast species that can be assayed in HTPP platforms to include industrial strains has facilitated the transition from screening to production applications. Prominent application areas include systems biology, production of biopharmaceuticals and fine chemicals and structural biology. Given the versatility of yeast-based protein expression platforms, the development of novel HTP methods for heterologous protein production in yeast continues to catalyze discovery, exploration, and implementation of new production pipelines (Table 1).

Table 1 Rationale for the use of S. cerevisiae and P. pastoris in HTPP applications E. coli

S. cerevisiae

P. pastoris

Yes

Yes

Yes

37,681

232

511

Fully sequenced genome

Yes

Yes

Yes

High-quality functional annotation

Yes

Yes

Yes

Industrial scalability

Yes

Yes

Yes

Cultivated in 96/384-well plates

Yes

Yes

Yes

Optimum temperature (range) ( C)

37 (20–50)

32 (4–45)

30 (10–42)

0.45–1.54

0.28–0.51

0.15–0.29

20–30

90–140

90–180

Cost of medium, per liter (€)

0.8–3.1

4.8–63.5

4.5–35.0

OD600 at harvestd

2–8

4–10

8–16

0.8–3.2

2.4–6.0

4.8–9.6

82

75

400

Low

Moderate

High

Established expression system Structures from host-expressed samples

a

1 b

Max. specific growth rate, μ (h Doubling time (min)

)

b c

Dry-cell weight at harvest (g) Max. protein productivity (mg/l)

e

Efficiency of secretory pathway a

Unique PDB entries (95% sequence identity cutoff) Variable depending on growth phase and conditions c From prices obtained from commercial suppliers d For cells grown in microtiter plates and shake flasks e From literature. Typical yields are lower b

High-Throughput Protein Production in Yeast

1.1 Rationale for Yeast-Based HTPP

71

A precise understanding of protein structure and function and the successful development of protein-dependent biotechnological processes require the production of sufficient quantities of the desired proteins for analysis. The motivation to implement efficient parallel schemes for protein production arose during the late 1990s with the accelerating accumulation of protein-coding gene sequences. Finding appropriate expression hosts, suitable gene constructs, culture conditions, and induction regimes to satisfy the growing demand for high-quality protein samples was found too complex to tackle by conventional laboratory protocols. Proteins display diverse folds and combinations of folds, they have specific requirements in terms of metals, cofactors, lipid environments, co-folding chaperones and posttranslational modifications (PTMs), among many other factors. The variable space that must, therefore, be sampled to find suitable expression conditions for most prokaryotic and eukaryotic proteins is, not surprisingly, huge. By the end of the last century, structural genomics communities had been set up to meet this challenge. Through a global, concerted effort, HTPP were developed, extensively tested, compared with one another and continuously optimized until they slowly converged on a minimum set of standard operating procedures (SOPs). This collection of SOPs works predictably, generating robust protein expression data and boasting reasonable success rates. At the centre stage of these efforts was E. coli, although other microbes have also been tested, albeit less successfully. Many biological questions that can be interrogated with HTPP SOPs. One of the most common experimental goals is to find all protein products that can be expressed in soluble form from all the protein-encoding genes in one genome (or a few related genomes). This was the central concern of the first structural genomics consortia, which focused on thermophilic microbes such as T. maritima [19] or deadly human pathogens as M. tuberculosis [20]. A second common goal is to evaluate the recombinant expression of proteins that share common functional traits, for example, protein kinases [21] and venom proteins [22, 23]. As the HTPP methods are particularly powerful for discovery and exploration, they serve a sieving purpose, finding those proteins that fulfill some desired properties and that can be produced in sufficient amounts. Accordingly, the HTPP SOPs aim to dramatically expand the number of independent expression experiments that can be carried out per operator per unit time to accelerate discovery. Obviously, increasing the throughput is done at the expense of reducing the culture volume per protein tested. This reduction in culture volume, in turn, poses its own particular challenges. Can a meaningful assessment be inferred about the production of a soluble protein (or a protein complex) from experiments involving small volumes (typically, 0.5–2.5 ml)? The miniaturization and automation of expression evaluation experiments thereby require the

72

Francisco J. Ferna´ndez et al.

development of sensitive detection methods capable of reliably estimating functional protein yields at a small scale. Often the goal is to detect the presence of soluble protein in a cell lysate, although the goal could also be to detect properly folded membrane proteins in their lipid environment, or functional enzymes and fluorescent reporter proteins. Even taking into account the concentrating effect of pelleting the cell culture to resuspend it in a smaller volume of a suitable lysis buffer (a 5-10-fold concentration effect), only well-expressed proteins will be detected by Coomassie Brilliant Blue (CBB) staining of SDS-PAGE gels. It has been estimated for E. coli that protein yields below 2 mg/l culture represent a lower limit for detection by SDS-PAGE, although lower yields up to 0.5 mg/l would provide sufficient material for most analytic and structural purposes. For yeasts, which typically have smaller volumetric yields than bacteria, it is not practical to rely exclusively on SDS-PAGE detection. Switching to immunological detection (Western blotting) is possible, and it has been implemented; however, it also increases costs and makes handling more complicated. An alternative method consists in adding a purification capture step after cell lysis, which is a more costefficient means of enhancing the detection sensitivity of HTPP schemes and constitutes today’s procedure of choice. In most cases, the capture step is implemented using nickel affinity chromatography of poly-histidine-tagged proteins, although other combinations of purification tags and resins are possible. For yeasts, it is not sufficient to monitor the overall yield of tagged protein products (as can be assessed by a dot blot) since proteolytic degradation is more commonly observed than in bacteria. Yeasts synthesize many cytosolic, vacuolar, and extracellular proteases that can come into contact with the expressed recombinant protein at various stages, primarily upon cell lysis. Detecting and avoiding unwanted proteolytic degradation of the expressed product becomes an integral part of an HTPP pipeline. Numerous parameters have an impact on protein expression, solubility, and total yield. Perhaps the parameters that are easiest to manipulate are the culture conditions, which have demonstrated an effect on expression levels. The composition of the culture medium offers an adjustable multidimensional chemical space worth exploring; rich or minimal, pH and additives all may play a role. Other factors which are known to affect the protein solubility levels (without changing the construct) include the plasmid copy number, the promoter, cultivation conditions (dissolved oxygen, shaking speed), the temperature, the yeast strain or species, the time, the type and concentration of the inducer. The systematic exploration of a large number of existing parameters is impractical, hence fixing specific parameters and varying others is a practical necessity. In addition, design of experiment strategies (fractional factorial screenings) must be adopted for efficiency reasons. A good

High-Throughput Protein Production in Yeast

73

compromise, implemented in E. coli by several groups, is a fractional factorial approach whereby a small number of cell strains (four possible strains), the type of culture media (three media formulations), and induction temperatures (three growth temperatures) are tested. Similar combinations of strains, promoters, culture conditions, and temperature have been assayed for yeast, and an outline of the significant findings appears below. 1.2 Methylotroph or Not

Yeast expression factories are classified according to their metabolic capacity to feed solely on methanol. This is a useful physiologic distinction that roughly corresponds to two differentiated sets of expression protocols [16]. Non-methylotrophs are those yeasts that lack the genetic and biochemical capacity to metabolize methanol, including the well-characterized yeasts S. cerevisiae and K. lactis. Typical expression protocols for non-methylotrophic yeasts involve the use of episomal plasmids, and heterologous expression is driven by intermediary metabolic promoters like the galactose-inducible GAL1 promoter [24, 25] and the strong constitutive ADH1 (gene encoding for alcohol dehydrogenase) [26], TEF1 (gene encoding for transcriptional elongation factor EF1α) [27], and GAPDH (gene encoding for glyceraldehyde 3-phosphate dehydrogenase) promoters. Preferred applications include the expression of metabolic enzymes, protein kinases and gene regulatory proteins as intracellular products, and antibodies and antibody fragments as secretion products. As energy source, S. cerevisiae depends on both the respiratory chain and the glycolysis. Methylotrophs, in contrast, describe yeast microorganisms which have evolved to utilize methanol as a source of carbon and energy and which are mainly aerobic. The best known methylotrophic yeasts are P. pastoris and H. polymorpha (Ogataea polymorpha). P. pastoris has been a prevalent yeast for recombinant protein expression [28] because of its methanol-inducible, strong and tightly regulated alcohol oxidase 1 (AOX1) promoter, which drives the expression of the central enzyme for methanol utilization [29]. A complete toolbox of promoters and terminators in the methanol utilization pathway has been recently characterized to expand the usability of the system. Expression cassettes are brought into methylotrophs by genome integration (although autonomous replicative plasmids also exist). The prototypical protein targets whose expression are attempted in are naturally secreted proteins like hormones, growth factors, degradative enzymes (e.g., carbohydrate-active enzymes), and antibodies [30–35]. When methylotrophic and non-methylotrophic yeasts are compared side by side, several recurrent themes emerge: Methylotrophs tend to produce higher protein yields per volume thanks to being able to grow to extremely high cell densities and are more efficient in secreting protein to the extracellular medium;

74

Francisco J. Ferna´ndez et al.

non-methylotrophic yeasts are easier to manipulate genetically and more amenable to high-throughput applications because of the availability of relative stable plasmids. In fact, the two most commonly used yeasts for protein production, S. cerevisiae and P. pastoris, represent two extreme models along the yeast metabolic spectrum. S. cerevisiae can thrive under low oxygen concentration, therefore allowing less controlled environments for protein production; P. pastoris, on the other hand, when feeding on methanol requires large amounts of dissolved oxygen to achieve optimum growth and protein production levels. HTPP platforms that plan to exploit methylotrophic yeasts must, therefore, be designed to avoid the health and fire hazards of using methanol while supplying sufficient aeration. The availability of a vast research literature and a wealth of specific mutants and mutant libraries for S. cerevisiae cannot be overstated in any fair comparison between the various yeast-based expression platforms [17, 18]. Despite baker’s yeast’s starting advantage, the excellent protein production and secretion properties of P. pastoris has spurred intense efforts worldwide to develop and mature high-throughput platforms capable of exploiting the specific genetic and metabolic idiosyncrasies of yeasts. 1.3 General Considerations

Setting up an HTPP laboratory with yeast microbes as expression workhorses is similar to doing it for E. coli expression. Microbiological techniques, culture volumes, and culture handling are virtually identical for bacteria and yeast. From the standpoint of managing yeast HTPP platforms, there are two main differences between bacteria and yeasts. Firstly, a prerequisite for performing parallel expression evaluations is the efficient and reliable transformation and maintenance of gene cassettes harboring expression constructs into yeast cells. This is easily accomplished with S. cerevisiae by chemical transformation and either auxotrophic or antibioticbased selection. In fact, in vivo homologous recombination in S. cerevisiae is a readily automatable procedure for creating expression constructs that reduces the labor and time dedicated to molecular cloning. However, the requirement for genome integration and exhaustive selection in P. pastoris imposes a narrower bottleneck for the development of HTPP strategies for this yeast. Owing to its lower intrinsic efficiency and the absence of efficient homologous recombination processes, transformation in P. pastoris usually involves electroporation and several rounds of auxotrophic or antibiotic-based selection. These deficiencies are partially compensated for by the simultaneous genome integration of multiple copies of the expression cassettes, a statistically rare event; isolating such “jackpot” clones, which are often associated with higher expression yields, may involve several rounds of selection with increasing concentrations of antibiotic. Recently, several new autonomously replicating sequences have been found to sustain stable plasmid propagation in P. pastoris [36, 37], thus opening

High-Throughput Protein Production in Yeast

75

the possibility of more robust, higher throughput HTPP protocols for this yeast. Secondly, yeast inducible promoters tend to be more dependent on culture conditions than the phage T7 promoter widely in use for bacterial expression, thereby leading to less variation in media formulations. In contrast to the diversity of media formulations devised for bacterial expression (e.g., bare LB, low salt LB, rich broths like 2TY, minimal saline, and autoinduction media), media formulations for S. cerevisiae or P. pastoris are less diverse, and they are strongly linked to the type of promoter used. Several attempts have been made to optimize media formulations for yeasts that can be adopted as the default medium by HTPP laboratories. Based on the strengths and weaknesses of the S. cerevisiae and P. pastoris expression systems, it has been suggested that S. cerevisiae could be used for screening and P. pastoris for scaling up. In medium-throughput or at laboratory-scale, the highest protein yields that can be attained with P. pastoris argue for its use for both screening and scaling up, while S. cerevisiae would play a role salvaging poorly performing targets in P. pastoris and for troubleshooting purposes. Either strategy has its merits, and the choice between them is influenced by other factors, e.g., desired throughput level, available instrumentation, and level of familiarity, skill, and knowledge with either yeast. The popularity of P. pastoris as an expression host and its success in membrane protein production has undoubtedly stimulated the development of parallelization protocols for the methylotrophic yeast. 1.4

Applications

1.4.1 Secreted Proteins and Antibodies

There are two areas in which yeasts excel in comparison with E. coli and most other eukaryotic expression systems: the production of secreted proteins (including antibodies and antibody fragments) and membrane proteins. Some lessons can be learned from current experience with the HTPP of secreted and membrane proteins targets. The lack of an efficient secretory system in E. coli has facilitated the rise of yeasts as preferred hosts for secretion expression. In contrast to bacteria, yeasts possess a very efficient and productive secretory pathway. Both S. cerevisiae and P. pastoris have been extensively used to secrete recombinant proteins to the supernatant both in academic and industrial settings. Secreting recombinant proteins to the supernatant has many advantages over intracellular expression: the transit through the ER allows glycosylation to occur, it exposes the protein to an oxidative environment and chaperones responsible for the correct folding of disulphide bridges, and, once secreted, recombinant proteins can be easily captured in pure form. This makes yeasts desirable platforms for the expression of economically important antibodies. Although optimized mammalian cell culture is best suited for antibody production, the mammalian cell lines in

76

Francisco J. Ferna´ndez et al.

use today have been extensively fine-tuned for antibody production. A similar optimization has yet not been carried out for yeast, with a few exceptions. For example, in S. cerevisiae co-expression of the target protein with the mammalian chaperone BiP, the co-chaperone GRP170, and the peptidyl-prolyl isomerase FKBP2 has improved antibody yields in yeast up to twofold [38]. Giving many of its favorable growth characteristics, yeast strain/genome optimization may help lift present bottlenecks and raise expression levels significantly. Adapting HTPP methods for the detection of secreted protein products is straightforward. If the target protein is only detectable by SDS-PAGE, a capture purification step is easily performed after removing the cell pellet by centrifugation. For fluorescence or enzyme-based assays (e.g., ELISA), it may be sufficient to sample the culture medium before clarification. The latter approach can significantly benefit from HTP ELISA protocols conducted by automated liquid handling platforms. Microfluidics screening of yeast random mutant libraries generated by UV irradiation has been used to select for improved α-amylase secreting yeast strains [39]. In this approach, individual mutant yeast strains expressing αamylase are mixed with a fluorogenic substrate, encapsulated in tiny droplets of culture medium surrounded by an immiscible oil layer, incubated and sent through a fluorescence activator cell sorter. Since the fluorescence measured in a droplet correlates with the amount of secreted α-amylase through the fluorogenic substrate turned over, cell sorting can effectively enrich the cell population in hypersecreting strains. Several rounds of selection and enrichment may be necessary to isolate superior strains. By combining this approach with whole-genome sequencing, many mutations in genes that belong to the secretory pathway of S. cerevisiae were discovered and are now available for targeted exploration with alternative target proteins [39]. 1.4.2 Membrane Proteins

A second area where yeast has excelled is in the production of prokaryotic and eukaryotic membrane proteins [14]. Most structures of membrane proteins deposited in the PDB have been produced in a yeast expression host [14]. As pharmacologically important targets, membrane proteins were for many years frustratingly hard to produce in active form. Thanks to intensive research that has revealed the structural and physicochemical factors that govern membrane protein folding and membrane insertion, more advanced expression systems have been devised to facilitate membrane protein expression. Complementary methods aimed at increasing the stability of membrane proteins by mutation or using binding chaperones and antibodies have also played a role in advancing the field. When screening for membrane protein expression, it is useful to test the effect of small molecule

High-Throughput Protein Production in Yeast

77

chaperones like histidine and dimethyl sulfoxide (DMSO). For specific proteins, the addition of a soluble form of cholesterol may be essential to allow the target protein to correctly insert into the yeast membrane since the naturally occurring sterol in yeasts is ergosterol. HTPP screening of membrane protein expression in yeasts takes into account the forerunning considerations by including 0.4 mg/l histidine and/or 2.5% DMSO as small molecule additives during expression tests. Screening is typically done with C-terminally yEGFP-tagged constructs, whose expression can be monitored by noninvasive fluorescence detection methods [40]. Despite their high sensitivity, fluorescence-based detection cannot, in general, distinguish properly folded protein and misfolded protein as long as the fluorescent reporter is well folded. It is, however, important to keep it, if compatible with the target protein’s fold and function, because during scaling up and purification the application of FSEC to analyze the membrane protein fusion allows to screen for the optimum detergent. During the screening, the use of expensive detergents should be minimized if possible. 1.5 UltraHTP Approaches

In general, the speed at which new constructs can be generated outpaces the capacity of expression evaluation platforms, clearly indicating that there is still plenty of room to increase the throughput of protein production initiatives. Fast construct generation through automated or semiautomated library construction methods calls for more efficient methods to run expression evaluation experiments and to reliably detect and rank their outcome. Microfluidics approaches can, in principle, provide the throughput necessary to match current library construction capacity. In particular, RNA-aptamers-in-droplets (RAPID) [41] is a novel experimental approach that uses aptamers to transduce extracellular product titer into fluorescence, allowing the ultrahigh-throughput screening of millions of individual experiments. RAPID was initially applied to solve synthetic biology and metabolic engineering problems, but even from the first application, the detection of secreted protein products was a core capacity. Screening vast libraries of genetically engineered or plasmid bearing microbes for secreted products (including proteins) can not only revolutionize modern approaches but may also switch the focus from merely a more efficient screening process to a discovery tool implementing random modification of expression constructs. Directed evolution techniques, for example, could easily fit in the RAPID workflow. With their wellcharacterized genomes and the ease with which yeast genomes can be manipulated, methodological advances such as RAPID and other microfluidics-based techniques are poised to create a wealth of data on protein (and other macromolecules and biologically active molecules) expression and stability data.

78

Francisco J. Ferna´ndez et al.

1.6 Outlook and Future Challenges

2

Yeasts offer similar advantages to E. coli when compared with other eukaryotic hosts, even though the tools currently available for HTPP are not as developed and matured. In particular, co-expression in yeast remains tedious despite promising developments and has not yet been fully assessed on an HTPP pipeline. More efficient lysis methods are also necessary to facilitate the analysis of intracellular targets; in contrast, evaluation of secreted targets is as simple as (or simpler than) in bacteria. Over the last 10 years, yeasts have shifted from being considered generalist expression hosts to becoming de facto more specialized hosts, with a proven track record for many types of proteins—from hormones and growth factors to enzymes to membrane proteins. The high cell densities that can be achieved render them especially attractive for upscaling industrial processes. The systematic application of yeast-based HTPP and multi-omics approaches for the better understanding of yeast cellular function should lead to systems biology-driven and mathematical model-guided strain engineering.

Materials Most of the steps described in detail in this chapter were performed in an HTP manner, applying robotics and/or in 96 well formats. For most of the steps, equipment and consumables are similar and are, for simplicity, described only once in Subheading 2. Prepare all solutions using ultrapure water, analytical grade reagents and media components. Prepare and store all reagents at room temperature unless indicated otherwise. Diligently follow all waste disposal regulations.

2.1 High-Throughput Expression Evaluation in S. cerevisiae 2.1.1 Reagents

1. Rich YPD medium for yeast strain propagation and biomass growth (see Note 1). YPD contains 1% (w/v) yeast extract, 2% (w/v) peptone, 2% (w/v) glucose. 2. Synthetic complete minus uracil (SC-Ura) for selection and expression (see Note 2). SC-Ura contains 6.9 g yeast nitrogen base (YNB) without amino acids, 770 mg Complement Supplement Mixture (CSM without uracil) and 20 g glucose. Autoclave SC-Ura and store at 4  C until use. 3. SC-Ura agar plates, made with 2% (w/v) agar. 4. SC-Ura medium supplemented with 20% (w/v) galactose, autoclaved. 5. Collection of yeast expression plasmids harboring the target gene constructs (see Note 3) prepared by, e.g., homologous recombination. Genes are placed under the control of the strong and inducible GAL1 promoter (inducible by galactose)

High-Throughput Protein Production in Yeast

79

and fused to a C-terminal octahistidine (His8) tag or a DNA sequence coding for a TEV cleavage site, the entire sequence for yeast enhanced green fluorescence protein (yEGFP), and a C-terminal His8 tag. 6. A suitable yeast expression strain prepared for chemical transformation (see Note 4). Two yeast strains used for HTPP applications are FGY217 [Mata, ura3-52, lys2Δ201, pep4Δ] [40, 42] and TM6* [43, 44] (see Note 5). With its exclusively respiratory metabolism, TM6* produces twice as much biomass as conventional respiro-fermentative strains [44] and is, therefore, an interesting strain for HTPP applications. 7. Lysis buffer: Detergent-based yeast protein extraction reagent, e.g., Y-PER (Invitrogen). Equipment and Consumables

1. Single-channel and multi-channel pipettes (MCPs). 2. Shaker with adjustable temperature between 16 and 30  C (see Note 6). If possible, use a shaker with humidity control set to 80% to prevent water evaporation during multi-day incubation. 3. Benchtop centrifuge equipped with a microplate rotor (e.g., 5810-R, Eppendorf). 4. Optical and fluorescence 96-well microtiter (MTP) Reader. 5. PCR plates with adhesive seals. 6. 24-Well and 96-well deep-well-boxes (DWBs) and with square, flat-bottom wells. 24-Well DWBs range from 0.5 to 4 mL capacity/well and 96-well DWB from 0.3 to 1 mL capacity/ well. 7. Breathable adhesive film, e.g., Breathable Sealing Tape (Corning). 8. Aluminum, non-sterile adhesive sealing tape, e.g., Microplate sealing tape (Corning). 9. Flat bottom 96-well optical and fluorescence microtiter plates. 2.2 High-Throughput Expression Evaluation with Pichia pastoris

Reagents

1. YPDS agar plates: YPD, 1 M sorbitol, 2% (w/v) agar plates. Supplemented with the required antibiotics (e.g., Zeocin, G418). 2. Buffered minimal dextrose (BMD) medium is used for the growth phase. BMD medium contains 100 mM potassium phosphate, pH 6.0, 1.34% (w/v) YNB, 4  105% biotin and 1% (w/v) glucose. 3. BMD agar plates supplemented with appropriate antibiotics (e.g., Zeocin, G418) depending on the expression plasmids. Prepared with 2% (w/v) agar.

80

Francisco J. Ferna´ndez et al.

Fig. 1 Scheme of HTPP workflow applied to S. cerevisiae. When target proteins are fused to a C-terminal yEGFP fusion, the fluorescence readout at the time of harvest reports on the expressed protein. Follow-up experiments including SDS-PAGE analysis and FSEC are essential to establish the integrity and monodispersity of the expressed target

4. For induction, glucose in BMD is replaced by 0.5% (v/v) methanol to prepare BMM (buffered minimal methanol) medium (see Note 7). For convenience, induction media BMM2 and BMM10 are used, where BMM2 contains 1% (v/v) methanol, and BMM10 contains 10% (v/v) methanol. When using histidine autotrophic strains like GS115, 0.004% histidine must be added to the media. 5. Collection of P. pastoris expression plasmids harboring the target gene constructs placed under the control of the strong and tightly controlled AOX1 promoter and, optionally, fused to an N-terminal secretion signal sequence. Target genes are C-terminally tagged with either His8 or TEV-yEGFP-His8 (see Note 8). The expression constructs should have been previously linearized with a restriction enzyme located at the 50 end of the promoter (Fig. 1). 6. A suitable P. pastoris strain prepared for electroporation. We use P. pastoris NRRL Y-11430 [45] as will-type strain and SMD1168H (Invitrogen, Carlsbad, CA, USA) as protease deficient strain (see Note 9). 2.3 HTP Analysis of Expressed Proteins

Reagents

1. IMAC-A: Lysis buffer supplemented with 20 mM imidazole. 2. IMAC-W: 50 mM TrisHCl, 50 mM sodium phosphate, 0.5 M NaCl, 50 mM imidazole, pH 8.0.

High-Throughput Protein Production in Yeast

81

3. IMAC-E: IMAC-W with 500 mM imidazole. 4. Reagents for protein electrophoresis, Coomassie Brilliant Blue (CBB), protein ladder. 4 Protein Gel Loading Buffer (PGLB): 40% (v/v) glycerol, 240 mM TrisHCl, pH 6.8, 8% (w/v) SDS, 0.04% (w/v) bromophenol blue, 5% (v/v) β-mercaptoethanol. 5. Protease inhibitor cocktail, e.g., Complete EDTA-free (Roche). 6. Magnetic immobilized metal affinity chromatography (IMAC) beads (see Note 10). Equipment and Consumables

1. Magnetic stand for 96-well plates. 2. Equipment for protein electrophoresis.

3

Methods Carry out all procedures at room temperature unless otherwise specified. Observe aseptic techniques when picking yeast colonies, inoculating yeast colonies into solid and liquid medium and adding medium/inducer to growing cultures.

3.1 High-Throughput Expression Evaluation in S. cerevisiae

1. Transform chemically competent S. cerevisiae FGY217 cells (50 μl) with 100 ng of the chosen DNA fragments and/or plasmids following standard procedures [40] (Fig. 2). Plate the transformation mixtures on SC-Ura agar plates, let plates dry for 10 min, and place into a plate incubator at 28  C for 24–48 h until colonies grow to about 2–4 mm in diameter (Fig. 2). 2. Prepare DWBs to grow the starter cultures by filling up 24-well DWBs with 2.5 ml SC-Ura medium or 96-well DWBs with 0.5 ml SC-Ura (Fig. 2). Use as many DWBs as necessary for expression evaluation of all available constructs. Inoculate each well with a freshly grown yeast colony. It is advisable to include both negative and positive controls. Use a yeast strain transformed with the empty version of the same expression plasmid as negative control; if possible, include a positive control consisting in a yeast strain transformed with an already characterized expression plasmid that reliably produces high levels of a stable protein that can be easily detected. Seal the DWBs with oxygen-permeable adhesive films. 3. Place the inoculated DWBs into a temperature and humidity controlled shaker set at 28  C, 80% humidity, and shaking speed at 350 rpm. Allow the cultures to grow overnight.

82

Francisco J. Ferna´ndez et al.

Fig. 2 Scheme of HTPP workflow applied to P. pastoris

4. Next morning, retrieve the DWBs from the shaker. Centrifuge the DWBs at 3000  g for 5 min to collect all the yeast pellet at the bottom of the wells. Carefully remove the seal to avoid carryovers between individual wells. Resuspend the cell pellets. Add 278 μl (24-well DWBs) or 56 μl (96-well DWBs) of fresh SC-Ura medium containing 20% (w/v) galactose and mix gently but thoroughly. The final inducer concentration is 2% (w/v) galactose per well. Re-seal the DWBs and return them to the shaker, keeping the settings as before (see Note 6). 5. Twenty-four [24] h later retrieve the DWBs from the shaker and collect the cell pellets containing the expressed protein by centrifugation at 6000  g for 15 min. 6. If the protein target is secreted, transfer the supernatant to a clean DWB and add protease inhibitors (otherwise, discard the supernatant). Freeze the supernatants at 80  C until ready to analyze them. When ready to perform the expression evaluation analysis, proceed to Subheading 3.3. 7. If the construct encodes the target protein as a C-terminal yEGFP fusion, remove a 100-μl aliquot into a 96-well plate for fluorescence readout at this stage (Fig. 2) (see Note 8). Other analytic methods can be used to detect non-fluorescent protein constructs (e.g., enzymatic assays). Otherwise, proceed to the next step.

High-Throughput Protein Production in Yeast

83

8. If not processed immediately, protease inhibitors should be added to the resuspended cell pellets, and the DWB should be sealed with aluminum seal and stored at 80  C until use. 9. When ready to process the pellets, add 125–500 μl of lysis buffer to each well and resuspend the pellets thoroughly. Incubate for 30 min with shaking (350 rpm). Pellet the cell debris by centrifuging at maximum speed for 10 min at 4  C. Transfer the lysate to a clean 96-well MTP. 3.2 High-Throughput Expression Evaluation with Pichia pastoris

1. Transform linearized expression plasmids into P. pastoris SMD1168H electrocompetent cells following standard procedures (Invitrogen) (see Note 11) (Fig. 1). Plate the transformation mixture into selective YPDS plates, let them dry and place them in an incubator at 28  C. Wait until well-developed colonies form (2–4 mm in diameter). 2. Prepare DWBs to grow the starter cultures of P. pastoris strains by filling up 24-well DWBs with 1.25 ml BMD medium or 96-well DWBs with 0.5 ml BMD (Fig. 1) (see Notes 12 and 13). Use as many DWBs as necessary for expression evaluation of all available constructs. Inoculate each well with a freshly grown P. pastoris colony on BMD agar plates or from a glycerol stock of the same yeast strain. At least 2–4 colonies from the same plate are tested in parallel since large variations are sometimes observed, which is attributed to copy number (clonal) variations; these variations should be minimized with plasmiddriven expression. It is advisable to include both negative and positive controls. Use a P. pastoris strain transformed with the empty version of the same expression plasmid as negative control; if possible, include a positive control consisting in a P. pastoris strain transformed with an already characterized expression plasmid that reliably produces high levels of a stable protein that can be easily detected. Seal the DWBs with breathable adhesive films. 3. Place the inoculated DWBs into a temperature and humidity controlled shaker set at 28  C, 80% humidity, and shaking speed at 350 rpm. 4. After 60 h, when glucose has been depleted, retrieve the DWBs from the shaker and add an equal volume of BMM2 to start induction at 0.5% (v/v) final methanol concentration. Re-seal the DWBs and return them to the shaker, keeping the settings as before (Fig. 1) (see Note 6). 5. During the next 2 days (at 12 h, 24 h and 48 h), add BMM10 to the cultures at one-tenth of the cultivation volume to replenish the exhausted methanol. This additional volume is 250 μl for 24-well DWBs and 50 μl for 96-well DWBs (Fig. 1).

84

Francisco J. Ferna´ndez et al.

6. Cultures are harvested 24 h later (at 72 h total induction phase) by centrifugation at 1750–2500  g (Fig. 1). 7. If the protein target is secreted, transfer the supernatant to a clean DWB and add protease inhibitors (otherwise, discard it). Freeze the supernatants at 80  C until ready to analyze them. When ready to perform the expression evaluation analysis, proceed to Subheading 3.3. 8. If the construct encodes the target protein as a C-terminal yEGFP fusion, remove a 100-μl aliquot into a 96-well plate for fluorescence readout at this stage (see Note 8). Other analytic methods can be used to detect non-fluorescent protein constructs (e.g., enzymatic assays). Otherwise, proceed to the next step. 9. If not processed immediately, protease inhibitors should be added to the resuspended cell pellets, and the DWB should be sealed with an aluminum seal and stored at 80  C until use. 10. When ready to process the pellets, add 125–500 μl of lysis buffer to each well and resuspend the pellets thoroughly. Incubate for 30 min with shaking (350 rpm). Pellet the cell debris by centrifuging at maximum speed for 10 min at 4  C. Transfer the lysate to a clean 96-well MTP. 3.3 HTP Analysis of Expressed Proteins

1. The analysis of the supernatants (for secreted proteins) or cell lysates (for intracellular products) can be expedited by using highly specific purification tags to capture His8-tagged protein products. Other affinity tags can be used as well, e.g., StrepII or FLAG tags (see Note 14). We use His8-tagged protein constructs and magnetic beads for all HTPP analyses. 2. Dispense 50 μl of a 50% slurry of IMAC magnetic beads into a clean 96-well plate. Bring the plate in contact with a permanent magnet for 1 min; this time is sufficient for the magnetic beads to deposit onto the wall in contact with the permanent magnet. Carefully remove the preservation solution with an MCP. 3. Rinse and equilibrate the magnetic beads with 500 μl double distilled water and twice 500 μl IMAC-A buffer. 4. Mix up to 500 μl supernatant or cell lysate with the magnetic beads. Incubate for 15 min with shaking (350 rpm) at 4  C. Separate the magnetic beads from the flow-through of the supernatant or cell lysate and save the latter to a clean plate labeled “FT.” 5. Add 500 μl of IMAC-A (containing 20 mM imidazole) and mix thoroughly. Incubate for 10 min with shaking (350 rpm) at 4  C. Separate the magnetic beads from the first wash and save the latter to a clean plate labeled “W20.”

High-Throughput Protein Production in Yeast

85

6. Add 500 μl of IMAC-W buffer (containing 50 mM imidazole) and mix thoroughly. Incubate for 10 min with shaking (350 rpm) at 4  C. Separate the magnetic beads from the second wash and save the latter to a clean plate labeled “W50.” 7. Elute bound proteins by adding 100 μl of IMAC-E (containing 500 mM imidazole) and mix thoroughly. Incubate for 10 min with shaking (350 rpm) at 4  C. Separate the magnetic beads from the eluted fraction and save the latter to a clean plate labeled “E500.” 8. Select the fractions to be analyzed by SDS-PAGE electrophoresis. Various options are possible. We run all elution fractions including negative and positive controls; crude supernatants/ lysates, flow-through and wash fractions can also be analyzed if a problem is detected in the elution fractions (e.g., little to no expression, proteolytic degradation, wrong size of the protein band). Mix 30 μl of the selected samples with 10 μl 4X PGLB on a PCR plate, seal it, place it in a thermocycler and heat it at 70  C for 10 min then cool it down immediately to room temperature (see Note 15). 9. Load 10–15 μl of each fraction on 10–15% SDS-PAGE gels along with a suitable protein ladder. Run the electrophoresis at 250 V for 45 min. The gel can be stained with Coomassie or blotted for immunodetection (Figs. 1 and 2). In case a positive control has been used, it is possible to estimate from the relative intensities of Coomassie-stained protein bands (or, equivalently, from Western blotting bands) the amount of target protein produced per mL culture. Yields between 0.1 and 1 mg/L should be tested by scaling up the expression cultures, while yields equal or greater than 1 mg/L, the same set of expression conditions is scaled up to the required culture volume to obtain the desired final amount of protein.

4

Notes 1. Cultivation media formulations for S. cerevisiae and P. pastoris typically depend on the chosen promoter (via the promoter regulation and the inducer) and the selection marker (especially with auxotrophic markers). 2. SC media allow the complete specification of media components for the correct selection of plasmid-containing cells while permitting protein expression upon the addition of the inducer. Unregulated expression via constitutive promoters is intrinsically less predictable than regulated induction, although in some instances it has been shown to work as well or better than with inducible promoters.

86

Francisco J. Ferna´ndez et al.

3. All available knowledge on protein construct engineering for bacteria can be applied to yeast expression, including the divide-and-conquer approach consisting in breaking up genes encoding multi-domain proteins into their constituent domains. Experimental approaches are also available, which may require library construction methods. These options have not been as comprehensively explored as in bacteria. Leaving aside considerations about the vector sequences, including promoter, Kozak sequences, terminators, and selection markers, protein constructs can differ in the precise definition of their truncation borders (e.g., full-length, isolated domains, clipping off of flexible flanking residues) and the presence of tags or fusions (small peptide tags, large fusions, placement at the C- or N-terminal end, whether it is cleavable or not). Perhaps one of the most influential decisions that can be made concerning protein production in yeast is whether the construct is targeted to the extracellular space (secretion) or not (intracellular). To direct protein products into the secretory pathway, the recombinant gene is prepended with a sequence encoding a peptide leader. Several peptide leaders have been screened, with S. cerevisiae’s α-mating factor being one of the most commonly used. Tags and fusions are also used in yeast to facilitate purification by affinity chromatography, for immunological detection and to increase the solubility of the target protein. Although a short hexahistidine tag (His6-tag) is a very popular choice for bacterial expression, yeast constructs are more frequently prepared with longer His-tags (octa- or deca-histidine tags) or, preferably, with more specific tags. The short peptidic StrepII or FLAG tags are amongst the most popular. Double tagging at both the C- and N-termini is also more common in yeast than in bacteria, since the tagging of both protein ends allow the detection of full-length protein expression and undesired fragments. Dual-affinity fusions are also available, like the TAP tag that allows for dual purification schemes. Given the drastic differences observed in secretion yield for different target proteins, it is crucial to adopt a hierarchical approach to construct large screening libraries of secretion constructs. 4. When choosing an appropriate S. cerevisiae strain, the most important factors are the available auxotrophic markers and whether the strain carries mutations in any of the genes coding for several vacuolar proteases. Yeast vectors often carry an intact copy of a metabolic gene to allow for the auxotrophic selection of an appropriate yeast strain in SC medium lacking certain specific metabolites. The genes encoding vacuolar proteases are typically mutated in most yeast production strains, to improve the stability and reduce degradation of protein products.

High-Throughput Protein Production in Yeast

87

Choosing a yeast strain with at least the protease-encoding gene pep4 knocked out or inactivated is sufficient for expression tests. 5. With its exclusively respiratory metabolism, TM6* produces twice as much biomass as conventional respiro-fermentative strains. 6. Most yeast strains used for recombinant expression grow optimally at 25–30  C. The impact of cultivation temperature on protein solubility and stability is less evident in yeast than in bacteria. Since yeast expresses a full complement of eukaryotic chaperones, yeasts are better equipped than bacteria to fold eukaryotic proteins properly and, therefore, lowering the temperature might seem less critical. However, other factors other than protein folding may be improved by lowering the temperature, e.g., proteolytic degradation of unfolded or misfolded proteins. For secreted protein products, which must traverse the endoplasmic reticulum (ER), one of the greatest risks is the unleashing of the unfolded protein response (UPR). The UPR is a homeostatic response to the buildup of unfolded proteins in the ER, which is transcriptionally programmed via the HAC1p transcription factor. Since lowering the cultivation temperature also reduces the growth, transcription and translation rates, it may help to raise the amount of secreted proteins. This possibility has to be weighed against the lengthier cultivation times before harvest; this time may range from 24 h to many days (sometimes over 1 week). As for S. cerevisiae, the experimental evidence in support of using lower temperatures for P. pastoris expression is still scant. We recommend using two temperatures during the screening phase, 20 and 30  C, and to evaluate the experimental results carefully. 7. The AOX1 promoter is induced by methanol in the absence of other carbon sources; glucose tightly represses the AOX1 promoter. 8. If the target is a membrane protein, we replace the C-terminal His8-tag by a larger TEV (tobacco etch virus)-cleavable yEGFP-His8 fusion as an additional reporter for in-gel fluorescence and, at a larger scale, FSEC (fluorescence-size exclusion chromatography). 9. Several P. pastoris strains offer unique advantages compared with the reference strains NRRL Y-11430 (wild-type), X-33 (wild-type) and GS115 (histidine auxotrophic strain) (Invitrogen, Carlsbad, CA, USA). While the wild-type strains are useful for reference purposes, additional strains exist that lack the main vacuolar protease gene (and also some of the secondary genes) that may confer additional stability to certain target proteins. Protease-deficient strains grow more slowly than

88

Francisco J. Ferna´ndez et al.

wild-type strains. There are also strains lacking the AOX1 coding gene (e.g., KM71) that are necessarily slow growing in the presence of methanol since their metabolism becomes dependent on the less efficient AOX2 gene product. Those strains, termed methanol utilization sensitive (MutS), grow more slowly and generate less biomass, a feature that may be beneficial for secreted proteins. Strains with a functional gene are termed Mut+. For screening purposes, protease-deficient strains should be used, and testing Mut+ and MutS strains side by side is also recommended. The use of episomal plasmids for expression reduces clonal variation by eliminating the possibility of random genome insertion events, including those events leading to the conversion between the various Mut phenotypes. 10. Non-magnetic beads can be used as well. Magnetic beads allow for more straightforward manipulations and, in our hands, they are more resistant to clogging when crude lysates are very dense or insufficiently clarified. In contrast, non-magnetic beads are less expensive, and they are equally effective in terms of total binding capacity and recovery of pure protein. Protocols for HTPP using non-magnetic beads are available for bacteria and yeast expression cultures. 11. Target genes have traditionally been transfected into P. pastoris by electroporation, leading to the genomic insertion of the expression cassette. Also, there are also stable episomal plasmids that can be stably propagated and drive expression to the same (or higher) level than single-copy integrative cassettes. 12. Alternatively to the protocol outlined here, a recent publication has described a new media formulation that exploits the ordered consumption of glycerol and methanol to accomplish autoinduction. Development of autoinduction media for P. pastoris is certain to have a positive impact on HTPP pipelines, reducing the amount of time and handling of media swapping protocols. 13. P. pastoris strains have been adapted for HTP growth and expression in various formats depending on the required throughput, cultivation volume and available equipment. 14. For FLAG-tagged proteins, scaling up can be performed without recloning since affinity resin-bound protein can be eluted with a 1 M arginine solution that avoids the use of expensive competing peptides for elution. 15. For convenience, a thermocycler can be programmed to denature the protein samples. A water bath or a block heater can be used alternatively.

High-Throughput Protein Production in Yeast

89

Acknowledgments We gratefully acknowledge the support received during the preparation of this chapter. MCV has received funding from the Spanish Ministerio de Economı´a y Competitividad (PET2008_0101, CTQ2015-66206-C2-2-R, and SAF2015-72961-EXP), the Regional Government of Madrid (S2017/BMD-3673), and the European Commission (Framework Programme 7 (FP7)) project ComplexINC (Contract No. 279039). Abvance Biotech srl contributed with salaries (FJF). Neither funder had any role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. References 1. Botstein D, Fink GR (2011) Yeast: an experimental organism for 21st Century biology. Genetics 189:695–704. https://doi.org/10. 1534/genetics.111.130765 2. Kell DB, Brown M, Davey HM et al (2005) Metabolic footprinting and systems biology: the medium is the message. Nat Rev Microbiol 3:557–565. https://doi.org/10.1038/ nrmicro1177 3. Picotti P, Cle´ment-Ziza M, Lam H et al (2013) A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis. Nature 494:266–270. https://doi. org/10.1038/nature11835 4. Annaluru N, Muller H, Mitchell LA et al (2014) Total synthesis of a functional designer eukaryotic chromosome. Science 344:55–58. https://doi.org/10.1126/science.1249252 5. Richardson SM, Mitchell LA, Stracquadanio G et al (2017) Design of a synthetic yeast genome. Science 355:1040–1044. https:// doi.org/10.1126/science.aaf4557 ´ et al (2017) 6. Marsit S, Leducq J-B, Durand E Evolutionary biology through the lens of budding yeast comparative genomics. Nat Rev Genet 18:581–598. https://doi.org/10. 1038/nrg.2017.49 7. Resnick MA, Cox BS (2000) Yeast as an honorary mammal. Mutat Res 451:1–11 8. Meehl MA, Stadheim TA (2014) Biopharmaceutical discovery and production in yeast. Curr Opin Biotechnol 30:120–127. https:// doi.org/10.1016/j.copbio.2014.06.007 9. Mehla J, Caufield JH, Uetz P (2015) The yeast two-hybrid system: a tool for mapping proteinprotein interactions. Cold Spring Harb Protoc 2015:425–430. https://doi.org/10.1101/ pdb.top083345

10. Young KH (1998) Yeast Two-hybrid: so many interactions, (in) so little time. . . . Biol Reprod 58:302–311. https://doi.org/10.1095/ biolreprod58.2.302 11. Bru¨ckner A, Polge C, Lentze N et al (2009) Yeast two-hybrid, a powerful tool for systems biology. Int J Mol Sci 10:2763–2788. https:// doi.org/10.3390/ijms10062763 12. Hall DA, Zhu H, Zhu X et al (2004) Regulation of gene expression by a metabolic enzyme. Science 306:482–484. https://doi.org/10. 1126/science.1096773 13. Cereghino JL, Cregg JM (2000) Heterologous protein expression in the methylotrophic yeastPichia pastoris. FEMS Microbiol Rev 24:45–66. https://doi.org/10.1111/j.15746976.2000.tb00532.x 14. Bill RM (2014) Playing catch-up with Escherichia coli: using yeast to increase success rates in recombinant protein production experiments. Front Microbiol 5:85. https://doi.org/10. 3389/fmicb.2014.00085 15. Gellissen G, Kunze G, Gaillardin C et al (2005) New yeast expression platforms based on methylotrophic Hansenula polymorpha and Pichia pastoris and on dimorphic Arxula adeninivorans and Yarrowia lipolytica - a comparison. FEMS Yeast Res 5:1079–1096. https:// doi.org/10.1016/j.femsyr.2005.06.004 16. Ferna´ndez FJ, Lo´pez-Estepa M, QuerolGarcı´a J, Vega MC (2016) Production of protein complexes in non-methylotrophic and methylotrophic yeasts: nonmethylotrophic and methylotrophic yeasts. Adv Exp Med Biol 896:137–153. https://doi.org/10.1007/ 978-3-319-27216-0_9 17. Ferna´ndez FJ, Vega MC (2016) Choose a suitable expression host: a survey of available protein production platforms. Adv Exp Med Biol

90

Francisco J. Ferna´ndez et al.

896:15–24. https://doi.org/10.1007/978-3319-27216-0_2 18. Ferna´ndez FJ, Vega MC (2013) Technologies to keep an eye on: alternative hosts for protein production in structural biology. Curr Opin Struct Biol 23:365–373. https://doi.org/10. 1016/j.sbi.2013.02.002 19. DiDonato M, Deacon AM, Klock HE et al (2004) A scaleable and integrated crystallization pipeline applied to mining the Thermotoga maritima proteome. J Struct Funct Genom 5:133–146. https://doi.org/10. 1023/B:JSFG.0000029194.04443.50 20. Fang Z, van der Merwe RG, Warren RM et al (2015) Assessing the progress of Mycobacterium tuberculosis H37Rv structural genomics. Tuberculosis (Edinb) 95:131–136. https:// doi.org/10.1016/j.tube.2014.12.005 21. Vieth M, Sutherland JJ, Robertson DH, Campbell RM (2005) Kinomics: characterizing the therapeutically validated kinase space. Drug Discov Today 10:839–846. https://doi.org/ 10.1016/S1359-6446(05)03477-X 22. Saez NJ, Nozach H, Blemont M, Vincentelli R (2014) High throughput quantitative expression screening and purification applied to recombinant disulfide-rich venom proteins produced in E. coli. J Vis Exp 89:e51464. https://doi.org/10.3791/51464 23. Turchetto J, Sequeira AF, Ramond L et al (2017) High-throughput expression of animal venom toxins in Escherichia coli to generate a large library of oxidized disulphide-reticulated peptides for drug discovery. Microb Cell Factories 16:6. https://doi.org/10.1186/ s12934-016-0617-1 24. Kim MD, Lee TH, Lim HK, Seo JH (2004) Production of antithrombotic hirudin in GAL1-disrupted Saccharomyces cerevisiae. Appl Microbiol Biotechnol 65:259–262. https://doi.org/10.1007/s00253-004-15982 25. Rohde JR, Trinh J, Sadowski I (2000) Multiple signals regulate GAL transcription in yeast. Mol Cell Biol 20:3880–3886 26. Denis CL, Ferguson J, Young ET (1983) mRNA levels for the fermentative alcohol dehydrogenase of Saccharomyces cerevisiae decrease upon growth on a nonfermentable carbon source. J Biol Chem 258:1165–1171 27. Gatignol A, Dassain M, Tiraby G (1990) Cloning of Saccharomyces cerevisiae promoters using a probe vector based on phleomycin resistance. Gene 91:35–41 28. Cregg JM, Vedvick TS, Raschke WC (1993) Recent advances in the expression of foreign

genes in Pichia pastoris. Biotechnology (NY) 11:905–910 29. Ellis SB, Brust PF, Koutz PJ et al (1985) Isolation of alcohol oxidase and two other methanol regulatable genes from the yeast Pichia pastoris. Mol Cell Biol 5:1111–1121 30. Montoliu-Gaya L, Esquerda-Canals G, Bronsoms S, Villegas S (2017) Production of an anti-Aβ antibody fragment in Pichia pastoris and in vitro and in vivo validation of its therapeutic effect. PLoS One 12:e0181480. https://doi.org/10.1371/journal.pone. 0181480 31. Aw R, McKay PF, Shattock RJ, Polizzi KM (2017) Expressing anti-HIV VRC01 antibody using the murine IgG1 secretion signal in Pichia pastoris. AMB Express 7:70. https:// doi.org/10.1186/s13568-017-0372-7 32. Vogl T, Sturmberger L, Kickenweiz T et al (2016) A toolbox of diverse promoters related to methanol utilization: functionally verified parts for heterologous pathway expression in Pichia pastoris. ACS Synth Biol 5:172–186. https://doi.org/10.1021/acssynbio.5b00199 33. Purcell O, Opdensteinen P, Chen W et al (2017) Production of functional anti-ebola antibodies in Pichia pastoris. ACS Synth Biol 6(12):2183–2190. https://doi.org/10.1021/ acssynbio.7b00234 34. Cai Y, Yao S, Zhong J et al (2017) Inhibition activity of a disulfide-stabilized diabody against basic fibroblast growth factor in lung cancer. Oncotarget 8:20187–20197. https://doi.org/ 10.18632/oncotarget.15556 35. Pourasadi S, Mousavi Gargari SL, Rajabibazl M, Nazarian S (2017) Efficient production of nanobodies against urease activity ofHelicobacter pylori in Pichia pastoris. Turk J Med Sci 47:695–701. https://doi.org/10. 3906/sag-1509-121 36. Schwarzhans J-P, Luttermann T, Wibberg D et al (2017) A mitochondrial autonomously replicating sequence from Pichia pastoris for uniform high level recombinant protein production. Front Microbiol 8:780. https://doi. org/10.3389/fmicb.2017.00780 37. Camattari A, Goh A, Yip LY et al (2016) Characterization of a panARS-based episomal vector in the methylotrophic yeast Pichia pastoris for recombinant protein production and synthetic biology applications. Microb Cell Factories 15:139. https://doi.org/10.1186/s12934016-0540-5 38. Koskela EV, de Ruijter JC, Frey AD (2017) Following nature’s roadmap: folding factors from plasma cells led to improvements in antibody secretion in S. cerevisiae. Biotechnol J

High-Throughput Protein Production in Yeast 12:8. https://doi.org/10.1002/biot. 201600631 39. Huang M, Bai Y, Sjostrom SL et al (2015) Microfluidic screening and whole-genome sequencing identifies mutations associated with improved protein secretion by yeast. Proc Natl Acad Sci U S A 112:E4689–E4696. https://doi.org/10.1073/pnas.1506460112 40. Drew D, Newstead S, Sonoda Y et al (2008) GFP-based optimization scheme for the overexpression and purification of eukaryotic membrane proteins in Saccharomyces cerevisiae. Nat Protoc 3:784–798. https://doi.org/10.1038/ nprot.2008.44 41. Abatemarco J, Sarhan MF, Wagner JM et al (2017) RNA-aptamers-in-droplets (RAPID) high-throughput screening for secretory phenotypes. Nat Commun 8:332. https://doi. org/10.1038/s41467-017-00425-7 42. Kota J, Gilstring CF, Ljungdahl PO (2007) Membrane chaperone Shr3 assists in folding

91

amino acid permeases preventing precocious ERAD. J Cell Biol 176:617–628. https://doi. org/10.1083/jcb.200612100 43. Henricsson C, de Jesus Ferreira MC, Hedfalk K et al (2005) Engineering of a novel Saccharomyces cerevisiae wine strain with a respiratory phenotype at high external glucose concentrations. Appl Environ Microbiol 71:6185–6192. https://doi.org/10.1128/AEM.71.10.61856192.2005 44. Ferndahl C, Bonander N, Logez C et al (2010) Increasing cell biomass in Saccharomyces cerevisiae increases recombinant protein yield: the use of a respiratory strain as a microbial cell factory. Microb Cell Factories 9:47. https:// doi.org/10.1186/1475-2859-9-47 45. Ku¨berl A, Schneider J, Thallinger GG et al (2011) High-quality genome sequence of Pichia pastoris CBS7435. J Biotechnol 154:312–320. https://doi.org/10.1016/j. jbiotec.2011.04.014

Chapter 5 A High-Throughput System for Transient and Stable Protein Production in Mammalian Cells Sarah M. Rue, Paul W. Anderson, Michelle R. Gaylord, Jessica J. Miller, Scott M. Glaser, and Scott A. Lesley Abstract Recombinant protein expression and purification is an essential component of biomedical research and drug discovery. Advances in automation and laboratory robotics have enabled the development of highly parallel and rapid processes for cell culture and protein expression, purification, and analysis. Human embryonic kidney (HEK) cells and Chinese hamster ovary (CHO) cells have emerged as the standard host cell workhorses for producing recombinant secreted mammalian proteins by using both transient and stable production strategies. In this chapter we describe a fully automated custom platform, Protein Expression and Purification Platform (PEPP), used for transient protein production from HEK cells and stable protein production from CHO cells. Central to PEPP operation is a suite of custom robotic and instrumentation platforms designed and built at GNF, custom cell culture ware, and custom scheduling software referred to as Runtime. The PEPP platform enables cost-effective, facile, consistent production of proteins at quantities and quality useful for early stage drug discovery tasks such as screening, bioassays, protein engineering, and analytics. Key words Human embryonic kidney, HEK, Chinese hamster ovary, CHO, Mammalian, Protein, Expression, Protein purification, Transient, Stable, Cell culture, Automation, Robotics, PEPP

1

Introduction Basic biologic research and biomedical drug discovery often rely on the ability to produce high-quality, purified recombinant proteins. These proteins can be used to support protein structure/function, regulation, and proteomics studies, and can also be used directly as biotherapeutic agents for medical applications. A considerable number of protein expression systems including bacterial, fungal, insect, plant, and mammalian cells are readily available to the researcher, each offering its own unique advantages. However, mammalian cell host systems, in particular human embryonic kidney (HEK) cells and Chinese hamster ovary (CHO) cells, have emerged as the primary workhorses for producing secreted

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_5, © Springer Science+Business Media, LLC, part of Springer Nature 2019

93

94

Sarah M. Rue et al.

mammalian proteins. Suspension-adapted HEK293 cells are often used for rapid protein expression, as transient transfection of these cells with plasmid DNA encoding the protein of interest can produce up to tens of milligrams per liter of culture in just a few days (for a review see Ref. [1]). CHO cells are the standard manufacturing host for biotherapeutics due to their ability to produce protein at multi-gram amounts per liter, their ability to grow in suspension and in chemically defined media, and their long history with regulatory authorities [2]. Mammalian cell lines offer relative ease of cell handling, cell culture scalability, high protein productivity, and the capability to correctly fold and assemble complex and multidomain proteins. Importantly, they also have the capacity for posttranslational modifications (e.g., disulfide bond formation, proteolytic processing, glycosylation) that, biochemically and functionally, largely reflect those found in the native state [3]. Nevertheless, the nuances of host cell-dependent posttranslational modification such as N-linked glycosylation observed upon expressing identical proteins in HEK and CHO have revealed distinct glycoforms that potentially impact pharmacokinetics and biological activity [4, 5]. For this reason expression of biotherapeutic candidates in CHO to allow early-stage assessment of functional activity and “developability” profiling is of great importance. The development of suspension-adapted HEK293 and CHO cell lines greatly simplifies the integration of automated and liquid handling steps for developing high-throughput protein expression workflows. Improvements in serum-free culture media, transfection efficiencies, and expression yields for both HEK293 and CHO permit flexibility in the scale and throughput of protein expression [6]. Microscale transient transfection protocols using HEK293 have been demonstrated to be quick, robust, and inexpensive [7]. Microscale expression in CHO can be performed using transient or stable expression protocols, although stable protocols are preferred at GNF due to the fact that stable pools can be cryoarchived for future protein resupply. For stable pool establishment, CHO cells are typically transfected with DNA encoding both the expression cassette for the protein of interest and a selectable marker. There are several available options for selectable markers; they tend to be genes that encode either enzymes that confer resistance to cell-killing using antibiotics (e.g., puromycin or hygromycin) or enzymes that are essential to cell metabolism (e.g., dihydrofolate reductase or glutamine synthase). Transfected cells are grown in media containing a selective agent for several weeks to enrich for cells that have stably integrated the expression plasmid into their genomes and are expressing the protein of interest. Cells that survive selection and grow out are referred to as a stable pool. This pool can be archived and then used to seed a new culture for protein production directly. If desired, stable pools

High-Throughput Protein Expression in Mammalian Cells

95

developed in microscale processes can also be single-cell cloned. Single-cell cloning is useful in that it facilitates the identification of cell lines with properties that are different or preferential compared to the parental pool (higher cell-specific productivity, differential posttranslational modifications, improved stability, etc.), but the process can significantly extend project timelines. With the arrival of the “omics” revolution and the trend toward high-throughput screening and discovery strategies, the demand for rapid, cost-effective, parallel protein expression and purification technologies has risen. The integration of protein expression technologies with steady advances in laboratory automation, robotics, and information management systems has enabled automation of the full process of protein expression and purification, from cell transfection to protein purification. In this chapter we describe the configuration, operation, and protocols for a fully automated custom robotic system, Protein Expression and Purification Platform (PEPP) that is interchangeably used for transient protein production in HEK cells and stable protein production in CHO cells. PEPP operation consists of a suite of custom-built robotic and instrumentation platforms built by GNF Systems in San Diego, CA, custom cell culture ware accommodating both HEK and CHO cell cultures, and custom scheduling software known as Runtime. PEPP is routinely used at GNF to supply recombinant antibodies, IgG Fab fragments, and non-antibody proteins to support drug discovery programs. In addition, PEPP has been used to produce protein collections that enable high-throughput proteinbased discovery screens. The largest collection produced by PEPP to date is GNF’s “Secretomics” collection, a nearly 7000-member library consisting of secreted and extracellular domain (ECD) derived proteins from the human secretome [8]. The use of PEPP for mammalian protein production and purification has resulted in a cost-effective increase in throughput, quality, and consistency and the platform architecture provides the flexibility needed to accommodate a variety of discovery applications.

2

Materials

2.1 Overview of PEPP and Custom Lab Ware

PEPP is a custom robotics platform that can be used to move and manipulate cells maintained in custom barcoded AutoFlasks. It can be used to fill and weigh AutoFlasks, perform transfections, or transfer nucleofected cells into AutoFlasks, and it can incubate up to 972 of these flasks in a controlled environment. PEPP can be used to monitor cell density in the flasks quickly and easily. When cells are ready to harvest, PEPP can be used to pellet AutoFlasks in a low-speed centrifuge, harvest the protein-containing cell supernatants, and purify proteins. Once purified, proteins can be buffer exchanged and aliquoted on the platform. PEPP currently supports

96

Sarah M. Rue et al.

both high-throughput HEK and CHO workflows at GNF. Each month on PEPP, typically approximately 350 proteins are made in HEK and 100–200 stable CHO pools are generated. 2.1.1 PEPP Layout and System Components

The overall layout of PEPP is shown in Fig. 1. Panel (a) shows an overhead view, with components numbered in accordance with the descriptions below. Panel (b) shows a CAD drawing of a side view. Individual system components are described below.

Control center

A.

Flask Filling Station

NAP-10 Tecan

Balance

Flask Hotels

Waste Chute

Incubator

Flask Density Reader

Harvester

Incubator

Centrifuge Purification Tecan

Transfection Tecan

B. Incubators

Transfection Tecan

Control Center Centrifuge Purification Tecan

Flask Filling Station

Waste Chute

Harvester

Nap10 Tecan

Fig. 1 (a) Overhead diagram showing the layout of PEPP, with components numbered according to the descriptions in Subheading 2.1.1. (b) CAD drawing depicting a side view of the platform

High-Throughput Protein Expression in Mammalian Cells

97

Fig. 2 Central robotic arm outfitted with a GNF Systems Gripper

1. St€aubli® TX90XL Robotic Arm with GNF Systems Gripper (Fig. 2) This central robotic arm moves barcoded AutoFlasks between different devices on PEPP. It is fitted with a GNF Systems-built gripper which has an integrated barcode scanner that can detect the presence of flasks and scan flask barcodes. With each movement of a flask on the system, the barcode is used to confirm proper labware orientation as well as to amend a detailed activity log for that individual AutoFlask. 2. GNF Systems Flask Filling Station (Fig. 3) This station consists of two Flask Dispensers that are mirror images of each other and reside together within a positive pressure HEPA enclosure (Fig. 3, Panel a). Each dispenser is equipped with three independently controlled tips that are plumbed with disposable tubing routed through three independent peristaltic pump heads to enable the device to quickly dispense or aspirate large volumes. Two of the fluid paths are routinely used for dispensing while the third is utilized for aspiration. The GNF Systems PDCapp software enables the operator to program the movement of any tip through three operational positions during a flask handling cycle: Dispense/Aspirate,

98

Sarah M. Rue et al.

Fig. 3 (a) The GNF Systems Flask Filling Station. (b) Up-close view of one of the dispensers filling an AutoFlask

Prime, and Wash. Designed specifically for AutoFlasks, the retractable nest is used to position the flask under a fixed holddown bar (Fig. 3, Panel b), ensuring that it remains properly locked in during dispense/aspirate operations but is still accessible by the robot for direct loading and unloading from the Flask Filling Station. The nest can also be tilted automatically during aspiration to position the septum corner of the flask at the lowest point to allow for nearly complete media removal. The Prime position allows each fluid path to be selectively flushed with a user-specified volume, while the Wash position enables sterilization of the outside of the tip by submerging it in an overflow chimney flooded with cleaning reagent. 3. AutoFlask Weigh Station (Fig. 4) This device weighs AutoFlasks at user-determined points in the process in order to track the volume in each flask. Empty flasks are weighed before they are filled. The approximate solution densities (mg/L) of media used on the system are known, as are the approximate solution densities of cultures at defined cell densities (numbers of cells/mL), so these can be converted into estimated flask volumes. When reagents are added via the Flask Filling Station or Transfection Tecan, high and low thresholds can be set so that any out-of-bounds conditions trigger a notification to the operator. 4. GNF Systems Flask Septum Sanitizer This device sterilizes the outer portion of the septum of an AutoFlask before it is moved to the Flask Filling Station or Transfection Tecan in order to prevent potential contamination of flask contents when the septum is pierced. The robot arm moves AutoFlasks one at a time into this station, where a ~100 μL drop of 70% EtOH is dispensed onto the septum. After pausing for two seconds, the robot slowly withdraws the flask from the sanitizer as an air knife blows the 70% EtOH,

High-Throughput Protein Expression in Mammalian Cells

99

Fig. 4 The AutoFlask Weigh Station. The balance is protected from airflow by a custom-built cover, and it sits on a platform that dampens vibrations

along with any contaminants, into a negative pressure snorkel which evacuates waste from the system. 5. Transfection Tecan A Tecan Freedom EVO® 150 with a custom positive pressure HEPA enclosure, four GNF Systems retractable AutoFlask nests, and various custom-made labware carriers to enable both 293 and CHO workflows. The GNF Systems AutoFlask nests on the Transfection Tecan utilize the same hold-down bar design used on the Flask Filling Station to prevent movement of the flask as reagents are added or removed by the Tecan Liquid Handling (LiHa) arm fitted with disposable custom Tecan tips. Nest movement back and forth to mix flask contents can be programmed through Tecan Freedom EVOware® by the operator to run at any desired point in a liquid handling routine. 6. GNF Systems Incubators PEPP is equipped with two GNF Systems Automated Incubators, each containing 18 hotels with 27 shelves, yielding a capacity of 486 AutoFlasks per incubator (Fig. 5). The entire carousel of hotels is programmed to rotate in an oscillating fashion in order to shake the flasks and maintain cells in suspension. Temperature, relative humidity and CO2 are controlled through GNF Systems Runtime software. Environmental conditions are captured in a log file every 5 min and out-of-range conditions trigger a notification to users. A vertical array of six independent doors allows direct robot access to any of the 486 flask positions while minimizing the disruptive effects of breaking the incubator’s environmental envelope (Fig. 6).

100

Sarah M. Rue et al.

Fig. 5 View of one of the GNF Systems Incubators from the outer side, which has a door that can be opened to allow access by the operator

Fig. 6 The robotic arm placing a flask into one of the GNF Systems Incubators

High-Throughput Protein Expression in Mammalian Cells

101

Fig. 7 The GNF Systems Flask Density Reader

7. GNF Systems Flask Density Reader (FDR; Fig. 7) This device consists of a laser light source and receiver housed in a custom nest (for a thorough description please see Ref. [9]). After the robotic arm removes an AutoFlask from the incubator it is seated into the nest and the laser light passes through the culture flask. Any cells present in the flask will reduce the amount of transmitted light (primarily by light scatter) that is sensed by the detector. The AutoFlask is then robotically returned to the incubator. The measured transmitted light can be correlated to the flask cell density using a standard curve derived from known cell densities. All AutoFlask cultures are read on this device each morning in order to inform actions that will need to be taken during that day on the system, such as archiving or seeding protein production cultures from stable CHO pools. Typically, approximately 400–500 flasks are read on this device each day, at a speed of 40 seconds per flask including robot movements. 8. GNF Systems Flask Centrifuge (see Fig. 8 for inside view) This custom centrifuge has a horizontal clamshell-type rotor and is able to pellet cells inside the AutoFlask into a pocket specifically designed for this application. The centrifuge holds four AutoFlasks and is typically run at 1500  g for 5 min to pellet cells. 9. GNF Systems Harvester and Supernatant Collector (Fig. 9) A station used to collect supernatants from AutoFlasks consisting of both a Harvester and a Supernatant Collector. Typically, cells are pelleted in the GNF Systems centrifuge and then the flasks are directly moved to the flask side of the Harvester to be processed in sets of four. A manifold (Fig. 9,

102

Sarah M. Rue et al.

Fig. 8 View of the inside of the GNF Systems Flask Centrifuge

Fig. 9 (a) The GNF Systems Harvester. The harvester can tip AutoFlasks at an angle that maximizes the amount of supernatant that can be withdrawn by the cannulae. (b) The Supernatant Collector side of the GNF Systems Harvester

High-Throughput Protein Expression in Mammalian Cells

103

Panel a) consisting of four cannulas pierces the AutoFlask septums and pumps supernatants over to the Supernatant Collector where they are dispensed through a second manifold into four 50 mL conical tubes (Fig. 9, Panel b). After all supernatant has been collected from each of the four harvest lines, both manifolds are moved to wash stations so that the tips can be cleaned and the lines can be flushed with sterile PBS. Once a set of eight supernatants has been harvested, the conical tubes are automatically slid over to a customized Tecan Freedom EVO® 150 worktable for the protein purification procedure. GNF Systems PDCapp Software on the Harvester computer is used to track progress and index the Supernatant Collector manifold to the correct set of collection tubes, allowing for up to 96 flask supernatants to be harvested in a single run. 10. Flask Hotels (Fig. 10) Two hotels with capacity for 27 AutoFlasks each are used as a temporary storage location for flasks that are to be taken offline (see Note 1). The entire hotel has a slight tilt to prevent dislodging of cell pellet material in flasks that have been centrifuged. 11. Waste chute (Fig. 11) Metal chute leading to a biohazard bin for automated AutoFlask disposal on the system. 12. Purification Tecan A Tecan Freedom EVO® used to purify proteins from harvested cell supernatants by gravity- fed affinity chromatography. This was customized by GNF Systems by modifying the Tecan to elevate the LiHa arm so that it can clear the gravityfed purification columns layered over the waste trough, which is housed above the 24-well capture plates.

Fig. 10 One of the two flask hotels on PEPP

104

Sarah M. Rue et al.

Fig. 11 The waste chute used to dispose of flasks

A custom GNF Systems liquid waste trough, protein collection deck, and two auto-filling buffer troughs can be programmatically controlled through Tecan EVOware®, enabling unattended purification of up to 96 supernatants at once. Typically, two cycles of 96 harvests and purifications can be run each day. 13. NAP-10 Tecan A Tecan Freedom EVO® used to exchange purified proteins into final formulation buffer of choice and to aliquot proteins into 2D-barcoded tubes. Like the Purification Tecan, this Tecan was modified by GNF Systems to raise the LiHa arm so that it can clear the Nap10 columns layered over the waste trough, which sits on top of the 24-well capture plates. The same custom GNF Systems-designed liquid waste and protein collection carriers used on the Purification Tecan are used on this Tecan to accomplish unattended buffer exchange and sample re-array. 14. PEPP Software Nearly all GNF Systems peripheral devices are controlled using an in-house built program called Peripheral Device Control application (PDCapp). This software allows execution of all of the steps available for a given device and allows the user to create PDCapp methods composed of the various steps in any order. Each step within those methods contains variables specific to that step, so the user has the ability to finely control each action. In order to execute methods and query events on the Transfection Tecan, a GNF Systems-built driver called UberBridge is used to facilitate communication between Runtime and the Freedom EVOware® external programming interface. The balance is another third-party device, but it is run by

High-Throughput Protein Expression in Mammalian Cells

105

feeding commands directly through a serial (RS-232) connection. Written by GNF Systems software engineers, Runtime Environment is the overarching scheduling software that drives PEPP and is the software that enables the system operator to design all the required workflows. To design a new workflow, the operator creates a Runtime method that calls specific Tecan or PDCapp methods on each peripheral device. Then, by providing a list of AutoFlask barcodes, the Runtime software is able to move each flask in the list through each step of the Runtime method, thus creating a scheduled workflow. Runtime has too many features to list, but to name a few it has the ability to build in timers, notifications, run several projects in parallel, and capture metrics including timing of system events, AutoFlask weights, and environmental conditions within the incubators over time. To facilitate ease of use in routine system workflows, custom “Dashboards” have been written which overlay Runtime, greatly simplifying operations by only exposing the relevant Runtime options that pertain to the specific workflow. For more information on the software, please contact GNF Systems ([email protected]). 2.1.2 Custom Lab Ware

1. Greiner CELLSTAR® AutoFlask™. A flask designed for automated cell culture (Fig. 12). The flask conforms to the ANSI SLAS standard footprint compatible with common laboratory automation. It has barcodes on both short edges, a pierceable septum that allows for the addition or removal of reagents, and a filter that enables gas exchange. It also has a pocket designed to capture cells during centrifugation in the GNF Systems Centrifuge in order to minimize cell carryover into harvested supernatants. 2. GNF Custom Robotic 1 mL Filtered Tips (Fig. 13). Elongated wide-bore tips specially designed for piercing the AutoFlask septum on a Tecan liquid handler with the ability to reach deeper into the flask than standard tips. Standard Tecan tips will push the septum into the flask if used during aspiration steps.

2.2 Preparation of DNA 2.2.1 Miniprep Using Macherey-Nagel NucleoSpin® 96 Plasmid Miniprep Kit

1. MAX Efficiency® DH5α Chemically Competent cells (ThermoFisher Scientific #18258012). 2. Whatman™ Uniplate® 96-well deep-well blocks (GE Healthcare #7701–5200). 3. Plasmid+® media (Thomson Instrument Company #446300). 4. Shel Lab SSI5 Floor Model Shaking Incubator. 5. Eppendorf™ 5810R or similar centrifuge.

106

Sarah M. Rue et al.

Fig. 12 An AutoFlask, with features labeled

Fig. 13 A GNF custom Tecan tip used on the Transfection Tecan. The tip has a long, narrow point that can penetrate the AutoFlask septum and reach the bottom of the flask while keeping the septum in place

6. Macherey-Nagel NucleoSpin® 96 Plasmid Miniprep Kit (Macherey-Nagel #740625.24). 7. Analog MultiTube Vortexer (Fisher Scientific #02-215-450). 8. NucleoVac 96 manifold (Macherey-Nagel #740681). 9. House vacuum line. 10. 96-well optical UV transparent flat bottom plates (Corning #3635). 11. Cell Culture Grade H2O (HyClone™ #SH30529.02). 12. Tecan Infinite® F-200 plate reader.

High-Throughput Protein Expression in Mammalian Cells 2.2.2 Maxiprep Using QIAGEN® HiSpeed Plasmid Maxi Kit

2.3 Protein Production in 293-F cells 2.3.1 Handling Parental Freestyle™ 293-F cells

107

1. QIAGEN® HiSpeed Plasmid Maxi Kit (QIAGEN #12663). 2. Cell Culture Grade H2O (HyClone™ #SH30529.02). 3. NanoDrop™ ND-1000 (Thermo Scientific). 1. Orbital incubator/shaker with temperature and CO2 control (Infors HT Multitron Cell or similar). 2. Biosafety cabinet containing a house vacuum line connected to a vacuum trap. 3. FreeStyle™ 293-F Cells (ThermoFisher Scientific #R79007). 4. Gibco® FreeStyle™ 293 Expression Medium (Thermo Fisher Scientific #12338018). 5. Bead bath set to 37  C. 6. Eppendorf 5810R or similar centrifuge. 7. Sterile disposable vented PETG cell culture flasks [Thermo Fisher Scientific #4112–0125 (125 mL), #4112–0250 (250 mL), #4112–0500 (500 mL), and #4115–1000 (1 L)]. 8. 2.5 liter UltraYield™ Flasks (Thomson Scientific #931136-B). 9. AirOtop™ Enhanced Seals for 2.5 L UltraYield Flasks (Thomson Scientific #899425). 10. 5 L Optimum Growth™ flasks (Thomson Scientific #931116). 11. Vi-CELL™ XR (Beckman Coulter) or alternative cell counting device. 12. Vi-CELL™ Sample Vials (Beckman Coulter #383721). 13. Sterile 50 mL conical tubes (Falcon® #352098). 14. Sterile 2 mL aspirating pipettes (VWR #414004-265). 15. Sterile dimethyl sulfoxide (DMSO; Calbiochem® #317275). 16. Steriflip-GV, 0.22 μm, PVDF, radio-sterilized (EMD Millipore #SE1M179M6). 17. Sterile 2 mL Cell Culture Cryogenic Tubes (Thermo Fisher Scientific #368632). 18. Styrofoam rack or Mr. Frosty™ Freezing Container (Thermo Fisher Scientific # 5100–0001). 19. 80  C freezer. 20. Liquid Nitrogen tank or VIP Plus™ Cryogenic Storage Chest Freezer (SANYO Scientific Model MDF-C2156VANC).

2.3.2 Transient Transfection of 293-F cells

1. Masterflex PharMed BPT Tubing, L/S #16, 250 (Cole-Palmer Instrument Company #06508-18). 2. Autoclavable zip ties. 3. Razor blades.

108

Sarah M. Rue et al.

4. 1 L of 70% Ethanol made using water filtered using a Milli-Q® water purification system (EMD Millipore). 5. 1 L of 10% Bleach solution made using water filtered using a Milli-Q® water purification system (EMD Millipore). 6. Sterile 2 mL aspirating pipettes (VWR #414004-265). 7. Gibco® FreeStyle™ 293 Expression Medium (Thermo Fisher Scientific #12338018). 8. CELLSTAR® AutoFlasks™ (Greiner Bio-One #779190-J). 9. 2.5 L UltraYield™ Flasks (Thomson Scientific #931136-B). 10. AirOtop™ Enhanced Seals for 2.5 L UltraYield Flasks (Thomson Scientific #899425). 11. VWR Standard Orbital Shaker, Model 3500. 12. Miniprepped or maxiprepped DNA aliquoted into 0.75 mL 2D Data-Matrix coded V-bottom push cap tubes in Loborack96 (Micronic #MPW52020BC). 13. Eppendorf 5810R or similar centrifuge capable of pelleting 96-well LoBo racks. 14. TempPlate pierceable sealing foil, sterile (USA Scientific #2923-0110). 15. Sterile Tecan 100 mL Disposable Reservoir (Fisher Scientific # NC9405917). 16. Transfection reagent of choice. 17. Spray bottle containing 70% vol/vol EtOH in ddH2O. 18. GNF Custom Tips, Robotic, Filtered, 1 mL (Thermo Fisher Scientific #F175-96RS1000PT). 19. Clean Plastic Manual Decapper-8 (Micronic #MP65010). 20. Nalgene™ Rapid-Flow™ 0.2 μm 250 mL filter units (Thermo Fisher Scientific #153–0020). 21. Biosafety cabinet containing a house vacuum line connected to a vacuum trap. 22. Sterile 96-well deep-well plates (Greiner Bio-One #780271). 2.4 Stable CHO Workflow 2.4.1 Maintenance of Parental CHO Cells

1. Custom CHO maintenance media or commercial media of choice. 2. Bead bath set to 37  C. 3. Biosafety cabinet containing a house vacuum line connected to a vacuum trap. 4. Orbital Incubator/shaker with temperature and CO2 control (Infors HT Multitron Cell or similar).

High-Throughput Protein Expression in Mammalian Cells

109

5. Sterile disposable vented PETG cell culture flasks [Thermo Fisher Scientific #4112–0125 (125 mL), #4112–0250 (250 mL), #4112–0500 (500 mL), and #4115–1000 (1 L)]. 6. Vi-CELL™ XR (Beckman Coulter) or alternative cell counting device. 7. Vi-CELL™ Sample Vials (Beckman Coulter #383721). 8. Sterile 50 mL conical tubes (Falcon #352098). 9. Sterile 2 mL aspirating pipette (VWR® #414004-265). 10. Sterile 2 mL Cell Culture Cryogenic Tubes (Thermo Fisher Scientific #368632). 11. 80  C freezer. 12. Styrofoam rack or Mr. Frosty™ Freezing Container (Thermo Fisher Scientific # 5100-0001). 13. Liquid nitrogen tank or VIP Plus™ Cryogenic Storage Chest Freezer (SANYO Scientific Model MDF-C2156VANC). 2.4.2 Nucleofection of Suspension CHO Cells

1. Plasmid DNA suspended in cell culture grade H2O. 2. NanoDrop™ ND-1000 (Thermo Scientific). 3. TempPlate No-Skirt 0.2 mL PCR Plates (USA Scientific #1402-9596). 4. TempPlate Sealing Film (USA Scientific #2921-0000). 5. Biosafety cabinet containing a house vacuum line connected to a vacuum trap. 6. Sterile 50 mL conical tubes (Falcon® #352098). 7. Eppendorf 5810R or similar centrifuge capable of pelleting cells in 50 mL conicals. 8. Sterile 2 mL aspirating pipette (VWR® #414004-265). 9. Cell Line 96-well Nucleofector™ Kit (Lonza). 10. 12-channel pipettor such as F1-ClipTip™ 50 (Thermo Fisher Scientific #4661160 N). 11. Sterile disposable tips for 12-channel pipettor. 12. Lonza Amaxa 4D-Nucleofector™ System: (a) 4D-Nucleofector™ Core Unit. (b) 4D-Nucleofector™ X Unit. (c) 96-well Shuttle™. (d) Laptop computer. 13. CELLSTAR® AutoFlasks™ (Greiner Bio-One # 779190-J). 14. Custom CHO media or commercial media of choice. 15. Antibiotic/Antimycotic #SV30079.01).

100

solution

(HyClone™

110

Sarah M. Rue et al.

16. Nalgene™ Rapid-Flow™ 0.2 μm 250 mL filter units (Thermo Fisher Scientific #153–0020). 17. GNF Custom Tecan Tips, Robotic, Filtered, 1 mL (Thermo Fisher Scientific #F175-96RS1000PT). 2.4.3 Monitoring Pool Recovery from Selection 2.4.4 Sampling AutoFlasks on the Transfection Tecan

1. PEPP FDR. 2. Stable CHO pools maintained in AutoFlasks. 1. Vi-CELL™ XR (Beckman Coulter) or alternative cell counting device. 2. Vi-CELL™ Sample Vials (Beckman Coulter #383721). 3. CELLSTAR® 24-well Tissue Culture Plates (Greiner Bio-one #662160). 4. Nunc MicroWell 96-Well V-bottom Plates (Thermo Scientific #62409-108).

2.4.5 Archiving Stable CHO Pools

1. GNF Custom Tecan Tips, Robotic, Filtered, 1 mL (Thermo Fisher Scientific #F175-96RS1000PT). 2. Micronic 1.4 mL 2D Data-Matrix, Coded Screw Cap tubes (USA Scientific #1775-2330). 3. Biosafety cabinet. 4. Sterile dimethyl sulfoxide (DMSO; Calbiochem #317275). 5. 12-channel pipettor such as F1-ClipTip™ 50 (Thermo Fisher Scientific #4661160 N). 6. Sterile disposable tips for 12-channel pipettor. 7. Hamilton LabElite Decapper (USA Scientific #1774-7101). 8. Micronic RS210 CryoScanner for 2D tubes (USA Scientific #MP55121). 9. Liquid Nitrogen storage tank OR VIP Plus™ Cryogenic Storage Chest Freezer (SANYO Scientific Model MDF-C2156VANC).

2.4.6 Protein Production from Stable CHO Pools

1. CELLSTAR® AutoFlasks™ (Greiner Bio-One #779190-J). 2. CHO production medium of choice. 3. Antibiotic/Antimycotic #SV30079.01).

100

solution

(HyClone™

4. Nalgene™ Rapid-Flow™ 0.2 μm 1000 mL filter units (Thermo Fisher Scientific #164-0020). 5. GNF Custom Tecan Tips, Robotic, Filtered, 1 mL (Thermo Fisher Scientific #F175-96RS1000PT).

High-Throughput Protein Expression in Mammalian Cells

2.5 Harvest and Protein Purification 2.5.1 Harvest

2.5.2 Purification of His6-Tagged Proteins

111

1. Corning™ 50 mL conical tubes without caps (Fisher Scientific #07-201-332). 2. 200 L stock of sterile HyClone™ Dulbecco’s Phosphatebuffered Saline [Dulbecco’s PBS (GE Healthcare #SH3A1759.03)]. 1. Nalgene™ Rapid-Flow™ 0.2 μm 1000 mL filter units (Thermo Fisher Scientific #164–0020). 2. Biosafety cabinet containing a house vacuum line connected to a vacuum trap. 3. Imidazole, BioUltra grade (Sigma-Aldrich #56749). 4. 1.0 N Sodium Hydroxide (VWR #BDH7222-4). 5. Hydrochloric Acid [HCl (Sigma-Aldrich #320331). 6. 10 Tris-buffered #T5912)].

Saline

[10

TBS;

(Sigma-Aldrich

7. 1 M Tris Hydrochloride Buffer, pH 7.5 (Corning #46-030CM). 8. 5 M NaCl (Corning #46-032-CV). 9. 100% Ethanol (Koptec #V1016G). 10. Cell Culture Grade H2O (HyClone™ #SH30529.02). 11. Ni-NTA Agarose (QIAGEN #30250). 12. Poly-Prep® Chromatography Columns (BioRad #7311553). 13. Sterile 24-Well Plates, 10.4 mL, Round Bottom, Polypropylene (Thomson Instrument Company #931565G1). 14. Robotic Reservoirs, 300 mL, Flat Bottom (Fisher Scientific #12-565-572). 15. TempPlate pierceable sealing foil, sterile (USA Scientific #2923-0110). 16. 200 L stock of sterile HyClone™ Dulbecco’s Phosphatebuffered Saline [Dulbecco’s PBS (GE Healthcare #SH3A1759.03)]. 17. TipOne 1000 μL conductive black filter tips for Tecan (USA Scientific #1128-2819). 2.5.3 Purification of IgG and Fc-Fusion Proteins

1. MabSelect™ SuRe™ resin (GE Healthcare #17-5438-03). 2. Protein G Sepharose™ 4 Fast Flow resin (GE Healthcare #170618-05). 3. Pierce™ IgG Elution Buffer (Thermo Fisher Scientific #21009). 4. 1 M Tris–HCl, pH 8.0 (Corning #46-031-CM). 5. Nalgene™ Rapid-Flow™ 0.2 μm 1000 mL filter units (Thermo Fisher Scientific #164-0020).

112

Sarah M. Rue et al.

6. 100% Ethanol (Koptec #V1016G). 7. Poly-Prep® Chromatography Columns (BioRad #7311553). 8. Sterile 24-Well Plates, 10.4 mL, Round Bottom, Polypropylene (Thomson Instrument Company #931565G1). 9. Robotic Reservoirs, 300 mL, Flat Bottom (Fisher Scientific #12-565-572). 10. TempPlate pierceable sealing foil, sterile (USA Scientific #2923-0110). 11. 1.0 N Sodium hydroxide (VWR #BDH7222-4). 12. 200 L stock of sterile HyClone™ Dulbecco’s Phosphatebuffered Saline [Dulbecco’s PBS (GE Healthcare #SH3A1759.03)]. 13. TipOne 1000 μL conductive black filter tips for Tecan (USA Scientific #1128–2819). 2.6 Buffer Exchange and Protein Aliquotting

1. NAP-10 columns, 1 mL (VWR #95017-013). 2. 1.0 N Sodium hydroxide (VWR #BDH7222-4). 3. 200 L stock of sterile HyClone™ Dulbecco’s Phosphatebuffered Saline [Dulbecco’s PBS (GE Healthcare #SH3A1759.03)]. 4. 10 Tris-buffered #T5912)].

saline

[10

TBS;

(Sigma-Aldrich

5. 0.75 mL and 1.4 mL Screw-capped V-bottom 2D Tubes in Lobo Rack (USA Scientific #1775-2351 and #1775-2330). 6. Hamilton LabElite DeCapper (USA Scientific #1774-7101). 7. Micronic RS210 CryoScanner for 2D tubes (USA Scientific #MP55121). 8. 96-well UV transparent flat bottom plates (Corning #3635). 2.7 Protein Quantitation

1. SpectraMax Plus 384 Microplate Reader (Molecular Devices). 2. 96-well UV transparent flat bottom plates (Corning #3635). 3. HyClone™ Sterile Phosphate Buffered Saline (GE Healthcare # SH30256.02).

3

Methods

3.1 Overview of PEPP and Custom Lab Ware

PEPP is used for two different fully automated protein production workflows that we will describe in this chapter: transient protein production from 293-F cells and stable CHO pool and protein production. The first step in both workflows is preparation of DNA encoding the proteins of interest for transfection.

High-Throughput Protein Expression in Mammalian Cells

3.2 Preparation of DNA

3.2.1 Miniprep Using Macherey-Nagel NucleoSpin® 96 Plasmid Miniprep Kit

113

Either miniprep or maxiprep DNA can be used in transfection of 293-F cells and CHO cells. For all transfections, DNA quality is paramount (see Note 2). We strongly recommend working with high-copy number plasmids and using a high-yielding bacterial strain such as DH5α™ (Thermo Fisher Scientific) for DNA preparation. In our hands, the below-described alkaline lysis kits have reproducibly produced DNA of sufficient quality for both stable CHO pool establishment and 293-F transfection (see Note 3). It is not necessary to use endotoxin-free DNA purification kits to prepare DNA for transfection of either 293-F cells or our proprietary CHO strain; other strains may have different sensitivities. Minipreps are faster, easier, less expensive, and more highthroughput than maxipreps, but the DNA yield is much higher for maxipreps. 1. Grow up bacteria in 96-well deep-well blocks. (a) Fill 96-well deep-well block with Plasmid+® media containing the appropriate antibiotic at a 1 mL volume/ well. (b) Inoculate wells of the block using colonies from a fresh agar plate (see Note 4). (c) Incubate block 18–24 h at 37  C, 800 rpm in a Shel Lab shaking incubator. 2. Pellet deep-well block at 2400  g 15–20 min. Supernatants should appear clear. 3. Decant supernatants (see Note 5). 4. Follow manufacturer’s miniprepping protocol [10] either using a NucleoVac 96 manifold (manual process), or a Tecan (automated process; see [11]). (a) After the addition of Buffer A1, cover the block with a TempPlate seal and then resuspend cells by vortexing the block on a multi-tube vortexer. (b) Proceed with the remainder of the manufacturer’s protocol as described. (c) We recommend performing the optional wash step with Buffer AW. 5. Elute DNA into a 96-well flat-bottomed optical plate. (a) If DNA will be used for stable CHO pool generation, elute each well with 70 μL cell culture grade water. If DNA will be used for a 293-F transfection, you can elute in either cell culture grade water or in Buffer AE provided with the kit.

114

Sarah M. Rue et al.

(b) Add water or elution buffer to the NucleoSpin Plasmid Binding Plate and incubate 15–30 min prior to applying the vacuum to pull the liquid through the plate. (c) Quantitate DNA using a Tecan Infinite® F-200 plate reader. (d) Cover plate with a TempPlate seal or move DNA into 2D-barcoded tubes. Store DNA at 20  C. 3.2.2 Maxiprep Using QIAGEN® HiSpeed Plasmid Maxi Kit

1. Follow the QIAGEN® HiSpeed Plasmid Maxi Kit protocol for growing up bacteria from a colony on a fresh agar plate and performing the DNA preparation. 2. If DNA will be used for nucleofection of CHO cells, elute in 1 mL cell culture grade water; DNA for 293-F transfection can be eluted in the provided Buffer TE. 3. To elute DNA, push water or TE gently into the QIAprecipitator using the plunger, but stop before any liquid passes through the filter into the elution tube. To maximize yields, allow water to sit on the QIAprecipitator for 15–30 min prior to eluting liquid from the QIAprecipitator with the plunger. 4. Quantitate DNA using a NanoDrop or other device. Store DNA at 20  C.

3.3 Protein Production in 293-F Cells

3.3.1 Handling Parental FreeStyle™ 293-F Cells

In this procedure, FreeStyle™ 293-F cells are transiently transfected with DNA encoding the proteins to be expressed. AutoFlasks are filled with a premade suspension of cells in media, and the following day the Transfection Tecan is used to mix the DNA and transfection reagent and to transfer the resulting complexes into the flasks. Several different transfection reagents can be used on the system; each has its benefits and drawbacks. Using automation for the transfection process makes it relatively straightforward to execute experiments to compare performance of different reagents, ratios of DNA to transfection reagent, and other parameters. Cells should be handled in a biosafety cabinet using sterile technique. Please also refer to the FreeStyle™ 293 Expression System Manual [12]. 1. Thawing cells. (a) Pre-warm 30 mL FreeStyle™ media in a 50 mL conical in a 37  C bead bath. (b) Thaw cryo-vial containing 1.0 e7 cells quickly in a 37  C bead bath.

High-Throughput Protein Expression in Mammalian Cells

115

(c) Transfer vial contents into 10 mL Freestyle™ media in a 50 mL conical. Pellet 5 min at 100  g, then resuspend in 10–20 mL fresh media and transfer to a 125 mL shake flask. (d) Place flask in an orbital shaking incubator set to 37  C, 8% CO2, and 120 rpm. (e) Monitor cell growth daily: withdraw 0.5–1 mL of culture into a Vi-CELL vial in a biosafety cabinet. Determine cell density and viability on a Vi-CELL instrument. Once cells reach a viable cell density between 1.0 and 3.0 e6 cells/ mL, expand to a larger shake flask (50 mL culture in a 250 mL flask, then 250 mL culture in a 1 L flask), always seeding at a density of 1.0–3.0 e5 cells/mL. Continue scale-up with seeding at the same density until you are able to achieve your desired working stock culture volume (typically 1 L or 2.5 L, see below). 2. Cell Maintenance. (a) Cells are typically maintained in 1 L culture volume in a 2.5 L UltraYield™ flask or 2.5 L culture volume in a 5 L Optimum Growth™ flask in an orbital shaking incubator set to 37  C, 8% CO2, and 120 rpm. (b) Passage 2–3 times/week, splitting to a density of 0.1–0.5 e6 cells/mL in pre-warmed media in a biosafety cabinet. Once cells have recovered from thaw, they should double approximately every 24 h. (c) A new vial of cells is thawed every four weeks. Cells are typically cultured for at least four passages before they are used in a transfection. After 5–6 weeks of use, cells are retired and the more recently thawed stock is used as the working stock. 3. Banking cells. (a) Grow cells to a density between 1.0 and 3.0 e6 cells/mL. (b) Prepare freezing media (made fresh): mix Freestyle™ media + DMSO at 10% volume (e.g., 9 mL Freestyle™ media +1 mL DMSO). Sterile filter through a 0.22 μm Steriflip-GV unit. Chill on ice. (c) Calculate the volume of cells needed to generate desired number of cryovials at 1.0 e7 cells/mL, 1 mL per vial. (d) Transfer appropriate volume to 50 mL conicals. Pellet at 100  g 5 min and aspirate media using a sterile 2 mL transfer pipette attached to a house vacuum line. (e) Resuspend cells in freezing media at a density of 1.0 e7 cells/mL. (f) Aliquot to 2 mL screw-capped cryo-tubes, 1 mL/vial.

116

Sarah M. Rue et al.

(g) Slow-freeze at 80  C in a Styrofoam rack or roomtemperature Mr. Frosty™. (h) Once frozen, store in a liquid nitrogen tank or cryogenic storage chest freezer. 3.3.2 Transient Transfection of 293-F Cells

1. Set up the GNF Systems Flask Filling Station. The two Flask Dispensers in this station are in a HEPAfiltered enclosed environment. While setting up the Flask Dispensers and manipulating items in the HEPA enclosure, sterile technique should be used. The below protocol describes the setup of one Flask Dispenser. If desired, both Flask Dispensers can be set up to fill AutoFlasks simultaneously to increase throughput. (a) Fold over and zip-tie the ends of one piece of 25-in. tubing. Autoclave the tubing as well as the metal tip for the dispenser. (b) Set up the Flask Dispenser in the custom GNF flask filling station enclosure. Carefully cut one end of a piece of sterilized tubing with a clean razor blade and attach the open end to a metal tip. Be sure to handle the tubing and tip with clean gloves, taking care not to touch the point of the tip or the inside of the tubing to prevent contamination. (c) Run the tube line through the Flask Dispenser pump. Cut open the remaining end of the tubing and attach a 2 mL transfer pipet to it. (d) Place the 2 mL pipet end into 70% EtOH and prime the dispenser with 300 mL using the PDC App software. (e) Between uses, be sure to prime the line with at least 50 mL of 10% bleach and at least 100 mL of 70% EtOH. 2. Seed flasks with 293-F Cell Suspension using GNF Systems Flask Filling Station. This is performed on the day before the actual transfection. AutoFlasks are filled with 293-F Cell Suspension and placed in the GNF Systems incubators overnight, allowing the cells to approximately double in the flasks. (a) Target having your working cell stock at 1.5–2.0 e6 cells/ mL on the day of seed (the day before transfection). Cells should be at least 95% viable. (b) Set up the system to weigh the appropriate number of empty AutoFlasks on the PEPP balance hours prior to cell preparation. (c) Dilute working stock of 293-F cells in FreeStyle™ media in the biosafety cabinet using sterile technique. Normally, 35 mL of culture at a cell density between 0.8–0.9 e6

High-Throughput Protein Expression in Mammalian Cells

117

cells/mL is required per flask; prepare enough culture to fill the desired number of flasks for your transfection run, plus two controls. Be sure to prepare about 0.5–1.0 L of extra culture for testing and priming of the seed line. (d) Verify the density of this Cell Suspension on a Vi-CELL using an average of triplicate samples. (e) Transfer the total volume of the Cell Suspension into at least two sterile 2.5 L UltraYield flasks, approximately 2 L per flask. Keep these flasks in an orbital shaking incubator set to 37  C, 8% CO2, and 120 rpm until you are ready to fill AutoFlasks. (f) Set up the orbital shaker on the platform in the flask filling station HEPA-filtered enclosure. (g) Just prior to dispensing into the first two AutoFlasks, move the first 2.5 L flask from the orbital shaking incubator into the flask filling station enclosure and set the shaker at a moderate speed to keep cells suspended. (h) There are two Flask Dispensers in the Flask Filling Station. To prepare the fluid path for one Flask Dispenser, follow the steps below. If both Flask Dispensers are to be used, repeat the below steps for the second dispenser. If desired, both fluid paths can draw from the same source flask of cells. – With the end of the sterile pipet still submerged in 70% EtOH, use the PDCapp method to manually prime 50 mL through the appropriate tip of the Flask Dispenser which will perform the flask seeding. – While the prime is actively running and 70% EtOH is being drawn through the fluid path, carefully remove the sterile pipet from the 70% EtOH then invert the tip inside the hood so that EtOH is drawn down the pipet. – As soon as the pipet is cleared of EtOH, carefully flip the pipet over and insert it into the flask of cells. (NOTE: By performing this action during the prime, you are introducing an air bubble into the fluid path and preventing any 70% EtOH from coming in contact with your flask of source cells.) – With the pipet now submerged in the Cell Suspension, prime another 50 mL to ensure the entire fluid path is cleared of 70% EtOH and no air bubbles remain, leaving a consistent flow of cell culture. (i) Queue flasks on the system to seed in the reverse order in which they will be transfected. Cell doubling time slows slightly as the cells are transitioned from growth in shake-

118

Sarah M. Rue et al.

flasks in an orbital incubator to growth in AutoFlasks in the PEPP incubators, which have a side-to-side shaking pattern. For this reason, the flask that is filled first will have the lowest cell density. The first flasks to be seeded are transfected last in order to give more time for the cells to grow. (j) After each AutoFlask is filled on the dispenser, it is weighed to confirm that the correct volume has been dispensed. (k) Every 40 min, exchange the 2.5 L shake-flask being used to fill flasks on the dispenser platform with one being stored in the 37  C orbital shaking incubator. This helps to minimize the amount of time the cells are kept at room temperature over the course of the seed. (l) When the seed is complete, prime each line with 100 mL of 10% bleach, let it sit for ~10 min, and then flush with 100 mL 70% EtOH to ensure the cells are no longer present in the lines. 3. On the day before transfection, move the DNA aliquots to be used from storage at 20 to 4  C, and cover several robotic reservoirs in pierceable foil and autoclave. For each transfection, we typically use 35 μg of DNA, but you may need to optimize the amount of DNA that should be used for your expression vectors in your transfection protocol. 4. On the day of transfection, centrifuge DNA aliquots at 200  g for 1 min, then temporarily store at room temperature. 5. Set up the Transfection Tecan deck. The Tecan is in a positive pressure HEPA-filtered enclosure and the deck should be treated as a sterile environment. Please refer to Fig. 14. All steps below should be performed inside the Tecan enclosure using aseptic tissue culture technique. (a) Unwrap one pre-autoclaved robotic reservoir from aluminum foil and place it on the deck. Fill with transfection reagent of choice. Cover and seal the reservoir top with pierceable foil. (b) Place transfection reagent in position #1. (c) If setting up a double transfection, repeat steps (a) and (b), placing second reservoir into transfection reagent position #2. (d) Once reservoir(s) is (are) in place, spray the top of the foil seal(s) and the inside of the fluid waste trough with 70% EtOH. Do not wipe the ethanol off; leave these wet. (e) Unwrap GNF custom tip box and make sure there are an adequate number of tips to conduct the transfection. If

High-Throughput Protein Expression in Mammalian Cells

119

Fig. 14 The Transfection Tecan deck set up for a 293-F transfection

more tips are needed, fill tip positions with a sterile tip tray, being careful not to let the racked tips touch the walls of the tip tray holder. (f) Prepare DNA plate(s). – Remove lid from each DNA plate. – Remove rubber plugs from each DNA tube with removal tool and discard plugs in trash. – Cover and seal DNA tubes with pierceable foil seal. Be certain to fully seal the tops of the tubes to prevent any leaks or evaporation. – Spray the top of the foil thoroughly and leave wet, then place the sealed plate into DNA plate position #1. (g) If setting up a double transfection, repeat step f. for the second DNA plate, placing the second sealed plate in DNA plate position #2. (h) Sterile-filter 250 mL FreeStyle™ media using a 0.2 μm filter unit. If you are performing a double transfection, sterile filter a second 250 mL of media. (i) Remove lid from one media bottle and seal with pierceable foil, then slide the sealed bottle into media position #1. (j) If setting up a double transfection, repeat step h. and slide into media position #2.

120

Sarah M. Rue et al.

(k) Carefully unwrap one 96 deep-well plate and snap into DNA mix position 1, with well A1 in the upper left position. (l) Place a second 96 deep-well plate into DNA mix position 2. (m) If setting up a double transfection, repeat steps j. and k. with DNA mix positions 3 and 4. (n) Finally, spray the foil seals on the media bottles with 70% EtOH and leave wet. 6. Perform transfection. (a) Read the flasks that were seeded first and last to determine when to start the transfection. The transfection is ideally started when the flask seeded last is around 1.2 e6 viable cells/mL and the flask seeded first is of slightly lower density. The flasks seeded last and reading closest to 1.2 e6 cells/mL will be transfected first. (b) Run the Transfection Tecan Program for 293-F cells. DNA and transfection reagent volumes are provided to the Tecan in a CSV file. (c) The Tecan fills all appropriate wells of the two deep-well DNA mix plates that will be used for the starting DNA plate with media according to the Excel file. It uses the data in the file to adjust the volume of media added based on the volume of DNA + transfection reagent that will be used so that each plate ends up with a consistent volume in all wells. (d) The Tecan transfers the indicated volume of DNA for the first four wells into DNA Mix Plate 1. (e) The Tecan transfers the indicated volume of transfection reagent into the first four wells of DNA Mix Plate 2. (f) The contents of four transfection reagent + media wells are combined with the contents of the four DNA + media wells in DNA Mix Plate 2 to allow transfection complex formation. Timing of this step is critical (see Note 6). While the complexes form, four AutoFlasks are retrieved from the incubator and placed in the 4 nests on the Tecan deck. Each of the four wells of complexes are then transferred to an AutoFlask. Upon complex addition, each AutoFlask is shaken 3 times in the custom GNF nest on the Tecan. The flasks are then returned to the GNF Systems incubator and shaken at 37  C, 75% humidity and 10% CO2 to allow cells to grow until ready for harvest and purification.

High-Throughput Protein Expression in Mammalian Cells

121

(g) For the Secretomics (non-antibody, secreted protein) workflow, AutoFlasks are incubated for approximately 96 h. (h) For antibody expression, AutoFlasks are incubated for about 120 h. (i) One to two days prior to harvest and purification, the AutoFlasks should be visually checked for contamination. In addition, if any control plasmids expressing fluorescent proteins (e.g., GFP) were used in the transfection, the cells in the corresponding flasks should be visibly pigmented. 3.4 Stable CHO Workflow

There are currently three routine applications for HT-CHO at GNF, and different fit-for-purpose protein production workflows are followed for each one. All applications begin with the generation of a stable pool in what we refer to as a “Mother” flask followed by cryo-archive of the pool on PEPP. 1. The fastest-growing HT-CHO application is the construction of renewable protein collections. In this case, protein production is performed in one “Daughter” flask and purification is carried out on PEPP. We have built three such enabling protein collections in CHO to date and are planning to build several more in the near future, the largest of which is a remake of the entire Secretomics collection. This effort should provide material in higher quantity and of higher quality than was obtained in the original Secretomics collection produced in HT-HEK. This will be beneficial in several ways: proteins will be provided at higher starting concentration and purity for use in screens that leverage GNF’s powerful high-throughput screening capabilities, which could increase the likelihood of biological activity being observed. Also, in contrast with the HEK-produced Secretomics collection, sufficient material will be available for biologists to confirm activity using the same batch of protein. Finally, when hits with interesting biological activity have been discovered and confirmed, we will have an easy, scalable path to resupply for follow-on activities: we can thaw the corresponding cryo-archived cell banks and put them through an offline fed-batch process. 2. The second application is mid-scale (up to 30 mg) expression of biotherapeutics to enable lead selection using cell-based assays and in vivo studies. In this case, the Mother flask is used to inoculate three Daughter flasks for protein production followed by purification entirely on PEPP.

122

Sarah M. Rue et al.

3. The final application is large-scale manufacture of antibodies or other proteins for drug discovery project support. Lead therapeutic molecules, frequently-used proteins such as isotype controls, and proteins that need to be modified post-purification often fall into this category. In this application, the Mother flask is taken offline and scaled up to inoculate a larger vessel for fed-batch protein production. 3.4.1 Maintenance of Parental CHO Cells

Always handle cells in a biosafety cabinet using sterile technique. 1. Thawing cells. (a) Pre-warm CHO growth media in a 37  C bead bath. (b) Thaw cryo-vial containing 1.0 e7 cells quickly in a 37  C bead bath. (c) Transfer vial contents into a 125 mL shake flask containing 50 mL media. (d) Place flask in an orbital shaking incubator set to 36.5  C, 10% CO2, and 150 rpm. (e) After 7 days, monitor cell growth daily: withdraw 0.5–1 mL of culture into a Vi-CELL vial in a biosafety cabinet. Determine cell density and viability on a Vi-CELL instrument. (f) Once cells reach a viable cell density between 1.0 and 3.0 e6 cells/mL, expand to a 100 mL culture in a 250 mL shake flask, seeding at a density of 0.2–0.5 e6 cells/mL and allowing cells to grow as described above. (g) Repeat scale-up with seeding at the same density until you are able to achieve your desired working stock culture volume. 2. Stock cell maintenance. (a) Parental (un-transfected) CHO cells are typically maintained in media of choice in an orbital shaking incubator set to 36.5  C, 10% CO2, and 150 rpm in a culture volume of 100 mL in a 250 mL shake-flask or 200 mL in a 500 mL shake flask. (b) Cells are typically kept in culture for at least 3 weeks before they are used in a nucleofection. Cells must be >95% viable and doubling every 24 h. (c) A new vial of cells is thawed every 5 weeks. Cell stocks are retired after 10–12 weeks of use, after which the most recently thawed stock is used. (d) Cells double approximately every 24 h and are spilt twice a week to 0.1–0.4 e6 cells/mL.

High-Throughput Protein Expression in Mammalian Cells

123

(e) During weeks in which a transfection is not planned, the stock can be scaled back to a culture volume of 100 mL in a 250 mL shake flask for maintenance. 3. Banking cells. (a) Grow cells to a density between 1.0 and 10.0 e6 cells/mL. Cells should be at least 95% viable. (b) Calculate the number of cryovials to be frozen at 1.0 e7 cells/mL, 1 mL per tube. (c) Transfer appropriate volume to 50 mL conical centrifuge tubes. Pellet at 100  g 5 min. (d) Aspirate some of the supernatant (conditioned media) using a sterile 2 mL transfer pipette inserted into a house vacuum line, taking the milliliter volume down to match the number of vials that will be frozen. For example, if 10 vials are to be frozen, aspirate down to the 10 mL mark on the side of the 50 mL conical. (e) Add 10% DMSO and resuspend cells. (f) Aliquot 1 mL of resuspended cells to 2 mL screw-capped cryo-tubes, 1 mL/vial. (g) Slow-freeze at 80  C in Styrofoam rack or roomtemperature Mr. Frosty™. (h) Once frozen, store in a liquid nitrogen tank or cryogenic storage chest freezer. 3.4.2 Nucleofection of Suspension CHO Cells

You will need to optimize the conditions for nucleofection of your CHO strain of choice. See Note 7 for tips on how to do this. Always handle CHO cells in a sterile biosafety cabinet. 1. Preparing for nucleofection. (a) Add the provided Supplement to your 4D-Nucleofector X Solution at a ratio of 1 to 4.5. Once the Supplement has been added to a Nucleofector Solution, the Solution is stable for up to 3 months at 4  C. (b) One or two days before transfection, cells are expanded and split to 0.75 e6 cells/mL or 1.5 e6 c/mL, respectively, so that the culture density on the day of nucleofection is approximately 3.0 e6 cells/mL. Cells need to be in the logarithmic growth phase for optimal results. (c) DNA preparation: determine concentration and ratio of absorbance at 260 nm and 280 nm (260/280 ratio) of sequence-verified miniprep or maxiprep DNA using a NanoDrop ND-1000 device or similar spectrophotometer. Only use DNA of concentration  200 ng/μL and with a 260/280 ratio between 1.70 and 1.90 in a nucleofection.

124

Sarah M. Rue et al.

(d) For each nucleofection, 1.5 μg of DNA and 7.5 e6 CHO cells are used. When performing a 2-vector transfection, 0.75 μg of DNA for each plasmid is used, for a total of 1.5 μg. (e) Typically, DNA is diluted to a concentration of 100 ng/μ L in cell culture-grade water in 60 μL total volume, which is enough DNA for four transfections at a volume of 15 μL per transfection. This 60 μL is apportioned as described below: – 10–15 μL of this dilution is sent for sequence verification. – 15 μL (1.5 μg) is aliquoted into a 96-well PCR plate to be used for nucleofection. This plate can be sealed with film and stored at 20  C until the day of nucleofection. – The remainder is archived in case any processes need to be repeated. (f) Formulas: One expression plasmid: Volume Plasmid DNA ¼ 1:5 μg  4=½Concentration of Plasmid ðμg=μL Þ Volume H2 O ¼ 60 μL  ðVolume Plasmid DNA Þ Two expression plasmids: Volume Plasmid A ¼ 0:75 μg  4=½Concentration of Plasmid A ðμg=μL Þ Volume Plasmid B ¼ 0:75 μg  4=½Concentration of Plasmid ðμg=μL Þ Volume H2 O ¼ 60 μL  ðVolume Plasmid A þ Volume Plasmid BÞ (g) Filling Flasks with CHO media (see Subheading 3.3.2, step 1 for setup of Flask Dispenser). The GNF Systems Flask Dispenser is used to fill AutoFlasks with CHO media containing selective agent of choice, typically to a volume of 45 mL/flask. This can be done up to 2 days before the nucleofection, but it is usually done on the day prior. Prefilled flasks are kept in the GNF Systems Incubators set to 36.5  C, 75% humidity and 10% CO2 until they are used, and are weighed on the day of nucleofection. 2. Nucleofection. (a) Turn on the Nucleofector and set 96-well Shuttle to the appropriate program after selecting the wells that will contain cells and DNA.

High-Throughput Protein Expression in Mammalian Cells

125

(b) Calculate the appropriate volume of cells to be pelleted based on viable cell density of parental cells and the target number of nucleofections to be performed, with enough overage for ten additional transfections. (c) Formula: Volume of Culture to Pellet ¼ 7:5 e6 cells  ðNumber of Transfections þ 10Þ= Culture Viable Cell Density ðcells=mLÞ Example: for a culture at a viable cell density of 3.0 e6 cells/mL, for 96 transfections the calculation would be: Volume of Culture to Pellet ¼ 7:5 e6 cells  106=3:0 e6 cells=mL ¼ 265 mL (d) Pellet calculated volume of cells via centrifugation at room temperature at 100  g for 7 min in sterile 50 mL conical tube(s). (e) Remove the supernatant in a biosafety cabinet using a sterile 2 mL aspirating pipette inserted into house vacuum line connected to a vacuum trap, taking care to leave the cell pellet intact. (f) Add the appropriate amount of supplemented Nucleofector Solution (20 μL per reaction) to the cells and mix up and down 2 times using a 5 mL pipette. The volume of the cell pellet and any residual media left after aspiration contributes to the final total volume of this Cell + Solution Mix. Note the final volume of this mix and use it to calculate the volume that should be added to each well of DNA for nucleofection. To do this calculation, divide the total volume of Cell + Solution Mix by the total number of transfections to be done. For example, if your total volume of Cell + Solution Mix is 800 μL and you had planned for 20 reactions, your calculation would be: 800 μL/20 reactions ¼ 40 μL of Cell + Solution Mix would be added to each well of DNA. (g) Pipet Cell + Solution Mix into a sterile multichannel reservoir. (h) Using a 12-channel pipettor, transfer the volume of Cell + Solution Mix calculated in f. (above) to each well of the 96-well plate of pre-aliquoted DNA. (i) Set a 12-channel pipettor to a volume large enough to account for a full transfer of DNA + Cell + Solution Mix (70 μL should be sufficient). Transfer the total volume of the mixture to the Lonza 96-well shuttle plate, taking care

126

Sarah M. Rue et al.

not to pipette bubbles into the plate. Gently tap the bottom of the plate 3 times on a benchtop to remove any bubbles. (j) When using the Amaxa 4D with the 96-well Shuttle attachment, the 96-well plate must contain a full plate of wells. If any of the 16-well strips are removed, the pulse administered by the instrument will not be uniform. If you have less than a full 96-well plate of samples to nucleofect, leave empty wells in the plate, but keep them covered to maintain sterility. Once the pulse is complete, you can remove empty wells in the biosafety cabinet and keep them sterile for later re-assembly into a new 96-well plate and use. (k) Nucleofect the plate. (l) After nucleofection, let the plate sit covered for 10 min before starting transfer of nucleofected cells into AutoFlasks on PEPP. 3. Transfer nucleofected cells into AutoFlasks on the Transfection Tecan. After nucleofection, the nucleofected plate is placed on the Transfection Tecan deck, where cells are transferred into AutoFlasks that have been prefilled with 45 ml selective media (Fig. 15). Each well of the 96-well plate is transferred into one AutoFlask, for 96 total flasks. (a) Set up the Transfection Tecan: – In sterile biosafety cabinet, sterile-filter 250 mL of CHO media containing desired selective agent and antibiotic/antimycotic solution at a 1 concentration using a 250 mL filter unit. Place bottle on the Tecan deck. – Make sure there are sufficient custom robotic tips on the deck for the transfer. – De-lid the nucleofection plate and place it on the deck. – The Tecan pulls 550 μL of selective media from the bottle and dispenses 150 μL into four wells of the nucleofection plate at a time, then mixes the contents of each well several times to make sure all cells are resuspended before being drawn into the Tecan tips (Fig. 15, Panel a). – The Tecan repeats this process with a second tip for each well in an effort to recover as many cells as possible. – The Tecan transfers entire volume from both tips into an AutoFlask. This is done for four AutoFlasks at a time (Fig. 15, Panel b). – AutoFlasks are placed back in the incubator.

High-Throughput Protein Expression in Mammalian Cells

127

Fig. 15 The Transfection Tecan resuspending CHO cells from the nucleofection plate (a) and transferring them into AutoFlasks (b)

3.4.3 Monitoring Pool Recovery from Selection

Stable pool establishment is a process that spans several weeks (see Note 8). Once the cells are transferred into selective media, they undergo a crisis. The viable cell density will be very low initially (90% viability and >1.0 e6 cells/mL). The cell density of each flask is monitored every morning using the Flask Density Reader (FDR), and these data are used to determine when a stable pool is ready for the downstream steps of cryo-archiving and protein production. It is important to establish an understanding of maximal cell density at which your CHO strain can be maintained in your media of choice. Once a stable pool has recovered, it will typically double every 24 h. As the cell density increases, nutrients in the media can be depleted resulting in a drop in viability. Understanding the behavior of your cells on PEPP and close monitoring of cell density with the FDR will allow you to avoid pool crisis. Once pools reach a density read of >2.0 e6 cells/mL according to the FDR data, they are ready for cryo-archive and protein production. If desired, pools can be sampled on the system at this point to measure cell viability using the Vi-CELL and to make sure that the pools are producing protein via Octet, flow cytometry, or other assay of choice.

3.4.4 Sampling AutoFlasks on the Transfection Tecan

If desired, a small volume of culture can be removed from AutoFlasks and transferred into 96-well plates and/or Vi-CELL vials using scripts written for the Transfection Tecan (Fig. 16). Up to 96 samples can be collected by the system without the need for an

128

Sarah M. Rue et al.

operator to be present. Sampling can be set up to finish early in the morning so that assays can be run as soon as the operator arrives. 1. Set up the Transfection Tecan deck. (a) Fill 24-well tissue culture plates with the appropriate number of Vi-CELL vials. Up to four plates can be placed on the deck at a time (96 vials). (b) Place one or two 96-well sample collection plates on the deck. Either one or two copies of samples can be transferred into these plates using the Tecan, enabling the user to generate identical plates ready for two independent assays if desired. 2. The robotic arm pulls an AutoFlask from the incubator, takes it to the GNF Systems Flask Sanitizer, and then places the AutoFlask in one of the four nests on the deck. 3. Once all four nests contain flasks, the sampling script on the Tecan is initiated. 4. The Tecan gently agitates each AutoFlask by shaking the nests in order to mitigate any cell settling and ensure that contents are uniformly mixed prior to sampling. 5. The Tecan LiHa arm picks up custom tips and withdraws 1 mL of culture from each flask. It dispenses 600 μL into a Vi-CELL vial and then dispenses 200 μL into each of two 96-well plates. 3.4.5 Archiving Stable CHO Pools

The Transfection Tecan is used to create four cryo-archived vials of each stable pool upon recovery from selection. It is desirable that each pool is at a viable cell density of at least 2.0 e6 cells/mL and

Fig. 16 The Transfection Tecan sampling AutoFlasks containing CHO pools (a) and transferring samples into Vi-CELL vials (b) and 96-well V-bottom plates, which are located behind the Vi-CELL vials on the deck

High-Throughput Protein Expression in Mammalian Cells

129

>90% viability at the point of archive so that time to recovery from thaw is as short as possible. Since the GNF Systems Flask Density reader cannot determine the viability of cultures, you may want to run the AutoFlask sampling process described in Subheading 3.4.4 prior to archive to make sure that they are >90% viable. 1. The robotic arm pulls an AutoFlask from the incubator, takes it to the GNF Systems Flask Sanitizer, and then places the AutoFlask in one of the four nests on the deck. 2. Once all four nests contain flasks, the archiving script on the Tecan is initiated. 3. The Tecan gently agitates each AutoFlask by shaking the nests. 4. The LiHa arm uses custom tips to aspirate four aliquots of 900 μL from each flask and to dispense the cells into four 1.4 mL 2D-barcoded tubes. 5. Once the Tecan has taken aliquots for all flasks to be archived, lid the LoBo rack of tubes manually and take it to a sterile biosafety cabinet. Add 100 μL sterile DMSO to the tubes using a 12-channel pipettor, and mix the contents of each tube by pipetting up and down twice. 6. Cap cryo-tubes and immediately place the rack at 80  C. 7. Once frozen, move the cryo-tubes to the cryogenic storage freezer. 3.4.6 Protein Production from Stable CHO Pools

In order to produce protein from stable pools that have been established on PEPP, new “Daughter” flasks are seeded by transferring cells from the original “Mother” flask culture using the Transfection Tecan (see Note 9). Up to three Daughter flasks can be seeded from one Mother flask since there are four flask nests on the Transfection Tecan. At GNF, different workflows have been established for protein production. For the production of protein collections in CHO, typically one Daughter flask is seeded and purified on PEPP. For production of biotherapeutics and other proteins to support drug discovery projects, three Daughter flasks are typically seeded. These three Daughter flasks can be taken all the way through harvest and protein purification on PEPP and their products can be pooled post-purification to generate one large batch of protein. Alternatively, if a large amount of protein is desired, the three Daughter flasks can be maintained on PEPP and eventually taken offline and expanded to seed a fed-batch culture. 1. Prepare protein production media by adding antibiotic/antimycotic to a 1 concentration and sterile filtering in a biosafety cabinet.

130

Sarah M. Rue et al.

2. Prefill appropriate number of AutoFlasks with 50 mL of protein production media using the GNF Systems Flask Dispenser. 3. Calculate seed volumes for Daughter flasks based on Mother flask FDR or Vi-CELL data. (a) Daughter flasks are typically seeded at a density between 0.1 and 0.4 e6 cells/mL. The seed density is decided based on system availability for harvest (weekend harvests are typically avoided) and the projected cell density at which cultures will be in danger of falling below ~70% viability (see Note 10). (b) Mother flask densities of over 2.0 e6 cells/mL are ideal so that inoculation volumes are low and only a minimal amount of selective media is transferred to the Daughter flasks. (c) Formula: Seed Volume ¼ Seed Density (cells/mL)  50 mL/Mother Flask Cell Density (cells/mL). Example: for a Mother flask at a density of 2.0 e6 cells/mL and a seed density of 0.1 e6 cells/mL, Seed Volume ¼ 0.1 e6 cells/mL  50 mL/ 2.0 e6 cells/mL ¼ 2.5 mL. 4. Seed volumes are provided to the Transfection Tecan using a CSV file, and Daughter flasks are seeded from the Mother flasks accordingly utilizing Runtime to bring Mother/Daughter pairs out in sequence as the Tecan works its way through the seed volume CSV file. 5. Daughter flask cell densities are monitored daily during protein production using the FDR to help determine the timing of harvest. See Fig. 17 for example FDR growth curves for a set of Daughter flasks. 3.5 Harvest and Protein Purification

3.5.1 Harvest

In this process, AutoFlasks are centrifuged to pellet cells inside the flask cell pocket and then protein-containing cell supernatants are withdrawn through the septum on the Harvester. The Harvester and Purification Tecan share a platform that is fitted with sliders (Fig. 18). Once a column containing eight 50 mL conical tubes has been filled with harvested supernatants, it is automatically slid across a rail into position in the Purification Tecan for purification of protein from the supernatants by gravity-fed affinity chromatography. Purified protein is then buffer exchanged and aliquoted using the NAP-10 Tecan (see Note 11). 1. Set up collection tubes for harvest. (a) Move all the sliders to the Supernatant Collector side using the method in the PDCapp. (b) Load the 12 sliding racks shared by the Supernatant Collector portion of the Harvester and Purification Tecan with clean cap-less 50 mL conical tubes.

Cell Density (e6 cells/mL)

High-Throughput Protein Expression in Mammalian Cells

131

12 10 8 6 4 2 0

1

2

3

4

5

6

7

8

9

10

Day Fig. 17 Example CHO pool growth curves generated by measuring transmitted light on the GNF Systems Flask Density Reader and correlating readings to a standard curve derived from known cell densities

Fig. 18 The Purification Tecan and the Supernatant Collector portion of the Harvester share a platform. Sliders automatically move from the Supernatant Collector side over to the Purification Tecan

2. The robot transfers four flasks at a time into the GNF Systems centrifuge. 3. The centrifuge spins the flasks at 1500  g for 5 min. See Fig. 19 for pictures of 293-F cultures before and after centrifugation.

132

Sarah M. Rue et al.

Fig. 19 Examples of AutoFlasks containing 293-F cells before and after pelleting in the GNF Systems Centrifuge

4. Flasks are then transferred from the centrifuge directly to the GNF Systems harvesting station. 5. On the Harvester, a set of four tips pierce four AutoFlask septums and the clarified supernatants are pumped into recipient 50 mL conical tubes seated on the Supernatant Collector. This process repeats four AutoFlasks at a time for up to 96 AutoFlasks per run. The Harvester tubing and tips are flushed with sterile Dulbecco’s PBS plumbed to the system from a 200 L barrel in between each set of AutoFlasks to minimize cross-contamination. 6. The 50 mL conical tubes on the Supernatant Collector are arrayed similar to a 96-well plate, with 8 rows and 12 columns. After each column is filled, that slider automatically slides to the purification side. 3.5.2 Purification of His6tagged Proteins

1. Prepare 0.25 N NaOH. 2. Make stock solutions. Sterile filter all stocks in a biosafety cabinet through a 0.2 μm filtration unit. Store at 4  C. (a) Prepare Imidazole 2 M, pH 7.4. (b) Check endotoxin level of the stock solution prior to use using a LAL assay such as the Kinetic-QCL™ Kinetic Chromogenic LAL Assay (Lonza #50-650 U). (c) Prepare 20 Secretomics buffer: 3 M NaCl, 200 mM Tris–HCl. (d) Prepare 1 Tris-buffered saline buffer. Dilute 100 mL 10 TBS in 900 mL Cell Culture Grade H2O. (e) Prepare Wash buffer: 20 mM Tris–HCl, 150 mM NaCl, 20 mM imidazole.

High-Throughput Protein Expression in Mammalian Cells

133

Fig. 20 Setup of the Purification Tecan deck. Not pictured are a waste tray under the columns that can move in and out, as well as four 24-well capture plates for the eluates kept in a chilled nest beneath the tray

(f) Prepare Elution Buffer: 1 TBS, 250 mM imidazole in Cell Culture Grade H2O. 3. Set up the Purification Tecan deck (Please refer to Fig. 20). (a) Pre-label four 24-well collection plates to maintain orientation and place them on the chilled elution collection platform. For antibody purifications, 100 μL of Tris–HCl pH 8.0 is added to each well of the plate to neutralize the pH of the eluate after purification and prior to desalting. The waste tray needs to be pulled out so the user can put the 24 deep-well blocks in place, then the user slides the waste tray in. (b) Place an appropriate number of clean BioRad chromatography columns in the GNF Systems white plastic holders. Each holder contains 24 purification columns and the holders are arrayed in a format that resembles a 96-well plate. Each column in the holder (set of 8 chromatography columns) must be complete, so add empty chromatography columns accordingly. (c) Place the filled holders on the Tecan worktable. The script starts out with a waste tray beneath the holders/columns and 24-well capture plates in a chilled (4  C) nest below the waste tray (see Note 12). (d) Prepare the Tecan work deck with the correct buffers in the correct positions. For the Tecan purification program, the maximum capacity of samples that can be purified requires more wash buffer and PBS/TBS buffer than the troughs can hold, so automated filling troughs are used.

134

Sarah M. Rue et al.

– Before setting the final clean troughs on the deck, place a used or “throw-away” trough under the automated filler tips. – Manually run 0.25 N NaOH or 10% bleach through both lines for 60 seconds to sterilize the fluid path. – Run sterile PBS through both lines for another 60 s. – Dispose of the used trough and replace with a new trough under each automated filler tip on the deck. (e) Make sure the deck is stocked with sufficient conductive Tecan tips for the run, approximately 8 racks of 96 tips per 96 purifications. (f) On the Tecan, pour 100% EtOH into a robotic reservoir. The Tecan pre-wets columns 8 at a time with 100 μL of 100% EtOH. (g) Dispose of remaining 100% EtOH properly. Pour approximately 20 mL of resin into the same reservoir. Place the reservoir on the Tecan, propping one end up with a 50 mL conical tube cap to maximize the volume available to the LiHa tips. (h) In between filling each set of 8 chromatography columns, the Tecan script will pause to prompt the operator to mix the resin slurry. This is done to counteract settling and to ensure that a homogenous mixture of beads is being loaded into the columns. Mixing is most simply accomplished by pouring the remaining resin slurry from the angled trough back into the resin stock vessel, then pouring approximately 20 mL of resin slurry back into the angled trough on the deck before clicking “OK” on the Tecan prompt. 3. Ni-NTA purification (Secretomics workflow) on the Purification Tecan. The below protocols describe typical purification procedures and volumes used at GNF, but these can be defined and customized by the user. (a) The Ni-NTA resin is stored in a 25% slurry. The Tecan loads each column with 800 μL of resin, yielding a bed volume of 200 μL. (b) 20 Secretomics buffer (1800 μL) is dispensed into the 50 mL conical tubes containing the supernatants to adjust the salt concentration of each sample prior to purification. (c) The Tecan loads 13 mL of the supernatants onto 8 chromatography columns at a time, swaps out the LiHa tips, then moves on to the next set of 8 columns. It runs through a set of 48 columns, then loops back to the beginning of the set and repeats the loading of the

High-Throughput Protein Expression in Mammalian Cells

135

columns until all the supernatant is loaded. The time it takes to load the 48 columns is enough time for the 13 mL to run through the columns, so when the Tecan loops back to the beginning of the set all of the supernatant will typically have run through. (d) The purification script has a clog detection method built in. If the conductive Tecan tips sense that liquid has not dripped through a column, the Tecan will wait 15 min for the clog to clear, then retest. If the clog remains, it will wait another 15 min. If the clog persists, the Tecan will place a tip near the resin in the chromatography column and conduct a forceful mix step to disperse the packed resin to attempt to restore flow. Clogs are tracked in a separate log for the operator to review after purification runs have completed. (e) The Tecan then dispenses 2 mL (10 column volumes) of Wash Buffer onto the columns (see Note 13). (f) The Tecan dispenses 50 μL of Elution Buffer as a pre-elution step designed to maximize the purity and concentration of the eluate. (g) Following the pre-elution step, the waste tray slides out from under the columns and the chilled 24-well capture plates are raised up so that they line up directly under the 96 columns. (h) Elution buffer (950 μL) is dispensed onto the columns. (i) Each well of a 24-well plate then is subjected to the desalting protocol in Subheading 3.6. 3.5.3 Purification of IgG and Fc-Fusion Proteins

This protocol is identical to the Ni-NTA purification, with a few key modifications: 1. Nothing needs to be added to the cell supernatants prior to purification. 2. Phosphate-buffered saline (PBS) is used in place of both the TBS and Wash buffers. 3. A larger volume of 25% slurry resin is used: 2400 μL resin is used for a final bed volume of 600 μL. MabSelect™ SuRe™ resin is used for purification of antibodies and Fc-fusion proteins of isotype hIgG1 and mIgG2a. Protein G resin is used for purification of antibodies and Fc-fusion proteins of isotype mIgG1, and also for purification of Fabs. 4. The Wash volume is increased to 6 mL to retain the 10 column volume ratio. 5. Pierce™ IgG Elution Buffer is used instead of imidazole-containing Elution Buffer.

136

Sarah M. Rue et al.

6. The pre-elution volume is increased to 300 μL in order to compensate for a larger resin bed volume. 7. The elution volume is 900 μL. 3.6 Buffer Exchange and Protein Aliquotting

The purpose of this final step is to remove salts leftover from the purification process from protein samples through the use of NAP-10 desalting or buffer exchange columns. Depending on desired final formulation buffer for your proteins, the NAP-10 Tecan buffer line can be plumbed to either the normal Secretomics TBS stock or to the same stock 1 PBS used in the purification of antibodies and Fc-fusion proteins. 1. Set up the NAP-10 Tecan (Fig. 21). (a) Retract the waste tray using the batch file. (b) Place four 24-well plates on the lower plate lifter. (c) Slide waste tray back into position. (d) Place the white NAP-10 holders in the designated position over the waste tray. (e) Remove fresh NAP-10 columns from their plastic shipping trays, remove the cap from the top of each column and the plug from the bottom before placing the column in one hole of the white plastic holding trays (see Note 14). (f) Once an appropriate number of NAP-10 columns have been positioned in the white holding trays, place each of

Fig. 21 The NAP-10 Tecan deck. Not pictured are a waste tray under the columns that can move in and out and four 24-well capture plates that sit below the tray. Once desalting is complete, the NAP-10 columns must be removed, then the Tecan can consolidate proteins from the 24-well capture plates into the 96 deep-well plate and then aliquot them into the 96-well optical plate and 2D-barcoded tubes

High-Throughput Protein Expression in Mammalian Cells

137

the holding trays loaded with columns into position above the waste slider. Note that the upper left position is “A1.” (g) Load conductive Tecan tips into tip racks. (h) Place one 96-well optical plate and four 2D LoBo racks on the deck. (i) Load the four purified 24-well protein catch plates onto the deck. (j) Wash NAP-10 Tecan lines with 0.25 N NaOH and then plumb the appropriate elution buffer to the line and flush with elution buffer. (k) Place a 300 mL robotic reservoir on the deck and fill with elution buffer. 2. Desalt Samples. (a) The Tecan equilibrates NAP-10 columns by filling them completely with elution buffer three times. Buffer goes to waste. (b) Contents of the 24-well catch plate from the Purification Tecan are loaded on the NAP-10 columns. (c) The Tecan slides the waste tray away and slides the 96-deep well eluate catch plate into place. (d) NAP-10 columns are eluted with 1500 μL elution buffer into the 96 deep-well plate. 3. Protein transfer to 2D tubes. The procedure described here is one in routine use at GNF, but steps and volumes can be customized by the user. If one Daughter protein production flask was seeded for a pool, the eluate is aliquoted directly into 2D tubes. If three Daughter protein production flasks were seeded, the three corresponding NAP-10 eluates are pooled into one or two 24 deep-well plates (depending on the number of samples) on the NAP-10 Tecan prior to aliquoting. From the 24 deepwell pooled plate or 96 deep-well NAP-10 eluate catch plate, five aliquots of protein are made. (a) The Tecan prompts the operator to input a dilution factor for protein to be transferred into the 96-well optical plate for protein quantitation (if needed) and the desired final volume in the optical plate. – For 293-F proteins, no dilution is made and an aliquot of 200 μL is transferred directly from the 96 deep-well NAP-10 eluate catch plate. – For CHO-derived proteins, a fivefold dilution is typically made. An aliquot of 44 μL is taken from the 96- or 24 deep-well plate and diluted into 176 μL of PBS for a final volume of 220 μL.

138

Sarah M. Rue et al.

(b) Four aliquots of each protein are transferred to 2D-barcoded screw-capped tubes in barcoded LoBo racks: – 293-F-derived proteins are stored in four 0.75 mL tubes at a volume of 320 μL/tube. – CHO-derived proteins from 1 Daughter flask are stored in 0.75 mL tubes at a volume of 350 μL/tube. – CHO-derived proteins from three Daughter flasks are stored in 1.4 mL tubes at a volume of 1.04 mL/tube. (c) 2D tube aliquots are capped, tube barcodes and plate labels are scanned, and data are entered into a protein database. 3.7 Protein Quantitation

Once proteins are desalted into their final formulation buffers, their concentrations in solution need to be determined. One way that this can be done is using a SpectraMax Plus 384 Microplate Reader device to measure the absorbance at 280 nm for each sample and then convert this reading to concentration using each protein’s molar extinction coefficient, which is defined as the A280 of a 1 mg/mL solution of the protein and is calculated based on its amino acid composition. For more information on molar extinction coefficients and how to calculate them for your proteins, please refer to [13]. The minimum sample volume for the SpectraMax device to read accurately is approximately 100 μL. Other quantitation methods can also be used, such as the Micro BCA™ Protein Assay Kit (Thermo Fisher Scientific #23235). 1. From the original 96-well optical plate, for CHO-derived proteins a 1:5 dilution of each protein sample (100 μL final volume) is made in elution buffer (PBS or TBS) (20 μL protein into 80 μL buffer) by hand into a second 96-well optical plate. Proteins derived from 293-F are read neat. 2. For background correction, all wells of a 96-well optical plate are filled with 200 μL PBS and the plate is read using the builtin SoftMax Pro program “Protein Quant” on the SpectraMax. The average read across the plate is calculated and entered in the “Water Constant” drop-down for background subtraction. 3. Absorbance of each sample at 280 nm in the protein dilution optical plate is read using the “Protein Quant” program. 4. Protein concentrations are then calculated using the following formula: Concentration ðmg=mLÞ ¼ ðA280  5Þ=Molar Extinction Coefficient

Depending on the intended use of proteins produced on PEPP, you may wish to subject them to a number of analytical

High-Throughput Protein Expression in Mammalian Cells

139

methods; the protein in the 96-well optical plate is reserved for this purpose. At GNF, LAL testing, analytical SEC, and capillary electrophoresis are standard. Intact mass spectrometry is also frequently performed to verify protein identity. If mass spectrometry is required, a 20 μL aliquot of protein from the 96-well optical plate can be taken and normalized to 1 mg/mL using the NAP-10 Tecan. To summarize, High-throughput (HT) workflows for transient protein expression in HEK cells and stable pool generation and protein production in CHO cells have been established and are in routine use at GNF. Typical applications of HT-HEK are the expression of antibody libraries for lead optimization, as well as rapid production of some non-antibody protein collections. Typical applications for HT-CHO are building enabling protein collections and protein production for support of drug discovery programs. One major hurdle that we have encountered is that demand for high-throughput protein expression on PEPP had scaled dramatically over the years. In order to handle increased demand, we have needed to conceive of innovative solutions to help coordinate and track activities on the system. Most of these solutions are software-based. We are currently finalizing the build-out of a custom Dashboard that will make day-to-day operation of the HEK transient workflow much easier for the operator by simplifying how standard HT-HEK processes are executed and tracked on PEPP. In addition, the Dashboard will simplify the HT-CHO workflow by automatically executing daily FDR reads and sorting flasks within projects according to cell density to allow the operator to easily decide when to passage, archive and harvest flasks. One challenge specific to the CHO workflow is that it is difficult to stay apprised of the status of each pool during the multi-week stable pool establishment and protein production processes. The integration of the FDR onto the system has enabled us to capture flask density data in a single location without physically removing cells from the flasks, and next we plan to build tools to enable interested parties to visualize graphed real-time FDR data so they can track progress of their pools. Further information on GNF Systems custom platforms and standalone devices can be found by visiting www. gnfsystems.com or by contacting [email protected]. For commercial inquiries regarding GNF Systems equipment, please email [email protected].

4

Notes 1. AutoFlasks are sometimes taken offline for downstream steps. Flasks containing CHO pools may be taken offline for scale-up and fed-batch cell culture protocols, and sometimes 293-F or

140

Sarah M. Rue et al.

CHO supernatants need to be harvested and purified manually to facilitate custom purification strategies. 2. When establishing DNA preparation methods, it is a good idea to include gel electrophoresis of DNA as a quality control step. This will allow you to check for DNA degradation, contaminating RNA, etc. A quick way to do this is to run a dilution of 1 μL DNA in 9 μL water on a Thermo Fisher Precast Agarose Gel. See Ref. [14] for a gel selection guide. 3. The QIAGEN® QIAprep Spin Miniprep Kit has also consistently given good yields and high-quality DNA in our hands. 4. For optimal miniprep and maxiprep yields, bacterial colonies used to inoculate starter cultures for DNA preps need to be fresh (within ~2 days of re-transformation or streak from a glycerol stock). Plates containing bacterial colonies need to be stored at 4  C. 5. Once the 96-well bacterial plate has been pelleted and media removed, you can store the plate at 20  C until you are ready to perform the Macherey-Nagel miniprep procedure. 6. The protocol described in the text allows approximately 5 min for complex formation. Different transfection reagents require different lengths of time to form complexes with DNA. The order and/or time of events can be modified in the Transfection Tecan methods to adjust the length of time allotted for complex formation in each well. 7. You will need to optimize the conditions for nucleofection of your CHO strain of choice. Lonza sells three different 4D-Nucleofector X Solutions: SG, SF, and SE; we have found buffers SG and SF to be best for our CHO strains. In addition, there are myriad possible Nucleofection programs available on the 96-well Shuttle. We recommend you conduct experiments to identify a program/Nucleofector Solution combination that works best for your CHO strain. Refer to Refs. [15, 16] for Lonza’s recommended optimization protocols; briefly, a GFP expression vector is used in a matrix of program/Nucleofector Solution combinations on the Shuttle, and transfection efficiency and cell health are determined by flow cytometry approximately 24 h later. Once you have identified the program and solution you will use, you will need to identify an appropriate number of cells and amount of DNA to use in the nucleofection. We typically use 7.5 e6 cells and 1.5 μg DNA, but the optimal amount may vary with CHO strain, vector, and selection strategy used. For this optimization experiment, we would recommend using a GFP expression vector containing your selection marker(s) of choice. Set up a 96-well Nucleofector plate such

High-Throughput Protein Expression in Mammalian Cells

141

that DNA amount is titrated down the plate and the number of cells are titrated across the plate. We recommend taking this nucleofection plate all the way through the stable pool establishment process so that the time to pool establishment can be compared across different ratios. Also, at the end of the experiment, pool quality can be visualized via flow cytometry. If conditions are appropriately dialed in, at least 98% of cells in a GFP-expressing pool should be GFP-positive when the pool recovers from selection. 8. With the CHO platform used at GNF, the cells do not need to be given new media during the stable pool establishment process. However, flasks can be fed with additional media using the GNF Systems Flask Dispenser if needed in different CHO platforms. Protocols have been established for washing the Flask Dispenser tip between flasks to eliminate cell carryover. 9. The presence of selective agents during protein production may inhibit pool production. We recommend omitting these agents from production media. 10. It is important to not let CHO protein production cultures drop below ~70% viability as this will lead to looser, more dispersed pellets and a higher likelihood that cell debris will end up in the harvested supernatant and lead to column clogging. 11. With the exception of Tecan system tubing, all tubing used during the harvest, purification and desalting processes is sterilized by flushing with 0.25 N NaOH (made by 1:4 dilution of the stock solution in water) and then neutralized by flushing with buffer immediately before the run. 12. The 24-well plates are placed on a chiller nest block. The block has inlet and outlet tubing running to a chiller and circulates a water/glycerol mixture to keep the block cold. The chiller block enables purification of two runs of up to 96 samples each in a day, where the second run will proceed into the night and the samples are kept cold until they can be run through the desalting process the next day. 13. The wash buffer trough is monitored using the conductive tips on the Tecan. When the liquid level in the trough drops below a given threshold a subroutine is executed that turns on a pump, which will fill the trough. During this fill, the Tecan will continue checking the trough every 5 s until it crosses another threshold specified in the subroutine, which signals the trough is full. 14. Some NAP-10 columns dry out during shipping. Be sure to check every column and ensure there is shipping buffer still in the sealed column. Discard dry columns.

142

Sarah M. Rue et al.

Acknowledgements The authors would like to acknowledge Salvatore Fanale, Analisa Benedetto, Heath Klock, Julie Vance, Melisa Low, Sarah Cox, Daniel McMullan, Mark Knuth, Marc Gustafson, Jim Chang, Marie Smith, Daniel Sipes, and James Mainquist for their contributions to the protocols and technologies described in this chapter. References 1. Geisse S (2009) Reflections on more than 10 years of TGE approaches. Protein Expr Purif 64(2):99–107. https://doi.org/10. 1016/j.pep.2008.10.017 2. Kim JY, Kim YG, Lee GM (2012) CHO cells in biotechnology for production of recombinant proteins: current state and further potential. Appl Microbiol Biotechnol 93(3):917–930. https://doi.org/10.1007/s00253-011-37585 3. Dalton AC, Barton WA (2014) Overexpression of secreted proteins from mammalian cell lines. Protein Sci 23(5):517–525. https://doi.org/10.1002/pro.2439 4. Suen KF, Turner MS, Gao F, Liu B, Althage A, Slavin A, Ou W, Zuo E, Eckart M, Ogawa T, Yamada M, Tuntland T, Harris JL, Trauger JW (2010) Transient expression of an IL-23R extracellular domain Fc fusion protein in CHO vs. HEK cells results in improved plasma exposure. Protein Expr Purif 71(1):96–102. https://doi.org/10.1016/j.pep.2009.12.015 5. Croset A, Delafosse L, Gaudry JP, Arod C, Glez L, Losberger C, Begue D, Krstanovic A, Robert F, Vilbois F, Chevalet L, Antonsson B (2012) Differences in the glycosylation of recombinant proteins expressed in HEK and CHO cells. J Biotechnol 161(3):336–348. https://doi.org/10.1016/j.jbiotec.2012.06. 038 6. Bandaranayake AD, Almo SC (2014) Recent advances in mammalian protein production. FEBS Lett 588(2):253–260. https://doi.org/ 10.1016/j.febslet.2013.11.035 7. Bos AB, Luan P, Duque JN, Reilly D, Harms PD, Wong AW (2015) Optimization and automation of an end-to-end high throughput microscale transient protein production process. Biotechnol Bioeng 112(9):1832–1842. https://doi.org/10.1002/bit.25601 8. Gonzalez R, Jennings LL, Knuth M, Orth AP, Klock HE, Ou W, Feuerhelm J, Hull MV, Koesema E, Wang Y, Zhang J, Wu C, Cho CY, Su AI, Batalov S, Chen H, Johnson K, Laffitte B, Nguyen DG, Snyder EY, Schultz

PG, Harris JL, Lesley SA (2010) Screening the mammalian extracellular proteome for regulators of embryonic human stem cell pluripotency. Proc Natl Acad Sci U S A 107 (8):3552–3557. https://doi.org/10.1073/ pnas.0914019107 9. Rue SM, Anderson PW, Miller JM, Fanale SG, Chang JY, Glaser SM, Lesley SA (2018) Mammalian cell culture density determination using a laser through-beam sensor. BioTechniques 65(4):224–226. https://doi.org/10.2144/ btn-2018-0059 10. Plasmid DNA Purification User Manual (2015) Macherey-Nagel. http://www.mn-net. com/Portals/8/attachments/Redakteure_ Bio/Protocols/Plasmid%20DNA%20Purifica tion/UM_pDNA_NS96.pdf. Accessed 22 Mar 2017 11. Application Notes-Tecan (2017) MachereyNagel. http://www.mn-net.com/tabid/ 12425/default.aspx. Accessed 11 Apr 2017 12. FreeStyle 293 Expression System (2010) Invitrogen. https://tools.thermofisher.com/con tent/sfs/manuals/freestyle293_system_man. pdf. Accessed 11 Mar 2017 13. TR0006.4 Extinction Coefficients (2013) Thermo Fisher Scientific. https://tools. thermofisher.com/content/sfs/brochures/ TR0006-Extinction-coefficients.pdf. Accessed 10 Mar 2017 14. E-Gel Precast Agarose Gels. Thermo Fisher Scientific. https://www.thermofisher.com/ us/en/home/life-science/dna-rna-purifica tion-analysis/nucleic-acid-gel-electrophore sis/e-gel-electrophoresis-system/e-gel-precast-agarose-gels.html. Accessed 22 Mar 2017 15. Nucleofector Protocols (2012) Lonza. http:// bio.lonza.com/resources/productinstructions/protocols. Accessed 22 Mar 2017 16. Amaxa™ 4D-Nucleofector™ Optimization Protocol for Cell Lines For 4D-Nucleofector™ X Unit–Transfection in suspension (2010) Lonza. http://www.lonzabio.jp/catalog/pdf/ ri/I548.pdf. Accessed 13 Mar 2017

Chapter 6 A High-Throughput Automated Protein Folding System Kenneth W. Walker, Philip An, and Dwight Winters Abstract In vitro protein folding can be employed to produce complex proteins expressed as insoluble inclusion bodies in E. coli from laboratory to commercial scale. Often the most challenging step is identification of renaturation conditions that will enable the denatured protein to form the native structure at an acceptable yield. Generally this requires screening a matrix of buffers and stabilizers to find an appropriate solution. Herein, we describe an automated and quantitative method to identify optimal in vitro protein folding parameters with a high rate of success. Key words High throughput, Automation, Protein folding, Screen, Liquid handling robot, Microfluidic capillary electrophoresis, E. coli

1

Introduction Recombinant protein production can be accomplished through several expression systems including, but not limited to, mammalian, insect and prokaryotic cells [1–3]. Mammalian expression systems possess the necessary machinery to produce complex soluble proteins, often with the correct native structure; therefore, they are frequently employed to produce secreted and transmembrane proteins; however, they do require substantial time and expertise to develop. Insect cells are frequently used to produce intracellular and some transmembrane proteins, but are rarely used to produce secreted proteins, and they require similar skills and timelines as mammalian cells. Prokaryotes, particularly E. coli, are often used to produce intracellular and extracellular proteins due to their speed of development and ease of use. Although E. coli can produce intracellular proteins with native structure, extracellular proteins are

Electronic supplementary material: The online version of this chapter (https://doi.org/10.1007/978-1-49399624-7_6) contains supplementary material, which is available to authorized users. Kenneth W. Walker and Philip An contributed equally to this work. Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_6, © Springer Science+Business Media, LLC, part of Springer Nature 2019

143

144

Kenneth W. Walker et al.

predominantly expressed as insoluble inclusion bodies, particularly if they require disulfide bonds which will not form in the reducing environment of the cytosol [4]. E. coli offers some substantial advantages over mammalian or insect cell expression systems in addition to speed. For example, inclusion bodies represent up to 26% of the host cell mass and are up to 90% pure recombinant protein; however, since proteins in inclusion bodies are not in their native conformation, in vitro protein folding is generally required to obtain the desired product [5–10]. The first step in protein folding generally requires complete denaturation and solubilization of the inclusion bodies followed by transfer of the reduced denatured protein to renaturating conditions conducive to formation of the native structure [11, 12]. Several general methods exist for accomplishing this including buffer exchange folding, on-column folding, and dilution folding [9, 13–20]. Since dilution folding is the most compatible with high-throughput methods, this is the method we most frequently employ. In addition, there are two major methodologies to screen for optimal renaturation conditions, sparse matrix systems, which screen a wide array of loosely related conditions, and dense matrix systems, which deeply probe a narrower but complete array of related conditions [19, 21, 22]. While both systems have been demonstrated to enable in vitro protein folding, we generally employ the dense matrix method first, since it has a high rate of success, and it is a powerful method for identifying relationships that can be used to hone in on optimal folding conditions rapidly. The matrix for this method includes compounds that often aid in protein folding, and in the combinations described herein have demonstrated a high rate of success in our hands. Since the probability of identifying a good folding condition is directly related to the size of the screening matrix, it is generally desirable to have large matrices; however, these can be labor intensive to assemble, particularly since some of the components must be prepared fresh. In order to address this, we use a liquid handling robot to assemble a 96 condition primary screening matrix. Since most in vitro folded proteins require disulfide bonds in their native state, differential migration with nonreducing electrophoresis methods are generally employed to assess renaturation success due to their speed and sensitivity. While traditional protein folding screens generally employ SDS-PAGE or other labor intensive methods, we prefer microcapillary electrophoresis (MCE) for the analysis due to its ability to rapidly and quantitatively assess the samples [13, 15, 23, 24]. In addition, we employ a liquid handling robot again to eliminate the labor intensive sample preparation step required for the analytical method. Although automation increases throughput, greatly reduces labor requirements and increases accuracy and precision, it is not required to employ this system, and alternatives not requiring a liquid handling robot or MCE instrument are also

High-Throughput Automated Protein Folding

145

described herein [25]. For proteins that do not contain disulfide bonds that result in a migration shift on MCE or SDS-PAGE, alternative assay methods, such as a functional or biophysical assay, must be employed. The quantitative data generated by the MCE assessment can be utilized to apply a rigorous well defined analysis to identify the best candidate conditions [25]. Since it is difficult to fully assess the quality of a folding condition at small scale, the lead conditions are then scaled up for a more detailed assessment that enables selection of the optimal in vitro folding condition for large-scale production of the desired product. Taken together, the system described herein provides a comprehensive method for rapid screening and identification of optimal in vitro folding conditions for a wide variety of proteins (see Fig. 1).

Fig. 1 Overview of the protein folding process, which begins with an automated small-scale folding screen, followed by an intermediate folding screen on a subset of conditions, and finally followed by full-scale production of the desired protein. DWIB double-washed inclusion bodies, MCE microcapillary electrophoresis, SEC size exclusion chromatography

146

2

Kenneth W. Walker et al.

Materials For some of the steps, equipment and consumables are similar and are, for simplicity, described only once in the material section.

2.1 Inclusion Body Solubilization and Wash

The solubilized inclusion body preparations should be used immediately; however, solubilized inclusion bodies can be frozen. The thawed mixture may contain precipitates which may need more reducing agent to dissolve. Reagents

1. Cell paste: E. coli cell culture pellet obtained through centrifugation at 5000 RCF for 30 min. 2. Lysate buffer: At least five times the volume of cell paste, by weight, of 50 mM Tris, 5 mM EDTA, pH 8.0. 3. Solubilization buffer: 8 M guanidine hydrochloride solution (GuHCl). 4. Reducing agent: solid DL-dithiothreitol (DTT). Equipment and Consumables

1. Standing homogenizer: Ultra-Turrax T 25 (IKA-Works, Staufen Im Breisgau, Germany). 2. Microfluidizer: M-110S (Microfluidics Corporation, Westwood Massachusetts, USA) In the absence of a microfluidizer (see Note 1). 3. Handheld homogenizer: Omni TH (Omni International, Kennesaw, Georgia, USA). 2.2 Small-Scale Folding Screen Automation Setup

Equipment and Consumables

1. Biomek 3000 liquid handling robot (Beckman Coulter, Brea, California, USA). 2. Thermoshake Four units (Inheco, Martinsried, Germany). 3. Multi TEC Control (Inheco, Martinsried, Germany). 4. Four 24-well cell culture plates (Corning Life Sciences). 5. A Biomek tool caddy containing one of each pipettes (Beckman Coulter): (a) P20. (b) P250. (c) P1000. 6. One box of AP96 P20 pipet tips (Beckman Coulter). 7. One box of AP96 P250 pipet tips (Beckman Coulter). 8. One box of P1000 pipet tips (Beckman Coulter). 9. Two round-bottom 24-well plates (Whatman).

High-Throughput Automated Protein Folding

147

Fig. 2 Biomek 3000 liquid handling robot work deck equipped with eight Labware brackets and four Thermoshakes (SH1–SH4). SHC identifies the Multi TEC Control unit, Tools identifies the tool caddy containing the pipetting tools, B2 and B5 are 24 well blocks with buffer components, B3 is a sectional reservoir frame with buffer components and B4 is an inverted P1000 pipet box lid to serve as a large reservoir. P1000, P250, and P20 identify the location of the pipette tips of 1000, 250, and 20 μl capacity, respectively

10. Sectional reservoir frame (Beckman Coulter Prod #372795). 11. Four sectional reservoir inserts (Beckman Coulter Prod #372790). 12. One inverted P1000 pipet box lid (Beckman Coulter). The Biomek 3000 liquid handling robot work deck is equipped with eight Labware brackets and four Thermoshakes (SH1–SH4, Fig. 2) braced to the work deck (Fig. 3a) using custom “L” brackets (Fig. 3b) and controlled by a Multi TEC Control Unit. Antivibration rubber pads (1/8 in.) (Fig. 3c) are attached to the bottom of each bracket to mitigate deck vibration from the shakers. In the absence of a Biomek or an automated liquid handler (see Notes 2 and 3, respectively). 2.3 Small-Scale Folding Screen Reagent Setup and Run

The following reagents, except for the urea solution and redox reagents, can be prepared ahead of time and stored as stock solutions. Urea solutions and redox reagents should be prepared fresh and used immediately. The total volume required for each reagent is shown, including overage, to assemble a 96-condition, 2 mL per condition folding screen matrix (Fig. 4). Reagents

1. pH buffer and arginine solutions, (a) 5 mL each.

148

Kenneth W. Walker et al.

Fig. 3 Custom components built to assemble the Biomek 3000 protein folding system. Thermoshakes controlled by a Multi TEC control unit (a) (2 of 4 shown), mounted on a custom aluminum shelf (b), and one-eighth inch rubber pads attached to the bottom of each bracket (c) to mitigate vibration

l

50 mM Tris base, 160 mM L-arginine-HCl, pH 8.5.

l

50 mM Tris base, 160 mM L-arginine-HCl, pH 9.5.

l

50 mM Tris base, 160 mM L-arginine-HCl, pH 10.5.

(b) 9 mL each. l 50 mM Tris base, 400 mM L-arginine-HCl, pH 8.5. l

50 mM Tris base, 400 mM L-arginine-HCl, pH 9.5.

l

50 mM Tris base, 400 mM L-arginine-HCl, pH 10.5.

For example, combine 30 mg of Tris base and 168.5 mg or 758.3 mg L-arginine-HCl (for 160 mM and 400 mM, respectively), bring to 80% final volume with ultrapure water, adjust the pH with 5 M HCl or 10 M NaOH and then bring the solution to its final volume with ultrapure water. Filter the solution through a 0.45 μm membrane. 2. 9 M Urea solution (40 ml). Resuspend 21.6 g of urea in 20 mL of ultrapure water, dissolve at room temperature and bring the solution to 40 mL. The reaction is endothermic so a water bath

High-Throughput Automated Protein Folding

149

Fig. 4 Primary screening matrix conditions in 96 well array, showing the variance of urea, arginine, glycerol, pH, and redox ratio. Color bands indicate the origin of the 24 well block used for carrying out the folding reaction and the SH number [1–4] indicates the Thermoshake number describing the location of that block and the subsequent number identifies the solution contents for that block [1–24]

can be applied to aid in dissolution, but do not heat beyond 25  C to prevent decomposition of the urea. Filter the solution through a 0.45 μm membrane. 3. 80% Glycerol solution (w/w), 30 ml. 4. 110 mL of ultrapure water. 5. Redox reagents. (a) 4 mL of 100 mM cystamine-2HCl. Weigh out 90 mg of cystamine-2HCl, bring to 80% final volume with ultrapure water, agitate mixture to dissolve solids and then bring the solution to its final volume with ultrapure water. (b) 7 mL of 100 mM L-cysteine. Weigh out 84.8 mg of L-cysteine, bring to 80% final volume with ultrapure water, agitate mixture to dissolve solids and then bring the solution to its final volume with ultrapure water.

150

Kenneth W. Walker et al.

2.4 Small-Scale Folding Screen Microcapillary Electrophoresis Assay Automation and Analysis

Reagents

1. Microcapillary electrophoresis (MCE) HT Protein Express Reagent Kit (PerkinElmer). 2. Nonreducing Protein Express Sample Buffer (PerkinElmer). Add 11.1 mg of iodoacetamide to 3 ml of sample buffer (20 mM final concentration). Prepared fresh. Equipment and Consumables

1. MCE system: Caliper LabChip GXII (PerkinElmer). 2. MCE HT Protein Express chip (PerkinElmer). In the absence of a Caliper LabChip system (see Note 4). 3. Biomek 3000 liquid handling robot (Beckman Coulter). 4. Four 24-well blocks containing protein folding mixtures. 5. One box of AP96 P250 pipet tips (Beckman Coulter). 6. One box of AP96 P20 pipet tips (Beckman Coulter). 7. One 96-well hard shell PCR plate (BioRad). 8. One round-bottom 24-well plate (Whatman) containing nonreducing sample buffer with fresh iodoacetamide in the first well. 2.5 Protein Purification and Quality Assessment

Reagents

1. Stationary phase: select an appropriate assortment of protein adsorptive media to purify the test article (e.g., affinity, ion exchange, or hydrophobic interaction resins). 2. Mobile phase: select an appropriate set of buffers to perform purification using the selected protein adsorptive media (s) following the manufacturer’s instructions. 3. Size Exclusion chromatography mobile phase: 50 mM sodium phosphate, 250 mM NaCl, pH 6.9. Equipment and Consumables ¨ KTA Explorer (GE Healthcare) preparative liquid chroma1. A tography system (or an alternative platforms such as the BioRad NGC). In the absence of an automated liquid chromatography system (see Note 5).

2. Spectrophotometer: Nanodrop2000c (Thermo Fisher). Other instruments and/or spectral based protein quantitation methods may be used for protein quantitation (see Note 6). 3. HPLC system: 1290 Infinity (Agilent) or equivalent (see Note 7). 4. Size exclusion column: Zenix-C SEC-300 7.8  300 mm (Sepax).

High-Throughput Automated Protein Folding

3

151

Methods

3.1 Inclusion Body Solubilization and Wash

The first step in the process requires isolation, washing, and solubilization of the inclusion bodies as the source of denatured protein for the folding screen. 1. Resuspend cell paste in five volumes, by weight, of lysate buffer using a stand homogenizer until the slurry is uniform. 2. Lyse the suspended cells using two passes through a chilled microfluidizer at 15,000 PSI. 3. Centrifuge the lysate at 25,000 RCF for 30 min to pellet the insoluble fraction containing the inclusion bodies then discard the supernatant. 4. Using a homogenizer, resuspend the pellet in half the volume of water used in the first resuspension step 1 then repeat the centrifugation step 3. 5. Repeat the resuspension and centrifugation step 4 again to achieve double-washed inclusion bodies (DWIBs). DWIBs may be stored at 80  C. 6. Resuspend DWIBs in one volume, by weight, of water using a handheld homogenizer. 7. Add eight volumes, initial DWIBs volume by weight, of 8 M GuHCl (6.4 M final) and continue homogenization until no visible solids are observed. 8. Add solid DTT to the solubilized DWIB solution to a final concentration of 10 mM and agitate until no visible DTT is observed. 9. Incubate the solubilized DWIBs at 37  C in a water bath for 1 h. 10. Remove the solubilized DWIBs from the water bath and store at room temperature if used immediately, otherwise store at 80  C for up to 1 month.

3.2 Small-Scale Folding Screen Automation Setup

The second step in the process is to set up a large matrix of folding conditions at small scale that are known to enhance in vitro folding. 1. Employ the Excel spreadsheet with a macro (available upon request) to automatically generates a Biomek liquid transfer worklists for each component and each pipetting tool based on the data entered in the spreadsheet. 2. Create a Biomek method that carries out the liquid transfers outlined in the spreadsheet. There are two liquid transfer types for this method: one for the 80% glycerol solution to account for the high viscosity (Fig. 5a), and another for the rest of the folding components (Fig. 5b). Include a command to activate

152

Kenneth W. Walker et al.

Fig. 5 Screenshot of a Biomek method required for liquid transfers: one for the 80% glycerol solution to account for high viscosity (a), and another for the remaining components (b)

the Thermoshake chilling feature in order to cool the plateholding platforms to 4  C (Inheco provides examples on how to write the Thermoshake control software using Microsoft Visual Basic and dynamic-link libraries) for the duration of the experiment. 3. Insert a shake command after all folding reagents are assembled in the destination plates then cease shaking for the addition of solubilized DWIBs. 4. Reactivate shaking after the DWIBs are added maintaining shaking for the duration of the experiment. Alternatively, the reactions can be transferred to a general shaker in a 4  C chamber to clear the Biomek work deck for additional processes. 5. Refer to Fig. 6 for labware placement on the work deck: (a) Four 24-well cell culture plates (SH1–SH4). (b) 1000 μl pipet tips (P1000). (c) 250 μl pipet tips (P250). (d) 20 μl pipet tips (P20). (e) Two round-bottom 24-well plates (B2, B5). (f) Sectional reservoir frame with four sectional reservoir inserts (B3). (g) Inverted P1000 pipet box lid (B4).

High-Throughput Automated Protein Folding

153

Fig. 6 Biomek 3000 software generated placement of components on the work deck required for assembling the folding matrix. Four 24-well blocks are indicated as SH1–SH4. P1000, P250, and P20 identify the location of the pipette tips of 1000, 250, and 20 μl capacity, respectively. B2 and B5 are 24 well blocks with buffer components, B3 is a sectional reservoir frame with buffer components, and B4 is an inverted P1000 pipet box lid to serve as a large reservoir 3.3 Small-Scale Folding Screen Reagent Setup and Run

After the liquid handler deck is set up, the folding reagents can then be placed in their respective labware, and the folding condition assembly automation can proceed. 1. Place all the reagents on the Beckman deck. Each folding reagent listed below is followed parenthetically by its component number (Comp #) as it pertains to the Biomek worklist, and its position on the Biomek work deck as well as its numeric position within the Labware (deck position-well number). (a) 50 mM Tris base, 160 mM (Comp-1, B2-1).

L-arginine-HCl,

pH 8.5

(b) 50 mM Tris base, 160 mM (Comp-2, B2-2).

L-arginine-HCl,

pH 9.5

(c) 50 mM Tris base, 160 mM L-arginine-HCl, pH 10.5 (Comp-3, B2-3). (d) 50 mM Tris base, 400 mM (Comp-4, B2-4).

L-arginine-HCl,

pH 8.5

(e) 50 mM Tris base, 400 mM (Comp-5, B2-5).

L-arginine-HCl,

pH 9.5

(f) 50 mM Tris base, 400 mM L-arginine-HCl, pH 10.5 (Comp-6, B2-6). (g) Urea Solution (Comp-7, B3-1). (h) Glycerol Solution (Comp-9, B3-3). (i) Water (Comp-11 and 12, B4).

154

Kenneth W. Walker et al.

(j) 100 mM cystamine-2HCl (Comp-13, B5-1). (k) 100 mM L-cysteine (Comp-14, B5-2). (l) Solubilized DWIBs (Comp 20, B5-8). 2. Verify that the folding components and their physical positions on the work deck adhere to the Excel spreadsheet, and that the transfer volumes in the spreadsheet are correct for assembling the folding matrix. 3. Activate the worklist macro to generate the worklist. 4. Execute the folding screen method on the Biomek. 3.4 Small-Scale Folding Screen Microcapillary Electrophoresis Assay Automation and Analysis

The fourth step in the process is to measure the protein migration shifts of small-scale folding matrix to determine when the folding reaction is complete. If an activity assay that is compatible with the folding conditions is available, then it can be employed in place of the MCE assay. This step should first be performed approximately 24 h after the small-scale folding screen has begun; however, additional assays may be required until the folding reaction reaches equilibrium (typically no more than 5 days). If available, include a protein control that has enough similarity to the test article to provide electrophoretic comparative benchmark. 1. Remove the containers on deck positions B2–B5. 2. Place a BioRad PCR plate on deck position B2 and a Whatman round-bottom 24-well plate on deck position B3 (Fig. 7). 3. Add 3 mL of nonreducing sample buffer with iodoacetamide to well A1 of the Whatman plate (B3). 4. Execute a Biomek method that will stop the Thermoshake shaking, transfer 21 μL from well A1 of the Whatman plate to each of the wells in the BioRad plate, then transfer 6 μL from each well of the protein folding plates (SH1–SH4). 5. Heat the assay plate to 85  C for 5 min. 6. Add 120 μL of water to each sample using a multichannel pipette. 7. Centrifuge the assay plate at 2200 RCF for 3 min. 8. Run the HT Protein Express assay following manufacturer’s instructions. 9. Analyze the resultant electropherograms and continue to run the same assay daily until there is no substantial change in the assay results, indicating the folding reactions have reached equilibrium. At this point, the folding is complete, and the final set of MCE results should be used for the ranking assessment.

High-Throughput Automated Protein Folding

155

Fig. 7 Biomek 3000 software generated placement of components on the work deck required for assembling the MCE analytical plate. Four 24-well blocks are indicated as SH1–SH4. P1000, P250, and P20 identify the location of the pipette tips of 1000, 250, and 20 μl capacity, respectively. B2 is the BioRad plate to receive the samples for MCE analysis and B3 is a 24 well block holding buffer components needed for preparing the MCE samples

Once the final data set is acquired, the next step in the process is to quantitatively analyze the small-scale folding matrix to identify promising conditions for mid-scale screening. 1. Within the Caliper LabChip software, select the analysis menu from the menu bar, open the analysis settings window, and select the peak find tab. 2. Adjust the slope threshold sample to 0.01, and the inflection threshold sample to 999.0 in order to maximize the peak identification and splitting, respectively. Since target peak (TP) quality will be utilized to determine folding success, all non-TP species must be accounted for. If a protein control is available, refer to the control TP retention time for test article TP identification. If a protein control is unavailable, select the consensus peak species near the expected migration position from the assay panel as the TP; product verification will require additional resources not outlined herein. The first criterion for folding success is the ability for the condition to produce the desired product, which is often indicated by the height of the TP (Fig. 8). Rank the folding conditions by greatest TP height selecting the top 2 for advancement to the intermediate scale step and sending the 28 remaining highest ranking candidates to the next step.

156

Kenneth W. Walker et al.

Fig. 8 Example MCE electropherograms from analysis of two protein folding samples (a) (one reaction in blue the other in red) with the target peak of interest identified as TP (Target Peak). Table of quantitative output from integration of the example MCE electropherograms (b) showing the factors used to assess the folding samples, TP Height, TP FWHM (Full Width Half Maximum), TP Sharpness and TP Percent Purity

The second criterion for folding success is product homogeneity, which can be indicated by peak sharpness, since a broad peak can potentially result from product polydispersity within the TP. A sharp peak will have a greater height to full width at half maximum (FWHM) ratio (Fig. 8). Rank the top 28 folding conditions selected in the previous step by greatest ratio of TP height to FWHM. Select the top 2 for advancement to the intermediate scale step, sending the remaining 8 highest ranking candidates to the next step. The final criterion for folding success is the fraction of material that form the TP (product purity) (Fig. 8). Rank the top 8 candidates from the previous step by product purity percentage, as measured by the LabChip software, selecting the top 2 for advancement to the intermediate scale step. The 6 folding conditions selected based on TP height, sharpness, and purity in the previous steps should represent the candidates with the best probability of having the highest levels of correctly folded product and should thus be advanced to the intermediate scale production assessment step (Fig. 9).

High-Throughput Automated Protein Folding

157

Fig. 9 Method of MCE data utilization to select the top six candidates for intermediate level scale up for final selection. Samples are first ranked by TP height and the top 2 are selected for scale up while the next 28 are ranked for peak sharpness. The top 2 sharpest peaks are selected for scale up, while the next 8 are ranked for purity. The top 2 most pure peaks are then selected for scale up resulting in a total of 6 conditions for intermediate scale assessment (2 highest TPs, 2 sharpest TPs and 2 most pure TPs) 3.5 Protein Purification and Quality Assessment 3.5.1 Intermediate-Scale Production and Purification

The fifth step in the process is to examine a promising set of folding conditions identified in the small-scale screen under conditions that better replicate a scaled process and employ more a more advanced assessment of folding success. 1. Scale-up the top six folding conditions selected by MCE analysis to 100 mL total reaction volume while maintaining the same volumetric ratio of solubilized DWIBs to folding solution. Stock folding solutions are not required if de novo formulation is more convenient. 2. Purify the folding reactions using the AKTA Explorer and the appropriate combination of stationary and mobile phases. The folding reactions may require conditioning to promote binding to the stationary phase, which is often accomplished by buffer addition or dialysis. The conditioned load may also require clarification prior to purification. Fractions of ¼ column volume or less should be used to ensure optimum pooling can be employed. Depending on the method employed dilution or buffer exchange may be required before column loading. It is generally best to use high resolution gradient methods to differentiate properly folded from incorrect isoforms. Since DWIBs are usually mostly target protein, affinity purification is generally not required; however, if the DWIBs are not pure enough, an initial affinity capture step may be required prior to the high resolution step (e.g., Protein A for Fc containing molecules and Ni-NTA for polyhistidine tag containing molecules).

158

Kenneth W. Walker et al.

Fig. 10 Example of analytical size exclusion chromatography (SEC) assessment of intermediate scale gradient fractions used as the secondary selection criteria for fraction pooling. Fractions with substantial pre-TP or post-TP content (a) should be avoided, while sharp high purity peaks (b) should be pooled into the final product

3. Screen the gradient fractions for pooling by MCE, employing the same parameters used in the folding screen selection process described above. Uniform selection criteria for the fractions should be employed to ensure the downstream yield and quality comparisons are valid. Secondary screening of the fractions by HPLC size exclusion chromatography (HPLC-SEC) is recommended. Isocratic elution at 1 mL/min can be used resulting in an 18 min cycle time. Select for single, sharp peaks with the expected retention times based on calibration standards or a benchmark protein (Fig. 10). 4. Select the fractions for pooling based on the MCE and analytical SEC results (Fig. 11). 5. Measure the protein concentration of the pool by Nanodrop 2000c and calculate yield in mg of protein per mL of folding reaction. 6. Analyze the pool by nonreducing MCE per the method previously described. 7. Analyze the pool by HPLC-SEC per the method previously described. If available, a functional assay should be conducted on the products to determine the functional yield. 8. Rank the purified products based on a combination of protein yield, MCE quality and HPLC-SEC quality. An ideal candidate will rank highest in all these categories; however, it may be necessary to balance these characteristics to choose the most appropriate candidate for advancement. In general product quality should take precedence unless the yield of such a target condition is so low that it may justify taking a lower quality candidate that could be further purified by additional processing steps.

High-Throughput Automated Protein Folding

159

Fig. 11 Example ion exchange chromatogram highlighting the fractions identified as desirable for pooling (light green) based on MCE and analytical SEC assessment. In this case the late eluting material was determined to not have desirable properties and was discarded 3.5.2 Lead Protein Folding Condition Scale-up

The final step in the process is to scale the best folding condition identified in the mid-scale assessment to produce the desired quantity of product. 1. Scale-up the lead folding condition based on intermediate-scale calculated yield (mg/mL) in order to achieve the desired amount of product. If a higher level of purity is required than achieved in the intermediate scale step, then additional material will be required to support further processing. 2. Purify using appropriate methods mentioned previously. Additional purification may be required depending on target product quality requirements. 3. Subject final pool to advanced analysis including those mentioned in this text, or any other quality assessments required to achieve satisfactory characterization needed for the intended use of the product protein.

4

Notes 1. Other techniques can be employed for cell disruption, such as sonication, nitrogen decompression, chemical disruption and French press. 2. In place of a Biomek 3000, other liquid handling robots, such as those manufactured by Tecan and Hamilton, may be used;

160

Kenneth W. Walker et al.

however, the worklist and automation setup described herein would require modification that lies out of the scope of this document. 3. The reagents can be assembled manually, adhering to the same transfer volumes outlined in Table S1 (Please refer to the Electronic Supplementary Material for a table with the detail of buffer positions and volumes). A plate shaker and a temperature controlled enclosure may be employed as an alternative to the Thermoshakes. 4. Standard nonreducing SDS-PAGE may be employed as an alternative to MCE; however, quantitative analysis will not be possible, thus qualitative judgments must be substituted. 5. Alternative protein purification techniques, such as peristaltic pumps, gravity columns or batch purification, may be employed. 6. Alternative protein quantitation methods, such as Bradford, BCA and Lowry assays, may be employed, as well as other spectral scan instruments, such as Thermo MultiSkan and Molecular Devices SpectraMax. 7. Alternative HPLC systems, such as Waters or Shimadzu, may be employed, as well as other comparable SEC columns such as those manufactured by Waters or Phenomenex.

Acknowledgments We would like to acknowledge Alex Mladenovic and Randy Hecht for contributing to the construction and programming of the Biomek, Tom Boone for protein folding condition guidance and Jeff Lewis for expressing the recombinant protein described in this chapter. References 1. Huang CJ, Lin H, Yang X (2012) Industrial production of recombinant therapeutics in Escherichia coli and its recent advancements. J Ind Microbiol Biotechnol 39:383–399 2. Rosano GL, Ceccarelli EA (2014) Recombinant protein expression in microbial systems. Front Microbiol 5:341 3. Zhang J (2010) Mammalian cell culture for biopharmaceutical production. In: Baltz RH, Davies JE, Demain AL (eds) Manual of industrial microbiology and biotechnology, 3rd edn. ASM Press, Washington, DC, pp 157–178 4. Swietnicki W (2006) Folding aggregated proteins into functionally active forms. Curr Opin Biotechnol 17:367–372

5. Marston FAO, Lowe PA, Doel MT, Schoemaker JM, White S, Angal S (1984) Purification of calf prochymosin (prorennin) synthesized in Escherichia coli. Nat Biotechnol 2:800–804 6. Marston FAO (1986) The purification of eukaryotic polypeptides synthesized in Escherichia coli. Biochem J 240:1–12 7. Mukhopadhyay A (1997) Inclusion bodies and purification of proteins in biologically active forms. Adv Biochem Eng Biotechnol 56:61–109 8. Schoemaker JM, Brasnett AH, MFA O (1985) Examination of calf prochymosin accumulation in Escherichia coli: disulphide linkages are a structural component of prochymosin-

High-Throughput Automated Protein Folding containing inclusion bodies. EMBO J 4 (3):775–780 9. Vallejo LF, Rinas U (2004) Strategies for the recovery of active proteins through refolding of bacterial inclusion body proteins. Microb Cell Factories 3:11 10. Fischer B, Sumner I, Goodenough P (1993) Isolation, renaturation, and formation of disulfide bonds of eukaryotic proteins expressed in Escherichia coli as inclusion bodies. Biotechnol Bioeng 41(1):3–13 11. Rudolph R, Lilie H (1996) In vitro folding of inclusion body proteins. FASEB J 10(1):49–56 12. Clark EDB (1998) Refolding of recombinant proteins. Curr Opin Biotechnol 9:157–163 13. Middleberg APJ (2002) Preparative protein folding. Trends Biotechnol 10:437–443 14. Coutard B, Danchin EGJ, Oubelaid R, Canard B, Bignon C (2012) Single pH buffer refolding screen for protein from inclusion bodies. Protein Expr Purif 82(2):352–359 15. Dechavanne V, Barrillat N, Borlat F, Hermant A, Magnenat L, Paquet M, Antonsson B, Chevalet L (2011) A highthroughput protein refolding screen in 96-well format combined with design of experiments to optimize the refolding conditions. Protein Expr Purif 75(2):192–203 16. Qoronfleh MW, Hesterberg LK, Seefeldt MB (2007) Confronting high-throughput protein refolding using high pressure and solution screens. Protein Expr Purif 55(2):209–224 17. Buswell AM, Ebtinger M, Vertes AA, Middleberg APJ (2002) Effect of operating variables on the yield of recombinant trypsinogen for a pulse-fed dilution-refolding reactor. Biotechnol Bioeng 77(4):435–444 18. Lin L, Seehra J, Stahl ML (2006) Highthroughput identification of refolding

161

conditions for LXRbeta without a functional assay. Protein Expr Purif 47(2):355–366 19. Vincentelli R, Canaan S, Campanacci V, Valencia C, Maurin D, Frassinetti F, Scappucini-Calvo L, Bourne Y, Cambillau C, Bignon C (2004) High-throughput automated refolding screening of inclusion bodies. Protein Sci 13(10):2782–2792 20. Scheich C, Niesen FH, Seckler R, Bussow K (2004) An automated in vitro protein folding screen applied to a human dynactin subunit. Protein Sci 13(2):370–380 21. Tobbell DA, Middleton BJ, Raines S, Needham MRC, Taylor IWF, Beveridge JY, Abbott WM (2002) Identification of in vitro folding conditions for procathepsin S and cathepsin S using fractional factorial screens. Protein Expr Purif 24(2):242–254 22. Armstrong N, De Lencastre A, Gouaux E (1999) A new protein folding screen: application to the ligand binding domains of a glutamate and kainate receptor and to lysozyme and carbonic anhydrase. Protein Sci 8 (7):1475–1483 23. Walther C, Mayer S, Jungbauer A, Du¨rauer A (2014) Getting ready for PAT: Scale up and inline monitoring of protein refolding of Npro fusion proteins. Process Biochem 49 (7):1113–1121 24. Lee SH, Carpenter JF, Chang BS, Randolph TW, Kim YS (2006) Effects of solutes on solubilization and refolding of proteins from inclusion bodies with high hydrostatic pressure. Protein Sci 15(2):304–313 25. An P, Winters D, Walker KW (2016) Automated high-throughput dense matrix protein folding screen using a liquid handling robot combined with microfluidic capillary electrophoresis. Protein Expres Purif 120:138–147

Part II High-Throughput Protocols Adapted to the Production of Specific Protein Families

Chapter 7 High-Throughput Production of Oxidized Animal Toxins in Escherichia coli Yoan Duhoo, Ana Filipa Sequeira, Natalie J. Saez, Jeremy Turchetto, Laurie Ramond, Fanny Peysson, Joana L. A. Bra´s, Nicolas Gilles, Herve´ Darbon, Carlos M. G. A. Fontes, and Renaud Vincentelli Abstract High-throughput production (HTP) of synthetic genes is becoming an important tool to explore the biological function of the extensive genomic and meta-genomic information currently available from various sources. One such source is animal venom, which contains thousands of novel bioactive peptides with potential uses as novel therapeutics to treat a plethora of diseases as well as in environmentally benign bioinsecticide formulations. Here, we describe a HTP platform for recombinant bacterial production of oxidized disulfide-rich proteins and peptides from animal venoms. High-throughput, host-optimized, gene synthesis and subcloning, combined with robust HTP expression and purification protocols, generate a semiautomated pipeline for the accelerated production of proteins and peptides identified from genomic or transcriptomic libraries. The platform has been applied to the production of thousands of animal venom peptide toxins for the purposes of drug discovery, but has the power to be universally applicable for highlevel production of various and diverse target proteins in soluble form. This chapter details the HTP protocol for gene synthesis and production, which supported high levels of peptide expression in the E. coli periplasm using a cleavable DsbC fusion. Finally, target proteins and peptides are purified using automated HTP methods, before undergoing quality control and screening. Key words Codon optimization, Gene synthesis, PCR assembly, High-Throughput, Codon optimization, PCR, Protein production, DsbC, Cysteine-rich proteins, Venom peptides, Toxins, Recombinant expression, Disulfide bonds

1

Introduction Animal venoms contain a complex arsenal of disulfide-reticulated peptides that present an enormous structural and pharmacological diversity [1]. However, while the use of venoms for drug discovery is rapidly emerging, it is still mostly an unrealized prospect due to

Yoan Duhoo, Ana Filipa Sequeira, Natalie J. Saez contributed equally to this work. Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_7, © Springer Science+Business Media, LLC, part of Springer Nature 2019

165

166

Yoan Duhoo et al.

recurrent technical bottlenecks that encumber venom exploration. Venom peptides generally contain between one and eight disulfide bonds that must be oxidized with the correct disulfide-bonding pattern in order to be active and stable. Therefore high-throughput (HTP) production of these highly biologically relevant molecules in E. coli is currently the major limiting step for the exploration of venoms for novel drug molecules. After benchmarking most of the published strategies to express reticulated peptides in E. coli [2], we developed a new HTP system for the efficient production of oxidized disulfide-bond-containing proteins and peptides, and applied this protocol to the 4992 animal toxins of the VENOMICS project [3] with the aim of replicating the diversity of animal venoms, in vitro, and to generate a large recombinant peptide library for drug discovery. De novo gene synthesis is a powerful tool to boost recombinant protein expression. The ability to make synthetic genes with a codon usage optimized to the recombinant host system promotes the effective production of recombinant proteins. In addition, gene synthesis allows the creation of new DNA molecules with any size or sequence for which tangible DNA is not accessible. The establishment of robust HTP gene synthesis approaches provides a new route to obtain large libraries of genes for recombinant expression. These innovative pipelines use optimized and simple protocols allowing production of hundreds to thousands of synthetic genes simultaneously. A new HTP platform was developed at NZYTech (Portugal) in order to generate and clone the 4992 synthetic genes encoding venom peptides identified from different animal species of the VENOMICS project. This pipeline was coupled with the HTP protein production of cysteine-rich proteins developed at AFMB (Marseille). Within a 9-month period, out of 4992 toxins trialled, 2736 (55%) could be produced in a soluble and oxidized form in quantities compatible with the drug discovery program. The screening of the VENOMICS toxin bank on G-protein-coupled receptors (GPCRs) (CEA, France) and on cellular assays (Zealand Pharma, Denmark) confirmed that this unique library of animal venom peptides contains recombinant peptides that are both correctly folded and biologically active. This chapter reports the detailed protocol for the production of cysteine-rich peptides in the periplasm of E. coli, using a cleavable DsbC fusion, and covers the various steps from gene synthesis to quality control and quantitation of the pure oxidized peptides. These protocols are universally applicable for protein expression in the milligram scale (per liter of culture) and have been successfully implemented, with or without the removal of the tag, for the production of hundreds of proteins from various origins for customers and collaborators of the Structural Biology core facility of the AFMB (Marseille).

High-Throughput Toxin Production

2

167

Materials Both the generation of the cysteine-rich clone library and peptide bank library were performed in 96-well format, following procedures that were mostly automated using robotics. Therefore, for generating both libraries, equipment and consumables are similar and are described only once, for simplicity, the first time they are mentioned in this section.

2.1 Generation of the Cysteine-Rich Clone Library

The HTP gene synthesis pipeline comprises four major steps: (1) gene design with codon optimization for E. coli, (2) gene assembly by PCR, (3) cloning, transformation, and plasmid purification, and (4) quality control by DNA sequencing. This section describes the reagents, equipment, and bioinformatics tools required for performing the gene synthesis and cloning protocols for a 96-well cloning format.

2.1.1 Gene Design with Codon Optimization for E. coli

All synthetic genes were designed using the ATGenium codon optimization software∗ in order to improve the expression of cysteine-rich peptides in E. coli. Peptide sequences were backtranslated into optimized DNA sequences using this bioinformatics tool, which allows designing multiple DNA sequences simultaneously. All genes encode proteins containing an N-terminal TEV (ENLYFQ) protease cleavage site to be able, if necessary, to remove any tags in the downstream process. The Glycine, usually added at the end of the TEV recognition site (in the P10 position e.g., ENLYFQ/G), has been omitted to recover the native sequence of the toxin without any extra residues after cleavage. We have previously demonstrated that cleavage is possible with any amino acid in position P10 except a Proline (see Ref. [2]).

2.1.2 Gene Assembly by Polymerase Chain Reaction (PCR) Reagents

1. Oligonucleotides are widely available from many companies at low-cost and different levels of purification. For gene synthesis, we recommend oligonucleotides synthesized at the smallest scale (5 μM), with desalting purification, dissolved in 10 mM TrisHCl buffer, pH 8.0, and stored at 20  C. Oligonucleotides required for gene assembly were designed using the NZYOligo designer program∗, a bioinformatics tool for HTP primer design. 2. Enzymes: KOD Hot Start DNA polymerase (1 unit/μL, EMD-Millipore). 3. dNTP solution (2 mM, EMD-Millipore): 2 mM dATP, 2 mM dTTP, 2 mM dCTP, and 2 mM dGTP. Each dNTP is used at 0.2 mM. 4. MgSO4 solution (25 mM, EMD-Millipore), used in the PCR reaction at 1.5 mM.

168

Yoan Duhoo et al.

5. DNA ladder. 6. TAE-agarose gels: agarose routine grade diluted in 1 TAE buffer, plus GreenSafe Premium (NZYTech, Ltd), cast in selfcontained system for routine agarose gel electrophoresis. Use 1 solution of Tris-Acetate-EDTA (TAE) Buffer, pH 8.3. 7. NZYDNA Clean-up 96 well plate kit (NZYTech, Ltd). Equipment and Consumables

1. PCR machine suitable for 96-well plates (such as T100 Thermal cycler, Bio-Rad). 2. 96-well PCR plates (PCR96) with adhesive PCR seals. 3. Multichannel pipettes (suitable for dispensing volumes from 5 to 50 μL). 4. Multichannel pipettes with variable span (suitable for volumes from 5 to 10 μL), such as Matrix Equalizer Pipettes 125 μL (Thermo Scientific). 5. Microcentrifuge for microtubes. 6. Centrifuge with rotor adapted for 96-well PCR plates (such as Centrifuge 5810R, Eppendorf). 7. Centrifuge with rotor for 24-well deep-well (DW24) plates (such as Avanti J-E centrifuge, Beckman Coulter). 8. 96-well ELISA plates with adhesive PCR seals. 9. Flat-bottomed, clear microtiter plate (Greiner Bio-One, reference 655,101), for determining DNA concentration. 10. Transilluminator (such as ChemiDoc™ XRS, Bio-Rad). 11. Liquid handling robot (TECAN Freedom EVO series) is used to set up PCR reactions, HTP DNA clean-up and HTP plasmid purification. Alternatively, a vacuum system (such as Manifold, EMD-Millipore) can also be used for the last two steps. 12. 96 UV-Vis Spectrophotometer with at least filters for 260 nm, 280 nm, 600 nm (such as Microplate reader, Thermo Scientific Multiskan GO UV/Vis microplate spectrophotometer with μDrop Plate, Thermo Scientific) or NanoVue (GE Healthcare Life Sciences).

2.1.3 Cloning, Transformation, and Plasmid Purification

Cloning system: All cloning reactions are performed using the NZYEasy Cloning & Expression kit (NZYTech, Ltd). This system allows direct cloning of synthetic genes into expression vectors, avoiding additional steps of transferring genes from cloning to expression vectors. In the current strategy, synthetic toxin genes, encoding a TEV recognition sequence immediately N-terminal to the toxin, were directly inserted into pHTP4 E. coli expression vector. The toxins are expressed as fusions to disulfide isomerase C (DsbC) to aid high-level expression of cysteine-rich proteins. A signal sequence N-terminal to the DsbC allows export of the

High-Throughput Toxin Production

169

proteins to the E. coli periplasmic space. Additionally, an internal hexa-histidine tag is present for purification. Reagents

1. Competent cells: Chemical competent DH5α E. coli strain from NZYTech. 2. Antibiotic: kanamycin (50 mg/mL in water), store stock solution at 20  C, and use a 1/1000 dilution. 3. LB Broth: dissolve 10 g tryptone, 5 g yeast extract, and 10 g NaCl in 950 mL of ultrapure water. Adjust the pH to 7.0 using NaOH and add ultrapure water to a final volume of 1 L. Mix well and dispense into appropriate containers. Sterilize in autoclave at 121  C for 15 min. 4. LB agar: Dissolve 10 g tryptone, 5 g yeast extract, and 10 g NaCl in 950 mL of ultrapure water. Adjust the pH to 7.0 using NaOH. Add 20 g of agar and adjust the volume to 1 L with ultrapure water. Mix well, dispense into appropriate containers, and autoclave. 5. LB agar plates: Melt slowly a bottle of LB agar in a microwave. Once cooled, add the required antibiotic (50 μg/mL kanamycin). Distribute 1.5 mL of LB agar to each well of 24-well sterile tissue culture plates (TC24). 6. SOC medium: Dissolve 20 g tryptone, 5 g yeast extract, and 0.5 g NaCl in 950 mL of ultrapure water, and add 4.5 mL of 0.5 M KCl. Mix well and adjust the medium volume to 1 L. Sterilize in autoclave at 121  C for 15 min. Once cooled, add 9 mL of 2 M MgCl2 hexahydrate and 20 mL of 1 M (20% w/v) glucose. The prepared medium should be stored at 2–8  C. 7. NZYMiniprep 96 well plate kit (NZYTech, Ltd).

Equipment and Consumables

1. 100 mL disposable reagent reservoirs. 2. Multi-dispenser (such as MINILAB 201 dispenser, HTL) and correspondent 50 and 25 mL syringes. 3. 24-well LB agar plates: 24-well sterile tissue culture plates (TC24) (Greiner Bio-One). For 96 transformations, use 4  24-well sterile tissue culture plates. 4. 24-deep-well plates (DW24, Greiner Bio-One) with air-O-top seals (4-titude Ltd. or similar breathable seals). For 96 cultures, use 4  24-DW plates. 5. Centrifuge with rotor for DW24 plates (such as Avanti J-E centrifuge, Beckman Coulter). 6. Water bath set at 42  C. 7. 96-well ELISA plates with adhesive PCR seals. 8. Liquid handling robot (TECAN Freedom EVO series). 9. Shaking incubator set to 37  C.

170

Yoan Duhoo et al.

10. Plate incubator set to 37  C. 11. All synthetic DNA sequences were verified by DNA sequencing. DNA integrity was verified using the NZYMulti Alignment software∗ for high-throughput sequencing data analysis. ∗

The bioinformatics tools are intellectual property of NZYTech Ltd., and are available upon request. The results described in Ref. [3] confirmed the efficacy of this platform that displays high efficiencies of PCR assembly and cloning with a low error rate of 1.06 mutation per kilobase of DNA synthesized. 2.2 HTP CysteineRich Peptide Bank Production

The HTP cysteine-rich protein production pipeline comprises three major steps: (1) protein production in E. coli, (2) purification of the His-DsbC toxin, and (3) purification and quality control of the final toxin. These steps are detailed in the subsections below. The protocol is for the production of 96 cysteine-rich proteins in parallel. All the protocols described in this chapter are automated on a TECAN Freedom EVO robot (see Ref. [4] for a video showing the detail of the Tecan set up).

2.2.1 Transformation, Culture, and Cell Harvest

See Subheading 2.1.3 for common reagents, equipment, and consumables.

Reagents

1. Chemically competent BL21 (DE3) pLysS E. coli strain (ThermoFisher). A stock of in-house chemically competent bacteria is made from this purchased strain. The new batch is aliquoted as 1 mL lots in 1.5 mL Eppendorf tubes, flash frozen in liquid nitrogen, and stored at 80  C. 2. Recombinant pHTP4 expression plasmid (NZYTech, Ltd). 3. Auto-induction expression media: NZY Auto-Induction LB medium powder (NZYTech, MB17901) prepared following the manufacturer’s protocol. Autoclave and store at room temperature. 4. Lysozyme stock (50 mg/mL): Dissolve 0.5 g lysozyme in water to a final volume of 10 mL. Store in 0.5 mL aliquots at 20  C. 5. Lysis/binding buffer 10 stock: Prepare 1 L of buffer containing 500 mM Tris pH 8, 3 M NaCl, and 100 mM Imidazole ACS grade (Merck, reference 104,716) in advance, filter through a 0.22 μm filter, and store at 4  C. A high-grade imidazole must be used so that it will not interfere with A280 readings for calculating protein yield. 6. Prepare fresh lysis buffer on the day of use by tenfold dilution of the 10 stock and addition of lysozyme to a final concentration of 0.25 mg/mL.

High-Throughput Toxin Production

Equipment and Consumables

171

1. Repeat pipettor (Eppendorf, reference 22 26 020-1) and 50 mL combitips (Eppendorf). 2. Small orbital shaking incubator such as the INFORS Multitron (model number AJ103). The shaking speed is 800 rpm for DW96 and 600 rpm for DW24 (see Note 1).

2.2.2 Purification of HisDsbC-Cysteine-Rich Proteins

1. DNase stock (2 mg/mL): Dissolve 100 mg of DNase in 50 mL of water. Sterilize by filtration and divide into 1 mL aliquots. Store aliquots at 20  C.

Reagents

2. 2 M MgSO4 stock: Dissolve 24 g of MgSO4 in 100 mL of water. Autoclave. 3. LabChip® HT Protein Express Chip (Perkin Elmer, 760499). 4. Protein Express CLS960008).

Assay

Reagent

Kit

(Perkin

Elmer,

5. HT Protein 200 Sample Buffer (Perkin Elmer 760518). 6. Ni Sepharose 6 Fast Flow resin (GE Healthcare, reference 17-5318-02): The resin is supplied in 20% ethanol. Put aliquots of the resin in 15 mL falcon tubes. To equilibrate the resin, wash twice in water and then twice in binding buffer. This is done by first centrifuging at 500  g for 1 min, discarding the supernatant by inverting the tubes and resuspending in water or buffer. Repeat at each step of equilibration. After the final wash, resuspend in binding buffer as a 50% (v/v) (50:50 mL) (resin:buffer) slurry. Store the equilibrated resin at 4  C when not in use. 7. Lysis/binding buffer 10 stock, see Subheading 2.2.2. On the day of use, dilute 25 mL of 10 stock into a final volume of 250 mL. 8. Wash buffer 10 stock: Prepare 1 L of buffer containing 500 mM Tris pH 8, 3 M NaCl, and 500 mM Imidazole ACS grade (Merck, reference 104,716) in advance, filter through a 0.22 μm filter, and store at 4  C. On the day of use, dilute 25 mL of 10 stock into a final volume of 250 mL. 9. Elution buffer 5 stock: Prepare 1 L of buffer containing 250 mM Tris pH 8, 1.5 M NaCl, and 1.25 M Imidazole ACS grade (Merck, reference 104,716) in advance, filter through a 0.22 μm filter, and store at 4  C. On the day of use, dilute 20 mL of 5 stock into a final volume of 100 mL. Equipment and Consumables

1. Macherey-Nagel 96-well Receiver/Filter Plate 20 μm, 1.5 mL capacity (Macherey-Nagel, reference 740686.4). 2. 4 Deep-well 96 (DW96) plates, with 2 mL volume capacity (Greiner Bio-One, reference 780,270).

172

Yoan Duhoo et al.

3. Perkin Elmer LabChip GX II (or alternatively an SDS-PAGE electrophoresis system). 4. Spectrophotometer and plate (UVStar Greiner Bio-One) for measuring absorbance at 280 nm (A280) for calculating yield of soluble proteins. 5. Plate sonicator adapted for deep well plates such as an Ultrasonic processor XL (Misonix Inc., USA) to ensure full cell lysis. 2.2.3 TEV Cleavage of the His-DsbC Tag Reagents

2.2.4 Acidification and Purification of Oxidized Cysteine-Rich Proteins Reagents

1. TEV protease: TEVsh (2 mg/mL) in 20 mM HEPES, 300 mM NaCl, 10% Glycerol, pH 7.4 purified following the published protocol [5]. Store at 80  C. 2. Dithiothreitol (DTT) 1 M in elution buffer stored at 20  C. Add directly to the eluate immediately prior to dilution, before cleavage to reach a final concentration of 0.1 mM. 1. Acidification buffer 10 stock. Prepare 1 L of 55% acetonitrile (VWR, HiPerSolv CHROMANORM® LC-MS grade) and 1.1% formic acid (Sigma, 56,302, LC-MS grade) in water. Store at room temperature. 2. Dry C18 reversed phase resin (100 A˚, Sigma reference 60,756) stored in pure acetonitrile (3 volumes of acetonitrile for one volume of beads). 3. SPE (Solid Phase Extraction) Binding buffer: 5% acetonitrile, 0.1% formic acid in water. 4. SPE Wash buffer: Solvent A without formic acid. 5. SPE Elution Buffer: 50% acetonitrile/water.

Equipment and Consumables

1. Portable chemical hood to remove acetonitrile vapors such as a F.T.M Technologies (France) hood with filters (240 m3 flow).

2.2.5 Quality Control and Quantification of Oxidized Cysteine-Rich Proteins

1. Solvent A: water with 0.1% formic acid. 2. Solvent B: acetonitrile with 0.1% formic acid.

Reagents Equipment and Consumables

˚ , 1.0 mm, 1. Reverse phase C18 column (Hypersil GOLD 50 A ˚ 1.9 μm, 175 A, ThermoScientific). 2. Ultra-high-performance liquid chromatography-mass spectrometry (UHPLC-MS) system with electrospray detection and 96 auto-injector such as an Accela High Speed LC system with detector MSQ+ (ThermoScientific, San Jose, CA). Correct target peptide molecular weight was confirmed using Data Explorer software (version 4.9, Applied Biosystems). The Protein concentration was determined using automatic processing with Xcalibur software (ThermoScientific), by measuring A280 and peak area integration.

High-Throughput Toxin Production

3

173

Methods Large-scale production of oxidized cysteine-rich proteins in E. coli requires efficient and fast procedures in order to facilitate the construction of peptide biobanks. The integrated methodologies and platforms described in this chapter were used to clone, express and purify the 4992 recombinant animal toxins of the VENOMICS project in less than a year time frame [3]. These protocols combine automation, simplicity and robustness and consist of an HTP cloning and an HTP soluble protein expression pipeline that permits 188 cloning reactions and 188 soluble cysteine-rich protein purifications every single week. For the VENOMICS project, the cloning was performed at the company NZYTech (Lisbon, Portugal) and the production at the AFMB (Marseille, France) but both pipelines utilize similar equipment and therefore could be implemented at the same site with the same throughput as this project.

3.1 Generation of the Cysteine-Rich Clone Library

A schematic representation of the HTP gene synthesis and cloning pipeline is presented in Fig. 1. This pipeline includes six main steps that allow the successful synthesis of multiples of 96 genes encoding venom peptides. The first step corresponds to gene design with codon optimization: from peptide sequences multiple DNA sequences are designed and optimized for expression in E. coli, using NZYTech algorithm of codon optimization software (ATGenium). In step 2, primers required for gene assembly are designed and synthesized and assembled by PCR using optimal conditions (step 3). Synthetic genes are cloned using NZYTech ligation-independent cloning (LIC) protocol into the E. coli expression vector pHTP4, which contains redox-active DsbC as a fusion tag, allowing the efficient formation of correctly folded disulfide bridges (step 4). In the subsequent steps, HTP Bacterial transformation (step 4) and HTP plasmid DNA preparations (step 5) are accomplished using high-throughput protocols. Finally, DNA sequences are checked for the presence of sequence errors using the Sequencing Analysis tool (step 6).

3.1.1 Gene Design with Codon Optimization

The main purpose of this project was to efficiently produce cysteine-rich peptides in E. coli in order to establish large libraries of highly purified recombinant molecules for drug discovery. Cysteine-rich peptides produced by venomous animals display a codon usage that is not optimal for expression in bacterial hosts. Thus, DNA sequences encoding venom peptides were designed by back-translating the peptide sequence and codon usage optimized for high levels of expression in E. coli, using the ATGenium codon optimization algorithm (see Note 2). This algorithm selects codons

174

Yoan Duhoo et al.

Fig. 1 Schematic representation of the HTP gene synthesis and cloning platform. ♦ highlights bioinformatic tools used to HTP gene and primer design, and sequencing analysis (NZYTech, Ltd); ♣ indicates steps that can be performed in an automated format using a liquid handling robot containing a vacuum system

used preferentially in highly expressed or average native E. coli genes (see Note 3) and excludes naturally rare codons, while ensuring an equal proportion of the two cysteine codons. Other factors considered for gene design were GC content (between 40 and 60%), a codon adaptation index (CAI) value higher than 0.8. In addition, putative E. coli regulatory sequences such as promoters,

High-Throughput Toxin Production

175

Table 1 Overhang sequences included in the outer primers that allow cloning into the pHTP4 vector Outer primer overhang

Sequence

Forward

5´-TCAGCAAGGGCTGAGG. . .-30

Reverse

5´-TCAGCGGAAGCTGAGG. . .-3´

activators or operators, and unwanted restriction sites were removed from the sequences. Presence of G/C islands, which might promote frame-shifting, was minimized by selectively avoiding runs of consecutive G and/or C greater than six nucleotides. Finally, no contiguous strings of nucleotides longer than five nucleotides were allowed. At the 50 and 30 ends of the gene sequences, overhanging LIC sites required for direct cloning into the expression vector were inserted (see Table 1). 3.1.2 Gene Assembly by Polymerase Chain Reaction (PCR) Oligonucleotide Design

PCR Assembly and Cleanup

Oligonucleotides required to synthesize 96 synthetic genes were designed using the NZYOligo designer program, which allows generating a primer pool per gene. This program uses the DNA sequence of each gene as template to design the assembly oligonucleotides. For a given gene, the external oligonucleotides are termed outer primers (forward and reverse primers), and the internal oligonucleotides are named inner primers. Primers used in gene assembly by PCR had a length of 60 bp, an overlap region of 20 bp between forward and reverse primers, as well as a gap of 20 bp (see Note 4). A 16-bp overhang on the 50 -end of both forward and reverse primers (see Table 1) was added to the outer primers sequences to allow ligation-independent cloning into pHTP4 E. coli expression vector (see Note 5). All oligonucleotides should be obtained desalted at 5 μM in nuclease-free water as described above (see Notes 6 and 7). A PCR reaction with a final volume of 50 μL was performed to synthesize each gene in which the oligonucleotides are used as template for gene construction. 1. For each gene to be synthesized, prepare an inner primer mix at 125 nM containing all inner oligonucleotides (see Note 8). Seal the plate with an adhesive PCR seal and centrifuge briefly to collect the reaction components. 2. On ice, prepare a PCR master mix sufficient for 96 PCR reactions (we recommend preparing a PCR master mix in excess, for example, for 104 reactions). Combine the reagents specified in Table 2 (see Note 9). Mix the solution well and centrifuge briefly to collect at the bottom of the tube.

176

Yoan Duhoo et al.

Table 2 PCR components used to perform 96 PCR assembly reactions Components

Volumes for 1 rx. (μL)

Volumes for 104 rxs. (μL)

10 Reaction Buffer

5

520

25 mM MgSO4

3

312

2 mM dNTP mixture

5

520

KOD Hot Start DNA polymerase

1

104

12

1248

H2O

3. Add 26 μL of PCR master mix to each well of a 96-well PCR plate, using a multichannel pipette. 4. Using a multichannel pipette, add 8 μL of inner primer mix at 125 nM, followed by 8 μL of each outer primer at 5 μM (final concentration of 800 nM) into the corresponding well of the PCR96 plate containing the PCR master mix. The two outer primers are used at a final concentration of 800 nM while the inner primers were pooled together in an equimolar mixture to achieve a final concentration of 20 nM in the PCR reaction. Seal the plate using an adhesive PCR seal and centrifuge to collect at the bottom of the plate. 5. Perform the PCR amplification with the thermocycler set to the following parameters: 95  C for 2 min: 1 cycle. 95  C for 20 s; 60  C for 8 s; 70  C for 3 s: 26 cycles. 4  C: 1. 6. After the amplification, confirm the efficiency of PCR assembly by analyzing a sample of 8 out of the 96 reactions (one column of the 96-well PCR plate, for example), through agarose gel (1.5% w/v) electrophoresis. Run 5 μL of each PCR product against a DNA ladder on agarose gel (see Note 10). 7. If the PCR products are the right size and are present as a single band proceed with silica-column purification of the PCR product using NZYDNA Clean-up 96 well plate kit (follow the manufacturer’s instructions and use the vacuum automated format) (see Note 11). 8. Determine the DNA concentration using a UV/Vis microplate spectrophotometer, by measuring absorbance at 260 nm. Calculate an average concentration for 16 PCR products (in ng/μL) and use this value to calculate the amount of PCR product to use per cloning reaction, as explained below. 9. Confirm the efficiency of the PCR clean-up step by analyzing a sample of 8 out of the 96 reactions, through agarose gel

High-Throughput Toxin Production

177

(1.5% w/v) electrophoresis (prepare and run samples as explained above). Agarose gel electrophoresis should reveal eight clear bands with the correct gene size (see Note 10). 3.1.3 Cloning, Transformation, and Plasmid Purification Cloning and Transformation

Purified PCR products were cloned directly into the pHTP4 expression vector using the NZYEasy Cloning & Expression System that follows a LIC technology (see Notes 12 and 13). Use the following equation to calculate the amount of PCR product to be used per cloning reaction. PCR product required ðngÞ∗ ¼ PCR product length ðbpÞ∗  0:083 ∗

average concentration and PCR product length previously calculated for 16 genes 1. On ice, prepare a cloning master mix sufficient for 96 reactions (we recommend to prepare a cloning master mix in excess, for example for 104 reactions) by combining the reagents specified in Table 3, excluding the purified PCR product that will be added following master mix distribution into the 96-well PCR plate. A single cloning reaction is performed in a final volume of 10 μL. Mix the master mix and centrifuge to collect at the bottom of the tube. x: average volume of purified PCR product to use in one cloning reaction. Do not include in preparation of cloning master mix. 2. Calculate the volume of master mix to aliquot per well as follows: Volume of master mix ðμL Þ ¼ 10  average volume of PCR product ðμL Þ

3. Distribute the cloning master mix into a 96-well PCR plate, using a multichannel pipette. 4. Add x μL of each purified PCR product into the 96-well PCR plate. Seal the plate with an adhesive PCR seal and centrifuge to spin down. 5. Perform the cloning reaction using the following parameters: 37  C for 60 min. 80  C for 10 min. 30  C for 10 min. 4  C, 1. 6. After cloning reaction, centrifuge briefly to collect the reaction components at the bottom of the plate and store samples on ice or at 20  C for subsequent transformation. 7. Prepare 4 24-well LB agar plates containing LB agar plus 50 μg/mL kanamycin with a multi-dispenser (see Note 14). 8. Place a new 96-well PCR plate on ice.

178

Yoan Duhoo et al.

Table 3 Cloning reaction components used to perform 96 cloning reactions Components

Volumes for 1 rx. (μL)

Volumes for 104 rxs. (μL)

10 Reaction Buffer

1

104

pHTP4 vector

1

104

NZYEasy enzyme mix

0.5

52

Purified PCR product

x

x  104

H2O

up to 10 μL

up to 1040 μL

x: average volume of purified PCR product to use in one cloning reaction. Do not include in preparation of cloning master mix

9. While on ice, distribute 25 μL of DH5α E. coli competent cells into each well of the 96-well PCR plate, using an automatic multichannel pipette (see Note 15). 10. Add 5 μL of cloning reaction to the competent cells, using multichannel pipette. Seal the plate with an adhesive PCR seal. Do not mix by pipetting. 11. Incubate the plate for 30 min on ice and heat-shock the cells at 42  C for 40 s, then transfer the 96-well PCR plate back to ice for 2 min (see Note 16). 12. Using an automatic multichannel pipette, add 100 μL of SOC medium into each well, cover with a new adhesive PCR seal to prevent contamination. Incubate at 37  C for 90 min in a shaking incubator with vigorous agitation. 13. Dispense 60 μL of transformed cells onto the previously prepared 4  24-well LB agar plates. In order to spread the cells, place the four plates in a shaking incubator set to 37  C, for 60 min, with gentle shaking. 14. Invert the plates and transfer them to the plate incubator, and incubate overnight at 37  C. 15. After overnight incubation, observe LB agar plates for the presence of colonies, and select one colony of each transformation reaction for DNA isolation. 16. Inoculate one single colony per transformation in 5 mL of LB liquid media supplemented with 50 μg/mL of kanamycin (use 4  DW24) and seal the plates with breathable seals. Be aware to avoid cross-contamination. Incubate deep-well plates at 37  C for ~16 h, with gentle agitation. Plasmid Purification

1. Harvest cells at 1500  g for 15 min at 4  C, using a centrifuge with a rotor for DW24 plates. Discard the supernatant.

High-Throughput Toxin Production

179

2. Isolate plasmid DNA from the bacterial pellets using the NZYMiniprep 96 well plate kit (according to the manufacturer’s instructions for vacuum automated format) (see Note 17). 3. Determine the DNA concentration using a UV/Vis microplate spectrophotometer, by measuring absorbance at 260 nm. Additionally, analyze a sample of 8 out of the 96 DNA preparations through agarose gel (0.8% w/v) electrophoresis, to confirm the concentration of plasmid DNAs. Agarose gel electrophoresis should reveal a clear band with the correct plasmid size (see Note 10). 4. Store plasmid DNA at 20  C. Quality Control by DNA Sequencing

1. The integrity of the artificial gene sequences should be confirmed by Sanger Sequencing. Use two primers for DNA sequencing in both directions. Quality control should ensure that the novel DNA sequence is equal to the defined sequence (see Note 18). 2. Sequencing results are analyzed using the NZYMulti Alignment software or by performing a sequence alignment between template DNA sequences and sequencing results of the 96 plasmids (see Notes 19 and 20).

3.2 HTP CysteineRich Peptide Bank Production

3.2.1 Transformation, Culture, and Cell Harvest Day 1

All the steps below have been performed in an almost fully automated manner using a liquid handling robot (TECAN Freedom EVO series). The whole protein purification procedure is designed to be performed in 7–8 days for 96 proteins. To cope with the timeline of the VENOMICS project (188 proteins/week), this protocol was run twice every single week over several months. The procedure can be broken down into five steps: (1) Transformation, culture and harvest of E. coli cultures, (2) Purification of His-DsbC-fusion proteins, (3) Cleavage of the tag by TEV protease, (4) Purification of correctly oxidized cysteine-rich proteins, and (5) Quality control of the proteins. A schematic representation of the HTP protein production pipeline is presented in Fig. 2. 1. For a set of 96 constructs to be transformed, prepare four TC24 plates containing 2 mL of LB agar supplemented with 50 μg/mL kanamycin and 34 μg/mL chloramphenicol. Allow to set and dry. 2. Thaw 3  1 mL of competent BL21 (DE3) pLysS strain on ice then aliquot 25 μL of competent cells into each well of a PCR96 plate using a multichannel pipette. Keep the plate on ice until the thermal shock in step 6.

180

Yoan Duhoo et al.

Fig. 2 Schematic representation of the HTP protein production

High-Throughput Toxin Production

181

3. Add 1 μL of the expression plasmids (at a concentration > 10 ng/μL for pure plasmids) with a multichannel pipette. Ensure that the plasmid is dispensed into the cells but do not mix by pipetting. Cover the plate with plastic film to avoid contamination. 4. Incubate on ice for 30 min, then place the plate at 42  C for 45 s (thermal shock), then transfer back to ice for 3 min. Add 100 μL of SOC medium from a reagent reservoir using a multichannel pipette and incubate for 60 min at 37  C. 5. In the meantime, prepare a sterile DW96 containing 1 mL LB (with appropriate antibiotics) in each well using a repeat pipettor and seal with plastic adhesive to prevent contamination. 6. At the end of the transformation, dispense 60 μL of transformed cells onto the pre-prepared 24-well LB agar plates (see Note 21). Place in a shaker for 10 min to spread and leave plates open to dry for 10 min under a hood (or in the incubator). Close the plates and leave them inverted at 37  C, overnight. 7. Dilute 60 μL of transformed cells into the DW96 containing the medium. Seal the deep well plate with a breathable film to allow culture aeration. Place in a 37  C shaking incubator at maximum speed overnight (800 rpm). This is the preculture for the production phase (DW96-Transfo). 8. The next day, the preculture is used to inoculate the cultures in auto-induction medium. The remaining preculture is used to prepare glycerol stocks if desired (see Note 22). Day 2

1. Make 2.5 L NZY auto-induction LB medium supplemented with 50 μg/mL kanamycin and 34 μg/mL chloramphenicol. Dispense 2 mL into each well of 48  DW24 plates with a repeat pipettor (DW24-Culture). 2. For each target, aspirate 600 μL of preculture and dispense 12  50 μL (1/40 dilution) in 12 consecutive wells of a DW24 to inoculate the cultures. With such a protocol, 2 targets are grown per DW24 (well 1–12, target 1; well 13–24, target 2. . .), resulting in the need to inoculate 48 DW24 to grow 96 different expressions. The transfer of the preculture from the DW96 into DW24 can be done using the Matrix multichannel pipette with variable span or the robotic system. 3. Incubate at 25  C with shaking (600 rpm) for 24 h.

Day 3

1. At the end of culture, visually check that the growth is homogenous for a single target within the 12 cultures. To determine the OD600nm take 20 μL of one culture/target and dispense into a flat-bottomed, clear microtiter plate containing 180 μL

182

Yoan Duhoo et al.

of medium in each well. Measure the OD600nm, taking into account the tenfold dilution. OD should reach around 12. 2. Centrifuge the 48 DW24 plates at 3800  g for 10 min then discard the supernatant into a waste container, decontaminate the media before disposal. Tap the plates, upside-down, onto absorbent paper to remove any excess medium. 3. In the meantime, prepare 600 mL of lysis buffer containing lysozyme (see Note 23). 4. Add 0.5 mL of lysis buffer to each well and resuspend the pellets by shaking them at 20  C and 800 rpm for 10 min. Store the DW at 20  C overnight before purification. 3.2.2 Purification of HisDsbC-Cysteine-Rich Proteins Day 4

1. Take out from the freezer the 12 DW24 equivalent to target 1 to 24. Thaw the frozen cell suspensions in a water bath for 10 min at 37  C followed by 15 min shaking at 20  C in the incubator. The cultures should become viscous (see Note 23). 2. Take 1.5 mL of DNase stock and mix it with 3 mL of MgSO4 stock. Dispense 15 μL into each well of the 12 DW24, to give a final concentration of 10 μg/mL of DNase and 20 mM MgSO4. Re-seal the plate with plastic tape and shake for a further 15 min at 20  C, after which stage the cultures should be non-viscous (see Note 23). 3. Pool the 12 0.5 mL lysates per targets into a single well of a new DW24 (6 mL/well) to have 24 distinct 6 mL lysates of DsbC-His-fusion-protein per DW24 (DW24-Lysate). 4. While pooling into the new DW24, check carefully that all the cultures are no longer viscous (see Note 24). 5. Optional: using the Matrix variable span pipet, aspirate 5 μL of the whole cell lysate and dispense into a 96-well PCR plate containing 7 μL of LabChip Sample Buffer. Denature for 5 min at 95  C. Proceed to the analysis on the LabChip GXII (Perkin Elmer) the same day or freeze the plate (Total fraction). The day of the analysis on the LabChip GXII (Perkin Elmer), thaw the plate, denature for 3 min at 95  C, add 32 μL of water and analyze following manufacturer’s instruction. This analysis can be replaced by a SDS-PAGE. 6. Sonicate the DW24 on ice in a plate sonicator (Ultrasonic processor XL, Misonix Inc., USA) for 5 min (power 5, 30 s ON/OFF cycles). 7. Move the DW24 to the Tecan Freedom worktable (see Note 25). 8. Start the Tecan purification procedure. The protocol initially made for 96-Nickel purification with a 96-pipetting head is used [4] with a minor modification to purify 24 targets in quadruplets in one experiment. Because

High-Throughput Toxin Production

183

with a 96-head, 4 tips are able to aspirate/dispense into a single well of a DW24, the 24 targets are purified in one go. The DW96 plates collecting the flow through, wash and elutions are simply replaced by DW24 plates (see schematic). With such a protocol, each 4 wells of the 96 filter plate are combined in a single well of the DW24. Therefore 1.5 mL of lysates (3 mL of initial culture) are purified on each well of the filter plates and eluted in 4 mL (combining 41 mL elution buffer). 9. With the 96-pipetting head of the robot, the resin suspension is mixed before aspiration to ensure a homogenous suspension and an equal amount of resin is dispensed into each well, before 200 μL of the 50% (v/v) Ni sepharose resin slurry is transferred to each well of the DW24. 10. Repeat step 9. Each well of the DW24 (6 mL lysate) now contains the equivalent of 1.6 mL of slurry (800 μL of resin). 11. With the same tips, to ensure a good protein binding and a homogenous mixing, perform slow up and down pipetting cycles on the slurry/lysate with the 96-pipette head over 10 min. 12. After 10 min, transfer 200 μL the lysate/bead mixture to a Macherey-Nagel filter/receiver plate (20 μm), mixing before aspiration. Repeat 3 more times to transfer 800 μL/well of the 96-well filter plate. 13. Turn a mild vacuum on for approximately 60 s to filter the lysate, collecting the flowthrough in a DW24. Turn the vacuum off. 14. Repeat steps 12 and 13 two more times so that all the slurry/ lysate mix is transferred to the filter plate. Each well of the filter plate has now around 200 μL of resin. 15. Remove the DW24 containing the flowthrough and replace it with the waste reservoir. Keep the flowthrough aside until the end of the purification. 16. Using the 96-pipette head, wash with 800 μL of binding buffer (50 mM Tris, 300 mM NaCl, 10 mM Imidazole, pH 8); transfer 200 μL of binding buffer onto the filter plate, repeating 3 times. 17. Turn a mild vacuum on for approximately 60 s to aspirate the buffer. Turn the vacuum off. 18. Repeat the wash a second time and a third time by repeating steps 16 and 17 twice. 19. In order to remove E. coli proteins weakly bound to the resin, repeat steps 16 to 17 replacing the binding buffer with the wash buffer (50 mM Tris, 300 mM NaCl, 50 mM Imidazole, pH 8). 20. Put a DW24 below the filter plate.

184

Yoan Duhoo et al.

21. Using the 96-pipette head, elute the protein with 1000 μL of elution buffer (50 mM Tris, 300 mM NaCl, 250 mM Imidazole, pH 8); transfer 200 μL of elution buffer on the filter plate, repeat 4 times. Wait 3 min. 22. Turn the vacuum on for approximately 60 s to aspirate the buffer, collect the elution in the DW24 plate (DW24-Purified fusion). Turn the vacuum off. 23. Using the variable span pipette (or robotics), aspirate 5 μL of the elution (Purified fusion) and dispense into a PCR96 plate containing 7 μL of LabChip Sample Buffer. Denature for 5 min at 95  C, add 32 μL of water and analyze the samples following manufacturer’s instruction with the High Sensitivity HT Protein Express protocol (10–100 kDa program) on the LabChip GXII device (Perkin Elmer). This analysis can be replaced by a SDS-PAGE if a Perkin Elmer machine is not available in the lab (see Note 26). 24. Transfer 200 μL of elution to a UVStar plate, measure absorbance at 280 nm and calculate the concentration of the His-DsbC-protein. This procedure is re-produced another three times in 1 day in order to purify the 96 His-DsbC-proteins; once the Purification of the first 24 targets is started on the Tecan robot, thaw the second series of 12 DW24 (Target 25–48) and proceed through the lysis/sonication steps. At the end of the purification of targets 1–24, start the program of the Tecan for the purification of target 25–48, thaw the DW for the targets 49–72 and repeat one more time for targets 73–96. Applying this protocol on 4992 cysteine-rich proteins, we could detect purified fusion proteins in all cases, with an average yield of 4 mg of His-DsbC-fusion-protein purified per culture (>160 mg/L of culture) and up to 10 mg of purified proteins in some cases (see Note 27). 3.2.3 TEV Cleavage of the His-DsbC Tag

1. Taking into account the protein concentration calculated in Subheading 3.2.2, step 24, add the correct volume of TEV protease (2 mg/mL stock) to each elution (4 DW24 containing 3.8 mL each) of the nickel purification (DW24-Cleaved). To ensure full cleavage, a ratio of 1/10 (w/w) protein/TEV is recommended. Add fresh DTT to reach 0.1 mM and elution buffer to a final volume of 5 mL. The addition of TEV, DTT and elution buffer can be done with the help of a macro written for the Tecan software or with pipettes (see Note 28). 2. Incubate with mild agitation overnight at room temperature.

High-Throughput Toxin Production 3.2.4 Acidification and Purification of Oxidized Cysteine-Rich Proteins

185

After the overnight TEV cleavage, the correct oxidized toxin must be separated from the TEV protease and the DsbC fusion. The samples are acidified in order to precipitate the TEV protease and misfolded peptides, then the toxins are separated from the DsbC by a C18 SPE purification. 1. Add 500 μL of a 55% acetonitrile and 1.1% formic acid stock solution to each well of the 4 DW24 containing the overnight cleaved samples (see Note 29). 2. Seal the DW with a plastic cover and incubate for 1 h at room temperature with mild agitation (DW24-Acidified). 3. Centrifuge the 4 DW24 plate at 3800  g for 10 min to eliminate precipitated TEV, then transfer the supernatant into 4 new DW24, 4. Transfer the 4 DW24 containing the clarified, acidified cleavage reactions to the Tecan deck to continue the C18 SPE purification. 5. With the 96-pipette head of the robot, to ensure a homogenous suspension, mix the 25% (v/v) C18 resin slurry (stored in acetonitrile) and transfer 200 μL to each well of a MachereyNagel filter/receiver plate (20 μm) (see Note 30). 6. Turn on a mild vacuum for approximately 30 s to remove the acetonitrile. 7. To equilibrate the beads, using the 96-pipette head, transfer 2 200 μL of binding buffer to the filter plate. 8. Turn on a mild vacuum for approximately 45 s to aspirate the buffer. Turn the vacuum off. 9. With the 8-needle arm of the Tecan robot, transfer 1000 μL of the clarified, acidified cleavage products to the respective wells of the Macherey-Nagel filter/receiver plate (20 μm) containing the C18 resin. 10. Turn a mild vacuum on for approximately 60 s to filter the cleavage products through the plate. The flowthrough is collected in a waste container. Turn the vacuum off. 11. Repeat steps 9 and 10 another 4 times to transfer and bind all the toxins on the C18 resin. At that step, because of the small pore size of the C18 beads, most of the DsbC and the last traces of TEV are eliminated in the FT. 12. Using the 96-pipette head, wash with 800 μL of binding buffer; transfer 200 μL of binding buffer on the filter plate, repeating 3 times. 13. Turn on a mild vacuum for approximately 60 s to aspirate the buffer. Turn the vacuum off. The last traces of the DsbC are eliminated at that step.

186

Yoan Duhoo et al.

14. Using the 96-pipette head, wash with 800 μL of wash buffer; transfer 200 μL of binding buffer on the filter plate, repeating 3 times. 15. Turn on a mild vacuum for approximately 60 s to aspirate the buffer. Turn the vacuum off. This step removes the traces of formic acid. 16. Place a DW96 under the filter plate. 17. Using the 96-pipette head, elute the toxins with 560 μL of elution buffer (50% acetonitrile in water); transfer 187 μL of elution buffer on the filter plate, repeating 3 times. 18. Wait 3 min. 19. Turn on a mild vacuum for approximately 60 s to aspirate the buffer. Turn the vacuum off. 20. Transfer the elution DW to a chemical hood, leave 16 h with mild agitation at room temperature to evaporate the acetonitrile. 21. Transfer the DW to 4  C. The pure oxidized toxins are now in 280 μL of water. They are stable for several weeks in these conditions. 3.2.5 Quality Control and Quantification of Oxidized Proteins

In order to cope with the pipeline throughput (188 toxins/week) and to ensure that the final peptide bank would be 100% verified by mass spectrometry for correct mass and position of each toxin, a double control system was set up. First, at the end of the purification an aliquot was analyzed on site. Then, once the final peptide bank was aliquoted and frozen at constant concentration (this part of the protocol is not covered in this chapter, please refer to Ref. [3] for details), a new aliquot was analyzed by a third-party mass spectrometry lab. 1. Take a 20 μL aliquot from the 96 final toxins and transfer to a PCR96 plate. 2. Transfer the plate to the auto-injector module (set to 4  C) of a UHPLC-MS with electrospray ionization detection. 3. Inject each sample on an analytical C18 column at a flow rate of 200 μL/min. Each 6 min analysis is performed with an acetonitrile gradient going from 5 to 40% in 2 min followed by an 80% wash and re-equilibration for the next injection. 4. To confirm the molecular weight of target peptides, the resulting mass spectras were deconvoluted using manual calculations and compared with theoretical values determined from the amino acid sequences. The concentration of the proteins was determined using automatic processing with the UHPLC software by A280 measurement and peak integration. Out of the 4992 toxins of the VENOMICS Project, 2736 (55%) were purified with the correct theoretical mass after the mass spectrometry quality control step.

High-Throughput Toxin Production

4

187

Notes 1. The production of 96 cysteine-rich proteins described in this chapter requires growing 48 DW24. Only a dedicated short orbital incubator for high speeds such as an Infors Multitron (model number AJ103) will have enough deck space and sufficient aeration in rich media for multiple DW. A throughput of 2304 cultures per week (96 DW24) was necessary to cope with the 188 (2  96) new cysteine-rich proteins received per week on the VENOMICS project. To be able to perform the whole process with only one Multitron incubator, a plexiglass second level was custom-made and added inside the incubator. For a lower throughput project, with few DW24, the Multitron can be replaced by a regular shaking incubator for all culture steps (at a speed of 200 rpm). 2. ATGenium codon optimization algorithm designs artificial genes with codon usages optimized for expression in E. coli. 3. ATGenium codon optimization algorithm may be adapted to design genes optimized for many different hosts depending on the input of the respective codon usage tables. 4. We recommend an oligonucleotide length between 50 and 60 bp, which is the maximum synthetic length that ensures high fidelity for most commercial DNA oligonucleotides providers. 5. pHTP4 expression vector contains two histidine tags (N- and C-terminal). 6. In the PCR reaction, oligonucleotides are used at a final concentration of 20 and 800 nM (inner and outer primers, respectively). 7. We recommend the use of oligonucleotides with no purification because additional purification steps were previously shown to have a minor impact in error rates of synthetic genes. Thus, the high cost associated with oligonucleotide purifications (such as HPLC, high-performance liquid chromatography, or PAGE, polyacrylamide gel electrophoresis) can be avoided during oligonucleotide synthesis. 8. In order to avoid pipetting errors, we recommend use a robotic system to prepare the inner primer mix. 9. A high-fidelity DNA polymerase is strongly recommended to minimize errors introduced during gene assembly by PCR. 10. If no bands are visible on agarose gel, we recommend analyzing more samples in order to verify the success of the PCR assembly, PCR clean-up or plasmid purification. In case of negative results, consider that the respective step was not correctly performed.

188

Yoan Duhoo et al.

11. To efficiently recover DNA, we recommend adding 80 μL of Elution Buffer or nuclease-free water (pH 8.5) to each well and incubate for 1–3 min at room temperature in order to increase elution efficiency. 12. One of the main advantages of this system is the possibility of cloning synthesized genes directly into E. coli expression vectors, saving time and decreasing costs. 13. If necessary, this system allows cloning the genes of interest directly into the pHTP0 cloning vector (ampicillin resistant). Subsequently, these genes can be transferred to pHTP expression vectors (NZYTech, Ltd.) containing different solubility tags and kanamycin resistance, thus generating different expression clones of the same gene. 14. LB agar plates can be prepared in advance and stored for up to 4 weeks at 4  C. Before usage, they should be pre-warmed and dried at 37  C, by leaving them in a plate incubator placed upside down. 15. We recommend the use of high-efficiency DH5α competent cells for the transformation step. If you fail to get colonies, repeat the transformation using a higher amount of cells (50 μL) plus the remaining 5 μL of cloning reaction. 16. Transformation protocol can be performed in a water bath or on a PCR machine. 17. To improve recovery of plasmid DNA, we recommend adding 120 μL of Elution Buffer or nuclease-free water (pH 8.5) to each well, and incubate for 1–3 min at room temperature in order to increase elution efficiency. 18. For sequencing pHTP expression vectors, we recommend using T7 promoter universal primer (50 - TAATACGACTCACTATAGGG -30 ) and T7 terminator primer (50 - GCTAGTTATTGCTCAGCGG -30 ). 19. The error rate of the large-scale method described here is 1.06 mutations per kilobase of DNA synthesized (single base deletions are more frequent, followed by substitutions and insertions). In case the DNA sequence is not 100% identical to the designed sequence, screen a second or a third colony. An average of 1.3 clones are necessary to obtain the correct gene. 20. The low error rate of this gene synthesis method avoids additional steps for the removal of errors. Since the identification of 100% correct genes is performed by screening a maximum of three colonies, the labor required for the selection and validation of recombinant clones is reduced. 21. The agar plates are only back-ups. They can be stored at 4  C for the scale-up production or for any cases where the liquid preculture does not grow but there are colonies on the plates.

High-Throughput Toxin Production

189

In that case the production of the toxins is postponed 24 h and the precultures are redone by a dilution in fresh medium of the original preculture and the picking of colonies for the few missing precultures to complete the plate. 22. Glycerol stocks can be stored at 80  C and used to inoculate precultures for subsequent rounds of expression or scale up. Glycerol stocks should be made in replicates. Preparation of glycerol stocks: Dispense 30 μL of 100% glycerol using a multidispensing pipette set to slow speed into each well of a 96-well microtiter plate. Transfer 120 μL of each culture into the corresponding well of the microtiter plate and mix by pipetting slowly and gently. Seal with plastic adhesive tape and store at 80  C. 23. While it is possible to include DNase and MgSO4 in the lysis buffer, we recommend not to. That way when the cells are thawed the lysis will be visible because the cell suspension will be viscous. If the lysozyme was accidentally omitted and DNase and MgSO4 are also present in the lysis buffer then it will be impossible to discriminate whether the lysis was successful. 24. This is the most critical point of the whole procedure, if some cultures are still viscous (for example, if the DNase was accidentally omitted in some wells), the filter will clog, generating an uneven pressure on the samples and contamination or total clogging of the filter plate could happen during the purification. 25. If the culture OD at harvesting time is above 12, clogging of the filter by cell debris could happen. In that case, centrifuge the DW24 plate at 3800  g for 10 min transfer the supernatant into a new DW24 and use this cleared lysate for the Nickel purification. 26. The LabChip GXII is much quicker and more precise than SDS-PAGE for the validation of MW and sample concentrations of each elution of His-DsbC-Toxins. But, because the lowest MW detected is 5 kDa, similarly to SDS-PAGE, the LabChip turned out to be inadequate for the quantification and MW determination of most of the final purified toxins. Only the UPLC-MS analysis (see below) turned out to be a precise way to quantify and check the exact mass of each toxin. 27. On the production phase of VENOMICS described here due to time constraints, the samples were not run on the Perkin Elmer Labchip GX II, only the OD of the purified samples were integrated in the calculation of the protein concentration. It is probable that a fair portion of the purified proteins were indeed not His-DsbC-oxidized toxin but His-DsbC-misfolded toxin or even His-DsbC alone, the misfolded toxin eliminated by the E. coli protease in that case (see Refs. [2, 3] for discussion

190

Yoan Duhoo et al.

around these points). We advise, for a small set of protein of interest to check the MW of the protein after elution and in the case of wrong MW or an abundant population of His-DsbC over the His-DsbC-toxin to tune the culture conditions using the quantity of His-DsbC-toxin as a criterion for the quality of the production. 28. Full cleavage is often reached in this buffer condition but the DTT concentration can be critical and in case of cleavage problems a screen of DTT concentrations is advisable (see Ref. [2]). 29. The acetonitrile is toxic and must be handled with caution under a chemical hood. 30. To remove any acetonitrile vapors and protect the users, a portable chemical hood with filters (flow of 240 m3) have been installed on the Tecan robot next to the acetonitrile containing buffers and filter plates.

Acknowledgements This work was supported by The VENOMICS European project grant N 278346 through the Seventh Framework Program (FP7 HEALTH 2011-2015). NZYTech gratefully acknowledges PORTUGAL 2020—Programa Operacional Regional de Lisboa, Project 011199. AFMB was supported by the French Infrastructure for Integrated Structural Biology (FRISBI) ANR-10-INSB-05-01. References 1. Robinson SD, Undheim EAB, Ueberheide B, King GF (2017) Venom peptides as therapeutics: advances, challenges and the future of venom-peptide discovery. Expert Rev Proteomics 14(10):931–939. https://doi.org/10. 1080/14789450.2017.1377613 2. Sequeira AF, Turchetto J, Saez NJ, Peysson F, Ramond L, Duhoo Y, Blemont M, Fernandes VO, Gama LT, Ferreira LM, Guerreiro CI, Gilles N, Darbon H, Fontes CM, Vincentelli R (2017) Gene design, fusion technology and TEV cleavage conditions influence the purification of oxidized disulphide-rich venom peptides in Escherichia coli. Microb Cell Factories 16 (1):4. https://doi.org/10.1186/s12934-0160618-0 3. Turchetto J, Sequeira AF, Ramond L, Peysson F, Bras JL, Saez NJ, Duhoo Y, Blemont M, Guerreiro CI, Quinton L, De Pauw E, Gilles N,

Darbon H, Fontes CM, Vincentelli R (2017) High-throughput expression of animal venom toxins in Escherichia coli to generate a large library of oxidized disulphide-reticulated peptides for drug discovery. Microb Cell Factories 16(1):6. https://doi.org/10.1186/s12934016-0617-1 4. Saez NJ, Nozach H, Blemont M, Vincentelli R (2014) High throughput quantitative expression screening and purification applied to recombinant disulfide-rich venom proteins produced in E. coli. J Vis Exp 89:e51464. https:// doi.org/10.3791/51464 5. van den Berg S, Lofdahl PA, Hard T, Berglund H (2006) Improved solubility of TEV protease by directed evolution. J Biotechnol 121 (3):291–298. https://doi.org/10.1016/j. jbiotec.2005.08.006

Chapter 8 High-Throughput Purification of Protein Kinases from Escherichia coli and Insect Cells Sebastian Mathea, Eidarus Salah, and Stefan Knapp Abstract Protein kinases are major targets for the development of new medicines and play key roles in cellular signaling. The flexible nature of these proteins, posttranslational modifications, and the large size of some protein kinases pose a particular challenge obtaining homogeneous, active recombinant protein kinases suitable for functional or structural studies. Here we describe our expertise expressing protein kinases in two frequently used host systems: E. coli and insect cells using the baculovirus expression vector system. In particular, we will discuss and provide detailed methods on construct design, high-throughput cloning, parallel expression testing and scale up as well as purification and co-expression strategies leading to stable and homogeneous recombinant protein samples. Key words Kinase expression, E. coli, Insect cells, Co-expression, Phosphatases

1

Introduction Protein kinases constitute a large protein superfamily with more than 500 members in human. Some organisms such as plants may express even larger numbers of protein kinases [1]. Protein kinases use either tyrosine residues or serine/threonine residues as phospho-acceptors and only few examples of dual activity have been reported to date [2]. Since kinases play key roles in cellular signaling, they need to quickly shuffle between inactive conformations and a well-conserved active state that allows efficient catalysis. A multitude of inactive states have been described that all misalign catalytic sequence motifs and residues. In the active kinase conformation, hydrophobic residues align to “hydrophobic spines” that stabilize the kinase domain and accurately position catalytically important residues [3]. This active state is often stabilized by phosphorylation of the activation segment, a sequence of usually 40 or more residues that is unstructured in inactive kinases but which forms the substrate-binding site in phosphorylated active

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_8, © Springer Science+Business Media, LLC, part of Springer Nature 2019

191

192

Sebastian Mathea et al.

kinases [4]. Some kinases recognize their own activation loop and readily auto-phosphorylate during heterologous expression [5]. For kinases that require other kinases or certain interacting proteins for activation, the high degree of flexibility of the kinase inactive states often causes protein instability and aggregation whereas auto-activating kinases may phosphorylate multiple residues during expression, including also residues that are buried in the hydrophobic core of the folded protein, which results in heterogeneous proteins with various levels of kinase activity [6]. Thus, homogenous expression may require co-expression of a phosphatase which removes randomly phosphorylated sites and which may also reduce toxicity due to excessive phosphorylation of endogenous proteins in the expression host. In addition, some kinases such as cytoplasmic tyrosine kinases require flanking domains in the same protein for stable inactive as well as active conformations [7] or require stabilizing interaction partners (e.g., the HSP90 chaperonin system). These requirements complicate construct design and efficient expression in heterologous host systems. Our laboratory has a long interest in the protein biochemistry and structural biology of protein kinases. In particular, the expression of kinases for structural studies requires highly purified kinases with homogenous phosphorylation states. Here we share our experience expressing protein kinases in bacteria as well as insect cells.

2

Material

2.1 Kinase Purification

2.1.1 Reagents

For first purification trials we use a general procedure that includes high-pressure lysis, affinity purification (IMAC), tag-cleavage and rebinding as well as size exclusion chromatography (SEC). For purification all buffers are chilled to 4 C and procedures are carried out in the cold room or using temperature controlled equipment. 1. BL21(DE3)-R3-pRARE2 bacterial strain (Novagen). 2. Sf9 and Hi5 insect cells. 3. Lysis buffer: 50 mM HEPES/NaOH pH 7.4, 500 mM NaCl, 0.5 mM TCEP, 5% glycerol, 20 mM imidazole, for LRRK2 only: 20 μM GDP, 5 mM MgCl2. 4. IMAC resin (Sepharose 6 Fast flow; GE Healthcare). 5. IMAC elution buffer: Identical to Lysis buffer supplemented with 300 mM imidazole. 6. Dialysis buffer: Identical to Lysis buffer without imidazole. 7. Rebinding IMAC elution buffer: Identical to the Lysis buffer supplemented with 45 mM imidazole.

High-Throughput Kinase Purification

193

8. SEC (size exclusion chromatography) buffer: 20 mM HEPES/ NaOH pH 7.4, 150 mM NaCl, 0.5 mM TCEP, 5% glycerol, for LRRK2 only: 20 μM GDP, 2.5 mM MgCle´. 9. TEV (Tobacco etch virus) protease stock solution: 2 3 mg/mL TEV protease in SEC buffer. We express TEV protease using a bacterial expression system and the same purification protocol as described for kinases. The protein can be stored at 80 C in the presence of 20% glycerol for a long time. 2.1.2 Equipment and Consumables

1. Micro-Express Glas-Col shaker (Glas-Col, Indiana, US) or similar set to 37  C. 2. VibraCell 750 sonicator (Sonics, USA) or equivalent. 3. EmulsiFlex-C5 (ATA-Scientifics) or equivalent. 4. Gravity column (Sigma), inner diameter 25 mm (length 100 mm). 5. SnakeSkin dialysis tubing (Sigma). 6. Amicon Ultra-15 (Merck). 7. AKTA Xpress chromatographic system (GE Healthcare). 8. HiLoad 16/600 Superdex 200 pg column (GE Healthcare).

3

Methods The protocol for the production and purification of kinases for structural studies starts upstream by the selection of the kinase constructs and its cloning into an expression vector. Design of expression constructs is often an iterative process that requires the cloning and testing of a large number of expression vectors. Several kinases are not stably expressed in bacteria or insect cells. Obtaining at least small amounts of functional full-length proteins can often be achieved using transient or stable transfection into mammalian cells, a topic that will not be covered by this chapter. For large-scale expression the presence of flexible and often unfolded low complexity region poses a significant challenge to the design of stable expression constructs. Since the main focus in our laboratory is protein structure determination and in vitro assay development, we typically use iterative cycles of 12 or more expression constructs for new kinase targets for heterologous expression in E. coli and insect cells (Fig. 1). Many structures of kinase catalytic domains have been determined by crystallographic methods. Since crystallization of a protein requires high purity and stability, the boundaries used for expression of these kinases usually represent good starting points for the design of expression constructs. For these kinases it is likely that expression systems are already available by public clone repositories such as Addgene (https://www.addgene.org/) or the Dundee collection (http://mrcppureagents.dundee.ac.uk). In

194

Sebastian Mathea et al.

Fig. 1 Typical cycles of parallel construct optimization in E. coli and insect cells. See text for details

addition, many commercial screening companies offer detailed description of their expression plasmids (e.g., https://www. proqinase.com/products-service/recombinant-proteins). We typically initiate the construct design process by aligning the protein sequence of the kinase we would like to express against the protein databank (pdb; http://www.rcsb.org/pdb/home/ home.do). This can be easily done using the EXPASY web based tool kit (http://www.expasy.org/tools/). An overview on available kinase structures is also provided by the KLIFS database (http:// klifs.vu-compmedchem.nl/). In case the protein kinase has not been expressed before in the literature we predict domains and potential disordered region in the protein using GlobPlot (http://globplot.embl.de/cgiDict.py). This tools provides a quick overview over folded and stable structures [8] (for an example see Fig. 2). Secondary structure prediction (e.g., PSIPRED http://bioinf. cs.ucl.ac.uk/psipred/) is also a useful tool, facilitating construct boundary decisions [9]. We avoid N-terminal residues known to lead to fast protein degradation which are in bacteria (arginine, leucine, lysine, as well as aromatic residues) [10].

High-Throughput Kinase Purification

195

Despite these predictions, we typically plan and test at least 12 constructs for each iteration of construct optimization for each kinase. This is mainly due to the often significant influence of terminal residues on expression levels and protein stability, a property that cannot be easily predicted even if highly similar kinases have been successfully expressed previously. At the Structural Genomics Consortium (SGC) we choose Ligation Independent Cloning (LIC) as a simple and cost effective platform for parallel construct generation for both E. coli and baculovirus expression vector system [11]. The established protocol has been recently described in detail [12] and will therefore not be detailed in this chapter. However, more recent protocols such as Fragment Exchange (FX) cloning are now also frequently used [13]. For cloning we usually use the Mach1 strain, a derivative of E. coli W strains (ATCC 9637) which is phage-resistant, fast-growing high copy number strain (Fig. 2).

Fig. 2 Analysis of structural features and disorder of the protein kinase FES (Gene ID: 2242) using GlobPlot. Functional domains are indicated as boxed areas. Folded regions are shown in the bar diagram at the bottom of the figure in green and disordered regions in blue. Low complexity regions are indicated as well (yellow bars at the top)

196

Sebastian Mathea et al.

3.1 Kinase Purification

3.1.1 Lysis of Cell Pellet and Preparation for Purification

Initially we use expression vectors for both host systems that allow N-terminal function with a His6-TEV (Tobacco Etch Virus) cleavage site such as the expression vector pNIC28-Bsa4 (www.uniprot. org/taxonomy/420517). The purification tag is small and allows efficient purification using Immobilized Metal Chromatography (IMAC) followed by TEV protease cleavage. Larger fusion tags (GST, SUMO, MBP) have been of advantage in some systems but they did not improve significantly yields of soluble proteins for most kinases cloned in our laboratory. For expression in E. coli we initially use the strain BL21(DE3)R3-pRARE2, a phage-resistant derivative of BL21(DE3) which contains the pRARE2 plasmid (Novagen) supplying tRNAs for rare bacterial condons (AGA, AGG, AUA, CUA, GGA, CCC, and CGG) on a compatible chloramphenicol-resistant plasmid. The protocol that we use for initial expression testing has been recently described in detail by Burgess-Brown et al. [14]. As most kinases require low temperature for soluble expression, we usually test expression at 18 C. Insect cell expression using Baculoviruses has been developed for more challenging kinase targets. However, due to the longer process we often test expression in bacteria and insect cells in parallel using the same construct boundaries (see Fig. 3 for an example of the impact of target boundaries on the level of soluble protein). Baculoviruses are DNA viruses that are not pathogenic to human cells and infect only insects of the order Lepidoptera. We use routinely two main insect cell strains: Sf9 derived from pupal ovarian tissue of the fall army worm, Spodoptera frugiperda and High Five cells derived from ovarian cells of Trichoplusia ni [15, 16] using the Bac-to-Bac® system (Invitrogen). A general procedure describing parallel testing in Sf9 insect cells has been recently published [17]. 1. Resuspended ~50 g of cell pellet is 150 mL lysis buffer. 2. Lyse the insect cells by sonication with the large probe of a Vibra-Cell 750 sonicator (5 s pulse and 10 s rest, 300 s pulse in total, amplitude 35%). 3. Lyse the E. coli cells by three passages through the highpressure cell breaker (at 1600 bar). 4. Centrifuge the lysate (40,000  g, 60 min, 4  C) to remove cell debris.

High-Throughput Kinase Purification

197

Fig. 3 Mapping of domain boundaries for the protein kinase LRRK2. Shown is a Coomassie stained SDS-PAGE of expression tests done in insect cells using constructs starting an N-terminal residue located before the LRR domain and ending at the C-terminus. The recombinant LRRK2 protein is indicated by an arrow. Previous expression trials suggested that the entire C-terminus is required for soluble protein expression and that the kinase catalytic domain cannot be expressed in isolation. A boundary was identified N-terminal of the predicted LRR (leucine rich repeat) domain that resulted in soluble protein in insect cells. Fine tuning of this boundary identified several constructs starting from residue 930–980 that resulted good expression levels in small-scale expression tests for this large (~1500 residue) LRRK2 fragments 3.1.2 Kinase Purification by IMAC

1. Load 5 mL Ni Sepharose 6 Fast flow (GE Healthcare) in a gravity column. 2. Equilibrate the column in 5 volumes of Lysis buffer.

Day 1

3. Transfer the soluble fraction onto the column in 30 mL portions. 4. Wash the beads with 6 30 mL of lysis buffer. 5. Elute the bound protein in 4 10 mL IMAC elution buffer supplemented with 300 mM imidazole. 6. To cleave the affinity tag, combine the elution fractions and add ~0.5 mg TEV. 7. Transfer the mixture into dialysis tubing and dialyze overnight (2 L dialysis buffer, 4  C).

198

Sebastian Mathea et al.

Day 2

1. Load 5 mL Ni Sepharose 6 Fast flow (GE Healthcare) in a gravity column. 2. Equilibrate the column in 5 volumes of dialysis buffer. 3. Transfer the dialyzed fraction onto the column. Unspecifically bound proteins are eluted in 4 10 mL rebind IMAC elution buffer. Contaminants, TEV protease, and the cleaved affinity tag will still stick to the column, while the target protein is either in the flow through or in the elution fractions. 4. Analyze by SDS-PAGE the various fractions.

3.1.3 Kinase Purification by Size Exclusion Chromatography

1. Combine the fractions containing the kinase. 2. Concentrate on an Amicon Ultra-15 the volume to 5 ml. 3. During protein concentration, equilibrate in SEC buffer a HiLoad 16/600 Superdex 200 pg column attached to an ¨ KTA-Xpress system. A 4. Load the sample on the column. 5. Collect 3 ml fractions in a 24-well plate. 6. Analyze by SDS-PAGE the various fractions. 7. Combine the fraction containing the kinase. 8. Concentrate on an Amicon Ultra-15 the volume up to 2 3 mg/mL concentration. 9. Flash-freeze 1 mL-aliquots (in liquid nitrogen) and store at 80  C. Alternatively, before freezing, use fresh protein to set up crystallization drops. To get a basic idea of the quality of the protein preparation, we generally provide the SEC profile, an SDS PAGE of the SEC elution fractions, the ESI-MS, and the UV absorbance spectrum of the concentrated protein. An example is given on Fig. 4.

3.1.4 Example Case: Expression of FES SH2-Kinase Domain

FES is a cytoplasmic tyrosine kinases harboring an N-terminal BAR domain, and SH2 domain as well as a C-terminal kinase domain [7]. Domain boundaries containing the full-length, the SH2-kinase unit, as well as the kinase domain have been extensively screened in our laboratory both in E. coli and in insect cells resulting only in constructs that either expressed FES constructs at low yields or in insoluble form. The bacterial constructs were subsequently tested in bacteria that co-expressed Yersinia tyrosine phosphatase (YopH) cloned into pCDFDuet-1 (Novagen) using NcoI/AvrII restriction sites, deleting the NcoI site in the final construct to yield an unambiguous start codon [18]. The different origin of replication and antibiotic resistance (streptomycin used at 50 μg/mL). This successful co-expression strategy has been reported for the kinases c-ABL and c-SRC expressing constructs harboring both the kinase

High-Throughput Kinase Purification

199

Fig. 4 Purification LRRK2970–2527. (a) SDS-PAGE of pooled IMAC elution fractions. The molecular weights of the marker proteins is indicated on the left side of the figure. The same marker has been used for the gel shown in panel (b) After removal of the affinity tag, the target protein binds unspecifically to the resin and elutes at 45 mM imidazole, while contaminants elute at 300 mM imidazole only. I input, F flow through, R rinse, E elution at 300 mM imidazole. (c) In SEC, the target protein peak is slightly broadened, the retention volume 61.8 mL indicates the protein to be monomeric. SDS-PAGE analyses of the SEC elution fractions are shown on top of the SEC trace. (d) From the UV absorbance spectrum can be concluded that the target protein is not substantially precipitated, and that the protein concentration is 3.2 mg/mL

domain alone and in fusion with the flanking SH3 and SH3 domains [18]. Expression screening using this BL21(DE3) E. coli cells transformed with pCDFDuet-1 resulted in high yield soluble protein expression for one construct only harboring the SH2 domain and the kinase domain as well as an N-terminal extension (Fig. 5). The crystal structure of this construct (I448-R882) revealed that this extended N-terminus inserts into the SH2-kinase domain interface leading the activation and stabilization of this domain interaction [7]. The identified FES SH2-Kinase construct resulted in high yields of purified recombinant protein that was easily purified and resulted in the successful crystallization of this target (Fig. 6). Autophosphorylation is a major problem for kinase expression and leads also often to toxicity during expression. We therefore express most kinases in hosts co-expressing phosphatases with broad substrate activity such as YopH (for tyrosine kinases) and λ-phosphatase (for serine/threonine specific kinases) [6].

200

Sebastian Mathea et al.

Fig. 5 Expression test of FES tyrosine kinase constructs in E. coli. SDS-PAGE of constructs of different length (indicated in the figure) in E. coli BL21 DE3 R3 co-expressing YopH phosphatase. Shown are total lysates (T), the soluble fraction after high speed centrifugation (S) as well as the eluted fraction (step elution 500 mM imidazole) from a Ni-affinity purification (IMAC). As shown in the figure many construct express this kinase but only one (I448-R882) yielded soluble protein

4

Notes Despite the research interest in protein kinases the expression of some kinase targets remains challenging. After initial construct optimization in silico as described above we usually start with 12–16 constructs (e.g., by systematic combinations of 3  4 or 4  4 primer pairs) using an E. coli T7 system. Our default system uses an N-terminal His6-TEV tag. In some cases swapping the tag to the C-terminus resulted in better expression results. However, due to the asymmetric cleavage of the TEV protease, C-terminal cleaved purification tags result in larger overhangs. We therefore often use a non-cleavable C-terminal His6 tag instead. Expression of some kinases results in toxicity. Hence catalytically inactive mutants can help expression in recombinant hosts systems. One of the main challenges expressing kinases is heterogeneous phosphorylation of the recombinant protein. If the kinase is highly active and it is expressed at high levels, many phosphorylation sites may get introduced during translation of the protein. This may lead to inaccessible phosphorylation sites that destabilize the protein and that can no longer be removed by phosphatases. Co-expression of phosphatases as outlined in the case study of FES kinase may result in stable und unphosphorylated protein. We frequently use co-expression with lambda phosphatase in bacteria and insect

High-Throughput Kinase Purification

201

Fig. 6 Purification FES448–822. (a) SDS-PAGE of IMAC samples. M marker, T total, I input, F flowthrough, R rinse, E elution at 300 mM imidazole. The molecular weights of the marker proteins are indicated on the left side of the figure. The same marker has been used for the gel shown in panel (b) After removal of the affinity tag, the target protein does not bind to the resin and elutes in the flowthrough. (c) In the final SEC purification step, the retention volume 88.4 mL indicates the protein to be monomeric. SDS-PAGE analyses of the SEC elution fractions are shown on top of the SEC trace. (d) From the UV absorbance spectrum can be concluded that the target protein is not substantially precipitated, and that the protein concentration is 2.2 mg/mL. (e) The total mass of the protein as determined by ESI-MS confirms the identity of the target protein and allows conclusions on the phosphorylation state (FES448–822 not phosphorylated). MWcalculated is 42704.2 Da, MWmeasured is 42710.3 Da

cells. Finally, inactive kinases are very flexible proteins. Additional stability and better expression levels can be achieved on some cases introducing an activating mutant in the activation segment resulting in an active and more stable protein.

Acknowledgements The authors are grateful for financial support by the SGC, a registered charity (number 1097737) that receives funds from AbbVie, Bayer Pharma AG, Boehringer Ingelheim, Canada Foundation for Innovation, Eshelman Institute for Innovation, Genome Canada through Ontario Genomics Institute, Innovative Medicines Initiative (EU/EFPIA) [ULTRA-DD grant no. 115766], Janssen, Merck & Co., Novartis Pharma AG, Ontario Ministry of Economic Development and Innovation, Pfizer, Sa˜o Paulo Research Foundation-FAPESP, Takeda, the Centre of Excellence

202

Sebastian Mathea et al.

(CEF) Macromolecular Complexes at Frankfurt University, and the Wellcome Trust. SK and SM are grateful for support by the German Cancer Centre (DKFZ) and the German Cancer Network (DKTK). References 1. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S (2002) The protein kinase complement of the human genome. Science 298(5600):1912–1934. https://doi.org/ 10.1126/science.1075762 2. Soundararajan M, Roos AK, Savitsky P, Filippakopoulos P, Kettenbach AN, Olsen JV, Gerber SA, Eswaran J, Knapp S, Elkins JM (2013) Structures of down syndrome kinases, DYRKs, reveal mechanisms of kinase activation and substrate recognition. Structure 21 (6):986–996. https://doi.org/10.1016/j.str. 2013.03.012 3. Kornev AP, Haste NM, Taylor SS, Eyck LF (2006) Surface comparison of active and inactive protein kinases identifies a conserved activation mechanism. Proc Natl Acad Sci U S A 103(47):17783–17788. https://doi.org/10. 1073/pnas.0607656103 4. Nolen B, Taylor S, Ghosh G (2004) Regulation of protein kinases; controlling activity through activation segment conformation. Mol Cell 15(5):661–675. https://doi.org/ 10.1016/j.molcel.2004.08.024 5. Oliver AW, Knapp S, Pearl LH (2007) Activation segment exchange: a common mechanism of kinase autophosphorylation? Trends Biochem Sci 32(8):351–356. https://doi.org/ 10.1016/j.tibs.2007.06.004 6. Shrestha A, Hamilton G, O’Neill E, Knapp S, Elkins JM (2012) Analysis of conditions affecting auto-phosphorylation of human kinases during expression in bacteria. Protein Expr Purif 81(1):136–143. https://doi.org/10. 1016/j.pep.2011.09.012 7. Filippakopoulos P, Kofler M, Hantschel O, Gish GD, Grebien F, Salah E, Neudecker P, Kay LE, Turk BE, Superti-Furga G, Pawson T, Knapp S (2008) Structural coupling of SH2-kinase domains links Fes and Abl substrate recognition and kinase activation. Cell 134(5):793–803. https://doi.org/10.1016/j. cell.2008.07.047 8. Linding R, Russell RB, Neduva V, Gibson TJ (2003) GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Res 31(13):3701–3708 9. McGuffin LJ, Bryson K, Jones DT (2000) The PSIPRED protein structure prediction server. Bioinformatics 16(4):404–405

10. Varshavsky A (1997) The N-end rule pathway of protein degradation. Genes Cells 2 (1):13–28 11. Haun RS, Serventi IM, Moss J (1992) Rapid, reliable ligation-independent cloning of PCR products using modified plasmid vectors. BioTechniques 13(4):515–518 12. Strain-Damerell C, Mahajan P, Gileadi O, Burgess-Brown NA (2014) Mediumthroughput production of recombinant human proteins: ligation-independent cloning. Methods Mol Biol 1091:55–72. https://doi. org/10.1007/978-1-62703-691-7_4 13. Geertsma ER, Dutzler R (2011) A versatile and efficient high-throughput cloning tool for structural biology. Biochemistry 50 (15):3272–3278. https://doi.org/10.1021/ bi200178z 14. Burgess-Brown NA, Mahajan P, StrainDamerell C, Gileadi O, Graslund S (2014) Medium-throughput production of recombinant human proteins: protein production in E. coli. Methods Mol Biol 1091:73–94. https://doi.org/10.1007/978-1-62703-6917_5 15. Vaughn JL, Goodwin RH, Tompkins GJ, McCawley P (1977) The establishment of two cell lines from the insect Spodoptera frugiperda (Lepidoptera; Noctuidae). In Vitro 13 (4):213–217 16. Wickham TJ, Davis T, Granados RR, Shuler ML, Wood HA (1992) Screening of insect cell lines for the production of recombinant proteins and infectious virus in the baculovirus expression system. Biotechnol Prog 8 (5):391–396. https://doi.org/10.1021/ bp00017a003 17. Mahajan P, Strain-Damerell C, Gileadi O, Burgess-Brown NA (2014) Mediumthroughput production of recombinant human proteins: protein production in insect cells. Methods Mol Biol 1091:95–121. https://doi.org/10.1007/978-1-62703-6917_6 18. Seeliger MA, Young M, Henderson MN, Pellicena P, King DS, Falick AM, Kuriyan J (2005) High yield bacterial expression of active c-Abl and c-Src tyrosine kinases. Protein Sci 14 (12):3135–3139. https://doi.org/10.1110/ ps.051750905

Chapter 9 Parallelized Microscale Expression of Soluble scFv Giulio Russo, Viola Fu¨hner, Andre´ Frenzel, Michael Hust, and Stefan Du¨bel Abstract Antibody phage display is a key technology to generate recombinant, mainly human, antibodies for diagnostic and therapy, but also as tools for basic research. After antibody selection by “panning,” a crucial step is the screening of monoclonal binders to isolate those which show antigen specificity. For this screening procedure, a highly parallelized approach to produce soluble antibody fragments in microtiter plates is essential. In this chapter, we give the protocol for the parallelized microscale production of scFvs for the screening procedure or further assays. Key words Phage display, Single-chain fragment variable (scFv), Monoclonal antibody screening, Small-scale antibody production in MTP

1

Introduction Since 1990/‘91, the time of its discovery, antibody phage display became the leading technology to generate recombinant antibodies. This in vitro technology is independent of the restrictions of the immune system, and thus suitable to generate antibodies against virtually any target. In this technique the antibody, most commonly in the single-chain fragment variable (scFv) format, is presented (displayed) on the phage surface in fusion to the capsid minor coat protein III (pIII). The genetic information encoding the antibody is packaged into the phage particle, coupling phenotype and genotype. Antibody gene libraries can be generated from any species allowing in vitro generation of human antibodies without the need for immunization. However, if available, the libraries can be patient derived immune libraries [1, 2]. Nevertheless, in most cases, universal (non-immune) libraries like McCafferty library [3], Pfizer library [4], Tomlinson libraries [5], or the Human/Hust antibody libraries (HAL) 7/8 and 9/10 [6, 7] are used. Currently, six antibodies generated by phage display are FDA/EMA approved.

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_9, © Springer Science+Business Media, LLC, part of Springer Nature 2019

203

204

Giulio Russo et al.

An overview about phage display derived therapeutic antibodies by phage display is given by Frenzel et al. [8]. The procedure of antibody selection via phage display is named “panning,” after the gold digger’s method to separate the nuggets from the soil [9]. In the panning procedure, the target antigen can be immobilized onto various solid surfaces [10–12]. Plastic surfaces such as polystyrene microtiter plates (MTPs) have high protein binding capacity and are widely used [13]. After blocking of the surface, the antibody phage library is incubated on the antigen. Thorough washing is required to eliminate the vast excess of unbound antibody phage. Only afterward, the bound antibody phage (binders) can be eluted and used to infect E. coli for reamplification. Consecutive coinfection of the same E. coli with a helperphage allows the production of new antibody phage for repeating the selection process until the number of panning rounds guarantees a sufficient enrichment of specific antibodies; typically three panning rounds are sufficient. Alternatively, if immobilization of the antigen is not suitable, the selection procedure can be performed in solution followed by a “pull-down” of the antigen-antibody phage complex [14] or, in case of surface markers, e.g., cancer targets, directly on cells [15, 16]. Different assays can be adopted to identify the antibodies with the desired properties. The typical hit selection assay which is also suitable for high-throughput applications is the indirect antigen ELISA [17]. Immunoblots are used to identify antibodies that only recognize denatured antigens [13]. If binding to the native form of a cell surface molecule is crucial for later applications, the antibodies can be directly screened for cell binding by flow cytometer analysis [18]. After identifying the most promising candidates the antibody genes are sequenced and subcloned into other formats, e.g., bivalent scFv-Fc or IgG [6, 17, 19, 20]. To screen hundreds of monoclonal binders, the highthroughput (HTP) scFvs production as soluble monoclonal antibody fragments is an indispensable requirement. Microtiter plates allow for parallelized microscale scFvs production in bacteria [21, 22]. E. coli XL1-Blue MRF’ are infected with the phage selected during the last panning round and plated for successive picking of single bacterial colonies, where each colony corresponds to an individual antibody clone. For amplification, single colonies are individually transferred to the wells of a 96-well polypropylene (PP) MTP containing growth medium [23] and 100 mM glucose. The glucose provides maximal repression of the Lac promoter. Under this condition, no scFv-antibodies are produced, as residual scFv-production may lead to mutations and deletions. When replacing the glucose by an inducer of the Lac promoter, like isopropyl-beta-D-thiogalacto-pyranoside (IPTG), soluble scFvs can be produced.

Parallelized Microscale Expression of Soluble scFv

205

Here, we give a stepwise protocol to produce soluble scFv antibodies in MTP for the screening ELISA procedure or further experiments. This protocol allows an experienced operator to easily screen up to 920 monoclonal antibodies in parallel (10 MTPs) in the absence of robotic automation. With the latter, several thousands of antibody clones can be analyzed in parallel.

2

Materials

2.1 Production of Soluble Monoclonal Antibody Fragments in Microtiter Plates 2.1.1 Reagents

1. Potassium phosphate buffer (2.31% (w/v) (0.17 M) KH2PO4 + 12.54% (w/v) (0.72 M) K2HPO4). 2. Glucose stock solution (2 M Glucose). 3. Buffered 2xTY pH 7.0 (1.6% (w/v) tryptone, 1% (w/v) yeast extract, 0.5% (w/v) NaCl, 10% (v/v) potassium phosphate buffer). 4. Buffered 2xTY-GA (buffered 2xTY containing 0.1 M Glucose +100 μg/mL ampicillin). 5. Buffered 2xTY-SAI (buffered 2xTY containing 50 mM saccharose +100 μg/mL ampicillin +50 μM isopropyl-beta-Dthiogalactopyranoside (IPTG)). 6. Glycerol 80%.

2.1.2 Equipment and Consumables

1. MTP incubator, e.g., VorTemp™ 56 (Labnet international, Edison, USA); 3 mm circular orbit. 2. 5810R Centrifuge (Eppendorf, Germany). 3. 96 well U-bottom polypropylene (PP) microtiter plates (Greiner Bio-One, Germany). 4. AeraSeal breathable sealing film (Excel Scientific, USA).

2.2 ELISA of Soluble Monoclonal Antibody Fragments.

1. 50 mM NaHCO3, pH 9.6.

2.2.1 Reagents

3. PBST for washing (PBS+ 0.05% Tween 20).

2. PBS pH 7.4 (8.0 g NaCl, 0.2 g KCl, 1.44 g Na2HPO4·2H2O, 0.24 g KH2PO4 add to 1 L ddH2O). 4. 2% MPBST (2% Milkpowder in PBS, 0.1% Tween 20). 5. 2% BSA in PBS. 6. Mouse α-His-tag monoclonal antibody (α-Penta His, Qiagen, Hilden, Germany). 7. Mouse α-myc-tag (Sigma).

monoclonal

antibody

(Myc1-9E10)

8. Mouse α-pIII monoclonal antibody PSKAN3 (Mobitec). 9. Goat α-Mouse IgG polyclonal antibody, A0168 (Fcγ specific) HRP conjugated (Sigma).

206

Giulio Russo et al.

10. TMB substrate solution A (50 mM citric acid, 30 mM potassium citrate, pH 4.1). 11. TMB substrate solution B (240 mg tetramethylbenzidine, 10 ml acetone, 90 ml ethanol, 907 μl 30% H2O2). Store at 4  C in the dark. 12. TMB substrate developing solution (19 parts TMB A + 1 part TMB B (prepare directly prior use)). 13. Stopping solution 1 N H2SO4 (1 M ¼ 98 g/mol). 2.2.2 Equipment and Consumables

1. ELISA-Reader (Tecan SUNRISE). 2. Tecan ELISA-Washer. 3. Microtiter plates (Nunc “Maxisorb” or “Costar” “Vinyl Assay Plates”).

3

Methods The first step in the identification of antigen specific monoclonal antibodies is the ELISA. For this screening, scFvs are produced as soluble monoclonal antibody fragments in microtiter plates. An ELISA with monoclonal phage preparations should be omitted to avoid antibody clones that bind only as scFv-phage or scFv-pIII fusion but not as soluble scFv fragment [24, 25] (see Note 1). The supernatant produced in MTPs can also directly be used for other screening methods, e.g., Immunoblot or FACS. The following protocol describes the production of soluble scFv in MTPs and the corresponding screening ELISA.

3.1 Production of Soluble Monoclonal Antibody Fragments in Microtiter Plates

1. Fill each well of a 96 well U-bottom PP MTP with 150 μL 2xTY-GA (see Note 2). 2. Pick 92 clones with sterile tips from the third panning round and inoculate each well (see Note 3). Seal the plate with a breathable sealing film. 3. Incubate overnight in a microtiter plate shaker at 37  C and 850 rpm (see Note 2). 4. (a) Fill a new 96 well polypropylene microtiter plate (mirror plate) with 150 μL 2xTY-GA and add 10 μL of the overnight cultures. Incubate for 2 h at 37  C and 850 rpm (see Note 2). (b) Add 30 μL glycerol solution to the remaining 140 μL overnight cultures. Mix by pipetting and store this master plate at 80  C. 5. Pellet the bacteria in the mirror plate by centrifugation for 10 min at 3200  g RT. Remove 180 μL glucose containing media by carefully pipetting (do not disturb the pellet). 6. Add 180 μL buffered 2xTY-SAI (containing saccharose, ampicillin, and 50 μM IPTG) (see Note 4) and incubate overnight at 30  C and 850 rpm (see Note 2).

Parallelized Microscale Expression of Soluble scFv

207

7. Pellet the bacteria by centrifugation for 10 min at 3200  g in the microtiter plates. Transfer the antibody fragment containing supernatant to a new polypropylene microtiter plate and store at 4  C up to 24 h or directly proceed with the screening (see Note 5). 3.2 ELISA of Soluble Monoclonal Antibody Fragments

1. To analyze the antigen specificity of the monoclonal soluble antibody fragments, coat 100–200 ng antigen in 100 μL PBS per well of a polystyrene MTP overnight at 4  C. As control coat 100–200 ng per well of BSA or streptavidin (see Note 6) or, in case the antibody selection was performed against a recombinant antigen fused to a purification tag, e.g., Fc-fusion, coat an unrelated antigen fused to the same Fc-part as negative control. 2. Wash the coated microtiter plate wells 3 with PBST (see Note 7). 3. Block the antigen coated wells with MPBST for 1 h at RT. The wells must be completely filled. 4. Fill 50 μL MPBST in each well and add 50 μL of antibody containing supernatant (see Note 8). Incubate for 1.5 h at RT/37  C (or overnight at 4  C). 5. Wash the microtiter plate wells 3 with PBST (see Note 7). 6. Incubate 100 μL/well mouse Myc1-9E10 α-myc tag antibody solution (appropriate dilution in MPBST) for 1 h at RT/37  C. 7. Wash the microtiter plate wells 3 with PBST (see Note 7). 8. Incubate 100 μL/well goat α-mouse HRP conjugate (appropriate dilution in MPBST) for 1 h at RT/37  C. 9. Wash the microtiter plate wells 3 with PBST (see Note 7). 10. Shortly before use, mix 19 parts TMB substrate solution A and 1 part TMB substrate solution B. Add 100 μL of this TMB substrate developing solution into each well and incubate for 1–30 min. 11. Stop the color reaction by adding 100 μL 1 N sulfuric acid solution per well. The color turns from blue to yellow. 12. Measure the extinction at 450 nm using an ELISA reader (reference wavelength 620 nm). 13. Identify positive candidates with a signal (on antigen) 10 times over the noise (on control protein, e.g., BSA) (see Note 9). 14. DNA sequencing of binders is performed with appropriate oligonucleotide primers as described before [17]. The antibody sequences can be analyzed by VBASE2 (www.vbase2. org) (Tool: Fab/scFab/scAb/scFv Analysis). The effective number of unique binders among the ELISA positive hits can be determined after sequencing (see Note 10).

208

4

Giulio Russo et al.

Notes 1. We observed that antibody::pIII fusion proteins and antibody phage sometimes show differences in antigen binding in comparison to soluble antibody fragments, because some antibodies can bind the corresponding antigen only as pIII fusion. Therefore, we recommend to perform the screening procedure only by using soluble antibody fragments, to avoid false positive binders. 2. When using different incubators, optimal plate shape, cultivation volume, and rpm may vary. 3. We recommend to pick 92 clones. Use the wells H3, H6, H9, and H12 for controls. H3 and H6 are negative controls; these wells will not be inoculated. We inoculate the wells H9 and H12 with a clone containing a phagemid encoding a known antibody fragment. In ELISA, the wells H9 and H12 are coated with the antigen corresponding to this control antibody fragment in order to check on scFv production and ELISA assay success. 4. The appropriate IPTG concentration for induction of antibody or antibody::pIII expression depends on the vector design. A concentration of 50 μM was well suited for vectors with a Lac promoter like pSEX81 [26], pIT2 [27], pHENIX [28], and pHAL14 [6, 29, 30]. The method for the production of soluble antibodies works with vectors with an amber stop codon (e.g., pHAL14 or pHAL30) between antibody fragment gene and pIII encoding gene (gIII). If the vector has no amber stop codon, mainly antibody::pIII fusion protein will be produced [31]. Buffered culture media and the addition of saccharose enhance the production of many but not all scFvs [22]. 5. Sterile filtration can reduce antibody degradation allowing for longer term storage. Stability of the scFv, filtrated or in crude bacterial production supernatant, can be also individually assessed to establish the optimal storage conditions. 6. More hydrophobic oligopeptides may need to be dissolved in PBS containing 5% DMSO. If biotinylated oligopeptides are used as antigen for panning, dissolve 200 ng Streptavidin in 150 μL PBS and coat overnight at 4  C. Pour out the wells and wash 3 with PBST. Dissolve 100–500 ng biotinylated oligopeptide in PBS and incubate for 1 h at RT. Alternatively, oligopeptides with a terminal cysteine residue can be coupled to BSA and coated overnight at 4  C. 7. The washing should be performed with an ELISA washer (e.g., TECAN Columbus Plus) to increase the stringency and to provide plate to plate reproducibility. To remove antigen,

Parallelized Microscale Expression of Soluble scFv

209

blocking, or antibody solutions, wash 3 with PBST (“standard washing protocol” for TECAN washer). If no ELISA washer is available, wash manually 3 with PBST. 8. Due to evaporation, less than 100 μL of scFv supernatant may be available after production. In this case, the volume of scFv supernatant for the test can be reduced down to 10 μL. This approach can also be used to test in parallel the scFv binding specificity for different antigens; valuable for different protein target variants or cross species binding assays. The remaining volume to 50 μL has to be replaced by MPBST. 9. The background (noise) signals should be about O.D.450 ~ 0.02 after 5–30 min TMB incubation time. 10. Typically, enough hits yielding antibodies suitable for simple research applications, like western blot, can be identified from a single ELISA plate. For example, to obtain antibodies to 196 different antigens for research provided to consortium partners in the EU initiative “Affinomics,” an average of 1,4 ELISA plates (corresponding to the analysis of 128 clones per antigen) was needed to yield 1368 monoclonal recombinant antibodies (with a median of five different antibodies per target, see Fig. 1). In contrast, “deep” screens are done for therapeutic targets where many additional properties of the

Fig. 1 Histogram representing the number of different validated antibodies (identified by different DNA sequence) generated per antigen from highly parallelized antibody selections within the EU program “Affinomics.” Per each antigen 128 clones were analyzed. Antibodies were considered as validated if they worked in at least two different binding assays

210

Giulio Russo et al.

antibody candidate have to be considered, like stability, epitope, and production yields. Here, automated ELISA of 30.000+ clones typically yields a few hundred different antibody clones.

Acknowledgements The support within the EU program “Affinomics” is gratefully acknowledged. This chapter is an updated and revised version of the MTP production protocol included in Russo et al. 2018 Molecular Methods [17]. References 1. Trott M, Weiß S, Antoni S et al (2014) Functional characterization of two scFv-Fc antibodies from an HIV controller selected on soluble HIV-1 Env complexes: a neutralizing V3- and a trimer-specific gp41 antibody. PLoS One 9: e97478 2. Chan S-W, Bye JM, Jackson P, Allain J-P (1996) Human recombinant antibodies specific for hepatitis C virus core and envelope E2 peptides from an immune phage display library. J Gen Virol 77:2531–2539 3. Schofield DJ, Pope AR, Clementel V et al (2007) Application of phage display to high throughput antibody generation and characterization. Genome Biol 8:R254 4. Glanville J, Zhai W, Berka J et al (2009) Precise determination of the diversity of a combinatorial antibody library gives insight into the human immunoglobulin repertoire. Proc Natl Acad Sci U S A 106:20216–20221 5. de Wildt RMT, Mundy CR, Gorick BD, Tomlinson IM (2000) Antibody arrays for highthroughput screening of antibody–antigen interactions. Nat Biotechnol 18:989–994 6. Hust M, Meyer T, Voedisch B et al (2011) A human scFv antibody generation pipeline for proteome research. J Biotechnol 152:159–170 7. Ku¨gler J, Wilke S, Meier D et al (2015) Generation and analysis of the improved human HAL9/10 antibody phage display libraries. BMC Biotechnol 15:10 8. Frenzel A, Schirrmann T, Hust M (2016) Phage display-derived human antibodies in clinical development and therapy. MAbs 8:1177–1194 9. Parmley SF, Smith GP (1988) Antibodyselectable filamentous fd phage vectors: affinity purification of target genes. Gene 73:305–318

10. Breitling F, Du¨bel S, Seehaus T et al (1991) A surface expression vector for antibody screening. Gene 104:147–153 11. Hawlisch H, Mu¨ller M, Frank R et al (2001) Site-specific anti-C3a receptor single-chain antibodies selected by differential panning on cellulose sheets. Anal Biochem 293:142–145 12. Moghaddam A, Borgen T, Stacy J et al (2003) Identification of scFv antibody fragments that specifically recognise the heroin metabolite 6-monoacetylmorphine but not morphine. J Immunol Methods 280:139–155 13. Hust M, Maiss E, Jacobsen H-J, Reinard T (2002) The production of a genus-specific recombinant antibody (scFv) using a recombinant potyvirus protease. J Virol Methods 106:225–233 14. Schu¨tte M, Thullier P, Pelat T et al (2009) Identification of a putative Crf splice variant and generation of recombinant antibodies for the specific detection of Aspergillus fumigatus. PLoS One 4:e6625 15. Keller T, Kalt R, Raab I et al (2015) Selection of scFv antibody fragments binding to human blood versus lymphatic endothelial surface antigens by direct cell phage display. PLoS One 10:e0127169 16. Rezaei J, RajabiBazl M, Ebrahimizadeh W et al (2016) Selection of single chain antibody fragments for targeting prostate specific membrane antigen: a comparison between cell-based and antigen-based approach. Protein Pept Lett 23:336–342 17. Russo G, Meier D, Helmsing S et al (2018) Parallelized antibody selection in microtiter plates. Methods Mol Biol 1701:273–284 18. Ayriss J, Woods T, Bradbury A, Pavlik P (2007) High-throughput screening of single-chain

Parallelized Microscale Expression of Soluble scFv antibodies using multiplexed flow cytometry. J Proteome Res 6:1072–1082 19. Hoet RM, Cohen EH, Kent RB et al (2005) Generation of high-affinity human antibodies by combining donor-derived and synthetic complementarity-determining-region diversity. Nat Biotechnol 23:344–348 20. J€ager V, Bu¨ssow K, Wagner A et al (2013) High level transient production of recombinant antibodies and antibody fusion proteins in HEK293 cells. BMC Biotechnol 13:52 21. Konthur Z, Hust M, Du¨bel S (2005) Perspectives for systematic in vitro antibody generation. Gene 364:19–29 22. Hust M, Steinwand M, Al-Halabi L et al (2009) Improved microtitre plate production of single chain Fv fragments in Escherichia coli. New Biotechnol 25:424–428 23. Sambrook J, Russell DW (2001) Molecular cloning: a laboratory manual, 3rd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY 24. Goffinet M, Chinestra P, Lajoie-Mazenc I et al (2008) Identification of a GTP-bound Rho specific scFv molecular sensor by phage display selection. BMC Biotechnol 8:34 25. Lillo AM, Ayriss JE, Shou Y et al (2011) Development of phage-based single chain Fv antibody reagents for detection of Yersinia pestis. PLoS One 6:e27756

211

26. Welschof M, Terness P, Kipriyanov SM et al (1997) The antigen-binding domain of a human IgG-anti-F(ab’)2 autoantibody. Proc Natl Acad Sci U S A 94:1902–1907 27. Goletz S, Christensen PA, Kristensen P et al (2002) Selection of large diversities of antiidiotypic antibody fragments by phage display. J Mol Biol 315:1087–1097 28. Finnern R, Pedrollo E, Fisch I et al (1997) Human autoimmune anti-proteinase 3 scFv from a phage display library. Clin Exp Immunol 107:269–281 29. Pelat T, Hust M, Laffly E et al (2007) HighAffinity, human antibody-like antibody fragment (single-chain variable fragment) neutralizing the lethal factor (LF) of Bacillus anthracis by inhibiting protective antigen-LF complex formation. Antimicrob Agents Chemother 51:2758–2764 30. Kirsch M, Hu¨lseweh B, Nacke C et al (2008) Development of human antibody fragments using antibody phage display for the detection and diagnosis of Venezuelan equine encephalitis virus (VEEV). BMC Biotechnol 8:66 31. Mersmann M, Schmidt A, Tesar M et al (1998) Monitoring of scFv selected by phage display using detection of scFv-pIII fusion proteins in a microtiter scale assay. J Immunol Methods 220:51–58

Chapter 10 High-Throughput Production of Influenza Virus-Like Particle (VLP) Array by Using VLP-factory™, a MultiBac Baculoviral Genome Customized for Enveloped VLP Expression Duygu Sari-Ak, Shervin Bahrami, Magdalena J. Laska, Petra Drncova, Daniel J. Fitzgerald, Christiane Schaffitzel, Frederic Garzoni, and Imre Berger Abstract Baculovirus-based expression of proteins in insect cell cultures has emerged as a powerful technology to produce complex protein biologics for many applications ranging from multiprotein complex structural biology to manufacturing of therapeutic proteins including virus-like particles (VLPs). VLPs are protein assemblies that mimic live viruses but typically do not contain any genetic material, and therefore are safe and attractive alternatives to life attenuated or inactivated viruses for vaccination purposes. MultiBac is an advanced baculovirus expression vector system (BEVS) which consists of an engineered viral genome that can be customized for tailored applications. Here we describe the creation of a MultiBac-based VLP-factory™, based on the M1 capsid protein from influenza, and its application to produce in a parallelized fashion an array of influenza-derived VLPs containing functional mutations in influenza hemagglutinin (HA) thought to modulate the immune response elicited by the VLP. Key words Baculovirus expression vector system (BEVS), Small-scale production, Virus-like particle (VLP), MultiBac, Cre recombinase, Cre-LoxP fusion, Influenza, Hemagglutinin (HA)

1

Introduction The inception of recombinant protein production technologies has had a decisive impact on both life sciences and drug discovery. Through recombinant overproduction, proteins that had been elusive before can now be produced in the quantity and quality required to decipher their structure and function. This set the stage for designing and validating intervention strategies to modulate their activity, which can now be translated into medications and therapies, through an elaborate process called the drug-discovery pipeline.

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_10, © Springer Science+Business Media, LLC, part of Springer Nature 2019

213

214

Duygu Sari-Ak et al.

Fig. 1 Influenza virus-like particles. Influenza virus structure, as deduced from electron microscopy (EM), is shown (left). The genetic material (green), vital for infectivity, is protected by an envelope made of proteins hemagglutinin (HA, colored in blue), neuraminidase (NA, colored in red), and matrix proteins (M1, colored brown; M2, colored in purple). HA and NA decorate the lipid membrane envelope. Synthetic influenza virus-like particles (VLPs) can be produced by recombinant protein overproduction (right). The VLPs can contain all envelope proteins as shown here, resulting in particles that are structurally virtually identical to the live virus with the important distinction being that all genetic material is absent from the VLPs, rendering the VLPs noninfectious and safe. Alternatively, VLPs can contain a subset of the envelope proteins, such as only hemagglutinin as presented in this study. Such VLPs are excellent candidates for vaccination. The influenza virus illustration was adapted from www.kimicontrol. com/microorg, courtesy of D. Jordan

Recombinant protein overproduction also enables genetic engineering of complex protein specimens that themselves can be used as therapeutic drugs. For example, viruses contain protein materials, which typically enclose and protect the viral genetic information required by the virus for replication upon infecting an organism. Recombinant production of the protein shell components of a virus in the absence of its genetic material results in viruslike particles (VLPs). VLPs structurally resemble the live virus, but do not contain the genetic material which is essential for infectivity. Such recombinant VLPs are presently being used to vaccinate humans against numerous diseases including influenza (Fig. 1), and also cancers caused by viral infection, a prominent example being cervical cancer caused by papilloma virus [1–6]. Research and development in academia and industry has long focused on small and isolated protein entities. Frequently, these are only the fragments or domains of a protein of interest that can be efficiently produced by the recombinant host systems available, most typically the prokaryote Escherichia coli. Until today, expression in E. coli has dominated the protein production field. For example, >90% of all entries in the protein database (PDB) were produced in E. coli. For many therapeutic proteins, E. coli-based

High-Throughput Influenza VLP Array Production by MultiBac Based VLP-Factory™

215

production has proven to be effective, efficient, and relatively inexpensive. For example, the wide availability of E. coli-produced recombinant human insulin has revolutionized the treatment of patients suffering from diabetes, greatly reducing its cost and also alleviating allergic reactions hitherto occurring when insulin extracted from pigs or cows had been administered as a therapeutic [7]. Massive investment over the last 30 years into optimizing protein expression in E. coli has resulted in a large collection of genetically modified strains with widely diverse properties, each tailored to particular experimental needs. An enormous and growing assortment of circular DNA molecules (plasmids) exists for expression in E. coli, to enable the production of heterologous proteins encoded by genes which are inserted into these plasmids. E. coli expression has become commonplace in virtually every molecular biology laboratory in public and private research institutions [8–11]. The explosion of biological data which became available recently thanks to breakthroughs in genomics, proteomics, and interactomics research, however, has revealed that the actual actors of cellular processes, particularly in eukaryotes, are not isolated proteins but rather complex multiprotein assemblies composed of up to a dozen or more protein subunits. Many of the individual eukaryotic proteins and most of the protein complexes, including therapeutically important assemblies such as VLPs, cannot be produced efficiently in E. coli. The failure of the prokaryote E. coli to produce complex human protein specimens is not surprising given that eukaryotic cells and their component proteins are in general much more complex than prokaryotes and demand different folding, processing, and posttranslational modifications (i.e., decorations with small molecules such as sugars or phosphate groups conferring activity) that the E. coli protein production machinery cannot support. Consequently, eukaryotic protein expression systems have entered center-stage [12–16]. Among eukaryotic techniques, the baculovirus expression vector system (BEVS) has emerged as a particularly powerful method to produce complex protein biologics that are authentically targeted and posttranslationally processed. We introduced MultiBac, a modular baculovirus-based expression system particularly useful for producing protein specimens comprising many subunits [17–25]. MultiBac consists of an engineered baculovirus from which we eliminated modalities detrimental to protein complex production. To facilitate the insertion of several to many protein encoding genes into the MultiBac virus, we created a set of small, circular DNA modules called Acceptor and Donor plasmids. Genes of interest can be inserted into these Acceptors and Donors by using ligation independent cloning procedures, and then combined into a multigene transfer plasmid by site specific recombination catalyzed by Cre recombinase. The multigene transfer plasmid is

216

Duygu Sari-Ak et al.

then inserted by transposition into the MultiBac baculoviral genome which exists as a bacterial artificial chromosome in special E. coli cells. Moreover, we outfitted MultiBac by a further site specific recombination site directly inserted into the baculoviral genome. This site can be accessed by means of Cre-LoxP fusion reaction by further Donor plasmids which can contain genes encoding for even further subunits of a protein complex of choice, or, alternatively, modalities to modify the heterologous protein produced such as kinases, phosphatases, glycosylases, and others [25–30]. MultiBac is now in use in many laboratories (>1000) worldwide and contributed critically to accelerate ambitious research programs in academia and industry [25]. In this report, we utilize a version of MultiBac which we designed for efficient expression of virus-like particles based on the influenza M1 capsid protein. We integrated a gene expressing M1 from H1N1 influenza strain into the viral backbone by site specific recombination, together with a fluorescent protein (mCherry) to monitor virus performance and VLP production (Fig. 2). Using this customized genome, VLP-factory™, we expressed in a parallelized fashion an array of influenza VLPs containing a series of mutations in a potentially immune-suppressive domain (ISD) within the influenza surface protein hemagglutinin (HA). ISDs have been originally discovered in retrovirus surface proteins and recently also in influenza, and evidence suggests that the influenza ISD is identical or overlapping with the fusion peptide segment within HA [31]. It has recently been shown that virus mediated membrane fusion is capable of inducing the innate immune responses in macrophages and dendritic cells [32]. Apparently, the HA fusion peptide/ISD can thus inhibit this essential function of the immune system. For proper induction of an immune response in vaccines it is essential to maintain the membrane fusion capability of the HA, while removing its immune suppressive activity. Therefore, ISD mutations must be identified that on the one hand abolish immune suppressive activity, while maintaining membrane fusion activity. This is however no trivial task since the ISD is located on a structurally sensitive part of HA. The availability of an array of influenza VLPs containing randomized amino acids at defined positions in the ISD would be instrumental to identify such mutants. The influenza VLP variants, once produced from the array, comprising wild-type HA or HA mutants with randomized amino acids at defined positions in the HA ISD, can then be tested downstream for membrane fusion activity (i.e., by hemolysis assay) and for eliciting and modulating immune response in vivo in animal models. Influenza VLPs containing mutations of the ISDs which may abolish the immune suppressive activity hold the promise to develop into VLP-based hyper-immunogenic antigens that could lead to broadly protecting influenza vaccines by eliciting a strong antibody titer upon immunization (in contrast to native HAs which are weak antigens).

High-Throughput Influenza VLP Array Production by MultiBac Based VLP-Factory™

217

Fig. 2 MultiBac-based VLP-factory™. The MultiBac baculoviral genome was customized for enveloped viruslike particle production by Cre-LoxP mediated insertion of a plasmid comprising expression cassettes for influenza H1N1 matrix protein M1 and fluorescent protein mCherry as a marker for tracking virus performance as shown schematically on the right. Insertion of transfer plasmids comprising expression cassettes for influenza surface proteins (hemagglutinin HA and/or neuraminidase NA) results in efficient expression of influenza virus-like particles (VLPs) as shown in EM images at the bottom. Recombinant influenza VLPs produced by the customized MultiBac genome (VLP-factory™) are virtually indistinguishable in size, shape, and appearance from life influenza virus (inset, image courtesy of R. Ruigrok). Scale bars (50 μm) are drawn in white

We show here the production and characterization of influenza VLPs, facilitated by our customized MultiBac-derived baculoviral expression tool, VLP-factory™. VLP-factory™ relies on the selfassembling, capsid-forming influenza M1 protein which we supply from the baculoviral genome in our expression experiments. Our VLP-factory™, however, is by no means restricted to the type of VLPs presented in this study. It is conceivable that most viral envelope proteins that can be produced efficiently in insect cells infected with our customized baculovirus will be efficiently incorporated in M1-based VLPs during the budding process. Thus, we anticipate that many other enveloped VLPs, from a wide

218

Duygu Sari-Ak et al.

range of viruses, can be produced efficiently by our approach, opening exciting avenues to produce potent VLP-based vaccines to combat viral disease.

2

Materials Expression construct design is carried out in silico using a DNA cloning software of choice (i.e., VectorNTI, ApE, and others). We used here gene synthesis for DNA encoding M1 from H1N1 influenza virus and hemagglutinin derived from both H1N1 and H5N1 influenza viral strains, and we used commercial mutagenesis services to introduce a set of potentially immune-modulating mutations in the hemagglutinin encoding genes. All internal restriction sites which may at some point be useful for subcloning into the MultiBac plasmids were eliminated by design, to facilitate possible future inclusion of further proteins such as NA and M2, or glycosylases to modulate the sugar structure of the heterologous proteins if desired [26–28] into our parallelized expression experiments. We applied codon optimization of the genes of interest for expression in insect cells by using the web-based algorithms provided at no cost from most synthetic DNA suppliers (e.g., www.idtdna.com/ CodonOpt, OptimumGeneTM from www.genscript.com), concomitantly removing of potentially harmful RNA secondary structure elements in the transcripts. During codon optimization, we eliminated any restriction sites that are part of the so-called multiplication modules in the Acceptor and Donor plasmids to allow for flexibility of gene assembly if further proteins, for example, other influenza factors or posttranslational modifiers such as glycosylases would be co-expressed. All reagents are prepared using ultrapure water (Millipore Milli-Q system or equivalent; conductivity of 18.2 MΩ∙cm at 25  C) and analytical grade reagents. Buffers, antibiotics, and enzymes are stored at 20  C. We have described transfer plasmid generation and insertion into the MultiBac genome in detail elsewhere [18, 19, 33]. Here we will therefore focus on the production, purification, and analysis of the influenza VLP array we produced using our parallelized approach.

2.1 Generating the MultiBac-Based VLP-factory™

1. Restriction endonucleases and reaction buffers.

2.1.1 Reagents

4. Gel extraction kit (such as a kit from Qiagen).

2. T4 DNA ligase and buffer. 3. Cre recombinase enzyme and buffer. 5. Plasmid purification kit (such as a kit from Qiagen). 6. Regular E. coli competent cells (TOP10, HB101, or comparable).

High-Throughput Influenza VLP Array Production by MultiBac Based VLP-Factory™

219

7. E. coli competent cells containing pir gene (for Donor plasmids). 8. E. coli competent cells DH10MultiBac harboring MultiBac baculoviral genome [18]. 9. An empty plasmid (pUCDM) for the insertion of genes encoding for influenza H1N1 M1 protein (GenBank ID ABD59883.1) and fluorescent protein mCherry (GenBank ID ANO45948.1) via multiple cloning sites [18]. 10. Antibiotics ampicillin, kanamycin, chloramphenicol, tetracyclin (for concentrations see ref. [18]). 11. Isopropyl β-D-1-thiogalactopyranoside IPTG. 12. Agar for pouring plates. 13. Media (LB, TB, SOC) for growing minicultures. 2.2 Generating Influenza VLP Producing Baculoviruses

1. DH10VLP-factory™ competent cells (see Subheading 2.1). 2. Sf21 or Sf9 insect cells. 3. Media (e.g., SF900 II SFM from Life Technologies or Hyclone form ThermoFisher) to grow insect cells.

2.2.1 Reagents

4. Transfection reagent (e.g., FuGENE from Promega or JetPEI from Polyplus Transfection).

2.2.2 Equipment and Consumables

1. Sterile plastic ware (6-well tissue culture plates, Eppendorf pipettes).

2.3 Production of Influenza VLP Array

1. Sf21 or Sf9 insect cell cultures.

2.3.1 Reagents

2. Media (e.g., SF200SFM from Life Technologies or Hyclone form ThermoFisher) to grow insect cells. 3. Liquid nitrogen for freezing of cell pellets. 4. Phosphate-buffered saline (PBS) solution (Thermo Fischer Scientific). 5. Sucrose (Sigma Aldridge) solution (20%, 60%) in PBS, autoclaved or sterile filtered (20 μm nylon filter).

2.3.2 Equipment and Consumables

1. Sterile glass ware (Erlenmeyer shaker flasks, 250 ml). 2. Sterile plastic ware (Eppendorf pipettes, sterile pipette tips). 3. Sterile Falcon or Greider tubes (15 ml, 50 ml). 4. Sterile ultracentrifugation tubes (Beckman). 5. Sterile Eppendorf pipettes. 6. Ultracentrifugation equipment (SW28 rotor or comparable, ultracentrifuge).

220

3

Duygu Sari-Ak et al.

Methods The genes encoding for the influenza M1 and HA subunits and the functional mutants are designed in silico, and then inserted into the Donor and Acceptor plasmid of choice. Once designed, mono-, di-, or, if more influenza factors such as NA and M2 are to be added, even polycistronic expression cassettes can be created by a variety of means including DNA synthesis, restriction/ligation cloning, ligation independent cloning (LIC), sequence and ligation independent cloning (SLIC), or other methods such as Gibson cloning [34] as described in detail before [35, 36], according to individual user preference. With the current highly competitive commercial offers we recommend custom DNA synthesis (nota bene including complete gene sequencing) to facilitate expression cassette construction, in particular for generation of mutant arrays as we carried out here for influenza HA.

3.1 Generating the MultiBac-Based VLP-factory™

Insert genes encoding for influenza H1N1 M1 protein (GenBank ID ABD59883.1) and fluorescent protein mCherry (GenBank ID ANO45948.1) into multiple cloning sites MCS1 and MCS2 of pUCDM plasmid [18] under polh and p10 promoter control, respectively, to yield pUCDM-M1-mCherry. Use generic restriction enzyme/ligation based cloning or ligation independent methods [33, 34] as preferred. 1. Transform pir + cells with pUCDM-M1-mCherry. Plate on agar containing chloramphenicol. Grow 25 ml bacterial culture as previously described [18] (see Note 1). 2. Using standard plasmid preparation procedures, prepare 1 μg of pure pUCDM-M1-mCherry plasmid. 3. Prepare DH10MultiBacCre chemical competent cells containing the MultiBac baculoviral genome and expressing Cre recombinase enzyme according to published procedures [17]. 4. Transform pUCDM-M1-mCherry plasmid into DH10MultiBacCre cells using standard salt-dependent transformation protocol, plate on agar plates containing tetracyclin, kanamycin, and chloramphenicol antibiotics as well as IPTG and BluOGal color reagent. 5. Pick a single blue colony and grow bacterial culture. Prepare competent cells and store in flash-frozen (liquid nitrogen) aliquots at 80  C. These are DH10VLP-factory™ cells containing the MultiBac-based VLP-factory (Fig. 2). 6. Prepare VLP-factory V0 virus from an aliquot according to published procedures [19]. Cells infected with VLP-factory™ virus turn brightly red due to mCherry expression.

High-Throughput Influenza VLP Array Production by MultiBac Based VLP-Factory™

3.2 Generating Influenza VLP Producing Baculoviruses

221

1. Transform DH10VLP-factory™ aliquots with transfer plasmid array encoding for wild-type and mutant influenza HA proteins and plate on agar containing kanamycin, gentamicin, tetracycline, IPTG, and BluOGal. 2. Pick a single white colony from each transformation and prepare composite VLP-factory™ bacmid, followed by Vo initial virus generation using transfection reagent and Sf21 insect cells as described [19]. Already at the Vo stage in 6 well plates, composite VLP-factory™ produced clones stain bright red (Fig. 3). 3. Harvest budded VLP-factory™ virus producing influenza VLPs after 48–60 h and amplify V1 virus in 250 ml shaker flasks containing 50 ml insect cell culture [19]. Cell cultures turn red from 24 to 48 h post infection (see Note 2).

3.3 Production of Influenza VLP Array

About three days after proliferation arrest (3 dpa), the cell cultures will have a bright Bordeaux red color due to co-expressed mCherry (Fig. 3). Transfer cells in a sterile Falcon tube. 1. Spin at 4000 rpm (3,000  g) at RT in a tabletop centrifuge (see Note 3). 2. Repeat step 1 with fresh Falcon tube to avoid carrying over cell debris. 3. Ultracentrifuge cleared supernatant overnight at 26,000 rpm (122,000  g) in an SW28 rotor (in centrifuge tubes containing 38 ml of supernatant each). 4. Gently resuspend the pellets in 100 μl PBS. 5. Prepare ultracentrifuge tube (38 ml volume) containing 20% sucrose and a cushion of around 2 ml of 60% sucrose at the bottom of the tube (see Note 4). 6. Load 1 ml of resuspended VLP pellet (from step 4). 7. Prepare identical tube with the same sucrose cushion as balance. 8. Ultracentrifuge overnight at 26,000 rpm (122,000  g) in a SW28 rotor. 9. Remove 20% sucrose solution by gentle pipetting and collect the white layer between the two sucrose solutions (interface on top of the 60% cushion). 10. Load the white layer on a discontinuous 20–60% sucrose gradient. 11. Ultracentrifuge overnight at 26,000 rpm (122,000  g) in SW28 rotor.

222

Duygu Sari-Ak et al.

Fig. 3 Production of influenza VLP array comprising HA mutants. VLP-factory™ was utilized for producing an array of functional influenza VL variants. A transfer plasmid library encoding for wild-type and mutant influenza HA5 and HAB proteins was inserted in parallel into VLP-factory™ (top). Initial virus was prepared in 6-well plates already showing the characteristic red color resulting from expression of mCherry (below). Small-scale amplification in shaker-flasks (middle) resulted in sufficient cell pellet to extract and purify the influenza VLP array by sedimentation and gradient centrifugation for functional study (bottom)

High-Throughput Influenza VLP Array Production by MultiBac Based VLP-Factory™

223

Fig. 4 Electron micrographs of purified influenza VLPs. A selection of VLP specimens is shown as analyzed by negative stain electron microscopy (EM). Production of M1 only by VLP-factory™ already results in globular shapes (top row) indicating efficient VLP formation in absence of influenza envelope protein. Co-expression of influenza HA (HA5, HAB) results in VLPs showing the characteristic pattern of protruding spikes (middle and bottom rows). Scale bars (50 μm) are drawn in white

12. Collect white visible band containing purified VLP species and store at 4  C for further in vitro use including electron microscopy (EM) (Fig. 4). 13. To remove sucrose for injection in animal models (Fig. 5), repeat ultracentrifugation as above, carefully remove sucrose containing supernatant, and gently resuspend the pellet in sterile cold PBS (adjust concentration to the particle density required for injections).

4

Notes 1. The propagation of a Donor and its derivatives depends on cells that express the pir gene (such as PIR1 and PIR2 from Invitrogen). This is due to the conditional origin present on these

224

Duygu Sari-Ak et al.

Fig. 5 VLP-factory™ produced influenza VLPs are functionally active. The presence of HA protein in a selected VLP preparation (HA5 M1) assayed by Western blot with specific antibody (H5N1 MIA-0052) is shown (left). The right lane in the Western blot contained VLPs comprising M1 only, and no HA signal is observed. M denotes molecular weight marker (kD). The recombinant influenza VLP was assayed in a hemolysis experiment (middle) showing up to 30% hemolysis of red blood cells (RBCs). VLPs containing M1 only did not cause hemolysis above natural level (red bar). The ELISA plot on the right shows antibody titers (IgG1) elicited after two injections (prime boost) of recombinant influenza VLP produced by using VLP-factory™, validating its protective potential against influenza infection in a murine model. PBS was used as a control

plasmids [37]. Acceptors and their derivatives (which we used here for the HA encoding genes), on the other hand, contain a common ColE1-derived origin and can be propagated in any E. coli cloning strains (TOP10, TG1, DH10, HB101). 2. We processed batches of up to 25 different VLP array members using this approach, which allowed us to control logistics with ease. High-throughput approaches to culture insect cells as we described more recently could conceivably allow for parallel processing of many more specimens at the same time in smaller volumes in a tabletop multi-fermenter [38]. 3. All steps during influenza VLP purification are performed on ice. Sucrose solutions are prepared in PBS and sterile filtered through a 0.2 μm filter. 4. During influenza VLP production, if removal of sucrose is desired for downstream processing (DSP), the white band can be diluted in PBS, ultra-centrifuged overnight at 26,000 rpm (122,000  g) in an SW28 rotor. The pellet is then resuspended in PBS.

Acknowledgements We thank all members of the Berger and Schaffitzel laboratories for helpful discussions, as well as Rob Ruigrok, Thibaut Crepin, and Darren Hart for expert insight in influenza biology. We are grateful

High-Throughput Influenza VLP Array Production by MultiBac Based VLP-Factory™

225

to Karin Huard for assistance with gradient preparation, and Yan Nie for introduction to negative-stain EM. This work was supported by the European Commission Framework Programme 7 projects ComplexINC (contract nr. 279039) and SynSignal (contract nr. 613879). Competing Financial Interest Statement The authors declare competing financial interest. Parts of the technology here described are subject of international patent EP2403940 and licensed exclusively to Geneva Biotech SARL. References 1. Cox MM, Hollister JR (2000) FluBlok: a next generation influenza vaccine manufactured in insect cells. Biologicals 37(3):182–189 2. Schiller JT, Lowy DR (2006) Prospects for cervical cancer prevention by human papillomavirus vaccination. Cancer Res 66 (21):10229–10232 € 3. Temchura V, Uberla K (2017) Intrastructural help: improving the HIV-1 envelope antibody response induced by virus-like particle vaccines. Curr Opin HIV AIDS 12(3):272–277 4. Charlton Hume HK, Lua LH (2017) Platform technologies for modern vaccine manufacturing. Vaccine 35(35 Pt A):4480–4485 5. Jeong H, Seong BL (2017) Exploiting viruslike particles as innovative vaccines against emerging viral infections. J Microbiol 55 (3):220–230 6. Pouyanfard S, Mu¨ller M (2017) Human papillomavirus first and second generation vaccines – current status and future directions. Biol Chem 398(8):871–889. https://doi.org/10. 1515/hsz-2017-0105 7. Zaykov AN, Mayer JP, DiMarchi RD (2016) Pursuit of a perfect insulin. Nat Rev Drug Discov 15(6):425–439 8. Jia B, Jeon CO (2016) High-throughput recombinant protein expression in Escherichia coli: current status and future perspectives. Open Biol 6(8). pii: 160196 9. Angius F, Ilioaia O, Uzan M, Miroux B (2016) Membrane protein production in Escherichia coli: protocols and rules. Methods Mol Biol 1432:37–52 10. Vincentelli R, Romier C (2016) Complex reconstitution and characterization by combining co-expression techniques in Escherichia coli with high-throughput. Adv Exp Med Biol 896:43–58

11. Vincentelli R, Romier C (2013) Expression in Escherichia coli: becoming faster and more complex. Curr Opin Struct Biol 23 (3):326–334 12. Nettleship JE, Assenberg R, Diprose JM, Rahman-Huq N, Owens RJ (2010) Recent advances in the production of proteins in insect and mammalian cells for structural biology. J Struct Biol 172:55–65 13. Nettleship JE, Watson PJ, Rahman-Huq N, Fairall L, Posner MG, Upadhyay A, Reddivari Y, Chamberlain JM, Kolstoe SE, Bagby S, Schwabe JW, Owens RJ (2015) Transient expression in HEK 293 cells: an alternative to E. coli for the production of secreted and intracellular mammalian proteins. Methods Mol Biol 1258:209–222 14. Nie Y, Viola C, Bieniossek C, Trowitzsch S, Vijay-achandran LS, Chaillet M, Garzoni F, Berger I (2009) Getting a grip on complexes. Curr Genomics 10:558–572 15. Vijayachandran LS, Viola C, Garzoni F, Trowitzsch S, Bieniossek C, Chaillet M, Schaffitzel C, Busso D, Romier C, Poterszman A et al (2011) Robots, pipelines, polyproteins: enabling multiprotein expression in prokaryotic and eukaryotic cells. J Struct Biol 175:198–208 16. Nie Y, Chaillet M, Becke C, Haffke M, Pelosse M, Fitzgerald D, Collinson I, Schaffitzel C, Berger I (2016) ACEMBL toolkits for high-throughput multigene delivery and expression in prokaryotic and eukaryotic hosts. Adv Exp Med Biol 896:27–42 17. Berger I, Fitzgerald DJ, Richmond TJ (2004) Baculovirus expression system for heterologous multiprotein complexes. Nat Biotechnol 22(12):1583–1587 18. Fitzgerald DJ, Berger P, Schaffitzel C, Yamada K, Richmond TJ, Berger I (2006) Protein complex expression by using multigene

226

Duygu Sari-Ak et al.

baculoviral vectors. Nat Methods 3 (12):1021–1032 19. Bieniossek C, Richmond TJ, Berger I (2008) MultiBac: multigene baculovirus-based eukaryotic protein complex production. Curr Protoc Protein Sci Chapter 5:Unit 5.20 20. Trowitzsch S, Bieniossek C, Nie Y, Garzoni F, Berger I (2010) New baculovirus expression tools for recombinant protein complex production. J Struct Biol 172(1):45–54 21. Bieniossek C, Imasaki T, Takagi Y, Berger I (2012) MultiBac: expanding the research toolbox for multiprotein complexes. Trends Biochem Sci 37(2):49–57 22. Trowitzsch S, Palmberger D, Fitzgerald DJ, Takagi Y, Berger I (2012) MultiBac complexomics. Expert Rev Proteomics 9(4):363–373 23. Barford D, Takagi Y, Schultz P, Berger I (2013) Baculovirus expression: tackling the complexity challenge. Curr Opin Struct Biol 23(3):357–364 24. Berger I, Garzoni F, Chaillet M, Haffke M, Gupta K, Aubert A (2013) The MultiBac protein complex production platform at the EMBL. J Vis Exp 11(77):e50159 25. Sari D, Gupta K, Thimiri Govinda Raj DB, Aubert A, Drncova´ P, Garzoni F, Fitzgerald DJ, Berger I (2016) The MultiBac baculovirus/insect cell expression vector system for producing complex protein biologics. Adv Exp Med Biol 896:199–215 26. Palmberger D, Rendic D (2015) SweetBac: applying MultiBac technology towards flexible modification of insect cell glycosylation. Methods Mol Biol 1321:153–169 27. Palmberger D, Klausberger M, Berger I, Grabherr R (2013) MultiBac turns sweet. Bioengineered 4(2):78–83 28. Palmberger D, Wilson IB, Berger I, Grabherr R, Rendic D (2012) SweetBac: a new approach for the production of mammalianised glycoproteins in insect cells. PLoS One 7(4):e34226 29. Fitzgerald DJ, Schaffitzel C, Berger P, Wellinger R, Bieniossek C, Richmond TJ, Berger I (2007) Multiprotein expression strategy

for structural biology of eukaryotic complexes. Structure 15(3):275–279 30. Koehler C, Sauter PF, Wawryszyn M, Girona GE, Gupta K, Landry JJ, Fritz MH, Radic K, Hoffmann JE, Chen ZA et al (2016) Genetic code expansion for multiprotein complex engineering. Nat Methods 13(12):997–1000 31. Bahrami S, Laska MJ, Pedersen FS, Duch M (2016) Immune suppressive activity of the influenza fusion peptide. Virus Res 211:126–131 32. Holm CH, Jensen SB, Jakobsen MR, Cheshenko N, Horan KA, Moeller HB, Gonzalez-Dosal R, Rasmussen SB, Christensen MH, Yarovinsky TO et al (2012) Virus-cell fusion as a trigger of innate immunity dependent on the adaptor STING. Nat Immunol 13 (8):737–743 33. Haffke M, Viola C, Nie Y, Berger I (2013) Tandem recombineering by SLIC cloning and Cre-LoxP fusion to generate multigene expression constructs for protein complex research. Methods Mol Biol 1073:131–140 34. Casini A, Storch M, Baldwin GS, Ellis T (2015) Bricks and blueprints: methods and standards for DNA assembly. Nat Rev Mol Cell Biol 16 (9):568–576 35. Celie PH, Parret AH, Perrakis A (2016) Recombinant cloning strategies for protein expression. Curr Opin Struct Biol 38:145–154 36. Benoit RM, Ostermeier C, Geiser M, Li JS, Widmer H, Auer M (2016) Seamless insertplasmid assembly at high efficiency and low cost. PLoS One 11(4):e0153158 37. Metcalf WW, Jiang W, Wanner BL (1994) Use of the rep technique for allele replacement to construct new Escherichia coli hosts for maintenance of R6Kλ origin plasmids at different copy numbers. Gene 138:1–7 38. Monteiro F, Bernal V, Chaillet M, Berger I, Alves PM (2016) Targeted supplementation design for improved production and quality of enveloped viral particles in insect cellbaculovirus expression system. J Biotechnol 233:34–41

Chapter 11 High-Throughput Protein Production of Membrane Proteins in Saccharomyces cerevisiae Jennifer M. Johnson and Franklin A. Hays Abstract This chapter outlines a protocol to assess viability for large-scale protein production and purification for selected targets from an initial medium-throughput cloning strategy. Thus, one can assess a broad number of potential candidate proteins, mutants, or expression variants using an empirically minimalistic approach. In addition, a key output from this protocol is utilization of Saccharomyces cerevisiae as a means for the efficient screening and production of purified proteins. The primary focus in this protocol is overexpression of polytopic integral membrane proteins though methods can be equally applied to soluble proteins. The protocol starts with outlining high-throughput (sans robotics) cloning of expression proteins into a dualtag yeast expression plasmid. These membrane proteins are then screened for expression level, detergent solubilization, initial purity, and chromatography characteristics. Both small- and large-scale expression methods are discussed along with fermentation. Key words Yeast expression, Protein expression, Protein purification, Fermentation, Membrane protein

1

Introduction Yeast systems, such as Saccharomyces cerevisiae and Pichia pastoris, are widely used to overproduce target proteins (both heterologous and homologous overexpression) for biochemical and structural characterization [1–4]. Working with yeast has distinct advantages over other systems. These include the following—yeast are: (1) capable of rapid high-density growth to generate large amounts of biomass with minimal cost and time, (2) unicellular organisms and well suited to genetic manipulation, (3) capable of posttranslational modifications which may be required for proper protein folding and targeting, (4) generally nontoxic and do not require onerous laboratory safety and management protocols, and (5) are model organisms that have been extensively characterized and well suited for increased throughput methodologies. As such, the budding yeast S. cerevisiae was adopted early-on in structural biology

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_11, © Springer Science+Business Media, LLC, part of Springer Nature 2019

227

228

Jennifer M. Johnson and Franklin A. Hays

initiatives for the overproduction of protein targets toward structure determination efforts. Such efforts often focused on more complex eukaryotic proteins (e.g., integral membrane proteins) that were resistant to overproduction in bacterial systems [3]. This highlights the progression one may often navigate when trying to overexpress eukaryotic polytopic integral membrane proteins (IMPs) where one first attempts bacterial systems (e.g., E. coli), then moves to yeast systems (e.g., S. cerevisiae or P. pastoris), insect cell systems (e.g., SF9s), and finally to mammalian systems (e.g., CHO or HEK293) until landing on protein production levels sufficient for studies being pursued (e.g., structural biology or enzymology). The current protocol will focus on utilizing S. cerevisiae to express candidate proteins followed by empirical screening and evaluation of the overexpressed protein product. Producing large amounts of purified protein is a prerequisite to determining molecular structures. The structure determination pipeline for IMPs is an empirical minefield of hurdles and failed experiments. Common difficulties associated with this endeavor are: (1) overexpression of active protein, (2) extraction (“solubilization”) from cellular membranes, (3) mono- or pauci-dispersity in solution following solubilization (pragmatic translation—not in the void peak on size exclusion chromatography), (4) crystallization, and (5) diffraction resolution barriers (low-angle data). Recent advances in microfocus beamlines [5, 6], X-ray free electron lasers [7], and electron microscopy [8, 9] have provided some relief to hurdles associated with crystallogenesis and diffraction barriers. Yet, one must produce sufficient quantities of functional protein to study—still a daunting challenge for eukaryotic IMPs [10]. This protocol presents a pragmatic approach to overcoming the overexpression, detergent extraction, and solution dispersity hurdles: funnel a broad number of protein targets into an empirically defined pipeline with evaluation metrics along the way as a means to identify suitable targets for downstream structural studies. Thus, the current protocol presents a tapered approach where a broad number of targets are cloned with only a few ranked proteins surviving to the final stages of labor intensive structure determination or enzymology. IMP crystallization and structure determination are covered extensively elsewhere [3, 10, 11] and are not included in the current protocol. At each step of the pipeline (Fig. 1), one may broaden the empirical approach to capture more targets for downstream progression though the empirical burden increases as this occurs. For instance, this protocol will utilize only one detergent, 2,2-didecylpropane-1,3-bis-β-D-maltopyranoside (LMNG), for testing solubilization from cellular membranes for IMP targets. LMNG is selected as it has shown an increased propensity for IMP solubilization in functional form and it belongs to a relatively new class of

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

229

Fig. 1 Schematic workflow for protein production using S. cerevisiae. Methods described in this protocol are tapered with high-throughput cloning and plasmid construction feeding into medium-throughput expression and solubilization screening with positive targets scaled-up in, relatively, low-throughput protein production and characterization methods. This funneled approach is best leveraged in high attrition empirical pursuits such as structural biology

maltose neopentyl glycol amphiphile detergents [12]. One could easily pick n-dodecyl-β-D-maltopyranoside (DDM) as well which is one of the most represented detergents in currently determined IMP structures [10, 12, 13]. The key point is that one is focused on forward progression of selected protein targets toward structure determination. Ideally, one will select a range of constructs or proteins to initiate the pipeline to, ultimately, yield high value targets for crystallization trials. Indeed, this approach was proven to be highly successful in studies of the S. cerevisiae membrane proteome where 351 IMPs were screened through the pipeline (using DDM) with >50% of targets passing solubilization tests and >25% of targets being purified to homogeneity [3]. Protein overexpression remains the primary initial hurdle in any IMP study that requires purified protein. Methods to overcome this have expanded greatly in bacterial systems [14, 15] yet remain less developed for eukaryotic systems. Optimization of protein expression level in S. cerevisiae involves modification of the expression vector (e.g., changing promoter or expression tag), growth conditions (e.g., shake flask vs. fermentation or rich media vs. minimal media), or

230

Jennifer M. Johnson and Franklin A. Hays

genotype/strain (e.g., W303 vs. INVSc1) as a starting point. Expression tags being used, and tag location, can have a significant impact on protein expression levels [16–19]. Inclusion of a C-terminal poly-histidine tag is generally desirable to ensure production of full-length protein using commercially available antibodies and resins. A drawback to poly-histidine tags is the nonspecific adsorption of contaminant cellular proteins to metal affinity resins (e.g., nickel and cobalt). For instance, hydrophobic IMPs may remain adsorbed to the resin following low, and even high, concentrations of imidazole or histidine during wash and elution steps. Such contaminants can be difficult, or impossible, to remove unless further purification steps or alternative methods are utilized. One can include multiple tags to facilitate either increased protein expression (e.g., galectin or maltose binding protein) or additional purification steps (e.g., FLAG, HA, or c-myc). For example, FLAG tags are a polypeptide epitope protein tag, consisting of the sequence DYKDDDDK, that can be added to the N- or C-terminus of an expressed protein to facilitate one-step purification and protein tracking [20]. In addition, one can choose a different promoter. The current vector is driven by the inducible GAL1 promoter [21]. Changing to a different tightly regulated promoter with a strong transcriptional start signal, such as ADH2, is a viable alternative [22]. Using a different selection marker may also improve protein expression levels as different media formulations could then be tried. One can also alter growth conditions to facilitate biomass production or more target protein per cell (e.g., nutrient rich/depleted media or alternative secondary carbon sources such as raffinose). If using fermentation, one can dynamically adjust the stir, dissolved oxygen (DO), pH, and feeding methods (e.g., fed-batch culture) to maximize cell density and protein expression [23]. Oxygen availability can affect growth rates and energy availability [24]. Previous studies have demonstrated that reducing growth temperatures can improve yields if toxicity is suspected [25], but it can also be detrimental to the expression of IMPs [1]. Several reports have shown chemical chaperones can increase protein expression levels by improving the folding of IMPs [26, 27]. These chemical chaperones include: (1) DMSO (2.5% v/ v), glycerol (10% v/v), or histidine (0.04 mg/ml). One may also alter the media by changing composition of the carbon sources. Finally, one can utilize gene redesign and codon optimization to improve protein expression levels [28–30]. The current protocol will utilize a synthetic complete histidine dropout media for selection in both shake flasks and fermentation methods.

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

2

231

Materials

2.1 High-Throughput Cloning and Plasmid Construction 2.1.1 Reagents

1. All solutions are prepared using ultrapure water (referred to as MilliQ H2O, prepared by purifying deionized water to attain a sensitivity of 18 MΩ at 25  C). 2. Order synthetic DNA oligomers in desalted form to prevent further purification or modification (see Note 1). For highthroughput cloning, DNA primers can be ordered (e.g., from Sigma-Aldrich) in 96-well format and normalized to 100 μM working concentrations. 3. Phusion DNA Polymerase (ThermoFisher) is a high-fidelity DNA polymerase used for all PCR reactions, except colony PCR to verify inserts, in this protocol. Phusion is optimal for high-throughput cloning as it is not prone to introduce mutations. For high GC content sequences, the Advantage 2 Polymerase Mix (Clonetech) is recommended. 4. Ethidium Bromide (EtBr, 2 mg/ml) is used to visualize DNA under UV light. It is highly carcinogenic and light sensitive. Keep wrapped in aluminum foil and store at RT (see Note 2). 5. 50 TAE (2 M Tris–HCl, 1 M Acetic acid, 50 mM EDTA) is used to make 1 TAE for DNA agarose gel electrophoresis. 6. Molecular biology grade ethanol (Sigma) is used to wash or precipitate DNA. 7. Molecular biology grade agarose (Research Products International) is used for DNA agarose gels. 8. Restriction enzymes and corresponding 10 buffer stocks were obtained from New England BioLabs and stored at 20  C until use. (Specific buffer compositions can be found in the appendix of the NEB catalogue) (see Note 3). 9. 6 Orange-G dye is used for visualization of migration fronts during DNA electrophoresis. 10. Quick-Load® 2-Log DNA Ladder (0.1–10.0 kb, New England BioLabs) is used as a size marker to identify DNA fragments separated by DNA agarose gel electrophoresis. 11. The QIAquick® gel extraction kit (Qiagen) is used for isolation of DNA fragments from agarose gels. 12. Any suitable T4 DNA ligase and ligation buffer (e.g., New England BioLabs) is sufficient for DNA ligation at this stage. 13. SOC Media is used for the enrichment step during E. coil transformations: Autoclave 1 L of SOB broth (Research Products International) and add MgCl2 (10 mM final concentration, sterile solution) and glucose (20 mM final concentration, sterile solution). Alternatively, SOC media can be prepared directly using: 20 g/L bacto-tryptone, 5 g/L yeast extract,

232

Jennifer M. Johnson and Franklin A. Hays

10 mM NaCl, and 2.5 mM KCl with pH adjusted to 7.0 using NaOH prior to autoclaving. Once cooled, add 10 ml of 2 M MgCl2 (10 mM final) and 10 ml of 2 M glucose (20 mM final). 14. Plasmids are maintained in Escherichia coli strain DH5α. Store at 80  C. 15. E. coli cells are grown in Luria-Bertani (LB) medium and or on LB agar (Research Products International) (see Note 4). 16. Ampicillin (200 mg/ml, Research Products International) is used for selection of positive transformants (see Note 5). 17. Plasmids are sequenced with a GAL1 promoter and Cyc terminator using the services of a DNA sequencing facility (see Note 6). 2.1.2 Equipment and Consumables

1. Concord polycarbonate 96-well skirted PCR plates (Bio-Rad) are used for all plate-based PCR and heating/cooling reactions. 2. The QIAquick® 96 PCR purification kit (Qiagen) is used to purify PCR products. 3. Plasmids are isolated using a QIAprep® spin miniprep kit (Qiagen). 4. 48-grid vented Q-trays with divider were obtained from Genetix Inc (#X6029).

2.2 Generation and Transformation of Competent S. cerevisiae Cells 2.2.1 Reagents

1. Yeast strain used for protein expression in this protocol: W303Δpep4 (leu2–3, 112 trp1–1, can1–100, ade2–1, his3–11,15, Δpep4, MATα). 2. YPD broth (Research Products International) is used for initial outgrowths for competent cells. 0.2 μm sterile filtered. 3. YPD agar (Difco) is used to generate agar plates for colony isolation. Plates are poured following a 15 min autoclave cycle, and stored at 4  C. 4. YPD broth (Research Products International) plus 15% (v/v) glycerol. This solution is used for the storage of competent cells at 80  C following 0.2 μm sterile filtration. 5. PLATE solution is used in the transformation process, and consists of 40% PEG 3350 (w/v) (Sigma), 0.1 M lithium acetate (Sigma), 10 mM Tris–HCl pH 7.0RT, and 1 mM EDTA. Solution is brought to volume with MilliQ H2O and 0.2 μm sterile filtered (see Note 7). 6. TE Buffer (1 L) is used to resuspend cells prior to plating and is made with 10 mM Tris–HCl pH 9.0RT and 150 mM EDTA. 7. Molecular biology grade sterile DMSO (New England BioLabs) used in plasmid generation.

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

233

8. Sheared salmon sperm DNA (10 mg/ml, Ambion) is used as the carrier DNA. Stored at 20  C in 2 mg/ml aliquots. 9. CSM-His plates (500 ml ¼ 20 plates) are used for selecting positive transformants, and consist of 0.77 g/L CSM-His (Sunrise Science Products), 20 g/L low melting agar (Research Products International), 10 g/L glucose (Sigma), 3.0 g/L Ammonium Sulfate (Sigma), 1.7 g/L yeast nitrogen base without amino acids and ammonium sulfate (Research Products International), and this is brought to volume with MilliQ H2O. Autoclave for 15 min and store plates at 4  C. 2.3 Target Screening and Initial Expression Profiling 2.3.1 Reagents

1. Glucose solution, 40% (w/v). Autoclave for 15 min and store at RT. 2. 20 Galactose (Carbosynth) solution, 40% (w/v). Add galactose to autoclaved, sterile water. Once dissolved and cooled to room temperature filter sterilize using 0.45 μm filter. Store solution at RT (see Note 8). 3. 10 Raffinose (Carbosynth) solution, 10% (w/v). Filter sterilize using 0.22 μm filter and store at RT (see Note 9). 4. 10 CSM-His solution, 7.9 g/L. Filter sterilize using 0.22 μm filter and store at 4  C (see Note 10). 5. 20 YNB solution. 13.4% (w/v) yeast nitrogen base without amino acids (Research Products International). Filter sterilize using 0.22 μm filter and store at 4  C. 6. 4 YPG solution is used as the inductant. 8% (w/v) yeast extract (Research Products International), 16% (w/v) peptone (Research Products International), and 8% (w/v) galactose. Add yeast extract and peptone to hot water to dissolve. Once dissolved autoclave for 15 min. After solution cools to RT, add galactose using the sterile 20 galactose stock, and stir (see Note 11). 7. Lysis Buffer: 50 mM Tris–HCl, pH 7.4RT, 10% (v/v) glycerol, 200 mM NaCl, 10 mM EDTA pH 8.0. Store at 4  C. 8. Halt™ Protease Inhibitor Cocktail (ThermoFisher) is obtained as a ready-to-use 100 stock solution. 9. Solubilization Buffer: 50 mM Tris–HCl, pH 7.4RT, 10% (v/v) glycerol, 200 mM NaCl. Store at 4  C. 10. 2,2-Didecylpropane-1,3-bis-β-D-maltopyranoside (LMNG) detergent is obtained from Anatrace (NG310) in solid form and prepared as a 200 mM stock solution. 11. SDS Polyacrylamide Gel Components. (a) Resolving gel buffer: 1.5 M Tris–HCl, pH 8.8. (b) Stacking gel buffer: 0.5 M Tris–HCl, pH 6.8.

234

Jennifer M. Johnson and Franklin A. Hays

(c) 30% Acrylamide/Bis Solution, 29:1 (Bio-Rad). Store at 4  C (see Note 12). (d) Ammonium persulfate: 10% (w/v) solution in water. Store at 20  C. (e) N,N,N,N0 -tetramethyl-ethylenediamine Sigma).

(TEMED,

(f) SDS-PAGE Running Buffer: 25 mM Tris–HCl, pH 8.3, 192 mM glycine, 0.1% (w/v) SDS (see Note 13). (g) 4 Laemmli Sample Buffer: 200 mM Tris–HCl, pH 6.8, 8% (w/v) SDS, 40% (v/v) glycerol, 50 mM EDTA, pH 8.0, and 0.08% (w/v) bromophenol blue. Mix thoroughly and make 960 μl aliquots in 1.5 ml microfuge tubes. Prior to use add 40 μl β-mercaptoethanol to the microfuge tube. Store at 20  C (see Note 14). (h) Precision Plus Protein™ Kaleidoscop™ (BioRad) is used as our protein and western blot standard. (i) 12% SDS-PAGE Gels (makes 4 mini gels): First, prepare and pour the resolving gel using the following components in the order listed (6.6 ml MilliQ H2O, 8.0 ml 30% (w/v) acrylamide mix, 5.0 ml 1.5 M Tris–HCl pH 8.8RT, 0.2 ml 10% (w/v) SDS, 0.2 ml 10% (w/v) ammonium persulfate, and 0.008 ml TEMED. Once gel has been poured, top gel off with water to prevent drying. Once the resolving gel has set, pour off any remaining water. Next, prepare and pour the stacking gel (5%) using the following components in the order listed (3.4 ml MilliQ H2O, 0.83 ml 30% (w/v) acrylamide mix, 0.63 ml 1.5 M Tris–HCl pH 8.8RT, 0.05 ml 10% (w/v) SDS, 0.05 ml 10% (w/v) ammonium persulfate, and 0.005 ml TEMED). Insert the comb (16-lane is recommended), taking care not to trap bubbles in the wells. Once the stacking gel has set remove from gel apparatus and store at 4  C (see Note 12). 12. Immunoblotting Components. (a) PVDF transfer membrane (0.2 μm, Thermo Scientific). (b) Western blot transfer buffer: 25 mM Tris–HCl, 192 mM glycine, and 20% (v/v) methanol. (c) TRIS buffered saline (TBS, 10 stock): 1.5 M NaCl, 0.1 M Tris–HCl, pH 7.4RT. (d) 1 TBS containing 0.05% (v/v) Tween-20 (TBST). (e) Blocking solution: 5% (w/v) milk in TBST (see Note 15). (f) Wash buffer: 1 TBS. (g) Thick blotting paper (BioRad).

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

235

(h) Thin blotting paper (BioRad). (i) Mouse monoclonal HRP conjugated H-3 probe (Santa Cruz Biotechnology #SC-8036HRP). (j) Mouse monoclonal HRP conjugated FLAG probe (Sigma #A8592–1MG). (k) SuperSignal west pico chemiluminescent substrate (Pierce #34080). (l) Glogos II autorad marker (Stratagene #420201). 2.3.2 Equipment and Consumables

1. Beadbeater supplies and equipment are as follows: Beadbeater (BioSpec Inc. #909 M), Beadbeater canister (Biospec Inc. #110803–50), 0.5 mm glass beads (BioSpec Inc. #11079105), and traceable lab controller and timer (Fisher #06-662-7).

2.4 Protein Expression, Fermentation, and Initial Protein Characterization

1. Reagents and media in Subheadings 3.1 and 3.2 below are outlined above—except where noted.

2.4.1 Reagents

2. Antifoam 204 is a mixture of organic polyether dispersions and is obtained from Sigma-Aldrich (#A6426) as an emulsion. 3. O2 electrolyte used in the fermentor dissolved oxygen (DO) probe was obtained from Mettler-Toledo AG (#341002016). 4. 1.5 M KOH/1.5 M NaOH can be used to adjust pH during culture growth when interfaced with a pH monitoring process control unit with a calibrated feed line. 5. Imidazole (Sigma-Aldrich #I2399-500G) is prepared as a 5 M stock solution and maintained in a brown bottle away from direct light. 6. The following purification resins are utilized in this protocol: Ni-NTA agarose affinity resin (Qiagen #1018236), TALON affinity resin (Clonetech #635504), and Sepharose 6B benzamidine beads (Amersham #51-5960-00-DD). 7. Thrombin protease was obtained from Novagen (#69671) while 3C protease was prepared in-house as a 3C-MBP-His fusion construct [31].

2.4.2 Equipment and Consumables

3

1. Column chromatography utilizes a Superdex 200 10/300 GL (GE Healthcare #17-5175-01)

Methods

3.1 High-Throughput Cloning and Plasmid Construction

1. Construct LIC-compatible primers (see Note 16) for gene amplification and insertion into plasmids with LIC-compatible 50 extensions [32]. Primers are constructed

236

Jennifer M. Johnson and Franklin A. Hays

as follows: forward primer consists of 50 -CAAGGACCGAGC AGCCCCTCA-GOI-30 (see Note 17) and reverse primer consists of 50 -ACCACGGGGAACCAACCCTCC-GOI-30 (see Note 18) where “GOI” represents 15–21 basepairs of target gene sequence. Do not include initial start codon in the forward primer. For the current cloning protocol, we will assume primers were ordered in 96-well format at 100 μM working concentrations. 2. Prepare a 96-well primer mix plate, using a multichannel pipette, as follows (Table 1): 3. Prepare a PCR master stock according to the table below (Table 2). 4. Ensure the PCR master stock is mixed well. Aliquot 45 μl of PCR master stock into each well of a prechilled 96-well PCR plate. Using a 20 μl multichannel pipette, add 5 μl of the primer mix to each well. 5. Run the PCR reaction according to the following protocol (Table 3): 6. Add 10 μl of 6 Orange-G dye to each well followed by gentle mixing and, using a multichannel pipette, loading 40 μl of PCR product onto a 1% (w/v) agarose gel (see Note 19). Run the gel at 100 V for 60–90 min for separation. Excise individual gel slices with a gel extraction tool (see Note 20) and dissolve each Table 1 Primer plate mix 100 μM Forward primer

5 μl

100 μM Forward primer

5 μl

MilliQ water

90 μl

Total volume

100 μl

Table 2 PCR master stock DNA template (e.g., genomic DNA)

50 μl

5 High Fidelity Buffer

1050 μl

dNTPs, 10 mM [stock]

105 μl

DNAse & RNAse free MilliQ water Phusion polymerase Total volume

3470 μl 50 μl 4725 μl

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

237

Table 3 PCR protocol Cycle step

Temp 

Time

Number of cycles

Initial denaturation

98 C

30 s

1

Denaturation Annealing Extension

98  C 55  C 72  C

10 s 15 s 2 min

35

Final extension

72  C

10 min

1

Hold

4 C

forever

Fig. 2 83ν plasmid map. The 83ν plasmid is chimeric shuttle vector constructed from a 2 μ backbone. The plasmid contains both histidine (HIS3) and ampicillin (AmpR) selection markers to facilitate plasmid retention in both bacterial and yeast expression systems. Production of the target gene is driven by the GAL1 promoter and terminates via the Cyc termination sequence. This is a LIC-compatible plasmid containing the forward, and reverse, LIC target sequences as noted in Subheading 3.1, step 1. The 83ν plasmid produces protein labeled with an N-terminal FLAG tag that is 3C protease cleavable and a C-terminal, thrombin cleavable, 10-histidine tag. Sequencing primers for insert sequence verification are noted

in 400 μl of QG buffer. Extract PCR product using the QIAquick 96 PCR purification kit and elute each sample in 80 μl of sterile MilliQ water (see Notes 21 and 22). 7. Prepare blunt-end, SmaI digested, LIC vector (Fig. 2) by preparing the following reaction (Table 4) (see Note 23):

238

Jennifer M. Johnson and Franklin A. Hays

Table 4 LIC vector reaction LIC vector mini/maxi-prep DNA

20 μl

10 NEB Buffer 4

6 μl

SmaI (20 U/μl)

4 μl

DNAse & RNAse free MilliQ water

30 μl

Total volume

60 μl

Table 5 DNA Polymerase treatment of blunt-end plasmid Digested LIC plasmid DNA

5.0 μl

10 T4 DNA polymerase buffer

2.0 μl

dTTP (25 mM stock)

2.0 μl

DTT (100 mM stock)

1.0 μl

LIC qualified T4 DNA polymerase

0.4 μl

DNAse & RNAse free MilliQ water

9.6 μl

Total volume

20 μl

8. Incubate the digestion reaction at 25  C for 5 h or leave at room temperature overnight. Following digestion, run the entire reaction on a 1% (w/v) agarose gel (100 V for 45 min) and extract digested plasmid from the gel using the Qiagen Gel Extraction Kit (or comparable means). Elute DNA with 30 μl MilliQ water. Digested plasmid can be stored at 20  C or utilized for immediate T4 polymerase treatment as outlined below. 9. DNA polymerase treatment of blunt-end digested LIC-compatible plasmid should be prepared as follows (Table 5): 10. To prepare polymerase treated, or “chewback,” insert DNA: mix 5.4 μl of insert master mix (Table 6) to each well of a 96-well, thin-walled, PCR plate. Using a 20 μl multichannel pipette, add 14.6 μl of gel purified insert DNA from Subheading 3.1, step 6 above to each well of the 96-well PCR plate. Set the PCR thermocycler to incubate reactions at 25  C for 40 min followed by 20 min at 75  C to heat inactivate the polymerase. Reactions can be stored at 20  C or used immediately for LIC annealing reactions. 11. Dilute 20 μl of T4 treated vector DNA with 300 μl of MilliQ water. Using a multichannel pipette, add 2.0 μl of diluted

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

239

Table 6 Chewback master mix 10 T4 DNA polymerase buffer

210 μl

dATP (25 mM stock)

210 μl

DTT (100 mM stock)

105 μl

LIC qualified T4 DNA polymerase Total volume

42 μl 567 μl

vector DNA into each well of a 96-well PCR plate followed by addition of 4.0 μl of gel purified, and T4 treated, insert DNA into each corresponding well (see Note 24). Incubate the reaction at RT for 15 min followed by addition of 2 μl EDTA (25 mM [stock]) and an additional 10 min RT incubation. 12. Aliquot 50 μl of competent E. coli DH5α cells into a prechilled 96-well deep block plate followed by addition of 5 μl of LIC reaction mixture into each well. Incubate the transformation reaction on ice for 20 min, heat shock cells at 42  C for 45 s, incubate on ice for 2 min, followed by addition of 200 μl SOC medium (prewarmed to 37  C). Seal the plate with a gas permeable seal and incubate at 37  C for 1 h and 220 rpm (see Notes 25 and 26). 13. For every GOI plasmid reaction, plate the entire 250 μl culture onto a RT LB-Amp (100 μg/ml) plate and incubate overnight at 37  C to select for positive transformants (see Notes 27 and 28). 14. At this point in the protocol, one will need to screen colonies for inserts that contain the GOI. An efficient means to screen large numbers of colonies is to utilize colony PCR in 96-well plates. Aliquot 20 μl of colony resuspension buffer into each well of a 96-well plate. Use autoclaved, flathead, toothpicks to pick a colony from each plate and then dipping in each assigned well briefly (see Note 29). 15. Prepare a colony PCR master mix as outlined below (Table 7) for each 96-well plate screened where each well contains a 20 μl reaction volume (see Note 30). Aliquot 10 μl of colony PCR master mix into each well of a prechilled 96-well thin wall PCR plate. Using a 20 μl multichannel pipette, add 10 μl of the resuspended colony to each corresponding well in the screening plate. 16. Run 96-well colony PCR reaction using the following program (Table 8): 17. Add 4 μl of 6 Orange-G dye to each well followed by gentle mixing and, using a multichannel pipette, loading all 24 μl of

240

Jennifer M. Johnson and Franklin A. Hays

Table 7 Colony PCR master mix 210 μl

10 Taq polymerase buffer dNTPs (10 mM [stock])

42 μl

100 μM GalF forward primer

10 μl

100 μM CycR reverse primer

10 μl 736 μl

DNAse & RNAse free MilliQ water

42 μl

Taq polymerase (5 U/ml)

1050 μl

Total volume

Table 8 96-well colony PCR reaction Cycle step

Temp

Time

Cycles

Initial denaturation

94  C

5 min

1

Denaturation Annealing Extension

94  C 55  C 68  C

15 s 15 s 2 min

30

Final extension

68  C

5 min

1

Hold

4 C

Hold

PCR product onto a 1% (w/v) DNA Plus agarose gel (USA Scientific) (see Note 19). Run the gel at 100 V for 60–90 min for separation. 18. For positive clones identified via colony PCR above, use 10 μl of resuspended colony to inoculate 1.5 ml of 2 LB media in 2.0 ml deep-well block. Incubate at 37  C and 220 rpm for 24 h. 19. Use the Qiagen QIAvac 96-well plasmid miniprep protocol to isolate plasmid DNA from each of the positive transformants identified above. Elute with 125 μl EB buffer into a 96-well PCR plate (see Note 31). 20. Individual inserts should be verified via DNA sequencing encompassing complete reads across the entire GOI. Cloned genes can be further validated using restriction mapping to confirm target gene insertion prior to DNA sequencing. Prepare a restriction mapping master mix as outlined below (Table 9). Aliquot 16.0 μl into each well of a thin-walled PCR plate. Using a multichannel pipette, add 4.0 μl of miniprepped plasmid DNA stock into each corresponding well and incubate at 37  C for 30 min. Add 4 μl of 6 Orange-G dye to

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

241

Table 9 Restriction mapping master mix 10 NEB Buffer 3

210 μl

BamHI (20 U/ml)

27 μl

XhoI (20 U/ml)

27 μl

DNAse & RNAse free MilliQ water

1417.5 μl

Total Volume

1681.5 μl

each well followed by gentle mixing and, using a multichannel pipette, loading all 24 μl of PCR product onto a 1% (w/v) DNA Plus agarose gel (USA Scientific) (see Note 19). Run the gel at 100 V for 60–90 min for separation. 21. Once expression plasmids have been constructed for target genes (following completion of Subheading 3.2, step 19 above), one should construct a backup plasmid repository for each construct. Prepare a 48 grid LB-Ampicillin Q-tray and dry the tray overnight at 37  C. Approximately 220 ml of LB-Ampicillin is needed for each Q-tray. Thaw E. coli DH5α cells on ice for 30 min prior to transformation. Aliquot 20 μl onto each well of a prechilled 96-well PCR plate followed by addition of 1 μl plasmid DNA into each well. Incubate on ice for 5 min, heat shock at 42  C for 30 s, incubate on ice for 2 min, and then add 100 μl SOC medium (prewarmed to 37  C). Plate the entire 120 μl from each plasmid construct onto individual wells of the 48 grid Q-tray. Use gentle swirling to spread the culture vs. plating beads or a cell spreader. Incubate the tray overnight at 37  C without flipping the tray (see Note 32). Plasmid DNA isolation and insert validation can be completed using the methods described above in Subheading 3.1, steps 19 and 20. 3.2 Generation and Transformation of Competent S. cerevisiae Cells

1. Inoculate 10 ml of YPD broth and place in a shaking incubator at 30  C, 220 rpm overnight. 2. Streak a YPD plate for colony isolation and incubate at 30  C for 48 h. 3. Place 5 ml of YPD broth in 5 aerated culture tubes (you may increase/decrease this number as desired) and inoculate each tube with a single colony of yeast. Grow at 30  C, 220 rpm, for 24 h. 4. Spin down growths at 3000  g for 10 min, and discard the supernatant. 5. Resuspend cell pellets in 5 ml of YPD + 15% (w/v) glycerol broth. Aliquot 500 μl into each microfuge tube. Store at 80  C or use immediately.

242

Jennifer M. Johnson and Franklin A. Hays

Table 10 Transformation master mix PLATE solution

14,000 μl

Boiled Salmon Sperm DNA (2 mg/ml)

500 μl

DMSO (100% stock)

1000 μl

Total volume

15,500 μl

6. For single colony transformation: remove a vial of competent cells from the 80  C freezer. Thaw and pellet the cells. Resuspend cells in 150 μl PLATE solution, then add 5 μl of sheared salmon sperm (carrier DNA, 10 μg) plus ~0.1 μg of plasmid DNA containing the GOI. Add 10 μl of DMSO and vortex briefly followed by a 15-min incubation at RT. Heat shock for 10 min at 42  C and pellet cells in a microfuge for 30 s. Remove the supernatant. Add 200 μl TE to the cell pellet and gently resuspend cells by aspirating up-and-down with a pipette tip. Plate two volumes (e.g., 50 and 150 μl) onto selective CSM-His plates and incubate at 30  C for 2 days (see Note 33). 7. For high-throughput transformations: Pellet 5 ml of overnight yeast culture at 3000  g for 10 min and discard supernatant. Prepare a master mix as follows (Table 10, enough for 96 transformations plus one blank): Mix the solution and add to pelleted cells, then vortex for 5 s to resuspend cells. Using a multichannel pipette, aliquot 150 μl into each well of a 96-well PCR plate (or 8-strip PCR tubes). Add 2 μl plasmid DNA from the 96-well miniprep into each tube, vortex, then incubate at 25  C for 15 min followed by a 42  C heat shock for 20 min. This can be performed in a PCR thermocycler suitable for 96-well plates or strips. Pellet cells in a benchtop centrifuge (equipped with a Microtiter plate holder) at 2000  g for 2 min and remove the supernatant using a multichannel pipette. Add 200 μl to each well andgently resuspend cells by aspirating up-and-down with a pipette tip. Spread 2000 μl of resuspended cells onto selective SC-HIS plates and incubate at 30  C for three days (see Note 34). 3.3 Target Screening and Initial Expression Profiling

1. Inoculate 5 ml of SC-HIS media with a single colony from the above selective plate (or grid plate) (see Notes 33 and 34) and incubate for 24 h at 30  C and at 220 rpm. 2. For each target protein, autoclave 280 ml of water in a 1 L baffled flask along with one 100 ml and one 250 ml graduated cylinder. Using the sterile 100 ml graduate cylinder, add 95 ml of 4 SC-HIS stock medium to each flask to generate 375 ml of total volume (1 SC-HIS) in each flask. Inoculate each flask

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

243

with 5 ml of preculture and incubate for 24 h at 30  C and at 220 rpm. 3. Following 24 h of initial outgrowth, the optical density in each flask should range between 15 and 20 for most cultures with a glucose concentration 90% being the more typical outcome. Yeast cells will not sufficiently lyse using sonication or freeze fracture methods though will readily lyse using a microfluidizer (e.g., Avestin C3 or C5 Emulsiflex operating at >25,000 psi). 39. Total volume of membrane suspension from each 0.5 L total culture volume will be between 1.5 and 3 ml which can be snap frozen in large or small aliquots—depending on need. 40. Ensure samples are well mixed at this stage and that the stir bar continues to agitate the solubilization solution for the duration of the 1 h window. Improper sample mixing at this stage will yield inaccurate solubilization results downstream. Ensure microfuge tubes used at this stage are sufficient to withstand 180,000  g during the high-speed spin following solubilization. Solubilization of the membrane pellet with a defined ratio of buffer to membranes will facilitate prep-to-prep comparisons and analysis of solubilization efficiency. 41. Normalizing the loading volumes for the SDS-PAGE gel is essential for comparing expression levels between various constructs or comparing various growth conditions during optimization. 42. An alternative approach for expression screening of large numbers of protein or protein constructs is to utilize fluorescence tags (e.g., GFP) and assess whole cell expression and initial purification by following fluorescence readouts at each stage. This has been described in detail elsewhere (Drew-D 2008 Nat. protocols 3, 784–798) and requires a fluorescence detector interfaced with a liquid chromatography system. 43. One could strip the PVDF membrane following incubation with an antibody to probe for a second antibody against the same protein. This approach leads to increased signal variability for the second antibody when studying integral membrane proteins before, and after, detergent solubilization. Thus, for the current protocol, separate PVDF membranes will be generated for each antibody being tested. Membranes can also be blocked overnight at 4  C if needed. 44. To increase efficiency, autoclave all flasks and additional components at this step. 45. After 24 h of growth, the glucose concentration should be less than 0.1% (w/v). This yeast strain will turn pink due to the ade2–1 mutation. If the glucose concentration is too high, then protein expression may be inhibited (glucose represses the GAL1 promoter). 46. For vessels that are not autoclaved in place (i.e., the entire vessel is autoclaved), ensure at least one line is left open to vent during the autoclaving process. For a Bio-Flow 115 vessel

256

Jennifer M. Johnson and Franklin A. Hays

containing a 10 L working volume, a 1 h autoclave cycle is sufficient to obtain sterility. One can determine the best autoclave time for their specific instrument by doing a “dummy” growth to check for adequate sterility of culture media following autoclaving. 47. The DO probe should be calibrated at 100% in the exact growth conditions that will be used (e.g., airflow, agitation rate, temperature) as these conditions will influence the DO content. 48. Addition of 2.5 L of 4 YPG to the fermentation vessel will decrease DO content as it has not been pre-equilibrated in order to maintain sterility of the rich media. Increasing airflow, and agitation if needed, following addition of the inductant will ensure the culture is not exposed to anoxic growth conditions. 49. All manipulations in this section are performed at 4  C unless otherwise noted. For larger growth cultures, a general approach for preparing the solubilization mixtures would be as outlined in the table below: 50. IMAC resins are generally available as 50% (w/v) mixtures in an ethanol solution, so it is important to spin the IMAC slurry in a benchtop centrifuge, remove supernatant, and resuspend resin in IMAC buffer prior to adding to lysate. For larger culture volumes of POI’s with expression levels in the 10–20 mg/L of culture range, one can start with around 1 ml of IMAC resin for every 3 L of culture growth. POI binding interactions with IMAC resin can be protein-dependent and, often, fall below manufacturer target ranges for polytopic membrane proteins in lipid-detergent micelles. 51. Following IMAC, and reverse IMAC, one should save a small portion of the resin to run on the final SDS-PAGE gel. A small portion of the resin can be washed in SDS-PAGE sample buffer, spun briefly, and then loaded on the gel to assess amount of protein which may be retained on the resin and not eluted in high concentrations of imidazole. In our experience, nonspecific and hydrophobic binding of detergent solubilized integral membrane proteins to metal-affinity resins is not uncommon and should be accounted for during purification. 52. Close attention should be paid to protein yield at each step in this process to ensure that material is not lost to precipitation or, if it is, that such a lost is noticed to ensure adjustments can be made on follow-on purifications. The POI may not be stable in the initial SEC buffer. If heavy protein loss is observed, then one should screen buffer composition (pH, [salt], etc.), detergent identity (e.g., consider shorter or longer chain

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae

257

detergents), inclusion of lipids, addition of stabilizing osmolytes (e.g., glycerol or sucrose), or inclusion/exclusion of reducing agents or metal chelators. 53. Centrifugal spin concentrators can also be used at this step though one should ensure the retentate fraction is well mixed during each spin. This involves stopping each run approximately four times per load to mix the retentate with a pipette. Regardless of method, ensure a proper molecular weight cutoff (MWCO) is used relative to the POI being purified while noting that larger MWCO membranes will allow for faster concentration steps. Pay careful attention to protein aggregation and concentration during this step. For initial screening, it is sufficient to concentrate POI’s to 1–2 OD/ml and no further. 54. An autosampler is not required for this protocol. One can manually inject each sample. 55. When using an autoinjection/autosampler method to conduct automated SEC runs in sequence, it is imperative that the operator be available to monitor duration of the first run and successful transition to second run. This will help ensure the sequence runs appropriately.

Acknowledgments The authors would like to acknowledge Jennifer Washburn, Zygy ˇ urzˇ, Hannah Schmitz, and Dr. Robert M. Stroud for their Roe-Z support in previous development and implementation of methods outlined in this chapter. The Hays lab is supported by the National Institute of General Medical Sciences of the National Institutes of Health under grant number R01GM118599. References 1. Newstead S, Kim H, von Heijne G, Iwata S, Drew D (2007) High-throughput fluorescentbased optimization of eukaryotic membrane protein overexpression and purification in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 104(35):13936–13941. https://doi.org/ 10.1073/pnas.0704546104 2. Drew D, Newstead S, Sonoda Y, Kim H, von Heijne G, Iwata S (2008) GFP-based optimization scheme for the overexpression and purification of eukaryotic membrane proteins in Saccharomyces cerevisiae. Nat Protoc 3 (5):784–798. https://doi.org/10.1038/ nprot.2008.44

3. Li M, Hays FA, Roe-Zurz Z, Vuong L, Kelly L, Ho CM, Robbins RM, Pieper U, O’Connell JD 3rd, Miercke LJ, Giacomini KM, Sali A, Stroud RM (2009) Selecting optimum eukaryotic integral membrane proteins for structure determination by rapid expression and solubilization screening. J Mol Biol 385(3):820–830. https://doi.org/10.1016/j.jmb.2008.11.021 4. Hays FA, Roe-Zurz Z, Stroud RM (2010) Overexpression and purification of integral membrane proteins in yeast. Methods Enzymol 470:695–707. https://doi.org/10.1016/ S0076-6879(10)70029-X 5. Flot D, Mairs T, Giraud T, Guijarro M, Lesourd M, Rey V, van Brussel D, Morawe C,

258

Jennifer M. Johnson and Franklin A. Hays

Borel C, Hignette O, Chavanne J, Nurizzo D, McSweeney S, Mitchell E (2010) The ID232 structural biology microfocus beamline at the ESRF. J Synchrotron Radiat 17(1):107–118. https://doi.org/10.1107/ S0909049509041168 6. Grochulski P, Cygler M, Yates B (2016) Designing a synchrotron micro-focusing beamline for macromolecular crystallography. Postepy Biochem 62(3):395–400 7. Batyuk A, Galli L, Ishchenko A, Han GW, Gati C, Popov PA, Lee MY, Stauch B, White TA, Barty A, Aquila A, Hunter MS, Liang M, Boutet S, Pu M, Liu ZJ, Nelson G, James D, Li C, Zhao Y, Spence JC, Liu W, Fromme P, Katritch V, Weierstall U, Stevens RC, Cherezov V (2016) Native phasing of x-ray free-electron laser data for a G protein-coupled receptor. Sci Adv 2(9):e1600292. https://doi.org/10. 1126/sciadv.1600292 8. Liao M, Cao E, Julius D, Cheng Y (2013) Structure of the TRPV1 ion channel determined by electron cryo-microscopy. Nature 504(7478):107–112. https://doi.org/10. 1038/nature12822 9. Kuhlbrandt W (2014) Biochemistry. The resolution revolution. Science 343 (6178):1443–1444. https://doi.org/10. 1126/science.1251652 10. Newby ZE, O’Connell JD 3rd, Gruswitz F, Hays FA, Harries WE, Harwood IM, Ho JD, Lee JK, Savage DF, Miercke LJ, Stroud RM (2009) A general protocol for the crystallization of membrane proteins for X-ray structural investigation. Nat Protoc 4(5):619–637. https://doi.org/10.1038/nprot.2009.27 11. Moraes I, Evans G, Sanchez-Weatherby J, Newstead S, Stewart PD (2014) Membrane protein structure determination – the next generation. Biochim Biophys Acta 1838(1 Pt A):78–87. https://doi.org/10.1016/j. bbamem.2013.07.010 12. Chae PS, Rasmussen SG, Rana RR, Gotfryd K, Chandra R, Goren MA, Kruse AC, Nurva S, Loland CJ, Pierre Y, Drew D, Popot JL, Picot D, Fox BG, Guan L, Gether U, Byrne B, Kobilka B, Gellman SH (2010) Maltose-neopentyl glycol (MNG) amphiphiles for solubilization, stabilization and crystallization of membrane proteins. Nat Methods 7 (12):1003–1008. https://doi.org/10.1038/ nmeth.1526 13. Orwick-Rydmark M, Arnold T, Linke D (2016) The use of detergents to purify membrane proteins. Curr Protoc Protein Sci 84:4.8.1–4.8.35. https://doi.org/10.1002/ 0471140864.ps0408s84

14. Saez NJ, Nozach H, Blemont M, Vincentelli R (2014) High throughput quantitative expression screening and purification applied to recombinant disulfide-rich venom proteins produced in E. coli. J Vis Exp 89:e51464. https://doi.org/10.3791/51464 15. Saez NJ, Vincentelli R (2014) Highthroughput expression screening and purification of recombinant proteins in E. coli. Methods Mol Biol 1091:33–53. https://doi.org/ 10.1007/978-1-62703-691-7_3 16. Rosano GL, Ceccarelli EA (2014) Recombinant protein expression in microbial systems. Front Microbiol 5:341. https://doi.org/10. 3389/fmicb.2014.00341 17. Rosano GL, Ceccarelli EA (2014) Recombinant protein expression in Escherichia coli: advances and challenges. Front Microbiol 5:172. https://doi.org/10.3389/fmicb. 2014.00172 18. Coolbaugh MJ, Wood DW (2014) Purification of E. coli proteins using a self-cleaving chitinbinding affinity tag. Methods Mol Biol 1177:47–58. https://doi.org/10.1007/9781-4939-1034-2_4 19. Wood DW (2014) New trends and affinity tag designs for recombinant protein purification. Curr Opin Struct Biol 26:54–61. https://doi. org/10.1016/j.sbi.2014.04.006 20. Munro S, Pelham HR (1984) Use of peptide tagging to detect proteins expressed from cloned genes: deletion mapping functional domains of Drosophila hsp 70. EMBO J 3 (13):3087–3093 21. Peng B, Williams TC, Henry M, Nielsen LK, Vickers CE (2015) Controlling heterologous gene expression in yeast cell factories on different carbon substrates and across the diauxic shift: a comparison of yeast promoter activities. Microb Cell Factories 14:91. https://doi.org/ 10.1186/s12934-015-0278-5 22. Clark KM, Fedoriw N, Robinson K, Connelly SM, Randles J, Malkowski MG, DeTitta GT, Dumont ME (2010) Purification of transmembrane proteins from Saccharomyces cerevisiae for X-ray crystallography. Protein Expr Purif 71 (2):207–223. https://doi.org/10.1016/j. pep.2009.12.012 23. Ahmad M, Hirz M, Pichler H, Schwab H (2014) Protein expression in Pichia pastoris: recent achievements and perspectives for heterologous protein production. Appl Microbiol Biotechnol 98(12):5301–5317. https://doi. org/10.1007/s00253-014-5732-5 24. Zhang Z, Moo-Young M, Chisti Y (1996) Plasmid stability in recombinant Saccharomyces cerevisiae. Biotechnol Adv 14(4):401–435

High-Throughput Membrane Proteins Production in Saccharomyces cerevisiae 25. Dragosits M, Frascotti G, Bernard-Granger L, Vazquez F, Giuliani M, Baumann K, Rodriguez-Carmona E, Tokkanen J, Parrilli E, Wiebe MG, Kunert R, Maurer M, Gasser B, Sauer M, Branduardi P, Pakula T, Saloheimo M, Penttila M, Ferrer P, Luisa Tutino M, Villaverde A, Porro D, Mattanovich D (2011) Influence of growth temperature on the production of antibody Fab fragments in different microbes: a host comparative analysis. Biotechnol Prog 27(1):38–46. https://doi. org/10.1002/btpr.524 26. Andre N, Cherouati N, Prual C, Steffan T, Zeder-Lutz G, Magnin T, Pattus F, Michel H, Wagner R, Reinhart C (2006) Enhancing functional production of G protein-coupled receptors in Pichia pastoris to levels required for structural studies via a single expression screen. Protein Sci 15(5):1115–1126. https://doi. org/10.1110/ps.062098206 27. Figler RA, Omote H, Nakamoto RK, Al-Shawi MK (2000) Use of chemical chaperones in the yeast Saccharomyces cerevisiae to enhance heterologous membrane protein expression: highyield expression and purification of human P-glycoprotein. Arch Biochem Biophys 376 (1):34–46. https://doi.org/10.1006/abbi. 2000.1712 28. Lanza AM, Curran KA, Rey LG, Alper HS (2014) A condition-specific codon optimization approach for improved heterologous gene expression in Saccharomyces cerevisiae. BMC Syst Biol 8:33. https://doi.org/10. 1186/1752-0509-8-33 29. Mellitzer A, Ruth C, Gustafsson C, Welch M, Birner-Grunberger R, Weis R, Purkarthofer T, Glieder A (2014) Synergistic modular promoter and gene optimization to push cellulase secretion by Pichia pastoris beyond existing benchmarks. J Biotechnol 191:187–195. https://doi.org/10.1016/j.jbiotec.2014.08. 035 30. Mauro VP, Chappell SA (2014) A critical analysis of codon optimization in human

259

therapeutics. Trends Mol Med 20 (11):604–613. https://doi.org/10.1016/j. molmed.2014.09.003 31. Gruswitz F, Frishman M, Goldstein BM, Wedekind JE (2005) Coupling of MBP fusion protein cleavage with sparse matrix crystallization screens to overcome problematic protein solubility. BioTechniques 39(4):476. 478, 480. Epub 2005/10/21 32. Aslanidis C, de Jong PJ (1990) Ligationindependent cloning of PCR products (LIC-PCR). Nucleic Acids Res 18 (20):6069–6074. Epub 1990/10/25 33. Parker JL, Newstead S (2014) Method to increase the yield of eukaryotic membrane protein expression in Saccharomyces cerevisiae for structural and functional studies. Protein Sci 23 (9):1309–1314. Epub 2014/06/21. https:// doi.org/10.1002/pro.2507 34. Lin SY, Sun XH, Hsiao YH, Chang SE, Li GS, Hu NJ (2016) Fluorophore Absorption Size Exclusion Chromatography (FA-SEC): an alternative method for high-throughput detergent screening of membrane proteins. PLoS One 11(6):e0157923. Epub 2016/06/23. https://doi.org/10.1371/journal.pone. 0157923 35. Joska TM, Mashruwala A, Boyd JM, Belden WJ (2014) A universal cloning method based on yeast homologous recombination that is simple, efficient, and versatile. J Microbiol Methods 100:46–51. Epub 2014/01/15. https://doi.org/10.1016/j.mimet.2013.11. 013 36. Weir M, Keeney JB (2014) PCR mutagenesis and gap repair in yeast. Methods Mol Biol 1205:29–35Epub 2014/09/13. https://doi. org/10.1007/978-1-4939-1363-3_3 37. Casini A, Storch M, Baldwin GS, Ellis T (2015) Bricks and blueprints: methods and standards for DNA assembly. Nat Rev Mol Cell Biol 16 (9):568–576Epub 2015/06/18. https://doi. org/10.1038/nrm4014

Chapter 12 High-Throughput E. coli Cell-Free Expression: From PCR Product Design to Functional Validation of GPCR Sandra Corte`s, Fatima-Ezzahra Hibti, Frydman Chiraz, and Safia Ezzine Abstract This chapter outlines a protocol to express GPCRs libraries for screening of targets. High-throughput screening of GPCR expression raised a big interest in the development of proteomic drug candidates, protein engineering, and microarrays. However, GPCRs represent a large family of difficult-to-express proteins which can be successfully produced by cell-free systems in the presence of liposomes. The open and flexible nature of this in vitro expression system allows the manipulation of transcription and translation as well as the modulation of the cell-free reaction environment by the addition of any adjuvant or the incorporation of unnatural amino acid for example. The compatibility of PCR fragments with cell-free protein synthesis and using SPRi as multiplex analytical platform offer an effective method to rapidly select different targets. Large-scale expression and purification of GPCRs into proteoliposome format are discussed at the end of this chapter. Key words Cell-free protein synthesis, High-throughput screening, GPCRs, Proteoliposomes, SPRi

1

Introduction There is an increasing requirement for membrane proteins (MPs) production in industrial and academic fields for a variety of applications. MPs are often used as antigens for antibody or vaccine development, as well as for target validation and high-throughput screening in the drug discovery process. Indeed more than 60% of current drug targets are membrane proteins [1]. G-protein-coupled receptors (GPCRs) are one of the most popular drug targets today. Almost one-third of the approved drugs currently available rely on some kind of interaction with these receptors. However production of these receptors by conventional cellular expression systems represents a tremendous problem due to the highly hydrophobic content of the proteins [2, 3]. Membrane proteins are only functional in their natural environment; they must

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_12, © Springer Science+Business Media, LLC, part of Springer Nature 2019

261

262

Sandra Corte`s et al.

therefore be embedded into a lipid bilayer mimicking the cell membrane. One of the main advantages of using such cell-free protein synthesis (CFPS) is the openness of the system as it allows the addition of potential co-factors, chaperones or ligands that could improve the expression of the target protein [4, 5]. For example, it’s easy by CFPS to add liposome at the beginning of reaction to produce proteoliposomes that conserve appropriate folding [6]. The combination of cell-free protein synthesis and liposome technology represents an original approach to produce active recombinant membrane proteins directly integrated into a defined lipid bilayer [7]. The resulted proteoliposome provides an ideal model system miming the natural environment of the protein. Furthermore, CFPS is really adapted to high-throughput (HTP) techniques providing an efficient means for producing a large range of membrane proteins into proteoliposome format. Indeed, the proteins can be rapidly produced from a linear PCR product through microscale reactions [8, 9]. Use of PCR products presents advantage of simplicity because it avoids the need for timeconsuming cloning steps required when generating an expression from plasmids. Also, liposomes can be customized to allow immobilization of proteoliposome onto sensor chip surface. Liposomes containing biotin-lipids can be used to produce GPCRs in biotinylated proteoliposome format which interact to avidin surface. The sensor-based screening approach is emerging as a powerful strategy to uncover novel GPCR ligands. The analysis of label-free biomolecular interactions between GPCRs and partners can be performed in a multiplex format with the XelPleX system. This new instrument enhances the throughput of conventional SPR and allows to monitor up to several hundred interactions simultaneously. Here, we describe a high-throughput expression system protocol allowing the simultaneous multigene expression in CFPS and the characterization using SPRi technology.

2

Materials

2.1 Molecular Biology

Reagents

1. Oligonucleotides are widely available from many companies at low cost and different levels of purification. For gene synthesis, we recommend oligonucleotides synthesized at the smallest scale (5 μM), with desalting purification, dissolved in 10 mM Tris–HCl buffer, pH 8.0, and stored at 20  C. 2. Enzymes: 0.02 units/μL Phusion DNA polymerase. 3. 200 μM of each dNTP.

High-Throughput Cell-Free Expression of GPCR

263

4. DNA ladder. 5. TAE-agarose gels: agarose routine grade diluted in 1 TAE buffer, plus GreenSafe Premium (NZYTech, Ltd), cast in selfcontained system for routine agarose gel electrophoresis. Use 1 solution of Tris-Acetate-EDTA (TAE) Buffer, pH 8.3. Equipment and Consumables

1. PCR machine suitable for 96-well plates (such as T100 Thermal cycler, Bio-Rad). 2.2 Liposome Preparation

Reagents

1. Lipids (e.g., Avanti Polar Lipids). 2. Chloroform. 3. HEPES buffer, pH 7.5. 4. DEPC-treated H2O (see Note 1). Equipment and Consumables

1. Probe sonicator (Branson Digital Sonifier 250). 2. 0.22-μm filter. 2.3 Cell-Free Reaction

Reagents

1. 50 Complete™ Diagnostics).

Protease

Inhibitor

Cocktail

(Roche

2. Amino acid mixtures: 2 mM of each of the 20 natural amino acids (Sigma-Aldrich). 3. Nucleoside monophosphate (NMP) mixture: 170 mM of AMP, 170 mM of GMP, 170 mM of CMP, 170 mM of UMP (Sigma-Aldrich). 4. 3-Phosphoglyceric acid (3PG), 1.5 M (Sigma-Aldrich). 5. tRNA from E. coli MRE600, 25 mg/mL (Roche Diagnostics). 6. Folinic acid salt, 10 mg/mL (Sigma-Aldrich). 7. Potassium glutamate, 3.2 M (Sigma-Aldrich). 8. Magnesium glutamate, 400 mM (Sigma-Aldrich). 9. Hepes 1.2 M (Sigma-Aldrich). 10. Nicotinamide ade´nine dinucle´otide (NAD), 33 mM (SigmaAldrich). 11. Na+-Oxalate, 210 mM (Sigma-Aldrich). 12. Spermidine, 300 mM (Sigma-Aldrich). 13. E. coli S30 extract, store frozen at 80  C. 14. Template DNA (plasmid 500–1000 ng/μL.

DNA

or

PCR

products)

264

2.4

Sandra Corte`s et al.

Analytics

2.4.1 Dot Blot

Reagents

1. Tris 25 mM pH 7.5. 2. Tris Buffered Saline with Tween-20 (TBS-T): 3.14 g Tris–HCl, 8 g NaCl, and 1 mL Tween-20 to enough dH2O for a total volume of 900 mL, bring to pH 7.5 with HCl and NaOH. Add more dH2O for final volume of 2 L. 3. Blocking buffer: 5% (w/v) milk in TBS-T. 4. Antibody: Poly-histidine antibody conjugated with a horseradish peroxidase (Sigma-Aldrich, USA). 5. Detection reagent (ECL). Equipment and Consumables

1. Whatman paper. 2. Nitrocellulose membrane. 3. Vacuum pump. 4. Microfiltration apparatus. 5. Membrane holder box. 6. 96-well plates. 2.4.2 Western Blot

Reagents

1. Mini-PROTEAN System (Biorad). 2. ChemiDoc XRS+ (Biorad). 3. Trans-Blot Turbo™ Transfer System. 4. BioRad PowerPac basic or BioRad PowerPac HC. 5. Nitrocellulose membrane. 6. Whatman paper. 7. Precision Plus Protein Dual Color Standards (Biorad). 8. 0.5 M Tris–HCl pH 6.8 This buffer is used when making the stacking gel. 9. 1.5 M Tris–HCl pH 8.8 This buffer is used when making the resolving gel. 10. 10% and 20% (w/v) Sodium Dodecyl Sulfate (SDS). 11. 10% (w/v) Ammonium persulfate (APS). 12. 30% 29:1 Acrylamide/Bis-Acrylamide. 13. Tetramethylethylenediamine (TEMED). 14. Tris-base. 15. Glycine. 16. Resolving gel (amounts for 2 gels): 8.6 mL dH2O, 6 mL 30% 29:1 Acrylamide/Bis-Acrylamide, 5 mL Tris–HCl pH 8.8, 0.2 mL SDS, 0.2 mL APS, and 10 μL TEMED.

High-Throughput Cell-Free Expression of GPCR

265

17. Stacking gel (amounts for 2 gels): 6 mL dH2O, 1.25 mL 30% 29:1 Acrylamide/Bis-Acrylamide, 2.5 mL Tris–HCl pH 6.8, 0.1 mL SDS, 0.1 mL APS, and 10 μL TEMED. 18. Running buffer: 30.3 g Tris base, 144 g Glycine, and 10 g SDS to enough dH2O for a total volume of 900 mL, bring to pH 8.3 with HCl and NaOH. Add dH2O for a final volume of 1 L. This buffer is diluted to 1 with dH2O to make gel running buffer. 19. Transfer buffer: 5.82 g Tris base, 2.93 g Glycine and 1.875 mL SDS 20% and 200 mL methanol to enough dH2O for a final volume of 1 L. 20. Tris Buffered Saline with Tween-20 (TBS-T). 21. Blocking buffer: 5% (w/v) milk in TBS-T. 22. Antibody: Poly-histidine antibody conjugated with a horseradish peroxidase (Sigma-Aldrich, USA). 23. Detection reagent (ECL). Equipment and Consumables

1. Molecular Imager Chemidoc™ XRS+ with ImageLab™ software (Bio-Rad, Marnes-la-Coquette, France). 2.5

SPRi

Reagents

1. Proteoliposomes as ligand. 2. Analytes dissolved in DMSO. 3. Bovine serum albumin (BSA) fraction V. 4. Phosphate-buffered saline. Equipment and Consumables

1. CSe SPRi-Biochip (Horiba Scientific, catalog number: 1123299207). 2. SPRi-Continuous Flow Microspotter (Horiba Scientific, catalog number: 1147200048). 3. XelPleX, Multiplex label-free molecular interaction platform (Horiba Scientific, catalog number: 1300010053). 4. EzView and EzAnalysis software (Horiba Scientific). 5. EzFit (based on Scrubber). 2.6 Purification with Sucrose Gradient

Reagents

1. 13.2 mL ultracentrifuge tube (14  89 mm). 2. Sucrose. 3. Ultracentrifuge.

266

3

Sandra Corte`s et al.

Methods To design a CFPS platform for high-throughput expression and screening of GPCRs, different steps are required. First, a biology molecular step is performed for the design of PCR fragments. Second, the cell extract is prepared from E. coli strain. This protocol is optimized to the E. coli BL21 codon-plus RIL strain as the source of the cell extract and it should be modified if the other strains are used. To validate the activity of the prepared S30 extract, the optimum magnesium and potassium concentrations are investigated. Third, the cell-free reaction is carried out in microscale to produce proteoliposome-embedded GPCRs. Indeed, the reaction mix is supplemented by optimized liposome composition. In the same way, the system could be optimized by the addition of specific parameters in order to increase expression yield and proper folding of expressed proteins. Then, after production of GPCRs into biotinylated proteoliposomes format, the proteins are captured onto SPRi sensor surface through avidin/biotin linking. Then, after production of GPCRs into biotinylated proteoliposomes format, the proteins are captured onto sensor surface through avidin/biotin linking. Multiple GPCRs may be immobilized onto surface in microarray format and interactions with specific ligands are monitored by SPRi. Finally, GPCR of interest are expressed in larger volume and purified by gradient sucrose (Fig. 1).

PCR Fragments Cloning

Preparation of Cell Extract

Optimization of Mg2+ and K+ concentrations

Identification of optimal expression condition

High throughtput functional assay by SPRi

Small scale expression

Fig. 1 General overview of the process of cell-free expression in high-throughput multiplex format

High-Throughput Cell-Free Expression of GPCR

3.1 Molecular Biology

267

Conventional expression in CFPS generally requires a plasmid template as a starting point. The expression plasmid used for the cellfree reaction needs to contain all the different regulatory elements required for the transcription, translation as well as for purification and labeling purposes. The regulatory elements are among others the T7 Promoter, the origin of replication, the RBS, the T7 Terminator, and possibly a purification tag. As an example of vector for CFPS, the pIVEX plasmid is optimized for expression in the Rapid Translation System (RTS) cell-free system under the control of bacteriophage T7 transcription elements. The gene of interest must contain an ATG initiation codon and a stop codon. The sequence upstream of the T7 promoter must contain a minimum of 6–10 nucleotides (nt) for efficient promoter binding (the sequence doesn’t need to be specific). The sequence following the T7 promoter must contain a minimum of 15–20 nt which forms a potential stem-and-loop structure. The sequence of 7–9 nt between the RBS and the ATG initiation codon must be present for optimal translation efficiency of the protein of interest (the sequence doesn’t need to be specific), and finally a T7 terminator located 4–100 nt downstream of the gene of interest for efficient transcription termination and RNA messenger stability. However, cloning and purification of the plasmid template is time consuming and could be avoided with the use of a linear DNA template. This template is obtained after few PCR amplifications. The use of linear DNA template is subject to nuclease degradation in the cell-free reaction environment. To overcome this issue, different solutions could be considered in order to decrease nuclease activity: (1) expression at lower temperatures [10], (2) disruption of exonuclease genes on the chromosome of E. coli [11], or (3) using of molecular inhibitors to target native exonucleases [12]. The PCR expression sequences (E-PCR) required for the transcription and the translation of the gene of interest (GOI) needs to be generated. To do so, two different strategies could be used. l

Overlapping Extension PCR (OE-PCR): DNA cassettes generated from a standard cell-free vector, as pIVEX, and containing the 30 and 50 -UTR regulatory elements. This method is recommended whenever the starting material is an optimized vector such as pIVEX.

l

Nested-PCR: in this case, the overlapping mega-primers are designed to be added to the regulatory elements in the coding sequence of the protein of interest. This method is recommended whenever the starting material is a linear DNA.

Sandra Corte`s et al.

268

Addition of overlapping regions (1st PCR)

Primer specific to the GOI

Overlap extension PCR : Splicing of « Regulatory elements» (2nd PCR) T7 Promoter Primer 1/7P

pIVEX Primer 2/7T

T7 Terminator Primer 1/7P

3’ UTR-Regulatory elements

Primer 2/7T

5’ UTR-Regulatory elements Tag T7 Promoter

T7 Terminator

Fig. 2 Overlap Extension PCR steps: The first PCR consists in amplifying the specific gene with mega-primers. The second PCR (overlap extension PCR) is performed to add the regulatory elements in 30 and 50 UTR. Primer1/7P is the primer of the T7 promoter region. Primer 2/7T is the primer of the T7 Terminator region 3.1.1 PCR Products OE-PCR

3.1.2 PCR Product Design Nested PCR

Linear DNA is generated by two PCRs. The first PCR consists in amplifying the specific gene with mega-primers. At the same time, the overlapping regions (coding DNA fragments for the regulatory elements necessary for the expression and purification of proteins) are added. At the end of the first PCR, the PCR1 product is purified by gel extraction. The second PCR (overlap extension PCR) is performed with the cassettes. The final product is then aligned and hybridized with the product PCR1, in the presence of the DNA polymerase. The final product (full-length) is purified on gel and amplified by external primers (Fig. 2). The linear DNA is generated by two successive PCRs (PCR1 then PCR 2) with two couples of mega-primer for each (Fig. 3). The purification of the band of interest is performed by gel extraction and is carried out after PCR1. The regulatory elements are added each time to the sequence of interest on the 30 and 50 -UTR regions.

High-Throughput Cell-Free Expression of GPCR

269

Primer B Primer A

Gene of Interest

PCR 1 Primer A’ Primer B’

Primer D Primer C

PCR 2 Primer C’ Primer D’

T7 Promotor

RBS

TAG

T7 Terminator

Fig. 3 Nested PCR steps: The linear DNA is generated by two successive PCRs (PCR1 then PCR 2) with two couples of mega-primer for each PCR. The regulatory elements are added during the PCR2 3.2 Liposome Preparation

The lipid composition of the liposomes has a strong influence on the expression and the integration of the protein in the lipid vesicles. The structural complexity of membrane proteins (number and size of the transmembrane domains) has to be taken into consideration in the lipid composition of the liposomes. Liposomes are obtained by evaporation of chloroform and hydration of thin lipid films. The reduction of size is performed by sonication in order to obtain unilamellar vesicles (SUV) with diameters in the range of 20–50 nm. The following composition has been adapted to enable efficient cell-free protein synthesis. In order to obtain proteoliposomes, synthetic liposomes are directly added into the cell-free reaction mixture. 1. Dissolve lipids powder in chloroform for a final lipid concentration of 10 mg/mL. Store at 20  C. 2. Mix the different lipids: 1,2-dioleoyl-sn-glycero-3-phosphocholine:1,2-dioleoyl-sn-glycero-3-phosphoethanolamine1,2distearoyl-sn-glycero-3-phosphoethanolamine-N-[biotinyl (polyethylene glycol)-2000]; 40:18:20:20:2 volume ratio. 3. Evaporate overnight to dryness using a univapo 150H or nitrogen vacuum. 4. Resuspend the lipid film in Hepes buffer, pH 7.5 or in diethyl pyrocarbonate treated water to obtain a 30 mg/mL lipid. 5. Sonicate using a tip sonicator (Branson Digital Sonifier 250) at 20% for 5 times 30 s, with a 1 min break on ice between sonications. 6. Filter once through a 0.22 μm PES filter. Store a week at 4  C.

270

Sandra Corte`s et al.

3.3 Cell-Free Reaction 3.3.1 Cell Extracts Preparation

3.3.2 Validation of the Prepared Extract

Cell extracts are prepared by cultivating E. coli strain in a 10 L batch fermenter with a temperature control, agitation, and good aeration. Briefly, bacterial growth is achieved and the cells are harvested in the mid-log phase and cooled to 10–14  C before centrifugation and lysing. Cells extract preparation is performed according to Kai et al. [13] with minor modifications: E. coli strain encodes the T7 RNA polymerase gene that is activated (within 30 min) after induction by adding IPTG to 1 mM. The cells are broken before OD 600 nm reach 7. Cell-free reaction is performed using an E. coli extract and energy mix provided by Synthelis SAS. We used a batch method for protein production by the cell-free system in a 50 μL volume. Each new batch of extract should be adjusted to its optimal concentration of Mg2+ (2–14 mM) and K+ (290–410 mM) ions. The validation is performed in 96-well flat-bottomed plates in a total reaction volume of 50 μL (see Note 2). 1. Thaw on ice 400 mM Mg-glutamate (4  C), 3.2 M K-glutamate (4  C), E. coli extract (80  C), and Energy solution according to the Table 1 (see Note 3). 2. Prepare six 5.2 μL reactions, testing a range of 2–14 mM additional Mg-glutamate by aliquoting set amounts of stock Mg-glutamate into 0.5 mL tubes.

Table 1 Reagents to prepare for 50 μL reactional mix Reagents

Initial concentration

Volume added/50 μL

Storage ( C)

Hepes

1.2 M

1.42 μL

4

Folinic acid

10 mg/mL

0.5 μL

20

NAD

33 mM

0.5 μL

20

AMP, GMP, UMP, CMP

170 mM each

0.5 μL each

20

Spermidine

300 mM

0.22 μL

20

3-PGA

1.5 M

1 μL

20

Sodium oxalate

210 mM

0.5 μL

20

E. coli tRNA

25 mg/mL

1 μL

20

Amino acids

2 mM each

8.33 μL

20

Protease Inhibitor Cocktail

Stock solution (50)

1 μL

20

Liposomes

30 mg/mL

8.33 μL

20

All buffers and stock solutions should be prepared with diethylpyrocarbonate (DEPC)-treated H2O (DNAse/RNAsefree)

High-Throughput Cell-Free Expression of GPCR

271

3. Prepare 4 33 μL reactions, testing a range of 290–410 mM additional K-glutamate by aliquoting set amounts of stock K-glutamate into 0.5 mL tubes. 4. Prepare 1036.8 μL of reaction mixture by combining 420 μL of E. coli extract, 604.8 μL energy mix, and 12 μL of template DNA at 1.5 μg/μL (see Note 4). 5. Dispense 1.3 μL Mg-glutamate to each well. 6. Dispense 5.5 μL K-glutamate to each well. 7. Add 43.2 μL of reaction mixture to samples containing Mg-glutamate and K-glutamate. 8. Place a coverlet on the plate and wrap with Saran to avoid evaporation. 9. Run reaction at 30  C in an incubator for 16 h with shaking (see Note 5). 10. Determine optimum Mg-glutamate concentration and K-glutamate concentration by expression level using a dot blot technique. 11. Use values determined in this calibration to determine the composition of optimum reaction mix. 3.3.3 High-Throughput Cell-Free Expression

Cell-free expression system is really adapted to high-throughput (HTP) techniques providing an efficient means for producing a large range of membrane proteins for use in screening and other research applications. The main parameter for optimal membrane protein expression is the liposome composition. Indeed, the proper folding and functionality of membrane proteins are improved by testing a large set of different liposomes in order to find the best liposome composition. The following protocol is adapted for the screening of a large number of genes with only one liposome composition or through several liposome compositions. The protocol is described for the 96 membrane proteins with one liposome composition. 1. Thaw all reagents on ice. 2. Prepare 3952 μL of reaction mix without liposomes by combining 1581.92 μL of energy solution, 1680 μL of E. coli extract, volume of optimum Mg-glutamate concentration and optimum K-glutamate concentration and DEPC-treated water to volume 4000 μL. 3. Dispense 41.66 μL reaction mix to each well of the 96-well plates. 4. Dispense 0.5 μL of template DNA to each well (see Note 6). 5. Add 8.33 μL of liposomes to samples reaction mix/DNA. In this one-step reaction, proteins are directly expressed and integrated in the lipid bilayer of liposomes.

272

Sandra Corte`s et al.

6. Place a coverlet on the plate and wrap with Saran. 7. Run reaction at temperature from 20  C to 30  C in an incubator for 6–16 h with shaking. 8. After expression of membrane proteins, the proteoliposomes are separated from precipitated membrane proteins by centrifuging the 96-welll plates at 15,000  g for 10 min at 4  C. 9. Remove the supernatant that contains the proteoliposome fractions and discard the pellet. The membrane proteins into liposomes are analyzed immediately by Dot Blot and on SDS-PAGE or stored at 20  C until ready to run the gel. Negative control is the reaction without template DNA. This serves as the background. 3.3.4 Small-Scale Protein Production

Scaled-up production of proteoliposomes could be performed in a 1.5 mL Eppendorf tubes with a 1 mL cell-free expression reaction volume using the optimized conditions. 1. Prepare all reagents according to the recommendations. 2. Mix all reagents. 3. Run reaction at 30  C in an incubator for 16 h with shaking. 4. After expression of membrane proteins, the proteoliposomes are purified on sucrose gradient.

3.4

Analytics

3.4.1 Dot Blot

Western blot analysis provides essential information (molecular weight or presence of truncated forms) about the successful expression of recombinant protein but does not lend itself to highthroughput analysis. Prior to analyzing samples by the western blot method, in order to analyze a large number of samples in parallel we investigated the use of the dot blot method. In addition, the small amount of sample needed for analysis means significant saving in research sources. The Dot Blot is a technique for the detection, analysis, and identification of proteins, similar to the western blot technique, but the protein samples are not separated electrophoretically but are spotted directly onto the nitrocellulose membrane. The Dot Blot is a semiquantitative method. 1. Assemble the Dot Blot apparatus. 2. Have nitrocellulose membrane and Whatman paper ready. 3. Prewet the Whatman paper in distilled water prior to placing it in the apparatus. 4. Prewet the nitrocellulose in distilled water and place it on the Whatman paper. 5. Adjust the flow valve to ensure proper drainage during filtration applications (see Note 7).

High-Throughput Cell-Free Expression of GPCR

273

6. Transfer 2 μL of the reaction sample to new 96-well plates. 7. Prepare 10 mL Tris 25 mM pH 7.5. 8. Add 6x loading buffer to the Tris 25 mM pH 7.5 (see Note 8). 9. Dispense 100 μL to all the reaction sample. 10. Transfer 102 μL of sample volume in the apparatus. Each well should be filled with the same volume of sample solution to ensure homogeneity. 11. Filtrate all samples in wells through the membrane. 12. After the samples have been completely drained from apparatus, stop the vacuum and remove the nitrocellulose membrane from the apparatus. 13. Let the membrane dry. 14. Block nonspecific sites with 5% Milk in TBS-T for 30 min at RT. 15. Incubate with a poly-histidine antibody conjugated with a horseradish peroxidase (Sigma-Aldrich, USA) diluted at 1:50,000 in TBS-Tween buffer, 5% nonfat milk for 1 h at RT. 16. Wash three times with TBS-T (3  10 min). 17. Incubate the nitrocellulose membrane with ECL reagent for 1 min. 3.4.2 Western Blot

Western Blot allows the identification of expressed protein samples by using a monoclonal anti-histidine HRP conjugated antibody. Before loading the sample, the total proteins were centrifuged in microtiter plate at 30,000  g for 30 min at 4  C. 1. Remove the supernatant and resuspend the pellet corresponding to the proteoliposome fractions in the same volume of 50 mM Tris pH 7.5. 2. Add 6 loading buffer to the proteoliposomes. 3. Load 5 μL of the final reaction volume on 12% SDS-PAGE gel and electrophorese at 180 V. 4. Transfer the proteins from the gel to the nitrocellulose membrane (0.22 μm) using the Trans-Blot Turbo transfer system (BioRad). 5. Stain membrane with Ponceau red to check the efficiency of the transfer. 6. Wash the nitrocellulose membrane with TBS-Tween for 10 min at room temperature. 7. Block nonspecific sites with 5% Milk in TBS-T for 45 min at RT. 8. Incubate with a poly-histidine antibody conjugated with a horseradish peroxidase (Sigma-Aldrich, USA) diluted at 1:10000 in TBS-Tween buffer, 5% nonfat milk for 1 h at RT.

Sandra Corte`s et al.

274

9. Wash three times with TBS-T (3  10 min). 10. Incubate the nitrocellulose membrane with ECL reagent for 1 min and visualize the protein profiles using Molecular Imager Chemidoc™ XRS+ with ImageLab™ software (Bio-Rad, Marnes-la-Coquette, France). 3.5

This method measures the proteoliposome-analyte interactions by Surface Plasmon Resonance imaging (SPRi) using XelPleX system (Horiba Scientific). XelPleX is a multiplex label-free molecular interaction platform. The imaging configuration of XelPleX enhances the throughput of conventional SPR. XelPleX uses a camera that allows visualizing many spots (up to 400) in real time and requires only few minutes to screen a huge panel of molecules. This protocol is adapted for high-throughput screening of membrane protein embedded directly into liposomes (proteoliposomes). In this format, the biotinylated proteoliposomes are immobilized (Fig. 4). Membrane proteins are analyzed in native or near-native environments with the practical appeal of easy preparation, stability, patterning, and availability of compatible surface characterization techniques (Fig. 5).

SPRi

O

O O

N O

O

O

O− S

O

S

O O

O−

Biotinylated proteoliposomes

O

N O

C

HN

O C

C

O HN

C

Streptatvidin O

O

S

H2N

S

O

O

S

Fig. 4 Immobilization of biotinylated proteoliposomes on CSe surface chemistry

S

O

High-Throughput Cell-Free Expression of GPCR

275

Fig. 5 Integration of SPRi in the Proteoliposomes characterization process

3.5.1 Preparation of Sensor Chip and Immobilization of Proteoliposome Samples

1. Store the Extravidin-coated SPRi-Biochips (CSe) at 4  C. Bring the chip to room temperature just before the experiment (see Note 9). 2. Dilute protein sample in the 10 mM PBS buffer solutions. The concentration of proteoliposomes should be in the range of 0.5–2 μg/mL. Each concentration will be printed in three replicates. 3. Using a sample volume of 100 μL in a standard 96-well microplate. The SPRi-CFM is an automatic printer that uses flow deposition for printing biomolecules. Samples are cycled over the surface and captured from solution, leading to higher biomolecule density, better spot uniformity, and improved assay sensitivity. 4. Put the microplate in the SPRi-CFM holder and start printing.

3.5.2 Experimental Binding Measurements

1. Put the spotted SPRi-Biochip in the XelPleX. 2. Block the surface of the SPRi-Biochip with Biotin at 10 μg/ mL. 3. Inject the normalization solution. 4. Prepare increasing concentrations of analyte and fill the 96-well microplate. Dilute this analyte with the running buffer used in the SPRi binding experiment. 5. Inject 150 μL for each concentration of the analyte in the flow cell at 50 μL/min.

276

Sandra Corte`s et al.

Molecular binding induces a shift of the plasmon curves and an increase of reflectivity. The kinetic curves (sensorgrams) show the variations of reflectivity versus time (association phase). The process can also be monitored on the SPRi difference image. White spots correspond to interacting areas of the SPRi-Biochip. When the sample solution leaves the flow cell, the ligand-analyte complexes is dissociated. This induces a shift back of the plasmon curves to the initial position and a decrease of reflectivity. The kinetic curves show the variations of reflectivity versus time (dissociation phase). The process can also be monitored on the SPRi difference image as interacting spots become darker. Ligand/analyte interactions are measured without regeneration. Regeneration is not practical since the harsh regeneration reagents may destroy the bioactivity of the proteoliposomes. When all the ligand-analyte complexes are fully dissociated (sometimes using a regeneration solution), the plasmon curves and the kinetic curves return to the initial state. The SPRi difference image is black again. 3.6 Purification with Sucrose Gradient

In order to purify proteoliposomes from empty liposomes and precipitated proteins, cell-free reactions are loaded on top of a three-step discontinuous sucrose gradient and an ultracentrifugation is performed. A discontinuous sucrose density gradient is prepared by layering successive decreasing sucrose densities solutions upon one another. 1. Prepare 60%, 25%, and 5% sucrose solutions in 50 mM Tris pH 7.5 buffer. 2. In an ultracentrifuge tube, layer 3 mL of 60% sucrose followed by 4 mL of 25% sucrose and 4 mL of 5% sucrose solution. 3. Load 1 mL of sample over the 5% sucrose solution. 4. Ultracentrifuge at 280,000  g for 1 h at 4  C. 5. Collect the fractions at each interface into 1.5 mL Eppendorf tube. 6. Dilute the fractions in 50 mM Tris pH 7.5 buffer by combining 500 μL of sucrose fraction with 1 mL of 50 mM Tris pH 7.5. 7. Centrifuge the diluted fraction for 30 min at 30,000  g, 4  C. 8. Discard the supernatant and resuspend the pellet in 50 mM Tris pH 7.5 buffer. 9. Analyze using Western Blotting and Coomassie Blue. 10. Store proteoliposomes at 80  C until use.

3.7 Application to the CXCR4 Receptor Binding Model

The C-X-C chemokine receptor type 4 (CXCR4) is a protein broadly expressed in the cell plasmic membrane of both the immune and the central nervous systems, and can mediate migration of resting leukocytes and hematopoietic progenitors in

High-Throughput Cell-Free Expression of GPCR

277

response to its ligand, SDF1α. CXCR4 is a G-protein-coupled receptor composed of 352 amino acids with seven transmembrane helices and also a major receptor for strains of human immunodeficiency virus-1 (HIV-1) that arise during progression to immunodeficiency and AIDS dementia. The roles of CXCR4 receptor make it a potential target in many therapeutic strategies. The binding properties between CXCR4 and SDF1α and its specific ligand are studied using SPRi binding assay as a proof of concept (Fig. 6). CXCR4 proteoliposome is expressed by cell-free system and the best expression conditions are determined by dot blotting and validated by western blot. Here we show that analysis by dot blot and western blot can be used to select the best expression conditions. After immobilization of a number of conditions on biochip, a SPRi analysis is used to assess the binding between CXCR4 and its natural ligand, SDF1α, in a multiparallel high-throughput manner. The expected size of CXCR4 is 39.7 kDa. After dot blot and western blot analysis, twelve conditions of CXCR4 proteoliposomes are retained (Fig. 6a, b). These conditions are immobilized in triplicate on the gold chip surface and successive injections of SDF1α are performed, followed by regeneration. Results are shown in Fig. 6. The Spotted SPRi-Biochip is blocked with 1% BSA to

Fig. 6 Expression and characterization of CXCR4. Dot Blot (a), Western Blot (b), and SPRi (c) analysis applied to the CXCR4 receptor binding model (W: Wash; R: Regeneration)

278

Sandra Corte`s et al.

avoid nonspecific binding. The solution of SDF1a is injected at two concentrations (10 nM and 20 nM). The interactions between CxCR4 and the ligands are monitored in real time by SPR imaging for each injected concentration, the variations of reflectivity representing the quantity of analyte interacting with CxCR4 are determined. The experiment is carried out at 25 μL/min using 50 mM HEPES pH 7.5 as running buffer, 0.05% BSA. The temperature of the flow cell is set at 25  C.

4

Notes 1. Use Milli-Q-purified water or equivalent in all recipes and protocol steps. 2. In order to prepare all reagents, use DEPC-treated water. The deionized water is treated with 0.001% DEPC filtered through a 0.2 m filter and double autoclaved to ensure sterility and complete inactivation of DEPC. 3. Don’t vortex E. coli extract. Invert the tubes few times to mix thoroughly the contents and place it on ice before using. 4. Genetically prepared histidine tagged proteins, having retained antibody binding capacity, are used to validate and calibrate the extract cells. These proteins are built with the same vector as the protein samples of interest. Use 15 μg/mL working concentration vector with reference protein. Choose a reference protein giving high signal intensity. 5. Runtimes vary depending on experiment and proteins expressed but typically last under 16 h. 6. It’s better to have the template DNA concentrations at 1.5 μg/ μL. We use 0.75 μg of DNA/50 μL reaction. 7. Make sure that all the screws have been tightened under vacuum to ensure that there will not be any contamination. 8. Don’t heat the samples before loading. 9. This highly stable surface chemistry enables the immobilization of biotinylated targets thanks to the avidin/biotin affinity. In parallel, this strong affinity enables cycle-to-cycle reproducibility after repeated regenerations cycles.

References 1. Yin H, Flynn AD (2016) Drugging membrane protein interactions. Annu Rev Biomed Eng 18:51–76 2. Ma Y, Mu¨nch D, Schneider T, Sahl HG, Bouhss A, Ghoshdastider U, Wang J, Do¨tsch V, Wang X, Bernhard F (2011)

Preparative scale cell-free production and quality optimization of MraY homologues in different expression modes. J Biol Chem 286:38844–38853 3. Berrier C, Guilvout I, Bayan N, Park KH, Mesneau A, Chami M, Pugsley AP, Ghazi A

High-Throughput Cell-Free Expression of GPCR (2011) Coupled cell-free synthesis and lipid vesicle insertion of a functional oligomeric channel MscL MscL does not need the insertase YidC for insertion in vitro. Biochim Biophys Acta 1808:41–46 4. Lu Y (2017) Cell-free synthetic biology: engineering in an open world. Synth Syst Biotechnol 2:23e27 5. Carlson ED, Gan R, Hodgman E, Jewett MC (2012) Cell-free protein synthesis: applications come of age. Biotechnol Adv 30:1185–1194 6. Rosenblum G, Cooperman BS (2014) Engine out of the chassis: cell-free protein synthesis and its uses. FEBS Lett 588:261–268 7. Liguori L, Marques B, Villegas-Me´ndez A, Rothe R, Lenormand J-L (2007) Production of membrane proteins using cell-free expression systems. Expert Rev Proteomics 4:79–90 8. Schinn SM, Broadbent A, Bradley WT, Bundy BC (2016) Protein synthesis directly from PCR: progress and applications of cell-free protein synthesis with linear DNA. N Biotechnol 33(4):480–487

279

9. Ahn J-H, Chu H-S, Kim TW, Oh IS, Choi CY, Hahn G-H, Park C-G, Kim DM (2005) Cellfree synthesis of recombinant proteins from PCR-amplified genes at a comparable productivity to that of plasmid-based reactions. Biochem Biophys Res Commun 338:1346–1352 10. Seki E, Matsuda N, Yokoyama S, Kigawa T (2008) Cell-free protein synthesis system from Escherichia coli cells cultured at decreased temperatures improves productivity by decreasing DNA template degradation. Anal Biochem 377:156–161 11. Michel-Reydellet N, Woodrow K, Swartz J (2005) Increasing PCR fragment stability and protein yields in a cell-free system with genetically modified Escherichia coli extracts. J Mol Microbiol Biotechnol 9:26–34 12. Sitaraman K et al (2004) A novel cell-free protein synthesis system. J Biotechnol 110:257–263 13. Kai L et al (2012) Systems for the cell-free synthesis of proteins. Methods Mol Biol 800:201–225

Chapter 13 High-Throughput Site-Directed Mutagenesis Claire Strain-Damerell and Nicola A. Burgess-Brown Abstract Protein engineering has an array of uses: whether you are studying a disease mutation, removing undesirable sequences, adding stabilizing mutations for structural purposes, or simply dissecting protein function. Protein engineering is almost exclusively performed using site-directed mutagenesis (SDM) as this provides targeted modification of specific amino acids, as well as the option of rewriting the native sequence to include or exclude certain regions. Despite its widespread use, SDM has often proved to be a bottleneck, requiring precision manipulation on a sample-by-sample basis to make it work. When dealing with large volumes of samples it is not possible to use such a low-throughput approach. Here we describe a highthroughput (HTP) method for SDM, optimized and used by the Structural Genomics Consortium (SGC) to complement structural studies. Key words Protein engineering, Site-directed mutagenesis (SDM), High-throughput (HTP), Ligation-independent cloning (LIC)

1

Introduction There are many different reasons to mutate a protein and many resources available to help you chose which sites to mutate (e.g., OMIM [1], Uniprot [2], DISOPRED [3], FireProt [4]) whether you hope to emulate a natural variant, target a posttranslational modification site (e.g., phosphorylation, glycosylation), wish to remove disordered regions, or hope to introduce stabilizing mutations or domains. Whatever the reason, protein engineering is still predominantly done at the DNA level by site-directed mutagenesis (SDM), using either whole plasmid amplification (or inverse PCR), megaprimer PCR, or the related primer extension PCR [5–7] (see Fig. 1). This provides precise manipulation at specific sites within the protein sequence. Chemical modification of proteins is also an invaluable tool [8] but will not be discussed here. Commercially available kits are commonly based on whole plasmid amplification (e.g., QuikChange II from Agilent) as this provides the shortest path in

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_13, © Springer Science+Business Media, LLC, part of Springer Nature 2019

281

282

Claire Strain-Damerell and Nicola A. Burgess-Brown

Fig. 1 Schematic overview of primer extension PCR, showing the main product produced during each reaction. Only a single point mutation is illustrated in this example, requiring two first round reactions to generate the forward and reverse fragments which are then used as the template for the second round PCR

which to generate mutations and requires only two primers. The limitation of whole plasmid amplification is that you run the risk of introducing undesirable mutations in the vector backbone, due to the amplicon size, even when the required high-fidelity polymerase is used. Whole plasmid amplification is usually restricted to single site mutations; however, it can be used to introduce multiple site mutations by generating overlapping fragments, resecting the ends, refilling and then ligating them [9]. This approach provides a means to limit the amplicon size during PCR for especially large plasmids. It is however a ligase-dependent method, making the efficiency low when scaled to a HTP level, and the 50 exonuclease used runs the risk of introducing frameshifts in the assembled product. Megaprimer PCR requires three primers for a single site mutant and involves two rounds of PCR, with a purification step in-between to remove the original primers. Primer extension PCR has the highest primer requirement as it requires a forward and reverse internal mutagenic primer (per mutation site) as well as forward and reverse cloning primers. This method also involves a two-step PCR (or an extension phase) but does not require a purification between these steps. Primer extension PCR is the most amenable to multiple site mutation, especially as there is no

High-Throughput Site-Directed Mutagenesis

283

upper or lower size limit applied by a purification step. All three methods are reliant on DpnI to destroy the template DNA; however the primer extension method appears to be less susceptible to wild-type carry-over caused by the 30 –50 proofreading ability of the polymerase removing the mismatch in the mutant primer (i.e., the desired mutation). This chapter covers the primer extension PCR process we have optimized for HTP mutagenesis at the SGC, with key steps highlighted and explained. The mutated genes are then cloned in a HTP fashion by ligation-independent cloning (LIC).

2

Materials Unless otherwise stated, molecular biology grade water (NucleaseFree Water ThermoFisher Scientific) is used for all dilutions and reactions set out below. Where ultrapure water is instead specified, it is prepared by purifying deionized water to reach a resistivity of 18 MΩ cm at 25  C. All reagents should be of analytical grade or higher and all plasticware should be DNAse-free.

2.1

First Round PCR

Reagents

1. Primers: Primers are supplied by MWG-Biotech and are HPSF purified at 0.01 μmol scale. Primer stocks are either supplied at or resuspended (in 10 mM Tris–HCl buffer, pH 8.0) to 100 μM and stored at 20  C. 2. Template library: Human cDNA clones were obtained from the IMAGE cDNA collection (currently distributed by Source BioScience, UK), from other commercial providers (OriGene, Invitrogen, FivePrime), or isolated in-house by PCR from human cDNA. Synthetic DNA clones, including either the natural cDNA or codon-optimized sequences, were synthesized to order by GenScript or Codon Devices. 3. Enzymes: Herculase II Fusion DNA polymerase (Agilent Technologies). 4. 10 mM dNTP solution: Prepare a stock of 10 mM dATP, 10 mM dTTP, 10 mM dGTP, and 10 mM dCTP in water and store at 20  C. 5. 50 TAE buffer (1 L): Dissolve 242 g of Tris base, 57.1 mL of glacial acetic acid and 100 mL of 0.5 M EDTA, pH 8.0 in water, adjust pH to 8.5 and make up to 1 L. Filter through a 0.2 μm membrane filter and use as a 1 solution. 6. 96 well 1.5% TAE-agarose gels: 3 g of agarose powder (Invitrogen), 200 mL of 1 TAE buffer and 4 μL of peqGREEN DNA gel stain (Peqlab), cast in a Sub-cell Model 96 (BioRad or similar) gel cast. 7. 1 kb Plus DNA ladder (Invitrogen) prepared in 1 BlueJuice™ (Invitrogen).

284

Claire Strain-Damerell and Nicola A. Burgess-Brown

Equipment and Consumables

1. 96 well PCR plates. 2. Thermal resistant adhesive film. 3. Adhesive seals. 4. Membrane filters, 0.2 μm and unit. 5. Reagent reservoirs for multichannel pipetting. 6. 1.5 mL Eppendorf tubes, autoclaved to remove DNAse contamination. 7. Multichannel pipettes and repeat pipettors are used to dispense reagents into a 96 well format. 8. 96 well PCR thermocycler with heated lid. 9. 96 well gel cast and tank (Subcell Model 96 BioRad or similar). 10. Gel Logic 200 Imaging System (Kodak). 2.2 Removing the Template DNA

The following reagents, consumables, and equipment are required in addition to those listed above. Reagents

1. DpnI [20 units/μL, New England BioLabs (NEB)]. Equipment and Consumables

2. PCR strip tubes. 3. Centrifuge suitable for 96 well PCR plates (150  g). 4. Water bath set at 37  C. 2.3 Second Round PCR

The following reagents, consumables, and equipment are required in addition to those listed above. Reagents

1. Low Range Quantitative DNA Ladder (Invitrogen) for the E-Gel® system. TE Buffer: 10 mM Tris–HCl and 1 mM EDTA, pH 8.0. Filter through a 0.22 μm syringe filter (Sartorius) and store at room temperature (RT). Equipment and Consumables

1. E-Gel® 96 Mother base and E-Gel® 96 1% Agarose Gels (Invitrogen). 2. MultiScreen PCR96 filter plate (Millipore). 3. V-bottomed microtiter plates. 4. Minisart syringe filters, 0.20 μm (Sartorius). 5. MultiScreenHTS Vacuum Manifold (Millipore).

High-Throughput Site-Directed Mutagenesis

2.4 LigationIndependent Cloning into pNIC28-Bsa4

285

The following reagents, consumables, and equipment are required in addition to those listed above. Reagents

1. Screening primers: pLIC-F (TGTGAGCGGATAACAATTCC), and pLIC-R (AGCAGCCAACTCAGCTTCC), prepared as a 10 μM stock and stored at 20  C. 2. Competent cells: All cloning is performed in Mach1™ cells (originally purchased from Invitrogen), with chemically competent cells produced in-house using the RbCl method [10]. Other cell lines are suitable for cloning but we recommend using a recA phage resistant strain, to promote plasmid stability and to reduce the risk of bacteriophage infection during E. coli expression, respectively. 3. Vectors: In this example, we use the N-terminal 6-Histidine tagged bacterial expression vector, pNIC28-Bsa4 (Addgene), which is LIC compatible. 4. Enzymes: T4 DNA Polymerase (3 units/μL, NEB), BsaI (10 units/μL, NEB), and MyTaq™ Red DNA polymerase (5 unit/μL, Bioline) for colony PCR screening. 5. 25 mM dGTP and 25 mM dCTP (prepared from 100 mM dNTP set, Invitrogen) and stored at 20  C. 6. 100 mM and 1 M Dithiothreitol (DTT): Prepare 1 mL aliquots in ultrapure water, filter through a 0.20 μm syringe filter (Sartorius), and store as 1 mL aliquots at 20  C. 7. Bovine serum albumin (BSA) (100 supplied with most NEB enzymes). 8. 25% (w/v) sucrose: Dissolve 250 g of sucrose in 1 L of ultrapure water and filter through a 0.22 μm filter unit (Millipore). 9. 60% (v/v) glycerol: Prepare in ultrapure water and autoclave to sterilize. 10. 50 mg/mL Kanamycin (1000): Prepare 10 mL aliquots in ultrapure water, filter through a 0.20 μm syringe filter (Sartorius), and store at 20  C. 11. LB-agar: Dissolve 22.5 g of premixed LB-broth and 13.5 g of agar in 800 mL of ultrapure water. Adjust volume to 900 mL and autoclave on the same day. 12. LB-agar plates: Melt LB-agar slowly in a microwave and add sucrose to a final concentration of 5% (w/v). Once cooled to hand-hot, add the appropriate antibiotic and swirl vigorously to mix. Pour 10 mL of the molten agar into each 50 mm petri dish, and once set, upturn and leave open to dry. These can be prepared ahead of time and stored at 4  C sealed in a plastic bag to prevent over-drying.

286

Claire Strain-Damerell and Nicola A. Burgess-Brown

13. 1 LB: Dissolve 22.5 g of premixed LB-broth in 800 mL of ultrapure water. Adjust the volume to 900 mL and autoclave on the same day. 14. SOC medium: Dissolve 18 g of tryptone (or peptone from casein), 4.5 g of yeast extract, 0.45 g of NaCl, and 2.25 mL of 1 M KCl in 800 mL of ultrapure water. Adjust the volume to 900 mL and autoclave on the same day. Once cooled, add 9 mL of 2 M MgCl2 hexahydrate and 18 mL of 1 M (18%) glucose. Filter both solutions through a 0.20 μm syringe filter (Sartorius) prior to use. 15. Virkon (Appleton Woods). Equipment and Consumables

1. PCR purification columns and kit. 2. Montage Plasmid MiniprepHTS 96 Kit (Millipore). 3. 50 mm petri dishes. 4. Disposable sterile spreaders or 2 mm autoclaved glass beads (e.g., Sigma-Aldrich) for spreading as these are reusable and allow faster plating for 96 well plates. 5. Disposable sterile inoculation loops (1 μL). 6. 96 deep well blocks (Thomson). 7. AirOtop porous seals (Thomson). 8. Foil seals. 9. Express™ PLUS filter unit, 0.22 μm (Millipore). 10. Microcentrifuge. 11. Centrifuge suitable for 96 deep well blocks (3000  g). 12. Micro-Express Glas-Col shaker (Glas-Col, Indiana, US) or similar set to 37  C. 13. Water bath set at 42  C. 14. Incubator set at 37  C.

3

Methods When performing site-directed mutagenesis, the decision of which residues to mutate is usually made at the amino acid level, whereas the primer design is done at the DNA level. In order to avoid mutating the wrong residue, it is advisable to refer to the context of the residue rather than the number alone (e.g., KAGLYMKMEPV, rather than M362R); this ensures that regardless of how the residues are numbered, the correct one is mutated (e.g., M362R, rather than M364R in the above example).The mutagenic primer design strategy followed is dependent on what you hope to achieve: insertion, deletion, or point mutation. For all types of

High-Throughput Site-Directed Mutagenesis

287

Fig. 2 Schematic representation of primer design and positioning for insertions, deletions, or point mutations. Note that the point mutation primers are completely overlapping, whereas the insertion and deletion primers are slightly offset

mutation, the primers are designed to flank the target site and either include the sequence being inserted (for insertions), exclude the sequence targeted for deletion (for deletions), or include the desired nucleotide change (for point mutations) (see Fig. 2). In the method described below, at least four primers are required for any mutation: forward and reverse LIC primers, and forward and reverse internal mutagenic primers (see Note 1). The primer design can be optimized for each sequence if desired, but in our hands, we find that a simple 12 bp extension either side of the mutation is sufficient for each mutagenic primer. However, having extra nonoverlapping regions in the mutagenic primers appears to be beneficial for insertions and deletions (see Fig. 2). The forward LIC primer should start with the following sequence: TACTTCCAATCCATG, followed by ~15–20 bp of gene-specific sequence. The “ATG” should be in-frame with the target protein sequence; if your protein is full length you need not include an additional start codon. The reverse LIC primer should start with the sequence: TATCCACCTTTACTG, followed by ~15–20 bp of gene-specific sequence. A stop codon should be included if the gene sequence does not already include one. The described example covers the process of cloning into our standard N-terminally His-tagged bacterial expression vector, pNIC28-Bsa4 (https://www.addgene. org/26103/) (see Fig. 3). For a more comprehensive overview of LIC cloning and the alternative vectors that are available please refer to Strain-Damerell et al. (2014) [11].

288

Claire Strain-Damerell and Nicola A. Burgess-Brown

Fig. 3 Schematic of the bacterial expression vector; pNIC28-Bsa4 containing a gene of interest. Cloning is LIC-dependent, putting the Gene of Interest (GOI) in-frame with the N-terminal 6-Histidine tag. Expression is driven by the T7 promoter and is IPTG-inducible 3.1

First Round PCR

1. Using a multichannel pipette and a reagent reservoir, add 190 μL of water to each well of 4 96 well PCR plates. To each well add 10 μL of the 100 μM primers to generate 4 separate primer plates (one per primer) at 5 μM each, with the primers in the correct well locations (see Note 2). The resulting plates are forward LIC, reverse LIC, forward mutagenic, and reverse mutagenic. Seal each plate with adhesive seal and store at 4  C. 2. For each template (see Subheading 2.1, item 2) prepare a 2.5 ng/μL dilution in a 1.5 mL Eppendorf tube, mix well, and aliquot 20 μL of this into the appropriate wells of another 96 well PCR plate (see Note 3). Seal this plate with adhesive seal and store at 4  C. 3. Prepare a PCR master mix as follows: 1 mL of 5 Herculase II Fusion polymerase buffer, 2.65 mL of water, 150 μL of 10 mM dNTP mixture, and 50 μL of Herculase II Fusion polymerase. Mix well, and using a multichannel pipette or repeat pipettor aliquot 20 μL to each well of 2 96 well PCR plates (forward and reverse reaction plates) (see Note 4). 4. Using a multichannel pipette, aliquot 2.5 μL of template DNA (see Subheading 3.1, step 2) into each of the wells of the forward and reverse reaction plates (see Subheading 3.1, step 3). 5. Using a multichannel pipette, aliquot 1.5 μL of the forward LIC primers and the reverse mutagenic primers (see Subheading 3.1, step 1) into each well of the forward reaction plate (see Subheading 3.1, step 3). 6. Using a multichannel pipette, aliquot 1.5 μL of the reverse LIC primers and the forward mutagenic primers (see Subheading 3.1, step 1) into each well of the reverse reaction plate (see Subheading 3.2, step 3). 7. Seal each plate with a thermal resistant adhesive film, place each into a separate thermocycler, and cycle as follows:

High-Throughput Site-Directed Mutagenesis

289

95  C for 10 min (95  C for 30 s, 62  C for 30 s, 68  C for 1–3 min∗)  15 cycles 68  C for 5 min 15  C hold ∗

Extension time is dependent on product size. Assume 1 min per 1 kb for the longest product on the plate. 8. While the cycle is running, prepare two 96 well 1.5% TAE gels. 9. Using a multichannel pipette, aliquot 3 μL of 2 BlueJuice into each well of 2 96 well PCR plates. Transfer 3 μL of the forward PCR reactions into one and 3 μL of the reverse PCR reactions into the other. Load all 6 μL using a multichannel pipette, noting that the samples will be interleaved (see Fig. 4). Load 6 μL of 1 kb DNA ladder and run the gels at 150 V for 1 h. 10. Analyze the sizes of the products and repeat any missing reactions with altered conditions where required (see Note 5).

Fig. 4 Example of first round PCR gels. Samples loaded with a multichannel are interleaved (e.g., A1, B1, A2). The products from both the forward reaction (top) and reverse reactions (bottom) combine to create the template for the second round PCR

290

Claire Strain-Damerell and Nicola A. Burgess-Brown

3.2 Removing the Template DNA

1. Prepare a master mix of DpnI as follows (see Note 6): Mix 209 μL of 10 NEB CutSmart buffer with 11 μL of DpnI, then aliquot 18 μL across a strip of 12 PCR tubes. Use the remaining DpnI mixture to set up a control reaction (see Subheading 3.2, step 3). 2. Using a multichannel pipette, aliquot 1 μL of the DpnI mixture into each well of the forward and reverse PCR reaction plates (see Subheading 3.1). Seal each plate with an adhesive film and mix well by tapping the side of each plate. Spin briefly at 300  g for 1 min before incubating at 37  C for 1 h in a water bath. 3. Set up a DpnI control sample (see Note 7) as follows: Mix 22.5 μL of water with 2.5 μL of any of the 2.5 ng/μL templates (see Subheading 3.1, step 2), followed by 1 μL of the DpnI mixture (see Subheading 3.2, step 1). Incubate this sample alongside the PCR reaction plates (see Subheading 3.2, step 2). 4. No purification step is necessary post-DpnI treatment (see Note 8).

3.3 Second Round PCR

1. Prepare a PCR master mix as follows: 500 μL of 5 Herculase II Fusion polymerase buffer, 1.3 mL of water, 75 μL of 10 mM dNTP mixture, and 25 μL of Herculase II Fusion polymerase. Mix well, and using a multichannel pipette or repeat pipettor aliquot 20 μL to each well of a 96 well PCR plate. 2. Using a multichannel pipette, aliquot 1.5 μL of forward LIC primer, 1.5 μL of reverse LIC primer (see Subheading 3.1, step 1), 1.5 μL of the forward PCR reaction, and 1.5 μL of reverse PCR reaction (see Subheading 3.2, step 2) into each well of the PCR plate (see Subheading 3.3, step 1). 3. Seal the plate with a thermal resistant adhesive film, place into a thermocycler, and cycle as follows: 95  C for 10 min (95  C for 30 s, 62  C for 30 s, 68  C for 1–3 min∗)  10 cycles 68  C for 5 min 15  C hold ∗

Extension time is dependent on product size. Assume 1 min per 1 kb for the longest product on the plate. 4. Dilute 3 μL of the reaction mixtures with 12 μL of water. Run on an E-Gel® (Invitrogen) including 20 μL of the 2 diluted Low Range Quantitative DNA Ladder (Invitrogen) (see Fig. 5). 5. Analyze the sizes of the products and repeat any missing reactions with altered conditions where required (see Note 9). 6. Transfer all correct products into a fresh PCR plate and make the volume up to 100 μL with water. Transfer to a MultiScreen

High-Throughput Site-Directed Mutagenesis A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 AM B1 B2 B3 B4 B5 B6 B7

291

B8 B9 B10 B11 B12 BM

2000 bp 800 bp 400 bp 200 bp C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 CM D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 DM

2000 bp 800 bp 400 bp 200 bp E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12 EM F1 F2 F3

F4

F5 F6 F7

F8 F9 F10 F11 F12 FM

2000 bp 800 bp 400 bp 200 bp G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 GM H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12 HM

2000 bp 800 bp 400 bp 200 bp

Fig. 5 Example of second round PCRs run on an E-Gel®. The intensity and sizes of the final products permit the use of a low resolving gel for analysis

PCR96 filter plate (Millipore) and follow the manufacturer’s instructions to purify the samples. Recover the products in 30 μL of TE buffer and transfer into a V-bottomed microtiter plate (see Note 10). 3.4 LigationIndependent Cloning into pNIC28-Bsa4

1. Digest the pNIC28-Bsa4 vector as follows: Mix 5 μg of vector, 10 μL of NEB CutSmart buffer, 2 μL of BsaI, make up to 100 μL with water and incubate at 37  C for 2 h in a water bath. 2. Confirm that the vector is completely digested by analyzing 3 μL on a 1.5% TAE/agarose gel. The bands should be 5.3 kb and 1.9 kb in size, with the smaller fragment containing the SacB cassette from the vector (see Note 11). 3. Purify the vector using a PCR purification column, as per the manufacturer’s instructions, eluting in 50 μL of TE buffer. 4. T4-treat the vector as follows: Mix 21.5 μL of water, 50 μL of BsaI-digested plasmid, 10 μL of NEB Buffer 2.1, 10 μL of 25 mM dGTP, 1 μL of BSA, 5 μL of 100 mM DTT, and

292

Claire Strain-Damerell and Nicola A. Burgess-Brown

2.5 μL of T4 DNA polymerase. Incubate at 22  C for 30 min, then heat inactivate at 75  C for 20 min in a thermal cycler. 5. To T4-treat the PCR products, set up a master mix as follows: Mix 215 μL of water, 100 μL of NEB Buffer 2.1, 100 μL of 25 mM dCTP, 10 μL of BSA, 50 μL of 100 mM DTT, and 25 μL of T4 DNA polymerase. Aliquot 5 μL into each well of a 96 well PCR plate, and using a multichannel pipette, transfer 5 μL of purified PCR product. Seal the plate and incubate at 22  C for 30 min, followed by 75  C for 20 min in a thermal cycler. 6. In a new 96 well PCR plate, aliquot 1 μL of T4-treated pNIC28-Bsa4 into each well, then using a multichannel, transfer 2 μL of T4-treated PCR product. Incubate at room temperature for 1 h. In the meantime, thaw the competent Mach1™ cells on ice and prepare the LB-agar plates containing 50 μg/ mL kanamycin and 5% (w/v) sucrose (see Note 11). 7. Place the PCR plate on ice, and using a reagent reservoir and multichannel pipette add 50 μL of Mach1™ cells to each well, mixing gently but well in the process. 8. Incubate on ice for 30 min and heat shock at 42  C for 45 s in a water bath, before returning to ice. 9. Add 100 μL of SOC medium to each well, cover with a porous seal, and incubate for 1.5 h at 37  C. 10. Plate the entire transformation mixture onto the LB-agar plates and incubate at 37  C overnight. 11. To screen the colonies, set up a master mix as follows: Mix 400 μL of 5 Bioline buffer, 1.49 mL of water, 100 μL of 10 μM pLIC-F and pLIC-R screening primer mix (see Subheading 2.4, item 1) and 10 μL of Bioline MyTaq™ DNA polymerase. Aliquot 20 μL into each well of a 96 well PCR plate. 12. Using a multichannel pipette and a reagent reservoir, add 1 mL of LB medium, containing 50 μg/mL kanamycin to each well of a 96 deep well block. 13. Using a 1 μL loop, pick a single colony and inoculate into the appropriate well of the PCR plate (see Subheading 3.4, step 11), followed by the deep well block (see Subheading 3.4, step 12). 14. Seal the PCR plate with a thermal resistant seal and cycle in a thermal cycler as follows: 95  C for 10 min (95  C for 30 s, 52  C for 30 s, 72  C for 1–3 min∗)  25 cycles 72  C for 5 min 15  C hold

High-Throughput Site-Directed Mutagenesis

293

Fig. 6 Example of a PCR screen gel. In this case the plate contained 8 mutations of 12 different targets running in columns across the plate; hence the sizes for neighboring wells are comparable (e.g., A1 and B1) ∗

Extension time is dependent on product size. Assume 1 min per 1 kb for the longest product on the plate and allow for the extra 300 bp for the position of the screening primers. 15. While the cycle is running, prepare a 96 well 1.5% TAE gel. 16. Seal the deep well block with a porous seal and incubate at 37  C overnight in a Glas-Col shaker at 650 rpm. 17. Using a multichannel pipette, analyze 10 μL of the PCR screen, noting that the samples will be interleaved (see Fig. 6). Load 6 μL of 1 kb DNA ladder and run the gels at 150 V for 1 h. 18. Analyze the sizes of the products and screen additional clones where required. 19. Where the DNA sequence of all clones must be verified before proceeding, it is advisable to sequence at this step as the transformants will still be fresh enough to select alternative clones (see Note 12). To sequence, purify the remaining 10 μL of PCR product using a MultiScreen PCR96 filter plate (Millipore), following the manufacturer’s instructions, and recover in 20 μL of TE. Send these samples for sequencing using the screening primers and internal sequencing primers where required. 20. Once the sequences are confirmed, combine the correct clones into a new 96 deep well block by inoculating 20 μL of each culture into 1 mL of LB medium, containing 50 μg/mL kanamycin. Grow at 37  C overnight in a Glas-Col shaker at 650 rpm. 21. Prepare a glycerol stock by adding 30 μL of 60% (v/v) glycerol to each well of a V-bottomed microtiter plate, followed by 120 μL of the overnight culture. Cover with a foil seal and store at 80  C.

294

Claire Strain-Damerell and Nicola A. Burgess-Brown

22. Centrifuge the remaining culture at 3000  g, discard the supernatant into Virkon (as per the manufacturer’s instructions), and process the pellets using the Montage Plasmid MiniprepHTS 96 Kit (Millipore). Recover the plasmid DNA in 30 μL of TE buffer. Transform into an appropriate expression strain and store the DNA at 20  C.

4

Notes 1. Fewer primers are required for alternative mutagenesis strategies but in our hands the success rate for introducing the mutations is higher when both forward and reverse mutagenic primers are used. When a single mutagenic primer is used for the first round, the result is a mismatch in the second round that can be targeted by the proofreading ability of the polymerase, resulting in the loss of the desired mutation from the second round of PCR. 2. You can set up two sets of forward/reverse plates instead if you wish but you will need the LIC primers alone for the second round PCR. 3. The template DNA must be in the form of plasmid DNA isolated from a dcm/dam positive strain for this process to work. It must also be of sufficient quality to be amenable to DpnI digestion. 4. It is vital that the reaction mixture is well mixed prior to aliquoting across the 2 PCR plates. 5. The first round products will be faint as they are the result of only 15 cycles but they should ideally be visible as this increases the success rate of the process. To optimize where there is no visible product, you can reduce the annealing temperature (between 50  C and 62  C), try adding 3% (v/v) DMSO or 5% (v/v) glycerol to the reaction, or try a touchdown cycle with annealing temperatures of 65  C, 60  C, and 55  C and 5 cycles of each. As a last resort you can increase the total number of cycles to 25, rather than 15, but this carries an increased risk of introducing undesired mutations. 6. This is the most vital step of the mutagenesis protocol as it removes the template DNA, i.e., the wild-type sample, from the reaction. DpnI selectively degrades methylated DNA (e.g., that purified from most bacterial strains). In contrast, DNA generated by PCR will be un-methylated and therefore will remain intact during the DpnI treatment. 7. As the DpnI treatment is so essential to success, it is important to run a control reaction to test that the template DNA has actually been fully degraded.

High-Throughput Site-Directed Mutagenesis

295

8. By DpnI-treating the template DNA you ensure that you cannot amplify the wild-type plasmid with the LIC primers, as the overlaps between the DpnI-generated fragments are too short to allow each to act as a primer for the next. When tested, we found no difference in the incidence of wild type between purified and non-purified samples. 9. If there are no discernible products from the first round PCR, then this is most likely the reason that the second round failed; therefore the first step in troubleshooting should be to confirm the first round products. If the first round products are correct, then try to alter the second round conditions (see Note 5). If one of the first round products is particularly dominant compared to the other, then sometimes reducing the amount used for the second round reaction can help. Note that there may well be multiple bands visible but there is no need to purify the correct band away from the others as only the full length product will contain both LIC sites. 10. In order to proceed with the T4-treatment stage, the PCR products must first be purified away from the residual dNTPs in the reaction, as these will inhibit the 30 resection of the PCR products by the T4 DNA polymerase. 11. The pNIC28-Bsa4 vector contains a SacB cassette, encoding levansucrase, which confers the ability to convert sucrose into toxic polymers. The SacB cassette is released during the cloning process. When plated on sucrose, cells containing the sacB gene will perish, whereas those containing the recombinant vector that lacks the SacB cassette will grow unhindered. This step selects against non-recombinant clones. 12. Where the targets express well, it is possible to verify the mutation by determining the intact mass on a mass spectrometer [12]. However if the target does not express or is low expressing, then DNA sequencing is required to verify the mutation. As the plasmid concentrations from a 96 well miniprep can often be too low to be amenable to DNA sequencing, it is often easier to use the remaining colony screen PCR product instead, as the requirements for PCR product sequencing are far lower than for plasmids.

Acknowledgments We would like to thank all the SGC scientists (past and present) who contributed toward the development of the method. The SGC is a registered charity (number 1097737) that receives funds from AbbVie, Bayer Pharma AG, Boehringer Ingelheim, Canada Foundation for Innovation, Eshelman Institute for Innovation, Genome

296

Claire Strain-Damerell and Nicola A. Burgess-Brown

Canada through Ontario Genomics Institute, Innovative Medicines Initiative, Janssen, Merck & Co., Novartis Pharma AG, Ontario Ministry of Research, Innovation and Science (MRIS), Pfizer, Sa˜o Paulo Research Foundation-FAPESP, Takeda, and the Wellcome Trust. References 1. Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 33(Database issue):D514–D517 2. The UniProt C (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res 45 (D1):D158–D169 3. Ward JJ, McGuffin LJ, Bryson K, Buxton BF, Jones DT (2004) The DISOPRED server for the prediction of protein disorder. Bioinformatics 20(13):2138–2139 4. Musil M, Stourac J, Bendl J, Brezovsky J, Prokop Z, Zendulka J, Martinek T, Bednar D, Damborsky´ J (2017) FireProt: web server for automated design of thermostable proteins. Nucleic Acids Res 45:W393–W399 5. Hemsley A, Arnheim N, Toney MD, Cortopassi G, Galas DJ (1989) A simple method for site-directed mutagenesis using the polymerase chain reaction. Nucleic Acids Res 17(16):6545–6551 6. Sarkar G, Sommer SS (1990) The “megaprimer” method of site-directed mutagenesis. BioTechniques 8(4):404–407

7. Ho SN, Hunt HD, Horton RM, Pullen JK, Pease LR (1989) Site-directed mutagenesis by overlap extension using the polymerase chain reaction. Gene 77(1):51–59 8. Spicer CD, Davis BG (2014) Selective chemical protein modification. Nat Commun 5:4740 9. Mitchell LA, Cai Y, Taylor M, Noronha AM, Chuang J, Dai L, Boeke JD (2013) Multichange isothermal mutagenesis: a new strategy for multiple site-directed mutations in plasmid DNA. ACS Synth Biol 2(8):473–477 10. Hanahan D, Jessee J, Bloom FR (1991) Plasmid transformation of Escherichia coli and other bacteria. Methods Enzymol 204:63–113 11. Strain-Damerell C, Mahajan P, Gileadi O, Burgess-Brown NA (2014) Mediumthroughput production of recombinant human proteins: ligation-independent cloning. Methods Mol Biol 1091:55–72 12. Chalk R, Berridge G, Shrestha L, StrainDamerell C, Mahajan P, Yue W, Gileadi O, Burgess-Brown N (2014) High-Throughput Mass Spectrometry Applied to Structural Genomics. Chromatography 1(4):159

Part III High-Throughput Protein Production Combined with High-Throughput Micro-Characterization for Protein Stability, Solubility, Quality and Interactions

Chapter 14 Hot CoFi Blot: A High-Throughput Colony-Based Screen for Identifying More Thermally Stable Protein Variants Ignacio Asial, P€ar Nordlund, and Sue-Li Dahlroth Abstract Highly soluble and stable proteins are desirable for many different applications, from basic science to reaching a cancer patient in the form of a biological drug. For X-ray crystallography—where production of a protein crystal might take weeks and even months—a stable protein sample of high purity and concentration can greatly increase the chances of producing a well-diffracting crystal. For a patient receiving a specific protein drug, its safety, efficacy, and even cost are factors affected by its solubility and stability. Increased protein expression and protein stability can be achieved by randomly altering the coding sequence. As the number of mutants generated might be overwhelming, a powerful protein expression and stability screen is required. In this chapter, we describe a colony filtration technology, which allows us to screen random mutagenesis libraries for increased thermal stability—the Hot CoFi blot. We share how to create the random mutagenesis library, how to perform the Hot CoFi blot, and how to identify more thermally stable clones. We use the Tobacco Etch Virus protease as a target to exemplify the procedure. Key words Colony screen, Protein expression, Error-prone PCR, Thermal stability, Hot CoFi

1

Introduction Protein solubility and stability are highly desired traits for the biopharmaceutical and biotechnology industries, as well as for the overall research community. There are significant unmet needs in this area where new methodologies and techniques to improve solubility and intrinsic stability are required. For the biopharmaceutical industry, increased solubility and stability can result in easier manufacturing, simplified formulation, and increased shelf life. In fact, high intrinsic thermal stability has shown to be a highly favorable trait for a protein drug to successfully reach the clinic [1–8]. More stable biotherapeutics may also reduce aggregationlinked immunogenicity in patients and allow decreased infusion times and even enable new potential alternative routes of administration [9]. Highly stable enzymes are also desired in the industrial biotechnology sectors such as bio-energy, plastic, paper, and textile

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_14, © Springer Science+Business Media, LLC, part of Springer Nature 2019

299

300

Ignacio Asial et al.

industries, as well as in household and industrial cleaning products, etc., which today represents a multi-billion dollar market [10]. In the early 2000s different Structural Genomics efforts pioneered huge advancements in high-throughput (HTP) recombinant protein production and purification methods. Because of its cost, ease, and lack of posttranslational modifications Escherichia coli (E. coli) was the preferred production host. New HTP sub-cloning technologies were introduced and the subsequent step of expression and solubility screening was miniaturized to a 96-well format where liquid cultures were lysed and soluble protein was separated from insoluble protein by centrifugation or filtration. To increase the likelihood of soluble recombinant expression in E. coli, scientists would vary and combine different parameters deemed to affect expression such as the host/strain, the vector system, the promoters, fusion tags, expression temperatures, induction times, and media [11–20]. To date, E. coli is still the main workhorse for recombinant protein production, and expression screens are still very much done in a 96-well format. However, there has been an adjustment in strategy when it comes to increasing the success of producing a soluble and stable recombinant protein. Instead of only varying the parameters mentioned above, changing the actual coding sequence to accommodate for rare codon bias or altering the protein’s physical properties has been shown to be an effective strategy. This can be done rationally to remove or replace low complexity or hydrophobic regions or by creating multiple constructs with different translational starts and stops to target smaller protein domains [21–45]. Rational protein design works very well, when there is good knowledge about the target such as its three-dimensional structure and homologues with conserved structural and functional domains, which may guide the construct design [46–52]. However, when there is little or no information available about the protein, the task of creating meaningful constructs or mutants can be very difficult, bordering on impossible. Instead a more random approach might be needed. Several different methods like error-prone PCR (epPCR), DNA shuffling, directional truncations, etc. can be used to create a mutant library consisting of an exponential number of variants for one target [53–61]. By randomizing the protein sequence to increase expression, additional and desirable traits might also be created such as increased solubility, stability, or activity. The remaining hurdle for the scientist is then to screen the mutant library for expression, which might be a daunting effort (even if you have an endless amount of 96 well plates at your disposal). Once a protein has been overexpressed and isolated, the scientist might still be left with a protein that is unstable and aggregates. The main goal will then be to identify optimal stabilizing conditions. This can be done by screening protein stability in different buffers, pH, with different additives and ligands, etc. This work can be done either by specific

The High-Throughput Hot CoFi Blot

301

protein assays or with more generic tools such as thermal stability assays (TSAs), where unfolding and aggregation of the protein is monitored during an increase in temperature. Increased thermal stability of a protein has been shown to positively correlate with the likelihood of crystallization for structural determination [62–64], and TSAs are routinely used in our laboratory to identify, investigate, and/or verify stabilizing conditions or ligands [65–69]. But as mentioned above, thermal stability is not just important for creating a protein sample suitable for crystallization, it is equally important for the pharmaceutical industry, where poor protein stability can affect the safety and efficacy of the protein drug [9, 70]. Seeing as beneficial mutations or truncations might be rare, a powerful expression screen to resolve thousands of protein variants would be ideal. In addition, the expression screen should preferably be protein activity-independent and function at the colony level. To be able to screen large gene libraries, we developed an activityindependent colony screen that relies on a filtration step, at colony level, and has therefore been named Colony Filtration blot or CoFi blot for short [71, 72]. It can be applied on a wide range of proteins and on any colony-forming organism and can be implemented in a modestly equipped laboratory. It does not rely on a reporter protein nor on specific protein activity, instead the protein is detected with antibodies and immuno-reagents. In our case, to be as generic as possible, we fuse our proteins with a 6xHis-tag that can be used for both detection and purification. CoFi blot is a method where colonies, carrying a gene construct or variants, are transferred to a filter membrane, which enables the separation of aggregated protein from soluble protein. After induction, the filter is placed on top of a nitrocellulose membrane, and during lysis, soluble protein will diffuse through the filter and be captured by the nitrocellulose. The nitrocellulose can then be blocked, probed, and developed with suitable immuno-reagents. The CoFi blot has been shown to be accurate and robust when benchmarked to the commonly used 96-well screens. It has been successfully used to screen increased expression of several deletions libraries of hard to express proteins [73] been shown to be useful for screening other types of gene collections [74] and even adapted for membrane proteins [75]. A few additional colony screens exist, which allows for direct screening of recombinant overexpression [55, 76–82]. All of them have the advantage of being very powerful, in that they can screen large gene libraries in one experiment, but they also suffer from some drawbacks such as bad resolution, false positives, large reporter proteins fused to the protein of interest and subsequent re-cloning that might ultimately change the expression of the target, etc. In addition, none of these methods allows for screening of increased thermal stability. As for screening for increased stability there exists a handful of generic screens that are protein-

302

Ignacio Asial et al.

independent, such as Proside, THR, and Tripartite Fusion System [83–86]. All of them have high screening capacity and are relatively inexpensive, but rely on the indirect read-out of a reporter protein or are limited to proteins that can be displayed on a phage. 1.1

Hot CoFi Blot

To allow the identification of more thermally stable protein variants, we expanded the CoFi blot to include a heating step before lysis, Hot CoFi blot [87]. The heating step will induce irreversible protein aggregation inside the bacterial colony, and upon lysis only thermally stable proteins will diffuse through the filter membrane and be detected on the nitrocellulose (Fig. 1A-H). The Hot CoFi blot has been benchmarked against commonly used TSAs and has allowed us to accurately determine the melting temperature (Tm) for a diverse set of proteins with mammalian and viral origin. We also successfully increased their thermal stability as compared with wild type (WT) by introducing random mutations by epPCR. Our protocol for creating these mutagenesis libraries is extremely straightforward and HTP as it allows us to create 10–12 libraries in parallel. Below we describe, in detail, how to generate an epPCR library and screen it with the Hot CoFi blot to identify more thermostable variants of a target protein. We also provide, as an example, the detailed sequence, primers, and screening temperatures used to generate a more thermally stable variant of the Tobacco Etch Virus (TEV) protease carrying a C-terminal His-tag. In our hands, after two rounds of random mutagenesis by epPCR and Hot CoFi screening, the melting temperature (Tm) of TEV protease was increased by almost 15 C without loss of activity as compared to WT. The increase in thermal stability was assessed with Differential Scanning Fluorimetry (DSF) and protease activity was assessed with an activity assay.

Fig. 1 Schematics of the Colony Filtration blot procedure. The heating is performed in step E and F

The High-Throughput Hot CoFi Blot

303

All steps in the protocol have been highly optimized to yield as much data and results as possible at a low cost while keeping to an HTP framework. For instance, to maintain HTP epPCR was preferred over other methods such as saturation mutagenesis, MEGAWHOP was preferred over restriction-cloning and transformation by electroporation over chemical transformation. Freeze-thawing in a home-made lysis buffer has been preferred over adding specific commercial lysis buffers as it has, in our lab, shown to be as efficient and a less costly alternative. A His-probe is commonly preferred in our lab over a His-antibody due to costs but also at the same time mimics binding to a Ni2+-conjugate, which would be used in any subsequent IMAC purification. Even though we routinely look for overexpression in bacteria, this method can be applied to any colony-forming organism.

2

Materials

2.1 Making Electrocompetent Cells

Reagents

1. E. coli cloning strain, e.g., DH10B (Thermo Fisher Scientific). 2. SOB media. 3. 1 mM HEPES pH 7.5 autoclaved or filter-sterilized (0.2 μm). 4. 10% Glycerol in MilliQ water autoclaved or filter-sterilized (0.2 μm).

2.2 Creating a Randomly Mutated Library with epPCR

Reagents

1. Protein expression vector with C-terminal tag (carrying gene of interest), e.g., pET24a (Novagen). 2. Vector-specific primers flanking the gene of interest. 3. T4 Polynucleotide Kinase, T4 DNA ligase buffer, DpnI and Taq DNA ligase (NEB). 4. Genemorph II Mutagenesis Kit (Agilent Technologies). 5. DNA, PCR and gel purification (Qiagen). 6. KOD HotStart and KOD Xtreme polymerases (MerckMillipore).

2.3 Screening a Mutagenesis Library with the Hot CoFi Blot

1. E. coli strain for protein expression such as Rosetta2 (DE3) from Novagen. 2. LB-Agar plates supplemented with appropriate antibiotics. 3. LB-Agar plated supplemented with appropriate antibiotics and 0.2 mM IPTG. 4. SOC medium. 5. Chicken egg white lysozyme. 6. Bovine Serum Albumin. 7. Benzonase endonuclease (Merck-Millipore).

304

Ignacio Asial et al.

8. Protease Inhibitor tablets, e.g., Inhibitor Cocktail Set III, EDTA-Free (Merck-Millipore). 9. CoFi lysis buffer: 20 mM Tris, pH 8.0, 100 mM NaCl, 0.2 mg/mL lysozyme, 11.2 U/mL Benzonase endonuclease and Protease Inhibitor (1 tablet per 100 mL of buffer). 10. TBS-T buffer: 20 mM Tris, pH 7.5, 500 mM NaCl, 0.05% (vol/vol) Tween 20. 11. Anti-His probe (Thermo-Fischer Scientific). 12. Super Signal West Dura chemiluminescence kit (ThermoFischer Scientific). Equipment and Consumables

1. Petri dishes (sterile): round of 9 cm and 15 cm diameter, and square of 24.5 cm diameter, e.g., from NUNC (ThermoFischer Scientific). 2. Durapore membrane (Merck-Millipore). 3. Nitrocellulose membrane (Thermo-Fischer Scientific). 4. Whatman paper (Whatman). 5. Sterilized 96-well deep-well flat-bottom plates (ThermoFischer Scientific). 6. 96-well 0.45 μm PVDF filter plates, e.g., MultiScreen plates (Merck-Millipore). 7. CCD camera such as LAS-4000 (Fujifilm). 8. Incubator for bacterial liquid growth, e.g., LEX bioreactor system from Harbinger Biotechnology & Engineering (Toronto, ON, Canada). 9. Plate incubator with reliable temperature control and temperature limit above 80  C.

3

Methods

3.1 Making Electrocompetent Cells

1. Start an overnight culture (2.5–5 mL per 250 mL culture) in SOB at 37  C. 2. The next day, inoculate 2–8  250 mL of SOB with 2.5 mL of starter culture each (baffled flasks). 250 mL culture will yield approximately 1–3 mL of electrocompetent cells. 3. Incubate at 37  C, shaking at 200 rpm. 4. When OD600 ¼ 0.6–1.0 (ideally 0.8), place the cultures on ice, in the cold-room for 30 min. From this point, work in the cold room (or on ice) all the time. Record the OD obtained. 5. Centrifuge cultures for 10 min at 5000  g, at 4  C, in autoclaved 250 mL flat-bottom tubes (see Note 1). 6. Resuspend the pellets in 200 mL ice-cold 1 mM HEPES, pH 7.5 by placing them in a shaker in the cold room (220 rpm).

The High-Throughput Hot CoFi Blot

305

7. Centrifuge cultures for 10 min at 5000  g, at 4  C (see Note 2). 8. Repeat steps 6 and 7. 9. Resuspend the pellets in 200 mL ice-cold 10% glycerol in MilliQ water by placing them in the shaker. 10. Centrifuge cultures for 10 min at 5000  g, at 4  C. 11. Repeat steps 9 and 10. 12. Resuspend the pellet by gentle shaking (no extra buffer or glycerol needed). 13. Aliquot 110 μL cells in 500 μL tubes (everything on ice or cold room). 14. Immediately place tubes in dry ice. Incubate for 30 min. 15. Store at 80  C (stable for ~6 months). 3.2 Creating a Randomly Mutated Library with epPCR 3.2.1 Primer Design

3.2.2 Kinase Treatment of Primers

For targets 1000 bp long or less: Use vector-specific primers flanking the gene of interest (see Note 3). For targets >1000 bp long: success rates of the whole procedure are reduced for longer genes. Split gene in two and create 2 libraries instead. To do so, find a GC-rich short sequence in the middle of the gene. Use the sense (50 –30 sense) strand of this sequence as forward primer for part-2 of the gene, and the reverse-complementary (30 –50 orientation) as reverse primer for part-1 of the gene. For part-1: vector-specific forward and middle-primer reverse; for part-2: middle-primer forward and vector-specific reverse. Phosphorylation of primers is required for improved performance of MEGAWHOP reaction. Use only freshly prepared phosphorylated primers, and handle them on ice as phosphorylation is thermolabile. 1. Mix the following: Forward primer (20 μMa)

5 μL

Reverse primer (20 μM )

5 μL

NEB T4 DNA ligase buffer 10

2 μL

NEB T4 PNK (10 U/μL)

1 μL

H2O

7 μL

a

Adjust volumes accordingly if the primer concentration is not 20 μM

a

2. Incubate at 37  C for 30–60 min. 3.2.3 Error-Prone PCR

This error-prone protocol is optimized to lead to 1–4 mutations per gene.

306

Ignacio Asial et al.

1. Create the error-prone PCR mix Template (500 nga)

X μL

10 Mutazyme II buffer

5 μL

40 mM dNTP

1 μL

Primer mix (10 μM)

5 μL

Mutazyme II polymerase

1 μL

ddH2O

Up to 50 μL

a Template amount can be reduced to increase mutation rate, and increased to reduce mutation rates

2. Program the thermal cycler and run PCR reaction according to the following: Step

Number of cycles

Temperature

Duration

1

1

95  C

2 min



2

30

95 C Primer Tm 5  C. 72  C

30 s 30 s 1 min/kb

3

1

72  C

10 min

4

1



4 C

Hold

3. Analyze the epPCR product on an agarose gel (see Note 4). 3.2.4 Gel Purification of epPCR Product

1. Cast an agarose gel. To do so, mix 1.5 g agarose + 150 mL 1 TAE buffer. Heat until homogeneity, add 4 μL SYBR-Safe 10,000. Cast and add combs. 2. Mount the gel into the agarose gel electrophoresis system, and add 1 TAE buffer, just enough to cover the gel. 3. Add 10 μL of 6 Loading Dye to 50 μL of epPCR product. Mix and load 30 μL per well. 4. Run gel at 100 V for 30 min. 5. Take a picture of the gel using blue or UV light (see Fig. 2). 6. Cut the desired band using a brand-new blade. Use different blades for different targets, to avoid cross-contamination, see Note 5. Transfer the gel sections to 2 mL Eppendorf tubes. All gel fragments corresponding to the same PCR insert should be pooled into the same tube, to obtain the highest concentration of insert possible from the purification. Gel Purification Procedures

1. Mix gel with 3 volumes of QG buffer. If you have many samples, use an excess of QG buffer: 1 mL.

The High-Throughput Hot CoFi Blot Ladder

Error-prone PCR products

307

Ladder

Templates

5,000 bp 2,000 bp 1,000 bp 500 bp

Inserts

100 bp

Fig. 2 Agarose gel showing examples or epPCR products from 8 reactions run in parallel

2. Dissolve gel at 50  C for 10 min. Every 2/3 min, give it a good shake or vortex the tubes. 3. Add 1 volume of isopropanol. 4. Mix well, transfer to a gel purification column. If the volume exceeds 800 μL, load sequentially onto the same column (do not split samples into new columns). 5. Centrifuge at 4000  g for 15 s. Reload the flow-throw and centrifuge again. 6. Discard flow-throw and add 750 μL of PE buffer. 7. Centrifuge at 17,900  g for 1 min. Discard flow-throw and centrifuge again. 8. Place column on a new Eppendorf tube. 9. Add 35 μL of EB buffer. 10. Incubate for 15 min at room temperature. 11. Centrifuge at 17,900  g for 1 min. 3.2.5 Library Creation by Megaprimer PCR of Whole Plasmid (MEGAWHOP)

1. Create a MEGAWHOP PCR mix with KOD Xtreme (the table below represents an ideal mix) Template (10 ng/μL)

1 μL

2 Xtreme Buffer

25 μL

2 mM dNTP

10 μL

50 mM NAD+

1 μL

Taq ligase

1 μL

Megaprimer

X μL (1 ng/bpa)

KOD Xtreme polymerase

0.5 μL

ddH2O

Up to 50 μL

a Megaprimer amount: 1 ng of insert per base-pair of insert length, e.g., 500 ng for a gene 500 nucleotides long

308

Ignacio Asial et al.

When using KOD Xtreme, your desired amount of megaprimer should be met in a maximum volume of 11.5 μL. If this condition is not met, KOD HotStart can be used (see below). However, this polymerase performs less efficiently. It is recommended that, instead (or in parallel), two new epPCR are set (eventually optimizing annealing temperatures and extension times) per target. Both PCR products should be combined for gel purification, to increase gel-purification yields, thus allowing the use of KOD Xtreme. 2. Program the thermal cycler and run the KOD Xtreme PCR reaction according to below table. Step

Number of cycles

Temperature ( C)

Duration

1

1

94

2 min

2

30

98 65a 68

10 s 30 s 1 min/kbb

4

1

4

Hold

a

The annealing temperature can be optimized for higher yield. Use the highest temperature possible to reduce undesired MEGAWHOP products (i.e., unspecific extensions) b 1 min per kb of vector length

Alternative 1: Only run this PCR reaction with KOD HotStart if the concentration of Megaprimer is low. Template (10 ng/μL)

1 μL

10 HotStart Buffer

5 μL

25 mM MgSO4

3 μL

2 mM dNTP

5 μL

DMSO

2.5 μL

50 mM NAD+

1 μL

Taq ligase

1 μL

Megaprimer

X μL (1 ng/bpa)

KOD HotStart polymerase

0.5 μL

ddH2O

Up to 50 μL

Alternative 2: The reaction for KOD HotStart PCR should be run with the following settings.

The High-Throughput Hot CoFi Blot

309

Step

Number of cycles

Temperature ( C)

Duration

1

1

95

2 min

2

30

95 65a 68

20 s 10 s 25 s/kbb

4

1

4

Hold

a The annealing temperature can be optimized for higher yield. Use the highest temperature possible to reduce undesired MEGAWHOP products (i.e., unspecific extensions) b 25 s per kb of vector length

3.2.6 DpnI Treatment of MEGAWHOP Product

1. Add 3 μL of DpnI (10 U/μL) to the PCR product. This is in excess to account for the reduced performance of the enzyme in the PCR reaction buffer, and reduced digestion of hemimethylated DNA. 2. Incubate for 3 h at 37  C.

3.2.7 PCR Purification of the MEGAWHOP Product

1. Mix sample with 5 volumes of QG buffer and 15 μL of 1.5 M NaOAc, pH 5.2. 2. Mix well, transfer to a PCR purification column. 3. Centrifuge at 4000  g for 15 s. Reload the flow-throw and centrifuge again. 4. Discard flow-throw and add 750 μL of PE buffer. 5. Centrifuge at 17,900  g for 1 min. Discard flow-throw and centrifuge again. 6. Place column on a new Eppendorf tube. 7. Add 35 μL of EB buffer. 8. Incubate for 15 min at room temperature. Do not shorten this incubation, as the yield would decrease. 9. Centrifuge at 17,900  g for 1 min.

3.2.8 Check Purified MEGAWHOP Product on Agarose Gel

1. Load 3 μL of purified MEGAWHOP product onto an agarose gel.

3.2.9 Electroporation of MEGAWHOP Product

These procedures are preferably done in the cold room (see Note 6).

2. Take a picture of the gel (Fig. 3).

1. Chill a 1 mm cuvette on ice for 30 min. 2. Thaw 100 μL of electrocompetent cells (DH10B) on ice for 15 min. 3. Mix 15 μL of ice-cold MEGAWHOP purified product with 100 μL of cells.

310

Ignacio Asial et al. Ladder MEGAWHOP products

Ladder Over-amplified vector

5,000 bp 3,000 bp 2,000 bp 1,500 bp 1,000 bp 750 bp 500 bp 300 bp 100 bp

Libraries

Unused inserts

Fig. 3 Agarose gel showing examples of MEGAWHOP products from 5 reactions run in parallel

4. Transfer the mix into the cuvette. Make sure that there are no bubbles by tapping the cuvette gently. 5. Electroporate with the following settings making sure that the time constant Tc should be 3–5 ms. Voltage

2000 V

Resistance

200 Ω

Capacitance

25 μF

6. Immediately add 1 mL of ice-cold SOC media. 7. Transfer the cells to a 2 mL Eppendorf tube, and incubate for 1 h at 37  C, with shaking. 8. Use 10 μL of the cells to determine library size: perform six 1/10 serial dilutions (101 to 106). Plate 5 μL of each dilution onto an agar plate containing the appropriate antibiotic. Incubate overnight at 37  C (see Note 7). 9. The remaining cells are transferred to 50 mL of TB containing the appropriate antibiotic. 10. Incubate overnight at 37  C with shaking. 3.2.10 Purification of Libraries

1. Transfer 2 1 mL of the library culture into two 2 mL Eppendorf tubes (1 mL per tube). 2. Perform library DNA purification using Qiaprep Spin Miniprep kit (or equivalent) according the manufacturer’s instructions (see Note 8). 3. Pellet the remaining of the culture (~45 mL) and store at 20  C in case new purifications are required. 4. Check the quality and estimate amount of the purified library on agarose gel (see Note 9).

The High-Throughput Hot CoFi Blot

3.3 Screening a Mutagenesis Library with the Hot CoFi Blot

311

Our preferred method of transformation is electroporation, as the transformation efficiencies are very reproducible and we can predict the number of colonies obtained. Typically, if the electrocompetent cells were prepared according to the protocol described above, 5 ng library electroporated into 50 μL of electrocompetent Rosetta2 cells gives 10,000–100,000 colonies, using the setting described above. 1. Transform an E. coli expression strain with the library by electroporation. 2. Add 1 mL SOC media and incubate for 1 h at 37  C. 3. Plate different volumes into several 24.5 cm square petri dishes with LB agar supplemented with the appropriate selective antibiotics. You need about 200 mL LB agar to cover one plate. For example, if using Rosetta2 cells and a pET24a vector, use LB agar supplemented with 50 μg/mL kanamycin and 34 μg/ mL chloramphenicol. 4. Grow plates overnight at 37  C. These are your “master” plates (see Fig. 1a). 5. The next day, cut Durapore membranes to an exact size of 22  22 cm (see Note 10). Cut one corner of the membrane to facilitate alignment. 6. Select the master plates containing 10,000–20,000 colonies (see Note 11).

approximately

7. Using a pair of ethanol-sterilized tweezers, place the Durapore membrane on top of the master plate, making sure to avoid bubbles. Incubate for about 1 min at room temperature (see Fig. 1b). 8. Remove the Durapore membrane with a peeling motion, flip it upside down. Place it on top of an LB agar plate, with colonies facing up, supplemented with antibiotics and protein expression—induction agent, e.g., 0.2 mM IPTG if using a pET24a vector. This is your “induction” plate (see Fig. 1c), see Note 12. 9. Incubate the induction plate overnight at RT (see Fig. 1d). Keep the master plates at 4  C. 10. The next day, cut one nitrocellulose membrane and one Whatman paper per induction plate, to an exact size of 22  22 cm. 11. Place the Whatman paper into an empty 24.5 cm square petri dish, and the nitrocellulose membrane on top of it. Cover both membranes with lysis buffer (approximately 100 mL), then remove all the buffer. Remove any bubbles with a plastic brayer. 12. For the colony filtration blot method, transfer the Durapore membrane on top of the nitrocellulose membrane. This is your “lysis” plate. Incubate for 30–60 min at room temperature (see Fig. 1e, f).

312

Ignacio Asial et al.

13. If performing the Hot CoFi method, incubate the induction plate at the desired temperature, typically 5–10  C above the wild-type protein melting temperature for 30–60 min, then transfer the Durapore membrane on top of the nitrocellulose membrane and incubate at the same temperature for an additional 30–60 min (see Fig. 1e, f). 14. Mark the Durapore membrane’s cut corner with a ballpoint pen. This will enable alignment later on. 15. Perform three freeze-thaw cycles by placing the lysis plate at 80  C for 30 min, then at RT for 30 min (until the plate is fully thawed). These cycles improve cell lysis efficiency. 16. Discard the Whatman paper and the Durapore membrane from the lysis plate, leaving only the nitrocellulose membrane. Cover the membrane with block buffer, and incubate for 1 h at RT, with very gentle mixing using a rock shaker (see Note 13). 17. Wash the nitrocellulose membrane three times with ~200 mL PBS-T, with 10 min incubations with shaking. 18. Cover the nitrocellulose membrane with an HRP-conjugated antibody or probe toward your target protein. We use the HisProbe-HRP conjugate diluted 1:5000 in block buffer. Incubate for 1 h at RT with shaking. 19. Wash the nitrocellulose membrane with ~200 mL PBS-T 3  10 min incubations with shaking. 20. Cover the nitrocellulose membrane with approximately 7 mL Super Signal West Dura chemiluminescence solution. Make sure the solution covers the whole membrane by tilting the membrane in different directions. Discard any excess liquid. 21. Use a CCD camera to record the chemiluminescence signal over time (see Fig. 1g), e.g., 2 min recording time, with images taken every 10 s. Take note of the orientation of the plate (thanks to the marking made previously); this will facilitate the alignment of the image later. 22. Select the image that gives the highest contrast between positive and negative colonies, while allowing you to still visualize negative ones. Crop the image to the size of the nitrocellulose membrane, and resize it to 22  22 cm. Mirror flip the image and print. Make sure the printer is set without resizing. 23. Take out the master plates from 4  C and let them warm up to RT. It may be necessary to open the lid to eliminate condensation. 24. Place the printed image below the master plate, and align the colonies on the plate to the dots in the image (see Fig. 1h).

The High-Throughput Hot CoFi Blot

313

25. Pick the positive colonies and re-streak them onto new LB-agar plates supplemented with antibiotics. Grow overnight at 37  C. 3.3.1 Validation of Variants Obtained

1. To validate the clones picked from the master plate, pick 4–8 colonies from each streaked agar plates, and grow them individually in 1 mL TB supplemented with antibiotics, overnight at 37  C. 96-well deep-block plates are ideal for this purpose. Include the wild-type clone, in triplicates. 2. Take 2 μL from each culture and plate them, as arrayed dots, on a 15 cm round Petri dish (see Note 14). The remaining culture can be used to make glycerol stocks and stored at 80  C. 3. Let the bacterial lawns grow for 4–5 h at 37  C, then perform the CoFi or Hot CoFi blot as previously described. 4. The clones giving signals at least two times higher than wild type are considered positives. They should be picked for plasmid isolation and sequencing.

3.3.2 Increasing Thermal Stability Further

The positive clones obtained in the first CoFi or Hot CoFi blot can be combined, and used as a new “wild-type” template to perform new rounds of mutagenesis and screening.

3.3.3 An Example: TEV Protease

As an example, we provide the details for performing Hot CoFi blot on the high-expression variant of TEV protease developed by van den Berg et al. [38]. TEV protease “wild-type” sequence: ATGCGTTCATCAACAAGTTTGTACAAAAAAGCAGGCTCGG GAGAAAGCTTGTTTAAGGGACCACGTGATTATAACCCGATATC GAGCTCCATTTGTCATTTGACGAATGAATCTGATGGGCACACAA CATCGTTGTATGGTATTGGATTTGGTCCCTTCATCATTACAAA CAAGCACTTGTTTAGAAGAAATAATGGAACACTGTTGGTCCAAT CACTACATGGTGTATTCAAGGTCAAGGACACCACGACTTTG CAACAACACCTCGTTGATGGGAGGGACATGATAATTATTCG CATGCCTAAGGATTTCCCACCATTTCCTCAAAAGCTGAAATTTA GAGAGCCACAAAGGGAAGAGCGCATATGTCTTGTGACAAC CAACTTCCAAACTAAGAGTATGTCTAGCATGGTGTCAGACAC TAGTTGCACATTCCCTTCATCTGATGGCATATTCTGGAAG CATTGGATTCAAACCAAGGATGGGCAGTGTGGCAGTCCATTAG TATCAACTAGAGATGGGTTCATTGTTGGTATACACTCAGCATC G A ATT T C A C C A A C A C A AA C A AT TAT T TC AC AA G C G T G C C GAAAAACTTCATGGAATTGTTGACAAATCAGGAGGCGCAG C A G T G G G T TA G T G G T T G G C G AT TA A AT G C T G A C T C A G TATTGTGGGGGGGCCATAAAGTTTTCATGAACAAACCTGAA GAGCCTTTTCAGCCAGTTAAGGAAGCGACTCAACTCATGAAT G A AT T G G T G TA C T C G C A A G A C C C A G C T T T C T T G TA CAAAGTGGTTGATCTGGAGGGCCCGCGGTTCGAAGGTAAGCC TATCCCTAACCCTCTCCTCGGTCTCGATTCTACGCGTACCGGT CATCATCACCATCACCATTGA

314

Ignacio Asial et al.

C-30

epPCR forward primer: 50 -GTACAAAAAAGCAGGCTCGGG-30 epPCR reverse primer: 50 -CACTTTGTACAAGAAAGCTGGGT Tm of wild-type protein, as determined by DSF: 52  C Temperature for round 1 of Hot CoFi: 60–65  C Temperature for round 2 of Hot CoFi: 70–75  C Activity assay:

Cleavage of a substrate containing a tandem of two SUMO moieties, connected by a linker containing the TEV protease recognition sequence: SUMO-(GGGS)2-ENLYFQ-(GGGS)2SUMO. Cleavage was monitored on SDS-PAGE. In our hands, roughly 50% of clones obtained maintained activity.

4

Notes 1. Flat-bottom tubes help obtaining more “solid” pellets. 2. Pellets can be very fragile from this point, pour the supernatant carefully. 3. Short primers (55  C. If these conditions are not met (e.g., for TA-rich segments), use gene-specific primers instead. 4. If you experience a low yield after epPCR, consider lowering annealing temperature or make additional epPCR reactions and combine them for gel purification. If you experience many undesired bands, consider increasing annealing temperature or redesign the primers to be more specific. 5. Cut as small as possible, as extra gel reduces considerably the purification yield; the total gel weight per insert should not exceed ~300 mg. 6. Exposure of cells to temperatures above 4  C reduces electroporation efficiency. 7. If you have too few colonies, check your electroporation procedures. Key to success is a rapid procedure, and respect of all the steps. Also check the quality of electrocompetent cells. Cells of low quality or more than 4 months old perform less efficiently. Poor number of transformants can also be due to low amounts of MEGAWHOP product. Optimal concentrations should be >20 ng/μL. 8. Do not increase the volume of culture to purify, as this leads to a lower purity of the final product. 9. If higher amounts of library are required, 10 mL of the culture can be used to perform a midi-prep, using Qiagen Plamid Midi kit according to the manufacturer’s instructions. Once again, do not exceed this volume; it would lead to lower purity.

The High-Throughput Hot CoFi Blot

315

10. Precision is important, as it will allow to align the developed nitrocellulose membrane to the original master plate. Cut one corner of the membrane to facilitate alignment. 11. If there is significant amount of condensation, open the plates and let them dry at room temperature until condensation is reduced. This facilitates colony transfer. 12. IPTG concentration can be varied to the optimal concentration for the target protein studied; if expressing the target into the periplasm through a signal peptide and a pET24a vector, for example, we typically use lower IPTG concentrations of about 10 μM. 13. It is important to use a rock shaker and very low mixing speeds for the rest of the procedure, to ensure even signals after development. If only circular-motion plate shakers are available, use a shaking speed of 20 rpm or less. 14. If you have more than 96 clones to validate, use the bigger 24.5 cm square Bioassay plates.

Competing Financial Interests S.D. and P.N. are co-founders of Evitra AB that owns the commercial rights to this method. I.A. and P.N. are co-founders of DotBio Pte. Ltd., which has licensed this method. References 1. Arndt MA, Krauss J, Schwarzenbacher R, Vu BK, Greene S, Rybak SM (2003) Generation of a highly stable, internalizing anti-CD22 singlechain Fv fragment for targeting non-Hodgkin’s lymphoma. Int J Cancer 107 (5):822–829. https://doi.org/10.1002/ijc. 11451 2. Chirino AJ, Mire-Sluis A (2004) Characterizing biological products and assessing comparability following manufacturing changes. Nat Biotechnol 22(11):1383–1391. https://doi. org/10.1038/nbt1030 3. Huus K, Havelund S, Olsen HB, van de Weert M, Frokjaer S (2005) Thermal dissociation and unfolding of insulin. Biochemistry 44 (33):11171–11177. https://doi.org/10. 1021/bi0507940 4. Mulinacci F, Capelle MA, Gurny R, Drake AF, Arvinte T (2011) Stability of human growth hormone: influence of methionine oxidation on thermal folding. J Pharm Sci 100 (2):451–463

5. Radek JT, Castellino FJ (1988) A differential scanning calorimetric investigation of the domains of recombinant tissue plasminogen activator. Arch Biochem Biophys 267 (2):776–786 6. Schellekens H (2002) Bioequivalence and the immunogenicity of biopharmaceuticals. Nat Rev Drug Discov 1(6):457–462. https://doi. org/10.1038/nrd818 7. Willuda J, Honegger A, Waibel R, Schubiger PA, Stahel R, Zangemeister-Wittke U, Pluckthun A (1999) High thermal stability is essential for tumor targeting of antibody fragments: engineering of a humanized antiepithelial glycoprotein-2 (epithelial cell adhesion molecule) single-chain Fv fragment. Cancer Res 59(22):5758–5767 8. Wakankar AA, Feeney MB, Rivera J, Chen Y, Kim M, Sharma VK, Wang YJ (2010) Physicochemical stability of the antibody-drug conjugate Trastuzumab-DM1: changes due to modification and conjugation processes.

316

Ignacio Asial et al.

Bioconjug Chem 21(9):1588–1595. https:// doi.org/10.1021/bc900434c 9. Roberts CJ (2014) Protein aggregation and its impact on product quality. Curr Opin Biotechnol 30:211–217. https://doi.org/10.1016/j. copbio.2014.08.001 10. Erickson B, Nelson WP (2012) Perspective on opportunities in industrial biotechnology in renewable chemicals. Biotechnol J 7 (2):176–185. https://doi.org/10.1002/biot. 201100069 11. Luan CH, Qiu S, Finley JB, Carson M, Gray RJ, Huang W, Johnson D, Tsao J, Reboul J, Vaglio P, Hill DE, Vidal M, Delucas LJ, Luo M (2004) High-throughput expression of C. elegans proteins. Genome Res 14 (10B):2102–2110. https://doi.org/10. 1101/gr.2520504 12. Lesley SA (2001) High-throughput proteomics: protein expression and purification in the postgenomic world. Protein Expr Purif 22 (2):159–164. https://doi.org/10.1006/prep. 2001.1465 13. Sorensen HP, Mortensen KK (2005) Soluble expression of recombinant proteins in the cytoplasm of Escherichia coli. Microb Cell Factories 4(1):1. https://doi.org/10.1186/14752859-4-1 14. Sorensen HP, Mortensen KK (2005) Advanced genetic strategies for recombinant protein expression in Escherichia coli. J Biotechnol 115(2):113–128. https://doi.org/10.1016/j. jbiotec.2004.08.004 15. Chen Y, Qiu S, Luan CH, Luo M (2008) A high throughput platform for eukaryotic genes. Methods Mol Biol 426:209–220. https://doi.org/10.1007/978-1-60327-0588_13 16. Knaust RK, Nordlund P (2001) Screening for soluble expression of recombinant proteins in a 96-well format. Anal Biochem 297(1):79–85. https://doi.org/10.1006/abio.2001.5331 17. Lueking A, Horn M, Eickhoff H, Bussow K, Lehrach H, Walter G (1999) Protein microarrays for gene expression and antibody screening. Anal Biochem 270(1):103–111. https:// doi.org/10.1006/abio.1999.4063 18. Makrides SC (1996) Strategies for achieving high-level expression of genes in Escherichia coli. Microbiol Rev 60(3):512–538 19. Christendat D, Yee A, Dharamsi A, Kluger Y, Savchenko A, Cort JR, Booth V, Mackereth CD, Saridakis V, Ekiel I, Kozlov G, Maxwell KL, Wu N, McIntosh LP, Gehring K, Kennedy MA, Davidson AR, Pai EF, Gerstein M, Edwards AM, Arrowsmith CH (2000) Structural proteomics of an archaeon. Nat Struct

Biol 7(10):903–909. https://doi.org/10. 1038/82823 20. Edwards AM, Arrowsmith CH, Christendat D, Dharamsi A, Friesen JD, Greenblatt JF, Vedadi M (2000) Protein production: feeding the crystallographers and NMR spectroscopists. Nat Struct Biol 7(Suppl):970–972. https:// doi.org/10.1038/80751 21. Roodveldt C, Aharoni A, Tawfik DS (2005) Directed evolution of proteins for heterologous expression and stability. Curr Opin Struct Biol 15(1):50–56. https://doi.org/10.1016/ j.sbi.2005.01.001 22. Welch M, Govindarajan S, Ness JE, Villalobos A, Gurney A, Minshull J, Gustafsson C (2009) Design parameters to control synthetic gene expression in Escherichia coli. PLoS One 4(9):e7002. https://doi.org/10. 1371/journal.pone.0007002 23. Bjork A, Dalhus B, Mantzilas D, Sirevag R, Eijsink VG (2004) Large improvement in the thermal stability of a tetrameric malate dehydrogenase by single point mutations at the dimer-dimer interface. J Mol Biol 341 (5):1215–1226. https://doi.org/10.1016/j. jmb.2004.06.079 24. Eijsink VG, Bjork A, Gaseidnes S, Sirevag R, Synstad B, van den Burg B, Vriend G (2004) Rational engineering of enzyme stability. J Biotechnol 113(1–3):105–120. https://doi.org/ 10.1016/j.jbiotec.2004.03.026 25. Fitzgerald J, Lugovskoy A (2011) Rational engineering of antibody therapeutics targeting multiple oncogene pathways. MAbs 3 (3):299–309 26. Courtois F, Schneider CP, Agrawal NJ, Trout BL (2015) Rational design of biobetters with enhanced stability. J Pharm Sci 104 (8):2433–2440. https://doi.org/10.1002/ jps.24520 27. Ahmad S, Kamal MZ, Sankaranarayanan R, Rao NM (2008) Thermostable Bacillus subtilis lipases: in vitro evolution and structural insight. J Mol Biol 381(2):324–340. https://doi.org/ 10.1016/j.jmb.2008.05.063 28. Dumon C, Varvak A, Wall MA, Flint JE, Lewis RJ, Lakey JH, Morland C, Luginbuhl P, Healey S, Todaro T, DeSantis G, Sun M, Parra-Gessert L, Tan X, Weiner DP, Gilbert HJ (2008) Engineering hyperthermostability into a GH11 xylanase is mediated by subtle changes to protein structure. J Biol Chem 283(33):22557–22564. https://doi.org/10. 1074/jbc.M800936200 29. Giver L, Gershenson A, Freskgard PO, Arnold FH (1998) Directed evolution of a

The High-Throughput Hot CoFi Blot thermostable esterase. Proc Natl Acad Sci U S A 95(22):12809–12813 30. Hao J, Berry A (2004) A thermostable variant of fructose bisphosphate aldolase constructed by directed evolution also shows increased stability in organic solvents. Protein Eng Des Sel 17(9):689–697. https://doi.org/10.1093/ protein/gzh081 31. Richardson TH, Tan X, Frey G, Callen W, Cabell M, Lam D, Macomber J, Short JM, Robertson DE, Miller C (2002) A novel, high performance enzyme for starch liquefaction. Discovery and optimization of a low pH, thermostable alpha-amylase. J Biol Chem 277 (29):26501–26507. https://doi.org/10. 1074/jbc.M203183200 32. Wang Q, Xia T (2008) Enhancement of the activity and alkaline pH stability of Thermobifida fusca xylanase A by directed evolution. Biotechnol Lett 30(5):937–944. https://doi. org/10.1007/s10529-007-9508-1 33. Worn A, Auf der Maur A, Escher D, Honegger A, Barberis A, Pluckthun A (2000) Correlation between in vitro stability and in vivo performance of anti-GCN4 intrabodies as cytoplasmic inhibitors. J Biol Chem 275 (4):2795–2803 34. Zumarraga M, Bulter T, Shleev S, Polaina J, Martinez-Arias A, Plou FJ, Ballesteros A, Alcalde M (2007) In vitro evolution of a fungal laccase in high concentrations of organic cosolvents. Chem Biol 14(9):1052–1064. https:// doi.org/10.1016/j.chembiol.2007.08.010 35. Warne T, Serrano-Vega MJ, Baker JG, Moukhametzianov R, Edwards PC, Henderson R, Leslie AG, Tate CG, Schertler GF (2008) Structure of a beta1-adrenergic Gprotein-coupled receptor. Nature 454 (7203):486–491. https://doi.org/10.1038/ nature07101 36. Warne T, Serrano-Vega MJ, Tate CG, Schertler GF (2009) Development and crystallization of a minimal thermostabilised G protein-coupled receptor. Protein Expr Purif 65(2):204–213 37. Andrews SR, Taylor EJ, Pell G, Vincent F, Ducros VM, Davies GJ, Lakey JH, Gilbert HJ (2004) The use of forced protein evolution to investigate and improve stability of family 10 xylanases. The production of Ca2 +independent stable xylanases. J Biol Chem 279(52):54369–54379. https://doi.org/10. 1074/jbc.M409044200 38. van den Berg S, Lofdahl PA, Hard T, Berglund H (2006) Improved solubility of TEV protease by directed evolution. J Biotechnol 121 (3):291–298. https://doi.org/10.1016/j. jbiotec.2005.08.006

317

39. Burgess-Brown NA, Sharma S, Sobott F, Loenarz C, Oppermann U, Gileadi O (2008) Codon optimization can improve expression of human genes in Escherichia coli: a multi-gene study. Protein Expr Purif 59(1):94–102. https://doi.org/10.1016/j.pep.2008.01.008 40. Reetz MT, Wu S (2009) Laboratory evolution of robust and enantioselective Baeyer-Villiger monooxygenases for asymmetric catalysis. J Am Chem Soc 131(42):15424–15432. https://doi.org/10.1021/ja906212k 41. Gileadi O, Burgess-Brown NA, Colebrook SM, Berridge G, Savitsky P, Smee CE, Loppnau P, Johansson C, Salah E, Pantic NH (2008) High throughput production of recombinant human proteins for crystallography. Methods Mol Biol 426:221–246. https://doi.org/10.1007/ 978-1-60327-058-8_14 42. Saez NJ, Vincentelli R (2014) Highthroughput expression screening and purification of recombinant proteins in E. coli. Methods Mol Biol 1091:33–53. https://doi.org/ 10.1007/978-1-62703-691-7_3 43. Savitsky P, Bray J, Cooper CD, Marsden BD, Mahajan P, Burgess-Brown NA, Gileadi O (2010) High-throughput production of human proteins for crystallization: the SGC experience. J Struct Biol 172(1):3–13. https://doi.org/10.1016/j.jsb.2010.06.008 44. Structural Genomics C, China Structural Genomics C, Northeast Structural Genomics C, Graslund S, Nordlund P, Weigelt J, Hallberg BM, Bray J, Gileadi O, Knapp S, Oppermann U, Arrowsmith C, Hui R, Ming J, dhe-Paganon S, Park HW, Savchenko A, Yee A, Edwards A, Vincentelli R, Cambillau C, Kim R, Kim SH, Rao Z, Shi Y, Terwilliger TC, Kim CY, Hung LW, Waldo GS, Peleg Y, Albeck S, Unger T, Dym O, Prilusky J, Sussman JL, Stevens RC, Lesley SA, Wilson IA, Joachimiak A, Collart F, Dementieva I, Donnelly MI, Eschenfeldt WH, Kim Y, Stols L, Wu R, Zhou M, Burley SK, Emtage JS, Sauder JM, Thompson D, Bain K, Luz J, Gheyi T, Zhang F, Atwell S, Almo SC, Bonanno JB, Fiser A, Swaminathan S, Studier FW, Chance MR, Sali A, Acton TB, Xiao R, Zhao L, Ma LC, Hunt JF, Tong L, Cunningham K, Inouye M, Anderson S, Janjua H, Shastry R, Ho CK, Wang D, Wang H, Jiang M, Montelione GT, Stuart DI, Owens RJ, Daenke S, Schutz A, Heinemann U, Yokoyama S, Bussow K, Gunsalus KC (2008) Protein production and purification. Nat Methods 5(2):135–146. https://doi.org/10. 1038/nmeth.f.202 45. Vincentelli R, Cimino A, Geerlof A, Kubo A, Satou Y, Cambillau C (2011) High-

318

Ignacio Asial et al.

throughput protein expression screening and purification in Escherichia coli. Methods 55 (1):65–72. https://doi.org/10.1016/j. ymeth.2011.08.010 46. Lutz S (2010) Beyond directed evolution – semi-rational protein engineering and design. Curr Opin Biotechnol 21(6):734–743. https://doi.org/10.1016/j.copbio.2010.08. 011 47. Yang JS, Seo SW, Jang S, Jung GY, Kim S (2012) Rational engineering of enzyme allosteric regulation through sequence evolution analysis. PLoS Comput Biol 8(7):e1002612. https://doi.org/10.1371/journal.pcbi. 1002612 48. Steiner K, Schwab H (2012) Recent advances in rational approaches for enzyme engineering. Comput Struct Biotechnol J 2:e201209010. https://doi.org/10.5936/csbj.201209010 49. Jemli S, Ayadi-Zouari D, Hlima HB, Bejar S (2016) Biocatalysts: application and engineering for industrial purposes. Crit Rev Biotechnol 36(2):246–258. https://doi.org/10.3109/ 07388551.2014.950550 50. Lehmann M, Wyss M (2001) Engineering proteins for thermostability: the use of sequence alignments versus rational design and directed evolution. Curr Opin Biotechnol 12 (4):371–375 51. Ewert S, Honegger A, Pluckthun A (2004) Stability improvement of antibodies for extracellular and intracellular applications: CDR grafting to stable frameworks and structurebased framework engineering. Methods 34 (2):184–199. https://doi.org/10.1016/j. ymeth.2004.04.007 52. Bloom JD, Labthavikul ST, Otey CR, Arnold FH (2006) Protein stability promotes evolvability. Proc Natl Acad Sci U S A 103 (15):5869–5874. https://doi.org/10.1073/ pnas.0510098103 53. Hart DJ, Waldo GS (2013) Library methods for structural biology of challenging proteins and their complexes. Curr Opin Struct Biol 23 (3):403–408. https://doi.org/10.1016/j.sbi. 2013.03.004 54. Yumerefendi H, Desravines DC, Hart DJ (2011) Library-based methods for identification of soluble expression constructs. Methods 55(1):38–43. https://doi.org/10.1016/j. ymeth.2011.06.007 55. Yumerefendi H, Tarendeau F, Mas PJ, Hart DJ (2010) ESPRIT: an automated, library-based method for mapping and soluble expression of protein domains from challenging targets. J Struct Biol 172(1):66–74. https://doi.org/ 10.1016/j.jsb.2010.02.021

56. Coco WM, Levinson WE, Crist MJ, Hektor HJ, Darzins A, Pienkos PT, Squires CH, Monticello DJ (2001) DNA shuffling method for generating highly recombined genes and evolved enzymes. Nat Biotechnol 19 (4):354–359. https://doi.org/10.1038/ 86744 57. Packer MS, Liu DR (2015) Methods for the directed evolution of proteins. Nat Rev Genet 16(7):379–394. https://doi.org/10.1038/ nrg3927 58. Arnold FH (2009) How proteins adapt: lessons from directed evolution. Cold Spring Harb Symp Quant Biol 74:41–46. https://doi.org/ 10.1101/sqb.2009.74.046 59. Farinas ET, Bulter T, Arnold FH (2001) Directed enzyme evolution. Curr Opin Biotechnol 12(6):545–551 60. Tracewell CA, Arnold FH (2009) Directed enzyme evolution: climbing fitness peaks one amino acid at a time. Curr Opin Chem Biol 13 (1):3–9. https://doi.org/10.1016/j.cbpa. 2009.01.017 61. Romero PA, Arnold FH (2009) Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Biol 10(12):866–876. https://doi.org/10.1038/nrm2805 62. Ericsson UB, Hallberg BM, Detitta GT, Dekker N, Nordlund P (2006) Thermofluorbased high-throughput stability optimization of proteins for structural studies. Anal Biochem 357(2):289–298. https://doi.org/10.1016/j. ab.2006.07.027 63. Vedadi M, Arrowsmith CH, Allali-Hassani A, Senisterra G, Wasney GA (2010) Biophysical characterization of recombinant proteins: a key to higher structural genomics success. J Struct Biol 172(1):107–119. https://doi. org/10.1016/j.jsb.2010.05.005 64. Deller MC, Kong L, Rupp B (2016) Protein stability: a crystallographer’s perspective. Acta Crystallogr F Struct Biol Commun 72 (Pt 2):72–95. https://doi.org/10.1107/ S2053230X15024619 65. Hew K, Dahlroth SL, Veerappan S, Pan LX, Cornvik T, Nordlund P (2015) Structure of the Varicella Zoster virus thymidylate synthase establishes functional and structural similarities as the human enzyme and potentiates itself as a target of brivudine. PLoS One 10(12): e0143947. https://doi.org/10.1371/journal. pone.0143947 66. Hew K, Dahlroth SL, Venkatachalam R, Nasertorabi F, Lim BT, Cornvik T, Nordlund P (2013) The crystal structure of the DNA-binding domain of vIRF-1 from the oncogenic KSHV reveals a conserved fold for

The High-Throughput Hot CoFi Blot DNA binding and reinforces its role as a transcription factor. Nucleic Acids Res 41 (7):4295–4306. https://doi.org/10.1093/ nar/gkt082 67. Larsson EA, Jansson A, Ng FM, Then SW, Panicker R, Liu B, Sangthongpitag K, Pendharkar V, Tai SJ, Hill J, Dan C, Ho SY, Cheong WW, Poulsen A, Blanchard S, Lin GR, Alam J, Keller TH, Nordlund P (2013) Fragment-based ligand design of novel potent inhibitors of tankyrases. J Med Chem 56 (11):4497–4508. https://doi.org/10.1021/ jm400211f 68. Almqvist H, Axelsson H, Jafari R, Dan C, Mateus A, Haraldsson M, Larsson A, Martinez Molina D, Artursson P, Lundback T, Nordlund P (2016) CETSA screening identifies known and novel thymidylate synthase inhibitors and slow intracellular activation of 5-fluorouracil. Nat Commun 7:11040. https://doi.org/10. 1038/ncomms11040 69. Guettou F, Quistgaard EM, Raba M, Moberg P, Low C, Nordlund P (2014) Selectivity mechanism of a bacterial homolog of the human drug-peptide transporters PepT1 and PepT2. Nat Struct Mol Biol 21(8):728–731. https://doi.org/10.1038/nsmb.2860 70. Ratanji KD, Derrick JP, Dearman RJ, Kimber I (2014) Immunogenicity of therapeutic proteins: influence of aggregation. J Immunotoxicol 11(2):99–109. https://doi.org/10.3109/ 1547691X.2013.821564 71. Cornvik T, Dahlroth SL, Magnusdottir A, Herman MD, Knaust R, Ekberg M, Nordlund P (2005) Colony filtration blot: a new screening method for soluble protein expression in Escherichia coli. Nat Methods 2(7):507–509. https://doi.org/10.1038/nmeth767 72. Dahlroth SL, Nordlund P, Cornvik T (2006) Colony filtration blotting for screening soluble expression in Escherichia coli. Nat Protoc 1 (1):253–258. https://doi.org/10.1038/ nprot.2006.39 73. Cornvik T, Dahlroth SL, Magnusdottir A, Flodin S, Engvall B, Lieu V, Ekberg M, Nordlund P (2006) An efficient and generic strategy for producing soluble human proteins and domains in E. coli by screening construct libraries. Proteins 65(2):266–273. https:// doi.org/10.1002/prot.21090 74. Dahlroth SL, Lieu V, Haas J, Nordlund P (2009) Screening colonies of pooled ORFeomes (SCOOP): a rapid and efficient strategy for expression screening ORFeomes in Escherichia coli. Protein Expr Purif 68 (2):121–127. https://doi.org/10.1016/j. pep.2009.07.010

319

75. Martinez Molina D, Cornvik T, Eshaghi S, Haeggstrom JZ, Nordlund P, Sabet MI (2008) Engineering membrane protein overproduction in Escherichia coli. Protein Sci 17 (4):673–680. https://doi.org/10.1110/ps. 073242508 76. Cabantous S, Pedelacq JD, Mark BL, Naranjo C, Terwilliger TC, Waldo GS (2005) Recent advances in GFP folding reporter and split-GFP solubility reporter technologies. Application to improving the folding and solubility of recalcitrant proteins from Mycobacterium tuberculosis. J Struct Funct Genom 6 (2–3):113–119. https://doi.org/10.1007/ s10969-005-5247-5 77. Waldo GS, Standish BM, Berendzen J, Terwilliger TC (1999) Rapid protein-folding assay using green fluorescent protein. Nat Biotechnol 17(7):691–695. https://doi.org/10. 1038/10904 78. Maxwell KL, Mittermaier AK, Forman-Kay JD, Davidson AR (1999) A simple in vivo assay for increased protein solubility. Protein Sci 8 (9):1908–1911. https://doi.org/10.1110/ ps.8.9.1908 79. Fisher AC, Kim W, DeLisa MP (2006) Genetic selection for protein solubility enabled by the folding quality control feature of the twinarginine translocation pathway. Protein Sci 15 (3):449–458. https://doi.org/10.1110/ps. 051902606 80. Lockard MA, Listwan P, Pedelacq JD, Cabantous S, Nguyen HB, Terwilliger TC, Waldo GS (2011) A high-throughput immobilized bead screen for stable proteins and multiprotein complexes. Protein Eng Des Sel 24 (7):565–578. https://doi.org/10.1093/pro tein/gzr021 81. Peabody DS, Al-Bitar L (2001) Isolation of viral coat protein mutants with altered assembly and aggregation properties. Nucleic Acids Res 29(22):E113 82. Wigley WC, Stidham RD, Smith NM, Hunt JF, Thomas PJ (2001) Protein solubility and folding monitored in vivo by structural complementation of a genetic marker protein. Nat Biotechnol 19(2):131–136. https://doi.org/ 10.1038/84389 83. Chautard H, Blas-Galindo E, Menguy T, Grand’Moursel L, Cava F, Berenguer J, Delcourt M (2007) An activity-independent selection system of thermostable protein variants. Nat Methods 4(11):919–921. https://doi. org/10.1038/nmeth1090 84. Foit L, Morgan GJ, Kern MJ, Steimer LR, von Hacht AA, Titchmarsh J, Warriner SL, Radford SE, Bardwell JC (2009) Optimizing protein

320

Ignacio Asial et al.

stability in vivo. Mol Cell 36(5):861–871. https://doi.org/10.1016/j.molcel.2009.11. 022 85. Martin A, Schmid FX, Sieber V (2003) Proside: a phage-based method for selecting thermostable proteins. Methods Mol Biol 230:57–70. https://doi.org/10.1385/1-59259-396-8:57 86. Sieber V, Pluckthun A, Schmid FX (1998) Selecting proteins with improved stability by a

phage-based method. Nat Biotechnol 16 (10):955–960. https://doi.org/10.1038/ nbt1098-955 87. Asial I, Cheng YX, Engman H, Dollhopf M, Wu B, Nordlund P, Cornvik T (2013) Engineering protein thermostability using a generic activity-independent biophysical screen inside the cell. Nat Commun 4:2901. https://doi. org/10.1038/ncomms3901

Chapter 15 High-Throughput Isolation of Soluble Protein Domains Using a Bipartite Split-GFP Complementation System Ame´lie Massemin, Ste´phanie Cabantous, Geoffrey S. Waldo, and Jean-Denis Pedelacq Abstract The identification of soluble, folded domains of proteins is a recurring task in modern molecular biology. We detail a protocol for identifying compact soluble protein domains using a self-assembling two-part splitGFP comprised of a detector fragment (GFP β-strands 1 through 10, or GFP1–10) and a tagging fragment (GFP β-strand 11, or GFP11). The assay is performed in E. coli cells and in cell extracts. A selection step insures the protein fragments are in frame and contain no stop codons, while an inverse PCR is used to enrich protein fragment libraries containing a specific target sequence. Key words Split-GFP, Domain trapping, Protein solubility, Protein tagging, Protein fragment complementation

1

Introduction Identification of soluble, folded domains of a large multi-domain protein can aid subsequent downstream steps such as biophysical characterization, directed evolution, raising antibodies, crystallization, enzymology, or protein-protein interaction assays. Since this is a common challenge, high-throughput methods are in demand. In this chapter, we describe a method that uses a scalable highthroughput assay based on two self-assembling split-GFP fragments, a large detector fragment (GFP β-strands 1 through 10, or GFP1–10) and a small tagging fragment (GFP β-strand 11, or GFP11) [1, 2]. The large GFP fragment has been optimized for assembly and folding with the small GFP fragment. The small GFP fragment has been optimized to not perturb fusion proteins. This system can be exploited to form the basis of a high-throughput screen for solubility in living cells or in cell extracts. By cloning a large number of DNA fragments from a cDNA of a protein of interest (POI), and testing the resulting POI-GFP11 protein for

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_15, © Springer Science+Business Media, LLC, part of Springer Nature 2019

321

322

Ame´lie Massemin et al.

solubility using GFP1–10 as a “sensor”, a map of the protein domain boundaries can be constructed and the minimal folded cores identified. Overview. DNA coding a POI is randomly fragmented, and the ends of the DNA fragments polished and blunted. Since only 1 in 18 of the DNA fragments would be in frame with GFP11 and in the original (genic, ORF) frame, the fragments are first cloned between two domains of the dihydrofolate reductase (DHFR) from E. coli [3] and selected on trimethoprim (Tmp) to eliminate out-of-frame fragments (which lead to stop codons and loss of trimethoprim resistance). After filtering the DNA fragments for correct reading frame and subcloning into the GFP11 fusion vector (with ColEI ORI and spectinomycin (Spec) resistance marker), the plasmid library is then cloned en masse into a recipient E. coli strain expressing the GFP1–10 detector under inducible control of isopropylthiogalactoside (IPTG) on a pET plasmid with compatible p15 ORI and a kanamycin (Kan) selectable marker. The fusion protein (POI-GFP11) library is transiently expressed with anhydrotetracycline (Antet), which allows newly expressed POI-GFP11 fusions to either remain soluble or aggregate while resting. Next, all the cells are induced with IPTG to express GFP1–10. Complementation between POI-GFP11 and GFP1–10 occurs only if the POI-GFP11 is soluble and the GFP11 domain is accessible. Bright cells or colonies are picked and sequenced to identify soluble candidates. By assessing many (several dozen to a few hundred) colonies or clones are picked, a map of domain boundaries can be constructed. This provides information on how truncations of a soluble fragment (fluorescent colony) lead to insoluble (faint colonies) versions. This makes it convenient to “pull” all the soluble candidates from the library that contains a target protein sequence.

2

Materials

2.1 DNA Fragmentation

Reagents

1. Phusion polymerase (New England Biolabs (NEB #M0530)). 2. PrimeSTAR GXL DNA Polymerase (Takara #R050) for genes that are difficult to amplify. 3. 10 mM dNTP mix aliquoted (5 μL) and stored at 80  C. 4. TE buffer: 10 mM Tris–HCl buffer pH 8.5. 5. Salt-free primers at 0.025 μmol scale (Sigma-Aldrich). Solution stocks are prepared in TE buffer at a concentration of 100 μM and stored at 20  C. Working solutions are diluted to 5 μM in the same buffer. 6. DNase I. 7. Cobalt solution 0.1 M.

High-Throughput Bipartite Split-GFP Assays

323

8. DNase I quenching Buffer: 80 μL gel loading dye 6, 20 μL EDTA 0.5 M pH 8.0. 9. 50 TAE buffer (1 L): 242 g Tris–HCl, 57.1 mL glacial acetic acid, and 100 mL of 0.5 M EDTA, pH adjusted to 8.5. Used as a 1 solution. 10. 1% agarose-TAE gels are prepared from 1 g type I agarose powder in 100 mL of 1 TAE. 11. 1 kb GeneRuler. 12. Gel loading dye 6 (NEB #B7024). 13. GelRed staining solution: 5 μL of GelRed™ (Biotium #41005) in 50 mL of water. 14. T4 DNA polymerase and T4 Polynucleotide Kinase as part of the Quick Blunting Kit (NEB #E1201). 15. Gel DNA recovery kit. 16. PCR Purification Kit such as QIAquick, Qiagen. Equipment and Consumables

1. 1.5 mL Eppendorf tubes, autoclaved. 2. 0.2 mL twin-walled tubes, autoclaved. 3. PCR Thermocycler. 4. Horizontal gel electrophoresis system. 5. UV transilluminator. 6. Sonicator Qsonica Q700 (Qsonica, LLC, Newtown, CT 06470). 7. Nanodrop 1000 (Thermo Fisher Scientific). 8. Microcentrifuge. 2.2 The DHFR ORF Trapper

The following reagents, consumables, and equipment are required in addition to some listed in Subheading 2.1. Reagents

1. pET 15b kanamycin (Kan) resistant insertion DHFR reading frame selector plasmid. 2. ElectroMAX DH10B competent cells (Invitrogen #18290015) are used for high efficiency transformation of DNA library. 3. Chemically competent Tuner™ (DE3) (Novagen #70623) contain the lacY permease mutation to enable adjustable levels of protein expression throughout all cells in a culture. 4. Sterilized cold water. 5. 60% (v/v) sterile glycerol. 6. Restriction enzyme StuI (NEB #R0187). 7. Alkaline Phosphatase, Calf Intestinal (CIP) (NEB #M0290).

324

Ame´lie Massemin et al.

8. T4 DNA ligase. 9. LB medium: 20 g LB-broth dissolved in 1 L of ultrapure water and autoclaved on the same day. 10. SOC medium: prepare 1 L of SOB following manufacturer’s instructions and autoclaved on the same day. Once cooled, 10 mL of 2 M MgCl2, and 20 mL of 1 M (18%) D-(+)-glucose are added: both solutions are filtered through a 0.45 μm syringe filter prior to use. 11. LB-agar: 35 g powder of LB-broth with agar dissolved in 1 L of ultrapure water and autoclaved on the same day. Cool down at room temperature (0.3 (typical value ~0.5). The cell pellet should be diluted with SEB to a desired disruption loading, a suggested value is 0.25 (see Note 2). This involves calculating the desired final volume of concentrate as Vfinal ¼ ((P/ T)/0.25)  V, where V is the current volume in mL of cell concentrate. Then simply add (Vfinal  V) mL of extra SEB to the concentrate to attain the desired disruption loading in volume Vfinal. Load the concentrated cells into a 4  C precooled nitrogen cavitation device and equilibrate it with 70 bar nitrogen for 45 min, then vent the disrupted lysate into a suitable vessel (e.g., detergent washed and well-rinsed vacuum receiver flask). Subject the cell lysate to sequential spins at 10,000  g and 30,000  g, each time transferring supernatant carefully to fresh centrifuge tubes. Leave some supernatant behind at the 30,000  g centrifugation to avoid extra turbidity close to the pellet supernatant interface, which generally gives lower expression levels if included. Gel filtrate the final supernatant using PD-10 SuperDex columns (GE healthcare) into fresh EB at 4  C. The resulting unsupplemented lysate can be then snap frozen in liquid nitrogen, and stored at 80  C (see Note 3).

Leishmania Cell-Free Expression for High-Throughput Interactions Analysis

3.2 Optimizing Feeding Solution for Leishmania LysateBased Expression

409

Besides template DNA encoding the desired product, unsupplemented Leishmania lysate requires the addition of a feeding solution for coupled transcription-translation of DNA through to protein. Hence, critical components are amino acids, an energy regeneration system (creatine phosphate + creatine phosphatase) and T7 polymerase for transcription. The recipe for feed solution with these components (plus other miscellaneous contents) is provided in Table 1, made at 5 reaction volume (i.e., 2 μL per 10 μL reaction). However, Table 1 omits the transcriptional components (T7 polymerase, rNTPs plus Magnesium ions), which are added separately to complete the feeding solution. These transcriptional inputs are not standardized because optimum lysate performance requires them to be optimized separately. The function of rNTPs is twofold in the coupled system, functioning both to provide building blocks for transcription of mRNA (all four rNTPS) and energy supply for translation (ATP and GTP only). Additionally, a tight magnesium optimum exists in lysate activity that is also related to the rNTP level, as rNTPs must complex with magnesium ions for biological activity. Nine different rNTP + T7 mixes (3  3) to complete the FS suitable for a minimal optimization regime are provided in Table 2. The complexity of rNTP interaction with the active lysate during coupled expression results in optimization issues where if rNTPs in combination are too low, transcription is limited and protein production may be reduced. However, excess transcriptional capacity in the form of excess T7 polymerase and/or rNTPs can cause unwanted short reaction products, presumably as premature terminations (see Note 4). This latter effect is not obvious when expressing eGFP alone and measuring fluorescence as a quality control (as typical in cell-free systems), because

Table 1 Incomplete 5 feed solution recipe omitting T7 polymerase, rNTPs and magnesium μL stock/100 μL final 5 FS

Component

Stock (mM if unstated)

Spermidine

100

1.2

DTT

500

2.0

Creatine phosphate

1000

20.0

HEPES-KOH pH 7.6

2500

4.0

PEG3000

0.5 v/v

Protease inhibitor Cocktail

120

amino acids

3.6

Anti-splice leader oligo

1

5.0

Creatine phosphokinase

5 units/μL

4.2

10.0 4.3 19.0

410

Wayne A. Johnston et al.

Table 2 Suggested rNTP variations to complete 5 feed solution 0.75 rNTP (μL/100 μL)

1.0 rNTP (μL/100 μL)

1.25 rNTP (μL/100 μL)

Component

Stock (mM)

5 FS (see Table 1)

n/a

69.7

69.7

69.7

T7 polymerase

2.5–5 mg/mL (see below)

12

12

12

MgOAc

1000

1.3

1.7

2.1

ATP

200

3.2

4.3

5.4

GTP

200

1.2

1.6

2

UTP

200

1.0

1.3

1.6

CTP

200

1.0

1.3

1.6

milliQ water

n/a

10.6

8.1

5.6

It is suggested that 9 total 5 Feed Solution variations are used, with the above 3 rNTP variations repeated using 5 mg/ mL (1), 3.75 mg/mL (0.75) and 2.5 mg/mL (0.5) T7 polymerase stock. Using a single starting 5 mg/mL T7 polymerase stock, this can be achieved by adding 12 μL T7 stock for 1.0 T7 as above, 9 μL T7 stock + 3 μL MilliQ water for 0.75 T7, and 6 μL T7 stock + 6 μL MilliQ water for 0.5 T7

prematurely terminated byproducts cannot form a fluorophore and are hence not visible. An example of premature terminations in an N-terminally eGFP-tagged fusion protein due to excess GTP in the feeding solution is provided in Fig. 1. One critical aspect of excess transcriptional capacity-mediated premature terminations in the Leishmania cell-free system is that it predominantly occurs early in the overall reaction (Fig. 1). This has a useful benefit where the unwanted tendency to form premature termination products can be observed in the kinetics of eGFP (only) production, without the need for expressing longer templates or running SDS-PAGE gels. Because premature termination occurs initially in the expression reaction, eGFP production will show clear biphasic kinetics with initial low production (due to production of incomplete, non-fluorescent protein), transiting to a higher production rate during expression as more full length eGFP is produced. Although this can be caused by increasing other transcriptional inputs (i.e., T7 polymerase, total rNTPs), it is particularly seen when increasing GTP (plus Mg) only. Figure 2 shows the effect of small differences in GTP supplementation on biphasic eGFP-only production kinetics (see Note 5). An additional consideration for assigning feed solution transcriptional capacity is that optimal values depend on the amount of lysate in the reaction. Provided the suggested disruption cell loading (0.25 g/g) in Subheading 3.1 is followed, lysate batches should

Fig. 1 SDS-PAGE gel of time dependent batch expression of eGFP-PPARγ in the Leishmania cell-free system. Reaction time from time zero (DNA addition to supplemented lysate) annotated on gel lanes, along with % band purity as determined by gel densitometry. Samples were not heated prior to loading to allow visualization of the fusion eGFP tag, with premature terminations visible as shorter fluorescence products. First lane is protein size ladder (PageRuler Plus)

Fig. 2 Biphasic kinetics of eGFP production in the LTE cell-free system is due to premature terminations resulting from above-optimal GTP addition concentration into the feeding solution. GTP in the feeding solution was added at 0.5, 0.63, and 0.95 mM; CTP, UTP were present at 0.5 mM and ATP was present at 1.2 mM. Magnesium concentration was set at [total rNTP] + 1.5 mM

412

Wayne A. Johnston et al.

be relatively consistent. If lysate addition is not optimized (as covered below), it is suggested that the 0.25 g/g unsupplemented lysate is simply added to the reactions close to the maximum volumetrically practical (i.e., 6 μL/10 μL reaction). Although not covered in this protocol, increasing the lysate level in reactions greater than: (0.25 g/g disruption; 6 μL/10 μL reaction) is possible and may allow greater expression yield in some circumstances (see Note 6). A simplified method for estimating the optimum rNTP addition at a given lysate concentration is as follows, with expression reactions described for a 10 μL volume (optimization reaction volume can be scaled up or down if desired): Mix incomplete 5 base feed solution (sufficient for 1000 Ml final feed solution or more) containing no rNTPs or T7 polymerase as per Table 1. Complete 9  100 μL complete 5 feed solutions with variable rNTPs and T7 polymerase as per Table 2. The “1.0 rNTP” variant represents the standard rNTP mix previously used for Leishmania cell-free expression [2]. Magnesium addition is calculated from the total rNTP concentration using an empirically derived formula, which can be used to derive the approximate magnesium optimum for other rNTP variations if these are desired to extend the optimization (see Note 7). Mix DNA template to 500 ng/μL. This can be either encoding eGFP only if using kinetic fluorescence measurement, or a N-terminally eGFP-tagged fusion protein similar to or a member of those intended for HTP screening panels. Adding 10% v/v of RNAse inhibitor is recommended (see Note 8). The 500 ng/μL template stock is used at 1 μL/10 μL final reaction, hence reaction DNA template concentration is 50 ng/μL, reaction RNAse inhibitor concentration is 1% v/v. Thaw unsupplemented lysate aliquot by hand warming and vortex mixing and formulate the reactions as soon as the aliquot is fully thawed and mixed. Lysate should be added to the reagents already placed in tubes (e.g., PCR strip tubes) or plate wells (e.g., 384 plate wells) as 6 μL (lysate+EB)/10 μL total reaction maximum. To extend the optimization, less than the maximum amount of lysate can be added to reactions. If doing so, the recommended dilutions of lysate are 6 μL/10 μL (no dilution), 5 μL/10 μL, 2.5 μL/10 μL, and 1 μL/10 μL. The remainder of the 6 μL/10 μL reaction volume is made up with EB. Mix well via pipette mixing or vortexing, then incubate for 3 h at 27  C. For eGFP expression only, SDS-PAGE gel visualization of protein is not useful as it cannot detect premature terminations. However, expression can be ranked according to end fluorescence, with any eGFP production kinetics showing visible biphasic behavior indicating a general propensity to premature terminations at high

Leishmania Cell-Free Expression for High-Throughput Interactions Analysis

413

transcriptional capacity. It is suggested that any T7/rNTP regime showing visible biphasic behavior in eGFP production kinetics (such as in Fig. 2) is not suitable for the use in HTP expression. If using a GFP-tagged fusion protein, premature terminations in N-terminally tagged variants will appear as smaller fluorescent products than the full-length main fusion band. Fluorescence will remain visible on the SDS-PAGE gel if samples are mixed with loading buffer but not heated prior to gel loading, as fluorophores such as GFP and mCherry can remain folded and fluorescent even in the presence of SDS (see Note 9). The presence of premature terminations as multiple short expression products can be visually estimated from the gel, and this is generally sufficient for optimization purposes. Alternately, a % band purity (total bands versus main band) can be calculated if gel densitometry software is available, with optimal expression at maximum % band purity. The optimization procedure effectively determines the highest transcriptional capacity variant possible without inducing premature terminations, at each of the lysate addition levels if these are included. The choice of configuration used then depends on whether maximum expression level (generally at or close to the highest lysate addition) is desired, or whether lower expression is possible in order to gain slower reaction kinetics, which generally leads to less product aggregation. Once a desired reaction configuration (lysate concentration plus rNTP/T7 concentration) is identified, it is optional but prudent to check the magnesium optimum (which for the previous reactions was theoretically calculated based on an empirically derived formula generally suitable for Leishmania lysates; see Note 7). The Magnesium in the feed solution should be reduced by 2 mM, and variants of 2 mM (no further addition), 1 mM, +0 mM (concentration previously used in the optimization), +1 mM and + 2 mM can be used to fine-tune the magnesium level to within 1 mM. Reactions are performed similarly to the main optimization protocol and the magnesium level for the HTP protein panel changed if maximum expression is not at the middle (+0 mM) value. Although such magnesium level optimization could be integrated into the main optimization above, it is not recommended as it greatly increases the amount of reactions required for the optimization. 3.3 Preparation of the Plasmid Templates for CellFree Expression

Cell-free expression for HTP interaction studies require cell-free suitable vectors containing the ORFs coding for the proteins-ofinterest. The author’s laboratory has developed a set of vectors for cell-free expression that encode the Species Independent Translation Sequence (SITS), which allows initiation of translation analogously to an Internal Ribosome Entry Site. These vectors can be

414

Wayne A. Johnston et al.

used not only in the LTE cell-free system but also in other cell-free systems [3]. The vectors are deposited in the Addgene nonprofit plasmid repository (www.addgene.org), under the ID# pCellFree G01 to G10. These variants are backbones that cover variations of ORF fusion proteins with N- and C-terminally fused eGFP, sfGFP, and mCherry. The pCellFree vectors are designed as destination vectors using the Gateway™ system for cassette exchange. Hence the ORFs of proteins of interest can be inserted into these vectors using commercially available Gateway™ kits available from Thermo Fisher Scientific, combined with standard molecular cloning techniques. 3.4 AlphaLISA Interaction Screening Using Leishmania Lysate

AlphaLISA technology can be deployed in high-throughput screening assays to identify biomolecular interactions. This is a bead-based proximity assay in which Alpha Donor and Acceptor beads are used to capture interacting molecules. Upon excitation at 680 nm, the alpha Donor bead coupled to binding partners generates a singlet oxygen that transfers energy to the acceptor bead, resulting in the production of a luminescence signal at 615 nm [4, 5]. In our assay, we apply Alpha Donor beads coated with the biotinylated Anti-mCherry VHH antibody fragment and Anti-GFP AlphaLISA Acceptor beads to detect the interacting protein pairs tagged with mCherry and GFP. Here, we describe an example for detecting the association of the LTE-expressed rapamycin-FKBP12 complex with FRB. This method is generally suited for detecting binary protein interactions in the LTE system. The interaction between these two proteins is mediated with rapamycin and has been exhaustively characterized [6, 7] making them a suitable system for validation of a protein interaction (Fig. 3).

3.4.1 Production and Biotinylation of mCherry Nanobody for AlphaLISA Assay

The coding sequence for mCherry VHH fragment also known as nanobody was obtained from the literature [8], and cloned into pHUE vector using the BamHI and HindIII restriction sites in frame with the N-terminal GST-tag. The protein was expressed in E. coli strain CodonPlus (DE3)-RIL in TB medium using IPTG induction. The cells were disrupted using high-pressure fluidizer and the metal affinity chromatography on the HisTrap FF column and size-exclusion fractionation on Superdex 75 column were used to purify the protein using an Akta Purifier FPLC system. The fractions were analyzed by SDS-PAGE and the fractions containing pure protein were pooled and stored at 80  C. The mCherry nanobody protein was chemically biotinylated in order to be coupled to the Alpha Donor beads. For that the protein is prepared in 100 mM sodium bicarbonate buffer, pH 8.5, at a concentration of 1 mg/mL (see Note 10). A fresh solution (2.5 mg/mL) of NHS-Biotin in DMSO was made prior to setting

Leishmania Cell-Free Expression for High-Throughput Interactions Analysis

415

Fig. 3 Schematic figure of in vitro protein–protein interaction study: (a) proteins are expressed in LTE; (b) FKBP protein is mixed with rapamycin and FRB protein and subsequently diluted in buffer A; (c) AlphaLISA assay is performed by incubating FKBP/rapamycin/FRB complex with alpha beads followed by measuring the signals using microplate reader

416

Wayne A. Johnston et al.

up the reaction. Addition of 40 μL of NHS-Biotin solution to 1 mL of protein (1 mg/mL) was followed by stirring of the reaction for 7 h in a cold room. The biotinylated protein was transferred to the dialysis cassette and dialyzed against three changes of 500 mL PBS buffer for 24 h in at 4  C. Dispense the samples into Eppendorf tubes and store it at 80  C. Streptavidin-coated magnetic beads were used to assess the biotinylation of the mCherry nanobody. Magnetic beads (50 μL) were washed twice with washing/binding buffer (0.5 mM NaCl, 20 mM Tris–HCl, 1 mM EDTA) by decanting the supernatant using a magnetic device. Beads were mixed with 10 μL of washing/binding buffer and loaded with 10 μL of 0.5 mg/mL biotinylated and non-biotinylated control proteins. The samples were incubated for 15 min on a rotating mixer at room temperature and the supernatant was transferred to a new tube. The concentration of the protein was measured using Nanodrop spectrophotometer, and if the proteins are completely biotinylated, no protein should be present in the supernatant. The presence of non-biotinylated proteins in the sample was also estimated by analyzing the supernatant of the magnetic beads on the Bolt™ 4–12% Bis-Tris gel SDS-PAGE gel. If the protein was efficiently coupled to the magnetic beads, no protein band should be detectable on the gel. 3.4.2 Cell-Free Expression of GFP-FRB and FKBP12-mCherry Fusion Proteins for AlphaLISA Assay

The genes encoding FRB (accession number: 1AUE) and FKBP12 (accession number: NP000792) were cloned in pCellFree_G03 (accession number: KJ541667.1) and pCellFree_G08 (accession number: KJ541672.1) vectors, respectively, using Gateway cloning technology (see also Subheading 3.3). To set up the cell-free expression reaction, mix lysate and feeding solution optimized as described in Subheading 3.2. To start the reactions, combine this supplemented lysate with 0.5 μL RNaseOUT (25 unit/10 μL reaction), 1 μL of the plasmid (either FRB or FKBP12 template plasmid; Concentration 300 nM), and Ultrapure™ water to total 10 μL (each reaction is prepared in duplicate). Incubate the reaction for 3 h and 4 h at 27  C for the expression of mCherry fusion proteins and GFP fusion proteins, respectively. The expressed proteins could be detected by measuring GFP and mCherry fluorescence using a microplate reader. In addition, 10 μL of reaction mixture and 2 Bolt sample buffer (1:1 v/v) should be resolved on SDS-PAGE Bolt gel by loading on the gel without heating the samples (Fig. 4). The fluorescence of fusion proteins can be visualized by scanning with the ChemiDoc (Bio-RAD) imaging system (Red filter: 695/55 and green filter: 605/50 for mCherry and GFP fluorescence detection, respectively).

Leishmania Cell-Free Expression for High-Throughput Interactions Analysis

417

Fig. 4 Expression of GFP-FRB and FKBP12-mCherry in LTE lysate: Expressed proteins were separated on SDS-PAGE gel and visualized by in gel fluorescence scanning. M: molecular weight marker; (1) GFP-FRB (41 KDa); (2) FKBP12mCherry (47 KDa) 3.4.3 Detecting the Interaction of In Vitro Expressed GFP-FRB and FKBP12-mCherry Fusion Proteins Using the AlphaLISA Assay

Following the expression of GFP-FRB and FKBP12-mCherry in LTE, 10 μL reaction of the FKBP12-mCherry protein was incubated with rapamycin (at the final concentration of 20 nM) at room temperature for 10 min. GFP-FRB protein was added to the rapamycin-FKBP12-mCherry sample and mixed by pipetting up and down. The protein mixture was then serially diluted 50, 100, 200, and 400 with buffer A. The AlphaLISA assay was carried out in 384 well plates, Optiplate-384 Plus, using Anti-GFP AlphaLISA Acceptor and Streptavidin Donor beads. Alpha beads were prepared according to the protocol provided by Perkin Elmer [9]. The Alpha Acceptor and Donor beads stocks (5 mg/mL) were diluted to 100 μg/mL (5) in 1 AlphaLISA Universal assay buffer. We added 1 μL of biotinylated mCherry nanobody into microplate wells followed by addition of 15 μL protein mixture (FKBP-rapamycin-FRB protein mix) and supplemented LTE lysate (as a negative control). The samples were prepared in triplicate. We then add 5 μL of 5 acceptor beads to each well and mixed it using Viaflo electronic pipette (see Note 11). The plate was covered with a sealing transparent film and the mixture incubated for 1 h at room temperature. Finally, 5 μL of 5 donor beads were added to the samples under subdued light, mixed gently, and incubated for 30 min at room temperature (see Note 12). The AlphaLISA signal was detected with Tecan microplate reader using the following setting: Filter:

418

Wayne A. Johnston et al.

Fig. 5 AlphaLISA plot: The average of Alpha signal is plotted vs the interacting protein dilution factor. The dashed line represents the Alpha signal obtained for interacting FRB-FKBP12 protein pairs and the dotted line represents the Alpha signal for the negative control (LTE lysate). Error bars represent standard error of the mean of triplicate samples’ measurement

AlphaLISA, Excitation time: 180 ms, Integration time: 300 ms. The obtained signal values were plotted against the dilution factors. In comparison to the negative control, a strong positive Alpha signal was detected upon incubation of FRB and FKBP proteins in the presence of rapamycin. This result represents the interaction of these proteins confirming the high quality proteins expressed in LTE system (Fig. 5).

4

Notes 1. Although the previous protocol [2] used a 2500  g spin, we observed that the greater rpm gives a more consistent cell density measurement. This factor is critical in lysate manufacture, as lysate quality is very dependent on the cell density input to the cell disrupter. 2. Although 0.25 g/g disruption loading is recommended here, the previous protocol [2] used a higher value (0.38 g/g albeit with a lower spin rpm). The protocol has evolved to use relatively dilute lysates at disruption, with generally higher addition of the dilute lysate to expression reactions (6 μL/10 μL). This combining of dilute lysates at disruption with minimal further dilution at the reaction stage leads to improved expression and

Leishmania Cell-Free Expression for High-Throughput Interactions Analysis

419

product quality. However, the feed solution optimization protocol described in Subheading 3.2 does suggest the option of various additions of lysate to the reaction. 3. Previously, lysates were combined with feeding solution into a supplemented lysate prior to snap freezing [2]. Freezing unsupplemented lysates presents some disadvantages; specifically thawed unsupplemented lysates deteriorate faster on ice than supplemented, and if supplementation involves re-aliquoting this adds an extra freeze-thaw step which can reduce expression levels by ~15%. However, the advantage of being able to tailor feed solution to the lysate is worth the additional complexity. 4. Transcription limited expression in the coupled system can be caused not only by insufficient rNTP building blocks, but also problems with DNA template quality and T7 polymerase activity. Transcriptional deficiencies from all these causes are observable as increased delay before any fluorescence is detected if expression of GFP producing template kinetics are measured (e.g., in a platereader). Delays significantly greater than 10 min from the start of reaction generally indicate insufficient transcription capacity in the Leishmania cell-free system. However excess transcriptional capacity, especially as excess T7 polymerase, is quite damaging to product quality. The suggested maximum reaction usage of T7 polymerase in Table 2 (5 mg/mL stock; 0.12 mg/mL final reaction) could be revised upward or downward based on individual preparations/sources of T7 polymerase having varying activities. If optimization indicates bad product quality for all feed solution variations, this suggests too much T7 polymerase is present. If good quality product with increasing expression yield occurs with increasing T7 polymerase, this suggests too little enzyme is present. 5. The relative endpoint amounts of eGFP fluorescence are also suggestive of premature terminations (which do not fluoresce), but endpoint only data is less useful as it does not distinguish between low fluorescence due to insufficient transcriptional capacity (less total protein) and excess transcriptional capacity (low quality due to short reaction products). 6. It is straightforward to add less lysate to the reaction (down to 2 μL/10 μL reaction), with the spare volume made up with EB. The protocol for optimizing lysate additions in provided in Subheading 3.2. Lowering lysate addition decreases reaction rate and increases reaction time, which generally leads to less protein aggregation at the expense of protein yield. Alternately a more concentrated lysate may be added to the reaction (via >0.25 g/g cell disruption loading during lysate manufacture), which generally leads to greater expression yield at the expense

420

Wayne A. Johnston et al.

of greater product aggregation. Typical values 0.25–0.35 g/g at disruption (see Subheading 3.1).

are

7. For Leishmania cell-free expression, approximately optimal Magnesium addition (as MgOAc) has been empirically derived in the author’s laboratory to be: Mg ðmMÞ ¼ Total ½rNTP þ 1:5 It should be noted that the Leishmania cell-free system performance (yield and quality) can be very magnesium sensitive under some reaction conditions. An alternative method of optimizing Leishmania cell-free expression is presented in [10], where all feed solution components are altered simultaneously to minimize the effects of magnesium concentration on reaction outcome. This paper also provides additional information on individual component variation on reaction outcome. 8. DNA templates that are prepared using mini or midi-prep purification kits generally contain small amounts of RNAses in the final product that can decrease expression yield by as much as 50%. As a transcriptional input, excess starting DNA can reduce product quality under some circumstances. However this effect is generally weak, and hence starting DNA concentration should not require separate optimization. 9. Heating of the gel during electrophoresis may lead to unfolding of the fluorophore and lack of visible bands. Fresh electrophoresis buffer should be used, and voltage reduced (e.g., 10 ng/μL) encoding the PDZ domains. Seal the plate using an adhesive PCR seal and centrifuge to spin down. 4. Incubate 1 h at room temperature. 5. Add 1 μL of Proteinase K. 6. Incubate 10 min at 37  C. 7. Centrifuge briefly to collect the reaction components and store samples on ice or at 20  C for subsequent transformation. 8. For the transformation and plating protocol, start at step 14 from the “Generation of pHTP0 Clone Collections” protocol but change antibiotics and use 50 μg/mL Zeocin in Low Salt LB for the pENTRY clone selection. The integrity of the artificial gene sequences was confirmed by Sanger Sequencing in both directions.

Generation of pETG41A Clone Collections

To be able to produce and purify when necessary the PDZ in soluble form, the pENTRY Zeo clones must be transferred to an E. coli Gateway Destination vector. In our previous study [1], the vast majority of the PDZ domains were detected in a soluble form

High-Throughput Production and Interaction of the PDZome

453

using a His-tagged maltose-binding protein (His-MBP) N-terminal fusion. The His tag is used for purification through affinity chromatography while the MBP tag is known to optimize the folding and to increase the solubility of recombinant proteins. MBP tag also increases the molecular weight of the constructs, facilitating their detection and quantification by microfluidic capillary electrophoresis. This strategy is kept for the new libraries except that the His-pKM596 vector used in the previous study [1] is replaced by the pETG41A vector; the pETM-41 restriction based vector modified for Gateway cloning [3]. After the cloning and expression of the new library following the protocols described here, for the few cases where the soluble level of His-MBP-PDZ was low, the PDZ were subcloned into an alternative enhancing protein, the Thioredoxin (TRX) with an His-tagged fusion (pETG20A), see ref. 3. Using the LR clonase II enzyme mix from the Gateway technology (Thermofisher), the pDONRZeo libraries are subsequently cloned into a Gateway pETG41A (or pETG20A when necessary) vectors to generate the libraries in Destination vectors enabling the production in E. coli system. 1. On ice, prepare the LR Clonase II master mix as follow (for 120 reactions): pETG41A (150 ng/μL): 1 μL LR clonase II Enzyme Mix: 1 μL 2. Transfer 2 μL into each well of a 96-well PCR plate, using a multichannel pipette. 3. Add 3 μL of pDONRZeo plasmid (>10 ng/μL), containing the PDZ domains. Seal the plate using an adhesive PCR seal and centrifuge to spin down. 4. Follow step 4 from the “Generation of pDONR Zeo Clone Collections” protocol but change antibiotics and use 50 μg/ mL ampicillin for the pDEST clone selection. The integrity of the artificial gene sequences is confirmed by Sanger Sequencing in both directions. 3.2 Generation of the Soluble Libraries of Single and Tandem PDZ Domains in E. coli 3.2.1 Protein Production

The whole protocol is automated on Freedom EVO 200 (Tecan) liquid handling robots with either a 4-channels Liquid LiHa and a 8-channels Air LiHa (Configuration 1, Fig. 1a, Institut Pasteur) or with a 8-channels Liquid LiHa and a 96 pipetting head (MCA) (Configuration 2, Fig. 1b, AFMB). Steps performed by hand are marked with an asterisk ∗. A video describing the protocol to perform the transformation and culture of 96 samples with the configuration 2 has been published elsewhere [5]; therefore this chapter focuses on the details of the procedure using configuration 1.

454

Yoan Duhoo et al.

Fig. 1 Tecan worktables. (a) Configuration 1 with a 4-channels Liquid LiHa and a 8-channels Air LiHa. (b) Configuration 2 with a 8-channels Liquid LiHa and a 96 MCA

The expression clones of the 266 PDZ domains and the 37 PDZ tandems are aliquoted in three 96-well PCR plates (named plates A, B, and C) and one 96-well PCR plate (T), respectively (see Tables 1 and 2 for organization). Each step of the protocol below describes the production of the 96 PDZ of plate A and is to be repeated two more times (on plates B and C) to produce the full library. The same protocol applies for the production of PDZ tandems.

High-Throughput Production and Interaction of the PDZome Day 1

455

Bacterial transformation 1. ∗Preparation of media and plates (under sterile conditions): dispense 7 mL/well of sterile, antibiotic-free, LB medium in the first column of a sterile DW24 covered with an Air Permeable adhesive film. Cover a new Sterile-DW96 with an Air Permeable adhesive film. Prepare four Falcon-50 tubes containing each 30 mL of LB medium supplemented with antibiotics. 10 min before the experiment: thaw on ice one plate of the PDZ expression plasmids library (A, B, C or T) (PCR96DNA) and one plate of aliquoted (25 μL/well) chemically competent BL21 (DE3) pLysS cells (PCR96-Cells). 2. ∗Preparation of the Tecan worktable: Turn on the Cooling station at 2  C. Place: Two 96-well aluminum adapters on the Shaker-heater position and one on the Cooling station. Four Falcon-50 tubes containing LB-antibiotics. The PCR96-Cells plate on the Cooling position + 96- aluminum block. The sterile-DW96 plate on the Cooling station. The DW24 plate containing sterile LB on a Flat position. One DiTi-10 μL and one DiTis-200 μL. 3. Add 2 μL of each PDZ expression plasmid (PCR96-DNA) in the respective well in the chemically competent cells plate (PCR96-Cells) at 2  C (8 channels Air LiHa). 4. Incubate the transformation mixes at 2  C for 30 min. 5. While the incubation: dispense 1 mL/well of LB medium supplemented with antibiotics into the DW96 plate (4 channels Liquid LiHa). 6. After the incubation, place the PCR96-Cells plate (Transformation reactions) on the Shaker/heater position adjusted for a transformation reaction at 37  C (RoMa). 7. Incubate for 5 min with 1000 rpm shaking and place the plate back at 2  C. 8. With the 8-channels Air LiHa, add to each transformation reaction 100 μL of antibiotic-free LB medium. 9. Place the PCR96-Cells plate in the incubator hotel MOI; Incubation for 1 h at 37  C, shaking. 10. Turn the Cooling station off to allow it to reach room temperature, and once the transformation incubation is done, place the plate back on the same position now at room temperature. 11. With the 8-channels Air LiHa, aspirate 100 μL of each transformation reaction, and dispense them in the 1 mL LB-antibiotics medium in the respective wells of the DW96 plate (DW96-Transfo). 12. Place the DW96-Transfo plate in the Infors-Multitron for over-night, at 37  C, 220 rpm.

456 Day 2

Yoan Duhoo et al.

Inoculation and culture To increase PDZ production yield, each PDZ culture (2 mL) is done in triplicates and the lysates are pooled at the end to have one sample per PDZ. 1. ∗Preparation of media and plates (under sterile conditions): Supplement 650 mL of auto-inducible expression medium; 0.5% glycerol with antibiotics, and dispense (by hand) 2 mL/well in twelve sterile DW24 plates, numerate each plate to identify in which well each PDZ will be expressed. Cover the plates with Air Permeable adhesive film. 2. ∗Preparation of the Tecan worktable: Place twelve DW24medium plates on Cooling station (kept at room temperature) and on the 2 MP4 carrier positions. At this step, it is important to correctly place the DW24 plates to avoid mistakes on the well number of each PDZ cultures. Place one Ditis-1000 μL. 3. ∗Take out the DW96-Transfo plate from the incubator, visually check each culture. Note any culture missing and place the DW on one shaker position, without film. A control of the OD600 of the transformation cultures can be performed. All cultures reach an OD600 close to 2 after an over-night pre-culture. 4. The DW96-Transfo is shaken at 1000 rpm until pipetting. 5. With the 8-channels Air LiHa, aspirate 450 μL/well of pre-cultures from the DW96-Transfo plate, and dispense 150 μL/well into three different DW24 plates (series a, b, c) with medium (following the pipetting plan described in the Fig. 2 at Day 2). The first column of the DW96-Transfo is dispensed into the two first columns of the first DW24 plate. The second column of the DW96-Transfo is dispensed into next two columns of the first DW24 plate. The third column of the DW96-Transfo is dispensed into last two columns of the first DW24 plate. The forth column of the DW96-Transfo is dispensed into the two first columns of the second DW24 and so on until the 96 pre-culture have inoculated the 3 series of 96 cultures in DW24. 6. Keep the remaining pre-cultures in the shaker to prepare glycerol stocks (see below). ∗For the culture growth and protein induction, place the twelve DW24 plates (DW24-Cultures) in the Infors-Multitron for 4 h at 37  C, 220 rpm, then drop the temperature to 17  C and leave over-night.

High-Throughput Production and Interaction of the PDZome

Fig. 2 Synopsis of the library production

457

458

Yoan Duhoo et al.

Protein production from glycerol stock As an alternative to a fresh transformation, the cultures for production can be inoculated from the glycerol stocks that need to be generated during the first library preparation. Preparation of the glycerol stock At the end of the inoculation phase (Day 2), keep the DW96Transfo shaking on the Tecan deck until everything is ready for the glycerol stock preparation. 1. ∗Preparation of medium (under sterile conditions): Fill one sterile 100 mL trough with 15 mL of 50% glycerol covered with a sterile lid. 2. ∗Preparation of the Tecan worktable: place the glycerolcontaining 100 mL trough. Three annotated PG96 plates (PG96-Stock) on Cooling station; Two DiTis-200 μL. 3. With the 8-channels Air LiHa, dispense 120 μL/well of 50% glycerol in the first PG96-stock plate. 4. With the 8-channels Air LiHa, aspirate 180 μL/well of transformation cultures from the DW96-Transfo plate, and dispense the total volume in the respective well of the PG96Stock plate with 50% glycerol. Mix five times 190 μL. Aspirate 200 μL/well of this bacterial glycerol stock, then dispense 100 μL in the respective well of each of the two other PG96stock plates. 5. ∗Place a sterile adhesive aluminum foil on each PG96-Stock plate and store them at 80  C. Protein production from a glycerol stock This is Day 1 for a new production batch. 1. ∗Preparation of media and plates (under sterile conditions): cover a new Sterile-DW96 plate with an Air Permeable adhesive film. Prepare four Falcon-50 tubes containing each 30 mL of LB medium supplemented with antibiotics. 10 min before the experiment: thaw on ice one PDZ bacterial glycerol stocks plate (PG96-Stock). 2. ∗Preparation of the Tecan worktable: place the thawed PG96Stock plate (without aluminum foil) on one Cooling position. The four Falcon tubes containing LB-antibiotics. One DiTis-200 μL. 3. With the 4-channels Liquid LiHa, dispense 1 mL/well of LB medium supplemented with antibiotics, into the Sterile-DW96 plate. 4. With the 8-channels Air LiHa, inoculate 80 μL/well of PDZ bacterial glycerol stock from the PG96-Stock plate into respective wells of the DW96 plate with LB-antibiotics pre-cultures (this is then similar to the DW96-Transfo).

High-Throughput Production and Interaction of the PDZome

459

5. ∗Place the pre-cultures plate (DW96-Transfo) in the InforsMultitron for over-night incubation, at 37  C, 220 rpm. 6. Start the protein production, from the DW96-Transfo plate as described above. Day 3

Harvesting, aliquoting, and freezing the libraries 1. ∗Preparation of buffers and plates: Prepare 100 mL of Lysis buffer. Annotate three PCR96 plates. 2. ∗Preparation of the Tecan worktable: Turn on the Cooling station. Place: Three PCR96 plates on two cold and one Flat position. One DiTis-1000 μL. 3. ∗Centrifuge sequentially the twelve DW24 plates for 10 min, 3000  g at 4  C. Discard supernatants by inverting the plates, apply them on tissue to discard remaining liquid, keep the pellets at 4  C. During the centrifugation, keep the remaining DW24 plates containing the cultures shaking in the incubator to avoid sedimentation. 4. ∗On ice, with the multichannel adjustable spacer pipette, dispense 300 μL/well of Lysis buffer in the three DW24 plates corresponding to a triplicate (series a, b, c) of cultures of the first 24 PDZ over-expressions (i.e., the 3 DW plates of culture 1). Shake it for 3 min, at 1000 rpm, at 4  C. On ice, resuspend each well properly by pipetting and pool the triplicated wells in one of the DW24 (DW24-Crude-1). The total lysate volume per PDZ is now around 1 mL; 900 μL of buffer plus the volume of the pellets. 5. Place this resuspended-lysates DW24-Crude-1 plate shaking on a cold position (see Note 2). 6. With the 8-channels Air LiHa, mix thoroughly the lysates 5 600 μL from the first column of the DW24-Crude-1 plate, aspirate 150 μL and dispense 50 μL/well into the respective wells of the three copies of PCR96 (PCR96-Crude-Aliquot). 7. ∗While aliquoting of the first cultures series, repeat step 4 to generate DW24-Crude (2-4). Following this protocol, altogether, four DW24-Crude (1-4) and three PCR96-CrudeAliquot are generated for 96 PDZ cultures. 8. ∗Seal the four DW24-Crude and three PCR96-Crude-Aliquot plates with adhesive aluminum foil, and store them at 80  C (see Note 3). Optional: OD600 measurement of the cultures (Day 3). After the overnight culture at 17  C, before harvesting the culture, to confirm that all cultures grew with comparable rates, the OD600 can be taken from an aliquot of the cultures. The OD600 in these conditions should be around 12–14.

460

Yoan Duhoo et al.

1. ∗Turn on the Cooling station. Place: One PS96 plate on a Flat position. Four Falcon-50 tubes containing each 25 mL of PBS. Two Ditis-200 μL. 2. With the 4-channels Liquid LiHa, dispense 180 μL/well of PBS into the PS96 plate. 3. ∗During the PBS distribution: Take out all DW24-Cultures from the incubator, visually check all wells and write-down low-density wells. Dispose one of the three series of the four DW24-Cultures on cold positions. Avoid as much as possible bacterial sedimentation before OD measurement by disposing the DW24-Cultures plates on the worktable just before the start of the Tecan script and keep the other DW24-Culture in the incubator to avoid bacterial damaging. 4. With the 8-channels Air LiHa, mix the 2 mL cultures 3 190 μL, aspirate 20 μL and dispense them in the PS96 plate with PBS. 5. With the 8-channels Air LiHa, mix all wells of the diluted cultures in the PS96 plate. 6. Transfer the dilutions PS96 plate in the NanoQuant reader. 7. Measure the OD600. 8. Transfer back the PS96 plate on the worktable. 9. An Excel file with one OD600 value for each well is automatically generated. Quantification of each PDZ, Generation of the frozen protein libraries and Quality control Determination of the soluble PDZ concentrations The lysates are stable for months at 80  C; therefore, the preparation of the final library can be performed the day after the cultures (Day 4) or later on. 1. ∗Preparation of buffers: Prepare 450 μL of DNAse solution (75 μL DNAse I 1 mg/mL, 150 μL MgSO4 1 M, 225 μL H2O). 50 mL of Lysis buffer, keep on ice. 3 mL of Sample Buffer Caliper supplemented with 5 mM DTT, keep at room temperature. 2. ∗Thaw one PCR96-Crude-Aliquot plate at 37  C, with vigorous agitation for 5 min. The extracts should be completely viscous. 3. ∗Add 3 μL/well of DNAse solution. Agitate the plate at room temperature, strong agitation for 15 additional minutes. Transfer each extract in one 1.5 mL-Eppendorf tube with the Matrix multipipette. During transfer check that the lysates are not viscous anymore.

High-Throughput Production and Interaction of the PDZome

461

4. ∗Clear the extracts by centrifugation at high speed (20,800  g) for 10 min. Transfer the supernatants into a new PCR96 plate with the Matrix multipipette (PCR96-Soluble-Aliquot). Keep the plate on ice and quickly proceed to the serial dilutions. 5. ∗Preparation of the Tecan worktable: turn on the Cooling station. Place: Three new PCR96 plates (1:8, 1:16 and 1:32 dilutions plates), the PCR96-Soluble-Aliquot plate, and one PCR384 plate on cold positions. 50 mL Lysis buffer in a 100 mL trough. 50 mL distilled H2O in a 100 mL trough. The Sample buffer Caliper in a 25 mL-Max-Recovery trough. Three DiTis-10 μL and one DiTis-200 μL. 6. With the 4-channels Liquid LiHa, dispense 140 μL/well of Lysis buffer in the PCR96-1:8 plate and 80 μL/well in both PCR96-1:16 and PCR96-1:32 plates. 7. With the 8-channels Air LiHa, dispense 5.6 μL/well of Sample buffer Caliper into the PCR384 plate. Discard the DiTis after each six columns dispense. 8. Serial dilutions: with the 8-channels Air LiHa, thoroughly mix (5 20 μL) the cleared extracts in the PCR96-Soluble-Aliquot plate. Aspirate 20 μL of cleared extracts. Dispense the whole volume into the respective wells of the PCR96-1:8 plate and mix thoroughly (5 150 μL). Aspirate 80 μL/well from the PCR96-1:8 plate. Dispense the whole volume into the respective wells of the PCR96-1:16 plate, mix thoroughly. Aspirate 80 μL/well from the PCR96-1:16 plate. Dispense the whole volume into the respective wells of the PCR96-1:32 plate, mix thoroughly. 9. With the 8-channels Air LiHa, aspirate 4 μL/well from the PCR96-1:32 plate and dispense them into the PCR384Dilutions plate, mix thoroughly 3 5 μL. Repeat the operation from the PCR96-1:16 and PCR96-1:8- plates. 10. ∗Place an adhesive plastic film on the PCR384-Dilutions plate. 11. ∗Heat the PCR384-Dilutions plate at 95  C for 5 min in dry bath (see Note 4) or PCR384. Remove the adhesive plastic film. 12. With the 4-channels Liquid LiHa, dispense 20.6 μL/well of distilled H2O into the PCR384-Dilutions plate (see Note 5). ∗To remove any air bubbles, centrifuge the PCR384Dilutions plate for 2 min at 2000  g. 13. The PCR384-Dilutions plate is analyzed with the High Sensitivity HT Protein Express protocol (10–100 kDa program) with the LabChip GXII device (Perkin Elmer) following supplier’s instructions. The run takes approximately 4.3 h to separate and analyze 384 samples.

462

Yoan Duhoo et al.

Table 3 Concentration of the soluble fraction of the “PDZome V.2” and tandem libraries A Plate A

A

1 2 AHNAK 2

2 ARHGAP21

3 CNKSR2

ARHGAP23

CNKSR3

ARHGEF11 ARHGEF12

DEPTOR DFNB31_1

CARD11 CARD14

DFNB31_2 DFNB31_3

G

APBA2_12 APBA2_22 2 APBA3_1

CASK

DLG1_1

H

APBA3_2

CNKSR1

DLG1_2

B C D E F

AHNAK2 * APBA1_1 APBA1_2

4 DLG1_3

5 DLG4_2

DLG2_1

1

6 DVL3

7 GIPC2

8 GRIP1_1

9

10 HTRA3

11 InaDl_2

HTRA4

1

12 INTU

DLG4_3 DLG5_1 DLG5_21

FRMPD1 FRMPD2_1 FRMPD2_2

GIPC3

GRIP1_2

GRIP2_22 * GRIP2_3

GOPC GORASP1

GRIP1_3 GRIP1_4

GRIP2_4 GRIP2_5

IL16_1 IL16_2

InaDl_3 InaDl_4 InaDl_5

DLG5_3 DLG5_4

FRMPD2_3 FRMPD3

GORASP2 GRASP

GRIP1_5 GRIP1_6

GRIP2_6 GRIP2_7

IL16_3 IL16_4

InaDl_6 InaDl_7

LIMK1 * LIMK2 LIN7A

DLG3_3

DVL1

FRMPD4

GRID2IP_11

GRIP1_7

HTRA1

InaDl_1

InaDl_8

LIN7B

DLG4_1

DVL2

GIPC1

GRID2IP_2

2

GRIP2_1

HTRA2

InaDl_10

InaDl_9

LIN7C

DLG2_2 2

DLG2_3 * DLG3_1 DLG3_2

1

LAP2 LDB3 2

Plate B

A

1 LMO7

2 LNX2_4

3 MAGI2_1

4 MAGI3_3

5 1 MAST4

6 MPDZ_3

7 MPP2

8 NHERF_2

9 NHERF4_2

10 PARD3B_2

11 PDLIM3

B C

LNX1_1 LNX1_2

LRRC7 1 MAGI1_1

MAGI2_2 MAGI2_3

1

1

MLLT4 MPDZ_1

MPDZ_4 MPDZ_5

MPP3 MPP4

NHERF2_1 NHERF2_2

NHERF4_3 NHERF4_4

PARD3B_3 PARD6A

PDLIM4

D

LNX1_3 LNX1_41

MAGI1_2 MAGI1_3

1

MAGI2_4

MPDZ_10

MPDZ_6

MPP5

NHERF3_1

NOS1

PARD6B

PDLIM5 PDLIM7

LNX2_1 LNX2_2 LNX2_3

2

PARD6G PCLO PDLIM1 PDLIM2

PDZD11 PDZD2_1 PDZD2_2 PDZD2_3

11

12

E F G H

MAGI1_4 * MAGI1_5 MAGI1_6

MAGI2_5 MAGI2_6

MAGI3_4 1 MAGI3_5 MAGI3_6

MAGI3_11 MAGI3_21

MAGIX MAST1 MAST2 MAST3

3 2 PTPN13_2

RAPGEF6

MPDZ_11 MPDZ_12 MPDZ_13 MPDZ_2

MPDZ_7 MPDZ_8 MPDZ_91 MPP1

MPP6 MPP7 MYO18A NHERF_1

NHERF3_2 2

NHERF3_3 * NHERF3_4 NHERF4_1

1

PARD3_1 PARD3_2 PARD3_3

PARD3B_11

12 PDZD2_42 PDZD2_5 PDZD2_6

1

1

PDZD4 PDZD7_1 1 PDZD7_2 PDZD7_32 PDZD8

Plate C 1

2

5

6

7

8

9

10

A

PDZD9

PREX1_1

SCRIB_2

SHANK2

SIPA1L3

SYNJ2BP

TJP2_1

USH1C_2

B C

PDZRN3_1 PDZRN3_2

1

PREX1_2 PREX2_1

PTPN13_3 PTPN13_4

RGS12 RGS3

2

SCRIB_3 SCRIB_4

SHANK3 SHROOM2

TJP2_2 TJP2_3

USH1C_3

D E

PDZRN4_1 PDZRN4_2

PREX2_2 PRX

PTPN13_5 PTPN3

RHPN1 RHPN2

SDCBP_1 SDCBP_2

SNTA1 SNTB1 SNTB22 *

PSCDBP

PTPN4

RIMS1

RADIL RAPGEF2

RIMS2 SCRIB_1

4

F G H

1

PICK1 PPP1R9A PPP1R9B

2

PSMD9 PTPN13_1

4

SHROOM3 SHROOM4

SYNP2 SYNPO2L2 TIAM1

SDCBP2_1 SDCBP2_2 SHANK1

SIPA1

SNTG2

TIAM22 TJP1_1

SIPA1L1 SIPA1L2

SNX27 STXB4

TJP1_2 TJP1_3

5

6

2

SNTG1

1

TJP3_1 TJP3_2 TJP3_3 TX1B3 2 USH1C_1

B Tandems 1 A B C D

APBA1_1-2 APBA2_1-2 APBA3_1-2 DFNB31_1-2

2

3

DLG5_1-2 GRIP1_1-2 GRIP1_2-3 GRIP1_4-5

GRIP2_5-6 INADL_1-2 INADL_2-3 INADL_4-5

MPDZ_8-9 PREX2_1-2 MPDZ_10-11 PTPN13_4-5 NHERF3_2-3 SCRIB_3-4 NHERF4_1-2 SCRIB_3-4_L

E F

DLG1_1-2 DLG2_1-2

GRIP1_5-6 GRIP2_1-2

INADL_8-9 LNX1_1-2

PARD3B_2-3 PDZD2_5-6

G H

DLG3_1-2 DLG4_1-2

GRIP2_2-3 GRIP2_4-5

LNX2_1-2 MPDZ_1-2

PDZD7_1-2 PREX1_1-2

USH1C_1-2

PDZome V. 2

Tandems

> 100 50 - 100 10 - 50 8 - 10

113 96 53 4

4 15 17 1

0.2 to at least one peptide (see Table 4). We are therefore confident that this new PDZome library is of excellent quality and could benefit, together with the holdup or in other assays, to the improvement of the knowledge in the field of the PDZ domains.

High-Throughput Production and Interaction of the PDZome

473

Fig. 5 “PDZome V.2” binding profiles of HPV16. Gray line: high-confidence Binding Intensity (BI) threshold

Table 4 Affinity of the “PDZome V.2” for various peptides Plate A A B C D E F G H

1 AHNAK AHNAK2 APBA1_1 APBA1_2 APBA2_1 APBA2_2 APBA3_1 APBA3_2

2 ARHGAP21 ARHGAP23 ARHGEF11 ARHGEF12 CARD11 CARD14 CASK CNKSR1

3 CNKSR2 CNKSR3 DEPTOR DFNB31_1 DFNB31_2 DFNB31_3 DLG1_1 DLG1_2

4 DLG1_3 DLG2_1 DLG2_2 DLG2_3 DLG3_1 DLG3_2 DLG3_3 DLG4_1

5 DLG4_2 DLG4_3 DLG5_1 DLG5_2 DLG5_3 DLG5_4 DVL1 DVL2

6 DVL3 FRMPD1 FRMPD2_1 FRMPD2_2 FRMPD2_3 FRMPD3 FRMPD4 GIPC1

7 GIPC2 GIPC3 GOPC GORASP1 GORASP2 GRASP GRID2IP_1 GRID2IP_2

8 GRIP1_1 GRIP1_2 GRIP1_3 GRIP1_4 GRIP1_5 GRIP1_6 GRIP1_7 GRIP2_1

9 GRIP2_2 GRIP2_3 GRIP2_4 GRIP2_5 GRIP2_6 GRIP2_7 HTRA1 HTRA2

10 HTRA3 HTRA4 IL16_1 IL16_2 IL16_3 IL16_4 InaDl_1 InaDl_10

11 InaDl_2 InaDl_3 InaDl_4 InaDl_5 InaDl_6 InaDl_7 InaDl_8 InaDl_9

12 INTU LAP2 LDB3 LIMK1 LIMK2 LIN7A LIN7B LIN7C

2 LNX2_4 LRRC7 MAGI1_1 MAGI1_2 MAGI1_3 MAGI1_4 MAGI1_5 MAGI1_6

3 MAGI2_1 MAGI2_2 MAGI2_3 MAGI2_4 MAGI2_5 MAGI2_6 MAGI3_1 MAGI3_2

4 MAGI3_3 MAGI3_4 MAGI3_5 MAGI3_6 MAGIX MAST1 MAST2 MAST3

5 MAST4 MLLT4 MPDZ_1 MPDZ_10 MPDZ_11 MPDZ_12 MPDZ_13 MPDZ_2

6 MPDZ_3 MPDZ_4 MPDZ_5 MPDZ_6 MPDZ_7 MPDZ_8 MPDZ_9 MPP1

7 MPP2 MPP3 MPP4 MPP5 MPP6 MPP7 MYO18A NHERF_1

8 NHERF_2 NHERF2_1 NHERF2_2 NHERF3_1 NHERF3_2 NHERF3_3 NHERF3_4 NHERF4_1

9 NHERF4_2 NHERF4_3 NHERF4_4 NOS1 PARD3_1 PARD3_2 PARD3_3 PARD3B_1

10 PARD3B_2 PARD3B_3 PARD6A PARD6B PARD6G PCLO PDLIM1 PDLIM2

11 PDLIM3 PDLIM4 PDLIM5 PDLIM7 PDZD11 PDZD2_1 PDZD2_2 PDZD2_3

12 PDZD2_4 PDZD2_5 PDZD2_6 PDZD4 PDZD7_1 PDZD7_2 PDZD7_3 PDZD8

2 PREX1_1 PREX1_2 PREX2_1 PREX2_2 PRX PSCDBP PSMD9 PTPN13_1

3 PTPN13_2 PTPN13_3 PTPN13_4 PTPN13_5 PTPN3 PTPN4 RADIL RAPGEF2

4 RAPGEF6 RGS12 RGS3 RHPN1 RHPN2 RIMS1 RIMS2 SCRIB_1

5 SCRIB_2 SCRIB_3 SCRIB_4 SDCBP_1 SDCBP_2 SDCBP2_1 SDCBP2_2 SHANK1

6 SHANK2 SHANK3 SHROOM2 SHROOM3 SHROOM4 SIPA1 SIPA1L1 SIPA1L2

7 SIPA1L3 SNTA1 SNTB1 SNTB2 SNTG1 SNTG2 SNX27 STXB4

8 SYNJ2BP SYNP2 SYNPO2L TIAM1 TIAM2 TJP1_1 TJP1_2 TJP1_3

9 TJP2_1 TJP2_2 TJP2_3 TJP3_1 TJP3_2 TJP3_3 TX1B3 USH1C_1

10 USH1C_2 USH1C_3

11

12

Plate B A B C D E F G H

1 LMO7 LNX1_1 LNX1_2 LNX1_3 LNX1_4 LNX2_1 LNX2_2 LNX2_3

Plate C A B C D E F G H

1 PDZD9 PDZRN3_1 PDZRN3_2 PDZRN4_1 PDZRN4_2 PICK1 PPP1R9A PPP1R9B

Binders No binders

244 22

The affinity between the “PDZome V.2” and 30 PDZ-binding-motifs has been quantified through the holdup protocol. In blue are depicted the PDZ that have a B.I >0.2 for at least one peptide.

474

4

Yoan Duhoo et al.

Notes 1. The lysozyme added in lysis buffer is used for the lysis but also for the normalization of the quantification of the soluble PDZ domains expressed and for the quantification of the holdup binding intensities. Therefore, it is crucial that its concentration at 4 μM is precisely adjusted, and that the buffer is prepared freshly before each use to avoid any lysozyme alteration at any step of the process. 2. The manual pipetting homogenization of the lysates is critical for accurate titration of the PDZ library. Keep the cell pellets/ lysis buffer on ice at all time and perform this step as quickly as possible. If the cells start to break the lysate will not stay homogenous in the multiple aliquoted plates for the quantification of the library. In order to get the concentration of each soluble PDZ in the crude extract, three aliquots (50 μL) of the library should be generated in PCR plates (PCR96-CrudeAliquot). One aliquot will be quantified to estimate the concentration of soluble PDZ concentration. Then the second aliquot will be thawed, the dilution factors determined earlier will be applied and the concentration of the library will be confirmed by capillary analysis. If subtle modifications of the dilution matrix are necessary the dilution factors will be modified and the third aliquot will be thawed, diluted and the concentration of the full library will be again controlled. This information will be used to dilute the final library to 5 μM of soluble PDZ that will be aliquoted in multiple copies and frozen. The determination of the accurate dilution factor for each PDZ is a critical step because these factors will be applied to the whole library (step 4) generating tens of plates of the 5 μM libraries. If several PDZ of the library are too diluted, the library would need to be redone from scratch! 3. For the first production of the full library, we recommend to grow and aliquot plate A, then the following days, plate B and finally plate C. While 3  266 cultures in parallel are not an issue for several labs, harvesting and aliquoting the plates A, B, C in 1 day gives high risk of getting lysis of the cells before the aliquoting is finished. That would result in an inhomogeneous sampling of the library in the PCR96-Crude-Aliquot for plates B and C. 4. When using the dry bath, let cool down the plate on ice/cooling station for a minimum of 15 min using a heavy aluminum lid to avoid condensation and plate deformation during the cooling phase. 5. In order to adjust the final volume to the well capacity of a 384-well PCR plate, the volume ratios are slightly different

High-Throughput Production and Interaction of the PDZome

475

form the manufacturer specifications to prepare samples for the caliper analysis (the correct volume of water should be 25.6 μL) but has been validated to give exactly the same output than following the manufacturer specifications. 6. For the quantification, select on each lane the His-Fusion-PDZ species and export the quantities determined by the integration software. Control at the same time that the apparent MW is in agreement with the expected MW of each protein. The software can easily achieve this in an automated manner after providing a list of expected MW for each position (in .csv format). 7. P1, P2, and P3 are usually filled with the same peptide to determine the Binding intensity of 96 PDZ against one peptide in triplicate. But, once the set-up is validated, screen in simplicate can also be conducted. When working with several peptides, it is therefore more relevant (economically) to perform first an initial screen with 3 different peptides in simplicate to get initial BI. Then, in a second holdup, accurate BI values can be calculated on the selection of the PDZ that were found to bind the peptides in the first screen, this time with triplicate experiments. 8. Because the peptides are in excess compared to the capacity of the beads, it is possible to recover the peptides by placing a new PCR384 plate under Filter384 plate before applying the vacuum routine. The peptides could be used safely for another holdup. 9. Alternatively, SDS-PAGE sample buffer (SSB) can be used here in order to allow freezing of the PDZ-SSB plate. This gives the option to analyze with microfluidic capillary gel electrophoresis later or to safely ship the plates to a lab equipped with a LabChip GXII apparatus.

Acknowledgments This work received institutional support from Centre National de la Recherche Scientifique (CNRS), Universite´ de Strasbourg, Institut National de la Sante´ et de la Recherche Me´dicale (INSERM) and Re´gion Alsace. The work was supported in part by a grants from Ligue contre le Cancer (e´quipe labellise´e 2015). This work was supported by the Programme Transversal de Recherche from Institut Pasteur (PTR grant no. 483 to N.W.) and by the French Infrastructure for Integrated Structural Biology (FRISBI) ANR-10-INSB-05-01 for the AFMB.

476

Yoan Duhoo et al.

References 1. Vincentelli R, Luck K, Poirson J, Polanowska J, Abdat J, Ble´mont M, Turchetto J, Iv F, Ricquier K, Straub M-L, Forster A, Cassonnet P, Borg J-P, Jacob Y, Masson M, Nomine´ Y, Reboul J, Wolff N, Charbonnier S, Trave´ G (2015) Quantifying domain-ligand affinities and specificities by high-throughput holdup assay. Nat Methods 12:787–793 2. Luck K, Charbonnier S, Trave´ G (2012) The emerging contribution of sequence context to the specificity of protein interactions mediated by PDZ domains. FEBS Lett 586:2648–2661 3. Vincentelli R, Cimino A, Geerlof A, Kubo A, Satou Y, Cambillau C (2011) High-throughput protein expression screening and purification in Escherichia coli. Methods 55:65–72

4. Saez NJ, Nozach H, Blemont M, Vincentelli R (2014) High throughput quantitative expression screening and purification applied to recombinant disulfide-rich venom proteins produced in E. coli. J Vis Exp. https://doi.org/10. 3791/51464 5. Sequeira AF, Turchetto J, Saez NJ, Peysson F, Ramond L, Duhoo Y, Ble´mont M, Fernandes VO, Gama LT, Ferreira LMA, Guerreiro CIPI, Gilles N, Darbon H, Fontes CMGA, Vincentelli R (2017) Gene design, fusion technology and TEV cleavage conditions influence the purification of oxidized disulphide-rich venom peptides in Escherichia coli. Microb Cell Factories 16:4

Chapter 22 High-Throughput Protein Analysis Using Negative Stain Electron Microscopy and 2D Classification Christopher P. Arthur and Claudio Ciferri Abstract High-throughput protein expression and purification allows for fast triaging of several constructs based on expression levels, protein integrity, and solubility. While this technology has been successfully adopted to prioritize constructs for structural biology, it could not inform on important biochemical properties such as domain architecture, homogeneity, and flexibility. Negative staining electron microscopy can be used to quickly evaluate these properties and, if coupled to single particle analysis, can inform on the architecture and conformational state of nearly any protein sample. Here we describe a protocol for negative stain sample preparation, imaging, and two-dimensional (2D) data analysis applicable to a variety of protein complexes. We discuss in more detail a specific application of this technology to large molecule studies to determine the binding sites of individual antibodies on target antigens. Key words Protein expression, Transmission electron microscopy, Negative stain EM, Single particle analysis, 2D classification, Epitope mapping

1

Introduction Recombinant protein expression plays a crucial role in enabling structural biology studies for drug discovery and a rapid production and analysis of high-quality recombinant proteins is a critical component of the successful development of therapeutics. While this approach is very powerful in prioritizing constructs based on solubility and levels of expression of the resulting protein sample, there is a need for additional tools to quickly visualize these samples and determine their stoichiometry, architecture, and conformational homogeneity. Transmission electron microscopy (TEM) has historically been a particularly relevant tool for examining cellular structure elements. Some of the very first cellular TEM images from Dr. Keith Porter revolutionized how scientists thought of cellular structure and organization [1]. As technology has advanced, so have the

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_22, © Springer Science+Business Media, LLC, part of Springer Nature 2019

477

478

Christopher P. Arthur and Claudio Ciferri

structure-solving capabilities of TEM. Currently, it is possible to image individual proteins and protein complexes at near-atomic resolution thanks to advances in electron optics, electron detector technology, image processing algorithms, and protein sample preparation and optimization techniques [2–4]. The preferred method for high-resolution TEM imaging is cryo-TEM, which requires that highly homogeneous samples be flash frozen in a vitreous layer of buffer and imaged in the microscope under low electron dose conditions. Despite the quantum leap progression undergone by cryo-TEM in the past few years, its lower throughput and its routine applicability to only fairly large macromolecular complexes make this technology very inefficient and costly (both in time and money) for initial protein sample screening. For this reason, negative staining and single particle analysis has become the preferred choice for high-throughput sample analysis. Negative stain TEM allows for rapid, somewhat simple and reproducible preparation and imaging of purified protein sample. The first work using negative staining for imaging biological single particles was performed almost 60 years ago [5], and while the technique has undergone modifications, the basic principal remains the same. The protein of interest is surrounded by a shell of heavy metal salts creating contrast due to the differential electron scattering of the biological material and the surrounding stain. This process generates a “negative” image of the specimen due to electrons interacting with the surrounding stain being scattered at higher angles and thus being excluded from the final image, while electrons that pass through the central sample are scattered at lower angles and contribute to the image signal. This results in a contrast of white protein on a dark background, which is opposite of what is seen in cryo-TEM imaging where the electrons interacting with the surrounding solvent undergo very little, if any, scattering and therefore go on to interact with the detector, leaving an image of dark protein on a white background. Although negative staining presents a higher contrast, as compared to cryo-TEM imaging, the granularity of the staining solution, generating a shell around the particle of interest, limit the resolution of this technique to 15–20 A˚. One limitation of this technique is represented by sample flattening, due to sample dehydration during grid preparation (see Subheading 3), especially for proteins having large internal cavities. In addition, it is occasionally observed the presence of preferential views, originated by the interaction between specific portion of the protein and the carbon support of the EM grid. Selecting an appropriate staining agent is quite important. It is desirable to have a heavy metal salt which dries to give an evenly spread, amorphous layer surrounding and supporting the biological sample. The most used negative staining heavy metal salts are uranyl acetate (UA), uranyl formate (UF), sodium/potassium

High-Throughput Protein Analysis Using Electron Microscopy

479

phosphotungstate (PTA), sodium silicotungstate (SS), and ammonium molybdate (AM). There are additional staining agents used less frequently that will not be described here. Ideally, the stain should interact with the biological sample and permeate into its aqueous cavities. This interaction could be limited by a number of factors: overall surface charge of the particle, hydrophobic pockets within the particle, size (granularity), and charge of the staining salt. Some of these can be overcome with the selection of the appropriate stain, or with the addition of other factors (detergent, carbohydrate). Since different stains perform differently, it is a common practice to utilize several different staining solutions (comprising different granularities and pH levels) to determine the best possible conditions for any given sample. This approach could also be used to verify that any possible imaged defects (aggregation, protein dissociation) are intrinsic to the sample itself and are not originated by the staining solution. It is also very important to consider any potential interaction and incompatibility between the sample buffer and the staining solution (i.e., UA is incompatible with phosphate buffer). For the majority of samples, uranyl acetate (2%, pH 4.5), or the finer uranyl formate (2% pH 4.5), will allow satisfactory imaging and could be used as initial method of staining. There have been a number of reviews and book chapters devoted to negative staining, thoroughly covering a number of technical aspects of the staining protocols. While negative staining raw micrographs can provide important information regarding the compositional homogeneity of the sample or its propensity to aggregate or disassemble, single particle analysis, and in particular 2D classification and averaging, can reveal important additional biochemical properties of the complexes in exam, such as domain organization, flexibility, and protein-protein interaction domains. The entire process comprising EM grid preparation, data collection, particle picking, and 2D classification can be applied to any protein complex having a molecular weight of at least ~80–100 kDa and can be completed in about 24 h using a 80–200 keV Electron Microscope and a desktop computer or computer cluster. While this process is straightforward for the analysis of a handful of complexes, it is labor intensive and could represent a bottleneck for the analysis of recombinant protein at the 96-well scale. Interestingly, in the past years, it has been shown that it is possible to dispense different sample at the nanoliter scale on different programmable position of one single EM grid, using inkjet technology [6, 7]. Therefore, it is conceivable creating a pipeline that makes use of 96-wells protein expression, automatic protein purification, staining optimization and negative staining analysis using nanoliter-inkjet technology and automatic data collection software [8, 9].

480

Christopher P. Arthur and Claudio Ciferri

We will introduce here our application of this technique to high-throughput sample preparation and evaluation in the pharmaceutical development industry.

2

Material

2.1 Uranyl Acetate Stain Preparation

Reagents

1. Uranyl Acetate (CAS #541-09-3, Electron Microscopy Science, USA). Equipment and Consumables

1. Falcon tubes (Fisher Scientific). 2. Lab Compact tube rocker. 3. Ultrasonic Sonicator. 2.2 Preparation of Hydrophilic Carbon Support Films on 3 mm Cu TEM Grids

Equipment and Consumables

1. 200, 300, or 400 mesh continuous carbon Cu TEM grids (Ted Pella, Redding CA). The 200, 300, or 400 mesh Cu TEM grids covered in a continuous carbon support film can be obtained from Ted Pella, or can be manufactured in-house using a number of techniques (see Note 1). 2. PELCO easiGlow™ (Ted Pella, Redding, CA). 3. Whatman filter paper number 1 (Merck-Millipore).

2.3 Staining of Sample for TEM Imaging

Equipment and Consumables

2.4

Equipment and Consumables

TEM Imaging

1. Parafilm M-roll.

1. 80–120 keV or 80–200 keV side entry microscopes, despite their entry level, are excellent machines for these analyses. 2.5 Data Processing, Particle Picking, and 2D Classification

3

1. CisTEM (Computational Imaging System for Transmission Electron Microscopy) Version 1.0beta—freely available at https://cistem.org [10].

Methods

3.1 Uranyl Acetate Stain Preparation

1. Weigh out 100 mg of uranyl acetate (see Note 2). 2. Combine 100 mg of uranyl acetate with 5 mL of DH2O, in a clean 5 mL falcon tube. 3. Place the solution onto a rocker and rock for 2 h. 4. After 2 h, sonicate the solution for 5 min.

High-Throughput Protein Analysis Using Electron Microscopy

481

5. Wrap the falcon tube in aluminum foil and store in a dark place (see Note 3). This operation takes around 2.5 h. 3.2 Preparation of Hydrophilic Carbon Support Films on 3 mm Cu TEM Grids

1. Carbon coated grids are placed into a PELCO easiGlow™ glow discharge cleaning system, and subjected to a 20 mA atmospheric plasma current for 20 s (see Note 4). Once finished the grids are used for further sample processing within 30 min. This operation takes around 10 min.

3.3 Staining of Sample for TEM Imaging

1. Place a piece of clean parafilm onto the bench surface. 2. Place 3, 50 μL drops of DH2O onto the parafilm. 3. With the freshly plasma treated TEM grid held securely by its edge in a pair of fine forceps, apply 4 μL of your protein sample to the carbon surface. 4. After 1 min, apply the edge of the grid to a piece of 1 Whatman filter paper, this will slowly wick away the protein sample. 5. Immediately touch the surface of the carbon grid to the surface of the first 50 μL DH2O drop, in order to wash the sample (see Note 5). 6. Touch the edge of the grid to a clean area of the 1 Whatman filter paper, and repeat with the subsequent 2 DH2O drops. 7. Once the final drop has been applied and wicked away, apply 4 μL of the filtered 2% Uranyl Acetate solution. 8. After 1 min, apply the edge of the grid to a clean area of the 1 Whatman filter paper and allow all of the stain to wick off of the surface of the grid. 9. Allow the grid to air dry for 5 min, then place into a storage grid box until ready to image. This operation takes 3 min/grid. See Fig. 1a for an illustration.

3.4

TEM Imaging

1. Make sure that the electron beam is aligned. 2. Load the prepared negative staining grid in a single tilt holder and insert the holder in the microscope according to manufacture protocols. 3. Set the stage at eucentric height at low magnification and search a region of the grid with the correct amount of staining to visualize particles (see Note 6). 4. Use low dose mode during data acquisition so that the sample is not irradiated while focusing the image. 5. Go to the desired high magnification (ideally associated with a pixel size of ~2–3 A˚/pixel on the specimen level and

482

Christopher P. Arthur and Claudio Ciferri

Fig. 1 (a) Negative staining sample preparation. Samples are deposited on a Cu grid coated with a thin layer of amorphous carbon. After 30 s incubation, the sample is blotted and washed three times with water before being stained with heavy atom salt solution. (b) Representative Negative Staining TEM image of a generic protein sample. The image shows that the protein sample is not aggregated and that can adopt different orientation on the grid. The presence of smaller fragments (arrows), on the other end, suggests degradation or disassembly

comprising 100–200 particles per field) and focus until the minimum contrast is observed (see Note 7) and set the defocus to ~1 μm. 6. Record images using a ~1 s exposure time. Generally, for single particle analysis 5000–20,000 particles can be required to obtain good quality 2D classes. A representative image of a protein complex imaged by negative staining TEM is depicted in Fig. 1b. This operation takes around 2.5 h. 3.5 Particle Picking and 2D Classification

1. Examine the acquired image to ensure that no astigmatism is present and that particles are well stained. 2. Pick particles using the find particles algorithm within cisTEM [10]. During particle picking we choose only particles that are single and without neighboring particles overlapping or in close proximity. Once the particle picking is completed, create a particle stack for 2D alignment. 3. Perform 2D classification using one of the available software freely or commercially available. We generally use CisTEM [10] for reference free 2D classification, using as desired output number of classes the total number of particles divided by 20–40 (see Note 8). Class averages will inform on homogeneity of the sample and potential flexibility or heterogeneity. These operations require around 3 h.

3.6 Low Resolution Epitope Mapping Using 2D Classification

It is possible to utilize negative staining TEM and 2D classification to perform fast epitope mapping by comparing a specific antigen in bound or unbound state to a set of specific monoclonal antibody in fragment antigen-binding (FAB) format.

High-Throughput Protein Analysis Using Electron Microscopy

483

Here we present a case study aiming to identify the binding sites of potent neutralizing antibodies against human Cytomegalovirus (HCMV) gH/gL/UL128/UL130/UL131A (Pentamer). This glycoprotein complex is required for HCMV entry in endothelial/epithelial cells and is targeted by potently neutralizing antibodies in the infected host [11–13]. Using negative staining TEM and 2D analysis of purified complexes, each bound to a subset of neutralizing antibodies, it was possible to show that the region of HCMV neutralization is limited to a relatively small region occupied by the ULs components (see Fig. 2 modified from [13]). These images were also used to determine three-dimensional reconstruction of several combinations of HCMV-Pentameric-Fab complexes [13]. Taken together, these studies suggested new strategies for the development of antibodies and vaccines against HCMV [8] and were crucial in prioritizing the correct combination of neutralizing antibodies stabilizing these HCMV complexes allowing their successful crystallization [14].

4

Notes 1. Carbon coated TEM grids can be manufactured by a number of techniques: (a) Bare grids can be coated with a thin layer of formvar, and subsequently coated with a thin layer of evaporated carbon. (b) Bare grids can be submerged into a DH2O bath, and evaporated carbon on the surface of pristine mica can be floated onto the surface of the DH2O and slowly lowered onto the submerged bare TEM grids. 2. Uranyl acetate is a radioactive material and should be handled in accordance with your institute’s radiation safety policy. Always wear appropriate PPE when handling. 3. This solution can be stored for at least 6 months in the dark. 4. Grids should be evaluated for their hydrophilicity. This is most easily done visually. The drop of solution on the surface of the grid should spread evenly and not ball up on the surface. If the solution balls up, or if the staining when imaged in the TEM seems very patchy, it may be due to your grid not being hydrophilic enough. This may be combated by increasing the glow discharge time. 5. Not all samples will behave well when exposed to DH2O with no buffering conditions. You should consider the needed buffering conditions and any possible adverse effects with your sample.

484

Christopher P. Arthur and Claudio Ciferri

Fig. 2 (a) HCMV Pentamer-specific human neutralizing antibodies. Each slice in the pie represents a different group of antibodies. (b) Reference free 2D EM analysis reveals the position of each binding site on the Pentamer for each group as shown in (a). From our EM analysis, Site 1, Site 4, and Site 6 antibodies bind to the upper portion of the Pentamer with Site 4 and Site 6 binding to a very similar position. Site 5 antibodies bind at the tip of the ULs protrusion, while Site 2, Site 3, and Site 7 antibodies bind on the lower portion of the Pentamer. (c) Schematic representation of the site of interaction between Pentamer and neutralizing antibodies as inferred from EM studies

6. It is important that the particles are completely embedded into the heavy atom stain. Failure to do so will produce in the less severe cases a strong halo around the particles, interfering with their alignment and in the most severe it will generate positive staining (black particles over white background). In either case, the particles will not be suitable for 2D alignment.

High-Throughput Protein Analysis Using Electron Microscopy

485

7. It is possible to determine the minimum contrast by determining the point in which the live power spectrum has no Fast Fourier Transform (FFT) rings. 8. Negative staining images have generally high contrast. Therefore 15 particles per class are sufficient to generate highly detailed class averages. References 1. Porter KR, Claude A, Fullam EF (1945) A study of tissue culture cells by electron microscopy. J Exp Med 81(3):233–246. https://doi. org/10.1084/jem.81.3.233 2. Kuhlbrandt W (2014) The resolution revolution. Science 343(6178):1443–1444. https:// doi.org/10.1126/science.1251652 3. Subramaniam S, Earl LA, Falconieri V, Milne JL, Egelman EH (2016) Resolution advances in cryo-EM enable application to drug discovery. Curr Opin Struct Biol 41:194–202. https://doi.org/10.1016/j.sbi.2016.07.00 4. Cheng Y, Grigorieff N, Penczek PA, Walz T (2015) A primer to single-particle cryo-electron microscopy. Cell 161(3):438–449. https://doi.org/10.1016/j.cell.2015.03.050 5. Brenner S, Horne RW (1959) A negative staining method for high resolution electron microscopy of viruses. Biochim Biophys Acta 34:103–110 6. Jain T, Sheehan P, Crum J, Carragher B, Potter CS (2012) Spotiton: a prototype for an integrated inkjet dispense and vitrification system for cryo-TEM. J Struct Biol 179(1):68–75 7. Peters JPJ. Gordon research conference 2016. Oral Presentation 8. Suloway C, Pulokas J, Fellmann D, Cheng A, Guerra F, Quispe J, Stagg S, Potter CS, Carragher B (2005) Automated molecular microscopy: the new Leginon system. J Struct Biol 151(1):41–60 9. Mastronarde DN (2005) Automated electron microscope tomography using robust

prediction of specimen movements. J Struct Biol 152(1):36–51 10. Grant T, Rohou A, Grigorieff N (2018) cisTEM, user-friendly software for single-particle image processing. Elife 7:7 11. Ciferri C, Chandramouli S, Donnarumma D, Nikitin PA, Cianfrocco MA, Gerrein R, Feire AL, Barnett SW, Lilja AE, Rappuoli R, Norais N, Settembre EC, Carfi A (2015) Structural and biochemical studies of HCMV gH/gL/gO and Pentamer reveal mutually exclusive cell entry complexes. Proc Natl Acad Sci U S A 112(6):1767–1772 12. Chandramouli S, Ciferri C, Nikitin PA, Calo´ S, Gerrein R, Balabanis K, Monroe J, Hebner C, Lilja AE, Settembre EC, Carfi A (2015) Structure of HCMV glycoprotein B in the postfusion conformation bound to a neutralizing human antibody. Nat Commun 14(6):8176 13. Ciferri C, Chandramouli S, Leitner A, Donnarumma D, Cianfrocco MA, Gerrein R, Friedrich K, Aggarwal Y, Palladino G, Aebersold R, Norais N, Settembre EC, Carfi A (2015) Antigenic characterization of the HCMV gH/gL/gO and pentamer cell entry complexes reveals binding sites for potently neutralizing human antibodies. PLoS Pathog 11(10):e1005230 14. Chandramouli S, Malito E, Nguyen T, Luisi K, Donnarumma D, Xing Y, Norais N, Yu D, Carfi A (2017) Structural basis for potent antibody-mediated neutralization of human cytomegalovirus. Sci Immunol 2(12). https:// doi.org/10.1126/sciimmunol.aan1457

Chapter 23 High-Throughput Protein Production Combined with HighThroughput SELEX Identifies an Extensive Atlas of Ciona robusta Transcription Factor DNA-Binding Specificities Kazuhiro R. Nitta, Renaud Vincentelli, Edwin Jacox, Agne`s Cimino, Yukio Ohtsuka, Daniel Sobral, Yutaka Satou, Christian Cambillau, and Patrick Lemaire Abstract Transcription factors (TFs) control gene transcription, binding to specific DNA motifs located in cisregulatory elements across the genome. The identification of TF-binding motifs is thus an important aspect to understand the role of TFs in gene regulation. SELEX, Systematic Evolution of Ligands by EXponential enrichment, is an efficient in vitro method, which can be used to determine the DNA-binding specificity of TFs. Thanks to the development of high-throughput (HT) DNA cloning system and protein production technology, the classical SELEX assay has be extended to high-throughput scale (HT-SELEX). We report here the detailed protocol for the cloning, production, and purification of 420 Ciona robusta DNA BD. 263 Ciona robusta TF DNA-binding domain proteins were purified in milligram quantities and analyzed by HT-SELEX. The identification of 139 recognition sequences generates an atlas of proteinDNA-binding specificities that is crucial for the understanding of the gene regulatory network (GRN) of Ciona robusta. Overall, our analysis suggests that the Ciona robusta repertoire of sequence-specific transcription factors comprises less than 500 genes. The protocols for high-throughput protein production and HT-SELEX described in this article for the study of Ciona robusta TF DNA-binding specificity are generic and have been successfully applied to a wide range of TFs from other species, including human, mouse, and Drosophila. Key words High-throughput protein production, HT-SELEX, Transcription factor, DNA-binding domain (DBD), Ciona robusta

1

Introduction Gene transcription is orchestrated by the recruitment of sequencespecific transcription factors to cis-regulatory DNA sequences that are mostly located in noncoding regions of the genome.

Kazuhiro R. Nitta and Renaud Vincentelli contributed equally to this work. Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_23, © Springer Science+Business Media, LLC, part of Springer Nature 2019

487

488

Kazuhiro R. Nitta et al.

Transcription factors have one or several DNA-binding domains (DBD), which mediate the specific recognition of short, partially degenerate DNA sequences. Living organisms from bacteria to plants and animals have hundreds to thousands of TFs in their genome. For example, Homo sapiens has around 1600 known or likely TFs [1, 2], while Ciona robusta, a member of the vertebrate sister group, only has around 700 TFs [3, 4]. TFs are classified in several families based on the structure of their DBD. Individual members of most structural families tend to display similar, though slightly different, binding properties (e.g., [5, 6]). By contrast, individual members of some classes of Zinc finger transcription factors, in particular the C2H2 class, can greatly differ in their DNA-binding specificity [7]. Hence, determination the DNA-binding specificity of the whole repertoire of TFs in an organism is a crucial step to understand gene regulation. One of the most efficient way to characterize in vitro the DNA-binding specificity of a TF is the Systematic Evolution of Ligands by EXponential enrichment (SELEX) protocol, a class of in vitro selection methods, that was developed in the early 1990s by [8, 9]. In brief, a complex library of short random nucleic acid oligonucleotides (single- or double-stranded DNA or RNA) is first prepared. Each oligonucleotide includes a variable region to which the protein can bind, flanked by constant regions used for PCR amplification of the library. Second, oligonucleotides binding to a target (usually a protein) are selected by incubating in vitro the input oligonucleotide library with the target recombinant protein, immobilizing the target-DNA complexes and washing away unbound oligos. Selected oligos are then amplified by PCR and used as input oligonucleotide library in the following cycle of selection. Finally, selected oligos are massively sequenced at the end of each cycle, and enriched DNA sequences are identified computationally. The concept of SELEX is simple and this technique efficiently selects aptamer ligands, which bind to a target of interest [10]. In the case of transcription factors, the synthesis of random doublestranded DNA oligonucleotide libraries and the mass production of recombinant transcription factor proteins are necessary to perform SELEX assay. As an animal genome encodes hundreds to thousands of TFs, hundreds of proteins need to be produced efficiently in sufficient amount to apply SELEX to the whole TF repertoire of an organism. Combining high-throughput protein production and high-throughput SELEX (HT-SELEX) assays is a powerful solution to achieve this difficult task. The full-length protein of a transcription factor and its truncated version limited to its DBD generally have the same DNA-binding specificity [5, 6]. As shorter proteins are usually more efficiently produced in bacteria, efforts can be focused on the production of recombinant DBDs.

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

489

In this chapter, we present the high-throughput protein production and HT-SELEX protocols that we developed and successfully applied to the characterization of the DNA-binding specificity of DBDs from the ascidian, Ciona robusta, formerly referred to as Ciona intestinalis type A [11]. Ascidians are invertebrate chordates that share with vertebrates their larval body plan, but have much simpler, unduplicated, genomes [12] in which the estimated number of transcription factors ranges from 394 [3] to 669 [4]. Because vertebrate genomes have undergone two rounds of duplication, most ascidian transcription factors have several vertebrate orthologs. This smaller repertoire allowed the precise determination of the spatio-temporal expression profiles of most Ciona TFs during embryonic development [3, 4] and the reconstruction of the gene regulatory networks driving early embryonic development [13, 14]. Deciphering the DNA-binding specificity of a large fraction of the TF repertoire in Ciona would improve these preliminary networks. 420 DBD from Ciona robusta were selected to study transcription factor DNA-binding specificities on a genome-wide scale. 263 DBD (63%) could be purified in milligram quantities and studied by HT-SELEX. After sequencing and data extraction, a specific DNA-binding profile could be identified for 139 transcription factors (53% of the soluble proteins). This demonstrated the possibility to produce functional DBD in E. coli, by far the easiest, cheapest, and quickest way to produce proteins for functional studies. The E. coli expression of DBD increased the success rate of protein production compared to the one of full-length protein production both in term of diversity of transcription factor families and in term of quantities of protein produced for downstream applications. These protocols have also been successfully applied for the identification of human, mouse, and fruit fly DBD as well as full-length transcription factors [5, 6] and therefore should be a good start for the production and determination of specificities of any class of transcription factors in a species of interest.

2

Materials The generation of the DBD clone and protein libraries were performed in 96-well formats with HT protocols using multi-pipets or robotics. For most of the steps, equipment and consumables are therefore similar and are, for simplicity, described only once in the material section.

490

Kazuhiro R. Nitta et al.

2.1 Generation of a Library of Clones Encoding Most Ciona robusta DNA-Binding Domains

The selected Ciona robusta DNA-binding domains were amplified by PCR, TOPO-cloned in pENTR/D-Topo (Thermofisher) and transferred, using Gateway technology (Thermofisher), into an E. coli expression vector, such that the resulting recombinant proteins were fused to an N-terminal His6-Thioredoxin tag.

2.1.1 Selection of the Clones and Generation of the Ciona robusta pENTR DBD Library

Programs

1. InterProScan sequence search (e.g., http://www.ebi.ac.uk/ interpro/, [15]). 2. Pfam domain search (e.g., http://pfam.xfam.org/, [16]). Reagents

1. A list of Ciona robusta transcription factor genes supported by extensive cDNA sequence information [17]. 2. Ciona robusta Unigene cDNA collections (Satoh collection, [18]). 3. Chemical competent DH5α E. coli bacteria. 4. Antibiotic: Kanamycin (50 mg/mL in water), store stock solution at 20  C, and use a 1/1000 dilution. 5. LB Broth: dissolve 10 g tryptone, 5 g yeast extract, and 10 g NaCl in 950 mL of distilled water. Adjust the pH to 7.0 using NaOH and add distilled water up to 1 L. Mix well and dispense into appropriate containers. Sterilize in autoclave at 121  C for 15 min. 6. LB agar: Dissolve 10 g tryptone, 5 g yeast extract, and 10 g NaCl in 950 mL of distilled water. Adjust the pH to 7.0 using NaOH. Add 20 g of agar and adjust the volume to 1 L with distilled water. Mix well, dispense into appropriate containers, and autoclave. 7. LB agar plates: Melt slowly a bottle of LB agar in a microwave. Let cool to 50  C, add the required antibiotic (50 μg/mL Kanamycin), and distribute 1.5 mL of LB agar to each well of 24-well sterile tissue culture plates. 8. 2 YT broth: dissolve 16 g tryptone, 10 g yeast extract, and 5 g NaCl in 1 L of distilled water. 9. pENTR™/D-TOPO™ Cloning Kit (Invitrogen). 10. High-fidelity PCR enzyme (e.g., Pfx50, Thermo Fisher). 11. Conventional PCR enzyme (Taq). Equipment and Consumables

1. Multichannel pipettes (suitable for dispensing volumes from 0.5 to 20 μL). 2. Multichannel pipettes with variable span (suitable with volumes from 5 to 60 μL), such as Matrix Equalizer Pipettes-125 and Pipettes-1250 μL (Thermo Scientific) which are used to dispense reagents into a 24- and 96-well format.

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

491

3. Multi-dispenser (such as MINILAB 201 dispenser, HTL) and correspondent 50 and 25 mL syringes. 4. Water bath set at 42  C (or 37  C when needed). 5. Shaking incubator and Plate incubator set to 37  C. 6. Centrifuge with rotor adapted for 96-well PCR plates (such as Centrifuge 5810R, Eppendorf). 7. Centrifuge with rotor for 24-DW plates (such as Avanti J-E centrifuge, Beckman Coulter). 8. PCR machine suitable for 96-well plates (such as T100 Thermal cycler, Bio-Rad). 9. 96-well PCR plates (such as MicroAmp™ Optical 96-Well Reaction Plate from Thermo Scientific) with adhesive PCR seals (such as AB0558, Applied Biosystem). 10. Deepwell 96, 2 mL culture capacity (such as the Greiner 780270). 11. Deepwell 24, 10 mL volume (such as the Whatman 77015102). 12. Miniprep 96-well plate kit such as Merck’s Montage Plasmid Miniprep (LSKP09624). 13. Liquid handling robot (TECAN Freedom EVO 200) is used to perform HT plasmid purification. Alternatively, a vacuum system (such as Manifold, EMD-Millipore) can also be used. 14. UV-Vis Spectrophotometer suitable for DNA quantification (such as Microplate reader, Thermo Scientific Multiskan GO UV/Vis microplate spectrophotometer with μDrop Plate, Thermo Scientific) or NanoVue (GE Healthcare Life Sciences). 15. 100 mL disposable reagent reservoirs. 2.1.2 Generation of the Ciona robusta Library of DBD Clones for E. coli Expression

See Subheading 2.1.2 for common reagent, equipment, and consumables. Reagents

1. Library of Ciona robusta DBD in pENTR/D TOPO stored in 96-PCR plates (DNA concentration >10 ng/μL). 2. Gateway cloning enzymes: LR clonase II enzyme mix (11791100, Thermofisher). 3. The pETG20A vector is produced in-house. It includes an internal histidine tag which allows recombinant protein purification through IMAC (ion metal affinity chromatography). 4. Antibiotic: ampicillin (200 mg/mL in water). Store stock solution at 20  C, and use a 1/1000 dilution. Equipment and Consumables

1. 24-well LB agar plates with the required antibiotic (200 μg/ mL Ampicilin): 24-well sterile tissue culture plates (Greiner

492

Kazuhiro R. Nitta et al.

Bio-One). For 96 transformations, use 4  24-well sterile tissue culture plates. 2. Air Permeable adhesive film, for culture in deep-well plates (such as Greiner 676050). 3. Small orbital shaking incubator such as the INFORS Multitron (model number AJ103). The shaking speed is 800 rpm for DW96 plates and 600 rpm for DW24 plates. 2.2 Production and Storage of a Library of Soluble Ciona DBDs in E. coli 2.2.1 Transformation, Culture, and Cell Harvest

See Subheading 2.1.2 for common reagent, equipment, and consumables. Reagents

1. Library of Ciona robusta DBDs in pETG20A stored in 96-PCR plates (DNA concentration >10 ng/μL). 2. Chemically competent Rosetta (DE3) pLysS E. coli strain (Invitrogen, Life Technologies). A stock of in-house chemically competent bacteria is made from a commercial strain. The batch is aliquoted in 1.5 mL Eppendorf tubes (1 mL/tube), then flash frozen in liquid nitrogen and stored at 80  C. 3. Chloramphenicol (Sigma - C0378)—Stock solution prepared at 34 mg/mL, stored at 20  C, used in media at 1:1000. 4. Auto-inducible expression medium: 85 g of NZY Auto-Induction LB medium powder (NZYTech, MB17901) is dissolved in 1.7 L distilled H2O supplemented with 0.5% of glycerol (AnalaR NORMAPUR - 24388.295), autoclaved, and stored at room temperature. 5. Lysozyme stock (50 mg/mL): Dissolve 0.5 g lysozyme in water to a final volume of 10 mL. Store in 0.5 mL aliquots at 20  C. 6. 10 buffer A stock: Prepare 1 L of buffer containing 500 mM Tris pH 8, 3 M NaCl, and 100 mM Imidazole ACS grade (Merck, reference 104716) in advance, filter through a 0.22 μm filter, and store at 4  C. A high quality grade imidazole must be used so that it will not interfere with A280 readings for calculating protein yield. 7. Prepare fresh lysis buffer on the day of use by diluting tenfold the 10 buffer A stock (final concentration: Tris 50 mM, NaCl 300 mM, 10 mM imidazole, pH 8) and adding lysozyme to a final concentration of 0.25 mg/mL.

2.2.2 Purification of DBDs

Reagents

1. DNase stock (2 mg/mL): Dissolve 100 mg of DNase in 50 mL of water. Sterilize by filtration and divide into 1 mL aliquots. Store aliquots at 20  C. 2. 2 M MgSO4 stock: Dissolve 24 g of MgSO4 in 100 mL of water. Autoclave.

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

493

3. LabChip® HT Protein Express Chip (Perkin Elmer, 760499). 4. Protein Express CLS960008).

Assay

Reagent

Kit

(Perkin

Elmer,

5. HT Protein 200 Sample Buffer (Perkin Elmer 760518). 6. Ni Sepharose 6 Fast Flow resin (GE Healthcare, reference 17-5318-02): The resin is supplied in 20% ethanol. Put aliquots of the resin in 15 mL falcon tubes. To equilibrate the resin, wash twice in water and then twice in binding buffer. This is done by first centrifuging at 500  g for 1 min, discarding the supernatant by inverting the tubes and resuspending in water or buffer. Repeat at each step of equilibration. After the final wash, resuspend in buffer A as a 50% (v/v) (50:50 mL) (resin:buffer) slurry. Store the equilibrated resin at 4  C when not in use. 7. 10 Buffer A stock, see Subheading 2.2.1. On the day of use, dilute 25 mL of 10 stock into a final volume of 250 mL. 8. 10 Wash buffer stock: Prepare 1 L of buffer containing 500 mM Tris 3 M NaCl and 500 mM Imidazole ACS grade (Merck, reference 104716), pH 8; filter through a 0.22 μm filter and store at 4  C. On the day of use, dilute 25 mL of 10 stock into a final volume of 250 mL. 9. 5 stock Elution buffer: Prepare 1 L of buffer containing 250 mM Tris pH 8, 1.5 M NaCl, and 1.25 M Imidazole ACS grade (Merck, reference 104716) in advance, filter through a 0.22 μm filter, and store at 4  C. On the day of use, dilute 20 mL of 5 stock into a final volume of 100 mL. The final 1 buffer composition is Tris 50 mM, NaCl 300 mM, 250 mM imidazole, pH 8). 10. Storage buffer: Add 50% Glycerol (such as AnalaR NORMAPUR - 24388.295) to the purified proteins, the final storage buffer composition is Tris 25 mM, NaCl 150 mM, 125 mM imidazole, 50% Glycerol, pH 8). Equipment and Consumables

1. Macherey-Nagel 96-well Receiver/Filter Plate 20 μm, 1.5 mL capacity (Macherey-Nagel, reference 740686.4). 2. 4 Deep-well 96 (DW96) plates, with 2 mL volume capacity (Greiner Bio-One, reference 780270). 3. Perkin Elmer LabChip GX II (or alternatively an SDS-PAGE electrophoresis system). 4. Spectrophotometer and plate (UVStar Greiner Bio-One) for measuring absorbance at 280 nm (A280) for calculating yield of soluble proteins. 5. Plate sonicator adapted for deep well plates such as Ultrasonic processor XL (Misonix Inc., USA) to ensure full cell lysis.

494

Kazuhiro R. Nitta et al.

2.3 High-Throughput Systematic Evolution of Ligands by EXponential Enrichment (HT-SELEX) 2.3.1 Preparation of the Degenerate DoubleStranded DNA Oligonucleotides Library

Reagents

1. Generic DNA oligonucleotides. 50 -TCCATCACGAATGATACGGCGACCACCGAACACTCTTTCCC TACACGACGCTCTTCCGATC-30

2. 96 different barcode DNA oligonucleotides. 50

-CGGAGTCGGCAAGCAGAAGACGGCATACGABBN20BBBBBA-

GATCGGAAGAGCGTCG-30

The bases underlined in 1. and 2. correspond to the annealing sites. B indicates the positions of barcode bases. N20 indicates the position of the 20 fully degenerate (equal probability of A, T, C, G) bases. 3. Phusion DNA polymerase (Thermofisher). Synthesis: We recommend ordering 0.05 μmoles of 96 different barcode DNA oligonucleotides as desalted ssDNA oligonucleotides (100 μM concentration). Order 1 μmole (100 μM concentration) of Generic DNA oligonucleotides. Oligonucleotides are dissolved in 10 mM Tris–HCl buffer, pH 8.0, and stored at 20  C. Random region length: We recommend ordering a fully randomized sequence length of 20 base pairs to reveal singlemonomeric transcription factor-binding motif. The randomized sequence is indicated as “N” in 2. Transcription factors recognize motifs of variable length, generally in the 4–8 bp range. Some TFs however recognize much longer motifs [19]. In addition, Jolma et al. [5] showed that a SELEX dataset produced with oligonucleotides with 40 random positions (40 N) identified dimeric or multimeric composite motifs that could not have been identified using shorter random sequences. Such long random region design may however not be advisable for most projects, as a fully randomized 40-mer library should contain up to 1.2  1024 different molecules (i.e., more than 1 mol), so that only a very small fraction of the full library will be included in each SELEX experiment. Constant sequences: At the end of a HT-SELEX, to be able to amplify by PCR the specifically bound sequences to a DBD, the oligonucleotides must contain “constant regions” on both ends of the randomized library. These sequences should not contain known TF-binding motif, as this would strongly bias selection (known binding motif can be found from database (e.g., JASPAR; http:// jaspar.genereg.net/, [20])). Next generation sequencing machines (e.g., Illumina HiSeq series) use their own default primer sets. While the design of oligonucleotides including next generation sequencing primer sequences avoids the sequencing library preparation step, care should be taken that it could potentially introduce unexpected TF-binding motifs within the constant regions of the oligos. Indicated sequences here are compatible with Illumina HiSeq sequences.

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

495

Barcode: Pools of SELEXed oligos corresponding to the same round of SELEX for each protein should be barcoded and mixed in equimolar ratios in groups of 96 DBD before mass-sequencing. Two barcodes should keep a Hamming distance of 2 or more to avoid ambiguities caused by single point mutation/errors in the barcode region during the sequencing step. 2.3.2 HT-SELEX

Reagents

1. Library of Ciona robusta purified proteins at 1 mg/mL in storage buffer (stored at 20  C). 2. 10 SELEX buffer: 100 mM Tris–HCl (pH 7.5), 500 mM NaCl, 10 mM MgCl2, 5 mM EDTA, 40% glycerol (see Note 1). 3. 250 μg/mL poly (dI-dC) 4. 1 M DTT. 5. Nickel resin: prepare resin as in Subheading 2.2.2. After the final wash, resuspend in 1 SELEX buffer as a 6.67% (v/v) (resin:buffer) slurry such as there is 10 μL of beads in 150 μL of buffer. Store the equilibrated resin at 4  C when not in use. 6. GoTaq DNA polymerase (Promega). 7. PCR primers: Forward: 0

5-

-TCCATCACGAATGATACGGCGACCACCGAACACTCTTTCCCTA-

CACGACGCTCTTC-30 Reverse: 50 -CGGAGTCGGCAAGCAGAAGACGGCATACG-30

Equipment and Consumables

1. Filterplate 96: MSDV N6 plate (Millipore) with 0.65 μm membrane. 2. Liquid handling robot (e.g., TECAN, Biomek, Agilent, etc.). 2.3.3 Library Sequencing and Motif Extraction

1. Short-read NGS massively parallel sequencing system (e.g., Illumina HiSeq series). 2. Computer (Linux operating system is recommended). 3. Motif analysis programs. We developed a homemade script to extract enriched k-mers (available from the download section of ANISEED). In case it is difficult, free software is an alternative option (e.g., The MEME Suite: Motif-based sequence analysis tools; http://meme-suite.org/).

3

Methods The overall strategy for the identification of Ciona robusta TF-binding DNA motifs is summarized in Fig. 1. The successive steps are: (1) Take census of all transcription factors detectable in

496

Kazuhiro R. Nitta et al.

Fig. 1 Pipeline for the identification of Ciona robusta transcription factor DNA-binding specificities

the Ciona genome. (2) Clone their DNA-binding domain in a shuttle vector. (3) Transfer the DNA-binding domains into an E. coli expression vector. (4) Produce and purify the His-tagged DNA-binding domains. (5) Capture oligonucleotides with affinity to the protein by HT-SELEX. (6) Prepare sequencing library and sequence TF-bound oligonucleotides using short-read massively parallel sequencing. (7) Extract enriched motifs in sequenced pools and Model the DNA-binding site recognized by the TF. The following Subheading 3 describes each step. 3.1 Generation of a Library of Clones Encoding Most Ciona robusta DNA-Binding Domains 3.1.1 Selection of the Clones and Generation of the Ciona robusta pENTR DBD Library

Scanning the KH gene model set (referred as KH2008) [17] for Interpro or GO terms associated to transcriptional regulation, 964 putative transcriptional regulators were identified. They were classified into a “Core TF” set of 369 genes encoding a clear transcription factor DNA-binding domain (e.g., basic helix-loophelix domain, homeodomain) and a “possible TF” set of 134 genes encoding possible transcription factors (mainly proteins with several C2H2 zinc fingers). The remaining 461 genes, which are unlikely to encode sequence-specific transcription factors but are likely to encode transcription- and chromatin-associated proteins, were used as a control set (“Unlikely” set). The annotated list of factors can be downloaded from the tunicate model organism database, ANISEED (http://www.aniseed.cnrs.fr). The “Core TF” and “Possible TF” gene sets were orthologous to 496 and 83 human transcription factors, respectively, approximately 35% of the estimated human repertoire of 1639 known or likely TFs [2]. To identify the boundaries of the putative DNA-binding domains (DBD), we ran both Pfam and Interpro domain searches

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

497

(e.g., http://pfam.xfam.org/, [16]) (e.g., http://www.ebi.ac.uk/ interpro/, [15]). To be on a safe side, we took the widest predicted region from both analysis and extended predicted domains by an additional 10 flanking amino acids on both N- and C-terminal ends to ensure stable domain folding. In case of multiple DBDs, the selected region included all domains. Most members of the Core and Possible TF gene sets were represented in the Ciona EST clone collections [18]. We amplified their DBDs by PCR with high-fidelity PCR enzyme and cloned amplified fragments into pENTR/D-TOPO vector according to the manufacturer’s instruction. Because every forward primer started with CACC, amplified fragments were cloned into the vector directionally. After transformation, recombinant E. coli was plated on LB agar plates containing kanamycin overnight. Next morning, we picked three colonies for direct PCR using conventional PCR enzyme (Taq DNA polymerase) with M13 forward and reverse primers, and confirmed successful cloning by agarose gel electrophoresis. For each of TF genes, one clone with the expected size of fragment was chosen. These clones were incubated overnight in a deep well 96-well plate (1 mL 2 YT medium for each well). Plasmid DNAs were prepared with the miniprep 96-well plate kit. By sequencing cloned fragments, we confirmed that no mutations were introduced by PCR. The resulting collection of Gateway entry clones covers 275 targets (74%), 65 targets (48%), and 102 targets (22%) of the Core, Possible, and Unlikely TF gene sets, respectively. Unlikely TFs were included both to confirm our TF classification and to provide negative controls in our pipeline. 3.1.2 Generation of the Ciona robusta Library of DBD Clones for E. coli Expression

To produce and purify the DBD in soluble form, the library of Ciona robusta DBD cloned in the Gateway pENTR/D-TOPO vector must be transferred to a Gateway Destination vector optimized for E. coli expression and including a purification/solubilization tag. After investigating several culture conditions and solubilization tags the majority of the DBD domains was produced in a soluble form using a His-tagged thioredoxin (His-TRX) N-terminal fusion vector (pETG20A), see ref. 21. The His-tag is required for purification through affinity chromatography, while the TRX tag was previously shown to optimize the folding and to increase the solubility of several recombinant protein domains and proteins, including DBDs. For this project, in an attempt to complete the protein collection, DBDs found to be insoluble as a His6-TRX fusion were subsequently cloned with alternative fusions (first His-MBP and then His-NusA) and then produced and purified using the same protocol as for the His-TRX constructs. This only improved the number of soluble DBD by a few percents and is therefore not discussed in this chapter (see ref. 21 for the details of the solubility screen).

498

Kazuhiro R. Nitta et al.

Using the LR clonase II enzyme mix from the Gateway technology (Thermofisher), the pENTR/D-TOPO libraries were cloned into the Gateway pETG20A destination vector to generate the library of expression constructs suitable for expression in the E. coli system. 1. On ice, prepare the LR Clonase II master mix as follow (for 120 reactions (96 reactions plus dead volume)): pETG20A (150 ng/μL): 1 μL LR clonase II Enzyme Mix: 1 μL 2. Transfer 2 μL into each well of a 96-well PCR plate, using a multichannel pipette. 3. Add 3 μL/well of pENTR/D-TOPO plasmids (>10 ng/μL), containing the DBD domains. Seal the plate using an adhesive PCR seal and centrifuge to spin down. 4. Incubate for 1 h at room temperature. 5. Add 1 μL/well of Proteinase K. 6. Incubate for 10 min at 37  C. 7. Centrifuge briefly to collect the reaction components and store samples on ice or at 20  C for subsequent transformation. 8. Prepare 4  24-well LB agar plates containing LB agar plus 200 μg/mL ampicillin with a multi-dispenser. 9. Thaw 3  1 mL of DH5α E. coli chemically competent cells on ice. 10. On ice, distribute 25 μL of DH5α E. coli chemically competent cells into each well of a 96-well PCR plate, using an automatic multichannel pipette. 11. Add 5 μL of cloning reaction to the competent cells, using multichannel pipette. Seal the plate with an adhesive PCR seal. Do not mix by pipetting. 12. Incubate the plate for 30 min on ice and heat-shock the cells at 42  C for 40 s, then transfer the 96-well PCR plate back to ice for 2 min. 13. Add 100 μL of SOC medium into each well, cover with a new adhesive PCR seal to prevent contamination. Incubate at 37  C for 90 min in a shaking incubator with vigorous agitation. 14. Dispense 60 μL of transformed cells onto the previously prepared 4  24-well LB agar plates. 15. Invert the plates and transfer them to the plate incubator, and incubate overnight at 37  C. 16. Inoculate one single colony per transformation in 5 mL of LB liquid media supplemented with 200 μg/mL of ampicillin (use 4  24-deep-well plates) and seal the plates with Air Permeable

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

499

adhesive film. Incubate deep-well plates with cultures at 37  C for ~16 h, with gentle agitation. 17. Harvest cells at 1500  g for 15 min at 4  C, using a centrifuge with a rotor for 24-DW plates. Discard the supernatant. 18. Isolate plasmid DNA from the bacterial pellets using a Miniprep 96-well plate kit for vacuum automated format (such as the Macherey-Nagel Nucleospins) according to the manufacturer’s instructions. 19. The resulting pETG20A clone collection is stored at 80  C in a series of five 96-well PCR (named A to E). The position of each DBD is identical through the various clone collections and protein libraries. 20. The integrity of the gene sequences should be confirmed by Sanger Sequencing in both directions. The resulting collection of 420 sequence-verified Gateway expression clones covers 69% (256/369) and 47% (63/134) of the Core and Possible TF gene sets, as well as 22% (101/461) of the Unlikely TF group. 3.2 Production and Storage of a Library of Soluble Ciona DBDs in E. coli

The protocol described in this chapter is based on the protocol described before [21] but that has been improved in automation and throughput by implementing advances developed for the VENOMICS project [22] to the production of DBDs. Following this protocol, the 420 DBDs clones can be processed for production and purification in 3 weeks. The whole protocol was automated on a Freedom EVO 200 (Tecan) liquid handling robot with an 8-channel Liquid LiHa and a 96 MCA head. A video describing the Tecan setup and the protocol to perform the transformation, culture, and purification of 96 samples has been published elsewhere [23]. Each step of the protocol below describes the production of the 96 DBD of plate A and was repeated four more times to produce the full protein library.

3.2.1 Transformation, Culture, and Cell Harvest

Day one

1. Prepare 4  24-well LB agar plates containing 2 mL of LB agar supplement by 100 μg/mL ampicillin and 34 μg/mL chloramphenicol with a multi-dispenser (see Note 2) 2. Thaw 3  1 mL of competent Rosetta (DE3) pLysS strain on ice, then aliquot 25 μL of competent cells into each well of a PCR96 plate using a multichannel pipette. 3. Add 1 μL of the expression plasmids (at a concentration >10 ng/μL for pure plasmids) with a multichannel pipette. Ensure that the plasmid is dispensed into the cells but do not mix by pipetting. Cover the plate with plastic film to avoid contamination.

500

Kazuhiro R. Nitta et al.

4. Incubate on ice for 30 min, then place the plate at 42  C for 45 s (thermal shock), then transfer back to ice for 3 min. Add 100 μL of SOC medium using a multichannel pipette and a reagent reservoir and incubate for 60 min at 37  C. 5. In the meantime, prepare a sterile DW96 plate containing 1 mL LB (with appropriate antibiotic) in each well using a repeat pipettor and seal with plastic adhesive to prevent contamination. 6. At the end of the transformation, dispense 60 μL of transformed cells onto the pre-prepared 24-well LB agar plates (see Note 3 and Fig. 2). Place in a shaker for 10 min to spread and

Fig. 2 Synopsis of the DBD library production

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

501

leave plates open to dry for 10 min under a hood (or in the incubator). Close the plates and leave them inverted at 37  C, overnight (see Note 4). 7. Dilute 60 μL of transformed cells into the DW96 plate containing the medium. Seal the deep well plate with a breathable film to allow culture aeration. Place in a 37  C shaking incubator at maximum speed overnight (800 rpm). This is the preculture for the production phase. 8. The next day, the preculture is used to inoculate the cultures in auto-induction medium. The remaining preculture is used to prepare glycerol stocks if desired (see Note 5). Day two

1. Make 1.7 L of NZY auto-induction LB medium supplemented with 100 μg/mL ampicillin and 34 μg/mL chloramphenicol. Dispense 2 mL into each well of 32 DW24 plates with a repeat pipettor. 2. For each target, aspirate 400 μL of preculture and dispense 8 50 μL (1/40 dilution) in 8 consecutive wells of a DW24 plate to inoculate the cultures. With such a protocol, 3 targets are grown per DW24 plate (well 1–8, target 1; well 9–16, target 2, well 17–24, target 3. . .), resulting in the need to inoculate 32 DW24 plates to grow 96 different expressions. The transfer of the preculture from the DW96 plate into DW24 plates can be done using the Matrix multichannel pipette with variable span or the robotic system. 3. Incubate at 37  C with shaking (600 rpm) for 4 h, decrease the temperature to 17  C for an additional 18 h. Day three

1. At the end of culture, visually check that the growth is homogenous for a single target within the 8 cultures. To determine the OD600nm take 20 μL of one culture/target and dispense into a flat-bottomed, clear microtiter plate containing 180 μL of medium. Measure the OD600nm, taking into account the tenfold dilution. OD should reach around 12. 2. Centrifuge the 32 DW24 plates at 3800  g for 10 min, then discard the supernatant into a waste container, decontaminate the media before disposal. Tap the plates, upside-down, onto absorbent paper to remove any excess medium. 3. In the meantime, prepare 500 mL of lysis buffer containing lysozyme (see Note 6).

502

Kazuhiro R. Nitta et al.

4. Add 0.5 mL of lysis buffer to each well and resuspend the pellets by shaking them at 20  C and 800 rpm for 10 min. Store the DW at 20  C overnight before purification. 3.2.2 Purification of DBD Domains

Day four

1. Take out from the freezer 8 DW24 plates representing target 1 to 24. Thaw the frozen cell suspensions in a water bath for 10 min at 37  C followed by 15 min shaking at 20  C in the incubator. The cultures should become viscous (see Note 6). 2. Take 1.5 mL of DNase stock and mix it with 3 mL of MgSO4 stock. Dispense 15 μL into each well of the 8 DW24 plates, to give a final concentration of 10 μg/mL of DNase and 20 mM MgSO4. Reseal the plate with plastic tape and shake for a further 15 min at 20  C, after which stage the cultures should be nonviscous (see Note 6). 3. Pool the 8  0.5 mL lysates per targets into a single well of a new DW24 plate (4 mL/well) to have 24 distinct 4 mL lysates of His-TRX-DBD fusion per DW24 plate. 4. While pooling into the new DW24 plates, check carefully that all the cultures are no longer viscous. This is the most critical point of the whole procedure, if some cultures are still viscous (for example, if the DNase was accidentally forgotten in some wells), the filter will clog, generating an uneven pressure on the samples and contamination or total clogging of the filter plate could happen during the purification. 5. Optional: using the Matrix variable span pipet, aspirate 5 μL of the whole cell lysate and dispense into a 96-well PCR plate containing 7 μL of LabChip Sample Buffer. Denature for 5 min at 95  C. Proceed to the analysis on the LabChip GXII (Perkin Elmer) the same day or freeze the plate (Total fraction). The day of the analysis on the LabChip GXII (Perkin Elmer), thaw the plate, denature for 3 min at 95  C, add 32 μL of water, and analyze following manufacturer’s instruction. This analysis can be replaced by a SDS-PAGE. 6. Sonicate the DW24 plate on ice in a plate sonicator (Ultrasonic processor XL, Misonix Inc., USA) for 5 min (power 5, 30 s ON/OFF cycles). 7. Move the DW24 plate to the Tecan Freedom worktable (see Note 7). 8. Start the Tecan purification procedure. The following protocol was initially designed for 96-Nickel purification with a 96-pipeting head [23]. It was slightly modified to purify 24 targets in quadruplets in one experiment. Because with a 96-head, 4 tips are able to aspirate/dispense into a single well of a DW24 plate, the 24 targets are purified in one go simply replacing the DW96 plate collecting the Flow

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

503

through, Wash and Elutions by a DW24 plate (see Fig. 2). With such a protocol, each 4 wells of the 96-well filter plate are combined in a single well of the DW24 plate. Therefore 1 mL of lysates (4 mL of initial culture) are purified on each well of the filter plates and eluted in 4 mL (combining 4 1 mL elution buffer). 9. With the 96-head pipet of the robot, mix before aspiration to ensure the Ni sepharose resin is suspended evenly and an equal amount of resin is dispensed into each well, then transfer 200 μL of the 50% (v/v) Ni sepharose resin slurry to each well of the DW24 plate. 10. Repeat step 9. Each well of the DW24 plate (4 mL lysate) now contains the equivalent of 1.6 mL of slurry (800 μL of resin). 11. With the same tips, to ensure a good protein binding and a homogenous mixing, perform during 10 min slow up and down pipetting cycles on the slurry/lysate with the 96-head pipet. 12. Transfer 200 μL of the lysate/bead mixture to a MachereyNagel filter/receiver plate (20 μm) mixing before aspiration. Repeat three more times to transfer 800 μL/well of the 96-filter plate. 13. Turn a mild vacuum on for approximately 60 s to filter the lysate through the plate into the DW24 plate to collect the flow through. Turn the vacuum off. 14. Repeat steps 12 and 13 two more times to transfer all the slurry/lysate mix on the filter plate. Each well of the filter plate has now around 200 μL of resin. 15. Remove the DW24 plate containing the flow through and replace it with the waste reservoir. Keep the flow through aside until the end of the purification. 16. Using the 96-head pipet, wash with 800 μL of binding buffer (50 mM Tris, 300 mM NaCl, 10 mM Imidazole, pH 8); transfer 200 μL of binding buffer to the filter plate, repeat 3 times. 17. Turn a mild vacuum on for approximately 60 s to aspirate the buffer. Turn the vacuum off. 18. Repeat the binding steps 15 and 16 twice more. 19. In order to remove E. coli proteins weakly bound to the resin repeat steps 15–17 replacing the binding buffer by the wash buffer (50 mM Tris, 300 mM NaCl, 50 mM Imidazole, pH 8). 20. Put a DW24 plate below the filter plate. 21. Using the 96-head pipet, elute the protein with 1000 μL of elution buffer (50 mM Tris, 300 mM NaCl, 250 mM Imidazole, pH 8); transfer 200 μL of elution buffer on the filter plate, repeat four times. Wait 3 min.

504

Kazuhiro R. Nitta et al.

22. Turn a mild vacuum on for approximately 60 s to aspirate the buffer. Turn the vacuum off. The elution fraction is 4 mL. 23. Add 4 mL of Glycerol 100% to the elution in the DW24 plate to put the DBD in storage buffer (final protein pools volume is now 8 mL with 50% glycerol). Properly annotate and seal with aluminum foil the DW24 plate. Store at 20  C. 24. Transfer 200 μL in a UVStar plate, measure absorbance at 280 nm, and calculate the concentration of the His-TRXDBD in storage buffer. 25. Using the variable span pipet (or robotics), aspirate 5 μL of the elution (Purified fraction) and dispense into a 96-well PCR plate containing 7 μL of LabChip Sample Buffer. Denature for 5 min at 95  C, add 32 μL of water, and analyze the samples following manufacturer’s instruction with the High Sensitivity HT Protein Express protocol (10–100 kDa program) on the LabChip GXII device (Perkin Elmer). This analysis can be replaced by a SDS-PAGE if a Perkin Elmer machine is not available in the lab (see Note 8). 26. This procedure is reproduced three times in a day, in order to purify the 96 His-TRX-DBDs. Once the Purification of the first 24 targets is started on the Tecan robot, thaw the second series of 8 DW24 plates (Target 25–48) and proceed through the lysis/sonication steps. At the end of the purification of targets 1–24, start the program of the Tecan for the purification of target 25–48, thaw the DW for the targets 49–72 and repeat one more time for targets 73–96. 27. Control the purity, the concentration, and the MW of each recombinant DBD by analyzing the LabChip GX II results following the manufacturer’s instructions (see Note 8). The samples with at least a total of 1 mg of DBD are ready for the SELEX analysis (DBD concentration is 0,125 mg/mL or above). Discard the elution fractions without suitable quantity or quality of His-TRX-DBD for the HT-SELEX. Following this purification protocol, out of the 420 clones of the project, 263 Ciona robusta DBD (63%) were purified with the correct molecular Mass and in sufficient quantities to be analyzed by HT-SELEX. The scale of production (1 mg) of this pipeline is sufficient to perform hundreds of HT-SELEX on each DBD. The choice of such a « large » scale was dictated by the wish to be able at every step to confirm the purity and correct MW by Caliper or SDSPAGE analysis of all the DBD. Table 1 details the names and positions of the 420 Ciona robusta DBD selected and cloned in this study (plate A to E) together with the identification of the 263 purified DBDs ready for the HT-SELEX. During several years the quality and stability of the DBDs were controlled by running aliquots on Caliper or

Position and names of the 420 DNA BD plasmids. The full protein names and sequences can be found in ANISEED. In green are highlighted the purified DBD ready for the HT-SELEX and in red the protein that failed the protein production

Table 1 Ciona robusta DNA BD library organization High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains 505

506

Kazuhiro R. Nitta et al.

Fig. 3 Schematic of the HT-SELEX protocol

SDS-PAGE. More than 95% of these proteins were stable at least 4 years in the storage buffer. The exact protein sequences of the 420 Ciona robusta DBDs can be found on the ANISEED gene page of each SELEXed TF gene (e.g., https://www.aniseed.cnrs. fr/aniseed/gene/show_selex?unique_id¼Cirobu.g00006940) or as a batch file in the download section of ANISEED [24]. 3.3 High-Throughput Systematic Evolution of Ligands by EXponential Enrichment (HT-SELEX)

The identification of the specific transcription factor DNA-binding sequences involves four steps: (1) Preparation of a randomized double-stranded DNA oligonucleotides library and mixing with DBD proteins described in previous section. (2) Selection of the DBD protein-DNA complexes. (3) Amplification of the selected DNA oligonucleotides by PCR (multiple rounds of steps 2 and 3 were performed). (4) Sequence selected DNA oligonucleotides from each step by massive parallel sequencing and extract specific TF-binding motifs (see Fig. 3).

3.3.1 Preparation of the Degenerate DoubleStranded DNA Oligonucleotides Library

The DNA library was initially synthesized as ssDNA oligonucleotides and rendered double strand by extension with DNA polymerase. To prepare the library, mix the Generic oligonucleotides with 96 different barcoded oligonucleotides in a 96-well PCR plate. Each well is allocated a specific barcode. Double-stranded DNA is generated by a single cycle of polymerase extension. Verification of the complexity of the library by massive parallel sequencing prior to its use in SELEX is advisable. PCR Mix

1

104

MilliQ

4.25 μL

442 μL

5 Phusion HF buf

11 μL

1144 μL

10 mM e dNTP

2 μL

208 μL (continued)

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains Generic oligo 100 μM

1.75 μL

Barcode oligo 5 μM

35 μL

Phusion DNA pol (Thermo)

1 μL

Final

55 μL

507

182 μL

104 μL

Cycling conditions: 1. 98  C

1 min

2. 60  C

2 min

3. 72  C

20 min



4. 4 C 3.3.2 HT-SELEX



The whole protocol is automated on a Freedom EVO 200 (Tecan) liquid handling robot at room temperature but could also be performed with a variable span multi-pipette. To perform the HT-SELEX, mix the randomized DNA library with 96 His-tagged DBD in a 96-well PCR plate. For each DBD, first a Ni+ beadsProtein-specific DNA ligand complex is formed, then it is selected by extensive washing steps. Finally, the specifically bound DNA is amplified by PCR using the constant part of the oligos. Amplified oligos from the previous cycle of SELEX are used as DNA input for the next cycle of reaction. SELEX assays using a bead protocol to capture TFs and their bound oligonucleotides were automated in 96-well format and ran for 7 cycles for each of the 263 recombinant Ciona DBDs, using an initial population of double-stranded oligonucleotides with a central stretch of 20 fully degenerate nucleotides. 1. Dispense 6 μL of the degenerate DNA library (0.5 pM) into each well of a 96-well PCR plate (see Note 9). 2. Add 9 μL of 1 SELEX buffer mixture (below) into each well. 10 SELEX buffer

1.4 μL

1 M DTT

0.014 μL

250 μg/mL Poly-dIdC

0.3 μL

Milli-Q

7.286 μL

3. Add 5 μL of DBD protein solution (250 ng) (see Notes 10 and 11). 4. Incubate for 20 min at room temperature. This will allow the binding of the DBD to its specific target oligonucleotides. 5. Add 150 μL of Nickel beads equilibrated in SELEX Binding buffer.

508

Kazuhiro R. Nitta et al.

6. Transfer the mixture from the 96-well PCR plate to the MSDV plate. 7. Shake for 30 min at room temperature to allow DBD binding to the Nickel beads. 8. Apply mild aspiration to remove the liquid. 9. Dispense 200 μL of 1 SELEX buffer. 10. Apply mild aspiration to remove the liquid. 11. Repeat steps 9 and 10 for 11 times. These extensive washes (240 Nickel bead volumes) remove unbound DBD and most unspecifically bound oligonucleotides. 12. Resuspend the beads in 200 μL of distilled water. 13. Transfer 4 μL of the mixture in a new PCR plate, add PCR Mix. PCR Mix

1

104

Milli-Q

7.7 μL

800.8 μL

5 GoTaq buf.

4 μL

416 μL

25 mM MgCl2

1.6 μL

166.4 μL

2.5 mM dNTP

1.6 μL

166.4 μL

Primers 10 μM (vol for each)

0.5 μL

52 μL

GoTaq DNA pol. (Promega)

0.1 μL

10.4 μL

Final

20 μL

1664 μL

Cycling conditions 1. 94  C 

1 min

2. 94 C

30 sec

3. 55  C

30 sec

4. 72  C

30 sec

5. Go to step 2. X 17 cycles 6. 72  C

3 min

7. 4  C



14. Validate oligonucleotide amplification by running 4 μL of the PCR reaction on a 1% agarose gel. E-gel 48 electrophoresis system (ThermoFisher), or similar, is recommended. 15. Steps 1–14 constitute one SELEX selection cycle. Repeat this cycle three to six times to finish the HT-SELEX, using the selected oligonucleotides of the previous cycle as input library for the next. The optimal number of cycles depends on the recombinant protein used and cannot be known before the

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

509

analysis of sequences from all cycles (see below). After each of the cycle, a fraction of selected oligonucleotides is frozen for subsequent sequencing. 3.3.3 Library Sequencing and Motif Extraction

Prepared DNA libraries in previous section are sequenced by Illumina HiSeq2000. Sequencing of ligand DNA pools generates a dataset from which, after barcode demultiplexing of the different samples, TF-binding motifs can be extracted represented by Position Weight Matrices (PWMs). Two types of PWMs were generated, based either on 6- or 8-mers, using a similar method as the one used by Weeder [25]. For each SELEX cycle of each transcription factor, we collected all sequenced oligos that matched the most enriched 6 (8)-mer of the cycle with 1 (2) mismatch(es). This starting k-mer was used as a seed to align selected oligo sequences, and a frequency matrix of each base at each position matrix was generated from this aligned set of oligos. A less graphic, but potentially more useful approach, is to represent DNA-binding specificity with a 4kdimensional vector in which each k-mer is associated to its enrichment score. Such vectors can be individually downloaded from the ANISEED web site for each gene card, alongside numerous other metrics (e.g., https://www.aniseed.cnrs.fr/aniseed/gene/show_ selex?unique_id¼Cirobu.g00006940) or as batch from the download section (https://www.aniseed.cnrs.fr/aniseed/download/ download_data). In addition, a public track hub (Normalized SELEX data) is provided on the ANISEED Ciona robusta genome browser to directly visualize the local affinity of DNA sequences for each of the SELEXed Ciona transcription factors (Fig. 4). SELEXed oligos corresponding to all 7 cycles of HT-SELEX for the 263 proteins were mixed in equimolar ratios in groups of 96, mass-sequenced using Illumina GAIIx or HiSeq2000 leading on average to 162,733 sequences per round and per factor (deposited in European Nucleotide Archive (ENA, EMBL-EBI) under accession PRJEB29730). High quality TF-binding motifs for 139 Ciona DBD could be extracted out of the 263 DBD submitted to HT-SELEX and

Fig. 4 Example enrichment and statistics for the transcription factor Tbx2/3. Position weight matrices created for each cycle along with the top enriched 6-mer. The final form of the binding site can be seen to stabilize after cycle 5

510

Kazuhiro R. Nitta et al.

sequencing. Table 2 details the names and positions of the 263 Ciona robusta purified DBD proteins processed through the HT-SELEX together with the identification of the 139 DBD where a robust DNA-binding motifs could be extracted. The DNA-binding motifs identified per DBD can be found in ANISEED [24]. Table 3 details the results along the pipeline organized by TF family and ranked by success for protein production and identification of specific TF-binding DNA motifs. The 24 structural TF families explored exhibit very different success for the protein purification pipeline ranging from 100% (for 10 families) to 37% and 25% for the families of Zn finger that are unlikely to directly bind DNA in a sequence-specific manner. Indeed, the protein production yield is much better for the other families with 75% success (179/239) than the Zn finger family with 46% (84/181). The possible reason could be the larger MW of their multi-fingered putative DBDs. Similar to the recombinant DBD production and purification results, the 24 TF families exhibited very different success rate for the identification of specific TF-binding DNA motifs, ranging from 100% (for 6 families) to 0% for the Zn finger families that were classified as unlikely to bind specific DNA sequences. Overall, the success in identifying the target DNA sequence of Zn finger family members 32% (27/84) is much lower than for the other TF families (63% success; 112/179). In particular, out of 20 proteins belonging to the Other Zinc finger class, no single factor was found to bind to DNA in a sequence-specific manner, possibly because most of these factors belong to the “Unlikely transcription factor” class. Consistently, SELEX analysis of a single protein from the Unlikely class led to the identification of a specific TF-binding motif, while 68% of Core proteins and 32% of Possible TFs were found to bind DNA in a sequence-specific manner. This suggests both that our annotation-based TF classification is accurate and that the HT-SELEX protocol is highly specific. Non-Core proteins represent 61% of putative transcription proteins in the Ciona genome, but only 11% of those which we tested by SELEX appeared to have sequence-specific DNA-binding activity (vs. 68% of Core proteins tested in the same conditions). Core TFs constitute close to 95% of the successfully SELEXed proteins. From these numbers, we estimate that the repertoire of Ciona transcription factors is probably in the order of 500, surprisingly similar to the E. coli repertoire of 314 TFs [26], given the vast increase in regulatory complexity between bacteria and a chordate. Note that almost all TF families tested are enriched motifs at least one gene indicating that the HT-SELEX protocol established is efficient for all TF families.

Purified proteins were transferred to new plates (A–C) and subjected to HT-SELEX. In blue are highlighted the DBD where DNA-binding motif could be identified and in gray the protein were no specific DNA-binding motif could be identified

Table 2 List of Ciona robusta DNA BD library with identified DNA-binding sequences High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains 511

512

Kazuhiro R. Nitta et al.

Table 3 Protein production/HT-SELEX summary classified by TF family (a) and by TF classification (b)

The data has been organized by number of obtained binding motifs per family. Bar plot next to columns indicates percentage of TF Core (referred as green), Possible (Orange), Unlikely (White), respectively

It is known that the TF DNA-binding motifs in each family are strictly conserved across species [5, 6]. Ciona TF-binding motif collection produced in this project also supports the evolutionary conservation of binding affinity between distant orthologs (a comparison of the DNA-binding specificity of Ciona and vertebrate orthologs is presented in the SELEX tab of the gene page of each TF in ANISEED). The C2H2 zinc finger family is a rapidly evolving structural class, and only 66 of the 169 Ciona C2H2 genes (39%) have one or more human ortholog, compared to 44/90 (49%) of homeobox genes. Consistently, some of these factors had a unique DNA-binding specificity (e.g., ZF-C2H2-1, ZF-C2H2-6, ZF-C2H2-7, ZF-C2H2-10, ZF-C2H2-15, ZF-C2H2-35). Homo sapiens C2H2 zinc finger DNA-binding specificity is widely diverged, and the C2H2 zinc-finger TFs bind to specific genome locus to regulate gene expression [7]. Ciona C2H2 zinc fingers may also have a diverged role in Ciona gene regulation. This possibility should be further investigated to address its evolution.

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

513

Overall, we have successfully established a robust HT protein production/purification and HT-SELEX pipeline and shown that it can be used to decipher the DNA-binding specificity of a large part of the Ciona robusta transcription factor repertoire. Starting from a collection of 44% of the Ciona robusta TF clone collection (420), with the protocols described in the article, 63% (263) of the Ciona robusta DBDs could be produced and purified. From this set of proteins, the binding motif of 139 DBD (53%) could be identified, indicating that the vast majority of the purified proteins are indeed functional [24]. The protocol developed for Ciona robusta transcription factor detailed in this chapter has been also successfully used to study Mammalian TFs [5] and fruit fly TFs [6] and has been expanded to describe multimeric TF DNA-binding specificity [27], TF DNA-binding affinity to methylated DNA [28], and TF DNA-binding affinity to nucleosome-bound DNA [29]. We believe that this protocol is a strong tool for deciphering in vitro transcription factor affinity on a large scale.

4

Notes 1. Initial tests by EMSA for different families suggested that this simple Tris, NaCl, poly-dIdC buffer without detergent worked for most families of TFs including Zn finger proteins. 2. Plates can be made ahead of time and stored for up to 4 weeks at 4  C. They should be pre-warmed and dried to room temperature or 37  C prior to use. This can be done during the 1-h incubation of the transformations. To dry the plates, leave them inside a hood (or in a plate incubator) with their lids off until all moisture has evaporated. 3. This is most easily done using a multichannel pipette with variable span, using only four consecutive pipette tips at a time. Four transformations can be aspirated at once from the 96-well PCR plate, then the pipette span can be extended to fit over the 24-well plate and the culture dispensed. If the pipette has step-based programming (as for the Matrix Equalizer Pipettes, Thermo Scientific), then steps 6 and 7 can be performed in one step by aspirating 120 μL and dispensing 60 μL onto the LB agar plate and 60 μL into the medium, changing the tip span between each dispensing step. 4. The agar plates are only back-ups. They can be stored at 4  C in case the liquid preculture does not grow but there are colonies on the plates. In such case, the DBD production is postponed by 24 h and the precultures are initiated with a dilution in fresh medium of the original preculture and the picking of colonies for the few missing precultures to complete the plate.

514

Kazuhiro R. Nitta et al.

5. Glycerol stocks can be stored at 80  C and used to inoculate precultures for subsequent rounds of expression. Glycerol stocks should be made in replicates. Preparation of glycerol stocks: Dispense 30 μL of 100% glycerol using a multidispensing pipette set to slow speed into each well of a 96-well microtiter plate. Transfer 120 μL of each culture into the corresponding well of the microtiter plate and mix by pipetting slowly and gently. Seal with plastic adhesive tape and store at 80  C. 6. While it is possible to include DNase and MgSO4 in the lysis buffer, we recommend not to. That way when the cells are thawed the lysis will be visible because the cell suspension will be viscous. If the lysozyme was accidentally omitted and DNase and MgSO4 are also present in the lysis buffer, then it will be impossible to discriminate whether the lysis was successful. 7. If the culture OD at harvesting time is above 12, clogging of the filter by cell debris could happen. In that case, centrifuge the DW24 plate at 3800  g for 10 min, transfer the supernatant into a new DW24 plate, and use this cleared lysate for the Nickel purification. 8. The labChip GXII gives quantitative values and is much quicker and more precise than SDS_PAGE for the validation of purity, Molecular Weight, and sample concentration of each elution of His-TRX-DBD. Alternatively use the OD for concentration and SDS_PAGE for the purity and approximate MW validation. 9. There is no need to purify the double-stranded oligonucleotides after each PCR reactions. Primers, nucleotides, polymerase, and PCR buffer will not affect the following reaction. The amount of oligonucleotides added to the binding reaction (0.5 picomole) was chosen to explore a large fraction of the population of all possible random sequences. In the case of 20 N oligos, 420 ¼ 1.1  1012 different ligands can be generated and each individual ligand is expected to be present in 0.3 copies in 0.5 pmol, assuming equimolarity. In case of 40 N oligos, 440 ¼ 1.2  1024 different ligands, where each oligonucleotide is expected to be present in 2  1013 copies in 0.5 pmol, assuming equimolarity. 10. The DBD are stored in liquid state at 20  C (50% glycerol in storage buffer). The day of the HT_SELEX experiment, transfer 5 μL of the 96 DBD from the 4 DW24 plates elutions into a new PCR plate. Adjust protein concentration to 50 ng/μL with 1 SELEX buffer and transfer 5 μL of diluted protein to the SELEX plate. For an average protein concentration (125 μg/mL) add 3 μL of buffer to the 2 μL of protein stock. 11. After testing various DBD concentrations in the assay, 250 ng turned out to be the best consensus to get robust HT-SELEX data. Assuming an average MW of 40 kD for the DBD, this

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains

515

corresponds 5 pmol of proteins (tenfold molar excess to the amount of oligonucleotides). When the SELEX enrichment was unsuccessful, the procedure was repeated with various protein quantities (100 ng–2.5 μg).

Acknowledgments This work was supported by a Grant-in-Aid from French ANR grants (Chor_Reg_Net, ANR-05-BLAN-015; Chor-Evo-Net, 0ANR-08-BLAN-0067-01) and an FP6 EU grant (Transcode, LSHG-CT-2004-511990). PL and RV were members or CNRS. KN was supported by the ANR (Chor_Reg_Net), and by CNRS. Thanks to Drs. Jussi Taipale and Arttu Jolma to support NGS sequencing. This work was supported by the French Infrastructure for Integrated Structural Biology (FRISBI, ANR-10INBS-05). References 1. Vaquerizas JM, Kummerfeld SK, Teichmann SA, Luscombe NM (2009) A census of human transcription factors: function, expression and evolution. Nat Rev Genet 10:252–263. https://doi.org/10.1038/ nrg2538 2. Lambert SA, Jolma A, Campitelli LF, Das PK, Yin Y, Albu M, Chen X, Taipale J, Hughes TR, Weirauch MT (2018) The human transcription factors. Cell 175:598–599. https://doi.org/ 10.1016/j.cell.2018.09.045 3. Imai KS (2004) Gene expression profiles of transcription factors and signaling molecules in the ascidian embryo: towards a comprehensive understanding of gene networks. Development 131:4047–4058. https://doi.org/10. 1242/dev.01270 4. Miwata K, Chiba T, Horii R, Yamada L, Kubo A, Miyamura D, Satoh N, Satou Y (2006) Systematic analysis of embryonic expression profiles of zinc finger genes in Ciona intestinalis. Dev Biol 292:546–554. https://doi.org/10.1016/j.ydbio.2006.01. 024 5. Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, Morgunova E, Enge M, Taipale M, Wei G, Palin K, Vaquerizas JM, Vincentelli R, Luscombe NM, Hughes TR, Lemaire P, Ukkonen E, Kivioja T, Taipale J (2013) DNA-binding specificities of human transcription factors. Cell 152:327–339. https://doi.org/10.1016/j.cell.2012.12.009

6. Nitta KR, Jolma A, Yin Y, Morgunova E, Kivioja T, Akhtar J, Hens K, Toivonen J, Deplancke B, Furlong EEM, Taipale J (2015) Conservation of transcription factor binding specificities across 600 million years of bilateria evolution. eLife 4. https://doi.org/10.7554/ eLife.04837 7. Najafabadi HS, Mnaimneh S, Schmitges FW, Garton M, Lam KN, Yang A, Albu M, Weirauch MT, Radovani E, Kim PM, Greenblatt J, Frey BJ, Hughes TR (2015) C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat Biotechnol 33:555–562. https:// doi.org/10.1038/nbt.3128 8. Tuerk C, Gold L (1990) Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249:505–510 9. Ellington AD, Szostak JW (1990) In vitro selection of RNA molecules that bind specific ligands. Nature 346:818–822. https://doi. org/10.1038/346818a0 10. Mallikaratchy P (2017) Evolution of complex target SELEX to identify aptamers against mammalian cell-surface antigens. Molecules 22:215. https://doi.org/10.3390/ molecules22020215 11. Brunetti R, Gissi C, Pennati R, Caicci F, Gasparini F, Manni L (2015) Morphological evidence that the molecularly determined Ciona intestinalis type A and type B are different species: Ciona robusta and Ciona

516

Kazuhiro R. Nitta et al.

intestinalis. J Zool Syst Evol Res 53:186–193. https://doi.org/10.1111/jzs.12101 12. Dehal P (2002) The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science 298:2157–2167. https://doi.org/10.1126/science.1080049 13. Imai KS (2006) Regulatory blueprint for a chordate embryo. Science 312:1183–1187. https://doi.org/10.1126/science.1123404 14. Satou Y, Imai KS (2015) Gene regulatory systems that control gene expression in the Ciona embryo. Proc Jpn Acad Ser B Phys Biol Sci 91:33–51. https://doi.org/10.2183/pjab. 91.33 15. Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, Chang H-Y, Doszta´nyi Z, El-Gebali S, Fraser M, Gough J, Haft D, Holliday GL, Huang H, Huang X, Letunic I, Lopez R, Lu S, Marchler-Bauer A, Mi H, Mistry J, Natale DA, Necci M, Nuka G, Orengo CA, Park Y, Pesseat S, Piovesan D, Potter SC, Rawlings ND, Redaschi N, Richardson L, Rivoire C, Sangrador-Vegas A, Sigrist C, Sillitoe I, Smithers B, Squizzato S, Sutton G, Thanki N, Thomas PD, Tosatto SCE, Wu CH, Xenarios I, Yeh L-S, Young S-Y, Mitchell AL (2017) InterPro in 2017— beyond protein family and domain annotations. Nucleic Acids Res 45:D190–D199. https://doi.org/10.1093/nar/gkw1107 16. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 44:D279–D285. https://doi.org/10.1093/nar/gkv1344 17. Satou Y, Satoh N (2005) Cataloging transcription factor and major signaling molecule genes for functional genomic studies in Ciona intestinalis. Dev Genes Evol 215:580–596. https://doi.org/10.1007/s00427-005-00169 18. Satou Y, Yamada L, Mochizuki Y, Takatori N, Kawashima T, Sasaki A, Hamaguchi M, Awazu S, Yagi K, Sasakura Y, Nakayama A, Ishikawa H, Inaba K, Satoh N (2002) A cDNA resource from the basal chordateCiona intestinalis. Genesis 33:153–154. https://doi. org/10.1002/gene.10119 19. Sayou C, Monniaux M, Nanao MH, Moyroud E, Brockington SF, Thevenon E, Chahtane H, Warthmann N, Melkonian M, Zhang Y, Wong GK-S, Weigel D, Parcy F, Dumas R (2014) A promiscuous intermediate underlies the evolution of LEAFY DNA binding specificity. Science 343:645–648. https:// doi.org/10.1126/science.1248229

20. Mathelier A, Fornes O, Arenillas DJ, Chen C, Denay G, Lee J, Shi W, Shyr C, Tan G, Worsley-Hunt R, Zhang AW, Parcy F, Lenhard B, Sandelin A, Wasserman WW (2016) JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res 44:D110–D115. https://doi.org/10. 1093/nar/gkv1176 21. Vincentelli R, Cimino A, Geerlof A, Kubo A, Satou Y, Cambillau C (2011) Highthroughput protein expression screening and purification in Escherichia coli. Methods 55:65–72. https://doi.org/10.1016/j.ymeth. 2011.08.010 22. Turchetto J, Sequeira AF, Ramond L, Peysson F, Bra´s JLA, Saez NJ, Duhoo Y, Ble´mont M, Guerreiro CIPD, Quinton L, De Pauw E, Gilles N, Darbon H, Fontes CMGA, Vincentelli R (2017) High-throughput expression of animal venom toxins in Escherichia coli to generate a large library of oxidized disulphide-reticulated peptides for drug discovery. Microb Cell Factories 16. https://doi. org/10.1186/s12934-016-0617-1 23. Saez NJ, Nozach H, Blemont M, Vincentelli R (2014) High throughput quantitative expression screening and purification applied to recombinant disulfide-rich venom proteins produced in E. coli. J Vis Exp. https://doi. org/10.3791/51464 24. Brozovic M, Dantec C, Dardaillon J, Dauga D, Faure E, Gineste M, Louis A, Naville M, Nitta KR, Piette J, Reeves W, Scornavacca C, Simion P, Vincentelli R, Bellec M, Aicha SB, Fagotto M, Gue´roult-Bellone M, Haeussler M, Jacox E, Lowe EK, Mendez M, Roberge A, Stolfi A, Yokomori R, Brown CT, Cambillau C, Christiaen L, Delsuc F, Douzery E, Dumollard R, Kusakabe T, Nakai K, Nishida H, Satou Y, Swalla B, Veeman M, Volff J-N, Lemaire P (2018) ANISEED 2017: extending the integrated ascidian database to the exploration and evolutionary comparison of genome-scale datasets. Nucleic Acids Res 46:D718–D725. https://doi.org/ 10.1093/nar/gkx1108 25. Pavesi G, Mauri G, Pesole G (2001) An algorithm for finding signals of unknown length in DNA sequences. Bioinformatics 17(Suppl 1): S207–S214 26. Pe´rez-Rueda E, Collado-Vides J (2000) The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12. Nucleic Acids Res 28:1838–1847 27. Jolma A, Yin Y, Nitta KR, Dave K, Popov A, Taipale M, Enge M, Kivioja T, Morgunova E, Taipale J (2015) DNA-dependent formation of

High-Throughput Production and Interaction of Ciona robusta DNA-Binding Domains transcription factor pairs alters their binding specificity. Nature 527:384–388. https://doi. org/10.1038/nature15518 28. Yin Y, Morgunova E, Jolma A, Kaasinen E, Sahu B, Khund-Sayeed S, Das PK, Kivioja T, Dave K, Zhong F, Nitta KR, Taipale M, Popov A, Ginno PA, Domcke S, Yan J, Schu¨beler D, Vinson C, Taipale J (2017) Impact of cytosine methylation on DNA binding specificities of human transcription factors.

517

Science 356:eaaj2239. https://doi.org/10. 1126/science.aaj2239 29. Zhu F, Farnung L, Kaasinen E, Sahu B, Yin Y, Wei B, Dodonova SO, Nitta KR, Morgunova E, Taipale M, Cramer P, Taipale J (2018) The interaction landscape between transcription factors and the nucleosome. Nature 562:76–81. https://doi.org/10. 1038/s41586-018-0549-5

Chapter 24 High-Throughput Micro-Characterization of RNA–Protein Interactions Sara Go´mez, Francisco J. Ferna´ndez, and M. Cristina Vega Abstract Many cellular processes depend on and are regulated by nucleic acid–protein interactions. In particular, RNA-binding proteins (RBPs) are involved in transcription, translation, modulating RNA polymerase activity, and stabilizing protein–RNA complexes. Furthermore, RBPs participate in the development of pathologies such as cancer and viral infections, and their dysfunction leads to mutations and the aberrant expression of noncoding RNAs. Therefore, the study of RNA–protein interactions represents a central issue for biology and biomedicine. While many valuable insights have been obtained from electrophoretic mobility shift assays (EMSA) and immunoprecipitation (IP), these standard methods suffer from two main limitations: insufficient sensitivity to capture low concentration RBP–RNA complexes in vitro and identification of interactions in vivo. In recent years, high-throughput (HTP) platforms have emerged that combine methodological improvements over conventional techniques with more sensitive detection systems, thereby catalyzing the simultaneous probing and analysis of a vast amount of RBP–RNA interactions by cellular proteomics and interactomics approaches. In this chapter, we summarize a selection of state-ofthe-art in vitro, in vivo, and computational HTP platforms for the discovery and characterization of RNA–protein interactions. We also reflect on the wealth of information obtained by the structural analysis of RBPs and their RNA-binding domains as a valuable resource for the rational design and implementation of new RNA-binding discovery platforms. Key words RNA, Protein, Interaction, RNA-binding domain, High-throughput assay, Microcharacterization

1

RNA–Protein Interactions: A Structural Perspective In 1970 the central dogma of molecular biology established a simplified scheme of genetic information flow in biological systems whereby the role of RNA was solely to act as an intermediary between DNA and protein production [1]. Since then, a continuous stream of new RNA species and RNA-specific functions have been described that extend the catalytic, structural, and regulatory roles played by noncoding RNAs (ncRNAs) in many critical cellular processes. These studies have been aided by sophisticated

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7_24, © Springer Science+Business Media, LLC, part of Springer Nature 2019

519

520

Sara Go´mez et al.

Table 1 RNA-binding domains PDB accession codes

Domain

Structural and topological features

RRM (RNA-recognition motif)

βαββαβ

1URN; 1B7F

[7, 8]

ZF (Zinc finger)

ββα

1UN6

[9]

KH (K homology)

βααββα or αββααβ

2ANR; 1EC6

[10, 11]

dsRBM (dsRNAbinding motif)

αβββα

3ADI; 1DI2

[12, 13]

DEAD-box

Helicase core with two tandem RecA-like domains

4DB2

[14]

PUF (Pumilio family) ααα

5BZ1

[15]

PAZ (PIWI/ β-barrel juxtaposed to a αβ in a clamp-like structure Argonaute/Zwille)

4Z4D

[16]

Lsm (Sm-like)

Two conserved motifs Sm1 and Sm2; N-terminal 2YLC α-helix followed by a twisted five-stranded β-sheet; heteroheptameric ring

[17]

SAM (Sterile alpha motif)

Globular fold

[18]

2B6G

References

bioinformatics method to find noncoding RNA genes in genomes [2]. Recently, RNA molecules have been proposed as drug targets based on their effector functions [3, 4]. As a result of the complexity of RNA biology, >1000 RBP genes are encoded in the human genome, and mutations in these genes can lead to loss-of-function proteins and aberrant RNA expression patterns with dramatic consequences for the cell. RNA-binding domains in RBPs can be categorized on a structural basis [5, 6]. A summary of RBPs from all life domains is presented in Table 1. The four most frequent RNA-binding domains include the RNA-recognition motif (RRM), the Zincfinger domain (ZF), the KH domain, and the double-stranded RNA-binding motif (dsRBM). The RRM, also known as the RNA-binding domain (RBD) and the ribonucleoprotein domain (RNP), is the most abundant in higher eukaryotes and most extensively studied RNA-binding domain. The ZF domain is about 30-amino-acid long folded into a ββα topology where the α-helix and the β-hairpin are stabilized by

High-Throughput RNA–Protein Interactions

521

a Zn2+ ion [19]. ZFs bind DNA, single-stranded (ss) and doublestranded (ds) RNA. They are classified according to the Zn2+chelating amino acid sequence motif, which contains cysteine and histidine residues (C2H2, CCCH, CCHC, and CCCC). ZFs are typically found in multiple tandem repeats in proteins, which allow them to recognize different RNA sequences through a variety of models. Some ZFs recognize only the backbone of dsRNA molecules [9], while other ZFs can probe specific bases by establishing stacking interactions between aromatic residues and hydrogen bonds [20]. The K homology (KH) motif comprises about 70 amino acids, and it is found in two topologically distinct domain versions: the eukaryotic Type I KH domain (βααββα) and the prokaryotic Type II KH domain (αββααβ) [21, 22]. Type I and II differ in the precise topology of the central β-sheet, whereas in Type I all β-strands are antiparallel, in Type II the central β2 strand is parallel to β3 and antiparallel to β1. In both cases, the β-strand is packed against three α-helices. The loop connecting α1 and α2 contains the conserved GXXG motif that identifies the KH family. KH domains can bind to many different four-nucleotide sequences through a variety of interactions, except for stacking interactions. The lack of stacking interactions leads to low-affinity binding events. To overcome the latter limitation, the KH domains have evolved two strategies: extension of the KH domain surface with, e.g., an extended C-terminal helix, and the multimerization of KH domains in the same RBP [23, 24]. In both cases, the various motifs can interact with the same RNA molecule independently or cooperatively. The dsRBM is approximately 70 amino acids and folds in an αβββα topology, usually found in multiple copies. The structural features of RNA that are recognized by the dsRBM are located on the face of a regular A-form helix covering two consecutive minor grooves separated by one major groove. Although the dsRBM was first described as binding preferentially to dsRNA, multiple sequence alignments have revealed a significant sequence and length divergence across dsRBMs, thereby suggesting multiple binding specificities [25]. There are RNA-binding motifs with specialized functions. The DEAD-box domain contains a helicase core with two tandem RecA-like domains where RNA recognition is ATP dependent [26]. The Pumilio Repeat Domain (PUF), which is present in a highly conserved set of eukaryotic ssRNA-binding proteins, is characterized by a 35-amino-acid ααα topology that comes in tandem repeats of 6–8 consecutive domains at the C-terminus of PUF family proteins [27]. The PIWI/Argonaute/Zwille (PAZ) domain consists in a 110-amino-acid clamp-like structure formed by the juxtaposition of a β-barrel domain and an αβ domain; PAZ domains recognize dsRNA and ssRNA, which sits on the inner side of the clamp [28]. Much more widespread in nature, the Sm-like

522

Sara Go´mez et al.

family (Lsm) fulfills roles in splicing, nuclear RNA processing, and messenger RNA decay; Lsm proteins are composed of two conserved motifs, Sm1 and Sm2 [29], with the Sm fold consisting of an N-terminal α-helix followed by a twisted five-stranded β-sheet [30]; the Sm proteins assemble into heteroheptameric ring structures to bind a single-stranded region of RNAs, e.g., U1, U2, U4, and U5 small nuclear RNAs (snRNAs). Finally, the SAM (Sterile alpha motif) domains are among the most abundant interaction domains and, despite their sequence conservation, they are very versatile in their binding properties, being able to interact with other SAM domains, non-SAM domains, and RNA; the RNA-binding SAM domains are helical globular domains with a large electropositive patch for RNA interaction [31]. Additionally, SAM-containing proteins with iron-sulfur clusters in their structure are also responsible for several crucial modifications in tRNA such as methylation, methylthiolation, or carboxymethylation, among others [32]. Intriguingly, folds like the E1-like activating enzyme domain, which were not traditionally considered to be tRNA-binding modules, have been recently discovered to establish stable interactions with tRNA and catalyze specific posttranslational modifications on nucleotide positions that are crucial for translation efficiency and fidelity [33, 34].

2

Methods for Detecting RNA–RBP Interactions Many assays have been developed to identify RNA-binding proteins and to characterize the affinity and specificity of the interaction, both in vitro and in vivo. Frequently, the identification of novel RBPs requires isolating it with the associated target RNA by, e.g., immobilizing the target RNA on a solid support, subjecting cell extracts to affinity chromatography, and identifying RNA-binding proteins by elution of RNA-binding proteins, separation and analysis by mass spectrometry [35, 36]. Electrophoretic mobility shift assays (EMSA) are commonly used for the characterization of RBP–RNA interactions, whereby RNA-containing protein complexes are identified by the retardation that RBPs elicit on the electrophoretic mobility of RNA in acrylamide or agarose gels with respect to free RNA [37–43]. Chemical or UV cross-linking can increase the stability of labile RNA–RBP interactions, therefore, lowering the detection limit of the technique [42]. Despite their merits, those two approaches are limited in their ability to discriminate physiologically relevant from non-physiological interactions, which can lead to false positive interactions that cannot be validated in in vivo assays, and neither one can be easily adapted for highthroughput (HTP) screening. HTP platforms for the characterization of RNA–RBP interactions can be divided into two groups: RNA centric methods, where

High-Throughput RNA–Protein Interactions

523

RNA is used as a bait to capture interacting RBPs, and RBP centric methods, where it is the RBP that is used as a bait to trap interacting RNA [6, 44–55]. These categories are not mutually exclusive and, in fact, RNA and RBP centric platforms share aspects of their protocols. For example, both platforms make use of molecular labeling and immunoprecipitation. Both RBP and RNA molecules can be labeled specifically, thereby allowing the detection of an interaction between them. Several genetic or chemical tags are commercially available, including polyhistidine, streptavidin tags, biotin, or aptamer-based tags. Selecting the most appropriate labeling and detection strategies is critical for the success of an RNA–RBP interaction assay; in particular, one must aim to maximize the signal-to-noise ratio while reducing any potential interference that the tag might have on the native structure of the interacting biomolecules [46]. In microarray-based interaction experiments, many different protein samples can be arrayed on a microchip for probing with fluorescently labeled RNA. The detection of a hit (an RNA-interacting protein) relies on the scanning of the entire microarray with a laser tuned to excite the specific fluorophore attached to the probe RNA [56, 57]. In the following sections, we introduce the main in vitro and in vivo HTP methods devised for the identification and characterization of RNA–protein interactions.

3

In Vitro High-Throughput Approaches: RNA Compete, RAPID, and MIST RNA compete is an HTP method to detect RNA sequences and structures that can bind a specific RBP (Fig. 1). This technique relies on the generation of extensive RNA libraries by in vitro transcription of a diverse pool of different RNA sequences and structures. The transcripts are incubated with the target RBP, which has been previously labeled with a suitable affinity tag. Next, stable RNA–RBP complexes are pulled down, and the binding RNA molecules are detected by hybridization to a detection microarray [49]. Another HTP method to study RNA–protein interactions is RAPID (Fig. 1). In contrast to RNA compete, RAPID is based on the use of RNA tags. RNA molecules can be chemically tagged and immobilized onto a solid support. In this configuration, various RBP-containing samples (e.g., cell or nuclear extracts) can be passed over the support to discover potential specific binders. Bound RBPs are then eluted and identified by mass spectrometry or Western blot [49]. A recent in vitro approach named MIST (Microarray Identification of Shifted tRNAs) addresses the specific interaction between tRNA and aminoacyl-tRNA synthetases independently from the

524

Sara Go´mez et al.

Fig. 1 Schematic representation of in vitro high-throughput techniques to detect and characterize RNA–RBP interactions and their workflows

aminoacylation reaction, combining EMSA and microarray analyses. In this technique, total tRNAs from an organism are extracted and labeled to be further used to perform a native electrophoretic separation of protein–tRNA complexes in vitro. These complexes are analyzed by EMSA and isolated through gel band extraction before being applied to an array assay to identify selective tRNA [58].

4 In Vivo High-Throughput Approaches: RIP, CLIP-seq, HiTS-CLIP, PAR-CLIP, and iCLIP RNA immunoprecipitation (RIP) is a protein-centric approach that exploits protein-specific antibodies to purify RNA–RBP complexes from cell and tissue extracts under physiological conditions (Fig. 2). The protein bait is then degraded by proteinases, and the RNA is transcribed and detected by hybridization to microarrays

High-Throughput RNA–Protein Interactions

525

Fig. 2 Schematic representation of in vivo high-throughput techniques to detect and characterize RNA–RBP interactions and their workflows

(RIP-chip) or next-generation sequencing (RIP-seq) [59–62]. The key advantages of RIP are the near-native working conditions, which allow the formation of RNA–RBP interactions, and the comprehensive coverage of genomic RNA sequences that can be tested in parallel. Precisely the favorable conditions existing in RIP for the formation of RNA–protein interactions explain the high rate of false positive interactions detected. Modifications of the original RIP method have overcome those limitations. In CLIP (Cross-Linking and ImmunoPrecipitation), UV cross-linking is used to stabilize physiological RNA–RBP interactions before purification under stringent conditions and size selection of the cross-linked complexes on an SDS-PAGE (Fig. 2). The co-purified RNA molecules can be reverse transcribed into a cDNA library and identified by HTP sequencing (HiTSCLIP) [63]. Additional related methods accomplish nucleotideresolution in the identification of cognate RNAs, including Photoactivatable Ribonucleoside Activated CLIP (PAR-CLIP) and Individual-Nucleotide Resolution CLIP (iCLIP) [63–67] (Fig. 2). Cross-Linking and Affinity Purification (iCLAP) is a

526

Sara Go´mez et al.

variation of iCLIP that resorts to a double affinity purification involving streptavidin/polyhistidine double-tagged RBPs and stringent purification methods to ensure the correct selection of the targeted RBP [49, 68] (Fig. 2). CLIP-based methods are very powerful to detect physiological RNA–protein interactions but are technically challenging to set up and to interpret. A streamlined version of the CLIP experimental setup, eCLIP (Enhanced CLIP), has recently been developed to accelerate the CLIP workflow [69]. Alternative RNA capturing methods exist that rely on affinity purification strategies that exploit streptavidin or polyhistidine tags to select the targeted RBP. A recent example of these methods is iCLAP.

5

Predictive Computational Strategies In addition to the experimental advances implemented in the in vivo and in vitro approaches, numerous in silico strategies have been developed to predict RNA–protein interactions [2, 49, 51, 52, 70–73]. Some of the software packages and databases are available through public servers. New bioinformatics tools regularly appear that complement and provide valuable alternatives to the

Table 2 Computational methods to discover RNA–RBP interactions Method Software

Features

References

RNA MEME based RBPmap SeAMotE RNA context lncProt

Identify motifs enriched in RNA targets using RNA data sets

[74] [75] [76] [77] [78]

Protein Struct-NB based PRIP SPOT-Seq OPRA BindN+ Pprint RNAProtB RNABindR+ HomPRIP HMMER

Identify regions in the protein surface able to bind with RNA using available structures or templates

Identify likelihood that long coding RNA (lcRNA) bind RBP

[79] [80] [81] [82] Identify binding residues based on primary features such as [83] [84] pKa value, hydrophobicity, molecular mass, etc. [53] [85] Identify binding sites based on sequence homology [86] RNA-binding ability of individual amino acids considering [87] the sequence context. Not valid for de novo detection catRAPIDsignature Identify RNA-binding regions by considering [88] physicochemical features (called as signature) present in known RBPs

High-Throughput RNA–Protein Interactions

527

existing methods. Table 2 presents a summary of representative computational methods for RBP discovery. Most methods currently in use are based on machine learning algorithms—Hidden Markov Models (HMMs), Random Forest (RF), and Support Vector Machine (SVM). In these methods, the prediction of an RNA–RBP interaction is addressed by searching for potential RNA-binding motifs, residues, or binding sites in RBPs with structural and sequence comparisons, and by identifying protein-binding motifs in RNA targets [49, 51, 70]. Prior knowledge on the positively charged nature of the surface of nucleic acidinteracting proteins has been incorporated in many computational methods, although not all proteins with positively charged surface patches bind RNA. Inclusion of additional features, such as sequence homology, amino acid composition, polarity, and prior knowledge on other types of interactions (e.g., hydrogen bond networks, van der Waals interactions), has increased the power of software programs aimed at predicting the formation of RNA–protein complexes [70]. The accuracy of bioinformatics predictions can be greatly enhanced by the availability of experimental structural information about the RNA and the RBPs from X-ray crystallography, NMR and cryo-EM. Nevertheless, in the absence of more direct structural information, homology modeling and docking can be used to build reasonably accurate models of RNA–proteins complexes [70].

6

Outlook and Future Challenges RNA is seldom naked in cells. From transcription to degradation, RNA molecules are in tight association with RBPs that catalyze their synthesis, remodel their structure, regulate their functions, and ultimately recycle them. Understanding RNA biology, therefore, requires identifying all RNA–RBP interactions and constructing mechanistic models of the recognition and effector processes. The complexity of such an interaction map has been revealed by experimental techniques that capture, purify, and characterize each partner of an RNA–RBP complex. However, the fact that many proteins that interact with RNA lack recognizable canonical binding motifs remains a major challenge in the field. Refinement and development of HTP platforms to discover RNA–RBP interactions in vitro and in vivo as well as to manage and interpret the vast amount of experimental data thus generated are poised to revolutionize our understanding of the complex interplay between RNA and the RNA interactome.

528

Sara Go´mez et al.

Acknowledgments We gratefully acknowledge the support received during the preparation of this chapter. MCV has received funding from the Spanish Ministerio de Economı´a y Competitividad (CTQ2015-66206-C22-R and SAF2015-72961-EXP) and the Regional Government of Madrid (S2017/BMD-3673). Abvance Biotech srl contributed with salaries (FJF). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. References 1. Crick F (1970) Central dogma of molecular biology. Nature 227:561–563. https://doi. org/10.1038/227561a0 2. Abbas Q, Raza SM, Biyabani AA, Jaffar MA (2016) A Review of Computational methods for finding non-coding RNA genes. Genes (Basel). https://doi.org/10.3390/ genes7120113 3. Eddy SR (2001) Non-coding RNA genes and the modern RNA world. Nat Rev Genet 2:919–929. https://doi.org/10.1038/ 35103511 4. Matsui M, Corey DR (2017) Non-coding RNAs as drug targets. Nat Rev Drug Discov 16:167–179. https://doi.org/10.1038/nrd. 2016.117 5. Cle´ry A, Allain FH-T (2013) From structure to function of rna binding domains - Madame Curie Bioscience Database - NCBI Bookshelf 6. Re A, Joshi T, Kulberkyte E et al (2014) RNA-protein interactions: an overview. Methods Mol Biol 1097:491–521. https://doi.org/ 10.1007/978-1-62703-709-9_23 7. Oubridge C, Ito N, Evans PR et al (1994) Crystal structure at 1.92 A resolution of the RNA-binding domain of the U1A spliceosomal protein complexed with an RNA hairpin. Nature 372:432–438. https://doi.org/10. 1038/372432a0 8. Handa N, Nureki O, Kurimoto K et al (1999) Structural basis for recognition of the tra mRNA precursor by the sex-lethal protein. Nature 398:579–585. https://doi.org/10. 1038/19242 9. Lu D, Searles MA, Klug A (2003) Crystal structure of a zinc-finger-RNA complex reveals two modes of molecular recognition. Nature 426:96–100. https://doi.org/10.1038/ nature02088 10. Teplova M, Malinina L, Darnell JC et al (2011) Protein-RNA and protein-protein recognition

by dual KH1/2 domains of the neuronal splicing factor Nova-1. Structure 19:930–944. https://doi.org/10.1016/j.str.2011.05.002 11. Lewis HA, Musunuru K, Jensen KB et al (2000) Sequence-specific RNA binding by a Nova KH domain: implications for paraneoplastic disease and the fragile X syndrome. Cell 100:323–332 12. Yang SW, Chen H-Y, Yang J et al (2010) Structure of Arabidopsis HYPONASTIC LEAVES1 and its molecular implications for miRNA processing. Structure 18:594–605. https://doi. org/10.1016/j.str.2010.02.006 13. Ryter JM, Schultz SC (1998) Molecular basis of double-stranded RNA-protein interactions: structure of a dsRNA-binding domain complexed with dsRNA. EMBO J 17:7505–7513. https://doi.org/10.1093/emboj/17.24. 7505 14. Mallam AL, Del Campo M, Gilman B et al (2012) Structural basis for RNA-duplex recognition and unwinding by the DEAD-box helicase Mss116p. Nature 490:121–125. https:// doi.org/10.1038/nature11402 15. Wilinski D, Qiu C, Lapointe CP et al (2015) RNA regulatory networks diversified through curvature of the PUF protein scaffold. Nat Commun 6:8213. https://doi.org/10.1038/ ncomms9213 16. Schirle NT, Sheu-Gruttadauria J, Chandradoss SD et al (2015) Water-mediated recognition of t1-adenosine anchors Argonaute2 to microRNA targets. elife. https://doi.org/10.7554/ eLife.07646 17. Sauer E, Weichenrieder O (2011) Structural basis for RNA 30 -end recognition by Hfq. Proc Natl Acad Sci U S A 108:13065–13070. https://doi.org/10.1073/pnas.1103420108 18. Johnson PE, Donaldson LW (2006) RNA recognition by the Vts1p SAM domain. Nat Struct

High-Throughput RNA–Protein Interactions Mol Biol 13:177–178. https://doi.org/10. 1038/nsmb1039 19. Brown RS (2005) Zinc finger proteins: getting a grip on RNA. Curr Opin Struct Biol 15:94–98. https://doi.org/10.1016/j.sbi. 2005.01.006 20. Plambeck CA, Kwan AHY, Adams DJ et al (2003) The structure of the zinc finger domain from human splicing factor ZNF265 fold. J Biol Chem 278:22805–22811. https://doi. org/10.1074/jbc.M301896200 21. Grishin NV (2001) KH domain: one motif, two folds. Nucleic Acids Res 29:638–643 22. Valverde R, Edwards L, Regan L (2008) Structure and function of KH domains. FEBS J 275:2712–2726. https://doi.org/10.1111/j. 1742-4658.2008.06411.x 23. Garcı´a-Mayoral MF, Hollingworth D, Masino L et al (2007) The structure of the C-terminal KH domains of KSRP reveals a noncanonical motif important for mRNA degradation. Structure 15:485–498. https://doi.org/10. 1016/j.str.2007.03.006 24. Beuth B, Pennell S, Arnvig KB et al (2005) Structure of a Mycobacterium tuberculosis NusA-RNA complex. EMBO J 24:3576–3587. https://doi.org/10.1038/sj. emboj.7600829 25. Liu Y, Lei M, Samuel CE (2000) Chimeric double-stranded RNA-specific adenosine deaminase ADAR1 proteins reveal functional selectivity of double-stranded RNA-binding domains from ADAR1 and protein kinase PKR. Proc Natl Acad Sci U S A 97:12541–12546. https://doi.org/10.1073/ pnas.97.23.12541 26. Wang S, Hu Y, Overgaard MT et al (2006) The domain of the Bacillus subtilis DEAD-box helicase YxiN that is responsible for specific binding of 23S rRNA has an RNA recognition motif fold. RNA 12:959–967. https://doi. org/10.1261/rna.5906 27. Spassov DS, Jurecic R (2003) The PUF family of RNA-binding proteins: does evolutionarily conserved structure equal conserved function? IUBMB Life 55:359–366. https://doi.org/ 10.1080/15216540310001603093 28. Siomi MC, Sato K, Pezic D, Aravin AA (2011) PIWI-interacting small RNAs: the vanguard of genome defence. Nat Rev Mol Cell Biol 12:246–258. https://doi.org/10.1038/ nrm3089 29. Hermann H, Fabrizio P, Raker VA et al (1995) snRNP Sm proteins share two evolutionarily conserved sequence motifs which are involved in Sm protein-protein interactions. EMBO J 14:2076–2088

529

30. Wilusz CJ, Wilusz J (2005) Eukaryotic Lsm proteins: lessons from bacteria. Nat Struct Mol Biol 12:1031–1036. https://doi.org/10. 1038/nsmb1037 31. Qiao F, Bowie JU (2005) The many faces of SAM. Sci STKE 2005:re7. https://doi.org/ 10.1126/stke.2862005re7 32. Kimura S, Suzuki T (2015) Iron-sulfur proteins responsible for RNA modifications. Biochim Biophys Acta 1853:1272–1283. https:// doi.org/10.1016/j.bbamcr.2014.12.010 33. Lo´pez-Estepa M, Arda´ A, Savko M et al (2015) The crystal structure and small-angle X-ray analysis of CsdL/TcdA reveal a new tRNA binding motif in the MoeB/E1 superfamily. PLoS One 10:e0118606. https://doi.org/10. 1371/journal.pone.0118606 34. Ferna´ndez FJ, Arda´ A, Lo´pez-Estepa M et al (2016) Mechanism of sulfur transfer across protein–protein interfaces: the cysteine desulfurase model system. ACS Catal 6:3975–3984. https://doi.org/10.1021/acscatal.6b00360 35. Francisco-Velilla R, Fernandez-Chamorro J, Lozano G et al (2015) RNA-protein interaction methods to study viral IRES elements. Methods 91:3–12. https://doi.org/10.1016/ j.ymeth.2015.06.023 36. Tacheny A, Dieu M, Arnould T, Renard P (2013) Mass spectrometry-based identification of proteins interacting with nucleic acids. J Proteome 94:89–109. https://doi.org/10. 1016/j.jprot.2013.09.011 37. Fillebeen C, Wilkinson N, Pantopoulos K (2014) Electrophoretic mobility shift assay (EMSA) for the study of RNA-protein interactions: the IRE/IRP example. J Vis Exp. https://doi.org/10.3791/52230 38. Yakhnin AV, Yakhnin H, Babitzke P (2012) Gel mobility shift assays to detect protein-RNA interactions. Methods Mol Biol 905:201–211. https://doi.org/10.1007/ 978-1-61779-949-5_12 39. Wassarman KM (2012) Native gel electrophoresis to study the binding and release of RNA polymerase by 6S RNA. Methods Mol Biol 905:259–271. https://doi.org/10.1007/ 978-1-61779-949-5_17 40. Ream JA, Lewis LK, Lewis KA (2016) Rapid agarose gel electrophoretic mobility shift assay for quantitating protein: RNA interactions. Anal Biochem 511:36–41. https://doi.org/ 10.1016/j.ab.2016.07.027 41. Carey MF, Peterson CL, Smale ST (2013) Electrophoretic mobility-shift assays. Cold Spring Harb Protoc 2013:636–639. https:// doi.org/10.1101/pdb.prot075861 42. Scott V, Clark AR, Docherty K (1994) The gel retardation assay. Methods Mol Biol

530

Sara Go´mez et al.

31:339–347. https://doi.org/10.1385/089603-258-2:339 43. Ferna´ndez FJ, Go´mez S, Navas-Yuste S et al (2017) Protein-tRNA agarose gel retardation assays for the analysis of the N 6-threonylcarbamoyladenosine TcdA function. J Vis Exp. https://doi.org/10.3791/55638 44. Uren PJ, Bahrami-Samani E, Burns SC et al (2012) Site identification in high-throughput RNA-protein interaction data. Bioinformatics 28:3013–3020. https://doi.org/10.1093/bio informatics/bts569 45. Tome JM, Ozer A, Pagano JM et al (2014) Comprehensive analysis of RNA-protein interactions by high-throughput sequencing-RNA affinity profiling. Nat Methods 11:683–688. https://doi.org/10.1038/nmeth.2970 46. Sutandy FXR, Hsiao FS-H, Chen C-S (2016) High throughput platform to explore RNA-protein interactomes. Crit Rev Biotechnol 36:11–19. https://doi.org/10.3109/ 07388551.2014.922916 47. Schlundt A, Tants J-N, Sattler M (2017) Integrated structural biology to unravel molecular mechanisms of protein-RNA recognition. Methods 118–119:119–136. https://doi.org/ 10.1016/j.ymeth.2017.03.015 48. Rinn JL, Ule J (2014) Oming in on RNA-protein interactions. Genome Biol 15:401. https://doi.org/10.1186/gb4158 49. Marchese D, de Groot NS, Lorenzo Gotor N et al (2016) Advances in the characterization of RNA-binding proteins. Wiley Interdiscip Rev RNA 7:793–810. https://doi.org/10.1002/ wrna.1378 50. Cook KB, Hughes TR, Morris QD (2015) High-throughput characterization of proteinRNA interactions. Brief Funct Genomics 14:74–89. https://doi.org/10.1093/bfgp/ elu047 51. Choi D, Park B, Chae H et al (2017) Predicting protein-binding regions in RNA using nucleotide profiles and compositions. BMC Syst Biol 11:16. https://doi.org/10.1186/ s12918-017-0386-4 52. Cheng Z, Huang K, Wang Y et al (2017) Selecting high-quality negative samples for effectively predicting protein-RNA interactions. BMC Syst Biol 11:9. https://doi.org/ 10.1186/s12918-017-0390-8 53. Cheng C-W, EC-Y S, Hwang J-K et al (2008) Predicting RNA-binding sites of proteins using support vector machines and evolutionary information. BMC Bioinformatics 9(Suppl 12):S6. https://doi.org/10.1186/14712105-9-S12-S6 54. Carneiro DG, Clarke T, Davies CC, Bailey D (2016) Identifying novel protein interactions: proteomic methods, optimisation approaches

and data analysis pipelines. Methods 95:46–54. https://doi.org/10.1016/j.ymeth. 2015.08.022 55. McHugh CA, Russell P, Guttman M (2014) Methods for comprehensive experimental identification of RNA-protein interactions. Genome Biol 15:203. https://doi.org/10. 1186/gb4152 56. Moore CD, Ajala OZ, Zhu H (2016) Applications in high-content functional protein microarrays. Curr Opin Chem Biol 30:21–27. https://doi.org/10.1016/j.cbpa.2015.10. 013 57. Abulwerdi FA, Schneekloth JS (2016) Microarray-based technologies for the discovery of selective, RNA-binding molecules. Methods 103:188–195. https://doi.org/10. 1016/j.ymeth.2016.04.022 58. Eriani G, Karam J, Jacinto J et al (2015) MIST, a novel approach to reveal hidden substrate specificity in aminoacyl-tRNA synthetases. PLoS One 10:e0130042. https://doi.org/10. 1371/journal.pone.0130042 59. Tenenbaum SA, Carson CC, Lager PJ, Keene JD (2000) Identifying mRNA subsets in messenger ribonucleoprotein complexes by using cDNA arrays. Proc Natl Acad Sci U S A 97:14085–14090. https://doi.org/10.1073/ pnas.97.26.14085 60. Keene JD, Komisarow JM, Friedersdorf MB (2006) RIP-Chip: the isolation and identification of mRNAs, microRNAs and protein components of ribonucleoprotein complexes from cell extracts. Nat Protoc 1:302–307. https:// doi.org/10.1038/nprot.2006.47 61. Trifillis P, Day N, Kiledjian M (1999) Finding the right RNA: identification of cellular mRNA substrates for RNA-binding proteins. RNA 5:1071–1082 62. Brooks SA, Rigby WF (2000) Characterization of the mRNA ligands bound by the RNA binding protein hnRNP A2 utilizing a novel in vivo technique. Nucleic Acids Res 28:E49 63. Licatalosi DD, Mele A, Fak JJ et al (2008) HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456:464–469. https://doi.org/10.1038/ nature07488 64. Hafner M, Landthaler M, Burger L et al (2010) Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141:129–141. https://doi.org/10.1016/j.cell.2010.03.009 65. Ko¨nig J, Zarnack K, Rot G et al (2010) iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat Struct Mol Biol 17:909–915. https://doi. org/10.1038/nsmb.1838

High-Throughput RNA–Protein Interactions 66. Wang T, Xiao G, Chu Y et al (2015) Design and bioinformatics analysis of genome-wide CLIP experiments. Nucleic Acids Res 43:5263–5274. https://doi.org/10.1093/ nar/gkv439 67. Bottini S, Pratella D, Grandjean V et al (2017) Recent computational developments on CLIPseq data analysis and microRNA targeting implications. Brief Bioinformatics. https:// doi.org/10.1093/bib/bbx063 68. Li X, Song J, Yi C (2014) Genome-wide mapping of cellular protein-RNA interactions enabled by chemical crosslinking. Genomics Proteomics Bioinformatics 12:72–78. https:// doi.org/10.1016/j.gpb.2014.03.001 69. Van Nostrand EL, Pratt GA, Shishkin AA et al (2016) Robust transcriptome-wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP). Nat Methods 13:508–514. https://doi.org/10.1038/ nmeth.3810 70. Puton T, Kozlowski L, Tuszynska I et al (2012) Computational methods for prediction of protein-RNA interactions. J Struct Biol 179:261–268. https://doi.org/10.1016/j. jsb.2011.10.001 71. Mann CM, Muppirala UK, Dobbs D (2017) Computational Prediction of RNA-Protein Interactions. Methods Mol Biol 1543:169–185. https://doi.org/10.1007/ 978-1-4939-6716-2_8 72. Liu Z-P, Chen L (2016) Prediction and dissection of protein-RNA interactions by molecular descriptors. Curr Top Med Chem 16:604–615. https://doi.org/10.2174/ 1568026615666150819110703 73. Wang Y, Chen X, Liu Z-P et al (2013) De novo prediction of RNA-protein interactions from sequence information. Mol BioSyst 9:133–142. https://doi.org/10.1039/ c2mb25292a 74. Bailey TL, Johnson J, Grant CE, Noble WS (2015) The MEME suite. Nucleic Acids Res 43:W39–W49. https://doi.org/10.1093/ nar/gkv416 75. Paz I, Kosti I, Ares M et al (2014) RBPmap: a web server for mapping binding sites of RNA-binding proteins. Nucleic Acids Res 42: W361–W367. https://doi.org/10.1093/nar/ gku406 76. Agostini F, Cirillo D, Ponti RD, Tartaglia GG (2014) SeAMotE: a method for highthroughput motif discovery in nucleic acid sequences. BMC Genomics 15:925. https:// doi.org/10.1186/1471-2164-15-925 77. Kazan H, Ray D, Chan ET et al (2010) RNAcontext: a new method for learning the sequence and structure binding preferences of

531

RNA-binding proteins. PLoS Comput Biol 6: e1000832. https://doi.org/10.1371/journal. pcbi.1000832 78. Lu Q, Ren S, Lu M et al (2013) Computational prediction of associations between long non-coding RNAs and proteins. BMC Genomics 14:651. https://doi.org/10.1186/14712164-14-651 79. Towfic F, Caragea C, Gemperline DC et al (2010) Struct-NB: predicting protein-RNA binding sites using structural features. Int J Data Min Bioinform 4:21–43 80. Maetschke SR, Yuan Z (2009) Exploiting structural and topological information to improve prediction of RNA-protein binding sites. BMC Bioinformatics 10:341. https:// doi.org/10.1186/1471-2105-10-341 81. Zhao H, Yang Y, Zhou Y (2011) Highly accurate and high-resolution function prediction of RNA binding proteins by fold recognition and binding affinity prediction. RNA Biol 8:988–996. https://doi.org/10.4161/rna.8. 6.17813 82. Pe´rez-Cano L, Ferna´ndez-Recio J (2010) Optimal protein-RNA area, OPRA: a propensity-based method to identify RNA-binding sites on proteins. Proteins 78:25–35. https://doi.org/10.1002/prot. 22527 83. Chu C, Zhang QC, da Rocha ST et al (2015) Systematic discovery of Xist RNA binding proteins. Cell 161:404–416. https://doi.org/10. 1016/j.cell.2015.03.025 84. Kumar M, Gromiha MM, Raghava GPS (2011) SVM based prediction of RNA-binding proteins using binding residues and evolutionary information. J Mol Recognit 24:303–313. https://doi.org/10.1002/jmr. 1061 85. Terribilini M, Sander JD, Lee J-H et al (2007) RNABindR: a server for analyzing and predicting RNA-binding sites in proteins. Nucleic Acids Res 35:W578–W584. https://doi.org/ 10.1093/nar/gkm294 86. Ule J, Jensen KB, Ruggiu M et al (2003) CLIP identifies Nova-regulated RNA networks in the brain. Science 302:1212–1215. https://doi. org/10.1126/science.1090095 87. Finn RD, Clements J, Arndt W et al (2015) HMMER web server: 2015 update. Nucleic Acids Res 43:W30–W38. https://doi.org/10. 1093/nar/gkv397 88. Livi CM, Klus P, Delli Ponti R, Tartaglia GG (2016) catRAPID signature: identification of ribonucleoproteins and RNA-binding regions. Bioinformatics 32:773–775. https://doi.org/ 10.1093/bioinformatics/btv629

INDEX A Affinity purification, see Chromatography Agarose gels.................................20, 140, 168, 176, 179, 187, 231, 236, 238, 240, 241, 253, 263, 284, 306, 307, 309, 310, 325–327, 430, 450, 497, 508, 522 Aggregation .................................44, 192, 257, 301, 302, 336, 339, 344, 346–350, 352, 353, 355–358, 374, 375, 387, 393, 398, 413, 419, 424, 479 Auto-induction.................... 57, 170, 181, 447, 492, 501 Automation, see Robotic systems Avi tag, see Fusion tag

B Bac-to-Bac system ....................................... 196, 385, 386 Baculovirus expression vector system (BEVS) .........52, 53, 56, 57, 61, 62, 195, 215 Benzonase.......................................................66, 303, 394 Bioinformatics tools ............................167, 170, 444, 526 Biotin ...........................................79, 262, 277, 406, 420, 449, 468, 469, 472, 523 Biotinylation ................................................ 406, 414, 416 BL21 (DE3) and derivatives, see Strains Buffer exchange......................................95, 99, 112, 130, 136, 144, 157, 249, 396, 400

C Capillary electrophoresis............139, 448, 453, 471, 475 Cell free................................................ 261–277, 403–420 Cell lysis ........................14, 72, 172, 245, 312, 371, 398, 408, 447, 448, 493 Cell(s) bacterial (see Strains, E. coli) Chinese hamster ovary (CHO) ............57, 62, 93–96, 99, 108–110, 113, 114, 117, 123–127, 139 human embryonic kidney (HEK) ... 9, 39, 42, 93–95, 139, 369, 373 insect high five .................................................... 196, 368 Sf9/Sf21 ...........................................192, 196, 219 yeast ..........................................................74, 255, 423 Chaperones............................ 4, 71, 75, 76, 87, 230, 262 Chromatography

affinity .........44, 72, 86, 99, 130, 414, 453, 497, 522 fluorescent size exclusion (FSEC) .............. 77, 80, 87, 361–387, 391–393, 395–397, 400 hydrophobic interaction ................................ 150, 338 immobilised metal affinity chromatography (IMAC) ............... 81, 84, 192, 196, 197, 199, 248, 249, 256, 303, 382, 442, 491 ion exchange............................................................ 159 reverse phase............................................................ 172 size exclusion (SEC) ....................145, 150, 158, 192, 193, 198, 228, 247, 250, 251, 336, 361–387 Cloning system Gateway® ............................ 14–17, 26, 416, 442, 453 In-Fusion® ............................................................. 9–13 ligation independent cloning (LIC)............. 6–10, 35, 46, 173, 174, 176, 195, 215, 220, 283, 285–286, 291–294 sequence ligation independent cloning (SLIC) ................................................... 8, 9, 27 Codon bias .................................................................... 300 Co-expression.......................................... 4, 192, 200, 223 Colony PCR ........................................231, 239, 240, 285 Competent cells....................................97, 169, 178, 179, 188, 218–220, 232, 242, 253, 285, 323, 328, 406, 431, 442, 451, 455, 498 Critical micelle concentration (CMC) ................ 387, 399 Cryo-electron microscopy (cryo-EM) ................. 33, 336, 362, 527 Crystallization ...........................4, 33, 44, 193, 199, 228, 229, 301, 321, 349, 362, 369, 371, 483 Crystallography .................................................... 363, 527

D Deep well 24 (DW24) .........................57, 133, 168, 169, 171, 178, 181–183, 185, 187, 189, 443, 455, 456, 458, 492, 500, 501, 503, 504, 514 Deep well 96 (DW96) .................. 63, 79, 108, 113, 120, 136, 137, 286, 292, 293, 304, 325, 428, 436, 493, 497 Detergents .......................... 77, 228, 233, 243–245, 248, 249, 251, 255, 256, 340, 346, 347, 362, 368, 371, 375, 381, 382, 384, 385, 387, 390–400, 408, 434, 479, 510

Renaud Vincentelli (ed.), High-Throughput Protein Production and Purification: Methods and Protocols, Methods in Molecular Biology, vol. 2025, https://doi.org/10.1007/978-1-4939-9624-7, © Springer Science+Business Media, LLC, part of Springer Nature 2019

533

HIGH-THROUGHPUT PROTEIN PRODUCTION

534 Index

AND

PURIFICATION: METHODS

Dialysis .............................. 157, 192, 193, 197, 198, 341, 406, 416, 420 Differential scanning fluorimetry (DSF) ....................302, 314, 349, 350, 355 Directed evolution ................................................. 77, 321 Disulfide bonds ..............................................94, 144, 145 Dot-blot screen .................................................... 272, 273 DsbA, see Fusion tag DsbC, see Fusion tag

E Electron microscopy (EM) ........................ 214, 223, 228, 477–484 Enzyme-linked immunosorbent assay (ELISA)........39, 76, 168, 169, 204–209, 224 Expression optimization ............................................... 379 Expression screening bacteria..................................................................... 198 cell free............................................................ 403–420 mammalian cell.......................................................... 42 yeast ..................................................................... 71–89 Expression vectors, see Vectors

F Fluorescence size exclusion chromatography (FSEC) .......... 77, 80, 87, 361–387, 391–393, 395–397, 400 Fluorescent protein(s) enhanced green (eGFP)....................... 363, 364, 369, 371, 374, 382, 383, 385, 387, 404, 409–412, 414, 419 Green (GFP)................. 38, 121, 140, 247, 255, 321, 325, 369, 374, 375, 379, 380, 390, 391, 404, 406, 413, 414, 416, 419, 424, 428, 433 split-GFP ....................................... 321–332, 423–437 Functional assays .................................................. 158, 391 Fusion partner (Fusion tag) DsbC..............................................166, 168, 173, 185 glutathione-S-transferase (GST) .............................. 53 hexahistidine (His6).......................... 38, 86, 391, 400 maltose binding protein (MBP) ...............41, 53, 196, 230, 463, 467, 497 N-utilizing substance A (NusA)...................... 41, 497 small ubiquitin-like modifier (SUMO) ........... 41, 196 thioredoxin (TRX) .........41, 453, 462–464, 467, 497

G Gateway® system, see Cloning systems Gel electrophoresis, see Sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE) Gel filtration, see SEC, chromatography Glutathione-S-transferase (GST), see Fusion partner

AND

PROTOCOLS

Glycerol stocks ............................82, 140, 181, 189, 293, 312, 435, 447, 456, 458, 501, 514 Glycosylation .................75, 94, 281, 335, 336, 346, 358 G-protein coupled receptors (GPCR) ....... 261, 270, 277

H Hexahistidine (His6), see Fusion partner High performance liquid chromatography (HPLC)............................ 150, 158, 160, 187, 342, 363, 365, 366, 372, 373 Homologous recombination .....9, 20, 21, 23, 35, 74, 78 Human embryonic kidney (HEK), see Cells Hydrophobic .............................191, 192, 208, 230, 256, 261, 300, 338, 339, 343, 346, 349, 350, 354, 390, 479

I Imidazole ............................ 44, 55, 56, 80, 84, 111, 131, 133, 170, 171, 183, 184, 192, 197, 199, 200, 230, 234, 249, 256, 376, 394, 396, 398, 400, 420, 447, 449, 492, 493, 503 Immobilized metal affinity chromatography (IMAC), see Chromatography Inclusion bodies ......................... 144–146, 151, 356, 434 Induction ......................57, 71, 73, 80, 82, 85, 208, 216, 246, 270, 300, 301, 311, 312, 414, 425, 429, 432, 433, 437, 456 In-Fusion, see Cloning system Insect cells, see Cells Insoluble proteins ................................................ 300, 331 Integral membrane protein (IMP).................53, 66, 228, 229, 243, 248, 255, 256 Interaction networks................................... 390, 403, 439 Ion exchange, see Chromatography Isopropyl β-D-1-thiogalactopyranoside (IPTG) .......... 57, 204–206, 208, 219–221, 270, 288, 303, 311, 315, 322, 324, 329, 331, 367, 414, 425, 426, 429, 432–434

K Kinases .............................. 20, 38, 71, 72, 191–201, 216, 303, 305, 323

L LC/MS, see Mass spectrometry; Liquid chromatography Lemo21(DE3), see Strains Library construction .................................................77, 86 Ligation independent cloning (LIC), see Cloning techniques Lysis ..................... 14, 53–55, 58, 60, 66, 72, 78–80, 83, 84, 113, 170, 172, 182, 189, 192, 196, 197, 233, 243, 245–247, 254, 301–304, 311, 312, 371, 380, 394, 395, 398, 408, 435, 447, 448,

HIGH-THROUGHPUT PROTEIN PRODUCTION

AND

PURIFICATION : METHODS

AND

PROTOCOLS Index 535

458, 460, 461, 463, 465, 470, 474, 492, 501, 514 Lysozyme.......................... 170, 182, 189, 303, 349, 447, 464, 470, 472, 474, 492, 501, 514

Nickel affinity purification, see Chromatography Nuclear magnetic resonance (NMR) spectroscopy................................................. 527 NusA, see Fusion partner

M

O

Maltose binding protein (MBP), see Fusion partner Mammalian cells..................... 35, 42, 43, 46, 58, 66, 75, 143, 193, 390 Mass spectrometry (MS)........................43, 46, 139, 186, 336, 467, 522, 523 Medium, culture LB agar ....................... 169, 176, 178, 179, 188, 232, 285, 292, 303, 311, 313, 324, 328, 329, 331, 426, 430, 432, 442, 443, 451, 490, 491, 497, 498, 500, 510 LB-Miller broth............................169, 285, 286, 311, 324, 330, 426, 442, 490 Luria-Bertani (LB) .................................................. 232 SOC ............................ 169, 178, 181, 219, 231, 239, 241, 253, 286, 292, 303, 311, 324, 431, 442, 451, 498, 500 terrific broth (TB) .......................................... 404, 406 Melting temperature (Tm) .............................18, 47, 302, 312, 347, 381 Membrane proteins.............................. 53, 55–62, 66, 72, 75–78, 87, 227–257, 261, 269, 271, 272, 274, 301, 336, 337, 340, 346, 361–387, 389–401 Micelles ..................... 248, 256, 336, 340, 346, 347, 399 Miniprep ............................. 97, 113, 123, 140, 232, 240, 242, 295, 310, 324, 327, 367, 427, 431, 491, 497, 499 Misfolded proteins ....................................................77, 87 MOI, see Multiplicity of infection (MOI) Molecular interactions ......................................... 265, 274 Monodisperse ..............................44, 343, 374, 376, 379, 380, 386, 390, 397, 400 MultiBac ............................................................... 213–225 Multi-channel pipettes (MCPs) ...............................79, 84 Multidomain proteins ..................................................... 94 Multiplicity of infection (MOI) ...................57, 378–380, 386, 387, 447, 455 Multiprotein complexes ................................................ 215 Mutagenesis...........................15, 26, 218, 281–295, 302, 303, 311–313, 366, 376 Mutations .................. 7, 9, 15, 20, 27, 76, 86, 170, 188, 204, 216, 218, 231, 255, 281, 282, 286, 287, 292, 293, 295, 301, 302, 305, 323, 353, 369, 382, 444, 495, 497, 520

Oligomeric state ..................................335, 336, 347, 358 Oligomerization ................................................... 373, 375 Oligonucleotides (Oligos) ......................... 167, 174, 187, 207, 262, 337, 342, 405, 441, 450, 488, 493, 495, 496, 503, 507, 508, 514, 515 Open reading frames (ORF) ......... 16, 34, 322–330, 414 Optical density (OD) ........................ 182, 189, 243, 246, 249, 251, 257, 270, 304, 448, 460, 501, 514 Origin of replication ..................................................... 198

N NanoDrop ............................ 56, 57, 106, 109, 114, 123, 158, 323, 328, 329, 365, 406, 416, 427

P PDB, see Protein Data Bank (PDB) Periplasm ...............................................38, 166, 169, 315 Periplasmic expression .................................................... 38 Phosphorylation ............................ 21–23, 191, 192, 200, 201, 281, 305, 327, 331 Polyacrylamide gel electrophoresis (PAGE), see Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) Polyethyleneimine (PEI) ................................................ 57 Polymerase chain reaction (PCR) ............... 4, 34, 55, 79, 85, 109, 167, 231, 262, 281, 300, 323, 364, 366, 412, 442, 488 Posttranslational modifications (PTMs) ........................ 71 Primers, see Oligonucleotides Promoters ......................... 4, 7, 9, 17, 19, 26, 35, 72, 75, 78, 80, 85–87, 174, 188, 204, 208, 220, 229, 230, 232, 237, 255, 267, 268, 288, 300, 363, 364, 367, 369, 371, 426 Protease ........................7, 17, 37, 38, 43, 55, 72, 80–82, 84, 86, 87, 167, 172, 179, 183, 185, 189, 193, 196, 198, 200, 233, 234, 237, 243, 249, 263, 270, 302, 304, 312, 314, 365, 391, 394, 399, 405, 409, 441, 450 Protease inhibitors ............................ 55, 81, 82, 84, 233, 243, 263, 270, 302, 304, 365, 394, 405, 409 Protease TEV, see Tobacco etch virus (TEV) Protein ........................................................................... 256 complexes .............................. 71, 215, 403, 424, 478, 479, 482, 522, 527 disorder prediction..............................................34, 35 engineering .............................................................. 281 expression ................... 3, 7–9, 15, 16, 19, 26, 27, 40, 52, 70–72, 76, 77, 85, 86, 93–95, 139, 166, 173, 197, 199, 215, 229, 230, 232, 243, 245, 252, 254, 255, 271, 303, 311, 323, 362, 368, 373, 378, 379, 382, 386, 403, 435, 477, 479

HIGH-THROUGHPUT PROTEIN PRODUCTION

536 Index

AND

PURIFICATION: METHODS

Protein (cont.) extraction .........................................79, 371, 428, 434 folding........................ 46, 76, 87, 143–160, 227, 385 interaction(s) protein–ligand ................................................... 449 protein–nucleic acids......................................... 527 protein–protein ........................ 69, 321, 403, 415, 423–437, 439, 479 labeling .................................................................... 237 phosphorylation ............................192, 200, 201, 281 purification .............................. 43, 44, 46, 57–58, 61, 95, 99, 111, 129, 130, 132–135, 150, 157, 158, 160, 173, 179, 343, 362, 393, 442, 479, 491, 510 refolding .................................................................. 436 secondary structure ................................................. 194 secretion.......................................................... 7, 74, 86 signal sequence ...........................................38, 42, 168 solubility ...................... 35, 44, 72, 87, 243, 299, 387 structure...................4, 35, 52, 71, 93, 193, 362, 363 Protein Data Bank (PDB) ....................... 34, 35, 76, 214, 332, 362, 446, 520 Protein of interest (POI) ...................243, 246, 248–250, 256, 257, 321, 322, 363, 364, 366–387 Protein production (expression) in E. coli ...............170, 214, 215, 300, 442, 453–466 in insect cells............................................................ 143 in mammalian cells ................................... 93–141, 143 Proteolysis, in vivo ........................................................ 361 Proteolytic degradation ..................................... 72, 85, 87 Proteomics........................................................69, 93, 215 PTMs, see Posttranslational modifications (PTMs)

Q Quality control ..........................140, 166, 167, 170, 179, 186, 406, 409, 440, 448, 460–466 Quantitation ...............................112, 137, 150, 160, 166

R Random library ................... 76, 303, 305, 493, 503, 507 Random mutants............................................................. 76 Rare codons .......................................................... 174, 300 Refolding ....................................................................... 436 Restriction enzymes ...................... 3–6, 14, 80, 220, 231, 251, 323, 324, 327, 328, 330, 331, 376, 425, 430, 431 Robotic system ......................................95, 181, 187, 501

S SDS-PAGE, see Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) Seleno-methionine (SeMet) ........................................... 43 SELEX, see Systematic Evolution of Ligands by Exponential Enrichment (SELEX)

AND

PROTOCOLS

Sequence and ligation independent cloning (SLIC) ...........................................8, 9, 27, 220 Site-directed mutagenesis ............ 15, 281–295, 366, 376 Size exclusion chromatography (SEC), see Chromatography Small molecules ................................. 52, 69, 76, 77, 215, 379–381, 391, 401 Small ubiquitin-like modifier (SUMO), see Fusion partner Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) ....................... 38, 40, 41, 44, 46, 53, 56, 57, 61–63, 66, 72, 76, 80, 85, 144, 160, 172, 189, 197–200, 234, 244, 245, 248–251, 255, 256, 273, 314, 336, 366, 373–376, 378–380, 382, 384, 385, 394–396, 405, 410–412, 414, 416, 417, 467, 475, 493, 500, 504, 525 Sonication ....................................66, 158, 184, 196, 255, 269, 325, 327, 371, 398, 434, 435, 504 Split GFP, see Fluorescent protein(s) Stoichiometry ................................................................ 477 Stop codon ........................ 208, 252, 267, 287, 322, 369 Strain engineering ........................................................... 78 Strains, E. coli BL21 (DE3) CodonPlus(DE)-RIL ....................... 266 BL21(DE3) ..................................170, 179, 192, 196, 199, 200, 266, 324, 327, 328, 426, 428, 430, 431, 433, 442, 455 BL21(DE3) pLysS ................................ 179, 442, 455 BL21(DE3) pRARE ............................................... 196 DH5α..........169, 178, 232, 239, 442, 451, 490, 498 Mach™ .................................................. 195, 285, 292 Rosetta (DE) pLysS ...................................8, 492, 498 TOP10............................................................ 218, 224 Streptavidin..... 207, 208, 406, 416, 417, 449, 468, 469, 523, 526 Structures................................ 4, 35, 52, 70, 71, 76, 143, 144, 193, 194, 199, 213, 214, 218, 228, 229, 267, 300, 335, 362, 363, 369, 389–392, 477, 488, 520–523, 526, 527 Superfolder-GFP, see Fusion tag Synchrotron................................................................... 362 Synthetic biology ................................................. 9, 26, 77 Systematic Evolution of Ligands by Exponential Enrichment (SELEX) ......................... 487–515

T Tandem affinity purification (TAP)................................ 86 Temperature control ..................160, 192, 270, 304, 395 T4 DNA ligase ......... 218, 231, 303, 305, 324, 426, 431 T4 DNA polymerase ....................... 6, 8, 9, 28, 238, 239, 285, 292, 295, 323 Thermal stability .........................46, 299, 301, 302, 312, 349, 350, 354, 355 Thioredoxin (Trx), see Fusion tag

HIGH-THROUGHPUT PROTEIN PRODUCTION Tobacco etch virus protease (TEV) ....................... 79, 87, 167, 168, 172, 179, 183, 185, 193, 196–198, 200, 302, 312, 314, 441, 450 Toxins ................................................................... 165–190 Transcription .........................7–9, 72, 87, 230, 267, 409, 410, 412, 419, 420, 423, 487–515, 523, 527 Transfection ........................ 39, 94, 95, 98, 99, 106, 108, 110, 112–115, 118–121, 123–126, 128–130, 140, 193, 219, 221, 364, 367, 369–371, 373, 377, 383, 386 Transformation............................8, 9, 14, 18, 20, 21, 23, 25, 27, 74, 79, 81, 82, 167–170, 173, 176, 178, 179, 181, 182, 188, 220, 221, 231, 232, 236, 239, 241, 242, 251, 253, 254, 292, 303, 311, 323, 328, 329, 431, 436, 442, 443, 450–453, 455, 456, 458, 492, 497, 498, 500–502, 510 Transient .......................................................................... 57 Transient transfection ............ 39, 57, 94, 106, 115, 364, 369–371 Translation, cell-free ..................................................... 267 Transmembrane domains ............................................. 269

AND

PURIFICATION : METHODS

AND

PROTOCOLS Index 537

Truncations ........................... 26, 86, 300, 301, 322, 363 T7 expression system .................................................... 200

U Uniprot .............................................................34, 35, 281

V Venom peptides.................................................... 166, 173 Virus.................................... 9, 14, 15, 35, 193, 213–225, 368, 377, 378, 380, 386

W Web server ....................................................................... 19 Western blotting................................ 39, 42, 72, 85, 209, 224, 234, 243–245, 264–265, 272–274, 276, 277, 395, 523

X X-ray crystallography .................................................... 527