Bionanocomposites : integrating biological processes for bioinspired nanotechnologies [First edition] 9781118942239, 111894223X, 9781118942246, 1118942248

365 76 9MB

English Pages [376] Year 2018

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Bionanocomposites : integrating biological processes for bioinspired nanotechnologies [First edition]
 9781118942239, 111894223X, 9781118942246, 1118942248

Table of contents :
Content: What Are Bionanocomposites? / Agathe Urvoas, Marie Valerio-Lepiniec, Philippe Minard, Cordt Zollfrank --
Molecular Architecture of Living Matter. Nucleic Acids / Enora Prado, Mónika Ádok-Sipiczki, Corinne Nardin --
Lipids / Carole Aimé, Thibaud Coradin --
Carbohydrates / Mirjam Czjzek --
Proteins / Stéphane Romero, François-Xavier Campbell-Valois --
Functional Biomolecular Engineering. Nucleic Acid Engineering / Enora Prado, Mónika Ádok-Sipiczki, Corinne Nardin --
Protein Engineering / Agathe Urvoas, Marie Valerio-Lepiniec, Philippe Minard --
The Composite Approach. Inorganic Nanoparticles / Carole Aimé, Thibaud Coradin --
Hybrid Particles / Nikola Ž Knežević, Laurence Raehm, Jean-Olivier Durand --
Biocomposites from Nanoparticles / Carole Aimé, Thibaud Coradin --
Applications. Optical Properties / Cordt Zollfrank, Daniel Opdenbosch --
Magnetic Bionanocomposites / Wei Li, Yuehan Wu, Xiaogang Luo, Shilin Liu --
Mechanical Properties of Natural Biopolymer Nanocomposites / Biqiong Chen --
Bionanocomposite Materials for Biocatalytic Applications / Sarah Christoph, Francisco M Fernandes --
Nanocomposite Biomaterials / Gisela Solange Alvarez, Martín Federico Desimone --
A Combination of Characterization Techniques / Carole Aimé, Thibaud Coradin.

Citation preview

Bionanocomposites

Bionanocomposites Integrating Biological Processes for Bioinspired Nanotechnologies

Edited by Carole Aimé and Thibaud Coradin Sorbonne Universités UPMC Univ Paris 06 Collège de France UMR CNRS 7574 Laboratoire de Chimie de la Matière Condensée de Paris Paris, France

This edition first published 2017 © 2017 John Wiley & Sons, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions. The right of Carole Aimé and Thibaud Coradin to be identified as the editors of this work has been asserted in accordance with law. Registered Office John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA Editorial Office 111 River Street, Hoboken, NJ 07030, USA For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com. Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats. Limit of Liability/Disclaimer of Warranty In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Library of Congress Cataloging‐in‐Publication Data Names: Aimé, Carole, 1981– editor. | Coradin, Thibaud, editor. Title: Bionanocomposites : integrating biological processes for bioinspired nanotechnologies / edited by Carole Aimé, Centre National de la Recherche Scientifique, Paris, France, Thibaud Coradin, Centre National de la Recherche Scientifique, Paris, France. Description: First edition. | Hoboken, NJ, USA : John Wiley & Sons, Inc., 2017. | Includes bibliographical references and index. Identifiers: LCCN 2017009006 (print) | LCCN 2017009534 (ebook) | ISBN 9781118942222 (cloth) | ISBN 9781118942253 (pdf ) | ISBN 9781118942239 (epub) Subjects: LCSH: Nanobiotechnology. | Nanocomposites (Materials) | Biomimetic materials. Classification: LCC TP248.25.N35 B564 2018 (print) | LCC TP248.25.N35 (ebook) | DDC 620/.5–dc23 LC record available at https://lccn.loc.gov/2017009006 Cover image: © Pobytov/Getty Images Cover design: Wiley Set in 10/12pt Warnock by SPi Global, Pondicherry, India Printed in the United States of America 10 9 8 7 6 5 4 3 2 1

v

Contents List of Contributors  xv 1

What Are Bionanocomposites?  1 Agathe Urvoas, Marie Valerio‐Lepiniec, Philippe Minard and Cordt Zollfrank

1.1 ­Introduction  1 1.2 ­A Molecular Perspective: Why Biological Macromolecules?  3 1.3 ­Challenges for Bionanocomposites  3 ­References  6 2

Molecular Architecture of Living Matter  9

2.1

Nucleic Acids  11 Enora Prado, Mónika Ádok‐Sipiczki and Corinne Nardin

2.1.1 ­Introduction: A Bit of History  11 2.1.2 ­Definition and Structure  12 2.1.2.1 Nomenclature  12 2.1.2.2 Structure  13 2.1.3 ­DNA and RNA Functions  15 2.1.3.1 Introduction  15 2.1.3.2 Transcription–Translation Process  16 2.1.3.3 Replication Process  18 2.1.4 ­Specific Secondary Structures  19 2.1.4.1 Watson–Crick H‐Bonds  19 2.1.4.1.1 Stem‐Loop 19 2.1.4.1.2 Kissing Complex  20 2.1.4.2 Other Kinds of H‐Bonding  21 2.1.4.2.1 G‐Quartets 21 2.1.4.2.2 i‐Motifs 23 2.1.5 ­Stability  23 2.1.6 ­Conclusion  25 ­References  25

vi

Contents

2.2 Lipids  29 Carole Aimé and Thibaud Coradin

2.2.1 ­Lipids Self‐Assembly  29 2.2.2 ­Structural Diversity of Lipids  30 2.2.2.1 Fatty Acyls (FA)  30 2.2.2.2 Glycerolipids (GL)  32 2.2.2.3 Glycerophospholipids (GP)  32 2.2.2.4 Sphingolipids (SP)  33 2.2.2.5 Sterol Lipids (ST)  34 2.2.2.6 Prenol Lipids (PR)  34 2.2.2.7 Saccharolipids (SL)  35 2.2.2.8 Polyketides (PK)  35 2.2.3 ­Lipid Synthesis and Distribution  35 2.2.4 ­The Diversity of Lipid Functions  36 2.2.4.1 Cellular Architecture  37 2.2.4.2 Lipid Rafts  37 2.2.4.3 Energy Storage  37 2.2.4.4 Regulating Membrane Proteins by Protein–Lipid Interactions  39 2.2.4.5 Signaling Functions  39 2.2.5 ­Lipidomics  39 ­References  40 2.3 Carbohydrates  41 Mirjam Czjzek

2.3.1 ­Introduction  41 2.3.2 ­Monosaccharides  42 2.3.3 ­Oligosaccharides  44 2.3.3.1 Disaccharides  44 2.3.3.2 Protein Glycosylations  46 2.3.4 ­Polysaccharides  47 2.3.4.1 Cellulose  49 2.3.4.2 Hemicelluloses  50 2.3.4.2.1 Xyloglucan 50 2.3.4.2.2 Xylan 50 2.3.4.2.3 Mannan or Glucomannan  52 2.3.4.2.4 Mixed‐Linkage Glucan (MLG)  52 2.3.4.3 Pectins  53 2.3.4.4 Chitin  54 2.3.4.5 Alginate  54 2.3.4.6 Marine Galactans  55 2.3.4.7 Storage Polysaccharides: Starch, Glycogen, and Laminarin  55 ­References  56

Contents

2.4

Proteins: From Chemical Properties to Cellular Function: A Practical Review of Actin Dynamics  59 Stéphane Romero and François‐Xavier Campbell‐Valois

2.4.1 ­Introduction  59 2.4.2 ­Molecular Architecture of Proteins  59 2.4.2.1 Amino Acids  60 2.4.2.2 Peptide Bond  60 2.4.2.3 Primary Structure  64 2.4.3 ­Protein Folding  66 2.4.3.1 Peptide and Protein: Secondary Structure  66 2.4.3.2 3D Folding: Tertiary Structure  67 2.4.3.3 Quaternary Structure  68 2.4.3.4 Protein Folding and De Novo Design  70 2.4.4 ­Interacting Proteins for Cellular Functions  73 2.4.4.1 Protein Interactions  73 2.4.4.2 Enzymatic Activity of Proteins  75 2.4.4.3 Molecular Motors  77 2.4.5 ­Self‐Assembly and Auto‐Organization: Regulation of the Actin Cytoskeleton Assembly  78 2.4.5.1 Origin of the Actin Treadmilling  79 2.4.5.2 Regulation of Actin Treadmilling  83 2.4.5.3 Arp2/3 and Formin‐Initiated Actin Assembly to Generate Mechanical Forces  83 2.4.5.4 Self‐Organization Properties and Force Generation Understood Using In Vitro Reconstituted Actin‐Based Nanomovements  85 2.4.5.5 Applications in Bionanotechnologies  85 2.4.6 ­Conclusion  87 ­References  88 3

Functional Biomolecular Engineering  93

3.1

Nucleic Acid Engineering  95 Enora Prado, Mónika Ádok‐Sipiczki and Corinne Nardin

3.1.1 ­Introduction  95 3.1.2 ­How to Synthetically Produce Nucleic Acids?  95 3.1.2.1 The Chemical Approach  95 3.1.2.2 Polymerase Chain Reaction  96 3.1.2.3 Combinatorial Synthesis of Oligonucleotides and Gene Libraries: Aptamers  100 3.1.3 ­Secondary Structures in Nanotechnologies  102 3.1.3.1 Watson–Crick H‐Bonds  102

vii

viii

Contents

3.1.3.1.1 Stem‐Loop 102 3.1.3.1.2 Kissing Complex  103 3.1.3.2 Other Kind of H‐Bonding  103 3.1.3.2.1 G‐Quartets 103 3.1.3.2.2 Origami: Nano‐architecture on Surface  105 3.1.4 ­Conclusion  108 ­References  108 3.2

Protein Engineering  113 Agathe Urvoas, Marie Valerio‐Lepiniec and Philippe Minard

3.2.1 ­Synthesis of Polypeptides: Chemical or Biological Approach?  113 3.2.2 ­Proteins: From Natural to Artificial Sources  114 3.2.2.1 How to Get the Coding Sequence of the Protein of Interest?  114 3.2.2.2 E. coli: A Cheap “Protein Factory” with a Diversified Tool Box  114 3.2.2.3 Common Expression Plasmids  116 3.2.2.4 Limits of Recombinant Protein Expression in E. coli  117 3.2.2.5 Some Solutions Are Available to Solve these Expression Problems  118 3.2.3 ­Proteins: A Large Repertoire of Functional Objects  118 3.2.3.1 Looking for Natural Proteins with Desired Function  118 3.2.3.2 From Protein Engineering to Protein Design  119 3.2.3.2.1 Modified Proteins Are Often Destabilized  119 3.2.3.2.2 Natural or Engineered Proteins: From Small Step to Giant Leap in Sequence Space  120 3.2.3.2.3 Computational Protein Design  120 3.2.3.2.4 Directed Evolution: A Diverse Repertoire Combined with a Selection Process  121 3.2.3.3 Combining Chemistry with Biological Objects  123 3.2.3.3.1 Labeling Natural Amino Acids  123 3.2.3.3.2 Bioorthogonal Labeling  123 3.2.3.3.3 Tag‐Mediated Labeling and Enzymatic Coupling  125 3.2.3.3.4 Enzyme‐Mediated Ligation  126 3.2.3.3.5 Quality Control of Labeled Biomolecules  126 ­References  126 4

The Composite Approach  129

4.1

Inorganic Nanoparticles  131 Carole Aimé and Thibaud Coradin

4.1.1 ­Introduction  131 4.1.2 ­Overview of Inorganic Nanoparticles  132 4.1.3 ­Synthesis of Inorganic Nanoparticles  132

Contents

4.1.3.1 Basic Principles  132 4.1.3.2 Nanoparticles from Solutions  138 4.1.3.2.1 Ionic Solids  138 4.1.3.2.2 Metals 139 4.1.3.2.3 Metal Oxides  140 4.1.3.2.4 Morphological Control  144 4.1.4 ­Some Specific Properties of Inorganic Nanoparticles  145 4.1.5 ­Concluding Remarks  149 ­References  149 4.2

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials  153 Nikola Ž. Knežević, Laurence Raehm and Jean‐Olivier Durand

4.2.1 ­General Considerations  153 4.2.2 ­Functionalization of Nanoparticle Surface  154 4.2.2.1 Functionalization of Hydroxylated Surfaces  154 4.2.2.2 Functionalization of Hydride‐Containing Surfaces  154 4.2.2.3 Functionalization of Metal‐Containing Nanoparticles  155 4.2.2.4 Functionalization of Carbon‐Based Nanomaterials  155 4.2.3 ­Linker‐Mediated Conjugation of Biomolecules to Nanoparticles  155 4.2.3.1 Conjugation through Carbodiimide Chemistry  155 4.2.3.2 Carbamate, Urea, and Thiourea Linkage  156 4.2.3.3 Schiff Base Linkage  158 4.2.3.4 Multicomponent Linkage Formation  159 4.2.3.5 Biofunctionalization through Alkylation  160 4.2.3.6 Bioorthogonal Linkage Formation  161 4.2.3.7 Conjugation through Host–Guest Interactions  162 4.2.3.8 Linkage through Metal Coordination  162 4.2.3.9 Ligation through Complementary Base Pairing  164 4.2.3.10 Electrostatic Interactions  164 4.2.4 ­Conclusions  164 ­Acknowledgments  165 ­References  165 4.3

Biocomposites from Nanoparticles: From 1D to 3D Assemblies  169 Carole Aimé and Thibaud Coradin

4.3.1 ­General Considerations  169 4.3.2 ­One‐Dimensional Bionanocomposites  170 4.3.3 ­Two‐Dimensional Organization of Nanoparticles  175 4.3.4 ­Three‐Dimensional Organization of Particles  175 4.3.5 ­Conclusion and Perspectives  180 ­References  180

ix

x

Contents

5 Applications  185 5.1

Optical Properties  187 Cordt Zollfrank and Daniel Van Opdenbosch

5.1.1 ­Introduction  187 5.1.2 ­Interactions of Light with Matter  189 5.1.3 ­Optics at the Nanoscale  190 5.1.3.1 Nanoscale Optical Processes  190 5.1.3.2 Nanoscale Confinement of Matter  191 5.1.3.3 Nanoscale Confinement of Radiations  191 5.1.4 ­Optical Properties of Bionanocomposites  191 5.1.4.1 Absorption Properties of Bionanocomposites  192 5.1.4.2 Emission Properties of Bionanocomposites  195 5.1.4.3 Structural Colors with Bionanocomposites  200 5.1.5 ­Conclusions  201 ­References  202 5.2

Magnetic Bionanocomposites: Current Trends, Scopes, and Applications  205 Wei Li, Yuehan Wu, Xiaogang Luo and Shilin Liu

5.2.1 ­Introduction  205 5.2.2 ­Construction Strategies for Magnetic Biocomposites  208 5.2.2.1 The Blending Method  208 5.2.2.2 In Situ Synthesis Method  209 5.2.2.3 Grafting‐onto Method  210 5.2.3 ­Applications of Magnetic Biocomposites  212 5.2.3.1 Environmental Applications  212 5.2.3.1.1 Removal of Toxic Metal Ions  212 5.2.3.1.2 Removal of Dyes  216 5.2.3.1.3 Biocatalysis and Bioremediation  216 5.2.3.2 Biomedical Applications  218 5.2.3.2.1 Magnetic Resonance Imaging (MRI)  218 5.2.3.2.2 Cellular Therapy and Labeling  219 5.2.3.2.3 Tissue Engineering Applications  221 5.2.3.2.4 Drug Delivery  221 5.2.3.2.5 Tissue Regeneration  224 5.2.3.3 Biotechnological and Bioengineering Applications  225 5.2.3.3.1 Biosensing 226 5.2.3.3.2 Magnetically Responsive Films  228 5.2.4 ­Concluding Remarks and Future Trends  228 ­Acknowledgments  229 ­References  229

Contents

5.3

Mechanical Properties of Natural Biopolymer Nanocomposites  235 Biqiong Chen

5.3.1 ­Introduction  235 5.3.2 ­Overview of Mechanical Properties of Polymer Nanocomposites and Their Measurement Methods  237 5.3.3 ­Solid Biopolymer Nanocomposites  237 5.3.4 ­Porous Biopolymer Nanocomposites  245 5.3.5 ­Biopolymer Nanocomposite Hydrogels  247 5.3.6 ­Conclusions  249 ­References  251 5.4

Bionanocomposite Materials for Biocatalytic Applications  257 Sarah Christoph and Francisco M. Fernandes

5.4.1 ­Bionanocomposites and Biocatalysis  257 5.4.2 ­Form and Function in Bionanocomposite Materials for Biocatalysis  260 5.4.2.1 Bionanocomposites Structure  260 5.4.2.1.1 Biopolymers 260 5.4.2.1.2 The Inorganic Fraction  264 5.4.2.2 Key Biocatalysts  269 5.4.2.2.1 Nucleotides and Amino Acids  269 5.4.2.2.2 Enzymes 272 5.4.2.2.3 Whole Cells  273 5.4.3 ­Applications  277 5.4.3.1 Biosynthesis  277 5.4.3.2 Sensing Applications  281 5.4.3.3 Environmental Applications  283 5.4.3.4 Energy Applications of Biocatalytic Bionanocomposites  286 5.4.4 ­Conclusions and Perspectives  289 ­References  290 5.5

Nanocomposite Biomaterials  299 Gisela Solange Alvarez and Martín Federico Desimone

5.5.1 ­Introduction  299 5.5.2 ­Natural Nanocomposites  301 5.5.2.1 Cellulosic Materials  301 5.5.2.2 Chitosan  305 5.5.2.3 Alginate  305 5.5.2.4 Collagen  307 5.5.2.5 Gelatin  307

xi

xii

Contents

5.5.2.6 Silk Fibroin  309 5.5.3 ­Synthetic Nanocomposites  5.5.3.1 PLLA and PLGA  309 5.5.3.2 Polyethylene Glycol  312 5.5.3.3 Methacrylate 312 5.5.3.4 Polyvinyl Alcohol  314 5.5.3.5 Polyurethanes 314 5.5.4 ­Conclusions  315 ­Acknowledgments  317 ­References  317 6

309

A Combination of Characterization Techniques  321 Carole Aimé and Thibaud Coradin

6.1 ­Introductory Remarks  321 6.2 ­Chemical Analyses  322 6.2.1 Inductively Coupled Plasma  322 6.2.2 Infrared Spectroscopy  323 6.2.3 X‐Ray Photoelectron Spectroscopy and Auger Electron Spectroscopy  324 6.2.4 Energy–Dispersive X‐Ray Spectroscopy and Electron–Energy Loss Spectroscopy  328 6.3 ­Determining Size and Structure  329 6.3.1 Imaging  329 6.3.1.1 Electron Microscopy  330 6.3.1.2 Atomic Force Microscopy  333 6.3.2 Scattering Techniques  335 6.3.2.1 Small Angle Scattering  337 6.3.2.2 Dynamic Light Scattering and Zetametry  337 6.3.3 Monitoring Particle–Biomolecule Interactions  339 6.3.3.1 Electrophoresis 339 6.3.3.2 Circular Dichroism Spectroscopy  340 6.3.3.3 Isothermal Titration Calorimetry and Surface Plasmon Resonance  342 6.4 ­Materials Properties  344 6.4.1 Optical Properties  344 6.4.2 Mechanical Testing  346 6.4.2.1 Rheology 346 6.4.2.2 Compression Tests  347 6.4.2.3 Tensile Tests  348 6.4.2.4 Relaxation Tests  348 6.4.2.5 Dynamic Mechanical Analysis  349 6.4.2.6 Indentation 349

Contents

6.4.2.7 Mechanical Testing of Hydrogels  349 6.4.3 Magnetic Measurements  350 6.4.4 Biological Properties  353 ­References  355 Index  359

xiii

xv

List of Contributors Mónika Ádok‐Sipiczki

Sarah Christoph

Department of Inorganic and Analytical Chemistry, University of Geneva, Geneva, Switzerland

Sorbonne Universités, UPMC Univ Paris 06, Collège de France, UMR CNRS 7574, Laboratoire de Chimie de la Matière Condensée de Paris, Paris, France

Carole Aimé

Sorbonne Universités, UPMC Univ Paris 06, Collège de France, UMR CNRS 7574, Laboratoire de Chimie de la Matière Condensée de Paris, Paris, France Gisela Solange Alvarez

Universidad de Buenos Aires. Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET). Instituto de la Química y Metabolismo del Fármaco (IQUIMEFA). Facultad de Farmacia y Bioquímica. Buenos Aires, Argentina François‐Xavier Campbell‐Valois

Département de Chimie et Sciences Biomoléculaires, Université d’Ottawa, Ottawa, Ontario, Canada Biqiong Chen

Department of Materials Science and Engineering, University of Sheffield, Sheffield, UK

Thibaud Coradin

Sorbonne Universités, UPMC Univ Paris 06, Collège de France, UMR CNRS 7574, Laboratoire de Chimie de la Matière Condensée de Paris, Paris, France Mirjam Czjzek

Laboratory of Integrative Biology of Marine Models, Station Biologique de Roscoff, University Sorbonne Paris VI and CNRS, Roscoff, France Martín Federico Desimone

Universidad de Buenos Aires. Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET). Instituto de la Química y Metabolismo del Fármaco (IQUIMEFA). Facultad de Farmacia y Bioquímica. Buenos Aires, Argentina Jean‐Olivier Durand

Institut Charles Gerhardt Montpellier UMR‐5253 CNRS‐UM2‐ENSCM‐UM1cc, Montpellier, France

xvi

List of Contributors

Francisco M. Fernandes

Enora Prado

Sorbonne Universités, UPMC Univ Paris 06, Collège de France, UMR CNRS 7574, Laboratoire de Chimie de la Matière Condensée de Paris, Paris, France

Institute of Physics Rennes, UMR UR1‐CNRS 6251, Rennes, France

Nikola Ž. Knežević

Faculty of Technology and Metallurgy, University of Belgrade, Belgrade, Serbia Wei Li

College of Food Science and Technology, Huazhong Agricultural University, Wuhan, Hubei, China Shilin Liu

College of Food Science and Technology, Huazhong Agricultural University, Wuhan, Hubei, China Xiaogang Luo

School of Chemical Engineering and Pharmacy, Wuhan Institute of Technology, Wuhan, Hubei, China Philippe Minard

Institute for Integrative Biology of the Cell (I2BC), UMR 9198, Université Paris‐Sud, CNRS, CEA, Orsay, France Corinne Nardin

Institut pluridisciplinaire de recherche sur l’environnement et les matériaux (IPREM), Equipe Physique Chimie des Polymères (EPCP), Université de Pau et des Pays de l’Adour (UPPA), Pau, France

Laurence Raehm

Institut Charles Gerhardt Montpellier UMR‐5253 CNRS‐UM2‐ ENSCM‐UM1cc, Montpellier, France Stéphane Romero

Equipe Communication Intercellulaire et Infections Microbiennes, Centre de Recherche Interdisciplinaire en Biologie (CIRB), Collège de France, Paris, France Institut National de la Santé et de la Recherche Médicale U1050, Paris, France Centre National de la Recherche Scientifique UMR7241, Paris, France MEMOLIFE Laboratory of Excellence and Paris Science Lettre, Paris, France Agathe Urvoas

Institute for Integrative Biology of the Cell (I2BC), UMR 9198, Université Paris‐Sud, CNRS, CEA, Orsay, France Marie Valerio‐Lepiniec

Institute for Integrative Biology of the Cell (I2BC), UMR 9198, Université Paris‐Sud, CNRS, CEA, Orsay, France Daniel Van Opdenbosch

Biogenic Polymers, Technische Universität München, Straubing, Germany

List of Contributors

Yuehan Wu

Cordt Zollfrank

College of Food Science and Technology, Huazhong Agricultural University, Wuhan, Hubei, China

Biogenic Polymers, Technische Universität München, Straubing, Germany

xvii

1

1 What Are Bionanocomposites? Agathe Urvoas1, Marie Valerio‐Lepiniec1, Philippe Minard1 and Cordt Zollfrank2 1

 Institute for Integrative Biology of the Cell (I2BC), UMR 9198, Université Paris‐Sud, CNRS, CEA, Orsay, France 2 Biogenic Polymers, Technische Universität München, Straubing, Germany

1.1 ­Introduction Almost all natural materials, which are formed through metabolic processes of an organism, are nanocomposite materials, that is, materials associating at least two distinct phases, one of which being of nanometer scale dimension. The term “natural” is most often synonymously used with the term “biological.” Natural nanocomposite can be therefore characterized as bionanocomposites. Basically two kinds of solid composite materials are generated in natural ­systems: soft matter and hard matter (Figure 1.1). Natural soft matter composites are composed of at least two types of organic biomacromolecules. The most prominent example here is wood, which is a hierarchically structured bionanocomposite consisting of polysaccharides (mainly cellulose) and lignin (Figure  1.1a). Biological hard matter is generally composed of an inorganic phase and an organic phase. Biominerals (sea shells) and hard tissue (bone) are two typical forms of appearance of biological hard matter (Figure 1.1b). Natural bionanocomposites combine a high resilience and tolerance toward failure, adaptation, modularity, and multifunctionality [1, 2]. They are originally designed and optimized for the needs of life and to meet the surrounding environmental conditions in order to guarantee the survival of the respective species they are associated with. Nature provide a rich pool of raw materials for mankind with easily accessible constituents for habitation, clothes, weapons, and arts, among many

Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

2

What Are Bionanocomposites?

(a)

(b)

5 cm

Figure 1.1  Examples for biological soft and hard matter: (a) trunk disc of an oak tree and (b) lower jawbone of a cow (mandible).

other examples. Further, the development of chemistry allowed for the transformation of this raw matter into synthetic materials. At the end of the last century, the conjunction of economic and environmental issues, combined with the growing development of multidisciplinary scientific research, has led to reconsider natural processes in general and natural materials in ­particular as an enormous pool of inspiration with an incredible structural and functional variability. Such bioinspired materials, achieved by using Nature guidelines to tailor and design a novel class of bionanocomposites or nanostructured biohybrid materials, have the potential to conquer complex multivariant environments [3–7]. However, it is interesting to note that under the constraints of living environments and required metabolic conversion processes, only a small number of organic compounds (based on the light elements carbon, hydrogen, ­oxygen, nitrogen, sulfur, and phosphorus) and a few inorganic phases (i.e., calcium phosphates and carbonates, silica, and iron oxides) are used for the formation of bionanocomposites [8]. This strongly contrasts with engineering materials that are prepared from almost all the elements of the periodic table. In parallel, structures and properties of biological polymers have been, and still are, studied by biologists mainly to understand their essential roles in biological systems. However, the potential applications of biological molecules in the design of bionanocomposites require to consider them as synthetic “building blocks” that may eventually be used in a context distant from their natural environment or function.

1.3 ­Challenges for Bionanocomposites

1.2 ­A Molecular Perspective: Why Biological Macromolecules? This is a relatively new view as biomolecules have long been considered, outside the biological or biomedical field, as highly complex systems, difficult to modify, and too fragile to be of any practical utility. Indeed, proteins or nucleic acids have characteristic features that are not common in the synthetic chemical world. Their natural functionality in living cells and their potential applications outside biology precisely result from these properties: ●●

●●

●●

●●

First, proteins and nucleic acids are very long copolymers in which the different monomers are linked with a defined order. In other words, these polymers have a defined “sequence,” a property that usually does not exist in polymers made by chemical synthesis. Second, the specific sequence of any nucleic acid or the coding sequence of a protein gene can be viewed and is actually used, by living cells as well as by biologists, not only as a substance but also as information: biological sequences can be duplicated, transmitted, eventually modified, and executed. Information processing occurs naturally between generations of cells and organisms that select, amplify, replicate genes, and control their expression. Information processing similarly occurs when a sequence is designed in a laboratory, transmitted by e‐mail, synthetized as a synthetic gene, amplified by PCR, and translated in protein in a recombinant microorganism. Third, biological polymers are self‐assembling materials. The information content embedded within each sequence is often sufficient to allow each nucleic acid or protein to reach its highly organized structure, and the functional properties of biological molecules directly result from their three‐ dimensional structure. Fourth, nucleic acids or proteins can evolve. Each natural protein or nucleic acid sequence is not simply a molecule: its informational content is the product of a historical process. In the current structure and function of any natural protein, there is the memory of all past successful trials that occurred during its evolution. It is this historical information accumulated over billions of years that explains the amazing diversity and extreme sophistication of natural protein structures and functions.

1.3 ­Challenges for Bionanocomposites Going back 50 years ago, the design of specific peptide or nucleic acid sequence to control the organization of gold nanoparticles into perfectly controlled crystals was probably as unexpected as the application of the same particles for cell imaging. Thus, the progresses made in the field of bionanocomposites over the

3

4

What Are Bionanocomposites?

last decades largely result from the evolution of both chemistry and biology fields (but also of physics, engineering, and computer science) that, in some specific areas, has led to conceptual and experimental convergences. The processing of natural macromolecules in artificial conditions has been as fruitful as the confrontation of chemical and biological to define how the two worlds can cohabitate. This has led to an impressive list of “hybrid” objects that will be described in the following chapters. However, there are several characteristics present in biology that have not been translated in engineering materials so far. The extraordinary structures and functions of biological materials strongly relate to their organization over several length scales. In particular, the importance of hierarchical structuring has long been identified and was widely investigated in the recent years [3, 9]. A variety of functional materials solutions relying on structural hierarchy were described in natural materials [10–14] (Figure  1.2). However, whereas such organizations could be obtained for organic or inorganic structures [15–17], they are still challenging to achieve in bionanocomposite systems.

(a5) (a1) (a4) (a3) (a2)

(b5)

(b4)

(b1)

(b2)

(b3)

Figure 1.2  Structural hierarchy of soft tissue (oak wood) and hard tissue (bone): (a1) tissue, (a2) cells, (a3) cell walls, (a4) elementary fibrils, and (a5) biomolecules (cellulose and lignin); (b1) compact and spongy bone, (b2) osteons, (b3) collagen fibril, (b4) mineralized collagen triple helices, and (b5) collagen molecule and calcium phosphate nanocrystal.

1.3 ­Challenges for Bionanocomposites

Another key feature of biological aspects yet to implement in materials science involves dynamic processes such as evolution, growth, and continuous structure formation (self‐organization, remodeling). This means that according to the local and temporal needs, the organism supplies the required material “on demand” with an extremely high spatiotemporal precision [18]. As a consequence, building blocks for the bionanocomposite formation (inorganic and organic phases) are continuously supplied and assembled from finely tuned cooperative interactions by regulatory processes as an answer to external and internal stimuli (Figure 1.3). This ultimately means that natural materials are dynamic in terms of structure and composition. These basic principles are related to two fundamental biological processes termed (i) ontogenesis and (ii) morphogenesis. Both processes are not known for engineering materials but (a)

(b)

Cells and other tissue elements

(c)

Collagen fibril

(d)

Figure 1.3  Fundamental biological and dynamic processes that are absent in engineering materials. The development of the material during the lifetime of an organism is termed ontogenesis: (a) juvenile bone (woven bone) with unoriented collagen fibrils and (b) adult bone (lamellar bone) with highly oriented collagen fibrils. Morphogenesis is the development of form: (c) juvenile skull and (d) adult skull.

5

6

What Are Bionanocomposites?

might lead to advanced materials with unforeseen properties and open new horizons for high‐level applications. To conclude, bionanocomposites are attracting more and more interest not only from the natural sciences but also from materials chemists and engineers. Although synthetic materials might never be as sophisticated as natural systems, integration of key processes involved in the building‐up of biological materials in the fabrication of bionanocomposites would pave the grounds for the development of a new generation of advanced materials that can cope with spatiotemporal multivariant environments and combine ­multiple properties.

­References 1 Zollfrank, C.; Scheibel, T.; Seitz, H.; Travitzky, N. Ullmann’s Encyclopedia of

Industrial Chemistry. Wiley‐VCH, Weinheim: 2014.

Zollfrank, C. Scr. Mater. 2014, 74, 3–8. 2 Fratzl, P.; Dunlop, J.; Weinkamer, R. (editors) Materials Design Inspired by 3

Nature: Function Through Inner Architecture. Royal Society of Chemistry, Cambridge: 2013. 4 Meyers, M. M.; Chen, P.‐Y.; Lin A. Y.‐M.; Seki, Y. Prog. Mater. Sci. 2008, 53, 1–206. 5 Mann, S. Biomineralization: Principles and Concepts in Bioinorganic Materials Chemistry. Oxford University Press, New York: 2001. 6 Cölfen, H.; Antonietti, M. Mesocrystals and Nonclassical Crystallization. John Wiley & Sons Inc., Hoboken, NJ: 2008. 7 Mann, S. Biomimetic Materials Chemistry. Wiley‐VCH, New York: 1996. 8 Fratzl, P. J. R. Soc. Interface 2007, 4, 637–642. 9 National Materials Advisory Board, Hierarchical Structures in Biology as a Guide for New Materials Technology. National Academy Press, Washington, DC: 1994. 10 Aizenberg, J.; Weaver, J. C.; Thanawala, M. S.; Sundar, V. C.; Morse, D. E.; Fratzl, P. Science 2005, 309, 275–278. 11 Arzt, E.; Gorb, S.; Spolenak, R. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 10603–10606. 12 Weiner, S.; Wagner, H. D. Annu. Rev. Mater. Sci. 1998, 28, 271–298. 13 Keckes, J.; Burgert, I.; Frühmann, K.; Müller, M.; Kölln, K.; Hamilton, M.; Burghammer, M.; Roth, S. V.; Stanzl‐Tschegg, S.; Fratzl, P. Nat. Mater. 2003, 2, 810–813. 14 Raabe, D.; Romano, P.; Sachs, C.; Fabritius, H.; Al‐Sawalmih, A.; Yi, S.‐B.; Servos, G.; Hartwig, H. G. Mater. Sci. Eng. A 2006, 421, 143–153.

­Reference

15 Zollfrank, C. Biotemplating: Polysaccharides in materials engineering.

In Design and Nature V, Comparing Design in Nature with Science and Engineering, Brebbia, C. A., Carpi, A. (eds.) WIT Press, Southampton, 441–451, 2010. 16 Van Opdenbosch, D.; Zollfrank, C. Adv. Eng. Mater. 2014, 16, 699–712. 17 Soler‐Illia, G. J.; Sanchez, C.; Lebeau, B.; Patarin, J. Chem. Rev. 2002, 102, 4093–4138. 18 Jeronimides, G. Chapters 1 and 2. In Structural Biological Materials. Design and Structure‐Property Relationships, Elices, M. (ed.) Pergamon, Amsterdam: 2000.

7

9

2 Molecular Architecture of Living Matter

11

2.1 Nucleic Acids Enora Prado1,*, Mónika Ádok‐Sipiczki2,* and Corinne Nardin3 1

Institute of Physics Rennes, UMR UR1‐CNRS 6251, Rennes, France Department of Inorganic and Analytical Chemistry, University of Geneva, Geneva, Switzerland 3 Institut pluridisciplinaire de recherche sur l’environnement et les matériaux (IPREM), Equipe Physique Chimie des Polymères (EPCP), Université de Pau et des Pays de l’Adour (UPPA), Pau, France 2

2.1.1 ­Introduction: A Bit of History The history of deoxyribonucleic acid (DNA) discovery and characterization began in 1869 when the Swiss biologist Friedrich Miescher isolated a phosphorus‐rich substance in the cell nucleus, named nuclein [1]. In 1889, a German pathologist and histologist, Richard Altmann, separated proteins and an acid substance (the nucleic acid) from the nuclein, and, in 1896, Albrecht Kossel (awarded the Nobel Prize in Physiology or Medicine in 1910) discovered the four nucleobases adenine (A), cytosine (C), thymine (T), and guanine (G). Only later, in 1928, was DNA identified by Phoebus Levene and Walter A. Jacobs and named deoxyribonucleic acid from 1935. After a first evidence in 1944 by Avery, MacLeod, and McCarty, Alfred Hershey and Martha Chase confirmed in 1952 that DNA is the carrier of the genetic information by a series of experiments currently called “Hershey and Chase experiments” [2]. Hershey won the Nobel Prize in Physiology or Medicine in 1969 for this discovery, along with Max Delbrück and Salvador Luria. Simultaneously, James Watson and Francis Crick discovered the double helical structure of DNA [3]. They proved at first that DNA adopts a helical structure prior to the elucidation of the full structure. The two keys of this problem were, on the one hand, the helical structure and, on the other hand, the observation that the chemical structure of DNA consisted of four bases (A, T, G, and C) and that the nucleobase pairs A–T and G–C had *  These authors contributed equally to the work. Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

12

Nucleic Acids

steric complementary structures. This new concept allowed them to develop a double helical model. This discovery would not have been possible without the X‐ray diffraction patterns acquired by Rosalind Franklin. J. Watson and F. Crick received the Nobel Prize in Physiology or Medicine in 1962 for these discoveries, which opened the way to molecular biology and biochemistry.

2.1.2 ­Definition and Structure 2.1.2.1 Nomenclature

Nucleic acids are macromolecular polymers divided into two families: DNA and ribonucleic acid (RNA); the basic unit, or monomer, is the nucleotide (nt). Each nucleotide consists of a heterocyclic base (also termed nitrogenous base), a phosphate group, and a pentose sugar (2′‐deoxy‐d‐ribofuranose). The only difference between the two sugars is the absence (deoxyribose in DNA) or presence (ribose in RNA) of a hydroxyl group at the 2′ position. The nucleobases are covalently linked through the C1′ positions on the sugar molecule, whereas the phosphate group is linked via the C5′ positions. The monomer without phosphate group is named nucleoside (glycosylamine). A nucleoside consists simply of a nucleobase and a 5‐carbon sugar. Moreover, nucleobases are typically classified as the derivatives of two parent compounds, pyrimidine and purine. In DNA, the purine bases are adenine and guanine, while the pyrimidines are thymine and cytosine. RNA uses uracil instead of thymine. The International Union of Pure and Applied Chemistry (IUPAC) has designated an abbreviation code for nucleotides [4], summarized in Figure  2.1 with the corresponding structures. O

NH2 N

Nucleobase

N

N H

N N H

N

Adenine (A)

N H

NH2

N

Guanine (G)

NH

NH

N

NH

O

O

NH2

N H

O

Cytosine (C)

N H

O

Thymine (T)

O

Uracil (U) NH2

HO

OH

O

Sugar H

H

H

OH

H OH

Ribose

HO

OH

O H

H

H

OH

H

H

Deoxyribose

N

Deoxyribonucleotide

N

O



O

P

O

O



O

H

H

H

OH

H

H

Figure 2.1  Structures of the different nucleobases, sugars, and deoxyribonucleotides.

O

2.1.2 ­Definition and Structur

2.1.2.2 Structure

The nucleic acid sequences, referred to as strand or primary structure, are regarded as linear (co)polymers, along which are linked the different units by phosphodiester bonds. The phosphate group binds to the C3′ position of the first nucleotide sugar and to the C5′ position of the next nucleotide sugar via two ester bonds, thus forming the polymer backbone. The oligonucleotide word, commonly used, refers to a short segment of few tens of nucleotides. The two strands of DNA are antiparallel and coiled around each other to assemble a right‐handed double helix maintained by hydrogen bonds, formed between the complementary nucleobases, named base pairs (bp), which is the principle of nucleic acid hybridization. Adenine is always associated with thymine (in DNA) or uracil (in RNA) by two hydrogen bonds, and guanine is always in association with cytosine through three hydrogen bonds. These H‐ bonds are referred to as Watson–Crick interaction and the double helix as the secondary structure of nucleic acids (Figure 2.2). The stable double helix structures are very common in DNA, in particular, to store the genetic information, whereas it is scarcer in RNA, which however adopts several single‐stranded conformations depending on the biological function. The DNA double helix is innately a nanoscale entity; its diameter is of about 2 nm and the separation between the bases of 0.34 nm; the helical periodicity is ~3.5 nm per turn or 10–10.5 nucleotide pairs per turn. Three common different forms of duplex nucleic acids have been identified. The most common form, present in most DNA at neutral pH and physiological ionic strength, is the B‐form, which is the classical right‐handed double helical structure. A thicker right‐handed duplex with a shorter distance between the base pairs has been described for RNA– DNA duplexes and RNA–RNA duplexes, the A‐form. A third form of duplex DNA has a strikingly different left‐handed helical structure. This Z‐form DNA is formed by stretches of alternating purines and pyrimidines, for example, GCGCGC, especially in negatively supercoiled DNA.1 A small amount of the DNA in a cell exists in the Z‐form (Table 2.1). At a higher level of structural organization, in the tertiary structure, the secondary structured elements are associated through numerous van der Waals interactions and specific hydrogen bonds via the formation of a small number of additional Watson–Crick 1  Supercoiling is a form of DNA in which the double helix is further twisted around itself, forming a tightly coiled structure. It is the form generally adopted by DNA in nature, since it enables sufficient condensation to be packaged into living cells. In negative supercoiling, the DNA is twisted around an axis in a direction opposite to that of the clockwise turns of the (right‐handed) double helix, which decreases the number of turns of one helix around the other. In positive supercoiling the twist of the supercoils is in the same direction with that of the double helix, which increases the number of turns of one helix around the other. Supercoiling must be temporarily suppressed when DNA replication takes place, and the degree of supercoiling can affect gene transcription [5, 6].

13

14

Nucleic Acids Nitrogenous base

Nucleotide

Sugar (deoxyribose)

P C

G

P

Phosphate group

P T

A P

P C

G

P P

T

A

P P

C

G

P

T

P

A P

P

T

A P

DNA double helix

Figure 2.2  Double helical structure of B‐form DNA and nucleobase pair position in the helical structure in the antiparallel orientation.

2.1.3 ­DNA and RNA Function

Table  2.1  Comparison of B‐form, A‐form, and Z‐form DNA structural parameters. B‐form

A‐form

Z‐form

Helix sense

RH

RH

LH

bp per turn

10

11

12

Vertical rise per bp (Å)

3.4

2.56

3.7

Rotation per bp (°)

+36

+33

30

Helical diameter (Å)

19

23

18

pairs and/or non‐Watson–Crick base pairs [7]. Thus, the tertiary structure refers to the three‐dimensional (3D) organization of nucleic acids, which is involved in specific recognition of elements such as proteins, nucleic acids, and ligands or as ion‐binding sites. A classical illustration of the level of structural organization of nucleic acid is transfer RNA (tRNA). The tRNA is an RNA composed of typically 76–90 nucleotides (primary structure), which are the physical links between the nucleotide sequence of nucleic acids (DNA and RNA) and the amino acid sequence of proteins. The structure of tRNA can be decomposed into its primary and secondary structures, usually visualized as a cloverleaf structure, as well as its tertiary structure. All tRNAs have a similar L‐shaped 3D structure. The cloverleaf structure becomes the 3D L‐shaped structure through the coaxial stacking of the helices, which is a common RNA tertiary structure motif [5] (Figure 2.3).

2.1.3 ­DNA and RNA Functions 2.1.3.1 Introduction

DNA is the carrier not only of the genetic information but also of its variations. Indeed, DNA stores the genetic information as a “hard drive” along nonrandom sequences of nucleotides. It therefore contains all the information necessary for the formation of proteins yet remaining within the cell nucleus. The DNA reading takes place during the transcription–translation process responsible for the RNA synthesis. RNA may play several roles, especially that of messenger between DNA and proteins, or a catalytic role through its ability to form complex structures. Another essential function of DNA is heredity. Owing to the replication process, DNA can be duplicated, and the genetic information can be transferred from one generation to another. These two essential processes are explained in more details in the following sections.

15

16

Nucleic Acids

73 G C C A 1 G C 75 76

G C G C 70 Acceptor arm A U 5 G C 68 5a A U 67a 66 60 T arm 6 G C 67 65 7 G C U UA D arm 11a 8 U CUCCC 9 G 16 15 U 14 U C G G CCG G 10 47s G A G G G U U C 56 18 G 48 G C 50 CU 55 45 19 G CG U G C C G C C C G GG AG 21 20 C GU A G GG 25 20a GU G C 43a A CC G C 47 A C 47a 30 G C 40 GC 47b Anticodon A U A C Extra arm arm A U UC A

90°

35

Figure 2.3  Structure‐based cloverleaf models (left) and 3D L‐shaped structure (right) of A. aeolicus tRNASec [Source: Itoh et al. [5]. Reproduced with permission of Oxford University Press].

2.1.3.2  Transcription–Translation Process

DNA is the template for protein biosynthesis, the expression of which defines the phenotype of individuals. The DNA cannot cross the nuclear membrane; one of the roles of RNA is that of an intermediate between DNA and proteins in the transcription–translation process. The transcription corresponds to the reading process of the genetic information by a molecular motor named polymerase RNA (polRNA). polRNA is fixed to a particular DNA sequence, the promoter (marking the gene beginning), and progresses along the gene,2 locally separating the two strands to form a bubble of about 10 nucleobases. Progressively, a complementary strand is produced: the messenger RNA (mRNA). mRNA can cross the nucleus membrane, and after an eventual RNA slicing (introns3 removal in eukaryote cells), the translation process begins. A 2  A gene is the basic physical unit of heredity; a linear sequence of nucleotides along a segment of DNA that provides the coded instructions for the synthesis of RNA, which, when translated into a protein, leads to the expression of a hereditary character. 3  An intron corresponds to a segment of a gene situated between exons that is removed before the translation of the messenger RNA and does not function in coding for protein synthesis.

2.1.3 ­DNA and RNA Function

“molecular factory” is fixed on the mRNA, composed of proteins and ribosomal RNA (rRNA), named ribosome. The mRNA is read by nucleotide triplets (a codon), an amino acid that is brought to the ribosome by tRNA and added to the protein chain. At the end of the translation process (when the stop codon is read), the protein is released into the cytoplasm. Among all the biological processes involved in genetic expression, one interesting mechanism, based on nucleic acids interactions, concerns the small RNA. Small RNAs are defined as ~20–30 nucleotide noncoding RNAs (ncRNA) that regulate genes expressions. It is interesting to note here that up to 98% of the transcriptional output of the human genome is represented by RNA that does not code for protein [8]. This bulk of transcribed RNA either is largely of no use or, alternatively, may fulfill a wide range of yet unexplored functions in both eukaryotic and prokaryotic biology [9]. These ncRNAs can be classified in several classes based on their length, biogenesis, polarity (sense or antisense), and putative functions. A basic classification criterion is size: long ncRNAs are typically >200 nt long and function without major prior processing. Conversely, small ncRNAs such as piwi‐interacting RNAs (piRNAs) [10], small interfering RNAs (siRNAs), microRNAs (miRNAs), and some bacterial regulatory RNAs [11] are processed from longer precursors. Among these small ncRNA, siRNAs, and miRNAs, noncoding single‐stranded RNA molecules spark a keen interest for the scientific community, especially miRNAs, as regulators of endogenous genes,4 and siRNAs, as defenders of the genome integrity in response to foreign or invasive nucleic acids such as viruses, transposons,5 and transgenes6 [12]. Single‐stranded forms of both miRNAs and siRNAs were found to associate with assemblies known as RNA‐induced silencing complexes (RISCs) [12a, 13]. Initially, miRNAs and siRNAs appeared to be distinguished in two primary ways. At first, miRNAs were viewed as endogenous and purposefully expressed products of an organism’s own genome, whereas siRNAs were thought to be primarily exogenous in origin, derived directly from the virus, transposon, or transgene trigger. Later on, it appeared that miRNAs were processed from stem‐loop (see Section 2.1.4.1.1) precursors of incomplete double‐stranded character, whereas siRNAs were found to be excised from long, fully complementary double‐stranded RNAs (dsRNAs) [14]. Despite these differences, the size similarities and sequence‐specific inhibitory functions of miRNAs and siRNAs immediately suggested 4  Endogenous substances are those that originate from an organism, tissue, or cell. Endogenous genes refer to those that are located in the cell genome, as opposed to genes that appear from viral infection. 5  A transposon is a gene or set of genes capable of inserting copies of itself into other DNA locations within the same cell, also called jumping gene. 6  A transgene is a gene that is taken from the genome of one organism and introduced into the genome of another organism by artificial techniques.

17

18

Nucleic Acids

relatedness in biogenesis and biological functions. Targeting more than 60% of the human genome, miRNAs and siRNA are rather important for many critical physiological processes, and an abnormal expression of miRNA/siRNAs has been confirmed to be highly related to the development of most types of cancers [9a, 15]. Due to great progresses via miRNA/siRNA‐related biological studies, miRNA/siRNAs are emerging as novel biomarkers for cancer diagnosis and therapeutics [16], further enabling the development of diverse miRNA/ siRNA detection‐based methodologies [17]. 2.1.3.3  Replication Process

The replication process is used to duplicate DNA during cell mitosis (cell division) so that each daughter cell retains a copy of the mother cell DNA. It is a semiconservative process; each of the two new formed DNA molecules consists of a parental strand (strand from the matrix of the replication) and a newly formed strand (new complementary strand of the parental one). In other words, if the two DNA strands are designated as A and B, strand A could serve as a template for synthesizing a new strand B, while B could template the synthesis of the new strand A (Figure 2.4). Replication begins at specific sequences (the origin of replication) and progresses in both directions along the two DNA strands. It is important to note that DNA is read from the 3′ to the 5′ position, whereas a new strand is synthesized from the 5′ to the 3′ direction. The initiation factor proteins, which facilitate the binding of the other proteins involved in replication, will recognize these locations. These proteins (helicases) allow to unzip the double‐stranded DNA and to form replication forks. The replication fork is the structure formed when the DNA is replicated and on which polymerase DNA (polDNA) is fixed. DNA polymerase is an enzyme that catalyzes the formation of nucleotide bonds. DNA polymerases are highly Template strand 5′-3′ 5′ 3′ Double stranded DNA 5′

3′

New strand 3′-5′

3′

3′

5′

5′ New strand 5′-3′ 5′

3′

3′ Template strand 3′-5′

Figure 2.4  Illustration of DNA replication.

5′

2.1.4 ­Specific Secondary Structure

accurate, with an intrinsic error rate of less than one mistake for every 107 nucleotides added [18]. The enzyme complex (helicases, polymerases, and other proteins) involved in the replication is called replicase.

2.1.4 ­Specific Secondary Structures The chemical structure of a single‐stranded nucleic acid gives little insight into its biological function as a carrier of genetic information. For a deeper understanding one must examine the structure at a higher level of organization. This section intends to give a comprehensive overview of the structure, biological function, and stability of specific secondary structures. 2.1.4.1  Watson–Crick H‐Bonds 2.1.4.1.1 Stem‐Loop

A classical secondary structure adopted by single‐stranded DNA or, more commonly, in RNA and based on Watson–Crick H‐bonding is the stem‐loop. This structure is also known as a hairpin or hairpin loop. A stem‐loop occurs upon intramolecular base pairing between two regions of the same strand, usually complementary in nucleotide sequences when read in opposite directions. A double helix assembling ending in an apical unpaired loop takes place. The resulting structure is a key building block of many RNA secondary structures. The formation of a stem‐loop structure is dependent on the stability of the resulting helix and loop regions. The most important parameter is the presence of a sequence that can fold back on itself to form a paired double helix. The stability of this helix is determined by its length, the number of mismatches or bulges it contains (a low number is tolerable, especially along a long helix), and the base composition of the paired region and the loop [19]. Base‐ stacking interactions, aligning the π orbitals of the bases’ aromatic rings in a favorable orientation, also promote helix formation. The stem‐loop structures are involved in diverse biological functions [20]. For example, stem‐loops occur in pre‐microRNA structures and most likely in tRNA, which contains three true stem‐loops and one stem that meet in a cloverleaf pattern. Stem‐loop structures are also identified in many ribozymes or within the 5′UTR7 [21] of prokaryotes, in which they are often bound to proteins or cause the attenuation of a transcript in order to regulate translation [22]. Another example is the mRNA stem‐loop structure forming at the ribosome binding site, which may 7  The 5′ untranslated region (5′ UTR) (also known as a Leader Sequence or Leader RNA) is the region of an mRNA that is directly upstream from the initiation codon. This region is important for the regulation of translation of a transcript by differing mechanisms in viruses, prokaryotes, and eukaryotes.

19

20

Nucleic Acids

control translation initiation [23]. Interestingly, the principle of stem‐loop ­formation has been used to develop different functional nanodevices [24]. As detailed in the section of the textbook dedicated to Functional Biomolecular Engineering, one original example of these applications concerns the conception of molecular motors, like hybridization‐driven DNA walkers. 2.1.4.1.2  Kissing Complex

The stem‐loop structures can also interact with themselves via unpaired bases interactions between two hairpin loops pair [25]. This kind of interaction is named kissing interaction, also called loop–loop pseudoknots. When the hairpin loops are located on separate molecules, their intermolecular interaction is called kissing complex. These intra‐ and intermolecular kissing interactions are important in forming the tertiary or quaternary structure of many RNAs. These interactions generally form between stem‐loops containing extensive complementarity. However, stable complexes have been observed containing only two intermolecular Watson–Crick base pairs [26]. Intramolecular kissing interactions are observed in the native structures of a variety of RNAs including Varkud satellite (VS) RNA8 [27] and tRNA [28]. These kissing interactions contribute to the assembly and stabilization of their respective RNA structures by joining and orienting helices [26a]. Kissing interactions may stabilize both native and nonnative interactions during tertiary folding, which can affect the rate at which the native structure is formed [29]. Kissing interactions often form distorted structures that can serve as recognition sites, for example, for proteins [30] or metal ions [31]. As a result, kissing interactions contribute to the stability of an RNA structure by affecting both global and local RNA interactions. As RNA function often depends on its ability to adopt alternative structures, it is difficult to predict RNA 3D structures directly from its sequence. Single‐molecule approaches show potentials to solve the problem of RNA structural polymorphism by monitoring molecular structures one molecule at a time. An original method to study the folding and the stability of a stem‐loop structure, or kissing complex, is the optical tweezer [32]. The transient formation of an intermolecular kissing complex is required for RNA dimerization during the life cycle of retroviruses [33] and for the formation of some antisense target complexes [34]. 8  The Varkud satellite (VS) ribozyme is one of the classes of nucleolytic ribozymes that includes the hammerhead, hairpin, and hepatitis delta virus ribozymes. These carry out reversible cleavage and ligation reactions at a specific site by transesterification reactions involving 2′‐ and 5′‐oxygen and 3′‐phosphorus atoms. The VS RNA is an abundant transcript from DNA found in the mitochondria of a number of natural isolates of Neurospora. Collins and coworkers found that the VS RNA contains an element capable of self‐cleavage, which is thought to act in the processing of replication intermediates.

2.1.4 ­Specific Secondary Structure

2.1.4.2  Other Kinds of H‐Bonding 2.1.4.2.1 G‐Quartets

The canonical right‐handed double helical secondary structure assumed in bulk DNA in vivo is known since its description by Watson and Crick in 1953 [3]. However, since then, it has become clear that DNA can adopt a variety of alternative conformations based on particular sequence motifs and interactions with various proteins. These spatial arrangements include single‐stranded hairpins, intramolecular duplexes, triplexes, and quadruplexes. Although these structures were considered an interesting phenomenon, little practical meaning was associated with them at first. However, it has recently become clear that they play important physiological roles. For instance, they have been found in telomeres. Telomeres are protecting the end of chromosomes from deterioration or from fusion with neighboring chromosomes and consist of (TTAGGG)(n) repeated sequences and associated proteins at chromosome ends. They also appear in greater numbers in the promoters of proto‐oncogenes, immunoglobulin heavy chain switch regions, and mutational hotspots; they are therefore found to have profound effects on replication, transcription, and genome stability. Proto‐oncogenes are genes that code for proteins responsible for proliferation. Mutations in proto‐oncogenes can lead to an increase in protein expression, hyperactivity (i.e., gain of function), and/or loss of regulation. This mutated form is called oncogene. Mutation frequencies vary significantly along nucleotide sequences such that mutations often concentrate at certain positions called hotspots. It is also assumed that quadruplexes are involved in the proliferation of tumor cells [35]. Some of these effects are positive, affecting the normal genetic development and ensuring the diversity, while other effects are negative and result in a variety of genetic disorders and cancer in humans. Nucleotides rich in guanine can form inter‐ or intramolecular hydrogen bonds to build square planar arrays of four guanines known as G‐quartets. In G‐quartets each guanine is linked with its neighbor via two hydrogen bonds by Hoogsteen pairing. These structures then stack on each other in a helical fashion, forming a G‐quadruplex structure frequently referred to as a G‐quadruplex, G‐tetraplex, or G4‐DNA (Figure 2.5). The structures of polynucleotides containing purely guanosines are thus four‐stranded analogues of the Watson– Crick AT and GC base‐paired double helices [37]. G‐quadruplexes are stabilized by hydrogen bonds and the presence of alkali metal ions, which are located in the center between two G‐quartets. These ions are usually potassium or sodium cations, which interact electrostatically with the guanine carbonyl [38]. The topology of a quadruplex varies depending on the monovalent cation (K+ or Na+), the glycosidic conformation (rotation around the bond joining the 1′‐carbon of the deoxyribose sugar to the heterocyclic base gives rise to the different conformations: syn or anti),

21

22

Nucleic Acids R N

H N

N N

N H H N

N

O

H N

N

M+ O

N

N R

N

H

O

N R N

H

H

H O

H N H

N H H

G G

G G

N

N N

N R

Figure 2.5  Left: structure of a G‐quartet, with four guanines arranged around a central monovalent cation (M+). Right: schematic structure of a G‐quadruplex [Source: Bochman et al. [36]. Reproduced with permission of Nature Publishing Group].

number of molecules of the nucleic acid involved in their formation (intramolecular, bimolecular, or tetramolecular), relative orientation of the strands (parallel or antiparallel), and number of stacking G‐quartets and nucleotide sequences. RNA is also capable of forming G‐quadruplex structures, and the thermodynamic stability of the resulting structure is even higher than that of the structures assembled by DNA, because of the 2′‐OH group of the ribose sugar that acts as a scaffold for an ordered network of water molecules and bonding patterns [39]. In addition, whereas DNA in biological systems is typically double stranded, RNA is typically single stranded. Since G‐quadruplex formation does not have to compete with duplex formation, it is hence more likely to occur. However, the assembly of a wide range of alternative structures is possible in addition to G‐quadruplex formation. A review was given on the topological and structural description of different DNA and RNA G‐quadruplexes formed by short and long human telomeric sequences by Phan [40]. While the main focus of research has been on DNA G‐quadruplexes and their potential role in biology, lately the role of RNA G‐quadruplexes as regulatory elements of gene expression is beginning to emerge. Recent progresses in unraveling the function of RNA G‐quadruplex enable foreseeing this structure as a suitable candidate for therapeutic intervention. Considerable efforts are thus invested in manipulating cellular events by strategically targeting this secondary structure [41]. In this regard, optimization of suitable drug candidates that can interact selectively with RNA quadruplexes is of utmost importance [42]. The initial emphasis was on G‐quadruplexes located in the 5′ UTR of mRNA, which is known to be involved in translational regulation [43].

2.1.5 ­Stabilit

2.1.4.2.2 i‐Motifs

Similar to G‐quartets, i‐motifs are four‐stranded DNA secondary structures that can be assembled by sequences rich in cytosine. Stabilized in acidic conditions, they are composed of two parallel‐stranded DNA duplexes held together in an antiparallel orientation by intercalated, cytosine–cytosine+ base pairs. Initially, i‐motifs were thought to be unstable at physiological pH, which precluded substantial biological investigation. However, recent advances have shown that this is not always the case and that i‐motif stability is highly dependent on factors such as sequence and environmental conditions. Some of the different i‐motif structures were previously described in details elsewhere [44]. The possibility to form higher order structures makes these motifs an excellent module for the design of nanodevices.

2.1.5 ­Stability The stability of nucleic acid structures depends on intrinsic and environmental parameters. Among the environmental parameters, the two key players are the ionic properties and the temperature. As nucleic acid molecules are highly charged polyanion, cations, such as sodium and magnesium ions, are required to neutralize the negative charges of the backbones to reduce the repulsive Coulombic interactions between the phosphate groups so that the nucleic acid molecules can fold into their compact native structures. Ionic properties, such as ion concentration, charge, and size, play important roles in determining the stability and folding kinetics of nucleic acids [45]. For example, divalent cations, like Mg2+, are more efficient in screening the negative charge of the backbone than K+, whereas monovalent cations, such as K+, allow the stabilization of G‐quadruplexes [46]. The second important environmental parameter for nucleic acid stability is temperature. Indeed, the 3D structures of nucleic acids are held together by a number of weak interactions such as hydrogen bonds, stacking, van der Waals, and hydrophobic interactions. These weak interactions follow thermodynamic rules and are thus temperature dependent. The temperature effect can be illustrated by the denaturation process of double‐ stranded DNA. As the temperature increases, local unwinding of the double‐ stranded DNA occurs. When all base interactions are disrupted, the two strands separate, according to a process called denaturation. The nucleobases are now exposed to the aqueous environment. Single‐stranded DNA is more stable than double‐stranded DNA at high temperature. Note that the edges of the bases will still form hydrogen bonds with water. As the temperature is lowered, the double‐stranded form becomes more stable than the single strand in solution, and the DNA renatures [47]. The first step is a nucleation event along which two complementary regions come in contact. Nucleation is the rate‐limiting step in renaturation. Once nucleation occurs, the rest of the molecule zips

23

Nucleic Acids 100

Absorbance at 260 nm

24

Poly(GC)

75

50

25

Native dsDNA Tm

Poly(GC)

0 40

60

80

100

Temperature (°C)

Figure 2.6  Typical melting curves of double‐stranded DNA.

up pretty quickly. The denaturation process can be easily followed by UV light absorbance measurement at 260 nm, since single‐stranded DNA absorbs more than its double‐stranded counterpart. The temperature course of the process yields a typical melting curve from which the melting point (Tm) can be determined (Figure  2.6) [48]. The Tm is defined as the temperature in degrees Celsius, at which 50% of all complementary molecules of a given DNA sequence are hybridized into a double strand, and 50% are present as single strands. This value gives valuable indications on the stability of nucleic acid structures [49] and is nucleobase composition dependent [24, 50]. It is important to note here that AT‐rich sequences have a lower Tm than GC‐rich sequences, since the average stacking interactions of G/C base pairs are two or three times stronger than A–T base pairs, so more thermal energy is needed to disrupt them [51]. As explained previously, nucleic acid stability depends not only on the nucleobase composition but also on the chemical nature of nucleic acid. Indeed, the presence of the ─OH group on the second carbon chain of the ribose sugar makes a ribopolynucleotide less stable than a deoxyribose molecule. The presence of the 2′‐OH group on the ribose sugar makes it susceptible to undergo a nucleophilic attack. The 2′‐OH can generate a 2′‐O− that can attack the phosphorous atom and convert the phosphodiester group into a 2′,3′‐cyclic nucleotide, thus breaking the polynucleotide chain. Hydrolysis of the cyclic nucleotide produces a mixture of 2′ and 3′ nucleotides at the breakpoint. Moreover, the double‐stranded DNA has relatively small grooves as opposed to the larger grooves along RNA molecules. This provides ample docking space for damaging enzymes, called nucleases (DNases and RNases). Thus, RNAs are more sensitive to enzymatic degradation than DNA.

 ­Reference

2.1.6 ­Conclusion In this chapter, subsequent to a brief recall of the history of the discovery of DNA, the composition of the different nucleobases, sugars, and deoxyribonucleotides are introduced. The higher order DNA and RNA structures and their function are exemplified throughout the whole chapter like the secondary nucleic acid double helix mainly held by H‐bonds referred to as Watson–Crick interaction, which exists in various forms in nature, or the coaxial stacking of helices of common RNA tertiary structure motif. This higher order of DNA and RNA organization is indeed involved in biological processes such as replication to duplicate DNA during cell mitosis (cell division) but in synthetic reactions as well. This chapter ends with a discussion on the stability of DNA against temperature variations, for instance, which is illustrated by the denaturation process of double‐stranded DNA. Overall, we described nucleic acids with emphasis on the relationship between composition, higher order of organization, and function. This is particularly important knowledge to introduce and understand current and future applications based on the use of nucleic acids.

­References Dahm, R. Dev. Biol. 2005, 278, 274–288. Hershey, A. D.; Chase, M. J. Gen. Physiol. 1952, 36, 39–56. Watson, J. D.; Crick, C. F. H. Nature 1953, 171, 737–738. Biochemical Nomenclature and Related Documents, Portland Press, London, 1992, pp. 109–126.  5 Itoh, Y.; Sekine, S.‐i.; Suetsugu, S.; Yokoyama, S. Nucleic Acids Res. 2013, 41, 6729–6738.  6 Dorman, C. J., Sci. Prog. 2006, 89, 151–166.  7 (a) Leontis, N. B.; Stombaugh, J.; Westhof, E. Nucleic Acids Res. 2002, 30, 3497–3531; (b) Westhof, E. FEBS Lett. 2014, 588, 2464–2469.  8 Mattick, J. S. Science 2005, 309, 1527–1528.  9 (a) Breving, K.; Esquela‐Kerscher, A. Int. J. Biochem. Cell Biol. 2010, 42, 1316–1329; (b) Belli Kullan, J.; Lopes Paim Pinto, D.; Bertolini, E.; Fasoli, M.; Zenoni, S.; Tornielli, G. B.; Pezzotti, M.; Meyers, B. C.; Farina, L.; Pe, M. E.; Mica, E. BMC Genomics 2015, 16, 015–1610. 10 Naqvi, A. R.; Islam, M. N.; Choudhury, N. R.; Haq, Q. M. R. Int. J. Biol. Sci. 2009, 5, 97–117. 11 (a) Brosnan, C. A.; Voinnet, O. Curr. Opin. Cell Biol. 2009, 21, 416–425; (b) Hamilton, A.; Voinnet, O.; Chappell, L.; Baulcombe, D. EMBO J. 2002, 21, 4671–4679.   1   2  3  4

25

26

Nucleic Acids

12 (a) Carthew, R. W.; Sontheimer, E. J. Cell 2009, 136, 642–655; (b) He, S.;

Yang, Z.; Skogerbo, G.; Ren, F.; Cui, H.; Zhao, H.; Chen, R.; Zhao, Y. Crit. Rev. Microbiol. 2008, 34, 175–188; (c) Castel, S. E.; Martienssen, R. A. Nat. Rev. Genet. 2013, 14, 100–112. 13 Hammond, S. M.; Bernstein, E.; Beach, D.; Hannon, G. J. Nature 2000, 404, 293–296. 14 (a) Voinnet, O. Cell 2009, 136, 669–687; (b) Moazed, D. Nature 2009, 457, 413–420. 15 Esquela‐Kerscher, A.; Slack, F. J. Nat. Rev. Cancer 2006, 6, 259–269. 16 (a) Tebes, S. J.; Kruk, P. A. Gynecol. Oncol. 2005, 99, 736–741; (b) Singh, S. K.; Hajeri, P. B. Drug Discov. Today 2009, 14, 859–865; (c) Hajeri, P. B.; Singh, S. K. Drug Discov. Today 2009, 14, 851–858; (d) Whitehead, K. A.; Langer, R.; Anderson, D. G. Nat. Rev. Drug Discov. 2009, 8, 129–138. 17 (a) Chen, H.‐L.; Guo, M.‐M.; Tang, H.; Wu, Z.; Tang, L.‐J., Yu, R.‐Q.; Jiang, J.‐H. Anal. Methods 2015, 7, 2258–2263; (b) Baril, P.; Ezzine, S.; Pichon, C. Int. J. Mol. Sci. 2015, 16, 4947–4972. 18 McCulloch, S. D.; Kunkel, T. A. Cell Res. 2008, 18, 148–161. 19 (a) Melnykov, A. V.; Nayak, R. K.; Hall, K. B.; Van Orden, A. Biochemistry 2015, 54, 1886–1896; (b) Kent, J. L.; McCann, M. D.; Phillips, D.; Panaro, B. L.; Lim, G. F. S.; Serra, M. J. RNA 2014, 20, 825–834. 20 (a) Breaker, R. R.; Joyce, G. F. Chem. Biol. 2014, 21, 1059–1065; (b) Lease, R. A.; Arluison, V.; Lavelle, C. Front. Life Sci. 2012, 6, 19–32; (c) Wachter, A. Trends Genet. 2014, 30, 172–181. 21 Mignone, F.; Gissi, C.; Liuni, S.; Pesole, G. Genome Biol. 2002, 3, reviews0004.0001–reviews0004.0010. 22 Deiorio‐Haggar, K.; Anthony, J.; Meyer, M. M. RNA Biol. 2013, 10, 1180–1184. 23 (a) Thapar, R. ACS Chem. Biol. 2015, 10, 652–666; (b) Laalami, S.; Zig, L.; Putzer, H. Cell. Mol. Life Sci. 2014, 71, 1799–1828; (c) Penno, C., Sharma, V.; Coakley, A.; Motherway, M. O. C.; van Sinderen, D.; Lubkowska, L.; Kireeva, M. L.; Kashlev, M.; Baranov, P. V.; Atkins, J. F. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, E1984–E1993. 24 Krishnan, Y.; Simmel, F. C. Angew. Chem. Int. Ed. Engl. 2011, 50, 3124–3156. 25 Nowakowski, J.; Tinoco Jr, I. Semin. Virol. 1997, 8, 153–165. 26 (a) Sehdev, P.; Crews, G.; Soto, A. M. Biochemistry 2012, 51, 9612–9623; (b) Kim, C.‐H.; Tinoco Jr, I. Proc. Natl. Acad. Sci. U. S. A. 2000, 97, 9396–9401. 27 Lilley, D. M. RNA 2004, 10, 151–158. 28 (a) Grigg, J. C.; Ke, A. RNA Biol. 2013, 10, 1761–1764; (b) Leonova, E. I.; Baranov, M. V.; Galzitskaya, O. V. Mol. Biol. 2012, 46, 34–46. 29 Pan, J.; Deras, M. L.; Woodson, S. A. J. Mol. Biol. 2000, 296, 133–144. 30 Thapar, R.; Denmon, A. P.; Nikonowicz, E. P. Wiley Interdiscip. Rev. RNA 2014, 5, 49–67.

 ­Reference

1 Bouchard, P.; Legault, P. Biochemistry 2014, 53, 258–269. 3 2 (a) Stephenson, W.; Wan, G.; Tenenbaum, S. A.; Li, P. T. X. J. Vis. Exp. 2014, 3

e51542; (b) Li, P. T. X.; Bustamante, C.; Tinoco Jr, I. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 15847–15852. 33 Sinck, L.; Richer, D.; Howard, J.; Alexander, M.; Purcell, D. F. J.; Marquet, R.; Paillart, J. C. RNA 2007, 13, 2141–2150. 34 Watrin, M.; Dausse, E.; Lebars, I.; Rayner, B.; Bugaut, A.; Toulme, J.‐J. Methods Mol. Biol. 2009, 535, 79–105. 35 (a) Neidle, S.; Parkinson, G. N. Curr. Opin. Struct. Biol. 2003, 13, 275–283; (b) Simonsson, T. Biol. Chem. 2001, 382, 621–628. 36 Bochman, M. L.; Paeschke, K.; Zakian, V. A. Nat. Rev. Genet. 2012, 13, 770–780. 37 Arnott, S.; Chandrasekaran, R.; Marttila, C. M. Biochem. J. 1974, 141, 537–543. 38 (a) Burge, S.; Parkinson, G. N.; Hazel, P.; Todd, A. K.; Neidle, S. Nucleic Acids Res. 2006, 34, 5402–5415; (b) Campbell, N. H.; Neidle, S. Met. Ions Life Sci. 2012, 10, 119–134. 39 (a) Agarwala, P.; Pandey, S.; Maiti, S. Org. Biomol. Chem. 2015, 13, 5570– 5585; (b) Tang, C.‐F.; Shafer, R. H. J. Am. Chem. Soc. 2006, 128, 5966–5973. 40 Phan, A. T. FEBS J. 2010, 277, 1107–1117. 41 (a) Collie, G. W.; Parkinson, G. N. Chem. Soc. Rev. 2011, 40, 5867–5892; (b) Xu, Y. Chem. Soc. Rev. 2011, 40, 2719–2740. 42 Sissi, C.; Gatto, B.; Palumbo, M. Biochimie 2011, 93, 1219–1230. 43 Bugaut, A.; Balasubramanian, S. Nucleic Acids Res. 2012, 40, 4727–4741. 44 (a) Guéron, M.; Leroy, J.‐L. Curr. Opin. Struct. Biol. 2000, 10, 326–331; (b) Day, H. A.; Pavlou, P.; Waller, Z. A. Bioorg. Med. Chem. 2014, 22, 4407–4418. 45 Tan, Z.‐J.; Chen, S.‐J. Biophys. J. 2006, 90, 1175–1190. 46 Lane, A. N.; Chaires, J. B.; Gray, R. D.; Trent, J. O. Nucleic Acids Res. 2008, 36, 5482–5515. 47 Hanke, A. Biochem. Soc. Trans. 2013, 41, 639–645. 48 Horton, H. R. Principles of Biochemistry, Pearson Prentice Hall, Upper Saddle River, NJ, 2006. 49 Knez, K.; Spasic, D.; Janssen, K. P. F.; Lammertyn, J. Analyst 2014, 139, 353–370. 50 Lesnik, E. A.; Freier, S. M. Biochemistry 1995, 34, 10807–10815. 51 Ke, C.; Humeniuk, M.; S‐Gracz, H.; Marszalek, P. E. Phys. Rev. Lett. 2007, 99, 018302.

27

29

2.2 Lipids Carole Aimé and Thibaud Coradin Sorbonne Universités, UPMC Univ Paris 06, Collège de France, UMR CNRS 7574, Laboratoire de Chimie de la Matière Condensée de Paris, Paris, France

2.2.1 ­Lipids Self‐Assembly Lipids are amphiphilic molecules, that is, they bear a hydrophobic part associated with a hydrophilic moiety. This confers to lipids the property to spontaneously self‐assemble into a variety of structures in aqueous solutions. The major forces that govern self‐assembly of amphiphiles into well‐defined structures such as membrane bilayers derive from the hydrophobic attraction. Nonpolar moieties disrupt the isotropic hydrogen bonding of water, causing an entropic loss at their interface with water. As a result, nonpolar molecules tend to a­ggregate to minimize the interface. In parallel, the hydrophilic, ionic, or steric repulsion of the headgroups imposes that they remain in contact with water [1, 2]. A minimum number of molecules is required for effective reduction of the water–hydrophobic surface interface and to trigger the highly cooperative transition from the soluble molecule to the self‐assembly. This is the so‐called critical aggregation concentration (cac) and for micelles, the critical micellization concentration (cmc). Several geometries are possible, including spherical or cylindrical micelles, vesicles, and lamellar layers that depend largely on the molecular geometry of the amphiphile and the resulting packing considerations (Figure 2.7) [1, 3, 4]. In a biological context, the characteristics of lipid molecules have several key impacts. Lipid self‐assembly enables cells to segregate their internal constituents from the external environment. This same principle acts at the subcellular level to assemble the membranes surrounding each cellular organelle [5]. Most biological membrane lipids are double‐chained phospholipids (i.e., bearing a phosphate‐derived polar head) or glycolipids (i.e., bearing a sugar‐based head), Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

30

Lipids

a0 l v p = v/(a0.l)

p < 1/3

1/3 < p < 1/2

~ 3 nm Spherical micelle

Cylindrical micelle

~ μm

1/2 < p < 1

p~1

~ μm Vesicle

10–100 nm Lamellar layer

Figure 2.7  Schematic representation of the relationship between the packing parameter p and the self‐assembling properties of amphiphiles [Source: Adapted from Oda [3]. Reproduced with permission of Springer].

with 16–18 carbons per chain, one of which is unsaturated or branched. Because of the unsaturation or branching, the membranes are in the fluid state at physiological temperatures. Hence, biological membranes are fluid and dynamic structures that can easily deform, where lipids can accommodate with each other as well as with other molecules. These properties allow various solute molecules to pass, and proteins to diffuse [1], and have consequences for the dynamics of membranes [6]—for example, in intermediates of membrane fusion [7]. In addition, the rich variety of phase behavior of multicomponent lipid mixtures is the basis for the formation of specialized microdomains [8–10]. Indeed, biological membranes exhibit chemically distinct domains that give rise to lateral heterogeneity. Those localized membrane environments play an important role in nature notably to direct the partitioning of various membrane proteins and lipid species such as cholesterol, which may play a major role in domain separation at the nanometer scale [8].

2.2.2 ­Structural Diversity of Lipids Lipids compose a complex molecular class with a rich structural diversity [11] that may be divided into eight categories: fatty acyls, glycerolipids, glycerophospholipids, sphingolipids, sterol lipids, prenol lipids, saccharolipids, and polyketides (Figure 2.8) containing distinct classes and subclasses of molecules. This classification covers eukaryotic and prokaryotic sources and is equally applicable to archaeal and synthetic lipids [12]. 2.2.2.1  Fatty Acyls (FA)

The fatty acyl structure represents the major building block of lipids and is  characterized by repeating series of methylene groups that impart its hydrophobic character. Derivatives from the straight‐chain saturated fatty

2.2.2 ­Structural Diversity of Lipid O O

O

OH O H

OH O

Fatty acyls (FA): hexadecanoic acid O O

O O H

Glycerolipids (GL): 1-hexadecanoyl-2-(9Z-octadecenoyl)-sn-glycerol

O P O OH

OH

H

N +

OH NH H

O

O

Glycerophospholipids (GP): 1-hexadecanoyl-2-(9Z-octadecenoyl)sn-glycero-3-phosphocholine

Sphingolipids (SP): N-(tetradecanoyl)-sphing-4-enine H

OH

H

Prenol lipids (PR): 2E, 6E-farnesol HO O

OH HO O HO

Sterol lipids (ST): cholest-5-en-3β-ol NH

O O

N

O

O NH O P O P O O OH OH

O

O

O

O

O H

OH OH

O O

O

H

Polyketides (PK): aflatoxin B1

Saccharolipids (SL): UDP-3-O-(3R-hydroxytetradecanoyl)-α -N-acetylglucosamine

Figure 2.8  Representative structure for each of the eight lipid classes [12].

Fatty acyls (FA) O O

O OH

methyl-branched fatty acids: 17-methyl-6Z-octadecenoic acid

Lactones: 11-undecanolactone

O OCH

Methoxy fatty acids: 2-methoxy-5Z-hexadecenoic acid

S

OH S

Thia fatty acids: R-lipoic acid; 1,2-dithiolane-3R-pentanoic acid

OH O

O

H OH

O COOH

Docosanoids: Neuroprostanes: 4S-hydroxy-8-oxo(5E,9Z,13Z,16Z,19Z)-neuroprostapentaenoic acid-cyclo[7S,11S]

O

Wax monoesters: 1-hexadecyl hexadecanoate

Figure 2.9  Representative structures for fatty acyls.

acids containing a terminal carboxylic acid (Figure  2.8) comprise methyl substituents, unsaturations, and heteroatoms of oxygen, halogen, nitrogen, or sulfur. Cyclic fatty acids, as well as heterocyclic rings containing oxygen or nitrogen, are also found in nature, together with fatty acid esters and long‐chain ethers (Figure 2.9).

31

32

Lipids

2.2.2.2  Glycerolipids (GL)

The glycerolipids essentially encompass all glycerol‐containing lipids (except glycerophospholipids). This category is dominated by the mono‐, di‐, and tri‐ substituted glycerols, together with glycerolglycans, bearing one or more sugar residues attached to glycerol via a glycosidic linkage and macrocyclic ether lipids (Figure 2.10). Over the last 10 years, new glycerol dialkyl glycerol tetraether (GDGT) lipids have been identified in the archaea that exhibit specific s­tereochemistry and/or alkyl chain composition (such as cyclopentane rings i­nsertion) [13]. 2.2.2.3  Glycerophospholipids (GP)

Glycerophospholipids form a separate category because of their abundance and importance. They are ubiquitous in nature and are key components of the lipid bilayer of cells. They also serve as metabolic fuels and signaling molecules. Phospholipids may be subdivided based on the nature of the polar headgroup, notably the presence of a second or third glycerol unit and the substituents on the glycerol backbone. Typically, there are alkyl‐linked glycerophospholipids, as well as dialkylether variants, and a separate class, called oxidized glycerophospholipids, in which one or more of the side chains have been oxidized. Phosphatidylcholine (PC) is a rather cylindrical glycerophospholipid that represents 50% of the cellular lipids in animal cells [11]. It carries a zwitterionic phosphocholine headgroup on a glycerol with two fatty acyl chains (diacylglycerol (DAG)), usually one being unsaturated yielding to fluid bilayers (Figure  2.11). In this category, phosphatidylethanolamine (PE) constitutes 20 mol% in most membranes. It has a small headgroup and a conical shape and creates a stress in the bilayer: the PE‐containing monolayer has a tendency to adopt a negative curvature. Phosphatidylserine (PS) appears on the cell surface Glycerolipids (GL) HO O

O

O O

H

H

OH HO H

O OH

Monoradylglycerols: monoacylglycerols: 1-dodecanoyl-sn-glycerol

Diradylglycerols: di-glycerol tetraethers:caldarchaeol

O O

O

O

HO O

O

O

O

O H

O

OH

OH

O

O

Triadylglycerols: tricylglycerols: 1-dodecanoyl-2-hexadecanoyl-3-octadecanoylsn-glycerol

H

HO

Diradylglycerols: diacylglycerols glycans: 1,2-di-(9Z,12Z,15Z-octadecatrienoyl)-3-O-β-D-galactosylsn-glycerol

Figure 2.10  Representative structures for glycerolipids.

O

2.2.2 ­Structural Diversity of Lipid Glycerophospholipids (GP) O

O O

R1

O–

O

P O

O

O

N+

Phosphatidylcholine O P O O O –

O R

O

O

O O

O

O P O –O

O

+

NH3 COO–

R2 Phosphatidylserine O O

+

NH3

O O HO

O– P O OH OH

O HO

R Phosphatidylethanolamine

OH

Phosphatidylinositol

Figure 2.11  Representative structures for glycerophospholipids. Sphingolipids (SP) OH

R

NH

O P O –O O

H

N +

OH H2N

O Sphingomyelin H

OH

H

Sphinganines: sphinganine OH NH H

O P O OH

H NH2

OH HO OH O O OH NH H

O

O Phosphonosphingolipids: N-(tetradecanoyl)sphing-4-enine-1-(2-aminoethylphosphonate)

OH

Neutral glycosphingolipids: Simple Glc series: Glcβ-Cer(d18:1/12:0)

Figure 2.12  Representative structures for sphingolipids.

during apoptosis and blood coagulation, and phosphatidylinositol (PI) is the basis for the phosphoinositides, phosphorylated derivatives whose signaling functions depend on the number and position of the phosphates on the inositol ring (Figure 2.11). 2.2.2.4  Sphingolipids (SP)

Sphingolipids share a sphingoid base backbone that is synthesized de novo from serine and a long‐chain fatty acyl coenzyme A (CoA) (i.e., a compound formed by association of a fatty acid with CoA). Major classes can be distinguished: notably the sphingoid and their simple derivatives, the sphingoid bases with an amide‐linked fatty acid (e.g., ceramides), and more complex sphingolipids with headgroups that are attached via phosphodiester linkages (the phosphosphingolipids) or via glycosidic bonds (the glycosphingolipids (GSLs)) (Figure 2.12). This structural variation has functional significance; for example, sphingoid bases in the dermis have additional hydroxyls that can interact with neighboring molecules, thereby strengthening the permeability

33

34

Lipids

barrier of the skin. Sphingomyelin (SM) contains a phosphocholine head, like PC, but has a hydrophobic ceramide backbone consisting of a sphingosine tail and one saturated fatty acid. In cells, SM tends to order membranes via its straight chains and its high affinity for the flat ring structure of cholesterol [11]. 2.2.2.5  Sterol Lipids (ST)

The sterol category is subdivided primarily on the basis of biological function. The sterols, of which cholesterol (Figure 2.8) and its derivatives are the most widely studied in mammalian systems, constitute an important component of membrane lipids, along with the glycerophospholipids and SMs. The steroids, which also contain the same fused four‐ring core structure, have different biological roles as hormones and signaling molecules (see estrogens, Figure 2.13). These are subdivided on the basis of the number of carbons in the core skeleton. 2.2.2.6  Prenol Lipids (PR)

The simple isoprenoids (linear alcohols, diphosphates, etc.) are formed by the successive addition of C5 units (Figure 2.8) that will be the basis for their own classification (e.g., vitamin A belongs to the C20 isoprenoids, Figure  2.13), with a polyterpene subclass for structures with more than 40 carbons. Another biologically important class of molecules, including vitamins E and K (Figure 2.13), is characterized by an isoprenoid tail attached to a quinonoid core of nonisoprenoid origin. Polyprenols and their phosphorylated derivatives play important roles notably for the transport of oligosaccharides across membranes.

Sterol lipids (ST) H

OH

H

H H

H

H

HO

HO

C18 steroids (estrogens) and derivatives: β-estradiol; 1,3,5[10]-estratriene-3,17β-diol

Secosteroids: vitamin D2 and derivatives: vitamin D2; (5Z,7E,22E)-(3S)-9,10-seco-5,7,10(19),22-ergostatetraen-3-ol

Prenol lipids (PR) O

OH O

C20 isosteroids: retinol; vitamin A

H 6

Vitamin K: vitamin K2(30): 2-methyl,3-hexaprenyl-1,4-naphthoquinone; menaquinone-6

Figure 2.13  Representative structures for sterol and prenol lipids.

2.2.3 ­Lipid Synthesis and Distributio

2.2.2.7  Saccharolipids (SL)

Saccharolipids possess fatty acids that are linked directly to a sugar backbone, forming structures that are compatible with membrane bilayers. They can occur as glycan or as phosphorylated derivatives. Additional saccharolipids include fatty‐acylated derivatives of glucose and sucrose (Figure 2.14). 2.2.2.8  Polyketides (PK)

Polyketides are represented by macrocyclic lactones, typically ranging from 14 to 40 atoms in size, as well as by complex aromatic ring systems. Polyketide backbones are often further modified by glycosylation, methylation, hydroxylation, oxidation, and/or other processes (Figure 2.14).

2.2.3 ­Lipid Synthesis and Distribution The main lipid biosynthetic organelle is the endoplasmic reticulum (ER), which produces the bulk of the structural phospholipids and sterols that are rapidly transported to other organelles. As a result, the lipid composition of different membranes varies throughout the cell. Saccharolipids (SL) O

OH HO O HO

NH

O O

NH2

O O O P O P O OH OH

N

O

O

OH O HO P O O O OH O NH O HO O O O HO HO

OH OH

HO

Acylaminosugars: monoacylaminosugars: UDP-3-O-(3R-hydroxy-tetradecanoyl)GlcN

O NH O O P OH O OH HO

Acylaminosugars: tetraacylaminosugars: lipi IVA

Polyketides (PK) O O

OH O O

OH

O

HO

O

O

O O

OH OH

Macrolide polyketides: 6-deoxyerythronolide B

O

OH

OH O

Aromatic polyketides: griseorhodin A

Figure 2.14  Representative structures for saccharolipids and polyketides.

35

36

Lipids

The major glycerophospholipids assembled in the ER are PC, PE, PI, PS, and phosphatidic acid (PA). In addition, the ER synthesizes ceramides (Cer), galactosylceramide (GalCer), cholesterol (CHOL) in mammals, and ergosterol (ERG) in yeast. Both the ER and lipid droplets participate in steryl ester and triacylglycerol (TG) synthesis. The Golgi lumen is the site of synthesis of SM, complex GSLs, and yeast inositol sphingolipid (ISL). PC is also synthesized in the Golgi and may be coupled to protein secretion at the level of its DAG precursor. Mitochondria and a few other organellar membranes also contribute to the generation of the lipid spectrum in cells (Table 2.2) [15]. Most lipids are moved from the site of synthesis throughout the cell by membrane trafficking or other transfer mechanisms involving, for example, the formation of membrane contacts. Sphingolipids and sterols are continuously moved from their site of synthesis in the ER to the Golgi. Then, they are sorted from the trans‐Golgi network into vesicular carriers for transport to the plasma membrane [14, 16]. As a result, the concentration of sterols and sphingolipids increases from the ER to the cell surface.

2.2.4 ­The Diversity of Lipid Functions Lipid self‐assembly provides the basis for the formation of the cell membrane. As a result, lipids are directly involved in cellular architecture and in creating specific subcompartments in membranes that ensure membrane trafficking, Table  2.2  Lipid synthesis and composition of the different cell membranes [14]. Lipids

CHOL/PL

ERG/PL

Endoplasmic reticulum

PC; PE; PI; PS; PA; Cer; GalCer; CHOL; TG

0.15

0.1

Plasma membrane

Cer; Sph; S1P; DAG; PI4P; PI(4,5)P2; PI(3,4)P2; PI(3,4,5)P3

1

0.5

Mitochondrion

PE; PG; CL; PA

0.1

0.1

Late endosomes

PI(3,5)P2; PI3P

0.5

0.1

Golgi

PC; PE; GlcCer; PI4P; ISL; GSL; SM

0.2

Site of synthesis of the major phospholipids, as well as of lipids that are involved in signaling and organelle recognition pathways (highlighted). Measure of the molar ratio of sterol (cholesterol—CHOL—in mammals, and ergosterol—ERG—in yeast) to phospholipid. CL, cardiolipin; GlcCer, glucosylceramide; PG, phosphatidylglycerol; PI(3,4)P2, phosphatidylinositol‐(3,4)‐bisphosphate; PI(3,4,5)P3, phosphatidylinositol‐(3,4,5)‐trisphosphate; PI(3,5)P2, phosphatidylinositol‐(3,5)‐bisphosphate; PI(4,5)P2, phosphatidylinositol‐ (4,5)‐bisphosphate; PI3P, phosphatidylinositol‐3‐phosphate; PI4P, phosphatidylinositol‐4‐ phosphate; S1P, sphingosine‐1‐phosphate; and Sph, sphingosine.

2.2.4 ­The Diversity of Lipid Function

regulating membrane proteins, and the whole cellular functions as a collective work from lipids and proteins [16]. In addition to their structural role in cell membranes, lipids are involved in energy storage and have important signaling functions. 2.2.4.1  Cellular Architecture

Cell membranes self‐assemble into stable two‐dimensional lamellar bilayers from a mixture of lipids [1, 9, 10, 16]. This lipid architecture is a guarantee for the compartmentalization of specific chemical reactions and provides membranes with the potential for budding and fusion that are essential for cell division, biological reproduction, and intracellular membrane trafficking [14]. The most common lipids in animal cells are the two phospholipids PC and PE. They have uncharged headgroups, whose interactions are entirely due to steric hydration forces. Therefore, they are fairly insensitive to changes in the ionic environment of the cytoplasm, making them ideal structural building blocks for stable membrane organization [1]. Depending on the ratio of these lipids, the resulting bilayer exhibits varying curvature and flexibility. This property is the basis for the dynamic regulation of the cell membrane structure by the lipid synthesis machinery of cells. In the plasma membrane delineating the cell cytoplasm, the lipid bilayer is highly asymmetric. Its outer leaflet contains mostly PC and sphingolipids, whereas the cytosolic leaflet is occupied by PE, PS, and PI, with cholesterol residing in both leaflets. A membrane translocation machinery, which involves P‐type ATPases, consumes large amounts of ATP to actively maintain this asymmetry (Table 2.3) [14, 16]. 2.2.4.2  Lipid Rafts

Membranes are typically crowded with proteins—mainly oligomeric— creating patches, called rafts, and resulting in a high variability in bilayer thickness. Membrane rafts are dynamic, nanometer‐sized sterol‐ and sphingolipid‐ enriched protein assemblies dominated by lipid–lipid and lipid–protein interactions [1, 5, 16, 17]. Structural compatibility of sphingolipids, sterols, and raft proteins enable the formation of dynamic subcompartments that function in signaling, trafficking, and modulation of membrane protein a­ctivity [16]. 2.2.4.3  Energy Storage

Lipids are used for energy storage, principally as TG and steryl esters, in lipid droplets. These function primarily as anhydrous reservoirs for the efficient storage of caloric reserves and as caches of fatty acid and sterol components that are needed for membrane biogenesis [14].

37

Table  2.3 Synthetic view of the relationships between lipid diversity and membrane properties and functions.

Bacteria Eukaryotes

Lipid composition

Membrane properties

Functions

Mainly phosphatidylethanolamine

Robust

Incorporation of membrane proteins

Glycerophospholipids

Robust

Incorporation of membrane proteins

Different shapes

Membrane budding

Different shapes

Complex organelle morphology Sphingolipids Sterols Higher organisms

Tissue‐specific sphingolipids

Vesicular trafficking Complex and specific cellular architecture

Establishment and maintenance of distinct organelles Specific functions depending on the cell type

2.2.5 ­Lipidomic

2.2.4.4  Regulating Membrane Proteins by Protein–Lipid Interactions

Many membrane proteins are modified by glycosyl PI anchors or sterol, myristoyl, palmitoyl, or prenyl moieties, driving their association with membranes. In addition, many proteins are known to bear a specific lipid‐binding motif that may have structural and regulatory implications [14, 16]. Typically, p­rotein– lipid interactions play a key role in controlling protein insertion and folding processes and are involved in the oligomerization of multisubunit c­omplexes [18]. 2.2.4.5  Signaling Functions

Lipids can act as first and second messengers in signal transduction and molecular recognition processes. The degradation of amphipathic lipids allows for bipartite signaling, which can be transmitted within a membrane by hydrophobic portions of the molecule and also propagated through the cytosol by soluble (polar) portions of the molecule [14].

2.2.5 ­Lipidomics Lipid diversity has an utmost functional importance in cell membranes. A considerable part of our genome is required to synthesize, metabolize, and regulate this diversity. Nevertheless, we are far from understanding the biological significance of this compositional complexity [16]. To this aim, it is necessary to develop a systems‐level analysis of lipids and their interacting partners, called lipidomics. This new approach not only aims to map all lipids of a cell or address lipids synthesis and distribution in cells but also integrates the collective capability of lipids and proteins in cell membranes [16]. Lipidomics gather an important number of different techniques to understand lipid functions at the system level. Typically, recent advances have combined mass spectrometry, chromatography, NMR spectroscopy, and computational strategies for the investigation of the dynamics of lipids metabolism, the trafficking and partitioning into specialized subcellular compartments, the transbilayer movement of lipids, and the dynamics of protein–lipid i­nteractions [6, 11]. Because deregulated lipid metabolism is involved in many human disease including fatty liver and lipotoxic‐induced insulin resistance, Alzheimer’s d­isease, cancer, inflammation, and atherosclerosis [6, 11, 19], improved lipid analysis should benefit molecular medicine and nutritional research. Indeed, lipidomics appears as a promising area of biomedical research, with a variety of applications in drug and biomarker development [6]. In the context of a whole cell, the complex relationships between metabolites will be unraveled by viewing them as an integrated system, the “interactome,” including genes, transcripts, proteins, and metabolites to fully describe cellular functioning [19, 20].

39

40

Lipids

­References 1 Israelachvili, J. Intermolecular & Surface Forces, 2nd ed.; Academic Press, Inc.,

San Diego, 1992.

2 Chen, I. A.; Walde, P. Cold Spring Harb. Perspect. Biol. 2010, 2, a002170. 3 Oda, R. Safin gels with amphiphilic molecules. Weiss, R. G.; Terech, P.,

ed. Molecular Gels. Springer, Dordrecht, the Netherlands, 2005.

4 (a) Israelachvili, J. N.; Mitchell, D. J.; Ninham, B. W. J. Chem. Soc., Faraday

5 6 7 8 9 10 11 12

13 14 15 16 17 18 19 20

Trans. 2 1976, 72, 1525–1568; (b) Israelachvili, J. N.; Mitchell, D. J.; Ninham, B. W. Biochim. Biophys. Acta 1977, 470, 185–201. Simons, K.; Sampaio, J. L. Cold Spring Harb. Perspect. Biol. 2011, 3, a004697. Wenk, M. R. Nat. Rev. Drug Discov. 2005, 4, 594–610. Chernomordik, L.; Kozlov, M. M.; Zimmerberg, J. J. Membr. Biol. 1995, 146, 1–14. Feigenson, G. W.; Buboltz, J. T. Biophys. J. 2001, 80, 2775–2788. Luzzati, V. Curr. Opin. Struct. Biol. 1997, 7, 661–668. Almsherqi, Z. A.; Kohlwein, S. D.; Deng, Y. J. Cell Biol. 2006, 173, 839–844. Yetukuri, L.; Ekroos, K.; Vidal‐Puig, A.; Oresic, M. Mol. BioSyst. 2008, 4, 121–127. Fahy, E.; Subramaniam, S.; Brown, H. A.; Glass, C. K.; Merrill, Jr., A. H.; Murphy, R. C.; Raetz, C. R. H.; Russell, D. W.; Seyama, Y.; Shaw, W.; Shimizu, T.; Spener, F.; van Meer, G.; Van Nieuwenhze, M. S.; White, S. H.; Witztum, J. L.; Dennis, E. A. J. Lipid Res. 2005, 46, 839–861. Chong, P.L.‐G. Chem. Phys. Lipids 2010, 163, 253–265. van Meer, G.; Voelker, D. R.; Feigenson, G. W. Nat. Rev. Mol. Cell Biol. 2008, 9, 112–124. Engelman, D. M. Nature 2005, 438, 578–580. Shevchenko, A.; Simons, K. Nat. Rev. Mol. Cell Biol. 2010, 11, 593–598. Lingwood, D.; Simons, K. Science 2010, 327, 46–50. Hunte, C.; Richers, S. Curr. Opin. Struct. Biol. 2008, 18, 406–411. van Meer, G. EMBO J. 2005, 24, 3159–3165. Dennis, E. A. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 2089–2090.

41

2.3 Carbohydrates Mirjam Czjzek Laboratory of Integrative Biology of Marine Models, Station Biologique de Roscoff, University Sorbonne Paris VI and CNRS, Roscoff, France

2.3.1 ­Introduction Among the four classes of biomacromolecules, the occurrence, diversity, and role of carbohydrates are still the less studied and understood. This is mainly due to the fact that we don’t possess well‐developed sequencing techniques analogous to protein or nucleotide sequencing methods. Nevertheless, and due to their important biological roles and features, their chemical nature, ­variability, and heterogeneity have been and continue to be subject for many biochemical studies. Carbohydrates are certainly the most diverse class of ­biomacromolecules. To illustrate this diversity, R.A. Laine [1] has calculated all possible oligosaccharide isomers, both branched and linear, that theoretically exist for a reducing hexasaccharide, yielding 1.05  ×  1012 structures, as 6 ­compared with 4096 oligonucleotides and 6.4 × 10 peptides having the same number of “units.” Carbohydrates and glycoconjugates have many biological functions that can be categorized into three major areas. These are (i) energy storage (e.g., starch and glycogen in plants and laminarin in brown algae), (ii) not only structural frameworks in the extracellular matrix of multicellular organisms but also exopolysaccharides surrounding unicellular organisms to form rigid (e.g., cel‑ lulose and hemicelluloses) or gel networks (e.g., agar, pectin, and bacterial exopolysaccharides), and (iii) other higher order functions such as molecular targeting, signaling, regulation, and cell–cell recognition (e.g., glycoproteins and proteoglycans on the cell surface). To decode this “language of sugars,” most of these higher order biological functions are mediated via carbohydrate–­protein interactions, and the large diversity of carbohydrate nature and structure go Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

42

Carbohydrates

hand in hand with a large diversity of proteins and enzymes [2] that specifically bind, recognize, degrade, and synthesize a given carbohydrate ­structure. The description or details concerning these proteins involved in carbohydrate inter‑ actions is far beyond the scope of this chapter, and for further reading I recom‑ mend reviews such as Refs. [3, 4]. Noteworthily, the sequences of these proteins and enzymes are classified in a huge “Carbohydrate Active EnZyme” database, the CAZy database (http://www.cazy.org) [5] that is constantly updated [6], where a large amount of information about these sequences can be found. In the context of this chapter we will describe the actual knowledge about carbohydrate structure–function relationships, in a most systematic manner, as to cover the most important but certainly not complete part of all natural occurring structures of carbohydrates reported to date.

2.3.2 ­Monosaccharides Monomeric sugar compounds, or –oses, that occur in biological material range from three carbon‐containing (glycerol) to nine carbon‐containing (sialic acid) polyhydroxyl aldehydes (aldoses) or ketones (ketoses) and can be grouped according to their chemical formula, configuration, and stereochemical con‑ formation. Many sugars have the empirical formula (CH2O)n that gave rise to the term “carbohydrate.” Numerical prefixes define how many carbons a sugar contains: for example, a triose has three carbon atoms, a pentose five, and a hexose six. The most frequent biological sugar units are pentoses and hexoses that represent the basic units of many polysaccharides for carbon storage and cell wall structuring. All monosaccharides can adopt a straight‐chain configu‑ ration, and –oses with four or more carbons can also adopt heterocyclic ring conformations, five‐membered rings (furanoses) and six‐membered rings (pyranoses) being the most frequent ring structures (Figure 2.15). The number of carbons in a sugar unit is not determinant for the ring structure, since both pentoses and hexoses can adopt either five‐ or six‐membered rings. The ring structures are not flat but undergo ring puckering [7] and the so‐called ener‑ getic “chair” conformation is the most favored for pyranose, while the most stable furanose conformation is simply called “puckered.” Two chair conforma‑ tions (4C1 and 1C4) are possible for hexopyranoses, placing the hydrogen and OH groups alternatively in either equatorial or axial positions. For example, for d‐sugars, such as d‐glucopyranose (or short d‐glucose, also abbreviated Glcp), the 4C1 conformation that will place as many hydroxyl groups (or other bulky groups) as possible in the equatorial position is more favorable, but for other sugars (i.e., the deoxysugar l‐fucose), the 1C4 chair is the more favorable ­conformation [8]. The position of the OH group at the anomeric carbon (C‐1) after ring closure with respect to the CH2OH group at the chiral center (C‐5 in

Straight chain

Ball-and-stick model

Ring conformation

Haworth projection

HC = O H HO

6

CH2OH

OH H

H

OH

H

OH

HO

6CH OH 2

4

5 3

HO

O

H

2 1

OH

OH β

4

OH

HC = O OH H

HO

H CH2OH

OH

H

β-D-Glucopyranose

5

CH2OH HO 3

α-L-Arabinose

OH

OH

H 4

2 1

H

β-D-Glucopyranose

O

4

OH 1β

2

H β-D-Glucose

HO

O

3

CH2OH

H

5

OH α

α-L-Arabinofuranose

OH

HOH2C 5

3

H

1

OH

O



H 2

H

OH

α-L-Arabinofuranose

Figure 2.15  Schematic representation of sugar nomenclature. Both β‐d‐glucose and α‐l‐arabinose are shown in different types of representations that display their stereochemistry and conformations. From left to right: straight‐chain models, ball‐and‐stick models, conformational models, and Haworth projections. The ball‐and‐ stick model demonstrates the convention whereby the last asymmetric carbon (marked with a white asterisk on the black carbon) is oriented with its hydrogen group to the rear. The size of the three asymmetric groups increases clockwise for d‐sugars or counterclockwise for l‐sugars. The conformational model distinguishes the relative axial and equatorial positions of the hydroxyl groups around the ring structure of pyranoses. d‐Glucose is the most stable of the hexoses because every hydroxyl group of the ring and the C‐6 primary alcohol group are in the equatorial position, which is energetically more favorable than other orientations. By convention, the α‐configuration of l‐arabinofuranose is in the “up” equatorial position.

44

Carbohydrates

d‐glucose and C‐4 in l‐arabinose) will determine the nomenclature of α‐ or β‑conformations: when the OH(1) and CH2OH groups are on opposite sides of the ring plane, the conformation is called “α” and “β” when they are positioned on the same side of the ring plane (Figure 2.15). The occurrence and frequency of monosaccharides in living organisms depends on their metabolisms and complexity. For example, the most common dietary monosaccharides are glucose and fructose, both produced during ­photosynthesis in plants, and galactose that is found in milk. These most frequent monosaccharides can be found in their free form in the cytosol, where they also play the role of triggering and/or regulating metabolic processes, but the vast majority of carbohydrate units are found in larger ­compounds, ranging from oligosaccharides, glycosides, glycoconjugates, and polysaccharides (for definition of these compounds, see Table 2.4). The follow‑ ing sections describe the most common compounds and also give a focus on those that are most relevant as tools in nanocomposites.

2.3.3 ­Oligosaccharides Besides a number of naturally occurring disaccharides, medium‐sized sugar chains ranging from three to a few tens of sugar units are naturally less fre‑ quently found in a free diffusible form. In contrast, the carbohydrate chains found in protein glycosylation are typically constituted of 2–15 or more units, depending on the nature and phylogenetic position of the respective organism. The role of these complex “protein decorations” has long been underestimated, and we now know that the significance of the presence or absence of these glycosyl residues attached to proteins is crucial for their role in recognition and signaling processes and the basis for their biological functions [9]. As such, the first mostly structural view has gradually given way to consider glycans as bio‑ active signals sui generis, representing a unique language: the “sugar code” [10]. In the following two sections these intermediate‐sized oligosaccharides will be treated separately: (i) the freely and naturally occurring disaccharides and (ii) the oligosaccharide decorations of glycoproteins. 2.3.3.1 Disaccharides

To perform synthesis and degradation of the complex glycan and polysaccha‑ ride structures, specific metabolic pathways occur in all living organisms. Through these pathways, unique sugars that are the basic units, such as glu‑ cose, galactose, fructose, or glucuronic acid, are converted into all necessary combinations of carbohydrate structures necessary for the survival of a given organism by specific enzymes that epimerize, hydrolyze, transglycosylase, esterify, and sulfate, just to mention some of the many transformations that are

2.3.3 ­Oligosaccharide

Table 2.4  List of most frequent mono‐ and disaccharides and definition of glycosyl‐containing compounds. Some important monosaccharides

Glucose

Hexose

Aldose

Glc

Fructose

Hexose

Ketose

Fru

Galactose

Hexose

Aldose

Gal

Mannose

Hexose

Aldose

Man

Xylose

Pentose

Aldose

Xyl

Arabinose

Pentose

Aldose

Ara

Ribulose

Pentose

Ketose

Rub

Disaccharides

Unit 1

Unit 2

Bond

Sucrose

Glucose

Fructose

α(1 → 2)β

Lactose

Galactose

Glucose

β(1 → 4)

Trehalose

Glucose

Glucose

α(1 → 1)α

Maltose

Glucose

Glucose

α(1 → 4)

Cellobiose

Glucose

Glucose

β(1 → 4)

Oligosaccharide

Roughly from 2 to 10 carbohydrate units

Polysaccharide

Approximately hundreds of carbohydrate units

Glycoside

Molecule linking a carbohydrate to a noncarbohydrate moiety, usually a small organic molecule

Glycoconjugate

Complex saccharides: carbohydrates covalently linked to other chemical species

Glycolipid

Carbohydrate linked to lipids

Glycoprotein

Proteins that contain oligosaccharide chains

Proteoglycan

Special type of glycoprotein with carbohydrate weighing 95% of the compound

performed in these reactions. Some disaccharide compounds, which often ­represent reaction intermediates of these metabolic pathways, have gained additional biological relevance due to their physicochemical or biochemical properties. There are two different types of disaccharides: reducing disaccha‑ rides, in which two units are linked such that at one end a free and reducing hemiacetal unit still is present, and nonreducing disaccharides, in which the components are bound to each other by their anomeric centers and neither monosaccharide has a free hemiacetal unit. Cellobiose and maltose are ­examples of reducing disaccharides. Sucrose and trehalose are examples of nonreducing disaccharides (Figure  2.16). The important intermediate state

45

46

Carbohydrates

HO HO

OH Glucose HO O α 1,1

Trehalose

Glucose

HO HO

HO

6

4 3

5 2

Maltose

O

OH

Glucose

HO α 1,4 O HO

O HO

OH

O

6′ 5′

OH OH

1

HO HO

OH

Glucose

OH

Glucose

HO HO

O OH O β

OH

OH

O

α

α 1,2′

O HO

2′

β

3′

1′

OH

4′

OH

O HO Galactose

Glucose

HO O

OH β 1,4

OH

OH

O OH

Fructo-furanose

Sucrose

Lactose

Figure 2.16  Conformational representations of four common disaccharides. The nature of the individual glycoside units forming the disaccharide are indicated, as well as the type of glycosidic bond that links the two units. When the sugar units are linked through both reducing end, the relative conformation of both C‐1 carbons is indicated.

between monosaccharides and polysaccharides makes them powerful regula‑ tion factors, and as such they are also involved in cellular signaling and used as energy‐storage molecules. In the cell, many of these disaccharides also play the role of osmolytes, present in the cytosol to control the water content and the cellular pressure. The most frequent disaccharides that occur naturally are listed in Table 2.4. 2.3.3.2  Protein Glycosylations

Glycosylation is the most common posttranslational modification of ­proteins [11], and it has been estimated that 70% of the human proteome is glycosylated [12]. Moreover, conjugation of sugars to proteins occurs through‑ out the entire phylogenetic spectrum. To date and to our knowledge, protein glycosylation patterns involve 13 different monosaccharides and can be ­connected to 8 ­different amino acids, forming at least 41 types of glycosidic linkages [13, 14]. Depending on the molecular weight ratio of the carbohy‑ drate fraction with respect to the protein fraction, glycoproteins (large pro‑ tein, decorated by ­oligosaccharides) are distinguished from proteoglycans (protein scaffold c­ arrying large carbohydrate chains). The diversity of glycan structures found in cells is not random but is carefully controlled by gene expression [15]. Glycans carried by proteins are classified based on the nature of the covalent linkage to the polypeptide: linked to an asparagine (N‐linked) or a serine or threonine (O‐linked). N‐linked protein glycosylation is initiated

2.3.4 ­Polysaccharide N-glycosylation

Asn High mannose glycan

Asn

Asn

Asn

Asn

mono-

bi-

tri-

tetra-antennary

N-Acetylglycosamine

Type 1 A/B/O blood group antigens O antigen

A antigen

N-Acetylgalactosamine B antigen

Sialic acid Fucose Mannose

β1,3

β1,3

β1,3

Galactose

Figure 2.17  Schematic representation of some selected important human N‐glycosylation and O‐glycosylation patterns. The complexity of branching patterns observed for N‐glycosylation is illustrated in the upper row, while the lower panel shows the major determinants of type I blood group O‐glycosylation. The figure was prepared using the software GlycoWorkbench 2 [16].

during protein translation in the endoplasmic reticulum, whereas O‐linked glycosylation occurs after translation in the Golgi apparatus. In both cases the glycans are further modified and elaborated in the Golgi apparatus on their way to the cell surface [15]. Mature O‐linked oligosaccharides are often small, the most common having 2–6 sugar residues as linear or short single‐branched oligosaccharides, whereas N‐linked glycans are larger having 10–20 residues, most as highly branched structures with typically 3–5 branches (Figure 2.17). In addition, proteins can also carry very long (up to ~100 sugars) repeating disaccharides linked to serine and threonine residues, classified as proteogly‑ cans. These compounds, specified by their long anionic sugar chains substi‑ tuted with sulfate groups, are termed glycosaminoglycans (GAGs), and the best known are heparan, hyaluronan, and chondroitin sulfates [17].

2.3.4 ­Polysaccharides In terms of biomass, the most important part of “carbohydrates” that also rep‑ resent the compounds most adapted for the construction of nanocomposites is polysaccharides. In general we speak of polysaccharides for chains containing more than 200 sugar units. These organic carbon–polymer structures are

47

48

Carbohydrates

produced through major biological processes and constitute the primary source of energy for life. Thus, the “battle” for energy in nature occurs through the biosynthesis and deconstruction of these major energy‐containing com‑ pounds. Plants and algae convert solar energy and CO2 into organic carbon through photosynthesis that can then be further utilized by heterotrophic organisms as energy‐providing nutrition. These reservoirs of carbon and energy are to a major part found in cell wall polysaccharides [18], extracellular matrixes, or, in certain cell lineages, storage polysaccharides [19] that are dif‑ ferent from those found in cell walls. We can distinguish between linear or highly branched polysaccharides, neutral or charged ones, and crystalline, gel‐forming or soluble polysaccharides. Some polysaccharides are made up of a single sugar as repeating unit, giving rise to linear homopolymers (or homo‑ glycans), while others have a disaccharide repeating unit (heteropolymers or heteroglycans), and even others can be made up of a regular or statistical dis‑ tribution of three to four different monosaccharides along the polymer chain. Some polysaccharides are purely linear chains, naturally occurring without any branches, while others bear characteristic branching patterns that specifically concern a certain hydroxyl position of the linear backbone chain. In all cases, the polysaccharides are in general (except for cellulose and chitin) not pure repetitions of one single pattern, but so‐called hybrid structures with a more or less regular distribution of at least two different patterns. Polysaccharides are tethered together with other components, such as pro‑ teins and aromatic substances (e.g., lignin or polyphenols), in a complex three‐ dimensional arrangement within the cellular cell wall. These biological structures can thus already be considered as natural nanocomposites. Cell walls contribute to the functional specialization of cell types. Indeed, the phys‑ icochemical and biochemical properties of different polysaccharides are exploited in these natural arrangements, providing differently performing material for stiffness, elasticity, shape, deformability, and reactivity. The plant cell wall is a dynamic compartment that needs to change throughout the life of the cell, leading also to different types of cell walls such as the new primary cell wall, the middle lamella forming the interface between primary walls of neigh‑ boring cells and a secondary cell wall building complex structures uniquely suited to the cell’s function [18]. In a simplified description, the cell walls that surround all plant cells are fibrous composites, in which a microfibrillar back‑ bone (Figure 2.18) is tethered together by cross‐linking glycans and embedded in a more gel‐like polysaccharide matrix (for illustrations of cell wall models, see Refs. [20, 21]). Again, given the diversity and variability of polysaccharide structures and functions, this chapter cannot give an entire overview, and only the most ­common and abundant polysaccharides and their occurrence and associated functions are described in more detail in the following sections.

2.3.4 ­Polysaccharide Plant cells

Cellulose microfibrils in a plant cell wall

Microfibril

H O O HO H O O HO H O O HO H O O HO

O OH

O OH

O OH

O OH

HO O O H HO O O H HO O O H HO O

H O

OH O

O HO H O

OH O

O HO H O

OH O

O HO H O

OH O

O H

O HO

O OH

O OH

O OH

O OH

HO O O H HO O O H HO O O H HO O O H

OH

Cell walls

O

OH O

OH O

OH

Cellulose molecules

O

β-Glucose monomer

Figure 2.18  Schematic representation describing the presence of cellulose fibrils in the cell wall of green plants. The scheme displays the different scales, starting at the macroscopic level of a leaf and ending at the molecular level, illustrating the hydrogen‐bonding network of intra‐ and interchain interactions stabilizing the crystalline state of cellulose microfibrils.

2.3.4.1 Cellulose

In land plants, cellulose is the most abundant polysaccharide, accounting for 15–50% of the dry mass of cell walls. Cellulose chains are formed by 1 → 4‐ linked β‐d‐glucose (Glcp), that is, the equatorial hydroxyl group at the C‐1 position is equatorially linked to the C‐4 carbon of the precedent glucose unit. For sterical reasons, the linked sugars must be inverted by almost 180° relative to each other, and the iteration of this linkage produces a nearly linear molecule (Figure 2.18). Cellulose exists in the form of paracrystalline assemblies called microfibrils but can also be found in more amorphous assemblies. Microfibrils consist of several dozen (1 → 4)‐β‐d‐glucan chains hydrogen‐bonded to each other. In most land plants, each microfibril contains, on average, about 36 indi‑ vidual chains, while microfibrils of algae can form either large round cables or flattened ribbons of several hundred chains. Electron microscope images show that microfibrils of angiosperms measure between 5 and 12 nm wide. Each ­single (1  → 4)‐β‐d‐glucan chain may be just several thousand units long (about  2–3 µm), but the individual chains begin and end at different places within the microfibril, leading to lengths of hundreds of micrometers

49

50

Carbohydrates

containing thousands of individual glucan chains. This structure is analogous to a spool of thread that consists of thousands of individual cotton fibers, each about 2–3 cm long [18] (Figure 2.18). 2.3.4.2 Hemicelluloses

The term “hemicellulose”—meaning “half cellulose”—is often regarded as unsatisfactory, since it designates a diverse group of polymers, which by defini‑ tion are non‐cellulose polysaccharides but contain β‐(1 → 4)‐linked backbones. The most frequent polymers termed hemicelluloses are xyloglucans, xylans, mannans, glucomannans, and mixed‐linkage glucans (MLG) (each described in a separate section). These polymers can hydrogen‐bond with cellulose, forming cross‐linkages between microfibrils and strengthening the cell wall structure. 2.3.4.2.1 Xyloglucan

Xyloglucan has a backbone consisting of β‐(1 → 4)‐linked glucopyranose (Glcp) units, most of which are branched with a α‐(1 → 6)‐linked xylopyranosyl (Xylp) residue in a regular pattern, giving the repeating core structure designated by XXXG or XXGG (the nomenclature of the xyloglucan branching pattern has been developed by Fry et al. [22]). However, part of the xylosyl branches can be further substituted with β‐(1 → 2)‐linked galactopyranose (Galp), which can carry additional substitutions through α‐(1 → 2)‐linked l‐fucopyranosyl (Fucp) units [23] (Figure 2.19a). Xyloglucans are highly heterogeneous, and the struc‑ ture also varies between plant species and within different tissues of a same spe‑ cies. These polysaccharides are believed to play a key role in tethering and cross‐connecting adjacent cellulose microfibrils, providing strength to the cell wall, and therefore need to be hydrolyzed and reconnected during cell growth [24]. 2.3.4.2.2 Xylan

Xylans are the most abundant hemicellulose polysaccharides in cell walls of grasses and a structurally much more diverse group of polysaccharides that don’t have a unique common core structure like xyloglucans. Variation of xylan structures is thus strong among plant species and even among different tissues of the same species [25]; nevertheless major characteristic xylan structures (Figure 2.19b) found in land plants are: ●●

●●

●●

Glucuronoxylan (GX; backbone of β‐(1 → 4) Xylp units with single α‐d‐­ glucuronic acid or O2‐methylated α‐d‐glucuronic acid branches; in addition the backbone xylose units are usually acetylated at O‐2 and/or O‐3) Glucuronoarabinoxylan (GAX; backbone of β‐(1 → 4) Xylp units with single α‐d‐glucuronic acid or O‐2‐methylated α‐d‐glucuronic acid and α‐l‐­ arabinofuranoside branches attached to the O‐2 and O‐3 positions of the xylose backbone units, respectively) Arabinoxylan (backbone of β‐(1 → 4) Xylp units with α‐l‐arabinofuranoside branches at positions O‐2 and O‐3 of the xylose backbone)

2.3.4 ­Polysaccharide

(a)

O HO

OH OH

HO O

Xyloglucan O

OH O

O HO

O

HO O

O O HO

O

OH

O OH

O

O O

OH

O

OH

OH

HO

HO

X

X

X

G

Xylan

(b) OH HO O

O

HO

O

OH

O

HO O

O OH

O O

O HO

OH

Pectin

(c)

HG

RG II

RG I

Figure 2.19  Different schematic representations to depict the structure, linkage, and branching of various polysaccharide chains. (a) Schematic representation of the chemical structure of xyloglucan; here the “ideal” sequence of the XXXG structure is represented; notably the unbranched backbone of this polysaccharide is composed of β‐1 → 4‐linked d‐glucose units. (b) Schematic representation of the chemical structure of the most simple xylan polysaccharide chain, composed of β‐1 → 4‐linked d‐xylose units, which are thus considered a derivate of the β‐1 → 4‐d‐glucose backbone, lacking the C‐6/O‐6 “side chain.” (c) Schematic representation “all in one” of pectin. The three major types of pectin domains, called homogalacturonan (HG), rhamnogalacturonan I (RG I), and rhamnogalacturonan II (RG II), are represented. It is important to note that the different structures shown here are intended only to illustrate some of the major domains found in most pectins rather than definitive structures. For details of the glycosyl units present in the different domains, see the description in the text section “Pectins”

51

Carbohydrates

Chitin

(d)

CH3

CH3

O

NH

HO O

O

OH

O O HO

O

OH NH

HO O

O O HO

O

NH

OH

O NH O

OH

O

CH3

CH3

Alginate

O

O

H

O

C

OH O

O HO OH

O

O O

C

OH

O

O C HO O

O O

C

O

M

G

OH

OH O

G

O OH

O

(e)

O

52

HO

C

O HO O

M

O

O

O C

OHO O

HO

M

G

Figure 2.19  (Continued) (d) Schematic representation of chitin: again the backbone of this polysaccharide is composed of β‐1 → 4‐linked d‐glucose units, the derivation from cellulose being the N‐acetylglucosamine modifications at C‐2. (e) Schematic representation of alginate; this polysaccharide chain is composed of mannuronic acid (M) units and their C‐5 epimer, guluronic acid (G); specific patterns of subsequent M or G units are the basis of physicochemical properties of various alginates.

Similar to xyloglucans, xylans are strongly interlinked through non‐covalent interactions to the cellulose microfibrils, thus participating to the hemicellu‑ losic network. 2.3.4.2.3  Mannan or Glucomannan

These polysaccharides are often major cell wall components of primitive ­vascular plants, which can be also found in several macro‐algae. Following the nomenclature, the basic backbone structure is exclusively β‐(1 → 4)‐linked mannose in mannan and both β‐(1 → 4)‐linked glucose and mannose units in the heteropolymer glucomannan. Further variations of these basic structures are produced through isolated α‐d‐galactose branches (at O‐6 positions) and/or O‐acetylation at O‐2 and O‐3 positions. 2.3.4.2.4  Mixed‐Linkage Glucan (MLG)

This polysaccharide, sometimes simply called β‐glucan, contains mixed β‑(1 → 3)‐ and β‐(1 → 4)‐linked d‐glucose units and thus is formed by the same building blocks as cellulose. MLG is present in most vascular plants and was initially thought to be restricted to the order of Poales but since then has been found in ancestral plants and green algae [26]. This polysaccharide is believed

2.3.4 ­Polysaccharide

to have a role as storage polysaccharide in grass endosperm and aleurone cell walls. The pattern and regularity of the β‐(1 → 3) linkages with respect to the more frequent β‐(1 → 4) linkages are determined by the organism of origin; mixed‐linkage β‐glucans can also be found in fungi where they are rather called lichenan. 2.3.4.3 Pectins

The term pectin designates a family of polysaccharides and in fact consists of a mixture of heterogeneous, branched, and highly hydrated chains, rich in d‐ galacturonic acid units (GalpA) and thus in general also highly charged. The unbranched and charged regions interact with Ca2+ cations, forming what is known as the “eggbox structure” [27], in which the carboxyl groups of GalpA from two parallel chains are cross‐linked by the ionic bond. A minimum of nine subsequent unesterified GalpA units are necessary to form a stable qua‑ ternary structure [28], and these stretches are responsible for the gel‐forming character of pectins. As a consequence, their extraction is obtained using Ca2+‐ chelating agents, such as ammonium oxalate or EDTA and EGTA. Their gel‑ ling properties are of importance for their biological functions, determining wall porosity and providing charged surfaces that modulate wall pH and ion balance and regulating cell–cell adhesion at the middle lamella. They are mostly found in the non‐wood forming parts of land plants. Pectins are generally divided in three main pectic structures or domains, all containing GalpA to a greater or lesser extent, which are homogalacturonan (HG), rhamnogalacturonan I (RGI), and rhamnogalacturonan II (RGII) [29] (Figure 2.19c). HG is the most abundant pectin polymer (roughly 60% of pectins) and is formed by a linear, regular chain of α‐(1 → 4)‐linked d‐GalpA units that are partially methyl‐esterified at the C‐6 carboxyl position (hindering the interac‑ tion with ions) and occasionally acetyl‐esterified at O‐2 and/or O‐3 positions. RGI comes next in terms of abundance (20–35% of pectins). The backbone of this polymeric chain is made up of a repeating disaccharide, a galacturonic acid, and a rhamnopyranose, linked to give a (1 → 4)‐α‐d‐GalpA‐(1 → 2)‐α‐l‐ Rhap unit, where most GalpA are acetylated at O‐2 and O‐3. While HG in general is linear and unbranched, RGI has abundant side chains, mainly con‑ taining α‐l‐arabinofuranose (α‐l‐Araf) and/or β‐d‐Galp units that are attached to the O4 of the backbone Rhap unit. Due to these numerous branches this domain structure is also known as the “hairy region” of pectin. The less abundant pectin structure is RGII. The backbone structure of RGII is equivalent to HG, formed by a regular chain of α‐(1 → 4)‐linked d‐GalpA units, some of which can be methylated. The difference to HG comes from ­characteristic and rather long, further branched side chains that contain a number of repeated patterns but also some rare sugar units such as

53

54

Carbohydrates

β‐d‐apiofuranosyl (Apif). The presence of this rare sugar residue is also ­responsible for dimerization through the formation of a borate 1 : 2 diol ester that cross‐links two monomeric units of RGII. The borate ester is formed between OH‐2 and OH‐3 of the 3′‐linked apiosyl residues. Overall, RGII ­contains 12 different glycosyl units that are linked by more than 20 different glycosidic linkages. Despite its high complexity, RGII structures are highly ­conserved across a broad spectrum of taxonomically diverse plants [30]. 2.3.4.4 Chitin

Chitin is a regular polymer of (1 → 4)‐β‐linked N‐acetyl‐β‐d‐glucosamine units (GlcNAc) that is found in cell walls of fungi and in the exoskeleton of insects, arthropods, and crustaceans. The structure of chitin has been determined as early as 1929 by Karrer and Hofmann [31] and is comparable with that of cel‑ lulose, forming crystalline nanofibrils or whiskers, except that chitin is a modi‑ fied polysaccharide and may be described as cellulose with one hydroxyl group (at O‐2) on each monomer replaced with an acetyl‐amine group. The partially deacetylated version of a polysaccharide named chitosan is protonated at slightly acidic or neutral pH, thus representing a rather rare form of positively charged polysaccharide. Because of its building block GlcNAc (Figure 2.19d), in common with human polysaccharides, chitin and chitosan act as biological messengers in cells, stimulating many fundamental processes, such as defen‑ sive, symbiotic, and developmental cellular processes in plants, and probably also modulate human cell signals, activating the nervous, immune, cutaneous, and endocrine systems. For all these reasons, many applications in human health and food biotechnology have already been developed for this polysac‑ charide. Recently, it has become possible to industrially produce pure chitin crystals, named “chitin nanofibrils” for their needlelike shape and nanostruc‑ tured average size (240 × 5 × 7 nm) [32]. Due to their specific chemical and physical characteristics, these chitin nanofibrils may have a range of industrial applications as basis for nanomaterials. 2.3.4.5 Alginate

Alginate is one of the most important industrially exploited marine polysac‑ charides extracted from brown seaweeds [33]. The industrial applications of alginates are linked to its ability to retain water and consequently to its gelling, viscosifying, and stabilizing properties. Ongoing biotechnological applications are based either on specific biological effects of the alginate molecule itself or on its unique, gentle, and almost temperature‐independent sol/gel transition in the presence of multivalent cations (e.g., Ca2+). The structure of the alginate gelling regions is similar to the “eggbox” structure found in HG pectic gels and requires minimal stretches of guluronic acid residues. Being a family of unbranched binary copolymers, alginates consist of (1 → 4)‐linked α‐d‐mannuronic acid (M)

2.3.4 ­Polysaccharide

and α‐l‐guluronic acid (G) residues of widely varying sequences (Figure 2.19e). Depending on the sequence, this polysaccharide is described as a true block copolymer composed of homopolymeric large regions of M or G, termed M‐ and G‐blocks, respectively, and interspersed with regions of alternating structure (MG blocks). 2.3.4.6  Marine Galactans

Marine galactans is the generic designation of polysaccharides found as the main components of cell walls of red marine macro‐algae (Rhodophyta). Following taxonomic lineages, in some algal species, these galactans are called agars, which are much more neutral, while they are called carrageenans for others [34]. This large family of so‐called hydrocolloids has in common to be made up of linear chains of galactose, with alternating α‐(1 → 3) and β‐(1 → 4) linkages, and, most often, polyanionic polymers containing a high number of sulfate groups. The ideal disaccharidic monomers of the main industrially used algal galactans are substituted by zero (agarose), one (κ‐), two (ι‐), or three (λ‑carrageenan) sulfate groups. Due to the occurrence of 3,6‐anhydro bridges in the α‐linked galactose residues, agars and κ‐ and ι‐carrageenans form ther‑ moreversible gels in aqueous solutions. This α‐linked dehydrated galactose unit is in the d‐configuration in carrageenans, while it is in the l‐configuration in agars. Different cations promote gelation in carrageenans, and K+ is ideal for the formation of strong κ‐carrageenan gels, while Ca2+ is needed for the forma‑ tion of ι‐carrageenan gels. On the other hand, the more neutral agarocolloids contain many different modifications, such as pyruvate and methyl and acetyl ester substitutions, but sulfation is also observed in agars [35] although to a much minor extent than in carrageenans. 2.3.4.7  Storage Polysaccharides: Starch, Glycogen, and Laminarin

Polysaccharides are also used in all living organisms as carbon storage, as energy source in the case of starvation, or as energy reservoir for growing embryos. The nature of the main carbon‐storage polysaccharide in all cases is based on glucose residues but has been subject to major evolutionary events [36], and therefore variations can be seen following the major lineages of the phylogenetic tree of life. Thus, in plants the major storage polysaccharide is starch, while it is glycogen in animals; however, both are made up of α‐(1 → 4)‐ linked glucose backbone chains. Starch is the generic term for plant carbon‐ storage polysaccharides that are generally composed of two major structural domains, which are amylose, formed by linear α‐(1 → 4)‐linked chains of hun‑ dreds of glucose units, and amylopectin, which contains a regular branching pattern, in which 24–30 α‐(1 → 4)‐linked glucose units are branched to each other through α‐(1 → 6) linkages. Most starches contain between 72 and 82% amylopectin and 18 and 33% amylose [37]. The animal carbon‐storage

55

56

Carbohydrates

polysaccharide glycogen is somewhat similar to amylopectins but much more extensively branched. In contrast, the brown algal carbon‐storage polysaccha‑ ride called laminarin is a vacuolar β‐1,3‐glucan with occasional β‐1,6‐linked branches and composed of regular β‐(1 → 3)‐linked glucan chains, rather short in length (20–30 glucose units per chain).

­References 1 Laine, R. A. Glycobiology 1994, 4, 759–767. 2 Henrissat, B.; Davies, G. J. Plant Physiol. 2000, 124, 1515–1519. 3 Fushinobu, S.; Albes, V. D.; Coutinho, P. M. Curr. Opin. Struct. Biol. 2013, 23,

652–659.

4 Arnaud, J.; Audfray, A.; Imberty, A. Chem. Soc. Rev. 2013, 42, 4798–4813. 5 Henrissat, B. Biochem. J. 1991, 280, 309–316. 6 Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Lombard, V.;

Henrissat, B. Nucleic Acids Res. 2009, 37, D233–D238.

7 Stoddart, J. F. Stereochemistry of Carbohydrates. John Wiley & Sons, Inc.,

New York; 1971, pp. 95–97.

8 Cremer, D.; Pople, J. A. J. Am. Chem. Soc. 1975, 97, 1354–1358. 9 Winterburn, P. J.; Phelps, C. F. Nature 1972, 236, 147–151. 10 Gabius, H. J.; André, S.; Jiménez‐Barbero, J.; Romero, A.; Solis, D. Trends

Biochem. Sci. 2011, 36, 298–313.

11 Opdenakker, G.; Rudd, P. M.; Ponting, C. P.; Dwek, R. A. FASEB J. 1993, 7,

1330–1337.

12 Apweiler, R.; Hermjakob, H.; Sharon, N. Biochim. Biophys. Acta 1999,

1473, 4–8.

13 Spiro, R. G. Glycobiology 2002, 12, 43R–56R. 14 Wilson, I. B. H.; Paschinger, H.; Rendic, D. Glycosylation of model and

15 16 17 18

19 20

‘lower’ organisms. In The Sugar Code. Fundamentals of Glycosciences (Gabius, H.‐J., ed.), Wiley‐VCH, Weinheim, Germany; 2009, pp. 139–154. Schnaar, R. L. J. Allergy Clin. Immunol. 2015, 135, 609–615. Damerell, D.; Ceroni, A.; Maass, K.; Ranzinger, R.; Dell, A.; Haslam, S. M. Biol. Chem. 2012, 393, 1357–1362. Frey, H.; Schroeder, N.; Manon‐Jensen, T.; Iozzo, R. V.; Schaefer, L. FEBS J. 2013, 280, 2165–2179. Carpita, N.; McCann, M. Chapter 2. In Biochemistry & Molecular Biology of Plants (Buchanan, B. B.; Gruissem; W.; Jones, R. L., eds), American Society of Plant Physiologists; John Wiley & Sons, Ltd, Chichester, UK; 2000, pp. 52–108. Kotting, O.; Kossmann, J.; Zeeman, S. C.; Lloyd, J. R. Curr. Opin. Plant Biol. 2010, 13, 321–329. Smith, L. G. Nat. Rev. Mol. Cell Biol. 2001, 2, 33–39.

 ­Reference

21 Michel, G.; Tonon, T.; Scornet, D.; Cock, M. J.; Kloareg, B. New Phytol. 2010,

188, 82–97.

22 Fry, S. C.; York, W. S.; Albersheim, P.; Darvill, A.; Hayashi, T.; Joseleau, J. P.;

23 24 25 26 27 28 29 30 31 32 33

34 35 36 37

Kato, Y.; Lorences, E. P.; Maclachlan, G. A.; McNeil, M.; Mort, A. J.; Reid, J. S. G.; Seitz, H. U.; Selvendran, R. R.; Voragen, A. G. J.; White, A. R. Physiol. Plant. 1993, 89, 1–3. McNeil, M.; Darvill, A. G.; Fry, S. C.; Albersheim, P. Annu. Rev. Biochem. 1984, 53, 625–663. Hayashi, T.; Kaida, R. Mol. Plant 2011, 4, 17–24. Ebringerova, A.; Heinze, T. Macromol. Rapid Commun. 2000, 21, 542–556. Sørensen, I.; Pettolino, F. A.; Wilson, S. M.; Doblin, M. S.; Johansen, B.; Bacic, A.; Willats, W. G. Plant J. 2008, 54, 510–521. Ridley, B. L.; O’Neill, M. A.; Mohnen, D. Phytochemistry 2001, 57, 929–967. Liners, F.; Thibault, J. F.; Van Cutsem, P. Plant Physiol. 1992, 99, 1099–1104. Willats, W. G. T.; Knox, J. P.; Mikkelsen, J. D. Trends Food Sci. Technol. 2006, 17, 97–104. Mohnen, D. Curr. Opin. Plant Biol. 2008, 11, 266–277. Karrer, P.; Hofmann, A. Helv. Chim. Acta 1929, 12, 616–637. Morganti, P.; Morganti, G.; Morganti, A. Nanotechnol. Sci. Appl. 2011, 4, 123–129. Draget, K. I.; Smidsrod, O.; Skjak‐Braek, G. Alginates from algae. In Polysaccharides and Polyamides in the Food Industry: Properties Production and Patents I (Steinbüchel, A.; Rhee, S. K., eds), Wiley‐VCH, Weinheim, Germany; 2005, pp. 215–223. Kloareg, B.; Quatrano, R. S. Oceanogr. Mar. Biol. Annu. Rev. 1988, 26, 259–315. Lahaye, M.; Rochas, C. Hydrobiologia 1991, 221, 137–148. Michel, G.; Tonon, T.; Scornet, D.; Cock, M. J.; Kloareg, B. New Phytol. 2010, 188, 67–81. Buléon, A.; Colonna, P.; Planchot, V.; Ball, S. Int. J. Biol. Macromol. 1998, 23, 85–112.

57

59

2.4 Proteins: From Chemical Properties to Cellular Function: A Practical Review of Actin Dynamics Stéphane Romero1,2,3,4 and François‐Xavier Campbell‐Valois5 1

Equipe Communication Intercellulaire et Infections Microbiennes, Centre de Recherche Interdisciplinaire en Biologie (CIRB), Collège de France, Paris, France 2 Institut National de la Santé et de la Recherche Médicale U1050, Paris, France 3 Centre National de la Recherche Scientifique UMR7241, Paris, France 4 MEMOLIFE Laboratory of Excellence and Paris Science Lettre, Paris, France 5 Département de Chimie et Sciences Biomoléculaires, Université d’Ottawa, Ottawa, Ontario, Canada

2.4.1 ­Introduction Proteins are the most abundant component of living matter. They play a wide variety of basic functions: to organize the cell, allow metabolic and catabolic functions, build the cell architecture, replicate, interpret genetic material, and catalyze chemical reactions. Proteins are also mediators of physiological functions in multicellular organisms to allow the communication between cells or cell adhesion or ensure the barrier function of a tissue. In this chapter, we will describe the molecular architecture of proteins and how their intrinsic physicochemical properties are used for performing cellular physiological functions. We will particularly focus on the properties of the actin cytoskeleton as an example of a regulated biological function that allows cell movement and architecture.

2.4.2 ­Molecular Architecture of Proteins Peptides, polypeptides, and proteins are polymers composed of amino acids, which are linked like pearls on a necklace through the formation of successive peptide bonds. Peptides and polypeptides are distinguished from proteins by their size, the latter being usually larger (>40–100 amino acids), although distinction between peptides/polypeptides and small proteins can become blurry Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

60

Proteins: From Chemical Properties to Cellular Function

depending on authors and/or circumstances. In addition, the term peptide can also be used to designate short polypeptides with specific physiological functions (e.g., hormones). Proteins (and their smaller counterparts) are absolutely critical to all biological processes. Structural diversities of proteins allow ­ them to accomplish a seemingly infinite breadth of functions that are largely inaccessible to other biopolymers such as nucleic acids or sugars. Proteins can be modified genetically or chemically to study their behavior. Enzymes are proteins endowed with catalytic activities, which can generate enantiomer‐ specific products more efficiently than any other synthesis methods. Natural and human‐designed enzymes are used in nanotechnologies for industrial, technological, and therapeutic purposes. 2.4.2.1  Amino Acids

Apart from the notable exception of glycine, amino acids used to synthesized proteins are chiral molecules of l‐conformation composed of an alpha carbon (Cα) linked to a primary amine (─NH2), a carboxylic acid (─COOH), an hydrogen, and a variable side chain, which confer specific physicochemical properties to each amino acid (Table 2.5) [1]. Side chains of amino acids can be hydrophobic (aliphatic or aromatic) or hydrophilic (polar or charged). Charged side chains can be charged negatively (acidic) or positively (basic) at physiological pH. Hydrophobic amino acids (aliphatic: alanine, proline, valine, leucine, methionine, and isoleucine; aromatic: phenylalanine, tyrosine, and tryptophan) are usually found in the center of globular proteins where they are shielded from water, while hydrophilic amino acids (polar: glycine, cysteine, serine, threonine, histidine, asparagine, and glutamine; acidic: aspartate and glutamate; basic: lysine and arginine) are usually found on the surface of globular proteins where they can interact with water. By convention, carbon atoms within linear side chains are named Xβ if they interact directly with the Cα, Xγ if they are one group apart from the Cα, and so on and so forth. Amino acids in proteins can be posttranslationally modified passively by chemical degradation (e.g., oxidation of methionine or deamidation of aspargine), or actively by other proteins to convey cellular signaling (e.g., signal transduction) [2], or chemically by design for experimental or biotechnological purposes [3–7]. In principle, it is also possible to chemically or genetically synthesize proteins with unnatural amino acids [8, 9]. This line of research is currently at the heart of chemical biology efforts aiming to generate enzyme and even synthetic and augmented organisms for novel biotechnology, medical, and industrial applications [10]. 2.4.2.2  Peptide Bond

The peptide bond is the chemical bond formed between two neighboring amino acids in a polypeptide chain. It is obtained through the nucleophilic attack of the nitrogen atom of the primary amine group of the ni + 1 amino acid

2.4.2 ­Molecular Architecture of Protein

Table 2.5  Name and properties of natural l‐amino acids. Nom (Three letter code, one letter code)

Structure

Properties of the side chain H

H

Glycine (Gly, G) +H

Alanine (Ala, A)

COO–

3N

CH3

H +H

Nonpolar, aliphatic

COO–

3N

H3C

Valine (Val, V)

CH3

H +H

CH3

Nonpolar, aliphatic, β‐branched CH3

H +H

Nonpolar, aliphatic, β‐branched

COO–

3N

Isoleucine (Ile, I)

Leucine (Leu, L)

Nonpolar, small, flexible

H COO–

3N

CH3 H3C

Nonpolar, aliphatic

H +H

COO–

3N

H

Proline (Pro, P) N H2 +

COO–

CH3

Methionine (Met, M)

Nonpolar, aliphatic, rigid, cis‐isomer of the peptide bond more frequently observed Nonpolar, aliphatic, sulfur atom

S H +H

3N

COO–

Phenylalanine (Phe, F)

Nonpolar, aromatic H +H

3N

COO–

(Continued)

61

62

Proteins: From Chemical Properties to Cellular Function

Table 2.5  (Continued) Nom (Three letter code, one letter code)

Tyrosine (Tyr, Y)

Structure

Properties of the side chain

HO

Nonpolar, aromatic, hydroxyl group

H +H

COO–

3N

Tryptophan (Trp, W)

Nonpolar, aromatic HN H +H

COO–

3N

HO

Serine (Ser, S)

Polar, hydroxyl

H +H

COO–

3N

HO H

Threonine (Thr, T) +H

CH3 COO–

3N

HS

Cysteine (Cys, C)

Polar, sulfur atom, two cysteines can form a disulfide bridge in oxidizing conditions

H +H

Polar, hydroxyl, β‐branched

H

COO–

3N

NH2

Asparagine (Asn, N)

Polar, amide group

O H +H

COO–

3N

O

Glutamine (Gln, Q)

NH2

H +H

3N

COO–

Polar, amide group

2.4.2 ­Molecular Architecture of Protein

Table 2.5  (Continued) Nom (Three letter code, one letter code)

Structure

Properties of the side chain

N

Histidine (His, H)

Polar, mildly basic

HN H +H

COO–

3N

Lysine (Lys, K)

NH3+

Polar, basic

H +H

COO–

3N

H2N

Arginine (Arg, R)

NH2+

Polar, basic

HN

H +H

Aspartate or aspartic acid (Asp, D)

COO–

3N

Polar, acidic

O– O H +H

COO–

3N

O

Glutamate or glutamic acid (Glu, E)

O–

H +H

3N

COO–

Polar, acidic

63

64

Proteins: From Chemical Properties to Cellular Function

on the carboxyl group of the ni amino acid, leading to the liberation of one water molecule and the formation of a covalent bond between the resulting secondary amine and carbonyl groups [1]. The resonance between these functional groups confers to the peptide bond a partial double bond character. This renders the peptide bond more rigid and planar than classical simple chemical bond and explains the existence of trans‐ or cis‐conformation of the peptide bond. For most amino acids, the trans‐conformer is highly favored over the cis‐conformer (~300 : 1) both in the unfolded and native states [1, 11]. Nevertheless, similarities between the Cα and the Cδ of proline set it apart from other amino acids, making its trans‐ and cis‐conformers almost equally likely (3 : 1). Due to its slow speed though, isomerization of prolines can delay protein folding, particularly if the cis‐conformer is observed in the native structure; enzymes named peptide‐prolyl isomerases can accelerate the isomerization of prolines [1, 11]. Although there is approximately 10 times more proline than other amino acids adopting the cis‐conformation [11], switching from the cis‐ (unfolded state) to trans‐conformer (native state) in non‐prolyl residues has nonetheless been linked to complex folding reactions [12]. To synthesize peptide bonds, living organisms use energy in the form of adenosine triphosphate (ATP). The peptide bond is extremely stable in water solution, and organisms used enzymes (proteases) and energy in the form of ATP to degrade it [1]. 2.4.2.3  Primary Structure

Messenger ribonucleic acids (mRNAs) are the template used by the ribosome during the process known as translation to generate proteins through the formation of a peptide bond between each ni and ni + 1 amino acid pair in the sequence. Proteins are synthesized from the amino‐ to the carboxy‐ terminus (e.g., the first amino acid in the chain has a free primary amine and the last amino acid in the chain has a free carboxylic acid because these groups do not form peptide bonds in natural proteins, which are linear polymers). Therefore, most proteins are originally made up of a specific composition and sequence of the 20 amino acids allowed by the universal genetic code (Table 2.5). The sequence of amino acids specific to a given protein is the primary structure (Figure  2.20a). The term residue also designates an amino acid at a given position within the primary structure. Residues are numbered from the amino (N)‐terminus to the carboxy (C)‐terminus [1]. In  most cases, the primary structure encodes the native structure (i.e., its three‐dimensional structure (3D) or tertiary structure) and function of a given protein. Processes through which newly synthesized or denatured proteins are converted spontaneously, or not, into native proteins, are ­ ­collectively called protein folding.

2.4.2 ­Molecular Architecture of Protein

(a) MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG

(b)

(c)

(d)

(e)

Figure 2.20  Protein structural properties. Primary, secondary, and tertiary structures are illustrated using ubiquitin, a small globular protein that includes many of the more complex traits of larger proteins. The ubiquitin fold is one of approximately 400 superfolds that are particularly frequently encountered in nature. (a) The primary structure of the human ubiquitin encoded by genetic information through the universal genetic code is represented. (b) The secondary structure scheme of ubiquitin (generated with PDB ID 1UBI and PDB sum) is characterized by the angles ϕ and ψ adopted by every amino acid in the primary structure, resulting mainly in α‐helices (helix) and β‐strands (arrows). They are characterized by the formation of hydrogen bonds with specific patterns. (c) The organization of the secondary structure elements in space through the establishment of long‐range contacts mainly between hydrophobic side chains corresponds to the tertiary structure. The structure in the left panel represents the overall tertiary structure of ubiquitin. The structure in the middle panel is another representation of the tertiary structure showing the

65

66

Proteins: From Chemical Properties to Cellular Function

2.4.3 ­Protein Folding 2.4.3.1  Peptide and Protein: Secondary Structure

Protein backbones (aka Cα traces) formed by invariable elements of amino acids (i.e., main chain amine, Cα, and carbonyl) adopt specific and stable conformations, depending on their primary structure. This conformation depends primarily on angles fi (ϕ) and psi (ψ) of chemical bonds formed by Cα with their cognate secondary amine (>NH) and carbonyl (>C═O) groups, respectively. Due to the variable steric hindrance effect introduced by side chains only a subset of ϕ and ψ combinations are tolerated in natural proteins composed of l‐amino acids. The nature of the side chain impacts on the angles that are tolerated by a given amino acid within a polypeptide chain. Naturally, glycine with its minimalist side chain can adopt a wider range of ϕ and ψ combinations than other amino acids. In contrast proline and β‐branched amino acids (e.g., threonine, valine, and isoleucine) are more restricted in the range of conformations adopted. The different combinations of ϕ and ψ angles tolerated in proteins are depicted in what is known as the Ramachandran plot. The most frequent combinations of ϕ and ψ angles observed in proteins are concentrated in specific region of the Ramachandran plot corresponding to the main secondary structure elements of proteins, that is, α‐helix and β‐sheets (Figure 2.20b). The formation of secondary structures allow for optimizing hydrogen bonds formation between >NH and >C═O groups of protein backbones [15]. Hydrogen bonds formation is important to expel water from protein core, thereby providing a hydrophobic environment favoring protein folding and stabilization of the native structure. Amino acids located in α‐helices have a local bonding pattern in which the >NH group from an ni amino acid is paired with the >C═O group of an ni amino acid. This induces a dipole moment in α‐helices with the N‐terminus charged positively and the C‐terminus charged Figure 2.20  (Continued) van der Waals radii of the backbone and side chains of hydrophobic (darker, in the core) and polar (lighter at the surface) residues. Note the burial of hydrophobic residues in the protein core, while polar residues are exposed at the surface. The figure in the right panel reveals the hydrophobic residues because masking polar residues shown in the previous panel have been completely stripped from this image. The burial of hydrophobic residues in the center of the structure is a central characteristic of the native structure and folding reaction of globular proteins. (d and f ) Two examples of quaternary structure described in the text: (d) the β‐galactosidase (PDB ID 4TTG) forms an obligate homotetramer. In this case, tetramerization is essential to obtain the native structure and enzymatic activity. (e) The green fluorescent protein (GFP; PDB ID 1GFL) was shown to form a weak dimer in solution. This dimerization was abrogated by the insertion of charged amino acids at the interface of monomers without affecting the tertiary structure and intrinsic fluorescence of GFP [13]. Tertiary and quaternary structure images in this figure were generated using the software Chimera [Source: Pettersen et al. [14]. Reproduced with permission of Wiley].

2.4.3 ­Protein Foldin

negatively [1]. This dipole and hydrogen bond patterns require specific amino‐ and carboxy‐capping motifs at the extremities of α‐helices [15]. Amino acids located in β‐strand form nonlocal hydrogen bond patterns. β‐strands can form parallel (β‐strands within parallel β‐sheet have the same amino–carboxy orientations) and antiparallel β‐sheets (β‐strands within antiparallel β‐sheet have opposite amino–carboxy orientations). In contrast to α‐helices, N‐ and C‐capping and dipole moment are negligible in antiparallel β‐sheets and are still debated in parallel β‐sheets, although there are arguments supporting their existence in the latter case [16]. Amino acids have varying propensities for α‐helix and β‐strand [17]. These propensities can be used to predict secondary structure form the primary structure [18]. For example, side chains of β‐branched amino acids are better tolerated in β‐sheets than in α‐helices [17, 18]. In native proteins, the secondary structure can be organized in multiple ways, and distinct secondary structure elements are interconnected with disordered regions exposed to the surface of proteins such as loops or through more structured connections known as β‐ or γ‐turns that enable complete reversal (180°) of the protein backbone orientation within a very short region. Many different subclasses of turns with subtle changes in their geometrical properties are observed in protein structures [19]. 2.4.3.2  3D Folding: Tertiary Structure

The atomic coordinates of all residues of a protein constitute the tertiary structure (aka the native structure). Numerous types of non‐covalent interactions are formed between side chains of amino acids. Long‐range interactions in particular play a dominant role in the process of protein folding and in the spatial organization of secondary structure elements observed in the tertiary structure. Interactions between hydrophobic amino acid side chains are arguably the driving force of protein folding and stabilization of the tertiary structure. Critical to their role in this process is the hydrophobic effect [20]. Water molecules (H2O) present in vivo or in vitro (in the solution) form a dense hydrogen bond patterns. The exposition of hydrophobic side chains to the solvent constrains the organization of H2O in their vicinity and is highly unfavorable at the entropy level. The hydrophobic effect is thus driven by entropy and consists of the shielding of hydrophobic side chains from contact with H2O molecules present in the solution or in cellular and tissular environment. The exclusion of hydrophobic side chains from H2O vicinity maximizes entropy in the system by enabling H2O to organize more freely around protein molecules in solution, thereby optimizing its hydrogen bond network. By expulsing H2O molecules from the protein core, the hydrophobic effect also affects hydrogen bonds and secondary structure formation by making amino acid functional groups the only hydrogen bond acceptors and donors available in this location. Therefore, the hydrophobic effect partially offsets the

67

68

Proteins: From Chemical Properties to Cellular Function

reduction of entropy of the polypeptide chain as it passes from the denatured state (high conformational diversity) to the native state (low conformational diversity). To maximize entropy in the protein–solvent (H2O) system, hydrophobic side chains coalesce to form the hydrophobic core of globular proteins through the formation of specific long‐range contacts contributing an important enthalpy‐dependent stabilizing factor. These interactions are mainly stabilized by van der Waals interactions (London dispersion force), while hydrophilic side chains remain at the surface interacting together and with water molecules (Figure  2.20c). Salt bridges (ionic interactions) are also formed between acidic and basic side chains. Most of the time they are found at the surface of proteins, but they can also occur at sites that are completely buried into the protein core. In this case, their formation can be absolutely essential for the stabilization of the tertiary structure due to the destabilization effect of uncoupled charged groups in the hydrophobic core [21, 22]. In the case of secreted proteins, including toxins, venom, hormones, and neuropeptides, disulfide bridges are often formed. Disulfide bridges are covalent interactions formed in oxidizing conditions (e.g., in the extracellular environment or in the secretory pathway of eukaryotes) between the ─SH group of two cysteine residues. Once formed, they are extremely stable in nonreducing conditions and contribute strongly to the organization of the tertiary structure and in the stabilization of proteins forming them. Numerous cofactors can also contribute to the stabilization of the tertiary structure including metal ions (e.g., Fe2+, Zn2+, Mg2+, etc.), which are coordinated by amino acids. Classic examples are ferredoxin (Fe) and zinc fingers (Zn) containing proteins. Metal ions and other cofactors are often playing both a structural role and a functional role in proteins that coordinate them. Large proteins are often composed of independent folding units that are called structural domains and often carry out a specific function (e.g., enzymatic or binding). 2.4.3.3  Quaternary Structure

The structure of many proteins consists of more than one molecule or monomer of the primary structure. For protein requiring oligomerization of monomers to obtain the native structure, it is necessary to introduce an additional level of structural organization, the quaternary structure. The quaternary structure is the spatial organization of the tertiary structure of monomers against each other (Figure 2.20d and e). A famous example is the β‐galactosidase (lacZ), which consists of a tetramer. Tetramerization is an integral part of the folding process of this enzyme, meaning that it supports stabilization of the tertiary structure within each monomer and is essential to enzymatic functions [23]. In contrast, the green fluorescent protein (GFP) has been shown to form a weak dimer that is easily disrupted without affecting the tertiary structure and intrinsic fluorescence [13]. The term quaternary structure can also be

2.4.3 ­Protein Foldin

extended to structures of protein complexes consisting of multiple protein monomers adopting distinct tertiary structure. Indeed, protein structures are deposited in the PDB or worldwide PDB (wwPDB) and attributed a unique identifier (usually composed of four digits and/or letters). Proteins can adopt a fibrous or a globular shape [1]. Fibrous proteins, such as collagen found in skin, often have a structural role. Globular proteins are more frequent and generally observed in all remaining proteins such as enzymes and proteins implicated in signaling pathways. Some globular proteins such as actin and tubulin polymerize through non‐covalent interactions to form dynamic filaments, which have some properties in common with fibrous proteins. Proteins with highly divergent primary structure cannot by definition adopt the same tertiary structure due to the discrepancy of their amino acids composition and hence, atomic coordinates. However, a subset of proteins with low sequence diversity can display similar topology, that is, a common spatial organization of their secondary structure elements. Proteins structures with similar topology are grouped in databases such as the Structural Classification of Proteins (SCOP and SCOP2) and Class, Architecture, Topology, and Homologous superfamily (CATH) [24–26]. While not being as quickly searchable as those two databases, the Dali server allows for comparative analyses of protein sequence and structure using a Protein Data Bank (PDB) identifier as query [27]. The key point to realize is that despite sequence and functional diversity, proteins from multiple evolutionary origins have adopted the same overall topology. Some protein topologies are much more frequently observed than others. The 400 most frequent topologies encapsulate most of the universal protein structural diversity. The 10 most frequent are called superfold and are found in approximately 25% of protein structures currently known [28, 29]. One of the great mysteries of structural biology is how primary sequences with such diversity can produce similar topologies. Artificial degeneration of the primary structure and selection of correctly folded clones of the Ras‐binding domain of Raf, a protein adopting the ubiquitin superfold, have revealed that stabilization of the native structure is conserved throughout natural evolution [30–32]. The formation of the tertiary and, when relevant, quaternary structures leads to the formation of cleft, loops, grooves, and surface with specific chemical and physical properties that enable protein binding and enzymatic functions. Residues that are non‐contiguous in the primary structure often contribute to critical chemical moieties in catalytic sites. Therefore, it is not only the nature of the side chains that are important in the formation of an efficient catalytic site but also the specific chemical environment formed within the catalytic site in the native structure (e.g., absence of water, formation of hydrogen bonds pattern, local pKa, etc.) and the shape of the catalytic site that determine the nature of the chemical reaction it catalyzes and the substrate(s) it can accommodate with more or less constraints [1]. The distribution of hydrophobic,

69

70

Proteins: From Chemical Properties to Cellular Function

acidic, and basic side chains at the surface permits the formation of dimerizing interface essential to protein–protein interactions (PPIs). Many examples of the distribution of specific amino acids at protein–protein interface have been reported [33]. For example, the basic face of the RBD of Raf interacts with acidic residues on Ras, while the face of the RBD of Raf opposite to the dimerization interface is rather acidic [30]. In addition, the nonrandom distribution of specific residues at surface exposed positions in the native structure was shown through the unveiling of cooperative binding processes that involved cooperation of residues located at a distance from the binding interface with those constituting the ligand‐binding pocket in PDZ domains [34]. 2.4.3.4  Protein Folding and De Novo Design

Secondary, tertiary, and quaternary structures and the topology of proteins are determined mostly using not only X‐ray crystallography but also nuclear magnetic resonance (NMR) spectroscopy (for smaller proteins) and electron microscopy (for larger protein complexes), and are finally deposited in the PDB or wwPDB [35]. Considerable progress has also been observed in the prediction of secondary and tertiary structures [36, 37]. However, the best algorithms are still a far cry from reliably replacing experimental methods to determine the tertiary structure of proteins. In parallel with structure prediction, capacities to design protein from de novo or using scaffold to adopt a specific tertiary structure or enzymatic functions have also spectacularly improved. The de novo design of tertiary structures unknown in nature has been reported [38]. Design of proteins with novel enzymatic function has also been shown [39, 40], although it is still very challenging to generate proteins with highly effective catalytic activity de novo (reviewed in Refs. [41, 42]). In the future, the capacity to routinely design structure and functions will probably become a reality. The design of novel enzymatic functions will allow for the production of chemicals that are currently impossible to obtain using natural enzyme or traditional chemical processes. Progress toward these objectives will have a profound impact in biotechnology, nanotechnology, and industrial processes. While empirical methods based on known protein structures have dominated structure prediction and de novo design algorithms, other efforts have focused on the search of general laws conducting protein folding, the process through which the polypeptide chain transits from the denatured to the native state. The assumption that general laws do exist came from what is known as the Levinthal’s paradox. To put it shortly, the time required for a polypeptide chain, even in a small‐size protein, to explore all possible conformations of each of its amino acid exhaustively is infinite; therefore there must exist constraints that limit conformational search. The hypothesis stating that the primary structure encodes the tertiary structure has led many scientists to study protein folding by measuring the impact of mutations on the folding and

2.4.3 ­Protein Foldin

unfolding rate. Indeed, many small proteins possess the intrinsic capacity to fold or more accurately refold following experimentally controlled denaturation. For refolding experiments, purified proteins are usually denatured using chaotropic salts (e.g., urea, guanidinum–HCl, etc.), but temperature or pH are also used. The folding and unfolding reactions are most of the time followed using the change in fluorescence of a tryptophan residue embedded in the structure of the protein of interest using a stopped‐flow apparatus. Fersht and colleagues have been pioneers in protein folding by introducing the method of protein engineering (amounting to alanine‐scanning mutations or alternative mutations reducing the size of the side chain at a given position in the primary structure) and have proposed two alternative models using barnase and chymotrypsin inhibitor 2 (CI2). In the framework model [43–45], as in the case of barnase [46, 47], some secondary structures are formed before others and the establishment of long‐range hydrophobic contacts. In the nucleation condensation model [48, 49], as in the case of CI2 [50, 51], long‐range hydrophobic contacts are formed to a certain extent first and lead to the coalescence of the rest of the hydrophobic core and secondary structure formation. Although data were consistent with the formation of a kinetic intermediate in the case of barnase and of many other proteins initially studied [45–47], CI2 displayed a simpler two‐state folding, that is, that only the unfolded and native states are significantly populated during folding (no intermediate states are formed). Protein engineering studies on many two‐state folding proteins were subsequently realized because of the simplicity of the interpretation of their kinetic data [52]. Analyses of the folding rate of many two‐state folders have also revealed that there is a strong correlation between the average distance between contacting residues in the native structure and protein‐folding rate [53, 54]. Although two‐state folding is often considered the most efficient and has been the focus of much interest in the field, it is clear that larger proteins fold through more complex folding pathways including many intermediates (Figure  2.21) [55]. Intermediates can be on‐pathway and off‐pathway. On‐ pathway intermediates contribute positively to the folding rate by aiding in the formation of the native state, while off‐pathway intermediates are self or non‐ self‐resolving dead‐ends on the folding pathway forming a certain amount of nonnative non-covalent bonds and/or conformations (e.g., nonnative peptide bond isomer, particularly in the case of proline residues) that must be destroyed to allow the formation of the native state. The introduction of tryptophan probes at many locations in the primary structure of ubiquitin has revealed the presence of an intermediate that was originally undetectable in this protein that was thought to adopt two‐state folding [56]. Three proteins of the homeodomain family adopting the same topology have been shown to fold through various pathways  corresponding to three different models (i.e., framework, nucleation–condensation, and a mix of the latter two models) [57]. Therefore, distinctions between the various protein folding models have become blurry.

71

72

Proteins: From Chemical Properties to Cellular Function

(a)

Denatured (unfolded)

Intermediate on-pathway

Native (folded) Rate limiting step (where the TS is formed)

Intermediate off-pathway

(b)

(c)

Figure 2.21  Protein folding and refolding: two‐state or three‐state folding (i.e., with or without the formation of stable intermediates). (a) This panel shows a schematic representation of the folding pathway of proteins. Denatured proteins form or not intermediates that can be on‐ or off‐pathway (lighter lettering indicates that these steps are facultative) during the folding reaction that leads ultimately to the formation of native proteins. On‐pathway and off‐pathway intermediates are envisioned forming mostly native‐like and significant nonnative‐like contacts, respectively. As in other chemical reactions, the native state is obtained through a rate‐limiting step that consists in the formation of a transition state (TS). The scheme below the folding pathway represents schematically the denatured, intermediate, and native states where dark and light circles represent hydrophobic and polar residues. Note the absence of contacts in the denatured state, the formation of some contacts between hydrophobic residues in the intermediates and the formation of many long‐range hydrophobic contacts in the native state. (b) Native structure of chymotrypsin inhibitor 2 (CI2) a protein that folds through a two‐state mechanism (code PDB: 2CI2). (c) Native structure of barnase a protein that folds through a three‐state mechanism (code PDB 1A2P). Structure images represented in this figure were generated using the software Chimera [Source: Pettersen et al. [14]. Reproduced with permission of Wiley].

2.4.4 ­Interacting Proteins for Cellular Function

This highlights the necessity of the elaboration of a unifying theory to reconcile the different models. In vitro studies of protein folding is focusing on self‐folding and refolding proteins that are usually rather limited in size. Nevertheless, protein synthesized inside living cells can be very large, often composed of many independent folding units. Many enzymes in the cytoplasm and endoplasmic reticulum can contribute to in vivo protein folding of such large proteins by unfolding off‐ pathway intermediates (chaperones), facilitating isomerization of the peptide bond of proline residues (peptidyl prolyl isomerase), destroying noncanonical disulfide bonds (thiol disulfide oxidoreductases), or recognizing inappropriately glycosylated proteins [58, 59]. Prokaryotes have usually a faster translation rate than eukaryotes. For this reason and maybe others related to differences in their translation and protein–chaperon machineries, prokaryotes have more difficulties to fold correctly very large proteins composed of many independent folding units [60]. Another important difference between in vivo and in vitro protein folding is the difference in protein crowding in both environments, which is of course much higher in the cytoplasm than in a test tube. The importance of the crowding effect on protein folding and functions occurring in vivo is seldom considered when studying protein folding, binding, and catalysis in vitro but has been discussed in depth elsewhere [61, 62].

2.4.4 ­Interacting Proteins for Cellular Functions 2.4.4.1  Protein Interactions

Proteins rarely act alone in a cell or in a living multicellular organism. An important feature of proteins is their ability to interact non‐covalently not only with other biomolecules, which can be lipids, sugars, or nucleic acids, but also mainly with other proteins. These interactions are necessary for the biological function of proteins acting within macromolecular complexes. In the case of the Arp2/3 complex involved in actin assembly and composed of seven different subunits, PPIs between the different subunits are important for the stability of the complex, but differential conformational changes also regulate the activity of the complex, switching from an inactive state to an active state (see Section 2.4.5 and [63] for review). Protein interactions with other biomolecules also account for the regulation of their function. Other specific interactions with lipids within membranes or compartmentalized proteins ensure the specific subcellular localization of protein. Non‐covalent interactions of proteins with other proteins, lipid, or carbohydrates, result from the same weak bonds that rule protein folding, such as hydrogen bonds, hydrophobic bonds, electrostatic bonds, or van der Waals forces [1]. As any reversible non‐covalent molecular interaction, a simple PPI is

73

74

Proteins: From Chemical Properties to Cellular Function

a bimolecular equilibrium reaction: protein A binds to protein B, leading to a complex AB: A B  AB (R2.4.1)1 On the forward reaction, one protein A and one protein B associates. This reaction involving two components is a second‐order reaction, with a rate defined by d AB dt

k . A . B (2.4.1)

On the reverse reaction, one AB complex dissociates into protein A and B, a first‐order reaction defined by d A dt

d B

k . AB (2.4.2)

By definition, rates of forward and reverse reaction are equal when the equilibrium is reached. This equilibrium state depends on the affinity of A for B, defined by the dissociation constant, KD, equal to KD

k k

A

eq

.B

AB

eq

(2.4.3)

eq

Thus, for a high affinity (i.e., a low KD), the amount of A and B interacting to form the complex AB is high, and a low amount of AB dissociates upon dilution [64]. This is particularly important for biological processes because high‐ affinity interactions are more likely to occur in vivo because the physiological concentration of these proteins is generally around or well above the KD. Inversely, some low‐affinity interactions in the micromolar range will never occur in vivo because physiologically relevant concentration of these proteins is well below the KD, whereas higher concentration of both or even of only one of the proteins implicated (proteins concentration similar or greater than KD) will promote the formation of complexes. Because the range of the rate constant of association is relatively small, the rate constant of dissociation will determine the affinity of two proteins. Low‐affinity proteins in the micromolar range such as the profilin–actin interaction (see Section 2.4.5) have a dissociation rate constant around 1 s−1, rendering the interaction between the two partner instable with a half‐life of 0.7 s (t1/2 = ln 2/k−). On the other hand, high‐affinity interactions in the picomolar to nanomolar range such as antigen–antibody 1  R stands for Reaction.

2.4.4 ­Interacting Proteins for Cellular Function

interactions will last forever with dissociation rate constants ranging from 0.001 to 0.000001 s and half‐lives ranging from hundreds of seconds to days. This long lasting antigen–antibody interaction has evolutionary allowed v­ ertebrates to build the efficient adaptive immune system, allowing the ­neutralization of pathogens by circulating antibodies [65]. To understand the various cellular functions of all the genome‐encoded proteins, a great emphasis has been put on establishing maps of PPI. ­ Establishing PPI networks is a big challenge: data have to be obtained from accurate high‐throughput methods, displaying enough sensitivity. Early attempts to map a binary interactome of the yeast model organism S. cerevisiae used high‐throughput two hybrids in which a protein interaction reconstitutes a functional transcription factor that turns on a reported gene [66, 67]. High‐ throughput data co‐affinity purification followed by mass‐spectrometry analyses techniques were also used to generate “co‐complex” interactome maps [68, 69]. Using similar approaches, it is known that the human protein interactome may involve from 130 000 to 650 000 PPIs [70, 71]. In addition, an important concern is to determine whether cofactors or additional binding partners are needed [72]. To understand the cellular function of any protein, the static view provided by PPI network data is the main limitation of this approach. Some interactions may occur only under specific conditions, with either temporal or spatial regulations. For instance, some particular cascade of interactions under the epidermal growth factor (EGF) receptor, leading to transcriptional activation or adaptation of the cell to its environment, only occurs upon cell stimulation with the EGF hormone and its binding to the EGF receptor [73]. To circumvent this limitation, novel proteomic, genomic, and transcriptomic approaches have been used to build dynamic PPI networks obtained under various stimuli [74]. 2.4.4.2  Enzymatic Activity of Proteins

Enzymes are a particular class of proteins that catalyze (i.e., accelerate the rate of ) chemical reactions. The enzyme converts a given molecule called substrate into a different molecule called product. The enzymatic reaction normally proceeds at a rate, temperature, and pH in an aqueous solvent corresponding to its physiological context. On average, the complete proteolysis of a protein under laboratory conditions requires boiling for about 24 h and high concentration of HCl. Under physiological conditions, the degradation by proteases occurs within few hours at 37°C and pH 7. Since enzymatic activity involves the binding of enzymes to their substrates, they are specific only for a particular reaction and are very selective to their substrates, as a particular key open only a specific lock. Enzymatic activity play a major role for cell metabolism and catabolism, to provide energy stored through ATP molecules and synthesize amino acids, nucleotides, lipids, and sugars. Because of their high specificity in catalyzing reactions, enzymes are used in industrial applications that require

75

76

Proteins: From Chemical Properties to Cellular Function

severe reaction catalysis constrains, such as the stereospecificity of products, that is, the capacity to synthesize a specific enantiomer of the sought‐after product. Michaelis and Menten observed that the rate of an enzymatic reaction was independent of the concentration of substrate when the amount of the enzyme was extremely low compared to the substrate. At lower concentrations of the substrate, the rate became proportional to its concentration. They concluded that the kinetics effects of enzymatic reaction could be explained by the  formation of the enzyme–substrate complex (ES) that would allow the transformation of the substrate (S) into the product (P) through the following reaction where E is the enzyme molecule: k

1

E S  ES k1

k2

E P (R2.4.2)

The idea of Michaelis and Menten was that the conversion of ES into P and E was slow compared to the binding reaction of E to P and that equilibrium is established between E, S, and ES [75]. At saturating amounts of substrates, the ES concentration remains the same, and the rate of the reaction only depends on the enzyme concentration as the concentration of the substrate varies. At low substrate concentration, the enzyme is not saturated and the rate depends on ES concentration, which is proportional to the substrate concentration. This theory has been generalized for reaction where ES is not necessarily present at an equilibrium but at a steady state [76]. The rate of the reaction follows the equation: vi

Vmax . S Km

S

0

(2.4.4)

0

where vi is the initial velocity of the reaction, Vmax is the maximum rate of the reaction, and Km is the Michaelis constant equal to Km = (k−1 + k2)/k+1. Therefore, Km is a measure of the ability of the substrate to interact with the enzyme. Because for most enzymes k2 is relatively small compared with k+1 and k−1, the Km value is often close to the equilibrium dissociation constant (KD) for the formation of the ES complex. The reaction ES → E + P is a first‐ order reaction in s−1 units and reflects the frequency of the reaction catalysis, that is, the enzyme turnover. Thus, k2 is also called the catalytic constant (kcat), which is limited by the lowest reaction, that is, the conversion of ES into P and E. How enzymes catalyze chemical reaction within the cell to render compatible the kinetics of the reaction with living organisms physiological conditions has been analyzed. Enzymes reduce the activation energy required for the chemical reaction to occur through the stabilization of the transition state [77]. Decreasing the energy level of the transition state occurs through different

2.4.4 ­Interacting Proteins for Cellular Function

mechanism. The transition state can be stabilized by the properties of the active site to reduce the number of degree of freedom of the bound substrate [78]. Thus, a local effect on the orientation and concentration of substrates occurs within the active site [79]. A concerted acid–base catalysis can occur within the active site, induced by reactive groups provided by side chains of specific residues of the enzyme [80]. Finally, some active sites can provide an alternative reaction pathway, by temporary reacting with the substrate to form a covalent intermediate with amino acid residues within the active site. In the case of the serine proteases, an intermediate ester is formed with residues within the active site to decrease the energy transition state, and finding inhibitors that target and remain covalently bound to the active site is of particular interest for pharmacological research [81]. 2.4.4.3  Molecular Motors

Molecular motors are a particular class of enzymes involved in many biological movements, including intracellular trafficking or muscle contraction. They catalyze the hydrolysis of the gamma terminal phosphate of ATP molecule and convert this chemical energy to perform mechanical work [82]. Three types of cytoplasmic motor coexist in the cell cytoplasm and moving on cytoskeletal fibers (see next section): dyneins and kinesins, which move on microtubule, and myosins walking on actin filaments. In these three families, molecular motors work as monomer, dimer, trimer, or tetramers and are highly processive or on the contrary perform just one step before dissociating from the filament. The general organization of molecular motor structures is conserved, with a head or motor domain binding to filaments and hydrolyzing ATP, a central “flexible domain,” and a fibrillar domain allowing multimerization. Our understanding of mechanochemistry results from studies from myosin, kinesin families, and G‐protein in which the structure of the surrounding ATP‐binding domain is very similar [83]. For all motor molecules, the conversion step of ATP into ADP induces a small conformational change in a globular motor domain, amplified and converted into movement by additional structural domains [84]. The primary conformational rearrangement event occurs in unbound motors and is induced in conserved structures flanking the ATP‐binding site by the leaving of the inorganic phosphate (Pi) following cleavage of the gamma phosphate bond. This first amplification level of ATP hydrolysis is coordinated with subsequent conformational changes in the globular head domain, coupling ATP hydrolysis, and Pi release to the ability of the motor to attach to actin filaments or microtubules [85, 86]. The third level of amplification concerns these conformational changes in the ATP‐binding site that are transmitted in order to induce large displacements of carboxy‐terminal domains. This large‐scale intramolecular displacement mediated by conformational changes in the central “flexible domain” is due to

77

78

Proteins: From Chemical Properties to Cellular Function

the rotation of a rigid stalk in myosin, or the repositioning of a flexible element in kinesins [87]. Thus, each cycle of binding to a cytoskeletal filament and release of propel forwards the motors in a single direction to a new binding site on the filament, with the ATPase activity controlling the power stroke cycle and the working stroke—the distance through which the motor pulls the filament in a contraction cycle—controlled by the properties of the central “flexible domain” (see Refs. [88, 89] for review). As a result of these amplification steps, the displacement of molecular motors on cytoskeletal filaments apply forces in the range of 1–10 pN, allowing a single motor to move vesicles in the molecular crowded cytoplasm, or induce muscle contraction through concerted motor activity.

2.4.5 ­Self‐Assembly and Auto‐Organization: Regulation of the Actin Cytoskeleton Assembly Protein fibers involved in cell architecture, called cytoskeletal proteins, have the particular ability to self‐associate to form polymeric filaments. Self‐assembly of actin filaments—one of the three major cytoskeletal components with microtubules and intermediates filaments in eukaryotic cells—is required for cell migration, cell contraction and division, morphogenesis, or synaptic plasticity. These actin‐based motile processes, involving PPIs and enzymatic activities, are governed by the spatial and temporal control of actin filaments assembly. In physiological conditions, actin filaments (F‐actins) coexist with monomeric actin (G‐actin) under steady‐state conditions. Seminal studies relying on fluorescence recovery after photobleaching (FRAP) analyses of injected fluorescently labeled G‐actin demonstrated that G‐actin polymerizes at the front and depolymerizes at the back of nascent F‐actin structures [90]. The origin and regulation of this flux of polymerization known as “treadmilling” will be described in the next paragraph. More recently, quantitative fluorescence speckle microscopy (qFSM) experiments showed that the overall flux of actin subunits is modulated locally so that different turnover rates of filaments coexist in the cells [91, 92]. To control the different rates of actin treadmilling at different sites within a same cell, specific interactions of a variety of regulatory proteins occurs with actin, constituting a large PPI network. By modifying the association and dissociation parameters of G‐actin, F‐actin auto‐organizes to modulate the velocity of actin‐based movements and to adapt the density of the actin network. Because each polymerizing actin filament exerts a pushing force of few pN [93, 94], the entire network exerts a propelling force when polymerizing against cellular membranes, or contractile forces in association with myosin activity to potently adhere to build tissues in multicellular organisms.

2.4.5 ­Self‐Assembly and Auto‐Organization: Regulation of the Actin Cytoskeleton Assembl

2.4.5.1  Origin of the Actin Treadmilling

G‐actin is a globular protein of 375 amino acids, which adopts a tertiary ­structure characterized by antiparallel β‐sheets and numerous α‐helices and is broken down in four subdomains. The subdomains 1 and 3 constitute a so‐ called barbed extremity and subdomains 2 and 4, the pointed extremity (Figure 2.22a). The F‐actin is helical, with all actin subunits in the same orientation, thus exposing two subunits at the (+) end (barbed) and the (−) end (pointed) of this structurally polarized filament (Figure 2.22b, [96]). An ATP or ADP nucleotide along with a Ca2+ or Mg2+ divalent metallic cation binds to the monomer within a deep cleft at the junction of the four subdomains in physiological conditions. The cleft is formed by two flexible α‐helix allowing nucleotide exchanges. When bound to ATP, hydrogen bonds between the gamma phosphate and glycine 158 and serine 14 maintain subdomains 1–2 and 3–4 in close contact [97]. Thus, the gamma phosphate differentiates the conformational state of the open ADP‐bound and closed ATP‐bound actin monomer (Figure  2.22a). Actin is an atypical ATP hydrolyzing enzyme (ATPase). Although the precise mechanism of ATP hydrolysis is still debated, the reaction involves the stabilization of a catalytic water molecule by the glutamine 137 residue, but the presence of a catalytic base is also still debated [95, 98, 99]. ATP hydrolysis by G‐actin is very slow, but ATPase activity is accelerated 40 000 fold in F‐actin with a rate constant of 0.3 s−1 for Mg‐ATP [100]. Electron microscopy images of negatively stained actin filaments suggest an opening of the actin cleft upon ATP hydrolysis, indicating an importance for the catalysis of the spatial orientation of the Gln 137 residue [95, 101]. However, whether the increased rate of ATP hydrolysis in the filament compared to monomeric actin is due to structural conformation changes of the nucleotide, cleft is still under investigation. F‐actin is the filamentous form of actin, which is formed by the polymerization of G‐actin monomers, hence forming a facultative quaternary structure. The F‐actin is a microfilament of approximately 7 nm in diameter that forms a right‐handed double helix with 13 residues per turn and a pitch of 37 nm. The assembly of G‐actin monomers to form F‐actin relies on classical PPI. In the filament, each subunit establish lateral interactions with n + 1 and n − 1 subunits and longitudinal interactions with n + 2 and n − 2 subunits, requiring the neutralization of repulsive surface charges by Mg2+, Ca2+, or K+ ions [100]. At each of the two extremities of the filaments, the polymerization corresponds to the sum of the association and dissociation reactions of actin monomers with the barbed or pointed end of a filament following this reaction A + F ⇆ F, where A corresponds to actin monomers and F corresponds to a filament end. Thus, the polymerization flux at each end is expressed as J C

k .C . F

k . F

k . F . C CC with CC

k (2.4.5) k

79

80

Proteins: From Chemical Properties to Cellular Function

(a)

(b) 2 Barbed end

4

3

1

Pointed end

Figure 2.22  Actin structure and dynamics. (a) Two X‐ray structures of the ATP–G‐actin monomer in the open (dark grey, 1hlu) or closed (light grey) state are superimposed. The ATP is in black. The transition between the open and closed states is defined by conformational changes, leading to a shortening of the distance between subdomains 2 and 4. (b) Recent advances in cryo‐EM techniques provided a high‐resolution model of the actin filaments in which the actin protomer is in the open state (PDB ID 3J8I) [95]. The barbed and pointed ends are indicated in the picture. (c) Scheme of the treadmilling of an actin filament and its regulation rate by ADF/cofilin and profilin. The depolymerization (bolded dashed arrow) and polymerization (dimed dashed arrow) reactions at pointed ends are numbered (1). The polymerization (bolded arrow) and depolymerization (dimed arrow) reactions at barbed ends are numbered (2). The presence of profilin (P, profilin path) prevents actin assembly at pointed ends to drives the polymerization flux exclusively at barbed ends. ADF increases by 25‐fold the rate of actin depolymerization (ADF path). In combination with profilin that accelerates by 100‐fold the rate of nucleotide exchange on depolymerized ADP–G‐actin, the assembly onto barbed ends is 125‐fold accelerated. (d) Acceleration of the treadmilling rate by ADF and capping proteins. The dependence of filament elongation rate (J) on the concentration of monomeric G‐actin [C] is represented. Css corresponds to the G‐actin concentration where the rates of actin polymerization at barbed ends (positive J values, solid lines) and depolymerization (negative J values, dotted lines) are equal. In the left panel, ADF, by increasing the rate of actin depolymerization (negative J values, ADF‐marked dotted lines), increases Css, (double arrow, shift of Css to a higher value), thus increasing actin polymerization rate. By capping the majority of barbed ends, capping proteins reduce the overall rate of barbed end elongation to almost zero p (bolded solid line), establishing Css closed to CC . Thus, the rate of actin assembly on the remaining uncapped filaments is enhanced. The combination of ADF and capping proteins highly increases Css and the rate of elongation of the remaining free barbed ends.

(c) ×125

ADF

D Ccp = 0.6 μM Ccb = 0.1 μM T D-Pi D-Pi D D D D D D D –1 –1

12 μM ·s

1.4 s–1

ADP

T

P

ATP

24 s–1

ADP

0.3 s–1

1.3 μM–1·s–1

D

0.2 s–1

T

ADF

D

P

ATP

T D-Pi D-Pi D D D D D D D

P Ccb

(d)

Ccp = 0.6 μM

= 0.1 μM

ADF effects on the treadmilling rate J Barbed ends

Css

Pointed ends Css(+ADF)

Pointed ends +ADF

[C]

CP effects on the treadmilling rate J Free barbed ends

95% capped barbed ends

Css

Css = Ccp

Figure 2.22  (Continued)

Pointed ends

[C]

× 25

82

Proteins: From Chemical Properties to Cellular Function

with (C) and (F) as the concentration of G‐actin and F‐actin, respectively. The critical concentration (CC) reflects the concentration under which G‐actin is not able to assemble at the end of the filament. Thus, when the G‐actin reservoir decreases and reaches the CC of pointed or barbed end, an equilibrium state is reached and actin polymerization stops at this end. The association rate of ATP–G‐actin at the barbed end is much faster than at the pointed end (i.e., k+ ≫ k−) and only limited by the rate of monomer diffusion [102]. The rate of ATP hydrolysis is about two orders of magnitude faster than the release of the inorganic phosphate (Pi) from actin subunits [100]. Consequently, newly assembled actin subunits at barbed ends are mainly associated with ATP and ADP–Pi at the very tip of the filaments, while the older subunits are associated with ADP (Figure  2.22c). This thermodynamic polarization of F‐actin has major consequences on the dynamics of actin filaments, because this leads to a difference in the CC at the two ends of filaments, with CCb  = 0.1 μM at barbed ends, and CCp  = 0.6 μM at pointed ends. Thus, the polymerization flux is the sum of the polymerization fluxes at each extremity: J C

J C

b

J C

p

k p . F . C CCp (2.4.6)

k b . F . C CCb

The equilibrium cannot be reached because only ADP‐actin subunits depolymerize from pointed ends, while ATP–G‐actin assembles at barbed ends because the nucleotide interaction parameters favor ATP–G‐actin over ADP–G‐actin [103, 104]. A steady state is reached instead, where the polymerization flux is null at this steady state, that is, JSS = J(CSS)b + J(CSS)p = 0, by definition. Thus, the polymerization fluxes at each extremity are not null but equal with an opposed sign: J C SS

b

k b . F . CSS CCb

J CSS

p

k p . F . CSS CCp (2.4.7)

with the concentration of actin monomers at the steady state CSS is set between p CCb and CC . Thus, the polymerization flux at the barbed extremity is positive and negative at the pointed ends with an equal value at steady state (Figure 2.22d). The polymeric actin filament keeps a constant length, but actin subunits are renewed constantly by the assembly of ATP–G‐actin at barbed ends at a rate that compensate the rate of ADP‐actin disassembly at pointed ends. Hence, ATP hydrolysis within actin filaments confers their treadmilling property, which is at the origin of cellular motile processes based on actin polymerization. Indeed, in steady‐state conditions of the living cell, F‐actins move forward upon actin assembly at barbed ends and keep constant lengths because actin disassembles at pointed ends. The treadmilling rate is limited by the rate of actin depolymerization at pointed ends, which is slower than the rate of actin assembly at barbed ends (k p  k b ). Thus, any variation of actin

2.4.5 ­Self‐Assembly and Auto‐Organization: Regulation of the Actin Cytoskeleton Assembl

depolymerization affects the rate of filament assembly at barbed ends. Actin polymerization is also irreversible because of the thermal dissipation due to the irreversible cleavage of ATP. The steady state is maintained as long as ATP is maintained in excess to replace ADP on depolymerized actin monomers. 2.4.5.2  Regulation of Actin Treadmilling

Actin‐based processes such as the elongation of actin filaments feeding the extension of lamellipodia during cell migration occurs at a higher rate than the treadmilling rate of F‐actins measured in vitro. A set of proteins found in actin‐ rich structures cooperates to accelerate the rate of actin treadmilling in vivo, hence generating F‐actin‐dependent movements at a rate consistent with ­cellular requirements. ADF, also called cofilin, binds to ADP–actin subunits in  the filament destabilizing its structure and accelerating its disassembly [105, 106]. As a result, partial actin depolymerization occurs at steady state, increasing the CSS and the rate of actin polymerization by 25‐fold [107]. Profilin and some WH2‐domain containing proteins like actobindin accelerate actin treadmilling by four-fold. Indeed, by binding the barbed face of actin monomers or the terminal subunit of filaments barbed ends, they prevent pointed end assembly and promote actin assembly onto barbed ends of actin filaments. The combined effects of ADP and profilin or WH2‐containing protein result in a 125‐fold increase of the treadmilling rate [108]. Capping proteins bind to barbed ends of actin filaments with a high affinity, hence blocking barbed end assembly [109]. As a result, CSS is increased to a value nearby the value of CCp . This increase in the monomeric actin concentration feeds an increase of the  growth of the remaining uncapped filaments. This process, known as ­funneling, has been experimentally observed, and theoretically modeled (see Refs. [110] for review, [111]). The combined effects on the treadmilling rate of ADF, barbed end promoting factors and capping protein, account for rapid movements of actin structures observed in cells [108]. 2.4.5.3  Arp2/3 and Formin‐Initiated Actin Assembly to Generate Mechanical Forces

By blocking barbed end growth through the effects of capping protein, cells need to renew the amount of uncapped filaments to maintain the treadmilling rate constant on these sites. The production of a protrusive force by actin polymerization requires the rapid self‐assembly of actin monomers to be directed on specific sites. The RhoGTPase family comprising Rho, Rac, and Cdc42, also known as small G‐proteins, controls the initiation of new actin filaments. RhoGTPases are signaling enzymes hydrolyzing GTP into GDP to regulate the control of the actin cytoskeleton. Their GTP‐ or GDP‐binding state regulates their signaling activity, being functionally active

83

84

Proteins: From Chemical Properties to Cellular Function

when bound to GTP and inactive when bound to GDP. Additional factors regulate their enzymatic activity to turn on or off signaling cascades. GTPase accelerating proteins (GAPs) inactivate RhoGTPases by accelerating the rate of GTP hydrolysis, whereas guanine exchange factors (GEFs) activate them by accelerating the reloading of GTP in their nucleotide‐binding site [112]. To induce the polymerization of new filaments, RhoGTPases activate the so‐called actin nucleators, downstream of cell signaling (see Ref. [113] for review). The induced network requires a sufficient rigidity to allow an anchoring of newly created filaments whose growth develops a protrusive force. The Arp2/3 complex is a very stable and conserved assembly of seven proteins, including two actin‐related proteins, Arp2 and Arp3, initiating the polymerization of new actin filaments [100, 110]. This ubiquitous complex is found in branched actin networks, for instance, at the leading edge of the extending lamellipodium during cell migration, and localize in vitro and in vivo at the level of branches of F‐actins [114]. Its binding to a mother F‐actin induces the growth of a daughter filament with a 70° angle from the Arp2 and Arp3 subunits [115]. Thus, this mechanism of new barbed end generation is an autocatalytic process generating a dendritic actin network. The Arp2/3 complex, inactive at the basal state, is activated by cellular endogenous nucleation promoting factors (NPFs) that induce a conformational change bringing closer the Arp2 and Arp3 subunits to mimic the structure of a new filament barbed end, onto which actin monomers polymerize [116]. The molecular details of NPFs regulation by Rac and Cdc42 will not be discussed here, but as a result, Arp2/3 targeting and activation against the plasma membrane at specific sites induces a dense crisscrossed actin network that allow cell locomotion, propulsion of cytosolic vesicles, or intracellular ­motility of bacterial pathogens [117]. Proteins of the formin family, comprising 15 members, are dimeric proteins that nucleates de novo actin filaments and catalyzes rapid processive insertional assembly of F‐actins by remaining attached to their growing barbed ends [118–120]. The head‐to‐tail ring structure of their formin homology (FH2) domain confer the nucleating and barbed end binding of formins through the stabilization by encircling an actin dimer sharing structural homologies with a filament barbed end (see Ref. [121] for review). Their FH1 domain interacts with profilin, inducing a 10‐fold acceleration of profilin‐G‐actin assembly on a formin‐bound barbed end, and allows a single formin molecule to remain attached to a barbed end during thousands assembly cycles of profilin–G‐actin complexes [118, 120]. Formins are auto‐inhibited at the basal state, because of intramolecular interactions between the DID domain and the C‐terminal DAD domains that mask the activity of the FH2 domain. The binding of RhoGTPases to their N‐terminal GTPase‐binding domain (GBD) next to the DID domain induces conformational changes that release the DAD–DID interaction. This

2.4.5 ­Self‐Assembly and Auto‐Organization: Regulation of the Actin Cytoskeleton Assembl

interaction unmasks the FH2 domain and allows formins to induce a network of parallel and linear actin filaments involved in the formation filopodia or a contractile ring allowing cell separation after mitosis [122]. 2.4.5.4  Self‐Organization Properties and Force Generation Understood Using In Vitro Reconstituted Actin‐Based Nanomovements

Reconstitution approaches to observe under the microscope and understand complex concerted action of cytoskeletal proteins started in early 1980s, with the first in vitro motility assay reconstituting the movement of beads coated with purified myosins moving on actin cables in an ATP‐dependent manner [123]. This motility assay was the start of more complex assays, which led to the determination of biochemical and biophysical parameters controlling the cycle of several molecular motors, including the use of atomic force microscopy or optical tweezers (reviewed in Ref. [124]). In vitro reconstitution in cellular extracts or using purified proteins of the movement of the intracytoplasmic bacterial pathogens Listeria and Shigella mediated by the formation of an actin comet tail was a major breakthrough in the understanding of the collective behavior of proteins at the base of cell migration [125, 126]. Bacterial protein properties were used to activate Arp2/3 and initiate the polymerization of an actin network in a medium containing only ADF, profilin, and a capping protein to maintain a high treadmilling rate. This assay evolved to a pure synthetic biomimetic system where site‐directed actin polymerization propel objects in the micrometer range, such as beads, vesicles, or lipid droplets coated with NPFs activating Arp2/3 (Figure 2.23a) [127–129]. This biomimetic system enabled to understand how few molecules act in concert to self‐organize and how the movement is controlled by biochemical parameters of proteins regulating actin dynamics [130–133]. Thus, biomechanical forces produced by the c­ ollective behavior of bionanocomposite have been characterized [132, 134, 135]. Reconstitution of the propulsion of formin‐coated beads by processive insertional polymerization allowed studying how interactions at barbed ends of F‐actins regulate formin activity and force production (Figure 2.23b) [118, 136, 137]. 2.4.5.5  Applications in Bionanotechnologies

Designing functionalized surface micropatterns of proteins regulating actin dynamics was a step forward in the complexity of biomimetic systems and brought new insight in the control of the cytoskeleton architecture by the shape of micropatterns of actin regulators [138–140]. These experiments ­demonstrate that only few physical and biochemical parameters rule the self‐ organization of molecules acting in concert to reproduce cellular structures and behaviors. This approach has been used to build 3D networks of actin ­filaments with an extremely precise shape and orientation of actin network, defined by the micropatterns of actin nucleation factors and biochemical

85

86

Proteins: From Chemical Properties to Cellular Function

(a)

(b) Bead

Bead

NPF

Formin Profilin G-actin

ADF CP Arp2/3

Figure 2.23  In vitro biomimetic reconstitution of Arp2/3‐based or formin‐based motility. (a and b) Scheme of Arp2/3‐based or formin‐based motility assays, reconstituted with purified proteins (upper panels). At the steady state, ADF/cofilin, profilin in addition to capping proteins (CP) maintain a high treadmilling rate of F‐actin (growing filament). In this medium, beads coated with an Arp2/3 activator (NPF) or beads coated with formins nucleate a branched filaments network (a) or a network of parallel filaments (b), respectively. By polymerizing against the bead surface, this actin comet tail generates a force that propels beads with a constant rate. Lower panel: time‐lapse sequence of a beads coated with the ActA protein of Listeria that activates directly Arp2/3 (a, phase‐contrast), or the FH1‐FH2 domain of the mDia1 formin (b, Rhodamine‐labeled G‐actin) generating actin comet tails.

c­ ontrol of actin polymerization. These 3D‐controlled actin networks can be used as patterns for gold‐coating to build electric nano‐wires with excellent electrical conductivity [141]. This very promising bionanocomposite‐ based  technology, enabled by progress in our comprehension of protein ­properties and behavior, is potentially attractive for applications in industrial nanotechnologies.

2.4.6 ­Conclusio

2.4.6 ­Conclusion Progresses in our understanding of protein chemistry and structure are ­opening the door to approaches for tracking protein location and function and  generating proteins and bio‐nanocomposites with novel properties. We expect that many advances in protein structure and function prediction, ­protein design, biotechnology, and synthetic biology will stem from these advancements in coming years. Furthermore, interactions of proteins with other biomolecules are keys for their functions. The non‐covalent binding of  two proteins is a simple bimolecular reaction, determined by kinetic and equilibrium constants. Understanding the various functions of all the genome‐ encoded proteins requires building maps of PPIs. Novel proteomic, genomic, and transcriptomic approaches have been used to build dynamic PPI networks obtained under various stimuli. Enzymes are proteins that catalyze chemical reactions by decreasing the activation energy required for the chemical reaction to occur. Enzymes accelerate the rate of reactions that would never occur in physiological conditions, by interacting with substrates and stabilizing a conformer closed to the transition state of the catalyzed reaction. The activity of a given enzyme is determined by the ability of the substrate to interact with the enzyme Km, and the enzyme turnover kcat. Myosins are molecular motors converting a chemical energy into mechanical work. The energy coming from the hydrolysis of the gamma phosphate bonds in the ATP molecules catalyzed by an active site in the molecular motor induces sequential short‐scale and large‐scale conformational changes of the entire protein, regulating the orientation and the binding/ detachment onto F‐actins. Thus, subsequent cycles of ATP hydrolysis induce a displacement of the myosin molecule on F‐actins, at the origin of force production. Assembly of actin monomers into polymeric filaments is ruled by physicochemical properties of subunit interactions in the filaments. ATP hydrolysis on each subunit following monomer assembly induces a thermodynamic polarization of the filament, leading to the treadmilling phenomenon at the steady state: monomers assemble at one end of the filament and disassemble at the other end with the same rate. This important property of F‐actins is at the base of the pushing force induced by actin polymerization and allows dynamic processes involved in cell architecture and locomotion. The treadmilling is regulated by a whole set of proteins that modulate its rate, to generate movements in cells with an appropriate speed. These movements generated by actin polymerization can be reconstituted in vitro using pure proteins and will have implications for bio‐nanotechnologies.

87

88

Proteins: From Chemical Properties to Cellular Function

­References 1 Voet, D.; Voet, J. G. Biochemistry, 4th Edition (John Wiley & Sons, Inc., 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

27 28 29

Hoboken, NJ, 2011). Khoury, G. A.; Baliban, R. C.; Floudas, C. A. Sci. Rep. 2011, 1, 90. Boutureira, O.; Bernardes, G. J. L. Chem. Rev. 2015, 115, 2174–2195. Bruchez, M. P. Curr. Opin. Chem. Biol. 2015, 27, 18–23. Gonalves, M. S. T. Chem. Rev. 2009, 109, 190–212. Griffin, B. A.; Adams, S. R.; Tsien, R. Y. Science 1998, 281, 269–272. Takaoka, Y.; Ojida, A.; Hamachi, I. Angew. Chem. Int. Ed. 2013, 52, 4088–4106. Young, T. S.; Schultz, P. G. J. Biol. Chem. 2010, 285, 11039–11044. Zhang, W. H.; Otting, G.; Jackson, C. J. Curr. Opin. Struct. Biol. 2013, 23, 581–587. Spicer, C. D.; Davis, B. G. Nat. Commun. 2014, 5, 4740. Dugave, C. Cis‐trans Isomerization in Biochemistry (Wiley‐VCH, Weinheim, 2006). Pappenberger, G.; Aygün, H.; Engels, J. W.; Reimer, U.; Fischer, G.; Kiefhaber, T. Nat. Struct. Biol. 2001, 8, 452–458. Zacharias, D. A.; Violin, J. D.; Newton, A. C.; Tsien, R. Y. Science 2002, 296, 913–916. Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin, T. E. J. Comput. Chem. 2004, 25, 1605–1612. Aurora, R.; Rose, G. D. Protein Sci. 1998, 7, 21–38. Farzadfard, F.; Gharaei, N.; Pezeshk, H.; Marashi, S.‐A. J. Struct. Biol. 2008, 161, 101–110. Chou, P. Y.; Fasman, G. D. Biochemistry 1974, 13, 211–222. Chou, P. Y.; Fasman, G. D. Biochemistry 1974, 13, 222–245. Hutchinson, E. G.; Thornton, J. M. Protein Sci. 1994, 3, 2207–2216. Tanford, C. Science 1978, 200, 1012–1018. Oliveberg, M.; Fersht, A. R. Biochemistry 1996, 35, 6795–6805. Tissot, A. C.; Vuilleumier, S.; Fersht, A. R. Biochemistry 1996, 35, 6786–6794. Matthews, B. W. C. R. Biol. 2005, 328, 549–556. Andreeva, A.; Howorth, D.; Chothia, C.; Kulesha, E.; Murzin, A. G. Nucleic Acids Res. 2014, 42, D310–D314. Lo Conte, L.; Ailey, B.; Hubbard, T. J.; Brenner, S. E.; Murzin, A. G.; Chothia, C. Nucleic Acids Res. 2000, 28, 257–259. Sillitoe, I.; Lewis, T. E.; Cuff, A.; Das, S.; Ashford, P.; Dawson, N. L.; Furnham, N.; Laskowski, R. A.; Lee, D.; Lees, J. G.; Lehtinen, S.; Studer, R. A.; Thornton, J.; Orengo, C. A. Nucleic Acids Res. 2015, 43, D376–D381. Holm, L.; Rosenstrm, P. Nucleic Acids Res. 2010, 38, W545–W549. Söding, J.; Lupas, A. N. Bioessays 2003, 25, 837–846. Orengo, C. A.; Jones, D. T.; Thornton, J. M. Nature 1994, 372, 631–634.

 ­Reference

30 Campbell‐Valois, F. X.; Tarassov, K.; Michnick, S. W. J. Mol. Biol. 2006, 362,

151–171.

31 Campbell‐Valois, F. X.; Michnick, S. W. J. Mol. Biol. 2007, 365, 1559–1577. 32 Campbell‐Valois, F. X.; Tarassov, K.; Michnick, S. Proc. Natl. Acad. Sci. U. S. A.

2005, 102, 14988–14993.

33 Shaul, Y.; Schreiber, G. Proteins 2005, 60, 341–352. 34 Lockless, S. W.; Ranganathan, R. Science 1999, 286, 295–299. 35 Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.;

Shindyalov, I. N.; Bourne, P. E. Nucleic Acids Res. 2000, 28, 235–242.

36 Pirovano, W.; Heringa, J. Methods Mol. Biol. 2010, 609, 327–348. 37 Kryshtafovych, A.; Fidelis, K.; Moult, J. Proteins 2014, 82 Suppl 2, 164–174. 38 Kuhlman, B.; Dantas, G.; Ireton, G. C.; Varani, G.; Stoddart, B. L.; Baker, D.

Science 2003, 302, 1364–1368.

39 Siegel, J. B.; Zanghellini, A.; Lovick, H. M.; Kiss, G.; Lambert, A. R.; St.Clair, J.

40 1 4 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57

58

L.; Gallaher, J. L.; Hilvert, D.; Gelb, M. H.; Stoddard, B. L.; Houk, K. N.; Michael, F. E.; Baker, D. Science 2010, 329, 309–313. Eiben, C. B.; Siegel, J. B.; Bale, J. B.; Cooper, S.; Khatib, F.; Shen, B. W.; Players, F.; Stoddard, B. L.; Popovic, Z.; Baker, D. Nat. Biotechnol. 2012, 30, 190–192. Bolon, D. N.; Voigt, C. A.; Mayo, S. L. Curr. Opin. Chem. Biol. 2002, 6, 125–129. Baker, D. Protein Sci. 2010, 19, 1817–1819. Kim, P. S.; Baldwin, R. L. Annu. Rev. Biochem. 1982, 51, 459–489. Baldwin, R. L.; Rose, G. D. Trends Biochem. Sci. 1999, 24, 26–33. Baldwin, R. L.; Rose, G. D. Trends Biochem. Sci. 1999, 24, 77–83. Matthews, J. M.; Fersht, A. R. Biochemistry 1995, 34, 6805–6814. Fersht, A. R.; Itzhaki, L. S.; elMasry, N. F.; Matthews, J. M.; Otzen, D. E. Proc. Natl. Acad. Sci. U. S. A. 1994, 91, 10426–10429. Fersht, A. R. Proc. Natl. Acad. Sci. U. S. A. 1995, 92, 10869–10873. Abkevich, V. I.; Gutin, A. M.; Shakhnovich, E. I. Biochemistry 1994, 33, 10026–10036. Otzen, D. E.; Itzhaki, L. S.; elMasry, N. F.; Jackson, S. E.; Fersht, A. R. Proc. Natl. Acad. Sci. U. S. A. 1994, 91, 10422–10425. Itzhaki, L. S.; Otzen, D. E.; Fersht, A. R. J. Mol. Biol. 1995, 254, 260–288. Nickson, A. A.; Clarke, J. Methods 2010, 52, 38–50. Plaxco, K. W.; Simons, K. T.; Ruczinski, I.; Baker, D. Biochemistry 2000, 39, 11177–11183. Plaxco, K. W.; Simons, K. T.; Baker, D. J. Mol. Biol. 1998, 277, 985–994. Dill, K. A.; Chan, H. S. Nat. Struct. Biol. 1997, 4, 10–19. Valle‐Belisle, A.; Michnick, S. W. J. Mol. Biol. 2007, 374, 791–805. Gianni, S.; Guydosh, N. R.; Khan, F.; Caldas, T. D.; Mayor, U.; White, G. W. N.; DeMarco, M. L.; Daggett, V.; Fersht, A. R. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 13286–13291. Saibil, H. Nat. Rev. Mol. Cell Biol. 2013, 14, 630–642.

89

90

Proteins: From Chemical Properties to Cellular Function

59 60 61 62 63 64 65 66 67

68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

Ellgaard, L.; Helenius, A. Nat. Rev. Mol. Cell Biol. 2003, 4, 181–191. Netzer, W. J.; Hartl, F. U. Nature 1997, 388, 343–349. Luby‐Phelps, K. Int. Rev. Cytol. 2000, 192, 189–221. Ellis, R. J. Curr. Opin. Struct. Biol. 2001, 11, 114–119. Goley, E. D.; Welch, M. D. Nat. Rev. Mol. Cell Biol. 2006, 7, 713–726. Pollard, T. D. Mol. Biol. Cell 2010, 21, 4061–4067. Sproul, T. W.; Cheng, P. C.; Dykstra, M. L.; Pierce, S. K. Int. Rev. Immunol. 2000, 19, 139–155. Fromont‐Racine, M.; Rain, J. C.; Legrain, P. Nat. Genet. 1997, 16, 277–282. Uetz, P.; Giot, L.; Cagney, G.; Mansfield, T. A.; Judson, R. S.; Knight, J. R.; Lockshon, D.; Narayan, V.; Srinivasan, M.; Pochart, P.; Qureshi‐Emili, A.; Li, Y.; Godwin, B.; Conover, D.; Kalbfleisch, T.; Vijayadamodar, G.; Yang, M.; Johnston, M.; Fields, S.; Rothberg, J. M. Nature 2000, 403, 623–627. Gavin, A. C. et al., Nature 2002, 415, 141–147. Krogan, N. J. et al., Nature 2006, 440, 637–643. Stumpf, M. P. H.; Thorne, T.; de Silva, E.; Stewart, R.; An, H. J.; Lappe, M.; Wiuf, C. Proc. Natl. Acad. Sci. U. S. A. 2008, 105, 6959–6964. Venkatesan, K. et al., Nat. Methods 2009, 6, 83–90. Snider, J.; Kotlyar, M.; Saraon, P.; Yao, Z.; Jurisica, I.; Stagljar, I. Mol. Syst. Biol. 2015, 11, 848. Nyati, M. K.; Morgan, M. A.; Feng, F. Y.; Lawrence, T. S. Nat. Rev. Cancer 2006, 6, 876–885. Woodsmith, J.; Stelzl, U. Curr. Opin. Struct. Biol. 2014, 24, 34–44. Michaelis, L.; Menten, M. L. Biochem. Z. 1913, 49, 333–369. Briggs, G. E.; Haldane, J. B. Biochem. J. 1925, 19, 338–339. Pauling, L. Nature 1948, 161, 707–709. Truhlar, D. G. Arch. Biochem. Biophys. 2015, 582, 10–17. Daniel, R. M.; Dunn, R. V.; Finney, J. L.; Smith, J. C. Annu. Rev. Biophys. Biomol. Struct. 2003, 32, 69–92. Jencks, W. P. Cold Spring Harb. Symp. Quant. Biol. 1972, 36, 1–11. Zhong, J.; Groutas, W. C. Curr. Top. Med. Chem. 2004, 4, 1203–1216. Bustamante, C.; Chemla, Y. R.; Forde, N. R.; Izhaky, D. Annu. Rev. Biochem. 2004, 73, 705–748. Vale, R. D. J. Cell Biol. 1996, 135, 291–302. Schliwa, M.; Woehlke, G. Nature 2003, 422, 759–765. Yun, M. Y.; Zhang, X. H.; Park, C. G.; Park, H. W.; Endow, S. A. EMBO J. 2001, 20, 2611–2618. Murphy, C. T.; Rock, R. S.; Spudich, J. A. Nat. Cell Biol. 3, 2001, 311–315. Vale, R. D.; Milligan, R. A. Science 2000, 288, 88–95. Sweeney, H. L.; Houdusse, A. Annu. Rev. Biophys. 2010, 39, 539–557. Marx, A.; Hoenger, A.; Mandelkow, E. Cell Motil. Cytoskeleton 2009, 66, 958–966. Wang, Y. L. J. Cell Biol. 1985, 101, 597–602.

 ­Reference

91 Vallotton, P.; Gupton, S. L.; Waterman‐Storer, C. M.; Danuser, G. Proc. Natl.

Acad. Sci. U. S. A. 2004, 101, 9660–9665.

92 Ponti, A.; Machacek, M.; Gupton, S. L.; Waterman‐Storer, C. M.; Danuser, G.

Science 2004, 305, 1782–1786.

93 Kovar, D. R.; Pollard, T. D. Proc. Natl. Acad. Sci. U. S. A. 2004, 101,

14725–14730.

94 Footer, M. J.; Kerssemakers, J. W.; Theriot, J. A.; Dogterom, M. Proc. Natl.

Acad. Sci. U. S. A. 2007, 104, 2181–2186.

95 Galkin, V. E.; Orlova, A.; Vos, M. R.; Schroder, G. F.; Egelman, E. H. Structure

2015, 23, 173–182.

96 Holmes, K. C.; Popp, D.; Gebhard, W.; Kabsch, W. Nature 1990, 347, 44–49. 97 Sablin, E. P.; Dawson, J. F.; VanLoock, M. S.; Spudich, J. A.; Egelman, E.;

Fletterick, R. J. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 10945–10947.

98 Saunders, M. G.; Voth, G. A. J. Mol. Biol. 2011, 413, 279–291. 99 Iwasa, M.; Maeda, K.; Narita, A.; Maeda, Y.; Oda, T. J. Biol. Chem. 2008, 283,

21045–21053.

100 Pollard, T. D. Annu. Rev. Biophys. Biomol. Struct. 2007, 36, 451–477. 101 Belmont, L. D.; Orlova, A.; Drubin, D. G.; Egelman, E. H. Proc. Natl. Acad.

Sci. U. S. A. 1999, 96, 29–34.

102 Drenckhahn, D.; Pollard, T. D. J. Biol. Chem. 1986, 261, 12754–12758. 103 Kudryashov, D. S.; Reisler, E. Biopolymers 2013, 99, 245–256. 104 Kinosian, H. J.; Selden, L. A.; Gershman, L. C.; Estes, J. E. Biochemistry 2002,

41, 6734–6743.

105 McCullough, B. R.; Blanchoin, L.; Martiel, J. L.; De la Cruz, E. M. J. Mol. Biol.

2008, 381, 550–558.

106 Carlier, M.‐F.; Pantaloni, D. J. Mol. Biol. 1997, 269, 459–467. 107 Carlier, M.‐F.; Ressad, F.; Pantaloni, D. J. Biol. Chem. 1999, 274, 33827–33830. 108 Carlier, M.‐F.; Pernier, J.; Montaville, P.; Shekhar, S.; Kühn, S. Cell. Mol. Life

Sci. 2015, 72, 3051–3067. Schafer, D. A.; Cooper, J. A. Annu. Rev. Cell Dev. Biol. 1995, 11, 497–518. Pantaloni, D.; Le Clainche, C.; Carlier, M.‐F. Science 2001, 292, 1502–1506. Wang, R.; Carlsson, A. E. Phys. Biol. 2015, 12, 066008. Sit, S. T.; Manser, E. J. Cell Sci. 2011, 124, 679–683. Campellone, K. G.; Welch, M. D. Nat. Rev. Mol. Cell Biol. 2010, 11, 237–251. Svitkina, T. M.; Borisy, G. G. J. Cell Biol. 1999, 145, 1009–1026. Pollard, T. D.; Borisy, G. G. Cell 2003, 112, 453–465. Welch, M. D.; Way, M. Cell Host Microbe 2013, 14, 242–255. Swaney, K. F.; Li, R. Curr. Opin. Cell Biol. 2016, 42, 63–72. Romero, S.; Le Clainche, C.; Didry, D.; Egile, C.; Pantaloni, D.; Carlier, M.‐F. Cell 2004, 119, 419–429. 119 Paul, A. S.; Pollard, T. D. Cell Motil. Cytoskeleton 2009, 66, 606–617. 120 Kovar, D. R.; Harris, E. S.; Mahaffy, R.; Higgs, H. N.; Pollard, T. D. Cell 2006, 124, 423–435. 109 110 111 112 113 114 115 116 117 118

91

92

Proteins: From Chemical Properties to Cellular Function

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138

139 140 141

Goode, B. L.; Eck, M. J. Annu. Rev. Biochem. 2007, 76, 593–627. Mattila, P. K.; Lappalainen, P. Nat. Rev. Mol. Cell Biol. 2008, 9, 446–454. Sheetz, M. P.; Spudich, J. A. Nature 1983, 303, 31–35. Veigel, C.; Schmidt, C. F. Nat. Rev. Mol. Cell Biol. 2011, 12, 163–176. Theriot, J. A.; Rosenblatt, J.; Portnoy, D. A.; Goldschmidt‐Clermont, P. J.; Mitchison, T. J. Cell 1994, 76, 505–517. Loisel, T. P.; Boujemaa, R.; Pantaloni, D.; Carlier, M.‐F. Nature 1999, 401, 613–616. Bernheim‐Groswasser, A.; Wiesner, S.; Golsteyn, R. M.; Carlier, M.‐F.; Sykes, C. Nature 2002, 417, 308–311. Delatour, V.; Helfer, E.; Didry, D.; Lê, K. H.; Gaucher, J. F.; Carlier, M.‐F.; Romet‐Lemonne, G. Biophys. J. 2008, 94, 4890–4905. Boukellal, H.; Campas, O.; Joanny, J.‐F.; Prost, J.; Sykes, C. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 2004, 69, 061906. Wiesner, S.; Helfer, E.; Didry, D.; Ducouret, G.; Lafuma, F.; Carlier, M.‐F.; Pantaloni, D. J. Cell Biol. 2003, 160, 387–398. Samarin, S.; Romero, S.; Kocks, C.; Didry, D.; Pantaloni, D.; Carlier, M.‐F. J. Cell Biol. 2003, 163, 131–142. Bieling, P.; Li, T. D.; Weichsel, J.; McGorty, R.; Jreij, P.; Huang, B.; Fletcher, D. A.; Mullins, R. D. Cell 2016, 164, 115–127. Akin, O.; Mullins, R. D. Cell 2008, 133, 841–851. Trichet, L.; Campas, O.; Sykes, C.; Plastino, J. Biophys. J. 2007, 92, 1081–1089. Marcy, Y.; Prost, J.; Carlier, M. F.; Sykes, C. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 5992–5997. Helfer, E.; Nevalainen, E. M.; Naumanen, P.; Romero, S.; Didry, D.; Pantaloni, D.; Lappalainen, P.; Carlier, M.‐F. EMBO J. 2006, 25, 1184–1195. Berro, J.; Michelot, A.; Blanchoin, L.; Kovar, D. R.; Martiel, J. L. Biophys. J. 2007, 92, 2546–2558. Reymann, A. C.; Boujemaa‐Paterski, R.; Martiel, J.‐L.; Guérin, C.; Cao, W.; Chin, H. F.; De La Cruz, E. M.; Théry, M.; Blanchoin, L. Science 2012, 336, 1310–1314. Ennomani, H.; Letort, G.; Ghérin, C.; Martiel, J.‐L.; Cao, W.; Nédélec, F.; De La Cruz, E. M.; Théry, M.; Blanchoin, L. Curr. Biol. 2016, 26, 616–626. Ciobanasu, C.; Faivre, B.; Le Clainche, C. Nat. Commun. 2014, 5, 3095. Galland, R.; Leduc, P.; Guérin, C.; Peyrade, D.; Blanchoin, L.; Théry, M. Nat. Mater. 2013, 12, 416–421.

93

3 Functional Biomolecular Engineering

95

3.1 Nucleic Acid Engineering Enora Prado1,*, Mónika Ádok‐Sipiczki 2,* and Corinne Nardin3 1

Institute of Physics Rennes, UMR UR1‐CNRS 6251, Rennes, France Department of Inorganic and Analytical Chemistry, University of Geneva, Geneva, Switzerland 3 Institut pluridisciplinaire de recherche sur l’environnement et les matériaux (IPREM), Equipe Physique Chimie des Polymères (EPCP), Université de Pau et des Pays de l’Adour (UPPA), Pau, France 2

3.1.1 ­Introduction In addition to their role as the carrier of the genetic information in nearly all living systems, nucleic acids have been recognized as a construction element for the assembly of various objects to achieve structural arrangements with nanoscale features [1]. In the past few years, there has been a constant increase of attention targeted at nanoscale hybridization of two different components, for example, inorganic−inorganic, organic–inorganic, bioorganic or bioinor‑ ganic nanocomposites, which results in synergic effects. These materials thus show unusual physicochemical properties and very often complementary behaviors regarding the two hybridized components.

3.1.2 ­How to Synthetically Produce Nucleic Acids? 3.1.2.1  The Chemical Approach

Oligonucleotides can now be routinely synthetized by automated solid‐phase synthesis based on phosphoramidate chemistry. This allows the synthesis of any desired sequence at scales ranging from 10 nmol to 1 µmol. As a conse‑ quence, oligonucleotides can be designed and synthesized so as to be comple‑ mentary to any target sequence in a gene or a genome. Very interestingly, and *  These authors contributed equally to the work. Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

96

Nucleic Acid Engineering

as described in details in the following, nucleic acids can also be chemically modified. The nuclease sensitivity of natural nucleic acid, the poor penetration across the cell membrane, and the unfavorable pharmacokinetic and pharmacody‑ namics properties limit the success of nucleic acids therapeutics [2]. A solution to solve this stability problem is the development of modified synthetic nucleic acids. During the last decades, several improvements were achieved by intro‑ ducing structural modifications in the nucleic acid structure, and a large num‑ ber of nucleic acids containing modified sugars, bases, and backbones have been developed and evaluated for their effects on intrinsic properties [3]. It is not possible to give an exhaustive list of chemical modifications available. For example, among promising structural modifications, phosphorothioate or morpholino backbones instead of phosphodiester ones are now used in clinical trials. Other promising structural modifications include 2′‐OH modifications (e.g., 2′‐O‐methyl), locked nucleic acids (LNAs), peptide nucleic acids (PNAs), and hexitol nucleic acids (HNAs) (Figure 3.1) [5]. Another kind of interesting modification concerns the coupling of natural or modified nucleic acids to molecules, antibodies or their fragments, liposomal components, saccharides, hormones, proteins and peptides, toxins, fluorophores or photoprobes, inhibi‑ tors, enzymes, growth factors and vitamins, and natural or synthetic polymers [4, 6, 7]. The conjugation is advantageous over structural modifications because it does not only improve the existing nucleic acid properties but also endows nucleic acids with entirely new properties. Modified or conjugated nucleic acids are not only limited to therapeutics but can also be used in diagnostics (e.g., DNA imaging and DNA microarrays) and nanotechnology [3, 8, 9]. 3.1.2.2  Polymerase Chain Reaction

The comprehension of duplicate and transcription–translation processes in the cell allowed the development of a major technique for the biotechnology field, the polymerase chain reaction (PCR). PCR is a technology used to amplify a single copy or a few copies of a piece of DNA/RNA across several orders of magnitude, generating thousands to millions of copies of a particular DNA/RNA sequence. The first paper in the Journal of Molecular Biology by Kleppe and coworkers first described a method using an in vitro enzymatic assay to replicate a short DNA template with primers in 1971 [10]. However, this early manifestation of the basic PCR principle did not receive much atten‑ tion, and the invention of the PCR in 1983 is generally credited to Kary Mullis [11]. In 1993, Mullis was awarded the Nobel Prize in Chemistry along with Michael Smith for his work on PCR. It is now a common and often indispen‑ sable technique used in medical and biological research laboratories for a wide range of applications [12]. These include DNA cloning for sequencing [13], diagnosis of hereditary diseases [14]; identification of genetic

(a)

(b) O

5′

O

B

(c) O

O – O P O O O

B

O

O O



O O P O– O O O 3′

O P S O

B

O

O O P S– O O

O

O

O P O O

O

O –

Phosphorothioate backbone

B

O

B

O

Morpholino backbone

LNA

O

O NH

O

B N

O

B

O O O P O–

B N

O

O P N O

O NH

O

B

O

N

B = Purine/pyrimidine bases

Phosphodiester backbone

N

O –

B

N

O

B

O P O O

O P N O

B

O

(f) NH

B

O

O B

(e)

O

B

N

O B

(d)

O O PNA

B

O O O P O– O

O

B

O O HNA

Figure 3.1  Structures of natural and modified oligonucleotide backbones: (a) natural phosphodiester, (b) phosphorothioate, (c) morpholino, (d) locked nucleic acid (LNAs), (e) peptide nucleic acids (PNAs), and (f ) hexitol nucleic acids (HNAs) [Source: Singh et al. [4]. Reproduced with permission of The Royal Society of Chemistry].

98

Nucleic Acid Engineering

fingerprints (used in forensic sciences and paternity testing) [15]; and detec‑ tion as well as diagnosis of diseases [16]. As the name indicates, PCR is a chain reaction based on a relatively simple principle. A DNA molecule is used to produce exponentially increasing num‑ ber of copies. This continuous replication is accomplished owing to polymer‑ ases (polDNA or polRNA, depending on the first strand chemical nature). As explained earlier, these enzymes are able to string together individual DNA building blocks to form long molecular strands. To achieve this task, polymer‑ ases require a supply of synthetic nucleotides, which are added in the PCR media. They also need a small fragment of DNA, known as the primer, to which they attach the building blocks and a longer DNA molecule to serve as a template for constructing the new strand. If these three ingredients are sup‑ plied, the enzymes will construct exact copies of the template. The basic reac‑ tion chain uses repeated cycles, each of which consists of three steps. First, the reaction solution containing DNA (or RNA) molecules (to be copied), poly‑ merases, primers, and nucleotides are heated. This causes the two complemen‑ tary strands to separate, a process known as denaturing or melting. Then, lowering the temperature induces the primers binding to the DNA (a process referred to as hybridization or annealing). The resulting bonds are stable only if the primer and the DNA segment are complementary. The polymerases then start to attach additional complementary nucleotides to these primers, thus strengthening the bonds between the primers and the DNA. Eventually, the extension step occurs. The temperature is again increased, depending on the ideal working temperature for the polymerases used, which adds further nucle‑ otides to the developing DNA strand. Simultaneously, any loose bonds formed between the primers and DNA segments that are not fully complementary break (Figure 3.2). For each cycle, these three steps are repeated and as a result the number of copied DNA molecules doubles. After 30 cycles, about a million molecules are replicated from a single segment of double‐stranded DNA. Since the discovery of PCR, many nucleic acid amplification technologies have been developed as, for example, a real‐time polymerase chain reaction (qPCR), which is used to amplify and simultaneously detect or quantify a tar‑ geted DNA molecule [17], or a reverse transcription polymerase chain reaction (RT‐PCR), which allows the detection of RNA expression [17, 18]. RT‐PCR and qPCR are often confused. However, they are separate and distinct tech‑ niques. While RT‐PCR is used to qualitatively detect gene expression through the creation of complementary DNA (cDNA) transcripts from RNA, qPCR is used to quantitatively measure the amplification of DNA using fluorescent probes [19]. In the clinical field, the qPCR is used, for example, for the quanti‑ fication of viral DNA during HIV‐1 infection [20]. Other examples of nucleic acid amplification techniques are digital polymerase chain reaction (dPCR) [14], rolling circle amplification (RCA) [21], strand displacement amplification (SDA) [22], and so on [23]. In addition to the comprehension of the general

3.1.2 ­How to Synthetically Produce Nucleic Acids Cycle 1

Cycle 2

Cycle 3 109 copies

1 copy

1

Temperature

3

2

DNA matrices

N cycles

Nucleotides Polymerase Primers Time

Figure 3.2  Basic principle of PCR, 1, denaturing step; 2, hybridization step; 3, extension step.

duplication process, an important parameter for the development of these technologies is the choice of the enzymatic system, notably polDNA and pol‑ RNA. In general, there are three aspects of a polymerase that should be consid‑ ered: the processivity, the fidelity, and the persistence. Processivity refers to the rate at which a polymerase enzyme makes the complementary copy of the template. Fidelity is the accuracy of the complementary copy being made; it is the most important of the three features. Finally, the attribute of persistence, which refers to the stability of the enzyme at high temperature, is intimately linked to the other two polymerase attributes. Moreover, the enzymatic system needs to resist inhibition [24] or incorporation of modified nucleotides [25]. DNA polymerase I (Pol I) from Escherichia coli was the first polymerase to be identified and characterized [26]. After this discovery, the search for other polymerases was extended to numerous organisms. Since it was first isolated, the DNA polymerase from Thermus aquaticus (Taq DNA polymerase) has become the standard reagent for the PCR reaction. Taq DNA polymerase is stable at 95°C and enabled the automation of the PCR process. The gene was cloned and used to produce the enzyme in non‐thermophilic host bacteria, so both native Taq, isolated from T. aquaticus, and cloned Taq, isolated from expression systems in other bacteria, are commercially available. In addition, a number of other thermally stable DNA polymerases, isolated from other

99

100

Nucleic Acid Engineering

thermophilic species, have become available. Among these are enzymes from Pyrococcus furiosus (Pfu polymerase), Thermus thermophilus (Tth polymer‑ ase), Thermus flavus (Tfl polymerase), Thermococcus litoralis (Tli polymerase aka Vent polymerase), and Pyrococcus species GB‐D (Deep Vent polymerase) [27]. Each of these, and other, polymerases has a specific set of attributes that can be selected according to the amplification techniques used. 3.1.2.3  Combinatorial Synthesis of Oligonucleotides and Gene Libraries: Aptamers

The knowledge of PCR processes has also allowed the development of a major technique in the biotechnology field, the Systematic Evolution of Ligands by EXponential enrichment (SELEX). The aim of this method, attributed to Tuerk and Gold in 1990, is to select a specific sequence of oligonucleotides (or pep‑ tides [28]) that binds specifically to a target molecule [29]. The selected oligo‑ nucleotide is named aptamer, which refers to the Latin word aptus meaning attached, tied, or tethered. The SELEX process is a combinatorial technique for the screening of very large libraries of oligonucleotides by an iterative process of in vitro selection and amplification. The general principle of the SELEX technology has been discussed in detail in numerous reviews [6, 8, 30]. The basic principle is based on an incubation of random sequences of an oligonu‑ cleotide library (including 1014–1015 different oligonucleotides) with a target of interest. Oligonucleotides with an affinity toward the target are isolated from the tremendous number of species in the library, being the unbound fraction separated by techniques such as affinity chromatography, filter binding, or ­capillary electrophoresis [30c]. Then, the bound oligonucleotides are isolated and amplified by PCR to obtain an enriched library, further used for the next selection/amplification cycle. The selection pressure rises with every SELEX round. After several cycles (around 10–15), the enriched library is cloned and sequenced. The individual sequences are then analyzed in order to identify the consensus motif, which corresponds to the minimal sequence required for ­target‐specific binding. Aptamers obtained are folding into specific three‐ dimensional (3D) structures to bind the target with dissociation constants in the pico‐ and nanomolar range [8] (Figure 3.3). It is important to note here that a randomized oligonucleotide library is generated by solid‐phase synthesis in which a nucleoside is covalently linked to a polymeric carrier. The resulting nucleotide chain is extended by another nucleoside in a mobile phase, which binds to the first nucleoside’s free 5′‐OH group. A protecting group attached to this 5′‐OH is used to prevent multiple binding events so that the length of the aptamer can be controlled. Subsequently, the nonbinding components are washed from the carrier in the mobile phase, and the reaction product remains immobilized on the carrier. The protecting group is then removed from the bound nucleoside so that the next base can be added [32]. The SELEX method

3.1.2 ­How to Synthetically Produce Nucleic Acids Initial library Evaluation

Pool generation

∘ Binding assay ∘ Diversity assay ∘ Sequencing

Target

Regeneration

Binding

10–15 times

Wash

Amplification

Elution

Figure 3.3  General SELEX procedure [Source: Schütze et al. [31]. Reproduced with permission of PLoS ONE].

is applicable not only to the selection of aptamers capable of binding with ­target molecules but also to the selection of oligonucleotides with a particular enzymatic activity. In this case, the ability to catalyze the desired chemical reaction is used as a selection criteria [33]. Aptamers are often compared to antibodies,1 since both classes of molecules function according to the key‐lock principle, making them suitable for the detection or deactivation of a target. The aptamers present a number of nota‑ ble advantages in comparison to antibodies. Their preparation is relatively easy and inexpensive. Moreover, their immunogenicity is lower than that of anti‑ bodies, and several aptamers reveal better specificity, higher affinity to their target, and less nonspecific cross‐reactivity [34]. Furthermore, aptamers can bind to a wider range of structures such as proteins [35], cells [36], small mol‑ ecules [22, 37], or toxic compounds [38]. Likewise, aptamers are heat and pH stable and more resistant to organic solvents. The aptamers can be denatured and renatured multiple times without loss of activity [39]. One other interest‑ ing aspect is that aptamers can be chemically modified and thus easily coupled to a large amount of molecules [40], like siRNAs [41], fluorophores [42], 1  An antibody is a Y‐shaped protein on the surface of B cells that is secreted into the blood or lymph in response to an antigenic stimulus, such as a bacterium, virus, parasite, or transplanted organ and that neutralizes the antigen (an immunoglobulin) by binding specifically to it.

101

102

Nucleic Acid Engineering

radioisotopes [43], electrochemical [44] or Raman [45] reporters, and nano‑ particles [46]. Currently, aptamers arouse great interest for the development of several applications in various fields, such as medicine [47] (drug delivery [48], medical diagnostic [47, 49], therapeutic [50]), imaging [30], environment [44], food safety monitoring [51], molecule detection [35, 52], and so on [30, 53]. For example, in the field of drug delivery, a disadvantage of aptamers is their short half‐life time in blood which may be due to their small size, thus promoting renal clearance and nuclease degradation [54]. Several strategies have been devised to prevent this rapid degradation [55], like their conjugation with ­polyethylene glycol [56] or poly(2‐alkyl‐2‐oxazoline) [7] or modifications with 2‐fluoropyrimidine [39a] or 2′‐O‐methyl nucleosides [57] in order to increase in vivo activity. Sensors are devices that respond to physical or chemical stimuli and produce detectable signals. A sensor thus contains at least two components: target rec‑ ognition and signal transduction. Ideally, target recognition elements should have high affinity (low detection limit), high specificity (low interference), wide dynamic range, fast response time, long shelf life, and good generality for detecting a broad range of analytes with the same class of recognition elements. Due to the aptamer characteristics, it is easily understandable that the majority of nucleic acid sensors use an aptamer‐like recognition element. This kind of sensor is named aptasensor. A large variety of aptasensors has been developed with different transduction components (optical, colorimetric, electrochemi‑ cal) and different principles of recognition [58]. Since it is not possible to give an exhaustive list of aptasensors, the application of the hybridization principle to nanotechnology is illustrated by some examples of aptasensors based on this process [59]. In the environmental field, heavy metal detection can be achieved using different kind of aptasensors [60], like Hg2+ sensing in combination with colorimetric detection [61].

3.1.3 ­Secondary Structures in Nanotechnologies 3.1.3.1  Watson–Crick H‐Bonds

As described in Chapter 2, the main, and well known, secondary structure of DNA and RNA is the double helical structure. The principle of hybridization, combined (or not) with chemical nucleic acid modifications, has allowed the development of many different applications based on nucleic acids [62], nota‑ bly in the field of sensors [63]. 3.1.3.1.1 Stem‐Loop

A stem‐loop occurs upon intramolecular base pairing between two regions of the same strand, usually complementary in nucleotide sequences when read in

3.1.3 ­Secondary Structures in Nanotechnologie

opposite directions, forming a double helix that ends in an apical unpaired loop. The principle of stem‐loop formation has already been used to develop different functional nanodevices [64]. One original example of these applica‑ tions concerns the molecular motors conception, like hybridization‐driven DNA walkers. In these systems, the DNA walker plays the role of a hybridiza‑ tion catalyst. It catalyzes the reaction between hairpin fuel strands and the track of the motors. The mechanistic details of this catalysis process are designed to produce unidirectional motion of the walkers. As an example, the principle of the autonomous walker is shown in Figure 3.4 [66]. Initially, the walker is attached to the track with its two feet. Unoccupied footholds T on the track form “inert” hairpin structures. The fuel hairpin molecules F can only be opened by occupied footholds. In this case, the protruding sequence on F acts as an external toehold for hybridization to T (Figure 3.4a). After hybridization of F with the leftmost foothold, the left foot of the walker is detached from the track. It can now invade the next foothold hairpin to the right (Figure 3.4b). After hybridization of W with T, the walker has effectively taken one step to the right. W acts as a hybridization catalyst for the reaction of fuel molecules F hairpins with foothold molecules T (Figure 3.4c). 3.1.3.1.2  Kissing Complex

The principle of kissing complex formation has also allowed the selection of a specific DNA aptamer against the HIV‐1 trans‐activation responsive RNA ­element [67] or the development of a biosensor for the detection of small ligands [68]. 3.1.3.2  Other Kind of H‐Bonding 3.1.3.2.1 G‐Quartets

Various G‐quadruplexes forming oligonucleotides functioning as aptam‑ ers have been investigated for targeting molecules of therapeutic impor‑ tance. Compared with monoclonal antibodies and other targeting tools, G‑quadruplex aptamers offer several advantages such as non‐immuno‑ genicity, heat stability, biostability, cost effectiveness, ease of chemical synthesis, enhanced cellular uptake efficiency, and flexibility for introduc‑ ing chemical modifications. In this regard, the most extensive work has been performed on RNA G‐quadruplex‐based aptamers against the prion protein, the agent causing transmissible neurodegenerative encephalopa‑ thy known as Creutzfeldt–Jakob disease [69]. Furthermore, numerous G‑quadruplex‐based RNA aptamers have been identified against other pathological diseases. RNA G‐quadruplex has already been used as a ther‑ moregulator to detect changes in temperature in E. coli. Both DNA and RNA G‐quadruplexes exhibit traits required for use in biosensors, molecu‑ lar beacons, nanoswitches, nanowires, and nanodevices [64].

103

104

Nucleic Acid Engineering

(a) F

T

T

T

T

W

(b)

F-T

(c)

T F-T

W

Figure 3.4  Autonomous DNA walker by Yin et al. The walker W consists of a DNA duplex with two single‐stranded “feet.” (a) Initially, the walker is attached to the track. The fuel hairpin molecules F can only be opened by occupied footholds (protruding sequence). (b) After hybridization of F, the left foot of the walker is detached from the track. (c) After hybridization of W with T, the walker has effectively taken one step to the right. W acts as a hybridization catalyst [Source: Simmel [65]. Reproduced with permission of WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim].

3.1.3 ­Secondary Structures in Nanotechnologie

Similar to G‐quartets, i‐motifs are four‐stranded DNA secondary structures which can be assembled by sequences rich in cytosine. Stabilized in acidic ­conditions, they are composed of two parallel‐stranded DNA duplexes held together in an antiparallel orientation by intercalated, cytosine–cytosine+ base pairs. By virtue of their pH dependent folding, i‐motif‐forming DNA sequences have been used extensively as pH switches in nanotechnology [70]. The possibility to form higher order structures makes these motifs an excel‑ lent module for the design of nanodevices capable of performing linear or rotary movements. The nanomotor switches between the two conformations reversibly through alternating DNA hybridization and strand exchange reac‑ tions, which enables to perform an inchworm‐like extending−shrinking motion [71]. Such DNA‐based switches are expected to find application as autono‑ mous biosensors in biomedicine or as actuators for DNA‐based molecular motors and robots. Potential applications of quadruplex‐based nano‐architec‑ tures include, from one side, the use of more rigid DNA scaffolds for the pre‑ cise positioning of macromolecules (at the nanometer scale), whereas, on the other side, the controlled conformational transition from the folded quadru‑ plex to the extended duplex form may be employed in the field of nanorobot‑ ics. Quadruplex structures comprise a wide class of well‐defined conformational states and structural polymorphs. Their conformation and stability can be tuned by varying the sequence and/or the length of the nucleic acid sequence, by modifying solution conditions, such as salt composition or pH, or by using small molecules that specifically bind with them [72]. 3.1.3.2.2  Origami: Nano‐architecture on Surface

The recognition of DNA molecules by their complements can be used for more than the formation of a simple double helix. Genetic engineers recog‑ nized in the early 1970s that short single‐stranded overhangs protruding from the end of a double‐stranded helical DNA (termed “sticky ends”) could be used to direct the intermolecular associations of different DNA molecules. Sticky ends provide the most readily programmable and predictable intermo‑ lecular interactions known, from the perspectives of both affinity and structure. A key feature of the double helix is that its axis is linear, not in the geometric sense of being a straight line but topologically as it is unbranched. The biologi‑ cal relevance of this fact is that only a linear complement to a DNA strand is well‐defined [73]. Branched molecules occur naturally in DNA metabolism, but only as ephemeral intermediates, such as the four‐arm Holliday junction or the disjoint three‐arm replication fork. These intermediates are unstable, because they contain homologous sequence symmetry, allowing them to isomerize via branch migration. However, now that the preparation of syn‑ thetic oligonucleotides was reported [74], it has been possible to eliminate this symmetry and produce DNA molecules with stable branch points.

105

106

Nucleic Acid Engineering

Moreover, if one combines sticky‐ended cohesion with branched DNA spe‑ cies, it is possible to produce various complex structures and topologies with relatively little effort [75]. The resulting superstructures, often termed DNA tiles, are used as building blocks for further assembly into discrete finite objects or infinite periodic lattices. Despite the many positive and promising results, certain disadvantages remain. The formation of large structures requires exact stoichiometric control and often purification of the constituent oligonucleo‑ tides and/or tiles, resulting in frequent errors and relatively long synthetic processes. Additionally, the complexity of the structures that can be produced with this strategy is limited to simple geometric shapes and the repetition of basic building blocks. Another milestone in DNA structure design is the realization of the concept of DNA origami. Nucleic acid sequences could be designed so that the strands would bend into well‐defined secondary and tertiary structures. The DNA origami technique folds a long single‐stranded DNA strand (scaffold) into a desired shape with the help of hundreds of short oligonucleotides, called staple strands. Staples are designed to bind several sometimes distant regions of the scaffold, based on Watson–Crick complementarity and formation of immobile four‐arm junctions, known as Holliday junctions, at strand exchange points that tie together neighboring helices, thus folding the scaffold strand into an addressable shape that can display desired patterns on its surface. Through the pioneering work of Seeman and later of Rothemund, the appli‑ cation of DNA as a building material for construction of nanoscale structures became viable [76]. Their goal was to assemble DNA into 3D crystalline lat‑ tices to further scaffold biological macromolecules, nanodevices, or nano‐elec‑ tronic components. Subsequently, substantial progresses have been made in the field of DNA nanotechnology. With the introduction of the scaffold‐based origami method, DNA has been used to construct increasingly complex self‐ assembled nanostructures in one‐, two‐, and three dimensions using various design strategies, as demonstrated in Figure 3.5 [78]. While the tile‐based assembly process is a useful tool to construct large and simple structures, the DNA origami method is more suitable for the produc‑ tion of smaller, more sophisticated, and well‐defined structures. If folded into a 2D shape, the size of the DNA origami is often on the order of 100 nm, a length scale accessible by lithographic methods. In addition, each of the staple strands are unique, meaning one can modify individual staple strands to introduce different functional groups at specific locations on the DNA scaf‑ fold with up to 6 nm resolution. Several research groups have demonstrated the possibility of displaying nucleic acids on the origami surface, for example, when mRNA probes are attached to DNA origami structures, they can be used for the detection of gene expression at the single‐molecule level [79]. These findings open the door for using the origami technique for a wide variety of

3.1.3 ­Secondary Structures in Nanotechnologie

+

Scaffold

2D shapes

Staple strands

2D nanoarrays

3D nanostructures

Figure 3.5  DNA origami and its applications: arbitrary 2D nanostructures, nano‐sized breadboards for the arraying of nanomaterials, and 3D nanostructures such as hollow polyhedrons [Source: Kuzuya and Komiyama [77]. Reproduced with permission of The Royal Society of Chemistry].

diagnostic applications. Proteins, nanoparticles, and other functional elements have been specifically positioned into designed patterns on these structures. When modified with specific linker molecules, such patterns can function as templates for the assembly of crossed pairs of carbon nanotubes for potential use as field effect transistors [80]. They can also act as templates to study chemical reactions, help in the structural determination of proteins, and be used as platforms for genomic and drug delivery applications. More biological applications were summarized by Tintoré et al. [81]. Various 3D origami struc‑ tures are now known, and selective encapsulation of a guest molecule, such as an enzyme or an inorganic nanomaterial, is expected to be feasible in the near future. DNA can also recognize a variety of target molecules or ions in a sequence‐ dependent fashion. Such recognition events usually result in the conforma‑ tional change of the DNA motifs, originated from the inherent dynamics of nucleic acids. Multiple stimulated motions can be integrated to drive complex mechanical movements and logic computing processes, through energetically favored motion paths. The target entities can be nucleic strands, metal ions, H+ or OH−, small molecules, or proteins. The diversity of the machine configura‑ tions and target molecules leads to DNA nanostructures that perform versatile mechanical functions, including walkers, tweezers, gears, rotators, metronome operations, and actuators [82].

107

108

Nucleic Acid Engineering

3.1.4 ­Conclusion With the support of more than 100 references, we described how the structure, properties, and higher order of organization and function of nucleic acids can be diverted as a tool in nanobiotechnologies. This chapter starts with chemical approaches to perform structural modifi‑ cations including 2′‐OH modifications (e.g., 2′‐O‐methyl), LNAs, PNAs, and HNAs that appear promising for potential application as therapeutics. Then, as the most advanced examples of synthetic reactions, we describe in details the PCR and the in vitro selection of nucleic acid sequences named aptamers, which recognize selectively a specific target. These are currently investigated to develop sensors, whereas in the nanotechnology field, an autonomous walker could be designed. With the introduction of the scaffold‐based origami method, DNA has been used to construct increasingly complex self‐assem‑ bled nanostructures in one‐, two‐, and three dimensions using various design strategies. One step further, the integration of DNA technology within biona‑ nocomposites will allow the design of hierarchically structured and bifunc‑ tional self‐assembled materials that will be described later on in this book.

­References 1 Singh, V.; Zharnikov, M.; Gulino, A.; Gupta, T. J. Mater. Chem. 2011, 21,

10602–10618.

2 Borna, H.; Imani, S.; Iman, M.; Jamalkandi, S. A. Expert Opin. Biol. Ther. 2015,

15, 269–285.

3 Bell, N. M.; Micklefield, J. ChemBioChem 2009, 10, 2691–2703. 4 (a) Zheng, J.; Yang, R.; Shi, M.; Wu, C.; Fang, X.; Li, Y.; Li, J.; Tan, W. Chem. Soc.

Rev. 2015, 44, 3036–3055; (b) Cobo, I.; Li, M.; Sumerlin, B. S.; Perrier, S. Nat. Mater. 2015, 14, 143–159; (c) He, D.; Wagner, E. Macromol. Biosci. 2015, 15, 600–612; (d) Dong, Y. C.; Liu, D. S.; Yang, Z. Q. Methods 2014, 67, 116–122; (e) Singh, Y.; Murat, P.; Defrancq, E. Chem. Soc. Rev. 2010, 39, 2054–2070; (f ) Neuberg, P.; Kichler, A. Nonviral Vectors for Gene Therapy Lipid‐ and Polymer‐Based Gene Transfer 2014, 88, 263–288. 5 (a) Shi, H. H.; Yang, F. P.; Li, W. J.; Zhao, W. W.; Nie, K. X.; Dong, B.; Liu, Z. C. Biosens. Bioelectron. 2015, 66, 481–489; (b) Barluenga, S.; Winssinger, N. Acc. Chem. Res. 2015, 48, 1319–1331; (c) Braasch, D. A.; Corey, D. R. Chem. Biol. 2001, 8, 1–7; (d) Petersen, M.; Wengel, J. Trends Biotechnol. 2003, 21, 74–81. 6 Zhang, Y.; Xing, S.‐G.; Wang, Z.; Kang, Q.‐H.; Ling, Y.; Yao, M.‐Y.; He, Y.‐P.; Jin, Y.; Chu, X.‐G. Prog. Biochem. Biophys. 2015, 42, 236–243. 7 Kedracki, D.; Maroni, P.; Schlaad, H.; Vebert‐Nardin, C. Adv. Func. Mater. 2014, 24, 1133–1139.

 ­Reference

8 Hernandez, L. I.; Machado, I.; Schaefer, T.; Hernandez, F. J. Curr. Top. Med.

Chem. 2015, 15, 1066–1081.

9 (a) Krishnan, Y.; Simmel, F. C. Angew. Chem. Int. Ed. 2011, 50, 3124–3156;

10 11

12

13 14 15 16

17 18 19 20 21 22

(b) Bhatia, D.; Sharma, S.; Krishnan, Y. Curr. Opin. Biotechnol. 2011, 22, 475–484; (c) Li, J., Wang, X. Y.; Liang, X. G. Chem. Asian J. 2014, 9, 3344–3358; (d) Ng, D. Y. W.; Wu, Y.; Kuan, S. L.; Weil, T. Acc. Chem. Res. 2014, 47, 3471–3480; (e) Jamali, A. A.; Pourhassan‐Moghaddam, M.; Dolatabadi, J. E. N.; Omidi, Y. TrAC Trends Anal. Chem. 2014, 55, 24–42. Kleppe, K.; Ohtsuka, E.; Kleppe, R.; Molineux, I.; Khorana, H. G. J. Mol. Biol. 1971, 56, 341–361. (a) Bartlett, J. M. S.; Stirling, D. A short history of the polymerase chain reaction, Vol. 226, Humana Press, Totowa, 2003, pp. 3–6; (b) Mullis, K. B.; Erlich, H. A.; Arnheim, N.; Horn, G. T.; Saiki, R. K.; Scharf S. J. Process for amplifying, detecting, and/or‐cloning nucleic acid sequences, Cetus Corporation, Emeryville, US Patent 4683195, July 28, 1987. (a) Salazar, J. K.; Wang, Y.; Yu, S.; Wang, H.; Zhang, W. J. Microbiol. Methods 2015, 110, 18–26; (b) De Medici, D.; Kuchta, T.; Knutsson, R.; Angelov, A.; Auricchio, B.; Barbanera, M.; Diaz‐Amigo, C.; Fiore, A.; Kudirkiene, E.; Hohl, A.; Tomic, D. H.; Gotcheva, V.; Popping, B.; Prukner‐Radovcic, E.; Scaramaglia, S.; Siekel, P.; To, K. A.; Wagner, M. Food Anal. Methods 2015, 8, 255–271; (c) Rawlence, N. J.; Lowe, D. J.; Wood, J. R.; Young, J. M.; Churchman, G. J.; Huang, Y. T.; Cooper, A. J. Quaternary Sci. 2014, 29, 610–626. Sharma, K.; Mishra, A. K.; Mehraj, V.; Duraisamy, G. S. Biotechnol. Genet. Eng. Rev. 2014, 30, 65–78. Huggett, J. F.; Cowen, S.; Foy, C. A. Clin. Chem. 2015, 61, 79–88. Perez‐Toralla, K.; Pekin, D.; Bartolo, J. F.; Garlan, F.; Nizard, P.; Laurent‐Puig, P.; Baret, J. C.; Taly, V. Med. Sci. 2015, 31, 84–92. (a) Deng, H. M.; Gao, Z. Q. Anal. Chim. Acta 2015, 853, 30–45; (b) Li, D.; Liu, B.; Chen, M.; Guo, D.; Guo, X.; Liu, F.; Feng, L.; Wang, L. J. Microbiol. Methods 2010, 82, 71–77. Logan, J.; Edwards, K. J.; Saunders, N. A. Real‐time PCR: current technology and applications, Horizon Scientific Press, Norfolk, 2009. Sanders, R.; Mason, D. J.; Foy, C. A.; Huggett, J. F. Anal. Bioanal. Chem. 2014, 406, 6471–6483. Evans, M. F. Diagn. Histopathol. 2009, 15, 344–356. Alidjinou, E. K.; Bocket, L.; Hober, D. Pathol. Biol. 2015, 63, 53–59. Wu, L. D.; Ma, C.; Zheng, X. X.; Liu, H. Y.; Yu, J. H. Biosens. Bioelectron. 2015, 68, 413–420. (a) Mokhtarzadeh, A.; Dolatabadi, J. E. N.; Abnous, K.; de la Guardia, M.; Ramezani, M. Bios. Bioelectron. 2015, 68, 95–106; (b) He, J.‐L.; Wu, Z.‐S.; Zhou, H.; Wang, H.‐Q.; Jiang, J.‐H.; Shen, G.‐L.; Yu, R.‐Q. Anal. Chem. 2010, 82, 1358–1364.

109

110

Nucleic Acid Engineering

23 (a) Fakruddin, M.; Mannan, K. S. B.; Chowdhury, A.; Mazumdar, R. M.;

24 25 26 27 28 29 30

31

32 33 34 35

36 37 38 39

40 41 42

Hossain, M. N.; Islam, S.; Chowdhury, M. A. J. Pharm. Bioallied Sci. 2013, 5, 245–252; (b) Csako, G. Clin. Chim. Acta 2006, 363, 6–31. Gadkar, V. J.; Filion, M. Curr. Issues Mol. Biol. 2014, 16, 1–5. Walsh, J. M.; Beuning, P. J. J. Nucleic Acids 2012, 2012, 17, ID 530963. Lawyer, F. C.; Stoffel, S.; Saiki, R. K.; Myambo, K.; Drummond, R.; Gelfand, D. H. J. Biol. Chem. 1989, 264, 6427–6437. Terpe, K. Appl. Microbiol. Biotechnol. 2013, 97, 10243–10254. Reverdatto, S.; Burz, D. S.; Shekhtman, A. Curr. Top. Med. Chem. 2015, 15, 1082–1101. Tuerk, C.; Gold, L. Science 1990, 249, 505–510. (a) Dougherty, C. A.; Cai, W.; Hong, H. Curr. Top. Med. Chem. 2015, 15, 1138–1152; (b) Ozalp, V. C.; Kavruk, M.; Dilek, O.; Bayrac, A. T. Curr. Top. Med. Chem. 2015, 15, 1125–1137; (c) Peyrin, E. J. Sep. Sci. 2009, 32, 1531– 1536; (d) Chauveau, F.; Pestourie, C.; Tavitian, B. Pathol. Biol. 2006, 54, 251–258. Schütze, T.; Wilhelm, B.; Greiner, N.; Braun, H.; Peter, F.; Mörl, M.; Erdmann, V. A.; Lehrach, H.; Konthur, Z.; Menger, M.; Arndt, P. F.; Glökler, J. PLoS One 2011, 6, e29604. Albericio, F. Solid‐phase synthesis: a practical guide, CRC Press, London, 2000. Liu, J.; You, M.; Pu, Y.; Liu, H.; Ye, M.; Tan, W. Curr. Med. Chem. 2011, 18, 4117–4125. Mayer, G. Angew. Chem. Int. Ed. 2009, 48, 2672–2689. (a) Ma, H.; Liu, J.; Ali, M. M.; Mahmood, M. A. I.; Labanieh, L.; Lu, M.; Iqbal, S. M.; Zhang, Q.; Zhao, W.; Wan, Y. Chem. Soc. Rev. 2015, 44, 1240–1256; (b) Strehlitz, B.; Nikolaus, N.; Stoltenburg, R. Sensors 2008, 8, 4296; (c) Wang, Y.; Xu, D.; Chen, H.‐Y. Lab Chip 2012, 12, 3184–3189. Liao, J.; Liu, B.; Liu, J.; Zhang, J.; Chen, K.; Liu, H. Expert Opin. Drug Deliv. 2015, 12, 493–506. (a) Lee, J. H.; Yigit, M. V.; Mazumdar, D.; Lu, Y. Adv. Drug Deliv. Rev. 2010, 62, 592–605; (b) Feng, C.; Dai, S.; Wang, L. Biosens. Bioelectron. 2014, 59, 64–74. Handy, S. M.; Yakes, B. J.; DeGrasse, J. A.; Campbell, K.; Elliott, C. T.; Kanyuck, K. M.; Degrasse, S. L. Toxicon 2013, 61, 30–37. (a) Farokhzad, O. C.; Cheng, J.; Teply, B. A., Sherifi, I.; Jon, S.; Kantoff, P. W.; Richie, J. P.; Langer, R. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 6315–6320; (b) Liss, M.; Petersen, B.; Wolf, H.; Prohaska, E. Anal. Chem. 2002, 74, 4488–4495. Yang, D.; Hartman, M. R.; Derrien, T. L.; Hamada, S.; An, D.; Yancey, K. G.; Cheng, R.; Ma, M.; Luo, D. Acc. Chem. Res. 2014, 47, 1902–1911. Takahashi, M.; Burnett, J. C.; Rossi, J. J. Adv. Exp. Med. Biol. 2015, 848, 211–234. Sassolas, A.; Blum, L. J.; Leca‐Bouvier, B. D. Biosens. Bioelectron. 2011, 26, 3725–3736.

 ­Reference

43 Rockey, W. M.; Huang, L.; Kloepping, K. C.; Baumhover, N. J.; Giangrande, P. H.;

Schultz, M. K. Bioorg. Med. Chem. 2011, 19, 4080–4090.

44 Turdean, G. L. Int. J. Electrochem. 2011, 2011, ID 343125. 45 Zengin, A.; Tamer, U.; Caykara, T. J. Mater. Chem. B 2015, 3, 306–315. 46 Kanwar, J. R.; Roy, K.; Kanwar, R. K. Crit. Rev. Biochem. Mol. Biol. 2011, 46,

459–477.

47 (a) Ye, M.; Hu, J.; Peng, M.; Liu, J.; Liu, J.; Liu, H.; Zhao, X.; Tan, W. Int. J. Mol.

Sci. 2012, 13, 3341–3353; (b) Toh, S. Y.; Citartan, M.; Gopinath, S. C. B.; Tang, T.‐H. Biosens. Bioelectron. 2015, 64, 392–403; (c) Sriramoju, B.; Kanwar, R.; Veedu, R. N.; Kanwar, J. R. Curr. Top. Med. Chem. 2015, 15, 1115–1124. 48 Zhou, J.; Rossi, J. J. Oligonucleotides 2011, 21, 1–10. 49 Kadioglu, O.; Malczyk, A. H.; Greten, H. J.; Efferth, T. Invest. New Drugs 2015, 33, 513–520. 50 Bruno, J. G. Molecules 2015, 20, 6866–6887. 51 Wu, J.; Zhu, Y.; Xue, F.; Mei, Z.; Yao, L.; Wang, X.; Zheng, L.; Liu, J.; Liu, G.; Peng, C.; Chen, W. Microchim. Acta 2014, 181, 479–491. 52 Collett, J. R.; Cho, E. J.; Ellington, A. D. Methods 2005, 37, 4–15. 53 (a) Sassolas, A.; Blum, L. J.; Leca‐Bouvier, B. D. Electroanalysis 2009, 21, 1237–1250; (b) Citartan, M.; Gopinath, S. C. B.; Tominaga, J.; Tan, S.‐C.; Tang, T.‐H. Biosens. Bioelectron. 2012, 34, 1–11; (c) You, M.; Chen, Y.; Peng, L.; Han, D.; Yin, B.; Ye B.; Tan, W. Chem. Sci. 2011, 2, 1003–1010. 54 Famulok, M.; Hartig, J. S.; Mayer, G. Chem. Rev. 2007, 107, 3715–3743. 55 (a) McConnell, E. M.; Holahan, M. R.; DeRosa, M. C. Nucleic Acid Ther. 2014, 24, 388–404; (b) Duncan, R. J. Control. Release 2014, 190, 371–380; (c) Duncan, R. Curr. Opin. Biotechnol. 2011, 22, 492–501; (d) Zhou, J.; Rossi, J. J. Mol. Ther. Nucleic Acids 2014, 3, e169; (e) Reinemann, C.; Strehlitz, B. Swiss Med. Wkly. 2014, 144, w13908. 56 Kruspe, S.; Mittelberger, F.; Szameit, K.; Hahn, U. ChemMedChem 2014, 9, 1998–2011. 57 Burmeister, P. E.; Lewis, S. D.; Silva, R. F.; Preiss, J. R.; Horwitz, L. R.; Pendergrast, P. S., McCauley, T. G.; Kurz, J. C.; Epstein, D. M.; Wilson, C.; Keefe, A. D. Chem. Biol. 2005, 12, 25–33. 58 (a) Liu, J.; Cao, Z.; Lu, Y. Chem. Rev. 2009, 109, 1948–1998; (b) Li, B.; Dong, S.; Wang, E. Chem. Asian J. 2010, 5, 1262–1272. 59 Kolpashchikov, D. M. Chem. Rev. 2010, 110, 4709–4723. 60 Li, B.; Du, Y.; Dong, S. Anal. Chim. Acta 2009, 644, 78–82. 61 (a) Knecht, M. R.; Sethi, M. Anal. Bioanal. Chem. 2009, 394, 33–46; (b) Xue, X.; Wang, F.; Liu, X. J. Am. Chem. Soc. 2008, 130, 3244–3245; (c) Du, J.; Jiang, L.; Shao, Q.; Liu, X.; Marks, R. S.; Ma, J.; Chen, X. Small 2013, 9, 1467–1481. 62 (a) Prado, E.; Daugey, N.; Plumet, S.; Servant, L.; Lecomte, S. Chem. Commun. 2011, 47, 7425–7427; (b) Novoa, A.; Winssinger, N. Beilstein J. Org. Chem. 2015, 11, 707–719; (c) Picarsic, J.; Reyes‐Mugica, M. Appl. Immunohistochem.

111

112

Nucleic Acid Engineering

63 64 65 66 67 68 69 70 71

72 73 74 75 76 77 78 79

80 81 82

Mol. Morphol. 2015, 23, 313–326; (d) Peterson, A. M.; Heemstra, J. M. Wiley Interdiscip. Rev. Nanomed. Nanobiotechnol. 2015, 7, 282–297. Famulok, M.; Mayer, G. Acc. Chem. Res. 2011, 44, 1349–1358. Krishnan, Y.; Simmel, F. C. Angew. Chem. Int. Ed. 2011, 50, 3124–3156. Simmel, F. C. ChemPhysChem 2009, 10, 2593–2597. Yin, P.; Choi, H. M. T.; Calvert, C. R.; Pierce, N. A. Nature 2008, 451, 318–322. Boiziau, C.; Dausse, E.; Yurchenko, L.; Toulmé, J.‐J. J. Biol. Chem. 1999, 274, 12730–12737. Durand, G.; Lisi, S.; Ravelet, C.; Dausse, E.; Peyrin, E.; Toulme, J. J. Angew. Chem. Int. Ed. 2014, 53, 6942–6945. Olsthoorn, R. C. Nucleic Acids Res. 2014, 42, 9327–9333. Halder, S.; Krishnan, Y. Nanoscale 2015, 7, 10008–10012. (a) Li, J. J.; Tan, W. Nano Lett. 2002, 2, 315–318; (b) Alberti, P.; Mergny, J. L. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 1569–1573; (c) Liu, D.; Balasubramanian, S. Angew. Chem. Int. Ed. 2003, 42, 5734–5736. Alberti, P.; Bourdoncle, A.; Sacca, B.; Lacroix, L.; Mergny, J. L. Org. Biomol. Chem. 2006, 4, 3383–3391. Seeman, N. C. Synlett 2000, 11, 1536–1548. Caruthers, M. H. Science 1985, 230, 281–285. Seeman, N. C. Biochemistry 2003, 42, 7259–7269. (a) Winfree, E.; Liu, F.; Wenzler, L. A.; Seeman, N. C. Nature 1998, 394, 539–544; (b) Rothemund, P. W. Nature 2006, 440, 297–302. Kuzuya, A.; Komiyama, M. Nanoscale 2010, 2, 309–321. Tørring, T.; Voigt, N. V.; Nangreave, J.; Yan, H.; Gothelf, K. V. Chem. Soc. Rev. 2011, 40, 5621–5928. (a) Ke, Y.; Nangreave, J.; Yan, H.; Lindsay, S.; Liu, Y. Chem. Commun. 2008, 5622–5624; (b) Subramanian, H. K.; Chakraborty, B.; Sha, R.; Seeman, N. C. Nano Lett. 2011, 11, 910–913. Maune, H. T.; Han, S. P.; Barish, R. D.; Bockrath, M.; Iii, W. A.; Rothemund, P. W.; Winfree, E. Nat. Nanotechnol. 2010, 5, 61–66. Tintoré, M.; Eritja, R.; Fabrega, C. Chembiochem 2014, 15, 1374–1390. Song, C.; Wang, Z. G.; Ding, B. Small 2013, 9, 2382–2392.

113

3.2 Protein Engineering Agathe Urvoas, Marie Valerio‐Lepiniec and Philippe Minard Institute for Integrative Biology of the Cell (I2BC), UMR 9198, Université Paris‐Sud, CNRS, CEA, Orsay, France

Nature offers an extremely large reservoir of proteins that are interesting functional elements or building blocks for man‐made technology. This has led to the development of chemical or biological methods allowing for the preparation of small peptides and large proteins. Using oligonucleotide synthesis, PCR, and gene synthesis methods, it is now possible to assemble at relatively moderate cost any desired DNA sequence yielding to the desired peptide or protein. Therefore, the scientific challenge in the protein field is nowadays to choose or design an amino acid sequence that will fold in the expected protein, with the proper and stable three‐dimensional (3D) structure required to give rise to the desired functional properties.

3.2.1 ­Synthesis of Polypeptides: Chemical or Biological Approach? Chemical synthesis is, by far, the dominant method to produce peptides or polypeptide chains with short sequences, typically 40 or less amino acid residues. Short peptides can be efficiently synthesized using conventional peptide synthesis [1], while most peptides are rapidly degraded, and inefficiently ­produced, in biological production systems. However peptides are often too short to be able to fold in a stable tertiary structure. Probably for this reason, most natural proteins have rather long sequences, typically from sixty up to several hundred residues. At the difference of short peptides, these long ­polypeptide chains are inefficiently synthesized by chemical methods while they are often efficiently produced as “recombinant proteins” using biotechnological Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

114

Protein Engineering

approaches. Biological production offers additional advantages over chemical approaches: the cost of chemical synthesis increases with scale, and each new production needs to restart a new synthesis from the beginning, while, once an expression system is set‐up for biological expression, it is relatively easy to grow microorganisms to produce new batches of recombinant protein, as well as to produce modified forms of the protein using site‐directed mutagenesis.

3.2.2 ­Proteins: From Natural to Artificial Sources Many interesting or useful proteins are not naturally abundant and are difficult or impossible to purify from natural sources. These naturally non‐abundant proteins can only be isolated if an artificial production source can be established. Since the development of DNA recombinant technology in the 1970s, it is possible to insert any protein coding DNA sequence under the control of ­“promoters” (expression‐promoting sequences), and the resultant recombined genetic element can be introduced in expressing host cells such as cultured microorganisms or animal cells. Production of proteins from these so‐called recombinant expression systems is easier and much more efficient than from natural organisms. Today, recombinant protein technology is essential in research not only to produce useful amounts of low abundance proteins but also to produce modified or engineered proteins from mutated or synthetic genes, as well as to express “tagged” proteins for easy purification or detection. 3.2.2.1  How to Get the Coding Sequence of the Protein of Interest?

Since 20 years, hundreds of genomes have been fully sequenced including those from most model organisms. Therefore, in most cases, the sequence of any protein of interest can be found in public databases and can be easily amplified by PCR and inserted into a relevant expression plasmid. If a natural sample of DNA to amplify is not available (e.g., for designed artificial proteins), it is possible to design a synthetic gene. Synthetic genes offer the additional benefits to adapt the codons used in the synthetic genes to the codon usage preferences of the producing microorganism (e.g., Escherichia coli), which, in some cases, is important for an efficient expression. 3.2.2.2  E. coli: A Cheap “Protein Factory” with a Diversified Tool Box

Although mammalian cells are often used for production of complex eukaryotic proteins for therapeutic applications, for most research and technological applications, expression systems based on microorganisms are preferred, among which E. coli is, by far, the most commonly used [2, 3] (Figure  3.6). Indeed, this system presents several advantages: bacteria grow quickly (few hours to produce proteins) and produce high yields of protein (1–100 mg/l in shake flask), and production remains inexpensive. A variety of E. coli host

(a)

Plasmid

Bacteria

Plasmid transformation in bacterial strain

Colonies Selective culture medium (antibiotic)

(b)

Liquid culture: IPTG induction

Protein production

(c)

Bacterial lysis centrifugation

Soluble extract (a)

Purification

(d)

1. Affinity chromatography

Bind

Wash

Elute (b) 2. SEC or ion exchange (if necessary) a

(e)

b SDS-PAGE analysis

Figure 3.6  Overview of protein production. A typical procedure for expression of a recombinant protein in Escherichia coli. (a) The plasmid contains the protein coding sequence under the control of expression‐promoting sequences and is introduced into an expression E. coli strain (usually BL21 pLys S or M15) by chemical transformation or electroporation. (b) Cells were grown at 37°C in rich culture medium containing an antibiotic. Protein expression is induced by the addition of an inducer (often IPTG), and the cells are further incubated for 2–4 h for protein production. (c) The cells are harvested and resuspended in relevant buffer. They are submitted to freezing/thawing cycles and French press or sonication and/or treated with lysozyme and DNAse to break the cell wall and genomic DNA molecule. After centrifugation, a clarified sample of the soluble bacterial fraction containing the tagged protein is obtained. (d) The initial purification step is often performed on immobilized metal affinity chromatography (IMAC). For example, the His‐tagged proteins are purified from the supernatant using nickel‐affinity chromatography. A second purification step using size exclusion chromatography is often used as a final purification and conditioning step. (e) The production and purification procedure can be analyzed by electrophoresis on acrylamide gel (SDS‐PAGE) in denaturing conditions.

116

Protein Engineering

strains have been developed with optimized properties for expression and are commercially available. 3.2.2.3  Common Expression Plasmids

The sequence to be expressed first needs to be inserted in an appropriate expression “vector” and the resulting construction inserted in the host expressing cell. Expression vectors used in E. coli are based on DNA molecules, called plasmids, usually circular and of small size (50 kDa). Some proteins are efficiently produced in E. coli but

117

118

Protein Engineering

do not fold in their proper 3D structure and instead form large aggregates or assembly of amorphous insoluble proteins called “inclusion bodies.” Similarly, expression in E. coli is usually not optimal for the production of proteins that need posttranslational modifications. Finally some proteins need to be stabilized by the oxidation reaction between thiol groups of cysteine leading to disulfide bonds. These disulfide bonds do not form efficiently in E. coli unless specific adaptations are introduced in expression systems. 3.2.2.5  Some Solutions Are Available to Solve these Expression Problems ●●

●●

●●

If the protein is toxic before induction, leaky transcription of DNA must be efficiently repressed using a promoter with strong regulation in engineered expression strains (e.g., BL21 DE3 pLys for pET vector family, M15 pREP4 with pQE vectors). Although the genetic code is almost universal, the usage frequency of synonymous codons is highly variable (e.g., the six codons coding for leucine are not equally used) and differs considerably from one organism to another. This bias in codon usage can lower protein expression, unless a specific strain (Rosetta™) or synthetic gene with adapted codon usage is used. If the protein contains disulfide bonds, correct folding is difficult in standard E. coli strains because of the reducing environment of the cytoplasm. Folding of disulfide‐bond‐containing proteins may be possible in special host strains (e.g., the Origami™ strain). Alternatively these proteins can be addressed to the periplasm of bacteria, where disulfide bonds form efficiently.

One of the most common problems in bacterial expression systems is the formation of inclusion bodies due to incorrectly folded, aggregated, and ­inactive proteins. The correct or incorrect folding is highly protein dependent. The most efficient way to minimize the formation of inclusion bodies is to decrease culture temperature (from 37 to 30 or 25°C), which lowers hydrophobic improper association that causes aggregation. For some proteins, this is not sufficient to prevent aggregation. In this case, the recombinant protein can also be co‐expressed with chaperones that aid proteins to fold correctly during synthesis.

3.2.3 ­Proteins: A Large Repertoire of Functional Objects 3.2.3.1  Looking for Natural Proteins with Desired Function

Genome sequencing has produced hundreds of millions of natural proteins sequences from a range of highly diverse organisms. It is therefore often

3.2.3 ­Proteins: A Large Repertoire of Functional Object

­ ossible to look for proteins with specific functional properties by first exploring p existing biological information stored in public sequence databases. Similarity between sequences resulting from evolution can help to identify sequence features naturally associated with a biological property [5]. For example, ­ ­proteins with high thermoresistance are commonly identified from thermophilic microorganisms. A range of natural model systems can be explored to understand how natural evolution has solved demanding adaptation problems such as mechanical properties of spider silk or mechanical resistance of shell‐ like composites. If biology is a clear source of inspiration, natural evolution has its own constraints that may not result to the properties needed for a specific application. The question is then: is it possible to adapt existing ­proteins, or, even, is it possible to create new ones with properties specifically designed for a given application? 3.2.3.2  From Protein Engineering to Protein Design

When gene‐modifying methods became available, early works mainly focused on testing hypothesis concerning the role of specific residues on proteins. Experimental observations gathered from these early experiments led to better appreciation of the effect of sequence changes on protein folding and stability. For example, it rapidly appeared that amino acids substitutions able to redirect the folding pathway toward new structures were much less common than anticipated. Accumulated experiments finally led to a shift in representations [6]. Rather than “pathways,” protein folding is now often described as an “energy landscape” or a funnel‐shaped free energy surface on which many ­different routes can lead a polypeptide chain to a global energy minimum ­corresponding to the folded state. In most cases, the final structure does not depend upon a precise succession of folding steps. Therefore the main problem is not really how sequence changes alter the pathway but rather how these changes affect the depth of this minimum. 3.2.3.2.1  Modified Proteins Are Often Destabilized

Natural proteins have usually not evolved to reach extreme stability and can easily be destabilized as a consequence of sequence changes [7]. For example, substituting a single nonpolar side chain buried in the tertiary structure for a polar one is often sufficient to dramatically alter the stability of a protein. Systematic mutagenesis studies clearly confirm that sequence positions buried in the hydrophobic structure are crucial for stability. Although usually more tolerant, some surface side chains, such as those involved in salt bonds networks, or hydrogen bonds with backbone amide groups may also have an important contribution to protein stability. Importantly, the effects of several destabilizing sequence modifications are in most cases additive, and therefore extensive sequence changes introduced in natural proteins may easily result in

119

120

Protein Engineering

a complete destabilization of proteins. A destabilized protein is unable to find an alternative energy minimum and does not adopt an alternative tertiary structure. In practice destabilized proteins are degraded by proteolytic systems of the producing host or form amorphous aggregates and cannot be efficiently recovered. 3.2.3.2.2  Natural or Engineered Proteins: From Small Step to Giant Leap in Sequence Space

Protein engineering experiments collectively indicate that the most likely ­consequence of multiple sequence modifications introduced in a natural protein sequence is a severe destabilization of its 3D structure [7]. However new functional proteins with tailor‐made nonnatural properties such as new binding site or new catalytic activities are unlikely to be found in the immediate sequence vicinity of naturally existing protein. Therefore if a large move in sequence space is still a risky adventure, it is also an unavoidable route toward proteins with really new functions [8]. Proteins with highly modified or fully nonnatural sequence can now be accessible by two different but complementary strategies: protein design or directed evolution. 3.2.3.2.3  Computational Protein Design

Protein design or the ability to define by computation a sequence that will fold in a given structure is the ultimate goal of protein modification or engineering methods and the ultimate test of our real understanding of protein structure and folding problems. Early pioneering attempts [9, 10] have occasionally reported successful design, but the complexity of protein folding energetics as well as the combinatorial nature of sequence exploration makes the protein design problem a highly challenging test. This field has only recently accomplished spectacular progresses, with the development and continuous improvement of an efficient modeling suite named Rosetta by Baker and collaborators [11]. A basic assumption in Rosetta is that 3D protein structures result from the conformational preferences of small protein fragments, and these preferences can be extracted from the conformation adopted by small fragments in known protein structures. An overall structure can then be reconstructed using Monte Carlo sampling of preferred local conformations and simplified energy function. Solutions resulting from this coarse grain model are clustered and further refined using a more detailed energy function and a complete protein representation with side chains and backbone flexibility. Rosetta was initially developed as a structure prediction tool (given a sequence, what is its most stable conformation?) but was later extended as a very useful design tool (what is the best sequence for a given conformation?). Protein computational design remains highly challenging, and only very few labs currently efficiently master design methods. Recent progress

3.2.3 ­Proteins: A Large Repertoire of Functional Object

in this field includes the successful design of nonnatural protein folds [12], the creation of specific interaction between protein pairs [13], the assembly of ­protein arrays or protein cages [14, 15], and the design and grafting of functional new catalytic sites in computationally designed enzymes [16]. 3.2.3.2.4  Directed Evolution: A Diverse Repertoire Combined with a Selection Process

Directed evolution is a protein engineering approach that mimics natural ­evolutionary processes (i.e., Darwinian evolution) in the laboratory. Setting up a directed evolution approach in a laboratory involves four main steps: (i) the choice of the initial protein to be “evolved,” (ii) the diversity introduction in its sequence leading to a diverse library, (iii) an iterative selection or screening process to isolate among the variants those adapted to the target properties, and (iv) the characterization of the best variants. The first milestone is the construction of high quality libraries. Diversity will be introduced at the DNA level, using degenerated oligonucleotide synthesis or error‐prone PCR. However, instead of exploring randomly a huge sequence space of protein sequences, in some situations specific positions within the sequence can be targeted for randomization. The second milestone is the selection process that requires technological development. Phage display (PD), based on the non‐lytic filamentous phage M13, was the first method used to select peptides, proteins, and antibody fragments with specific binding properties for chosen targets. PD remains the most used technique to select for new binding properties due to its robustness and ease to use as described in many practical manuals (Figure 3.8). Several alternative display methods have been developed to circumvent some of the limitations of phage technology (ribosome display, cell display, etc.). The common feature of these methods is to maintain a physical linkage between the gene and the corresponding protein. As the selection occurs for the functional properties of the protein, it becomes possible to recover the gene that encodes this precise protein variant. After sequencing, the DNA fragment is easily subcloned, and the protein of interest can be produced in any appropriate recombinant format. The main field of applications of protein directed evolution is enzyme ­engineering of the emergence of new binding (or “recognition”) properties in proteins [17, 18]. A variety of engineered enzymes with improved characteristics like thermostability, substrate specificity, catalytic properties, regio‐/ enantioselectivity, or stability in organic solvents could be obtained [19]. Antibodies belong to the canonical class of proteins dedicated to specific recognition, as their natural function is precisely to recognize and bind their targets (or “antigens”) with high affinity. However, antibodies are large and complex heterotetrameric proteins stabilized by several disulfide bonds. As this complex structure is an obstacle for their production as recombinant

121

122

Protein Engineering f a

b

e c

d g acgtacgtacgtg

Figure 3.8  Creation of molecular recognition by artificial evolution of protein. Phage display is commonly used to evolve new binding peptides or proteins, based on a principle that mimics a natural evolutionary process. The key concept is that predefined recognition properties do not need to be designed as they can be extracted from a highly diverse population of molecules and can then be very efficiently amplified by their genetic information. The source of diversity is a synthetic library (a), a very large collection of partially randomized sequences. The target molecule (b budding circle) is used as an affinity trap to retain the very few molecules of the library that, by chance, have a target‐ complementary molecular structure (c). Most molecules of the library do not bind the target and are eliminated by washing step (d). The adapted binders, although initially at a non‐detectable concentration, are associated to a DNA sequence and can therefore be amplified very efficiently either by microorganism growth or by PCR (e). One round of selection and amplification is not usually sufficient to isolate the binders from the library, but the process of selection is iterative (f ), and several selection rounds lead to specific target‐binding sequences (g).

proteins, a variety of antibody fragments and derivatives with reduced size and yet intact recognizing properties were engineered (Fab, scFv, diabody, single domain antibody from camelidae (VHH)). Antibodies and antibody fragments obtained either from engineering monoclonal antibodies or directed evolution are a major source of new therapeutic molecules [20]. The prospect for application outside biology is however severely constrained by the poor biophysical properties (stability, folding, and expression efficiency) of antibodies and derivatives. More recently it became possible to efficiently create new tailored binding sites in proteins unrelated to antibodies, using directed evolution and PD methods. These innovative protein scaffolds based on highly stable and efficiently folded protein frameworks [21, 22] are endowed with specific tailored recognition properties, which open the route for a wide range of applications.

3.2.3 ­Proteins: A Large Repertoire of Functional Object

3.2.3.3  Combining Chemistry with Biological Objects

A large variety of chemistry‐based strategies can be employed to label proteins with synthetic probes for applications in cell biology, biotechnology, and chemical sciences and in emerging multidisciplinary fields like nanophysics as recently reviewed [23–28]. 3.2.3.3.1  Labeling Natural Amino Acids

The first strategy is the covalent coupling of natural amino acid side chains with synthetic probes (Figure  3.9a) [29]. The large number of nucleophilic functional groups provided by side chains on surface of proteins allows a variety of chemical modifications [30]. The main challenge of labeling approaches using chemical reactions is the development of mild reaction conditions necessary to preserve protein structure and function: water as solvent, neutral pH, moderate temperatures, short time reactions, and so on. Cysteine is the most used residue for site‐specific modifications. Thiol groups of cysteine residues can undergo alkylation reactions via electrophilic ketones or maleimide derivatives under mild conditions necessary to preserve protein structure and function. Lysine side chains are also widely used for ­protein modification via their ε‐amino group. A large panel of reactive methods is available for the modification of primary amines including activated esters (NHS), isocyanates, sulfonamides, and so on. As lysine residues are more prevalent than cysteine residues, this labeling is rather used for nonspecific labeling, for example, to attach proteins to a solid surface, a polymer, or to label the proteins with several probes. Other side chain modifications were described taking advantage of the reactivity of functional groups present in proteins [23, 29, 30]. 3.2.3.3.2  Bioorthogonal Labeling

The general goal of “bioorthogonal” labeling approaches is to introduce new “nonbiological” reactivities into biological macromolecules (Figure 3.9b) [31]. Natural proteins are made from the 20 amino acids with restricted list of chemical functions. Unusual chemical functions can be introduced into proteins by incorporation of nonnatural amino acids with unnatural chemical functions like ketone, aldehyde, alkyne, or azide groups. Bacterial strains grown in minimal media can incorporate some nonnatural amino acids, if appropriate precursors are provided as nutrient. Increased range of nonnatural amino acids can be incorporated using engineered aminoacyl‐tRNA synthetase. Following pioneering works by Bertozzi et al., the azide group is widely used in different bioorthogonal reactions. For example, the reduction of alkylazide can be catalyzed by a triarylphosphine (known as Staudinger reaction), alkyne–azide cycloaddition used in “click chemistry” with CuI as catalyst or more recently copper‐free reactions with strained cyclooctyne reagents [32, 33]. Various

123

124

Protein Engineering

(a) Direct labeling

SH

+

Lys

NH2

N H

Y

+

(b)

S

X

X: maleimide, haloacetyl, etc.

Cys

Y: activated ester, isocyanate, etc. N N

Orthogonal labeling

N

N2

Click chemistry

aa*

aa*

Cu′ catalysis

+

aa*: unnatural amino acid

N N N

Cu-free reaction

+

aa*

Inverse electron-demand Diels–Alder N

N

N

N

N

N

N N

5 min, 20°C

N

N

+

HN

HN

(c) Tag labeling +

Biotin ligase (BirA) Biotin

Enzyme-mediated labeling

AviTag (GLNDIFEAQKIEWHE)

Biotin Probe (nanoparticle, agarose bead, etc.)

+

Metal-mediated labeling

Metal-binding tag His6-Tag (HHHHHH)

(d)

Biotinylated AviTag (GLNDIFEAQKIEWHE)

Metal chelate (Ni2+)

Self-labeling proteins + Labeling enzyme + SpyCatcher HaloTag SNAP-Tag

(e)

Specific probe

Covalent bond

SpyTag HaloTag ligand Benzylguanine derivative

Enzyme-mediated assembly Sortase A

++

Nt LPXTG--CO2–

Ct H3N-G----

Nt

Transpeptide ligation

Ct

O --LPXT

N H

G--

Figure 3.9  Overview of protein bioconjugation strategies. In all these approaches, the protein of interest is represented as a sphere and synthetic probes used for labeling are represented as stars. (a) Direct labeling mainly involves side chain modification of proteins.

3.2.3 ­Proteins: A Large Repertoire of Functional Object

other new chemical reactions like “hydrazone ligation” (reaction between hydrazido or hydrazino groups and carbonyls), “oxime ligation” (reaction between an oxamine and a carbonyl), “Diels–Alder”‐triggered cycloaddition (coupling between strained alkenes and tetrazines) or photo‐induced reactions have been described. The development of these new labeling processes can be applied to purified biomolecules or directly to biomolecules within living cells [23, 31]. These different labeling approaches are optimized for use in mild ­conditions that preserve the structure of proteins (water as a solvent, low ­temperature, neutral pH, etc.). 3.2.3.3.3  Tag‐Mediated Labeling and Enzymatic Coupling

Tag‐mediated approaches are based on the introduction of a specific sequence (or tag) that is used as target for chemical or enzymatic covalent modification [24, 34] (Figure 3.9c). Enzymatic approaches combine the advantage of highly specific and efficient labeling with mild and biocompatible conditions. As an example, a biotin moiety can be covalently bound to a specific lysine side chain within a small acceptor peptide sequence (biotin tag: 15 residues), by a bacterial enzyme (biotin ligase BirA). Biotin is widely used to immobilize labeled molecules or particles, thanks to its very high affinity for streptavidin (KD = 10−14 M). A variety of tag‐mediated labeling technologies were also developed with either self‐labeling peptide tags (e.g., tetracysteine tag, lanthanide‐binding tag, SpyTag, etc.) or entire proteins (e.g., SNAP‐tag, HaloTag, CLIP‐tag) that can react with a labeling compound added in the solution [23–25] (Figure 3.9d). For example, SNAP‐tag technology is a self‐labeling tag based on the transfer of a methyl group from a nucleotide to a reactive serine residue catalyzed by a methyl transferase involved in DNA repair processes. Various O‐benzylguanine substrates functionalized with biotin or fluorescent dies can be used to label different proteins [35]. SpyTag is a short peptide that spontaneously forms an isopeptide bond upon encountering its protein partner SpyCatcher [36]. Figure 3.9  (Continued) Cysteine or lysine side chains can be covalently labeled following specific reactions with chemicals coupled to synthetic probes. (b) Orthogonal labeling involves chemical modification of unnatural amino acids incorporated in the protein of interest by genetic and metabolic engineering of bacterial strains. These labeling reactions are highly specific, involving chemical function absent in nature, as illustrated by examples of click chemistry and inverse electron‐demand Diels–Alder reactions. (c) Tag labeling involves the chemical modification of a small peptide tag‐added at an extremity of the protein sequence. A biotin moiety specifically added on the lysine residue of the AviTag™ by a biotin ligase (BirA) can further be used to interact with streptavidin. The hexahistidine tag is widely used, for example, for protein purification due to its high affinity for metal cations. (d) In self‐labeling systems, the protein of interest is produced in fusion with a labeling enzyme that can specifically interact and covalently bind its target/substrate. The target can be labeled with any desired probe. Following this principle, a variety of couple enzyme/ target was discovered and engineered. (e) Specialized enzymes can catalyze covalent protein–protein assembly. As an example, Sortase A can ligate two proteins displaying, respectively, specific N‐ and C‐terminal sequences.

125

126

Protein Engineering

This tag‐labeling approach has already been successfully applied to engineer new biomaterials [37]. The “HaloTag system” is based on an engineered bacterial dehalogenase that can covalently bind to a series of functionalized chloroalkane derivatives. This enzyme can be genetically fused to any protein of interest. HaloTag technology has been used to link proteins with a variety of chemicals like fluorescent dies, or magnetic or polymer particles [38]. 3.2.3.3.4  Enzyme‐Mediated Ligation

Sortases are bacterial transpeptidases that catalyzes chemoselective ligation between peptides and proteins (Figure  3.9e). The key requirement is the ­presence of a specific N‐terminal sequence of five amino acids [39] on one fragment and a C‐terminal glycine on the second fragment. Recombinant sortase can easily be produced, purified, and used to create covalent linkage between two polypeptides with appropriate N‐ and C‐terminal sequences [40]. 3.2.3.3.5  Quality Control of Labeled Biomolecules

All these modern approaches are very promising to enable chemistry to meet biological molecules. However, it is essential to control the effect of modifications on protein structure. Like sequence modifications, chemical modifications can destabilize proteins. A special attention has to be paid on reaction conditions that have to be as mild as possible to preserve the structure of proteins. Once the chemical reaction or modification occurred, mass spectrometry is often the method of choice to characterize the modified object. Once chemically characterized, biophysical techniques can help to control the absence of protein aggregates (UV spectrophotometry, dynamic light scattering, size exclusion chromatography, etc.) and to verify that the overall 3D structure of the protein is preserved (circular dichroism, microcalorimetry, NMR, etc.).

­References 1 Stawikowski, M.; Fields, G. B. Curr. Protoc. Protein Sci. 2012, Chapter 18, 2 3 4 5 6 7 8 9

Unit 18 11. Rosano, G. L.; Ceccarelli, E. A. Front. Microbiol. 2014, 5, 341. Gopal, G. J.; Kumar, A. Protein J. 2013, 32, 419–425. Waugh, D. S. Trends Biotechnol. 2005, 23, 316–320. Wijma, H. J.; Floor, R. J.; Janssen, D. B. Curr. Opin. Struct. Biol. 2013, 23, 588–594. Dill, K. A.; Chan, H. S. Nat. Struct. Biol. 1997, 4, 10–19. Tokuriki, N.; Tawfik, D. S. Curr. Opin. Struct. Biol. 2009, 19, 596–604. Urvoas, A.; Valerio‐Lepiniec, M.; Minard, P. Trends Biotechnol. 2012, 30, 512–520. Regan, L.; DeGrado, W. F. Science 1988, 241, 976–978.

 ­Reference

10 Goraj, K.; Renard, A.; Martial, J. A. Protein Eng. 1990, 3, 259–266. 11 Leaver‐Fay, A.; Tyka, M.; Lewis, S. M.; Lange, O. F.; Thompson, J.; Jacak, R.;

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Kaufman, K.; Renfrew, P. D.; Smith, C. A.; Sheffler, W.; Davis, I. W.; Cooper, S.; Treuille, A.; Mandell, D. J.; Richter, F.; Ban, Y. E.; Fleishman, S. J.; Corn, J. E.; Kim, D. E.; Lyskov, S.; Berrondo, M.; Mentzer, S.; Popović, Z.; Havranek, J. J.; Karanicolas, J.; Das, R.; Meiler, J.; Kortemme, T.; Gray, J. J.; Kuhlman, B.; Baker, D.; Bradley, P. Methods Enzymol. 2011, 487, 545–574. Kuhlman, B.; Dantas, G.; Ireton, G. C.; Varani, G.; Stoddard, B. L.; Baker, D. Science 2003, 302, 1364–1368. Schreiber, G.; Fleishman, S. J. Curr. Opin. Struct. Biol. 2013, 23, 903–910. King, N. P.; Sheffer, W.; Sawaya, M. R.; Vollmar, B. S.; Sumida, J. P.; André, I.; Gonen, T.; Yeates, T. O.; Baker, D. Science 2012, 336, 1171–1174. King, N. P.; Bale, J. B.; Sheffler, W.; McNamara, D. E.; Gonen, S.; Gonen, T.; Yeates, T. O.; Baker, D. Nature 2014, 510, 103–108. Kiss, G.; Çelebi‐Ölçüm, N.; Moretti, R.; Baker, D.; Houk, K. N. Angew. Chem. Int. Ed. 2013, 52, 5700–5725. Brustad, E. M.; Arnold, F. H. Curr. Opin. Chem. Biol. 2011, 15, 201–210. Lane, M. D.; Seelig, B. Curr. Opin. Chem. Biol. 2014, 22, 129–136. Denard, C. A.; Ren, H.; Zhao, H. Curr. Opin. Chem. Biol. 2015, 25, 55–64. De Meyer, T.; Muyldermans, S.; Depicker, A. Trends Biotechnol. 2014, 32, 263–270. Binz, H. K.; Amstutz, P.; Pluckthun, A. Nat. Biotechnol. 2005, 23, 1257–1268. Boersma, Y. L.; Pluckthun, A. Curr. Opin. Biotechnol. 2011, 22, 849–857. Takaoka, Y.; Ojida, A.; Hamachi, I. Angew. Chem. Int. Ed. 2013, 52, 4088–4106. Walper, S. A.; Turner, K. B.; Medintz, I. L. Curr. Opin. Biotechnol. 2015, 34, 232–241. Hinner, M. J.; Johnsson, K. Curr. Opin. Biotechnol. 2010, 21, 766–776. Schumacher, D.; Hackenberger, C. P. Curr. Opin. Chem. Biol. 2014, 22, 62–69. Cobo, I.; Li, M.; Sumerlin, B. S.; Perrier, S. Nat. Mater. 2015, 14, 143–159. Kent, S. B.; Alewood, P. F. Curr. Opin. Chem. Biol. 2014, 22, viii–xi. Basle, E.; Joubert, N.; Pucheault, M. Chem. Biol. 2010, 17, 213–227. Shannon, D. A.; Weerapana, E. Curr. Opin. Chem. Biol. 2015, 24, 18–26. Algar, W. R.; Prasuhn, D. E.; Stewart, M. H.; Jennings, T. L.; Blanco‐Canosa, J. B.; Dawson, P. E.; Medintz, I. L. Bioconjug. Chem. 2011, 22, 825–858. Horisawa, K. Front. Physiol. 2014, 5, 457. McKay, C. S.; Finn, M. G. Chem. Biol. 2014, 21, 1075–1101. Veggiani, G.; Zakeri, B.; Howarth, M. Trends Biotechnol. 2014, 32, 506–512. Hoehnel, S.; Lutolf, M. P. Bioconjug. Chem. 2015, 26, 1678–1686. Reddington, S. C.; Howarth, M. Curr. Opin. Chem. Biol. 2015, 29, 94–99. Chen, A. Y.; Deng, Z.; Billings, A. N.; Seker, U. O. S.; Lu, M. Y.; Citorik, R. J.; Zakeri, B.; Lu, T. K. Nat. Mater. 2014, 13, 515–523. Urh, M.; Rosenberg, M. Curr. Chem. Genomics 2012, 6, 72–78. Ritzefeld, M. Chemistry 2014, 20, 8516–8529. Schmohl, L.; Schwarzer, D. Curr. Opin. Chem. Biol. 2014, 22, 122–128.

127

129

4 The Composite Approach

131

4.1 Inorganic Nanoparticles Carole Aimé and Thibaud Coradin Sorbonne Universités, UPMC Univ Paris 06, Collège de France, UMR CNRS 7574, Laboratoire de Chimie de la Matière Condensée de Paris, Paris, France

4.1.1 ­Introduction The term “nanomaterials” encompasses a wide variety of systems that have in common one or several physical or chemical features with dimensions at the nanometer scale. Usually a difference is made between nanostructured materials that are macroscale elements composed of nanoscale objects and nanoparticles or colloids that are individualized nano‐objects [1]. It is usually admitted that nanoparticles have sizes between 1 and 100 nm [2]. However, some controversy exists in defining the boundaries of the size range, especially in the small size domain where molecules or atomic clusters can be 2–3 nm in dimensions [3]. Hence, practically, the limits of the “nanoworld” remains rather undefined and will be assumed here to apply to any object with a dimension below 1 µm. A traditional frontier distinguishes organic from inorganic chemical systems. Organic chemistry mainly corresponds to carbon chemistry in its association with hydrogen, oxygen, nitrogen, sulfur, phosphorus, and halides. All other compounds are considered inorganics. However, pure carbon compounds are usually classified as inorganics [4], whereas some metalloid elements (in par­ ticular, boron, silicon, phosphorus, and tin) [5] can be used to prepare mole­ cules mostly considered as organic compounds. This enlightens that for a long time, organic compounds were studied and used in a molecular, dispersed form, whereas inorganics mainly correspond to solid phases. The development of coordination chemistry, showing that metals can form molecules, and of polymer science, evidencing that organic molecules can form solids and exhibit similar physical properties as inorganic phases, certainly contributed to shallow this distinction. Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

132

Inorganic Nanoparticles

The emergence of nanoscience further dusted off traditional classifications. New compounds, new synthetic routes, and new properties have been discov­ ered. Indeed an exhaustive presentation of these is out of the scope of the pre­ sent book. In this chapter we have focused on information relevant to the field of bionanocomposites, presenting the main families of nanoparticles, their preparation by solution routes, and some of their useful properties.

4.1.2 ­Overview of Inorganic Nanoparticles The classification of nanoparticles is currently an important challenge, mainly motivated by regulation issues. This classification can be based on their ­chemical or physical properties. For instance, the US Environmental Protection Agency identifies four main different types of man‐made nanomaterials: c­ arbon‐ based materials (including carbon nanotubes, fullerene, graphene), metal‐based materials (including metallic metal‐based materials, metal oxides, metal chalcogenides (i.e., quantum dots)), dendrimers (i.e., star‐shaped organic ­macromolecules), and composites (based on the combination of s­ everal nano­ particles).1 The EU commission has proposed a broader and more detailed ­classification distinguishing between metallic and nonmetallic metal‐containing nanoparticles, a remanence of the traditional distinction between metals and ceramics in the bulk material science [6]. However, it also places quantum dots in a distinct category, enlightening that the scaling down of metal chalcogenides introduces new properties specific to the nanoscale. Table 4.1 follows this classifi­cation and gathers the main types of what will be considered here as inor­ ganic nanomaterials together with their main useful properties. It is important to point out that all of these nanomaterials have been widely used for the design of bionanocomposites except for alumina and zirconia. One reason is that they are traditionally used in composite structures for their high hardness, a property that is not specifically targeted in bio‐based mate­ rials. In the case of alumina, the high toxicity of the aluminum ion has probably refrained its use so far.

4.1.3 ­Synthesis of Inorganic Nanoparticles 4.1.3.1  Basic Principles

Traditionally, methodologies to synthesize materials at the nanoscale are clas­ sified in two categories: the top‐down and the bottom‐up approaches. The top‐down strategies start from larger objects that are converted into smaller 1  https://www.epa.gov/chemical‐research/research‐nanomaterials

4.1.3 ­Synthesis of Inorganic Nanoparticle

Table 4.1  Main inorganic nanoparticles and their specific properties. Main category

Name/formula

Useful properties

Inorganic nonmetallic nanoparticles

Amorphous silica SiO2

Wide size range Easy surface functionalization Controlled porosity Low cytotoxicity

Titanium dioxide TiO2

Optical properties Electronic conductivity

Zinc oxide ZnO

Photodegradation Antibacterial properties

Aluminum oxide Al2O3

Hardness

Zirconium oxide ZrO2 Iron oxides

Hardness

Fe2O3 (hematite)

Magnetism

Fe3O4 (magnetite)

Low cytotoxicity

Cerium dioxide CeO2

Proton conductivity

Zirconium oxide ZrO2

Hardness

Calcium carbonate CaCO3

Hardness Low toxicity

Metals and metal alloys

Calcium phosphates

Bioactivity

Gold Au

Optical properties Electronic conductivity

Silver Ag

Catalytic properties Antibacterial properties

Others (Pt, Fe, Co, Ni, Au–Ag alloys)

Magnetic properties Catalytic properties

Carbon‐based nanomaterials

Fullerenes

Electrical conductivity

Carbon nanotubes

Shape anisotropy

Carbon black Graphene and graphene oxide Quantum dots

Metal sulfides ZnS, CdS

Optical properties

Nanoclays

Bentonite

Intercalation

Kaolinite

Ion exchange

Montmorillonite

Easy dispersion

Smectite

Shape anisotropy

133

Inorganic Nanoparticles

(a) Rotation of the grinding bowl

134

(b)

Horizontal section Movement of the supporting disc

Centrifugal force

200 nm

(c)

200 nm

Figure 4.1  Nanoparticle preparation via ball milling: (a) principle of ball milling [Source: Reprinted from Lu et al. [9] with permission] and SEM images of graphite powders (b) before and (c) after milling [Source: Xing et al. [10]; http://pubs.rsc.org/en/content/articlehtml/2013/ nr/c3nr02328a. Used under CC BY 3.0 https://creativecommons.org/licenses/by/3.0/].

ones by fragmentation or etching [7, 8]. These strategies are mainly based on physical processes, such as mechanical forces or irradiation. For instance, the high‐energy ball milling process involves the grinding of a micrometric powder by the mechanical action of hard balls placed in a closed chamber under shak­ ing (Figure  4.1) [9, 10]. However, this approach usually leads to a large size distribution and poor morphological control of the resulting nanoparticles. In contrast, lithography techniques, using light, ions, or electrons to locally degrade (or alternatively reinforce) a substrate following a predefined motif, allow for the creation of nanoscale patterns with a very high degree of preci­ sion. Yet these technologies are time and energy consuming and are usually not adapted to the synthesis of nanoparticles that can be further transferred into solutions. Bottom‐up approaches go the other way round and start from smaller objects (i.e., atoms, ions, or molecules) to build larger structures. The key challenge is therefore to ensure that the dimensions of these structures do not go beyond the nanoscale. The primary event involved in the formation of a solid consists of the binding of two objects, also called precursors, initially in a nonsolid phase (for instance, atoms in gas or ions in solution). For the sake of clarity, let us consider that these two precursors A and B are different and form a solid of formula AB. A and B are moving in their medium and can come in contact. At this point, either they form a bond corresponding to their association in the solid state or they dissociate. If the former process sufficiently prevails over the latter, then a stable assembly of As and Bs is obtained, called a nucleus. This is called the nucleation phase. Further, these nuclei act as seeds for the growth of larger

4.1.3 ­Synthesis of Inorganic Nanoparticle

particles, either by continued binding of A and B on the nuclei surface or by association of nuclei. This growth phase stops when there are no longer free A/B in the medium or if the surface reactivity is decreased. To understand better how these processes occur and how they can be ­controlled, thermodynamic considerations can be useful [11]. To simplify the presentation, we will consider a liquid phase reaction where A and B are dis­ solved in a solvent (solv). The equilibrium corresponding to the formation of the AB solid (s) can be written as

A

solv

B

ABs

solv

The equilibrium constant K of this reaction can be approximated as K = 1/[A][B]. Note that 1/K corresponds to the solubility product of ABs (termed Ks). The standard Gibbs (or free) enthalpy of the equilibrium can be expressed as G0

RT ln K

RT ln K s

For a given solution containing A and B in [A]0 and [B]0 concentration, the free enthalpy of the reaction is written as



Gr

RT ln

A

0

B

Ks

0

RT ln S



where S is the supersaturation, indicating the excess of A and B compared with the minimum concentration required for solid formation. The reaction of solid formation is favored if ΔGr  1. Going back to the previous description of the nucleation reaction, this corresponds to the step where the formation of solid‐like A–B bonds begins to be favored over their dissociation. However, as mentioned earlier, the formation of a nucleus requires that a ­sufficient number of these A–B bonds are formed. From an energy point of view, the formation of a particle should be con­ sidered at two levels: (i) in the volume of the particle, there is a global gain in free enthalpy (ΔGv) due to both the free enthalpy associated with the formation of the A–B bond and the change in entropy due to the transition from the dispersed to the condensed phase; (ii) on the surface, there is an energy associated with particle–solvent interactions called the interfacial energy (γ). The free enthalpy of the reaction can therefore be written as

Gr

4 r2

4 3

r3

Gv Vm

where Vm is the molar volume of AB and r is the radius of the particle.

135

Inorganic Nanoparticles

From this equation, a critical radius r* = 2γ Vm/ΔGv can be deduced corre­ sponding to the maxima in ΔGr, ΔGr* . This radius r* (or rc) corresponds to the size of the nucleus (Figure 4.2a). The Gibbs–Thomson hypothesis states that it is possible to associate a criti­ cal supersaturation S* to the critical radius r*. S* represents the excess of A and B above AB solubility required to form the nucleus. (a) ∆G

Surface energy

∆Gc Radius rc

Bulk-free energy

(b) Critical limiting supersaturation

Cmax

Rapid self-nucleation Saturation

136

Cmin Growth Cs

I

II

III Time

Figure 4.2  Nucleation and growth of nanoparticles. (a) Nucleation occurs when the seed size exceeds a critical radius (rc) determined by the balance between bulk‐free energy and surface energy. (b) In La Mer’s model, if supersaturation is reached rapidly, multi nuclei are formed before the growth process starts [Source: Polte [11]; http://pubs.rsc.org/is/content/ articlehtml/2015/ce/c5ce01014d. Used under CC BY 3.0 https://creativecommons.org/ licenses/by/3.0/].

4.1.3 ­Synthesis of Inorganic Nanoparticle

The critical free enthalpy for nucleation can then be written as Gr*

16 3

Vm RT

2

3

1 ln S *

2



with R as the gas constant and T the temperature. Hence, nucleation is thermo­ dynamically favored by low γ and high S*. In terms of kinetics, the Arrhenius law can be applied: v

kn exp

Gr* RT

where v is the rate of nucleation and kn is a kinetic constant that is mainly related to the probability of the reagents to meet and interact, that is, diffu­ sion‐related effects. From this equation, it can be deduced that nucleation will be faster as γ is small or S* is large. Once some nuclei are formed, the growth process starts. The precise ther­ modynamic description of this process becomes tricky because the reactions no longer occur between precursors in solution but between free precursors and species present on the particle surface or even between two particle sur­ faces. This process therefore becomes closer to a solid–liquid heterogeneous reaction and strongly depends on the affinity of the surface for its surrounding medium, that is, the interfacial energy. This theory, largely applied to precipitation and crystallization processes, was further reconsidered in the case of nanomaterials by La Mer, showing that if criti­ cal supersaturation is reached rapidly, many nucleation events are occurring simultaneously, depleting the solution in free precursors (Figure  4.2b). The growth step is therefore separated from and controlled by the nucleation, offering fruitful insights for the preparation of nanoparticles with well‐defined sizes. Overall, these considerations allow for the identification of the key parame­ ters one can play with to control the size of particles: ●●

●●

Concentration: If conditions of high supersaturations are used, nucleation is highly favored. Therefore a large quantity of seeds will be rapidly formed in the medium, depriving it from free precursors. The growth will stop when these are no longer available. Surface reactivity: If unfavorable interactions exist between the particle sur­ face and its surrounding, the nucleation phase will be slow so that many free precursors will be available for further growth. Hence the control of particle size must be achieved by limiting the possibility for these precursors to bind on the particle surface. As we will show in the following section, this can be achieved by modulating the intrinsic surface reactivity (for instance, its charge) or capping it with organic molecules.

137

138

Inorganic Nanoparticles ●●

Temperature and stirring: The former will usually favor nucleation and growth, and its influence on precursor reactivity may be a determining fac­ tor for achieving a successful control of particle size. High stirring rates are important to achieve fast and homogeneous nucleation and growth as they favor rapid diffusion of the reagents. They can also avoid nuclei or particle agglomeration.

4.1.3.2  Nanoparticles from Solutions

As pointed out earlier, the very first steps of solid formation involve atomic/ ionic/molecular objects that interact one with another and collectively to form a nucleus. Hence, in many instances, the strategies developed for nanoparticle synthesis depend on the nature and reactivity of these precursors. A first dis­ tinction can be made between gaseous and liquid precursors, but, for the sake of clarity, only the latter will be described in this section. 4.1.3.2.1  Ionic Solids

The simplest case to consider consists of ionic solids of formula CA where C is a cation and A is an anion. These solids can, in principle, be formed by mixing salts of C and A. However, the rapid addition of the C solution to the A solution above supersaturation leads to fast precipitation of CA without any control of the particle size. One possibility is to perform the co‐injection of the two ions within a common liquid medium in a time‐controlled manner. This medium can contain an additional reagent favoring precipitation (for instance, acids or bases to shift the pH of reaction). However in many cases, such a kinetic con­ trol of the mixing conditions is not sufficient, and other strategies must be used, playing on the precursor nature, on the solvent properties, and/or on the presence of surface capping agent. This is nicely illustrated by the synthesis of CdSe quantum dots, as described by Landry et  al. (Figure  4.3) [12]. In this work, the Cd2+ precursor solution contained oleic acid and the Se2− solution was prepared from trioctylphos­ phine. These two additives are amphiphilic molecules, with their polar head having good affinity for cadmium and selenium, respectively. The reaction solvent was octadecene, an apolar solvent with low affinity for CdSe but high affinity for the hydrophobic chains of the surfactants. The two precursor solu­ tions are added simultaneously under stirring and heating. The nucleation occurs rapidly due to supersaturation. This results in the formation of a micel­ lar‐like shell of amphiphiles capping the surface of the particle. The particle is therefore confined, and its growth is limited not only by the fact that the sur­ face is no longer accessible to free precursors but also because the micelle, whose size depends on the amphiphile concentration, constrains the space available for particle expansion.

4.1.3 ­Synthesis of Inorganic Nanoparticle CdSe QD

Se–TOP

Cd(Ac)2 R–COOH ODE

Surfactant: R–COOH, TOP, R–NH2*

[Cd–Se]i

ODE R–NH2* 165°C

20 nm

Figure 4.3  Synthesis of CdSe quantum dots and the resulting TEM image [Source: Landry et al. [12]. Reproduced with permission of American Chemical Society].

Another method to get advantage of the confinement effect of amphiphilic assemblies is to use the emulsion method [13]. In this case, the cation and anion precursors are independently dissolved in the liquid core of micellar objects. When the two micellar solutions are mixed, they can exchange their internal content via fast coalescence/separation events, resulting in the confinement of both precursors within the same droplets, and therefore precipitation. 4.1.3.2.2 Metals

In the case of metals, the reaction leading to the solid metallic M0 phase forma­ tion is the reduction of a metal MZ+ salt. Thus, the strategies described above can be applied with one precursor being this metal ion and the second one being a reducing agent, for instance, using emulsion routes (Figure 4.4) [14]. However the seminal work by Turkevich et al. introduced the concept of using reducing molecules that can also act as capping agent, as illustrated by the formation of gold nanoparticles by mixing gold hydrochlorate HAuCl4 solu­ tions with citric acid [15]. The idea of using additives exhibiting multiple roles in the control of the nanoparticle size gave rise to another very popular method called the polyol process [16]. This methodology uses polyalcohol molecules, such as ethylene glycol, as solvent, reducing agent, and capping molecules. A major limitation of these methods is that the surface oxidation of the nano­ particles is difficult to avoid during the synthetic process. This can be detrimental for their application, for instance, in catalysis. In this case, the thermal decompo­ sition (also called pyrolysis) of organometallic precursors constitutes a particu­ larly attractive and flexible method. The typical example of this approach is the use of iron pentacarbonyl Fe(CO)5 for the synthesis of Fe nanoparticles (Figure 4.5). Decomposition can be achieved by heating [17] or sonochemical activation [18], in the presence of polymers that can act as catalysts for the reaction.

139

140

Inorganic Nanoparticles Microemulsion I

Microemulsion II Aqueous phase Reducing agent (NH4OH, N2H4, NaBH4, etc.)

Aqueous phase Metal salt (FeCl3, FeCl2, CuCl2, etc.)

Oil phase

Mix microemulsion I and II

Collision and coalescence of droplets

Chemical reaction occurs

Oil phase

Percolation

Precipitate (Metal or metal oxide)

Figure 4.4  The microemulsion route to metal nanoparticles [Source: Capek [14]. Reproduced with permission of Elsevier].

4.1.3.2.3  Metal Oxides

The formation of metal oxides via solution route, also known as the sol–gel process [19], follows a different pathway. In aqueous solutions, metal ions MZ+ are solvated by H2O molecules. However, depending on their charge and size, they can induce the partial deprotonation of these water molecules, forming M─OH or even M─O− bonds. These species can remain isolated, especially at low concentration, or interact one with another to form M─O─M bonds follo­ wing a condensation reaction. Such a reaction can proceed in a way very similar to polymerization processes known in organic chemistry by successive addition of M─O groups. However, in contrast to common organic monomer that ­exhibits only one reactive function, therefore forming 1 D chains, metal ions are usually surrounded by four to six water molecules. Hence the “polymerization” reaction can extend in 3D, leading to particulate objects. It is important to point out that the condensation reaction highly depends on the pH of the medium that defines the charge and therefore the reactivity of the metal–oxygen species. For instance, in extreme pH conditions, only positively charged (M─H3O+) or negatively charged (M─O−) species are present, which can hardly form M─O─M bonds due to mutual electrostatic repulsion. On this basis, adjusting parameters like pH or temperature or using complexing/capping agents allows

4.1.3 ­Synthesis of Inorganic Nanoparticle

(a)

(b)

(c)

(d)

(e)

(f)

Figure 4.5  TEM images of Fe nanoparticles obtained from Fe(CO)‐oleylamine precursors at various temperatures and times: (a) 30°C for 1 min, (b) 30°C for 60 min, (c) 30°C for 180 min, (d) 70°C for 60 min, (e) 100°C for 60 min, and (f ) 130°C for 60 min. All scale bars represent 20 nm [Source: Kura et al. [17]. Reproduced with permission of American Chemical Society].

for the controlled synthesis of nanoparticles for some important metal oxides such as iron oxides [20]. These processes are usually described as aqueous sol–gel routes and have been extensively described by Jolivet et al. in a highly recommended reference book [21]. One of the most widespread applications of this approach is the preparation of iron oxide (magnetite and maghemite) nanoparticles using the alkalinization of aqueous mixtures of Fe2+ and Fe3+ salts. It was shown that the condensation reaction occurs between two ­neutral precursors [Fe2(OH)4(H2O)8]0 and [Fe2(OH)6(H2O)6]0. The determi­ ning factors for the success of this synthesis are therefore not only Fe2+/Fe3+ ratio and the pH of the reaction but also the ionic strength of the medium (Figure 4.6) [22]. However, in many cases, it is important to use precursors that do not react spontaneously but only when water is added. In this context, the most com­ mon sol–gel starting reagents are metal alkoxides of general formula M(OR)n with R being an alkyl group such as CH3 or C2H5. Upon addition of water, the hydrolysis reaction converts M─OR into M─OH (or M─O−) groups that can undergo condensation. The pH of the added aqueous solution plays a major role in the kinetics of the hydrolysis reaction. Interested readers should refer to

141

142

Inorganic Nanoparticles

(a)

(b) 50 nm

D nm 12 10 8 6 4 2 9

10

11

12

pH

Figure 4.6  (a) TEM images of magnetite particles precipitated in aqueous medium. (b) Influence of the pH of precipitation on the mean particle size at fixed Fe(II)/Fe(III) ratio and ionic strength [Source: Jolivet et al. [22]. Reproduced with permission of The Royal Society of Chemistry].

the reference book of sol–gel chemistry by Brinker and Scherrer for more details [23]. In some way, the hydrolysis step of alkoxides can be compared with the reduction step of metal ions described in the previous section, as they both create the building elements of the solid network. Thus, similar strategies such as the emulsion route or the polyol process have been applied to the syn­ thesis of metal oxide nanoparticles from alkoxides [16, 24]. Non‐hydrolytic (i.e., water‐free) processes have also been described [25]. Nevertheless, metal oxide chemistry exhibits some specificity that is very well illustrated by the Stöber process, the most popular method to obtain silica nanoparticles [26]. Tetraethoxysilane Si(OC2H5)4 (TEOS) is dissolved in an ethanol solution at high concentration. An aqueous solution of ammo­ nia, containing a large excess of water compared to TEOS, is then added under rapid stirring, leading to the rapid formation of nanoparticles that are left to age for a few hours. The size of the particles highly depends on the relative concentration of each reagent but, as a general trend, increases with TEOS and ammonia concentration at constant water content, while the effect of water strongly depends on the content of the two other reagents (Figure 4.7) [27]. From a mechanistic point of view, it is suggested that the large excess of water leads to the fast hydrolysis of TEOS, while the basic conditions initially favor the aggregation of nuclei and further deposition of free hydrolyzed TEOS monomer, according to the La Mer model described previously. Nevertheless, to fully understand the particle growth process, two other parameters must be taken into account. First, the isoelectric point of the particle surface tends to decrease with increasing particle size. Thus, whereas basic conditions first favor the aggregation of nuclei, larger particles bear a higher negative charge

4.1.3 ­Synthesis of Inorganic Nanoparticle (a) 1000 900

Particle size (nm)

800 700 1.13 M NH4OH

600

0.85 M NH4OH

500

0.57 M NH4OH

400

0.28 M NH4OH

300

0.11 M NH4OH

200 100 0

0

5 10 Water concentration (M)

15

(b) 30 PDI = 0.012 PDI = 0.035

Number (%)

25

PDI = 0.002

PDI = 0.021

1.13 M NH4OH 0.85 M NH4OH

20

0.57 M NH4OH

15

PDI = 0.002

10

0.28 M NH4OH 0.11 M NH4OH

5 0 0

500

1000

1500

Size (nm)

Figure 4.7  Stöber method to silica nanoparticles. (a) Mean particle size of silica particles (DLS number mean) as a function of water and ammonium hydroxide concentration for 0.28 M TEOS. (b) Example DLS distributions with increasing ammonium hydroxide concentrations for 0.28 M TEOS and 6 M H2O [Source: Greasley et al. [27]. Reproduced with permission of Elsevier].

that not only prevents their aggregation but also slows down further monomer addition. Second, the presence of large amounts of EtOH results in an equilib­ rium between Si─OH and Si─OC2H5 groups on the particle surface, impacting on interfacial energy. Thus, in addition to its versatility, this method has the great advantage to allow size control without the addition of any surfactant or other capping agent.

143

Inorganic Nanoparticles

4.1.3.2.4  Morphological Control

In bulk materials, the morphology of the solid phase reflects their crystallo­ graphic structure. However, as pointed out earlier, for nanoparticles, the high surface‐to‐volume ratio decreases the thermodynamic contribution of the internal bonds. Moreover the mild conditions involved in solution routes also contribute to the low crystallinity of nanoparticles. Therefore, in many occa­ sions, as‐prepared nanoparticles have a spherical (i.e., isotropic) shape. While hydrothermal treatments allow to improve the particles’ crystallinity, further morphological control can be achieved via a wide range of strategies [28]. The first approach relies on the additives that bind specific crystalline planes, inhibiting their growth and therefore controlling the particle habit, as nicely demonstrated for ZnO nanocrystals (Figure 4.8) [29]. Another strategy relies on the templating strategy. In this case, a pre­ formed object, being organic (soft) or inorganic (rigid), is used as a substrate for the deposition of the targeted phase, followed by the specific removal (by chemical dissolution or thermal degradation) of the template. One possibil­ ity is to use a porous support, such as a membrane, in which case the particle formation occurs within the available cavities [30]. Alternatively, the deposi­ tion can be achieved on the surface of preformed colloidal systems. In par­ ticular, biological systems have been widely used as templates to control anisotropic nanomaterials [31]. (a)

(b)

200 nm a+

100 nm +H +H +H

+

o2–

a N

c b

(c)

100 nm

N

144

o2– o2– o2–

a

Zn2+ Zn2+ Zn2+ o

o o

oo

Zn2+ Zn2+ Zn2+ o

o o

oo

o

o

Figure 4.8  Shape control of ZnO nanocrystals through additives. TEM images and schematic crystal structure of particles obtained (a) in water, (b) with sodium dodecanoate, and (c) with dodecanoic acid [Source: Lizandara‐Pueyo et al. [29]. Reproduced with permission of The Royal Society of Chemistry].

4.1.4 ­Some Specific Properties of Inorganic Nanoparticle

CTAB:

N+

NaBH 4

+ [AuCl4

Asc

]–

orb

Br–

Seeds

ic a

cid

Au(I)

Seeds Ascorbic acid

Figure 4.9  The seed‐mediated method for gold nanorods preparation [Source: Chen et al. [33]. Reproduced with permission of The Royal Society of Chemistry].

An interesting extension of these approaches is the so‐called seed‐mediated procedure, mainly developed for metallic nanoparticles [32]. Initially, spherical nanoparticles are prepared in the presence of a surfactant. Then a solution containing the metallic salt, the same surfactant, and a mild reducing agent is added. The slow deposition of the metal on the seed occurs simultaneously to the extension of the surfactant micellar assembly that provides a confined space for the growth (Figure 4.9).

4.1.4 ­Some Specific Properties of Inorganic Nanoparticles As pointed out earlier, the inorganic nanoparticles differ from bulk phases by two factors: the small amount of atoms forming the core structure and the large amount of atoms present on the surface. From a physical perspective, the first outcome of these differences is the so‐ called quantum confinement effect. This effect originates from a restriction of the space available for the displacement of internal charge carriers (electrons and holes), modifying the electronic structure of the solid, that is, the energy levels that can be occupied by electrons. As the light absorption/emission properties of a solid are directly related to these energy levels, variations in particle size will directly impact their optical properties. In semiconductors, the electron–hole distance in the bulk phase is in the order of a few nanometers so that their optical properties will vary with their size for dimensions below

145

146

Inorganic Nanoparticles

CB 1Pe 1Se

CB Eg VB

1Sh 1Ph

VB

6 nm

2 nm

Figure 4.10  The quantum confinement effect in semiconductors leads to modification of the density and energy of electronic levels in the valence band (VB) and conduction band (CB), changing the gap between these two bands (Eg) and therefore the optical properties [Source: de Mello Donega [34]. Reproduced with permission of The Royal Society of Chemistry].

circa 5 nm: this is the reason behind the optical properties of quantum dots (Figure 4.10) [34, 35]. The second important outcome of nanoscale confinement is surface plas­ mon resonance. Surface plasmons are collective electron motions formed at the interface between two media exhibiting permittivity with opposite signs, for instance, for metals in air. A resonance phenomenon can be obtained when the interface is irradiated with an electromagnetic wave whose frequency is equal to that of the plasmon wave. For noble metals (Au, Ag) at the nanoscale, the surface plasmon becomes localized around the particle, and the electrons oscillate with a frequency that can be in the range of visible light, resulting in an optical absorption process [36]. The energy, shape, and intensity of such surface plasmon resonance band are sensitive not only to particle size but also to the chemical nature of the metal and the shape of the particle (Figure 4.11) [37]. Moreover, as an interfacial phenomenon, it will vary if a molecule gets adsorbed on the surface. Magnetic properties are also very sensitive to size. The most well‐known effect of scaling down magnetic materials to the nanoscale is related to the low amount of spin carriers, making the particles less sensitive to external mag­ netic field, giving rise to the superparamagnetism state. More generally, the different properties exhibited by surface groups com­ pared to core species that are both geometric (due to surface curvature effects)

(a)

SPR wavelength (nm)

750

r = 60 nm r = 40 nm r = 20 nm

700

λext = 331.35 ns + 164.83

650 λext = 238.38 ns + 227.76

600 550

λext = 153.92 ns + 301.74

500 1.3

1.4

(b)

1.6

1.5

1.6

1.7

R = 3.4 R = 3.0 R = 2.6 R = 2.0 R = 1.0

1100 SPR wavelength (nm)

1.5 ns

1000 900 800 700 600 500 1.3

1.4

(c) 1000

1.7

xAg = 0.0 xAg = 0.2 xAg = 0.5 xAg = 1.0

950 900 λext (nm)

ns

850 800 750 700

1.3

1.4

1.5 ns

1.6

1.7

Figure 4.11  Surface plasmon resonance in metal nanoparticles: effect of (a) Au particle size r, (b) Au particle aspect ratio R, and (c) Au1–xAgx alloy composition on the surface plasmon resonance (SPR) band wavelength for different refractive index ns of the external medium [Source: Le and El‐Sayed [37]. Reproduced with permission of American Chemical Society].

Inorganic Nanoparticles

and chemical (due to the dissymmetry in the local environment between the particle internal structure and the external medium) can have a large influence on nanomaterials reactivity. The first parameter is clearly illustrated by silica nanoparticles, for which the number of reactive silanol groups and surface charge, both impacting on their possible functionalization as well as on their (a)

Relative modulus (E/Em)

3.0 PAG–MWNT 2.5

2.0 PAG–organoclay 1.5 PAG–silica 1.0 0

2

4

6 8 10 12 Filler content (wt%)

14

16

(b) 1.6 Relative modulus (E/Em)

148

PMMA–SWNT

1.4 PMMA–organoclay 1.2

1.0 PMMA–alumina 0.8

0

5

10

Filler content (wt%)

Figure 4.12  Effect of nanoparticles (“nanocharges”) on the Young modulus E of polymer nanocomposites compared to that of the matrix alone Em for (a) PA6 and (b) PMMA matrix, illustrating high reinforcing effect at low concentration for multiwall carbon nanotubes (MWNT) and decreased modulus in the presence of alumina particles [Source: Tjong [40]. Reproduced with permission of Elsevier].

 ­Reference

cytotoxicity, varies with particle size [38]. The second effect was claimed to contribute to the high catalytic performance of gold nanoparticles since Au atoms on the surface have a lower coordination state than the core metallic atoms, and therefore a more open coordination sphere [39]. Indeed, the large surface‐to‐volume ratio of nanoparticles also increases their specific surface area, that is, the area of surface per gram of materials, and therefore theoretically enhances their adsorption capacity. Yet this is not always practically true because the sorption process can impact the colloidal stability, leading to particle aggregation and subsequent loss of available sur­ face. Accordingly, it is often assumed that nanoscale charges are more effi­ cient than the microscale ones in improving the mechanical properties of polymer composites because they have a higher surface of contact with the matrix. However, such a reinforcement also depends on particle shape, sur­ face reactivity, and particle dispersion so that, ultimately, the introduction of nanoparticles may result in a decrease in stability of the composite material (Figure 4.12) [40].

4.1.5 ­Concluding Remarks This brief overview of inorganic nanoparticles aimed to demonstrate their diversity in chemical structures, the versatility of available synthetic methods, and the wide range of properties they can exhibit. It therefore constitutes a tool box of functional building blocks that can be selected for association with a given biological macromolecule to construct bionanocomposite materials. However, it must be pointed out that in a majority of cases, such an association cannot be obtained by simply mixing the two components but requires a fine control of their interactions that can be achieved by further chemical modifica­ tion of the particle surface and/or of the biomacromolecule chemical function­ ality [41], as shown in the following chapters.

­References 1 Cao, G.; Wang, Y. Nanostructures and Nanomaterials: Synthesis, Properties and

Applications, Imperial College Press, London, 2004.

2 Commission Recommendation of 18 October 2011 on the definition of

nanomaterials (2011/696/EU), The European Commission, 2011.

3 Vinslava, A.; Tasiopoulos, A. J.; Wernsdorder, W.; Abboud, K. A.; Christou, G.

Inorg. Chem. 2016, 55, 3419–3430.

4 York, A. P. E. J. Chem. Educ. 2004, 81, 673–676. 5 Thomas, S. E. Organic Synthesis: The Roles of Boron and Silicon, Oxford

Chemistry Primers Series, Oxford University Press, Oxford, 1991.

149

150

Inorganic Nanoparticles

6 Commission staff working paper—Types and uses of nanomaterials, including

safety aspects, SWD, 2012 288 final.

7 Biswas, A.; Bayer, I. S.; Wang, T.; Dervishi, E.; Faupel, F. Adv Colloid Interface

Sci. 2012, 170, 2–27.

8 Yu, H. D.; Regulacio, M. D.; Ye, E.; Han, M. Y. Chem. Soc. Rev. 2013, 42,

6006–6018.

9 Lu, Y.; Guan, S.; Hao, L.; Yoshida, H. Coatings 2015, 5, 425–464. 10 Xing, T.; Sunarso, J.; Yang, W.; Yin, Y.; Glushenkov, A. M.; Li, L. H.; Howlett,

P. C.; Chen, Y. Nanoscale 2013, 5, 7970–7976.

11 Polte, J. Cryst. Eng. Comm. 2015, 17, 6809–6830. 12 Landry, M. L.; Morrell, T. E.; Karagounis, T. K.; Hsia, C.‐H.; Wang, C.‐Y.

J. Chem. Educ. 2014, 91, 274–279.

13 Lopez‐Quintela, M. A. Curr. Opin. Colloid Interface Sci. 2003, 8, 137–144. 14 Capek, I. Adv. Colloid Interface Sci. 2004, 110, 49–74. 15 Turkevich, J.; Stevenson, P. C.; Hillier, J. Discuss. Faraday Soc. 1951,

11, 55–75. Dong, H.; Chen, Y.‐C.; Feldmann, C. Green Chem. 2015, 17, 4107–4132. Kura, H.; Takahashi, M.; Ogawa, T. J. Phys. Chem. C 2010, 114, 5835–5838. Suslick, K. S.; Fang, M.; Hyeon, T. J. Am. Chem. Soc. 1996, 118, 11960–11961. Coradin, T. Encyclopedia of Inorganic and Bioinorganic Chemistry, John Wiley & Sons, Inc., Hoboken, 2013, pp. 1–12. 20 Laurent, S.; Forge, D.; Port, M.; Roch, A.; Robic, C.; Van der Elst, L.; Muller, R. N. Chem. Rev. 2008, 108, 2064–2110. 21 Jolivet, J. P. Metal Oxide Chemistry and Synthesis: From Solution to Solid State, John Wiley & Sons, Inc., New York, 2000. 22 Jolivet, J. P.; Chanéac, C.; Tronc, E. Chem. Commun. 2004, 481–487. 23 Brinker, C. J.; Scherer, G. W. Sol–Gel Science: The Physics and Chemistry of Sol–Gel Processing, Academic Press, Boston, 1990. 24 Feldmann, C.; Jungk, H.‐O. Angew. Chem. Int. Ed. 2001, 40, 359–362. 25 Niederberger, M. Acc. Chem. Res. 2007, 40, 793–800. 26 Stöber, W.; Fink, A.; Bohn, E. J. Colloid Interface Sci. 1968, 26, 62–69. 27 Greasley, S. L.; Page, S. J.; Sirovica, S.; Chen, S.; Martin, R. A.; Rivieiro, A.; Porter, A. E.; Jones, J. R. J. Colloid Interface Sci. 2016, 469, 213–223. 28 Xia, Y.; Xiong, Y.; Lim, B.; Skrabalak, S. E. Angew. Chem. Int. Ed. 2009, 48, 60–103. 29 Lizandara‐Pueyo, C.; Morant‐Minana, M. C.; Wessig, M.; Krumm, M.; Mecking, S.; Polarz, S. RSC Adv. 2012, 2, 5298–5306. 30 Shingubara, S. J. Nanoparticle Res. 2003, 5, 17–30. 31 Hall, S. R. Proc. R. Soc. A 2009, 465, 335–336. 32 Jana, N. R.; Gearheart, L.; Murphy, C. J. Adv. Mater. 2001, 13, 1389–1391. 33 Chen, H.; Shao, L.; Li, Q.; Wang, J. Chem. Soc. Rev. 2013, 42, 2679–2724. 34 de Mello Donega, C. Chem. Soc. Rev. 2011, 40, 1512–1546. 35 Bera, D.; Qian, L.; Tseng, T.‐K.; Holloway, P. H. Materials 2010, 3, 2260–2345. 16 17 18 19

 ­Reference

36 Willets, K. A.; Van Duyne, R. P. Ann. Rev. Phys. Chem. 2007, 58, 267–297. 37 Le, K.‐S.; El‐Sayed M. A. J. Phys. Chem. B 2006, 110, 19220–19225. 38 Patwardhan, S. V.; Emami, F. S.; Berry, R.; Perry, C. C. J. Am. Chem. Soc. 2012,

134, 6244–6256.

39 Thompson, D. T. Nanotoday 2007, 2, 40–43. 40 Tjong, S. C. Mater. Sci. Eng. R 2006, 53, 73–197. 41 Voisin, H.; Aimé, C.; Coradin, T. Eur. J. Inorg. Chem. 2015, 27, 4463–4480.

151

153

4.2 Hybrid Particles: Conjugation of Biomolecules to Nanomaterials Nikola Ž. Knežević1,*, Laurence Raehm2 and Jean‐Olivier Durand2 1 2

Faculty of Technology and Metallurgy, University of Belgrade, Belgrade, Serbia Institut Charles Gerhardt Montpellier UMR‐5253 CNRS‐UM2‐ENSCM‐UM1cc, Montpellier, France

4.2.1 ­General Considerations Biofunctionalization of the surface of nanomaterials is usually performed by utilizing the reactivity of various functional groups, which are typically present on the biomolecules. These moieties are frequently amine, carboxylic acid, hydroxyl, thiol, or carbonyl groups from peptides, nucleic acids, or smaller biomolecules. The biofunctionalization may also take advantage of their intrinsic charge. However, functional groups that are not typically found in biomolecules can also be introduced to the reactants for providing specific “bioorthogonal” linkage formation, for example, in “click” chemistry applications. An important aspect in devising the functionalization strategy is the preservation of the activity of the biomolecules upon their attachment to the surface, which implies that the active sites of the biomolecules should remain non‐altered (non‐functionalized and not denatured) and available for the interaction with the desired ligands or receptors. These necessities often require inclusion of spacer linkers (e.g., polyethylene glycol (PEG)) between the biomolecule and the nanoparticle surface because additional noncovalent interactions between them, for example, hydrogen bonding or electrostatic interactions, may hinder the desired activity of biomolecules. Generally, there are two main strategies for surface functionalization with complex biomolecules and linkers. (i) The bonding between the biomolecule and the linker may be initially achieved in solution (homogeneous reaction), followed by the grafting of the corresponding conjugate to the surface of nanoparticles *  Corresponding author: [email protected] Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

154

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials

(heterogeneous reaction). (ii) The complex nanoarchitecture may be built sequentially, involving all heterogeneous reactions, starting from the reaction between the nanoparticles and the linkers, followed by the attachment of ­biomolecules to the surface‐grafted linkers. Higher functionalization capacities are usually achieved with the first strategy, since the homogeneous reactions typically allow higher reaction yields, and the low‐yielding heterogeneous surface grafting is performed as a final step in the reaction sequence. However, it is often beneficial to perform the heterogeneous reaction early in the reaction sequence due to easiness of purification of the surface‐bound m ­ oieties from the leftover reactants and side products in the reaction mixture, by centrifugation or filtration. This is especially helpful if multicomponent reactions are performed, with the application of various catalytic molecules. Nevertheless, if the catalytic ­molecules and side products of the homogenous coupling process are not reactive with the nanomaterial’s surface, it is often possible to perform immediate in situ functionalization of nanomaterials with linker‐modified biomolecules after the homogeneous coupling reaction, with subsequent separation of the side products by washings of the functionalized material.

4.2.2 ­Functionalization of Nanoparticle Surface 4.2.2.1  Functionalization of Hydroxylated Surfaces

Solid nanomaterials containing Si, Fe, Al, Ti, or other atoms that form hydroxylated surfaces (e.g., mesoporous silica nanoparticles (MSN), iron oxide, and titanium dioxide) can be functionalized with silicium‐based bifunctional ­reagents [1–5]. Commercially available alkoxysilanes containing different functional groups (thiol, amine, and halide) can be grafted on the surface of these nanoparticles, typically performed in dry solvents at an elevated temperature in order to form the covalent linkage between the surface hydroxyls and the reagent’s Si atoms, which are made electrophilic by the presence of alkoxy substituents [6]. Thus obtained functionalized surfaces are available for further modifications with biomolecules through some of the conjugation ­processes described in Section 4.2.3. 4.2.2.2  Functionalization of Hydride‐Containing Surfaces

A novel type of nanomaterials, porous silicon nanoparticles (pSiNPs), has been developed recently for biomedical applications, which contain mostly Si─Si bonds in the interior, while the surface of the nanoparticles contains Si─H and Si─OH bonds [7, 8]. Hence, in addition to silanization with alkoxysilanes, this type of materials can be functionalized through hydrosilylation reaction, ­performed in dry solvents at slightly elevated temperature, with ­terminal ­alkenes or alkynes containing additional functional groups for post‐ functionalization with biomolecules [9].

4.2.3 ­Linker‐Mediated Conjugation of Biomolecules to Nanoparticle

4.2.2.3  Functionalization of Metal‐Containing Nanoparticles

Metal‐containing surfaces are typically functionalized through formation of coordination bonds with bifunctionalized ligands, where one of the functional groups contains donor atoms with high affinity for coordination to the metal atoms or ions at the nanoparticle surface, while the other functional group has weaker metal‐coordination ability and remains free for further modification by the methods described in Section  4.2.3. Hence, gold nanoparticles are easily functionalized by thiol‐containing ligands [10, 11], while iron oxide and up‐­ ­ conversion nanoparticles may be functionalized through carboxylate ­coordination [12, 13]. Thiol‐containing ligands can also be used for effective functionalization of Zn‐ and Cd‐based quantum dots [14, 15], while phosphonate ligands have demonstrated higher affinity for the surface of lanthanide‐ based up‐­converting nanoparticles than carboxylate ligands [16]. Nanoparticles with metal surfaces can often be functionalized directly with biomolecules through coordination of donor atoms from the biomolecules to the metal s­ urface, for example, coordination of nucleotides to nanomaterials containing various metal ions (Au(III), Ag(I), Ce(III), Gd(III), and Tb(III)) [17]. In addition, since the metal‐containing surfaces are typically highly charged, electrostatic interactions can be utilized for direct coating of the surface with biomolecules, for example, in the case of DNA functionalized up‐conversion nanoparticles [18]. 4.2.2.4  Functionalization of Carbon‐Based Nanomaterials

Functionalization of carbon nanotubes is most notably known to occur by ­oxidation of their surface upon extensive sonication in the mixture of nitric and sulfuric acid [19]. This procedure yields carboxylate‐modified surfaces, which can be further tailored by the coupling reactions presented in Section  4.2.3. Additional surface‐modification reaction of carbon nanotubes includes thermally activated addition reactions and electrochemical modifications [20]. Various adjustments of these thermal oxidation/addition techniques were ­proposed for functionalization of other carbon‐containing nanoparticulate materials, including fullerenes [21], nanodiamonds [22], carbon dots [23], and graphene oxide nanoparticles [24].

4.2.3 ­Linker‐Mediated Conjugation of Biomolecules to Nanoparticles 4.2.3.1  Conjugation through Carbodiimide Chemistry

A very popular strategy for the formation of peptide or ester bonds between nanomaterials and biomolecules is through the coupling of amine or alcohol moieties with carboxylic acid moieties [25–27]. Khorana’s early review

155

156

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials

s­ummarized typical chemical characteristics of carbodiimides [28], with an observation that their stability toward polymerization and decomposition increases with branching of the alkyl substituents. Carbodiimides typically used for coupling reactions are 1‐ethyl‐3‐(3‐dimethylaminopropyl) carbodiimide hydrochloride (EDC, also abbreviated as EDAC), for reactions in aqueous ­solutions, and N,N′‐dicyclohexylcarbodiimide (DCC) (soluble in dichloromethane, acetonitrile, dimethylformamide, and tetrahydrofuran), which is applicable for nonaqueous coupling. However, the by‐product of DCC coupling is dicyclohexylurea, which has poor solubility in most organic solvents. In this case, the biomolecule needs to be firstly coupled to a linker in a homogeneous ­reaction, followed by surface grafting of the obtained conjugate upon removal of the dicyclohexylurea [29]. Hence, this reagent is very suitable for homogeneous coupling due to easiness of separation of the by‐product, and the direct coupling of biomolecules to the surface of nanomaterials with DCC is obviously not advisable. Diisopropylcarbodiimide (DIC) is developed as an alternative to DCC for coupling in organic medium since both DIC and its side product after the activation of carboxyl moiety (diisopropylurea) are soluble in organic solvents. Investigation of the kinetics revealed that the rate‐determining step of the coupling is the reaction between the acid and carbodiimide to give the O‐ acylisourea (Figure 4.13), with a higher rate constant at weakly acidic‐neutral pH value than for the reaction in basic environment [30]. However, the final product distribution and yield is controlled by further reaction steps, with a favored formation of the coupling product with an increased pH value of the reaction environment. The presence of catalytic additives such as N‐hydroxybenzotriazole (HOBt) or N‐hydroxysuccinimide (NHS) further drives the reaction toward the formation of the desired amine or ester linkage, which is especially beneficial in case of ester formation and heterogeneous coupling reactions. Formation of thioester linkages for biofunctionalization of nanomaterials is also possible through carbodiimide‐catalyzed reaction of carboxylic acids and thiols. As a recent example, J. Shi and colleagues anchored TAT peptide on the ­surface of multifunctionalized MSN through EDC chemistry for efficient application in photodynamic therapy (Figure 4.14) [31]. 4.2.3.2  Carbamate, Urea, and Thiourea Linkage

Urea and carbamate linkage can also be effectively applied for attachment of biomolecules to nanomaterials [32]. This can be achieved by reaction of amine or alcohol functional groups of biomolecules with isocyanate‐functionalized nanomaterials surface, or as a homogeneous biomolecule‐isocyanate reaction, followed by surface grafting, as discussed in the Introduction. Utilization of phosgene‐based coupling reagents is also possible to yield the desired carbamate or urea linkages [33, 34], and 1,1′‐carbonyldiimidazole is a particularly

4.2.3 ­Linker‐Mediated Conjugation of Biomolecules to Nanoparticle (i) N N

N

O C

N

N

HO

O

C

R

O

N H

EDC

R

O-Acylisourea

(ii) N

N

N

N + N

N

O

C NH O

N

+

N

N

NH

HO HOBt

R

O

R

C O NH

O

Urea (iii) N N N R

O

+

R′

N

R′

N

H2N

N OH

O

+ R

NH O

Figure 4.13  Mechanism of formation of amide linkage through EDC activation of carboxyl group and catalytic effect of HOBt.

O O Si O MSN

NH2

O O Si O

EDC, NHS TAT peptide

NHCO-TAT

MSN

Figure 4.14  Functionalization of mesoporous silica nanoparticles (MSN) with TAT peptide through EDC chemistry.

suitable phosgene analogue as it can form the carbamate linkage in a stepwise manner in mild conditions, which is preferable in functionalization of nanomaterials with biomolecules [35, 36]. Carbonyl diimidazole can also be utilized as activating agent for carboxylic acids to form amide, ester, and thioester

157

158

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials

O O Si O

N=C=O

ε-Poly-L-lysine

O O Si O MSN

MSN

NH O NH ε-Poly-L-lysine

Figure 4.15  Functionalization of MSN with ε‐poly‐l‐lysine through isocyanate chemistry.

linkages. However, a general drawback in the isocyanato coupling is that the reaction cannot be performed in aqueous environment due to the reactivity of isocyanates with water. Hence, proteins and nucleic acids may be denaturized during the process of linking to the nanomaterials surface. A more stable reagent for linkage formation in aqueous environment is isothiocyanate, which when reacted with amine groups forms a thiourea linkage [37]. Thiourea linkage can also be susceptible to hydrolysis in acidic environment particularly if the linkage is made through α‐amines of amino acids [38]. As a recent example, R. Martinez‐Manez and colleagues functionalized MSN with poly‐l‐lysine for drug delivery application in cancer cells (Figure 4.15) [39]. 4.2.3.3  Schiff Base Linkage

Reaction of amines with ketone or aldehyde moieties is also a viable route for linking the biomolecule to the surface of nanoparticles. Thus formed imine is easily cleaved in weakly acidic environment, which is in fact beneficial if the delivery of biomolecules is intended under weakly acidic conditions of cancer tissue or inside acidic endosomal compartments. For obtaining a more stable linkage, the imine can be reduced, which is typically performed by borohydride reagents [40]. H. Zhu and coworkers used the imine procedure with glutaraldehyde to anchor lipase on the surface of superparamagnetic Fe3O4 nanoparticles (Figure  4.16). The nanoparticles displayed good reusability and applicability [41]. Hydrazone and oxime linkages are more stable toward hydrolysis in physiological conditions than their imine analogues due to electron delocalization, which makes the carbon atom from the Schiff base moiety less susceptible to NH2

NH2

Fe3O4

Glutaraldehyde NH2

Lipase

Fe3O4

N N

Lipase

Figure 4.16  Functionalization of superparamagnetic nanoparticles with lipase through glutaraldehyde and imine chemistry.

4.2.3 ­Linker‐Mediated Conjugation of Biomolecules to Nanoparticle Peptide Peptide O

O HN

NH2

HN

O HN

HN

NH2

HN HN

O

N

HN

O HN

N O

CHO-CO-peptide Acetate buffer pH 5.5 1 h, 37°C

Figure 4.17  Functionalization of nanoparticles through the hydrazone linkage.

nucleophilic attack [42]. In addition, since the first step in the hydrolysis reaction would be protonation of the nitrogen from the Schiff base moiety, oxime linkages have been shown to be more stable than hydrazone linkages due to lesser basicity of the nitrogen bonded to oxygen atom. The group of O. Melnyk and coworkers prepared semicarbazide‐functionalized nanoparticles for the anchoring of peptides through semicarbazone linkage for biochips applications (Figure 4.17) [43]. 4.2.3.4  Multicomponent Linkage Formation

In addition to the already discussed carbonyl diimidazole‐mediated multicomponent ligation, the Mannich reaction was recently demonstrated to be a fruitful route for the formation of stable covalent linkages between biomolecules and nanoparticles [44]. The authors used formaldehyde to form the linkage between amine group‐containing cyclic arginine–glycine–aspartic acid (RGD) peptide and the aromatic‐functionalized surface (Figure 4.18). The constructed nanoparticles were further used for targeted imaging of cancer. OH Fe3O4

Formaldehyde

O O

O O NH O HO

N O H

N H NH

O Fe3O4 O NH2 NH

O

NH O

NH2

O NH

HN O HN H 2N

HN

N H

O OH

O NH O

NH O

HO c(RGDyK)

Figure 4.18  Linking the cyclic tripeptide c(RGDyK) to a surface of iron oxide nanoparticles through Mannich reaction [Source: Xie et al. [44]. Reproduced with permission of American Chemical Society].

159

160

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials

4.2.3.5  Biofunctionalization through Alkylation

Thiol moieties from the biomolecules can be efficiently attached to the surface of nanoparticles through alkylation with maleimide derivatives in neutral aqueous solutions at room temperature [45, 46]. This strategy has been recently applied for MSN‐based in vivo targeting and simultaneous positron emission tomography (PET) imaging of cancer [47]. The thiol–maleimide linkage was applied for attaching a PEG spacer to the nanoparticles, as well as for the ­subsequent attachment of the cancer‐targeting monoclonal antibody to the surface‐attached PEG. Thiol moieties from the biomolecules can be also linked to the surface through disulfide formation, by the disulfide exchange reaction [48]. Reagents containing pyridine disulfide moieties are especially useful because they allow simultaneous monitoring of the reaction kinetics through UV absorption of the pyridine‐2‐thione side product [49]. However, the disulfide bond is not particularly stable as it can be cleaved through exchange with other thiols or through reduction inside cells with glutathione. Nevertheless, this property of the disulfide linkage has been exploited for the delivery of therapeutic molecules to the cells [50]. Another alkylation strategy, which was recently successfully employed for biofunctionalization of nanoparticles, is the reaction of amine‐functionalized nanoparticles with a squarate‐functionalized galactose (Figure 4.19) in ethanol/water mixture, in the presence of triethylamine at room temperature [51]. The authors successfully used the synthesized material for imaging and treatment of cancer through photodynamic therapy and drug delivery.

O

OH OH O NH2

HO

EtO

OH O

NH

O

O

O

O

HN

HO OH

O HO

NH

H2O, EtOH, Et3N MSN

Porphyrin MSN

Figure 4.19  Attachment of a carbohydrate molecule to the surface of nanoparticles through alkylation of amine‐functionalized surface with squarate moiety [Source: Gary‐ Bobo et al. [51]. Reproduced with permission of Elsevier].

OH

4.2.3 ­Linker‐Mediated Conjugation of Biomolecules to Nanoparticle

4.2.3.6  Bioorthogonal Linkage Formation

Bioorthogonal chemistry involves reactions between functional groups that are not typically present in biological systems and that can be performed in complex biological environments. Click reaction is the perfect example of applicability of bioorthogonal chemistry for biofunctionalization of nanomaterials [52]. It involves the reaction between azide and terminal alkyne moieties, and although this reaction is possible at elevated temperatures even without a catalyst, its presence allows performing the reaction in milder conditions (room temperature, aqueous solvent), and the reaction yields a single regioisomer. The copper‐catalyzed reaction allows the synthesis of the 1,4‐disubstituted regioisomers specifically, which is typically performed in aqueous environment with the presence of up to 2 mol % of copper sulfate and up to 10 mol % of sodium ascorbate for the in situ ­production of catalytic Cu(I) ions [53]. For example, the surface of gold nanoparticles can be effectively functionalized with lipase through click chemistry (Figure  4.20) [54]. Efficient copper‐free click reactions are also developed, which involve the reaction between strained cyclooctyne moiety and azide. Since this reaction does not require the presence of cytotoxic

N3

N N3 N3 3 N3 N3 N3

+ SH

N3

N3

N3

N3 N3

N3 N3 N3 + N3 N3

Copper (I)

Au nanoparticle

Acetylene–lipase

O

O

N3 =

HS

O

N H

N H

O

O

N 11

1

N+

N–

SH

R1–N=N+=N–

+

R2

Copper (I)

R1

N

N N R2

Figure 4.20  Conjugation of lipase to gold nanoparticles through click chemistry [Source: Brennan et al. [54]. Reproduced with permission of American Chemical Society].

161

162

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials

copper ions, it had been demonstrated already as applicable for linkage ­formation in in vitro and in vivo settings [55]. Staudinger ligation is another example of bioorthogonal chemistry, and it involves the reaction between an azide moiety and usually a methyl ester in the presence of a phosphine catalyst. Hermann Staudinger discovered the reaction for mild reduction of azide moieties into amines in the presence of phosphines, but this reaction was further developed for application in bioorthogonal ligation by Saxon and Bertozzi [56]. Recent review article summarizes application of this strategy for bioconjugation [57]. 4.2.3.7  Conjugation through Host–Guest Interactions

Functionalization of nanomaterials surface with certain biomolecules is ­possible through the utilization of specific host–guest interactions between natural ligands and receptors. For example, the small biomolecule biotin ­vitamin B7 is known to have a strong affinity for binding with the large glycoprotein avidin (66–69 kDa) or streptavidin (52.8 kDa). For instance, avidin was used as the cap for entrapping molecules inside the mesopores of biotin functionalized MSN. Exposure of the nanoparticles to proteolytic enzyme led to the disintegration of the protein cap that showcases opportunities for using intracellular triggers for drug delivery applications [58]. A plethora of biofunctionalization strategies can also be devised based on host–guest interactions with cyclodextrin moieties. For example, recent report on lanthanide nanoparticles for targeted imaging of cancer demonstrated application of different functionalization and conjugation strategies. Higher affinity of azide‐containing phosphonate ligands was first used for ligand exchange with oleic acid‐capped nanoparticles surfaces, which were subsequently conjugated with β‐cyclodextrin by click chemistry, and ultimately the strong host–guest interaction between β‐cyclodextrin and adamantane allowed functionalization of the nanoparticles with adamantane‐functionalized RGD peptide (Figure 4.21) [59]. 4.2.3.8  Linkage through Metal Coordination

Since biomolecules typically contain moieties with suitable donor atoms for coordination to transition metal ions, for example, amine, thiol, and histidine, this linkage chemistry offers plenty opportunities for biofunctionalization of nanoparticles [60]. As an example, the group of C.Y. Mou and coworkers anchored TAT‐superoxide dismutase on MSN through nickel complex and histidine‐tag technique (Figure  4.22) [61]. Polyhistidine‐appended proteins were also successfully attached to different quantum dots by direct coordination to surface metal ions [62]. Similarly, thiol‐appended aptamers were ­effectively coordinated to gold nanoparticles [63].

4.2.3 ­Linker‐Mediated Conjugation of Biomolecules to Nanoparticle

(a) (1) Ligand exchange UCNP

UCNP

(2) Click reaction

(b) UCNP

UCNP

Em = 540 nm Ex = 980 nm

Tumor tissue

o

HO HO

HO

o HO

H N

HO

P

βCD

N N N

o P

Ad-RGD

N3

H N

Integrin receptor

βCD

Figure 4.21  Surface functionalization of lanthanide nanoparticles through click chemistry, followed by formation of highly stable host–guest interactions between adamantane moiety and β‐cyclodextrin [Source: Ma et al. [59]. Reproduced with permission of American Chemical Society].

O O Si O MSN

O O

O O

NH S

N H

NH O

N

O N N O O Ni2+ N

His-tag-TAT-SOD

N

Figure 4.22  Functionalized MSN through coordination and His‐tag technique.

163

164

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials

4.2.3.9  Ligation through Complementary Base Pairing

Because of the strength of multiple hydrogen bonding between complementary base pairs, nucleic acids can be used for the functionalization of nanomaterials and for the construction of complex nanoassemblies [64, 65]. For example, this binding was applied for entrapment of molecules inside MSN through base pairing of complementary DNA strands, and the material was showcased as applicable for adenosine‐5′‐triphosphate (ATP)‐responsive drug delivery [65]. If iron oxide nanoparticles served as pore blockers, application of an alternating magnetic field on the suspension of this nanomaterial was shown to induce local heating, which was sufficient to cause decoupling of DNA strands and to release the pore‐entrapped molecules [66]. This principle may indeed be applicable for magnetic field‐targeted drug delivery. 4.2.3.10  Electrostatic Interactions

Electrostatic interactions may seem labile and non‐applicable in biological ­settings due to the presence of various ions in biological fluids, which may induce metathesis and therefore lead to breakaway of nanoparticle‐coupled molecules. However, many examples from the literature have shown that ­collaborative action of multiple charge interactions may in fact lead to efficient biofunctionalization of nanoparticles [67, 68]. An early report showed that the presence of positively charged dendrimers on the surface of nanoparticles ­successfully complexed with negatively charged plasmid DNA, which was sufficient to lead to its transfection in various cell lines [69]. Optimization of this charge interaction principle has led to the development of various potent nanocarriers for gene delivery [70–72].

4.2.4 ­Conclusions A plethora of chemical modification strategies can be used for the functionalization of nanomaterials with biomolecules. The capabilities of nanoparticles’ surfaces for binding with biomolecules can be typically increased by the pre‐functionalization of the surface with appropriate linkers. Another strategy is the modification of the biomolecules with moieties that exhibit high affinity for the intended nanoparticle surface. Hence, alkoxysilane moieties have high affinity for hydroxylated surfaces (silica, iron oxide, and titanium oxide); terminal ­alkenes and alkynes are reactive toward hydride surfaces, while different mono‐ or polydentate ligands can be introduced as linkers to various metal‐containing surfaces in quantum dots and up‐converting nanoparticles. The coupling of these linkers to the biomolecules can be achieved through different reaction pathways, including carbodiimide and click chemistry, and other covalent (­carbamate, urea, Schiff base, and metal

 ­Reference

coordination) and noncovalent coupling interactions (host–guest, electrostatic). The diversity of chemical ­coupling strategies, which is still increasing by emerging new methodologies, ensures the design and building of increasingly complex biomolecule–nanoparticle ­composites and their applications in different technological fields.

­Acknowledgments N. Z. Knezevic acknowledges the financial support of the Ministry of Education, Science and Technological Development of the Republic of Serbia (grant number III45019) and Chaire Total Chimie Balard.

­References 1 Lu, J.; Liong, M.; Li, Z.; Zink, J. I.; Tamanoi, F. Small 2010, 6, 1794–1805. 2 Hung, C.‐W.; Holoman, T. R. P.; Kofinas, P.; Bentley, W. E. Biochem. Eng. J.

2008, 38, 164–170.

3 Knezevic, N. Z. RSC Adv. 2013, 3, 19388–19392. 4 Knežević, N. Ž.; Slowing, I. I.; Lin, V. S. Y. ChemPlusChem 2012, 77, 48–55. 5 Zhao, J.; Milanova, M.; Warmoeskerken, M. M. C. G.; Dutschk, V. Colloids

Surf. A Physicochem. Eng. Asp. 2012, 413, 273–279.

6 Stein, A.; Melde, B. J.; Schroden, R. C. Adv. Mater. 2000, 12, 1403–1419. 7 Park, J.‐H.; Gu, L.; von Maltzahn, G.; Ruoslahti, E.; Bhatia, S. N.; Sailor, M. J.

Nat. Mater. 2009, 8, 331–336.

8 Secret, E.; Maynadier, M.; e, A.; Chaix, A.; Bouffard, E.; Gary‐Bobo, M.;

Marcotte, N.; Mongin, O.; El Cheikh, K.; Hugues, V.; Auffan, M.; Frochot, C.; Morère, A.; Maillard, P.; Blanchard‐Desce, M.; Sailor, M. J.; Garcia, M.; Durand, J.‐O.; Cunin, F. Adv. Mater. 2014, 26, 7643–7648. 9 Knezevic, N. Z.; Stojanovic, V.; Chaix, A.; Bouffard, E.; Cheikh, K. E.; Morere, A.; Maynadier, M.; Lemercier, G.; Garcia, M.; Gary‐Bobo, M.; Durand, J.‐O.; Cunin, F. J. Mater. Chem. B 2016, 4, 1337–1342. 10 Love, J. C.; Estroff, L. A.; Kriebel, e; Nuzzo, e; Whitesides, G. M. Chem. Rev. 2005, 105, 1103–1169. 11 Woehrle, G. H.; Brown, L. O.; Hutchison, J. E. J. Am. Chem. Soc. 2005, 127, 2172–2183. 12 Thomas, L. A.; Dekker, L.; Kallumadil, M.; Southern, P.; Wilson, M.; Nair, S. P.; Pankhurst, Q. A.; Parkin, I. P. J. Mater. Chem. 2009, 19, 6529–6535. 13 Han, G.‐M.; Li, H.; Huang, X.‐X.; Kong, D.‐M. Talanta 2016, 147, 207–212. 14 Mei, B. C.; Susumu, K.; Medintz, I. L.; Mattoussi, H. Nat. Protoc. 2009, 4, 412–423. 15 Knezevic, N. Z.; Lin, V. S. Y. Nanoscale 2013, 5, 1544–1551.

165

166

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials

16 Boyer, J.‐C.; Carling, C.‐J.; Chua, S. Y.; Wilson, D.; Johnsen, B.; Baillie, D.; 17 18 19 20 21 22 3 2 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

40 41 42 43 44

Branda, N. R. Chemistry 2012, 18, 3122–3126. Wang, F.; Liu, B.; Huang, P.‐J. J.; Liu, J. Anal. Chem. 2013, 85, 12144–12151. Huang, L.‐J.; Yu, e; Chu, X. Analyst 2015, 140, 4987–4990. Kim, B.; Sigmund, W. M. Langmuir 2004, 20, 8239–8242. Balasubramanian, K.; Burghard, M. Small 2005, 1, 180–192. Wang, G.‐W.; Lu, Y.‐M.; Chen, Z.‐X. Org. Lett. 2009, 11, 1507–1510. Chang, I. P.; Hwang, K. C.; Ho, J.‐a. A.; Lin, C.‐C.; Hwu, e; Horng, J.‐C. Langmuir 2010, 26, 3685–3689. Hu, S.; Ding, Y.; Chang, Q.; Yang, J.; Lin, K. Appl. Surf. Sci. 2015, 355, 774–777. Zhao, X.; Liu, L.; Li, X.; Zeng, J.; Jia, X.; Liu, P. Langmuir 2014, 30, 10419–10429. Lale, S. V.; Aswathy, R. G.; Aravind, A.; Kumar, D. S.; Koul, V. Biomacromolecules 2014, 15, 1737–1752. Fang, C.; Bhattarai, N.; Sun, C.; Zhang, M. Small 2009, 5, 1637–1641. Milgroom, A.; Intrator, M.; Madhavan, K.; Mazzaro, L.; Shandas, R.; Liu, B.; Park, D. Colloids Surf. B Biointerfaces 2014, 116, 652–657. Khorana, H. G. Chem. Rev. 1953, 53, 145–166. Knezevic, N. Z.; Mrdanovic, J.; Borisev, I.; Milenkovic, S.; Janackovic, D.; Cunin,F.; Djordjevic, A. RSC Adv. 2016, 6, 7061–7065. Chan, L. C.; Cox, B. G. J. Org. Chem. 2007, 72, 8863–8869. Pan, L.; Liu, J.; Shi, J. Adv. Funct. Mater. 2014, 24, 7318–7327. Kang, H.‐J.; e, E. J.; Park, H.‐D. Appl. Surf. Sci. 2015, 324, 198–204. Devdutt, C. Curr. Org. Chem. 2011, 15, 1593–1624. Babad, H.; Zeiler, A. G. Chem. Rev. 1973, 73, 75–91. D’Addona, D.; Bochet, C. G. Tetrahedron Lett. 2001, 42, 5227–5229. Knezevic, N. Z.; Trewyn, B. G.; Lin, V. S. Y. Chemistry 2011, 17, 3338–3342. Dallagnol, J. C. C.; Ducatti, D. R. B.; Barreira, S. M. W.; Noseda, M. D.; Duarte, M. E. R.; Gonçalves, A. G. Dyes Pigment 2014, 107, 69–80. Lang, L.; Ma, Y.; Kiesewetter, D. O.; Chen, X. Mol. Pharm. 2014, 11, 3867–3874. Mondragon, L.; Mas, N.; Ferragud, V.; de la Torre, C.; Agostini, A.; Martinez‐ Manez, R.; Sancenon, F.; Amoros, P.; Perez‐Paya, E.; e, M. Chemistry 2014, 20, 5271–5281. Abdel‐Magid, A. F.; Carson, K. G.; Harris, B. D.; Maryanoff, C. A.; Shah, R. D. J. Org. Chem. 1996, 61, 3849–3862. Zhu, W.; Li, Y.; Zeng, F.; Yin, H.; Wang, L.; Zhu, H. RSC Adv. 2015, 5, 23039–23045. Kalia, J.; Raines, R. T. Angew. Chem. Int. Ed. 2008, 47, 7523–7526. Carion, O.; Souplet, V.; Olivier, C.; Maillet, C.; Meclard, N.; El‐Mahdi, O.; Durand, J.‐O.; Melnyk, O. ChemBioChem 2007, 8, 315–322. Xie, J.; Chen, K.; Lee, H.‐Y.; Xu, C.; Hsu, A. R.; Peng, S.; Chen, X.; Sun, S. J. Am. Chem. Soc. 2008, 130, 7542–7543.

 ­Reference

45 Tsai, C.‐P.; Chen, C.‐Y.; Hung, Y.; Chang, F.‐H.; Mou, C.‐Y. J. Mater. Chem.

2009, 19, 5737–5743.

46 Hu, C.‐M. J.; Kaushal, S.; Cao, H. S. T.; Aryal, S.; Sartor, M.; Esener, S.; Bouvet,

M.; Zhang, L. Mol. Pharm. 2010, 7, 914–920.

47 Chen, F.; Hong, H.; Zhang, Y.; Valdovinos, H. F.; Shi, S.; Kwon, G. S.; Theuer,

C. P.; Barnhart, T. E.; Cai, W. ACS Nano 2013, 7, 9027–9039.

48 King, T. P.; Li, Y.; Kochoumian, L. Biochemistry 1978, 17, 1499–1506. 49 Lai, C.‐Y.; Trewyn, B. G.; Jeftinija, D. M.; Jeftinija, K.; Xu, S.; Jeftinija, S.; Lin, V.

S. Y. J. Am. Chem. Soc. 2003, 125, 4451–4459.

50 Yang, D.; Chen, W.; Hu, J. J. Phys. Chem. B 2014, 118, 12311–12317. 51 Gary‐Bobo, M.; Hocine, O.; Brevet, D.; Maynadier, M.; Raehm, L.; Richeter, S.;

52 53 54

55

56 57 58 59 60 61 62 63 64 65 66 67 68

Charasson, V.; Loock, B.; Morere, A.; Maillard, P.; Garcia, M.; Durand, J.‐O. Int. J. Pharm. 2012, 423, 509–515. McKay, C. S.; Finn, M. G. Chem. Biol. 2014, 21, 1075–1101. Rostovtsev, V. V.; Green, L. G.; Fokin, V. V.; Sharpless, K. B. Angew. Chem. Int. Ed. 2002, 41, 2596–2599. Brennan, J. L.; Hatzakis, N. S.; Tshikhudo, T. R.; Razumas, V.; Patkar, S.; Vind, J.; Svendsen, A.; Nolte, R. J. M.; Rowan, A. E.; Brust, M. Bioconjug. Chem. 2006, 17, 1373–1375. Baskin, J. M.; Prescher, J. A.; Laughlin, S. T.; Agard, N. J.; Chang, P. V.; Miller, I. A.; Lo, A.; Codelli, J. A.; Bertozzi, C. R. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 16793–16797. Saxon, E.; Bertozzi, C. R. Science 2000, 287, 2007–2010. van Berkel, S. S.; van Eldijk, M. B.; van Hest, J. C. Angew. Chem. Int. Ed. 2011, 50, 8806–8827. Schlossbauer, A.; Kecht, J.; Bein, T. Angew. Chem. Int. Ed. 2009, 48, 3092–3095. Ma, C.; Bian, T.; Yang, S.; Liu, C.; Zhang, T.; Yang, J.; Li, Y.; Li, J.; Yang, R.; Tan, W. Anal. Chem. 2014, 86, 6508–6515. You, C.; Piehler, J. Anal. Bioanal. Chem. 2014, 406, 3345–3357. Chen, Y.‐P.; Chen, C.‐T.; Hung, Y.; Chou, C.‐M.; Liu, T.‐P.; Liang, M.‐R.; Chen, C.‐T.; Mou, C.‐Y. J. Am. Chem. Soc. 2013, 135, 1516–1523. Aldeek, F.; Safi, M.; Zhan, N.; Palui, G.; Mattoussi, H. ACS Nano 2013, 7, 10197–10210. Huang, C.‐C.; Huang, Y.‐F.; Cao, Z.; Tan, W.; Chang, H.‐T. Anal. Chem. 2005, 77, 5735–5741. Prabhu, V. M.; Hudson, S. D. Nat. Mater. 2009, 8, 365–366. He, X.; Zhao, Y.; He, D.; Wang, K.; Xu, F.; Tang, J. Langmuir 2012, 28, 12909–12915. Ruiz‐Hernandez, E.; Baeza, A.; Vallet‐Regi, M. ACS Nano 2011, 5, 1259–1266. Calatayud, M. P.; Sanz, B.; Raffa, V.; Riggio, C.; Ibarra, M. R.; Goya, G. F. Biomaterials 2014, 35, 6389–6399. Ho, K.‐C.; Tsai, P.‐J.; Lin, Y.‐S.; Chen, Y.‐C. Anal. Chem. 2004, 76, 7162–7168.

167

168

Hybrid Particles: Conjugation of Biomolecules to Nanomaterials

69 Radu, D. R.; Lai, e; Jeftinija, K.; Rowe, E. W.; Jeftinija, S.; Lin, V. S. J. Am. Chem.

Soc. 2004, 126, 13216–13217.

70 Brevet, D.; Hocine, O.; Delalande, A.; Raehm, L.; Charnay, C.; Midoux, P.;

Durand, J. O.; Pichon, C. Int. J. Pharm. 2014, 471, 197–205.

71 Meng, H.; Mai, e; Zhang, H.; Xue, M.; Xia, T.; Lin, S.; Wang, X.; Zhao, Y.; Ji, Z.;

Zink, J. I.; Nel, A. E. ACS Nano 2013, 7, 994–1005.

72 Chen, A. M.; Zhang, M.; Wei, D.; Stueber, D.; Taratula, O.; Minko, T.; He, H.

Small 2009, 5, 2673–2677.

169

4.3 Biocomposites from Nanoparticles: From 1D to 3D Assemblies Carole Aimé and Thibaud Coradin Sorbonne Universités, UPMC Univ Paris 06, Collège de France, UMR CNRS 7574, Laboratoire de Chimie de la Matière Condensée de Paris, Paris, France

4.3.1 ­General Considerations Biomolecules themselves represent nanoscale materials with encoded structural and functional information [1]. Their minute coupling with inorganic particles provides new materials that combine the unique properties of the particles with the recognition abilities and functions of biomolecules. Those bionanocomposites are highly tunable hybrid objects with a broad range of applications. In particular, applications in biological and biomedical fields are targeted due to the biological activity and intrinsic biocompatibility of the biological counterpart, together with its molecular recognition abilities that can be diverted for the specific targeting of a biological component such as an enzyme or a cell. In terms of biomolecules, proteins have been widely investigated for the engineering of materials, due to the diversity of their structure and function and the diversity and modularity of molecular recognition properties ruling their interactions with a biological environment. Besides this, nucleic acids have also been of utmost interest, based on their specific recognition properties. Much fewer examples can be found with carbohydrates, as seen in the following text. Finally, lipids have been investigated mostly from the point of view of solid‐supported lipid bilayers that are model systems combining mechanical and chemical stability with physiological relevance [2, 3]. Biofunctionalized nanoparticles can be prepared with two distinct purposes. First, they can be designed for use as individual one‐dimensional (1D) nano‐ objects. In this situation, the biomolecules are selected so as to interact specifically with an external component. In a second approach, the bioconjugated Bionanocomposites: Integrating Biological Processes for Bioinspired Nanotechnologies, First Edition. Edited by Carole Aimé and Thibaud Coradin. © 2017 John Wiley & Sons, Inc. Published 2017 by John Wiley & Sons, Inc.

170

Biocomposites from Nanoparticles: From 1D to 3D Assemblies

particles act as building blocks for the bottom‐up construction of two‐ and three‐­dimensional (2D and 3D) nanostructured systems. This requires that biological moieties possess self‐assembling properties that are responsible for well‐defined interparticle interactions and drive the formation of composite ­networks. The following sections describe the main principles driving the elaboration of such  bionanocomposites and provide key examples of their properties and applications.

4.3.2 ­One‐Dimensional Bionanocomposites Nanoparticles are extensively used in biomedical applications because of their electronic, optical, and magnetic properties that result from their nanometer size and their chemical composition. The variety of core materials available (e.g., metal, metal oxide, and semiconductor), combined with the diversity of biomolecules in terms of structure and function, make 1D bionanocomposites excellent platforms for the development of biological and biomedical setups. Recent applications of biomolecule–nanoparticle nanocomposites can be found in drug and gene delivery, biological sensing, and imaging of live cells and tissues [4]. The first conceptually simple approach takes advantage of the ability of many biomolecules to recognize and bind other biological components in an efficient and specific manner to achieve nanoparticle targeting of cells or tissues. This strategy is useful not only for therapeutic purposes (drug delivery, hyperthermia) but also for imaging of in vitro and in vivo biological specimens using techniques such as optical imaging and magnetic resonance imaging (MRI) [5–9]. A key step in this approach is to define the best‐suited targeting moiety and to ensure that its association with the nanoparticles is not detrimental to its recognition capability. For instance, the RGD peptide is well‐known to have a high affinity for adhesion molecule integrin αvβ3. However, it was found that, once deposited on the surface of iron oxide nanoparticles, its cyclic form c(RGDyK) was the most efficient to favor cellular uptake and has proven to target specifically activated endothelial cells and various tumor cells expressing integrin αvβ3. Those c(RGDyK)‐coated particles were found to be stable for months in aqueous medium and preserved their ability to accumulate preferentially in tumor cells in vivo when administrated intravenously, allowing for their tracking by MRI [10]. Combining targeting, imaging, and therapeutic properties in a single nano‐ object requires to assemble the different functional elements on the particle surface in a very careful manner, that is, in a well‐defined sequence of steps  defined by the chemical constraints imposed by each molecule. This allows for the design of highly complex multifunctional nanoparticles such as iron oxide nanoparticles combining trimodality imaging and therapeutic

4.3.2 ­One‐Dimensional Bionanocomposite

(a)

(b)

(c) (i)

(ii)

15%ID/g

High

Fe3O4

Dopamine

0%ID/g

HSA

(iii)

High

Cy5.5 DOTA-64Cu

10 nm

Low Low

Figure 4.23  (a) Schematic illustration of the multifunctional HSA‐iron oxide nanoparticles. The particles were incubated with dopamine, after which the particles became moderately hydrophilic and could be doped into HSA matrices before drug loading; (b) TEM of the HSA‐functionalized iron oxide nanoparticles in water; (c) representative images of mouse injected with HSA‐functionalized iron oxide nanoparticles, 18 h post injection: (i) in vivo NIRF, (ii) in vivo PET, and (iii) MRI images [Source: Xie et al. [11]. Reproduced with permission of Elsevier].

abilities (Figure 4.23) [11]. In this example, dopamine is first used to modify the surface of the iron oxide particle, allowing their dispersion into polar solvents. Next, the nanoconjugates are coated with human serum albumin (HSA) that can not only act as a drug reservoir but also exhibit some cell‐targeting ­properties. After labeling with 64Cu‐DOTA and Cy5.5, these particles showed multiple imaging ability, combining positron emission tomography (PET), near‐infrared fluorescence (NIRF), and MRI. One interesting consequence of intermolecular recognition and binding events in biomolecular systems is that they can be correlated with an important change in conformation or even lead to specific cleavage. This property can be used to design capped porous particles, exhibiting on‐demand drug delivery ability. The concept was more particularly applied to mesoporous silica nanoparticles (MSNs) [12]. Once the pores are loaded with the drug, they are capped (i.e., closed) with a bioresponsive moiety. In the presence of the stimulating biomolecules, the cap undergoes a conformational switch or is detached, leaving open pores and allowing for drug release [13]. As a proof of concept, proteins were grafted onto MSNs to act as capping agents: in the presence of protease, the cleavage of the enzyme opens the gate, delivering the entrapped drug [14]. More specific approaches have been developed based on the recognition properties of an enzyme for a given sequence bearing typically a cleavage site. In other words, peptides are grafted onto the surface of MSNs, bearing a recognition sequence to be cleaved by a specific enzyme. A recent example describes the MSNs specifically designed to be selectively opened in the presence of caspase 3 (C3) [15]. In this setup, C3 cleavage site on protein is highly conserved and selective, consisting of four amino acids. This peptide sequence has been grafted onto MSNs loaded with a drug to be delivered

171

172

Biocomposites from Nanoparticles: From 1D to 3D Assemblies

(a1)

(a2)

Caspase 3 25 μm

(b1)

(b2)

100 nm

Figure 4.24  (a1) Schematic representation of the gated MSNs capped with the C3‐cleavage site containing peptide; (a2) TEM image of peptide‐functionalized MSNs showing the typical porosity associated to mesoporous matrix [Source: de la Torre et al. [15]. Reproduced with permission of WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim]. (b1) Representation of the positively charged amino modified‐MSNs capped with a single‐stranded oligonucleotide. The delivery of the entrapped guest is selectively accomplished in the presence of the complementary oligonucleotide; (b2) TEM image of the loaded MSNs [Source: Climent et al. [16]. Reproduced with permission of WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim].

specifically in apoptotic cells upon cleavage of the peptide and opening of the pores (Figure  4.24a). On the other hand, proteins can be used to address a cargo in a highly specific fashion. Typically, antibodies have been conjugated to particles (MSNs) to target a given type of cells in vivo. Alternatively, DNA was used to cap drug‐loaded MSNs. Removal of DNA for the further opening of the pores and subsequent release of the drug could be achieved by DNase enzyme, responsible for DNA hydrolysis into small fragments  [17] or using a complementary DNA strand for competitive binding (Figure 4.24b). In the latter case, the complementary strand could be free in solution [16] or conjugated to a particle bearing complementary strands of nucleic acids [18]. Another strategy consists in capping the MSNs pores with DNA aptamers. Aptamers are specific nucleic acids that possess high affinity and selectivity toward a given target [19, 20]. Upon binding of their target, most of the aptamers undergo a conformational switch that results in the opening of the MSNs pores [21]. A last possibility is to use the properties of inorganic nanoparticles to detect recognition events between biomolecular systems, allowing for the design of

4.3.2 ­One‐Dimensional Bionanocomposite

biosensors. Many nanoparticle‐based biosensors incorporate nucleic acids due to the strict complementary rules driving their interactions. As inorganic counterparts, semiconductor nanocrystals or gold nanoparticles have been largely used as reporters of the recognition event due to their unique optical properties. A simple approach relies on the bioconjugation of antibodies to gold nanoparticles together with Raman‐active dyes. Upon contact with arrays of proteins, the binding of these nanoprobes can be detected using surface‐ enhanced Raman scattering (SERS) [22], allowing for a fast optical screening of protein–protein interactions [23]. Another method, fluorescence resonance energy transfer (FRET), is particularly well adapted for the development of ­biosensors. FRET is a non‐radiative process whereby an excited state donor D (usually a fluorophore) transfers energy to a proximal ground state acceptor A through long‐range dipole–dipole interactions [24]. It is very appealing for bioanalysis because of its intrinsic sensitivity to nanoscale changes in D/A separation distance that usually occurs over distances comparable with the dimensions of most biological macromolecules (ca. 10–100 Å). Gold nanoparticles have been used successfully in FRET applications for the engineering of molecular beacons based on DNA. Practically, these setups are made up of nanoparticles bearing a sensing DNA–fluorophore conjugate. The binding of the target molecules (typically the DNA sequence of interest, complementary to the sensing DNA) results in a conformational change, which restores or quenches the fluorescence of the reporter attached to the sensing DNA. The as‐engineered devices are able to detect specific DNA sequences [25] and single mismatch [26] (Figure 4.25a). Alternatively, DNA cleavage processes can be monitored based on the fluorescence changes associated with the cleavage of the DNA–fluorophore sequence by nuclease enzymes [28]. Enzymes constitute another family of biomolecules widely used in biosensor design, mainly based on the detection of the products of their biocatalytic activity. However the conjugation of enzymes on nanoparticles is a major ­challenge and often leads to their inactivation, due to the poor control of the protein ­conformation upon its adsorption on the particle surface. This drawback can be turned into an advantage as it potentially provides the enzyme in two different states: active free enzyme and inactive enzyme–nanoparticle complexes. This was nicely shown using gold nanoparticles electrostatically bound to the β‐galactosidase (β‐gal) enzyme. The adsorption of the enzyme was found to inhibit its activity. However, in the presence of proteins [27] or bacteria [29] that have high affinity for gold nanoparticles, the release of the β‐gal occurs, thus restoring its activity. Therefore the optical monitoring of the enzymatic reaction using a fluorogenic substrate provides a colorimetric readout of the gold‐ analyte binding event (Figure 4.25b). Yet it is worth noting that this approach has a poor specificity in terms of detected analytes since it relies on simple electrostatic interactions with positively charged gold nanoparticles.

173

(a) T6

F

S

T6

Au T6

T6

S

F

T6

Target DNA

S Au

T6

F

F S T6

T6

Target DNA F T6

T6 S Au

F

S T6

T6

(b1)

Nanoparticle/β-galactosidase

OH O O HO OH HO

(b2)

O

O

CH3 Enzymatic reaction HO

O

O

CH3

HO β-Galactosidase

HO

OH O

OH

OH

Figure 4.25  Operating principles of biomolecule–nanoparticle hybrid probes. (a) DNA‐ based sensor: in the closed state, single‐stranded DNAs adopt a constrained conformation on gold particle resulting in fluorophore quenching by the nanoparticle. Upon target binding, the DNA conformation opens, the fluorophore is separated from the particle surface by about 10 nm because of the structural rigidity of the hybridized double‐stranded DNA, and fluorescence is restored [Source: Maxwell et al. [25]. Reproduced with permission of American Chemical Society].

4.3.4 ­Three‐Dimensional Organization of Particle

4.3.3 ­Two‐Dimensional Organization of Nanoparticles The organization of semiconductors or metal nanoparticles into 2D superlattices can provide unique physical and optical properties that differ from isolated nanoparticles or bulk materials and find applications for the manufacture of nanoscale integrated circuits. However, it has remained a significant challenge to generate patterned arrays with a long‐range positional order. To this aim, biomolecular recognition can be used to accurately position bioconjugated nanoscale objects. Indeed, the ability of biomolecules to bind targets in a tunable, multivalent, and reversible manner through highly specific noncovalent interactions provides the guiding principles for the programmed assembly of nanometer scale biocomposites [30]. Since the discovery and implementation of finite DNA structures, known as DNA origami [31], DNA‐based self‐ assembled arrays have been used to program the assembly of gold and semiconductor nanocrystals into 2D arrays [32, 33]. This first requires the assembly of the DNA scaffold on substrates patterned by microcontact printing or lithographic processes. Then nanoparticles functionalized with complementary DNA sequences are added, and their in situ hybridization directs the assembly of highly ordered 2D nanoparticle arrays that reproduce the features of the predefined nanopatterns. Since these pioneering works, the control of the dimensions and geometry of the templating nanostructure has led to assemblies of nanomaterials into a variety of superstructures including highly parallel arrays of nanoscale materials over very large areas (Figure 4.26a and b) [34, 35, 37–39]. Finally, this strategy has proven to be highly modular, affording the combination of different types of nanoparticles in a controlled manner by  conjugating different and specific nucleic acids to each type of particles (Figure 4.26c) [36].

4.3.4 ­Three‐Dimensional Organization of Particles To achieve 3D biocomposite nanostructures, it should in principle be possible  to follow the same strategy as described earlier for 2D systems, that is, preparation of a DNA scaffold that can be used as a template for the assembly of nanoparticles through hybridization. However, this approach faces two Figure 4.25  (Continued) (b) β‐Galactosidase‐based sensor: (b1) electrostatic interactions with the particle inhibit the enzymatic activity; (b2) β‐galactosidase is displaced from the particle upon binding of the analytes, restoring its catalytic activity resulting in an amplified signal for spectroscopic detection [Source: Miranda et al. [27]. Reproduced with permission of American Chemical Society].

175

176

Biocomposites from Nanoparticles: From 1D to 3D Assemblies

(a)

(c1)

(b)

DNA1

DNA1

Patterned linker

(c2)

1 μm

100 nm

50 nm

Figure 4.26  (a) Thymine‐conjugated 10 nm gold nanocrystals annealed to approximately 50 nm polyadenine patterned lines on silicon [Source: Reprinted from Noh et al. [34] with permission © 2009, American Chemical Society]. (b) SEM photo of the self‐assembly of gold nanoparticles onto six‐dot line nanopatterns [Source: Lalander et al. [35]. Reproduced with permission of American Chemical Society]. (c1) Schematic representation of DNA linkers on substrate, hybridizing with DNA‐modified gold nanocrystals, and (c2) SEM image of the resulting nanocrystal films [Source: Noh et al. [36]. Reproduced with permission of WILEY‐ VCH Verlag GmbH & Co. KGaA, Weinheim].

important challenges. First, DNA‐based 3D scaffolds are difficult to obtain above the micron scale. Second, nanoparticles bearing the complementary DNA sequence have a stronger tendency to accumulate on the scaffold surface rather than to penetrate into the interior of the template. A better option is to use the biofunctionalized nanoparticles themselves as building blocks for the formation of a 3D network. Pioneering works by Alivisatos [40] and Mirkin [41] have focused on programmed assembly in a solution. They prepared two populations of gold nanoparticles bearing complementary single‐stranded DNA sequences. Upon mixing, hybridization leads to particle aggregation. Since then, DNA and DNA origami scaffolds [31] have been widely used to tune nanoparticle assemblies, in terms of reversibility, interparticle spacing, and periodicity [30], from a wide variety of particle shape [42, 43] and chemistry (gold [44–48], silver [49], single‐wall carbon nanotubes [50], and silica [51, 52]) (Figure  4.27). One step further, gold nanocrystal solids with a given orientation could be obtained [53–55]. Very importantly, this strategy has proven to allow the independent adjustment of each of the relevant crystallographic parameters in solution, including particle size, periodicity, and interparticle distance. The gold nanoparticles’ aggregation resulting from self‐assembly can be detected, thanks to the modification of their surface plasmon resonance (SPR) band. Practically, a particularly useful output is the redshift and broadening of the plasmon band due to the interparticle plasmon coupling that can be diverted for the development of biosensors. Hence, the oligonucleotide‐mediated nanoparticle aggregation process has been extensively used for the development of

4.3.4 ­Three‐Dimensional Organization of Particle

(a)

(b) DNA1

Diatomic

y

T- shaped

Linear

SWNT

10 nm

(2)

y

y

y

SWNT1

(1)

x

DNA2

SWNT2

SWNT3

DNA3

DNA4

z x

x

x

(3)

(4)

500 nm

100 nm

z

Square planar

Square pyramidal

Octahedral

Figure 4.27  (a) Schematic representation and TEM images of the molecular geometries obtained upon hybridization of DNA‐linked colloidal gold nanoparticles with a single complementary DNA oligonucleotide. In the schematic depictions, the solid lines represent double‐stranded DNA and the gray dotted circles the 2D plane. Scale bars (10 nm) are common for all images [Source: Kim et al. [44]. Reproduced with permission of WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim]. (b) Schematic representation of DNA‐hybridization‐ based assembly of single‐wall carbon nanotubes (SWNT) (1) with each other and (2) with gold nanoparticles (AuNPs). (3 and 4) Atomic force microscopy images of SWNT–AuNP structures [Source: Chen et al. [50]. Reproduced with permission of American Chemical Society].

simple and highly sensitive colorimetric biosensors for the detection of specific oligonucleotide sequences [56–58]. This approach has been broadened by the incorporation of aptamer sequences highly specific to a given target. One example concerns the integration of aptamers specific to adenosine and cocaine for their respective sensing [59]. The sensor features a particle functionalized with two different sequences of single‐stranded DNA, one for conjugation and another with the aptamer of interest. The presence of cocaine triggers nanoparticles dispersion. This is accompanied with a blue‐to‐red color change, allowing for spectroscopic sensing. This method has been extended to the detection of other biomolecular systems such as platelet‐derived growth factors (PDGFs) [60] and thrombin (Figure  4.28) [61]. Alternatively, silica particle‐aptamer ­biosensors have been developed for the detection of ATP [52]. Other biomolecules have been used to drive the aggregation and dispersion of gold nanoparticles with the aim of sensing given compounds.

177

178

Biocomposites from Nanoparticles: From 1D to 3D Assemblies

(a)

AuNPs

(b)

NaCl

Aptamer

Thrombin

Red

Aptamer Aptamer Aptamer thrombin BSA NaCl

Blue

Figure 4.28  (a) Schematic representation and (b) photograph of the gold nanoparticles (AuNPs) colorimetric strategy for thrombin detection in the presence and absence of a target protein (from left to right: 83 nM thrombin, 83 nM BSA, and water) [Source: Wei et al. [61]. Reproduced with permission of The Royal Society of Chemistry].

Glyconanoparticles have been synthesized through the conjugation of lactose to gold nanoparticles for the colorimetric detection of the cholera toxin. In solution the lactose‐stabilized nanoparticles are red in color. Cholera toxin binds to the lactose derivative and induces aggregation of the nanoparticles so that the nanoparticle solution appears as deep purple color. The simple color change of the bioassay provides a selective means to detect and quantify the cholera toxin within 10 min [62]. Enzymatic activity has been also detected based on the aggregation/dispersion of gold nanoparticles. Such particle‐based setups make use of gold nanoparticles that are aggregated through a bridging peptide containing a cleavage site. Upon the action of proteases, the gold nanoparticles get dispersed, allowing the detection of proteases activity [63, 64]. One approach based on the bottom‐up construction of 3D networks from protein self‐assembly can be found in the association of actin with gold nanoparticles for the obtention of nano building blocks. Those blocks then self‐ assemble into gold‐modified actin filaments that become motor nanostructures once deposited on myosin‐functionalized glass surfaces [65]. Self‐assembling building blocks combining inorganic particles and biomolecules have also been described from the association of silica particles with collagen. Collagen could be electrostatically bound as a triple helix to silica prior triggering its self‐assembly into large, well‐organized fibrils (Figure 4.29a) [68]. A surface‐­ mediated growth of the collagen fibrils could be evidenced that was attributed to the fact that although the protein was initially confined on the particle, it preserves some mobility, allowing for its subsequent self‐assembly [66]. Finally, in a last example, the ferritin protein has also been used to direct the organization of two types of nanoparticles (magnetic iron oxide and nonmagnetic gold  nanoparticles) through electrostatic interactions (Figure  4.29b) [67].

4.3.4 ­Three‐Dimensional Organization of Particle

(a) pH 2.5

+

+

pH 6.5

+

so3

NaOH



0.2 μm

0.2 μm

0.2 μm

(b) Electrostatic interactions and self - assembly

Magnetic nanoparticle

pl= 4.5 Ferritin

100 nm

100 nm

Figure 4.29  (a) Collagen adsorption and fibrillogenesis from sulfonate‐modified silica particles, as shown by TEM [Source: Reprinted from Bancelin et al. [66] with permission © 2014, Royal Society of Chemistry]. (b) Magnetic nanoparticles assembled with ferritin via electrostatic interactions and corresponding TEM images [Source: Reprinted from Srivastava et al. [67] with permission © 2007, American Chemical Society].

179

180

Biocomposites from Nanoparticles: From 1D to 3D Assemblies

This  strategy has given rise to bionanocomposites with regular interparticle spacing, where the magnetic properties of ferritin have been integrated with those of magnetic nanoparticles to generate 3D assemblies featuring novel magnetic behavior.

4.3.5 ­Conclusion and Perspectives Living systems are characterized by a full set of interacting rules that ensure the hierarchical structure and function of biomolecules. These rules can be diverted to program the interactions of synthetic bionanocomposites with living systems or with each other. In other words, the conjugation of inorganic nanoscale objects with biomolecules opens the doors toward the engineering of biocompatible and tagged objects for their specific targeting. On the other hand, high throughput methods to synthesize well‐calibrated nano‐objects, such as metal or semiconductor particles, with varying size and shape in a reproducible manner are emerging, and the use of the resulting bionanocomposites holds great promises. In particular, their combination with biomolecules introduces new directions to the rapidly developing field of nanobiotechnology. Indeed, as illustrated here, nanobiocomposites show great potentials in future nanomedicine, including drug delivery, sensor design, and bioimaging. However, today, this mostly concerns 1D nanocomposites. Still, recent substantial progresses in the processing of 2D and 3D nanostructures based on biomolecular recognition are good indicators that proper devices can be constructed. This highlights the scientific challenges ahead of us, which relies on a strongly interdisciplinary research.

­References 1 Willner, I.; Willner, B. Nano Lett. 2010, 10, 3805–3815. 2 Rapuano, R.; Carmona‐Ribeiro, A. M. J. Colloid Interface Sci. 2000, 226,

299–307.

3 Voisin, H.; Aimé, C.; Coradin, T. Eur. J. Inorg. Chem. 2015, 2015(27),

4463–4480.

4 De, M.; Ghosh, P. S.; Rotello, V. M. Adv. Mat. 2008, 20, 4225–4241. 5 Chen, F.; Hong, H.; Zhang, Y.; Valdovinos, H. F.; Shi, S.; Kwon, G. S.;

Theuer, C. P.; Barnhart, T. E.; Cai, W. ACS Nano 2013, 7, 9027–9039.

6 Gao, X.; Cui, Y.; Levenson, R. M.; Chung, L. W. K.; Nie, S. Nat. Biotechnol.

2004, 22, 969–976.

7 Akerman, M. E.; Chan, W. C. W.; Laakkonen, P.; Bhatia, S. N.; Ruoslahti, E.

Proc. Natl. Acad. Sci. U.S.A 2002, 99, 12617–12621.

 ­Reference

8 Gary‐Bobo, M.; Hocine, O.; Brevet, D.; Maynadier, M.; Raehm, L.; Richeter, S.;

9 10 11 12 13 14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Charasson, V.; Loock, B.; Morere, A.; Maillard, P.; Garcia, M.; Durand, J.‐O. Int. J. Pharm. 2012, 423, 509–515. Daniel, M.‐C.; Astruc, D. Chem. Rev. 2004, 104, 293–346. Xie, J.; Chen, K.; Lee, H.‐Y.; Xu, C.; Hsu, A. R.; Peng, S.; Chen, X.; Sun, S. J. Am. Chem. Soc. 2008, 130, 7542–7543. Xie, J.; Chen, K.; Huang, J.; Lee, S.; Wang, J.; Gao, J.; Li, X.; Chen, X. Biomaterials 2010, 31, 3016–3022. Valtchev, V.; Tosheva, L. Chem. Rev. 2013, 113, 6734–6760. Coll, C.; Bernardos, A.; Martinez‐Manez, R.; Sancenon, F. Acc. Chem. Res. 2013, 46, 339–349. Yang, X.; Pu, F.; Chen, C.; Ren, J.; Qu, X. Chem. Commun. 2012, 48, 11133–11135. de la Torre, C.; Mondragon, L.; Coll, C.; García‐Fernández, A.; Sancenón, F.; Martinez‐Manez, R.; Amoros, P.; Perez‐Paya, E.; Orzaez, M. Chem. Eur. J. 2015, 21, 15506–15510. Climent, E.; Martínez‐Manez, R.; Sancenon, F.; Marcos, M. D.; Soto, J.; Maquieira, A.; Amoros, P. Angew. Chem. Int. Ed. 2010, 49, 7281–7283. Zhang, G.; Yang, M.; Cai, D.; Zheng, K.; Zhang, X.; Wu, L.; Wu, Z. ACS Appl. Mater. Interfaces 2014, 6, 8042–8047. Ruiz‐Hernandez, E.; Baeza, A.; Vallet‐Regi, M. ACS Nano 2011, 5, 1259–1266. Hermann, T.; Patel, D. J. Science, 2000, 287, 820–825. Mairal, T.; Özalp, V. C.; Lozano Sánchez, P.; Mir, M.; Katakis, I.; O’Sullivan, C. K. Anal. Bioanal. Chem. 2008, 390, 989–1007. He, X.; Zhao, Y.; He, D.; Wang, K.; Xu, F.; Tang, J. Langmuir 2012, 28, 12909–12915. Aroca, R. F.; Alvarez‐Puebla, R. A.; Pieczonka, N.; Sanchez‐Cortez, S.; Garcia‐Ramos, J. V. Adv. Colloid Interface Sci. 2005, 116, 45–61. Cao, Y. C.; Jin, R.; Nam, J.‐M.; Thaxton, C. S.; Mirkin, C. A. J. Am. Chem. Soc. 2003, 125, 14676–14677. Sapsford, K. E.; Berti, L.; Medintz, I. L. Angew. Chem., Int. Ed. 2006, 45, 4562–4588. Maxwell, D. J.; Taylor, J. R.; Nie, S. J. Am. Chem. Soc. 2002, 124, 9606–9612. Dubertret, B.; Calame, M.; Libchaber, A. J. Nat. Biotechnol. 2001, 19, 365–370. Miranda, O. R.; Chen, H.‐T.; You, C.‐C.; Mortenson, D. E.; Yang, X.‐C.; Bunz, U. H. F.; Rotello, V. M. J. Am. Chem. Soc. 2010, 132, 5285–5289. Ray, P. C.; Fortner, A.; Darbha, G. K. J. Phys. Chem. B 2006, 110, 20745–20748. Miranda, O. R.; Li, X.; Garcia‐Gonzalez, L.; Zhu, Z.‐J.; Yan, B.; Bunz, U. H. F.; Rotello, V. M. J. Am. Chem. Soc. 2011, 133, 9650–9653. Storhoff, J. J.; Mirkin, C. A. Chem. Rev. 1999, 99, 1849–1862. Rothemund, P. W. K. Nature 2006, 440, 297–302. Kannan, B.; Kulkarni, R. P.; Majumdar, A. Nano Lett. 2004, 4, 1521–1524.

181

182

Biocomposites from Nanoparticles: From 1D to 3D Assemblies

33 Deng, Z.; Tian, Y.; Lee, S.‐H.; Ribbe, A. E.; Mao, C. Angew. Chem. Int. Ed.

2005, 44, 3582–3585.

34 Noh, H.; Hung, A. M.; Choi, C.; Lee, J. H.; Kim, J. Y.; Jin, S.; Cha, J. N. ACS

Nano 2009, 3, 2376–2382.

35 Lalander, C. H.; Zheng, Y.; Dhuey, S.; Cabrini, S.; Bach, U. ACS Nano 2010, 4,

6153–6161.

36 Noh, H.; Hung, A. M.; Cha, J. N. Small 2011, 7, 3021–3025. 37 Zhang, J.; Liu, Y.; Ke, Y.; Yan, H. Nano Lett. 2006, 6, 248–251. 38 Noh, H.; Choi, C.; Hung, A. M.; Jin, S.; Cha, J. N. ACS Nano 2010, 4,

5076–5080.

39 Le, J. D.; Pinto, Y.; Seeman, N. C.; Musier‐Forsyth, K.; Taton, T. A.; Kiehl, R. A.

Nano Lett. 2004, 4, 2343–2347.

40 Alivisatos, A. P.; Johnssonm, K. P.; Peng, X.; Wilson, T. E.; Loweth, C. J.;

Bruchez, M. P.; Schultz, P. G. Nature 1996, 382, 609–611.

41 Mirkin, C. A.; Letsinger, R. L.; Mucic, R. C.; Storhoff, J. J. Nature 1996, 382,

607–609.

42 Dujardin, E.; Mann, S.; Hsin, L.; Wang, C. R. C. Chem. Commun. 2001,

1264–1265.

43 Schreiber, R.; Do, J.; Roller, E. M.; Zhang, T.; Schüller, V. J.; Nickels, P. C.;

Feldmann, J.; Liedl, T. Nat. Nanotechnol. 2014, 9, 74–78.

44 Kim J. W.; Kim J. H.; Deaton, R. Angew. Chem. Int. Ed. 2011, 50, 9185–9190. 45 Mirkin, C. A.; Mucic, R. C.; Storhoff, J. J.; Letsinger, R. L. J. Am. Chem. Soc.

1998, 120, 12674–12675.

46 Xu, X.; Rosi, N. L.; Wang, Y.; Huo, F.; Mirkin, C. A. J. Am. Chem. Soc. 2006,

128, 9286–9287.

47 Loweth, C. J.; Caldwell, W. B.; Peng, X.; Alivisatos, A. P.; Schultz, P. G.

Angew. Chem., Int. Ed. 1999, 38, 1808–1812.

48 Maye, M. M.; Nykypanchuk, D.; Cuisinier, M.; van der Lelie, D.; Gang, O.

Nat. Mater. 2009, 8, 388–391.

49 Pal, S.; Sharma, J.; Yan, H.; Liu, Y. Chem. Commun. 2009, 6059–6061. 50 Chen, Y.; Liu, H.; Ye, T.; Kim, J.; Mao, C. J. Am. Chem. Soc. 2007, 129,

8696–8697.

51 Wu, J.; Silvent, J.; Coradin, T.; Aimé, C. Langmuir 2012, 28, 2156–2165. 52 Wu, J.; Coradin, T.; Aimé, C. J. Mater. Chem. B 2013, 1, 5353–5359. 53 Nykypanchuk, D.; Maye, M. M.; van der Lelie, D.; Gang, O. Nature 2008, 451,

549–552.

54 Park, S. Y.; Lytton‐Jean, A. K. R.; Lee, B.; Weigand, S.; Schatz, G. C.; Mirkin, C. A.

Nature 2008, 451, 553–556.

55 Macfarlane, R. J.; Lee, B.; Jones, M. R.; Harris, N.; Schatz, G. C.; Mirkin, C.

Science 2011, 334, 204–208.

56 Reynolds, R. A.; Mirkin, C. A.; Letsinger, R. L. J. Am. Chem. Soc. 2000, 122,

3795–3796.

 ­Reference

57 Storhoff, J. J.; Lucas, A. D.; Garimella, V.; Bao, Y. P.; Muller, U. R.

Nat. Biotechnol. 2004, 22, 883–887.

58 Han, M. S.; Lytton‐Jean, A. K. R.; Mirkin, C. A. J. Am. Chem. Soc. 2006, 128,

4954–4955.

59 Liu J. W.; Lu, Y. Angew. Chem. Int. Ed. 2006, 45, 90–94. 60 Huang, C.‐C.; Huang, Y.‐F.; Gao, Z.; Tan, W.; Chang, H.‐T. Anal. Chem. 2005,

77, 5735–5741.

61 Wei, H.; Li, B.‐L.; Li, J.; Wang, E.‐K.; Dong, S.‐J. Chem. Commun. 2007,

3735–3737.

62 Schofield, C. L.; Field, R. A.; Russell, D. A. Anal. Chem. 2007, 79, 1356–1361. 63 Guarise, C.; Pasquato, L.; De Filippis, V.; Scrimin, P. Proc. Natl. Acad. Sci.

U.S.A 2006, 103, 3978–3982.

64 Laromaine, A.; Koh, L.; Murugesan, M.; Ulijin, R. V.; Stevens, M. M.

J. Am. Chem. Soc. 2007, 129, 4156–4157.

65 Patolsky, F.; Weizmann, Y.; Willner, I. Nat. Mater. 2004, 3, 692–695. 66 Bancelin, S.; Decencière, E.; Machairas, V.; Albert, C.; Coradin, T.; Schanne‐

Klein, M. C.; Aimé, C. Soft Matter 2014, 10, 6651–6657.

67 Srivastava, S.; Samanta, B.; Jordan, B. J.; Hong, R.; Xiao, Q.; Tuominen, M. T.;

Rotello, V. M. J. Am. Chem. Soc. 2007, 129, 11776–11780.

68 Aimé, C.; Mosser, G.; Pembouong, G.; Bouteiller, L.; Coradin, T. Nanoscale

2012, 4, 7127–7134.

183

185

5 Applications

187

5.1 Optical Properties Cordt Zollfrank and Daniel Van Opdenbosch Biogenic Polymers, Technische Universität München, Straubing, Germany

5.1.1 ­Introduction The term optical properties of materials generally denotes the interaction of electromagnetic (EM) radiation with solid matter [1]. EM radiation can be regarded as waves that oscillate in electric and magnetic fields transversely. They propagate with a speed in vacuum cvac of 2.9 × 108 m/s, which is one of the fundamental constants of nature. The oscillations of these two fields are perpendicular to each other and to the direction of energy and wave propagation. The primary properties of an EM wave are its intensity (the amplitude squared), propagation direction, its frequency, that is, wavelength spectrum, and polarization. The frequency ν and length λ of the wave are related via the speed of light (c = λ∙ν). The term “spectrum” denotes the entirety of EM waves within a detected frequency range or space. Examples for EM radiation from the EM spectrum include radio waves (λ > 0.1 m), microwaves (