Synthetic Biology, Volume 2 978-1-78262-278-9, 1782622780, 978-1-78262-120-1, 978-1-78801-405-2

389 40 11MB

English Pages 186 [196] Year 2017

Polecaj historie

Chemical synthetic biology 9780470713976, 0470713976

Chemistry plays a very important role in the emerging field of synthetic biology. In particular, chemical synthetic biol

707 87 7MB Read more

Computational Methods in Synthetic Biology [2 ed.] 1071608215, 9781071608210

This second edition book provides complete coverage of the computational approaches currently used in Synthetic Biology.

442 119 8MB Read more

Strange Natures: Conservation in the Era of Synthetic Biology 9780300258677

A groundbreaking examination of the implications of synthetic biology for biodiversity conservation

535 57 20MB Read more

Programmable Planet: The Synthetic Biology Revolution 0231205104, 9780231205108

A new science is reengineering the fabric of life. Synthetic biology offers bold new ways of manufacturing medicines, cl

176 40 993KB Read more

Computational Methods in Synthetic Biology [2nd ed.] 9781071608210, 9781071608227

This second edition book provides complete coverage of the computational approaches currently used in Synthetic Biology.

686 46 8MB Read more

Programmable Planet: The Synthetic Biology Revolution 0231205104, 9780231205108

A new science is reengineering the fabric of life. Synthetic biology offers bold new ways of manufacturing medicines, cl

452 66 8MB Read more

Principles of Bone Biology (2 Volume Set) [4th Edition] 9780128148426

Preface from the first edition (1996): "The world of modern science is undergoing a number of spectacular events th

2,872 273 99MB Read more

Synthetic Membranes: Volume II. Hyper- and Ultrafiltration Uses 9780841206236, 9780841208056

Content: v. 1. Desalination -- v. 2. Hyper- and ultrafiltration uses. v. 1. The Loeb-Sourirajan membrane : how it came a

602 106 7MB Read more

Synthetic Membranes:. Volume I Desalination 9780841206229, 9780841208049, 0-8412-0622-8, 0-8412-0625-2

Content: v. 1. Desalination -- v. 2. Hyper- and ultrafiltration uses. v. 1. The Loeb-Sourirajan membrane : how it came a

571 201 8MB Read more

Synthetic Vision: Using Volume Learning and Visual DNA 9781501505966, 9781501515170

In Synthetic Vision: Using Volume Learning and Visual DNA, a holistic model of the human visual system is developed into

168 46 8MB Read more

Synthetic Biology, Volume 2
978-1-78262-278-9, 1782622780, 978-1-78262-120-1, 978-1-78801-405-2

Author / Uploaded
Ryadnov
Maxim

Table of contents :
Content: Design and Applications of Synthetic information Processing Circuits in Mammalian Cells
Self-assembly at the Multi-scale Level: Challenges and New Avenues for Inspired Synthetic Biology Modelling
Protein Scaffolds and Higher-order Complexes in Synthetic Biology
Design of Synthetic Symmetrical Proteins
Designer Proteins for Bottom-up Synthetic Biology
Synthetic Extracellular Matrix Approaches for the Treatment of Myocardial Infarction

Citation preview

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP001

Synthetic Biology Volume 2

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP001

View Online

View Online

A Specialist Periodical Report

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP001

Synthetic Biology Volume 2 A Review of Recent Literature Editors Maxim Ryadnov, National Physical Laboratory, Teddington, UK Luc Brunsveld, Eindhoven University of Technology, The Netherlands Hiroaki Suga, University of Tokyo, Japan Authors Noortje A. M. Bax, Eindhoven University of Technology, The Netherlands Carlijn V. C. Bouten, Eindhoven University of Technology, The Netherlands Luc Brunsveld, Eindhoven University of Technology, The Netherlands Patricia Y. W. Dankers, Eindhoven University of Technology, The Netherlands Tina Fink, National Institute of Chemistry, Slovenia Franca Fraternali, King’s College London, UK Tom F. A. de Greef, Eindhoven University of Technology, The Netherlands Anniek den Hamer, Eindhoven University of Technology, The Netherlands Roman Jerala, National Institute of Chemistry, Slovenia Jan Lonzaric ´, National Institute of Chemistry, Slovenia Chris D. Lorenz, King’s College London, UK Irene Marzuoli, King’s College London, UK Giuseppe Milano, Universita` di Salerno, Italy B. Mylemans, KU Leuven, Belgium H. Noguchi, KU Leuven, Belgium Bas J. H. M. Rosier, Eindhoven University of Technology, The Netherlands Maxim Ryadnov, National Physical Laboratory, Teddington, UK Sergio Spaans, Eindhoven University of Technology, The Netherlands J. R. H. Tame, Yokohama City University, Japan A. R. D. Voet, KU Leuven, Belgium J. Vrancken, KU Leuven, Belgium S. Wouters, KU Leuven, Belgium

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP001

View Online

Print ISBN: 978-1-78262-120-1 PDF ISBN: 978-1-78262-278-9 EPUB ISBN: 978-1-78801-405-2 DOI: 10.1039/9781782622789 ISSN: 0140-0568 A catalogue record for this book is available from the British Library r The Royal Society of Chemistry 2018 All rights reserved Apart from fair dealing for the purposes of research for non-commercial purposes or for private study, criticism or review, as permitted under the Copyright, Designs and Patents Act 1988 and the Copyright and Related Rights Regulations 2003, this publication may not be reproduced, stored or transmitted, in any form or by any means, without the prior permission in writing of The Royal Society of Chemistry or the copyright owner, or in the case of reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency in the UK, or in accordance with the terms of the licences issued by the appropriate Reproduction Rights Organization outside the UK. Enquiries concerning reproduction outside the terms stated here should be sent to The Royal Society of Chemistry at the address printed on this page. Whilst this material has been produced with all due care, The Royal Society of Chemistry cannot be held responsible or liable for its accuracy and completeness, nor for any consequences arising from any errors or the use of the information contained in this publication. The publication of advertisements does not constitute any endorsement by The Royal Society of Chemistry or Authors of any products advertised. The views and opinions advanced by contributors do not necessarily reflect those of The Royal Society of Chemistry which shall not be liable for any resulting loss or damage arising as a result of reliance upon this material. The Royal Society of Chemistry is a charity, registered in England and Wales, Number 207890, and a company incorporated in England by Royal Charter (Registered No. RC000524), registered office: Burlington House, Piccadilly, London W1J 0BA, UK, Telephone: þ 44 (0) 207 4378 6556. For further information see our web site at www.rsc.org Printed in the United Kingdom by CPI Group (UK) Ltd, Croydon, CR0 4YY, UK

Preface

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP005

DOI: 10.1039/9781782622789-FP005

This is the second volume of the recently launched series of periodical reports that cover recent developments in synthetic biology. Synthetic biology is a relatively new research area which combines biology and engineering to design, build and test biological systems. The definition of synthetic biology has evolved into a more purposeful term of engineering biology: the synthesis of complex, biologically based (or inspired) systems, which display functions that may not exist in nature. Engineering is at the heart of synthetic biology, being applied at all levels of biological hierarchy from individual molecules to cells, tissues and organisms. There is a rapidly growing body of literature for synthetic biology, with several specialist journals now available. Finding the most appropriate information in this field can be time-consuming. Therefore, this series aims to offer comprehensive reviews of recent literature in themed chapters. Each chapter strives to highlight the most recent findings in specific sub-areas and reviewes research reports that were published over the last two to three years. Revisions of traditional concepts in the light of emerging discoveries are also provided by each chapter, which differentiates this series from other publications, while keeping with the progress without losing touch with foundations. The volume starts with an overview of synthetic information processing circuits (Lonzaric, Fink and Jerala). Conducive to one of the fundamental topics in synthetic biology, synthetic circuits are also one of the most rapidly evolving concepts. The chapter takes a brave step towards highlighting challenges of developing circuits in mammalian cells. Approaches to design logic circuits using genetic elements from bacteria, yeast and plants are discussed on par with strategies used to re-wire signalling pathways that are endogenous to mammalian cells. Externally responsive systems, circuits combining transcription and translation and circuits with recombination are featured in detail, providing a critical focus on use in applications including personalised cell therapies. A logical continuation of this topic is offered in the next chapter, which outlines new avenues in synthetic biology modelling (Milano, Marzuoli, Lorenz and Fraternali). Synthetic biology operates with biomolecular ensembles resulting from the assembly of different components in varied combinations. To be effective in routine use in industry, these ensembles will benefit from accurate computational models that can ultimately pave the way to digital bio-manufacturing. This forms the subject of the review that screens for a means of modelling native and non-native assemblies, by matching force-field parameterisations with conformational flexibility of multi-partner subcellular and virus-like assemblies. Often referred to as a bottom-up synthetic biology, this approach of using biomolecular assembly to construct discrete and functional higher-order structures is gaining momentum in formulating the tool box of molecule-specific scaffolds. This stream of research is further detailed in the following Synthetic Biology, 2018, 2, v–vi | v

c

The Royal Society of Chemistry 2018

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP005

View Online

chapter, which reviewes protein scaffolds designed to control signalling or metabolic activity in living cells (den Hamer, Rosier, Brunsveld and Greef). The chapter goes through a rich repertoir of multi-component protein complexes involved in signal transduction, which are complemented with mathematical models of scaffold assemblies and signalling pathways these scaffolds regulate. Computational predictions of signal ihibition and amplification are described with reference to experimental evidence. The chapter concludes with examples of synthetic scaffolds for signalling networks and metabolic engineering which are of relevance to commercial applications. Following the design of intracellular scaffold proteins, the fourth chapter outlines computational design rules for synthetic symmetrical proteins (Vrancken, Wouters, Mylemans, Noguchi, Tame and Voet). This report takes a somewhat different approach for synthetic biology designs starting from protein folding motifs, bottom, to geometrically defined proteins, up, assembled according to a desired symmetry. Consensus designs that rely on repeated motifs constitute a mainstream trend of the chapter. Two types of exciting case studies are given including repeat proteins delivering specific functions and distinguishable protein folds based on the conventional classification, e.g. b-propeller, a/b-barrel and the like. The progress in designing de novo proteins computationally is followed in the penultimate chapter that gathers information about experimentally validated designs (Ryadnov). This chapter discusses how the current understanding of protein structure-activity relationships may be improved by creating particular biological functions with designer synthetic biomaterials ranging from artificial viruses to cell-supporting extracellular matrices. The latter is taken to the level of one of the most challenging therapeutic targets in the closing chapter of the volume (Spaans, Bax, Bouten and Dankers). This report reviews top-down and bottom-up approaches used to emulate the native extracellular matrix with synthetic analogues for the treatment of myocardial infarction. The impact of such matrices on tissue remodelling, cardiac performance and repair is discussed in line with therapeutic routes including injectable scaffolds, cell delivery and recruitment and the stimulation of endogenous repair. The chapter completes the volume compiled in the spirit of multiscale synthetic biology, starting from genetic elements and culminating with applications at the organismic level of biological organisation. Each themed chapter is structured around current trends in the reviewed area, providing the authors’ outlook of future perspectives, either as a separate section or incorporated in the text. All chapters are written by leading researchers in their subject areas allowing for a broad appeal to researchers in academia and industry. Maxim Ryadnov, Luc Brunsveld and Hiroaki Suga

vi | Synthetic Biology, 2018, 2, v–vi

CONTENTS

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP007

Cover Front cover image courtesy of Santanu Ray and Emiliana De Santis. This image shows a colour-converted atomic force micrograph of self-assembled protein matrices. These artificial nano-to-microscopic structures instruct the development of live cells into new tissues for applications in regenerative medicine.

Preface

v

Design and applications of synthetic information processing circuits in mammalian cells Jan Lonzaric´, Tina Fink and Roman Jerala

1

1 Introduction 2 Input signals and inducible systems 3 Designed synthetic circuits 4 Therapeutic applications of synthetic biology 5 Conclusions and lookout Acknowledgements References

Self-assembly at the multi-scale level: challenges and new avenues for inspired synthetic biology modelling

1 2 8 23 27 27 27

35

Giuseppe Milano, Irene Marzuoli, Chris D. Lorenz and Franca Fraternali 1 2 3 4

Introduction Computational methods in self-assembly Force-field parameterization Applications

35 37 40 44

Synthetic Biology, 2018, 2, vii–ix | vii

c

The Royal Society of Chemistry 2018

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP007

5 MD-SCF models: applications in self assembly 6 Conclusions and future perspectives Acknowledgements References

Protein scaffolds and higher-order complexes in synthetic biology Anniek den Hamer, Bas J. H. M. Rosier, Luc Brunsveld and Tom F. A. de Greef 1 2 3 4 5

Introduction Scaffold proteins Higher-order assemblies Mathematical models and simulations Synthetic scaffolds in engineered signalling networks 6 Synthetic scaffolds in metabolic engineering 7 Conclusions References

Design of synthetic symmetrical proteins J. Vrancken, S. Wouters, B. Mylemans, H. Noguchi, J. R. H. Tame and A. R. D. Voet 1 Introduction 2 Symmetrical repeat proteins 3 Symmetrical globular proteins 4 Future perspectives Acknowledgements References

Designer proteins for bottom-up synthetic biology

50 57 58 58

65

65 66 70 73 83 92 93 94

97

97 100 105 110 111 112

115

Maxim G. Ryadnov 1 Introduction 2 Designer proteins as mimetics of native assemblies 3 Designer protein bricks 4 Current trends: functionally applicable designs 5 Future perspectives References

viii | Synthetic Biology, 2018, 2, vii–ix

115 116 118 125 147 149

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP007

Synthetic extracellular matrix approaches for the treatment of myocardial infarction Sergio Spaans, Noortje A. M. Bax, Carlijn V. C. Bouten and Patricia Y. W. Dankers 1 Introduction 2 Healthy versus adverse remodeled cardiac niche 3 Synthetic ECM approaches for the treatment of MI 4 Discussion and future perspectives References

155

155 156 165 178 180

Synthetic Biology, 2018, 2, vii–ix | ix

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-FP007

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

Design and applications of synthetic information processing circuits in mammalian cells Jan Lonzaric ´, Tina Fink and Roman Jerala* DOI: 10.1039/9781782622789-00001

Mammalian cells represent a challenge and opportunity for synthetic biology. The toolbox of regulators implemented in mammalian cells includes small molecule- or physical stimulus-responsive elements from bacteria, yeast and plants, and designed nucleic acid binding proteins such as TALEs and CRISPR/Cas. This enables engineering of mammalian cells to sense versatile external or endogenous signals and process them via designed logic circuits, or rewiring of endogenous signaling pathways for therapeutic responses, demonstrated in disease models of diabetes, inflammation, cancer. . . Synthetic biology tools are also applied to improve safety and efficiency of engineered therapeutic cells, such as for cancer immunotherapy or stem cells.

1

Introduction

Synthetic biology as a discipline merging life sciences with engineering has since its inception proved to be a valuable tool in research as well as in industrial and medical applications. While wholly synthetic genomes are no longer a fiction,1,2 it is now becoming possible to not only replace an organism’s genetic material with synthetic DNA, but to also design the function of that synthetic DNA and produce genetic circuits that either respond to combinations of external or internal signals in therapeutic applications3,4 or provide us with valuable insights into the complex relationships between DNA, RNA and proteins in the functional cell.5,6 The principle idea behind synthetic biology is in fact a simple one—to define functionally separate basic parts that behave in a predictable way and can be joined and recombined as modules into higher system, which will perform much more complex functions, but in turn also act predictably. Basic modules for gene editing, transcriptional control, translational control and functional control of proteins were first developed in prokaryotes, and drawing a parallel to electronic systems, transistors,7 Boolean logic gates,8 oscillators,9 switches,10 band-pass filters11 and many other regulatory modules and devices were assembled. Many of these devices have then been transferred or independently reinvented to function in eukaryotic cells. In this review we will focus on synthetic biology in mammalian cells and aim to present an overview of different mechanism of gene editing and transcription regulation (Fig. 1A) through direct control of transcription factors or through rewired signaling pathways, we will provide a window into construction and function of modules operating on the level of RNA and translation regulation (Fig. 1B), and will describe attempts at construction of faster National Institute of Chemistry, Hajdrihova ulica 19, 1000 Ljubljana, Slovenija. E-mail: [email protected] Synthetic Biology, 2018, 2, 1–34 | 1 c

The Royal Society of Chemistry 2018

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Fig. 1 Synthetic biology toolbox. Synthetic circuits are regulated on the levels of DNA, RNA and/or proteins. (A) Genes are edited directly or epigenetically regulated with transcription factors. (B) Translation is regulated with RNA interference, RNA-binding proteins (RBPs), aptamers and ribozymes, which can be a part of the mRNA transcript or supplied externally. (C) Proteins are regulated by functional reconstitution, phosphorylation/dephosporylation, proteolysis or degradation/stabilization.

responsive elements through posttranslational control of proteins and their functions (Fig. 1C). Finally, we will describe examples of interesting and complex devices that represent an important advance towards potentially useful therapeutic applications of mammalian synthetic biology.

2

Input signals and inducible systems

The first step in designing a controllable synthetic biological circuit is the selection of appropriate input signals. Often, proof of principle circuits are designed so that the input signal is provided by the presence or absence of certain constitutively active elements.12–14 On the other hand, circuits can be designed to conditionally respond to a wide range of externally supplied or intracellular signals, specific to a disease state, physiological conditions or cell type. The latter include the presence of certain DNA elements,15 short RNAs,16 or endogenous proteins,17–19 and circuits can respond to the signals present locally or systemically in an organism, such as metabolites,20–23 neurotransmitters24 or cytokines.25,26 Development of physiologically relevant synthetic sensors is fundamental to development of therapeutic circuits and we will cover some examples of these input systems along with the examples of therapeutic use in a later section of this review. In these opening section however, we will focus on the introduction of synthetic input systems, designed to be orthogonal to the mammalian cell biology. 2.1 Allosterically controlled transcription factors For therapeutic applications of biologic devices as well as for investigation purposes we often want to design systems that respond only to selected input signals without sensing or interfering with the state of the chassis cell or organism in unpredictable ways. This allows us to exert control over the target cells, either to perturb the investigated system or to safely initiate or terminate production of therapeutic effectors. The earliest developed and most well characterized orthogonal sensing systems are based on small chemical inducers. In genetic circuits, the 2 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

inducers can exert their effect directly on transcription factors (TFs) by allosterically affecting their binding affinity for target DNA.27 Many such transcription factors are derived from bacteria28 and modified to function in mammalian cells by fusion with repression domains (such as the ¨pel associated box—KRAB29) or activation domains (such as the Kru virally derived VP16 domain30 or the human p65 domain31). Additionally, appropriate binding sites for those transcription factors have to be inserted into the promoter region of the gene of interest. The most widely used example of a bacterial transcription factor adapted for mammalian use, and one that superbly illustrates the principles of modular design, is the tet system. The TetR protein, originaly found in bacteria, has first been fused to the VP16 domain to generate a ligand dependent activator that drives expression from a minimal promoter with the tetO binding sites.32 TetR binds to tetO only in the absence of its ligand, doxycyclin, therefore the TetR-VP16 fusion (tTA) creates the so called tet-OFF system that activates transcription in the absence of doxycycline (Fig. 2A). If only transient expression is desired, the tet-OFF system places an inconvenient burden on both the researcher and the cell in having to maintain and tolerate a constantly high concentration of doxycyclin when no protein is being produced. To circumvent this and activate the target gene only in the presence of doxycyclin, two other design options are available—either the fusion of TetR to the KRAB in combination with a constitutively active tetO containing promoter, which could be described as derepression in the absence of the ligand and was termed the TET-dependent transsilencer tTS33 (Fig. 2B), or the reverse tet transcription activator rtTA, which binds to the target DNA only in the presence of the ligand and thereby activates transcription with the help of the VP16 fusion domain34

Fig. 2 Transduction of information through transcription factors. Many prokaryotic transcription factors are allosterically regulated by small molecule binding. (A) The tet transactivator (tTA) is composed of the TetR DNA-binding domain (DBD) and the VP16 activation domain (AD), which activates transcription in absence of ligand. (B) The tet transsilencer (tTS) is composed of the TetR DBD and a KRAB repression domain (RD), which silences expression in the absence of ligand. (C) The reverse tet transactivator (rtTA) is composed of the reverse tet DBD and the AD VP16, which activates transcription in presence of ligand. (D) A rewired G-protein coupled receptor (GPCR) pathway signals through the native secondary messenger cAMP to activate a native transcription factor, which binds to an ectopically inserted operon. (E) A synthetic GPCR pathway is established by fusion of the GPCR with an ectopic transcription factor. Interaction between activated receptor and arrestin reconstitutes activity of a protease, which releases the transcription factor from the membrane, allowing it to activate transcription in the nucleus. Synthetic Biology, 2018, 2, 1–34 | 3

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

(Fig. 2C). These options differ in kinetic properties and leakage, which are therefore the key factors that need to be considered during the design of circuits containing the tet systems. Importantly, there is a whole range of bacterial transcription factors responding to different small molecule ligands described, however for most of them mutants that bind to target DNA either in the absence or the presence of ligands are not available, so fusion with either VP16 or KRAB is the preferred strategy in harvesting bacterial transcription factors for the design of OFF and ON systems in mammalian cells, respectively.27 Many other small molecule responsive transcriptional regulators have been harvested from nature, mainly bacteria, such as acetaldehyde responsive elements, which enable cell stimulation through volatile components,35 a urate responsive system, where in addition to the natural sensor addition of a transporter improved the responsiveness,23 and the sensing of the flavonoid phloretin, an apple metabolite, by the regulator from Pseudomonas putida.36

2.2 Rewiring endogenous pathways Another strategy of controlling synthetic biological circuits is by their coupling to endogenous signaling networks. The receptor type that lends itself well to this strategy is the G-protein coupled receptor (GPCR) also known as the serpentine receptor for its seven membrane passing helical domains.37 The greatest merit of this receptor family is the number of its members and the range of stimuli they can respond to. GPCRs are the largest and most diverse group of membrane receptors in eukaryotes and can bind neurotransmitters, hormones, pheromones, odors and various other synthetic or biological molecules and some of them even respond to light, which requires binding of light-sensitive cofactors.38 Upon ligand-induced activation, the GPCR acts as a guanine exchange factor for the coupled GTPase, which initiates downstream signaling, for example ¨nder et al. have designed synthetic by activating adenylyl cyclase.39 Ausla mammalian sensors by coupling this native GPCR signaling pathway to transcriptional control of transgenes by placing them under a cAMPresponsive promoter38 (Fig. 2D). Unfortunately, the downside of this type of signaling in design of synthetic circuits is that the GPCR signaling networks are not mutually orthogonal, resulting in crosstalk between signalization through various GPCR pathways if executed in a single cell. ¨ller et al. have been able to circumvent this by using designed cell Mu consortia that convert signals from GPCRs into orthogonal soluble mediators that can be combined in the downstream processing cells.35 The activated GPCR is eventually phosphorylated in what is usually a negative feedback loop or desensitization mechanism. The phosphorylated GPCR is recognized and bound by arrestins which prevent it from further binding to G-proteins and facilitate its internalization with eventual dephosphorylation and recycling to the membrane. It is this binding of the activated GPCR to arrestin that can be utilized for the design of synthetic signaling pathways. Fusion of a transcription factor to the GPCR through a protease specific cleavage site, and the fusion of the cognate protease to arrestin, result in a system that upon GPCR activation 4 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

leads to the cleavage of the TF from the membrane, its translocation into the nucleus and activation of transcription of target genes (Fig. 2E). This transcriptional activity persists well after the signal from the GPCR is deactivated or its ligand is no longer present, which depending on the specific circuit topology can either be considered an advantage or a disadvantage.40 Additionally, while arrestins are believed to bind fairly exclusively to phosphorylated GPCRs, synthetic signaling as described above has been shown to be relatively leaky. A split protease strategy with one protease fragment attached to arrestin and the other fragment to the GPCR together with the cleavage site and TF were designed to compensate for this problem. Additionally, cleavage site modifications and arrestin truncations were employed with varying efficiency, depending on cell type and the exact GPCR used.41 Following the principle of complete orthogonality, an effort was made to transplant bacterial two-component systems that are also able to sense a wide range of inputs, into mammalian cells. These function through a histidine-aspartate phosphorelay between a receptor histidine kinase and a response regulator transcription factor. While Hansen et al. have been able to show the effectiveness of the phosphorelay, inducible activation of two-component systems has not yet been achieved in mammalian cells and it remains unclear whether the constitutive activity of the transplanted signaling circuit was a result of some unpredicted process in the chassis cell or the presence of activators in mammalian cell media.42

2.3 Chemically induced dimerization Chemically inducible dimerization (CID) offers a powerful tool to control and manipulate target proteins in living cells. A typical CID system consists of two different protein domains which dimerize upon ligand binding. The most widely used CID today is the FKBP–FRB system, based on the human FKBP12 and mTOR proteins which heterodimerize upon binding of rapamycin or its analogs.43 This system has been used to generate rapamycin inducible translocation44 and activation of transcription factors,45 inteins,46,47 phosphatases48 and kinases49 and a number of reviews describing its development and applications are available.50–52 In order to simultaneously control and combine multiple processes, however, orthogonal CIDs are required. Recently two new CID systems were developed using the plant hormones abscisic acid and gibberellin.53,54 Liang et al. modified proteins of the plant abscisic acid signaling pathway (PYL1 and ABI1) in order to chemically induce proximity of intracellular proteins in mammalian cells53 and the gibberellin induced CID was developed by optimizing GID1 and GAI heterodimerizing proteins. Additionally, to effectively control the dimerization of GID1 and GAI in mammalian cells, chemical modifications of gibberellin were required. At physiological pH, the carboxylic group of gibberellin is negatively charged, decreasing its efficiency of internalization across the cellular membrane. Esterification improved membrane permeability while retaining the full heterodimerization potential of gibberellin due to processing by intracellular esterases.54 Interestingly, Synthetic Biology, 2018, 2, 1–34 | 5

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

abscisic acid required no such modification in spite of the fact that it is also acidic. The rapamycin-, abscisic acid- and gibberellin-induced systems are completely orthogonal and thus well suited for control of separate signaling events.55 Gibberellin and rapamycin were used to independently induce protein translocation to mitochondria and the plasma membrane respectively,54 while abscisic acid and rapamycin were used to translocate proteins to the nucleus or the plasma membrane respectively.53 Finally, abscisic acid and gibberellin were combined in design of inducible Cas9, behaving either as an OR or an AND gate in respect to the two inputs.55

2.4 Responding to light Perhaps the most spectacular example of inducible systems for mammalian cells was introduced by light sensitive receptors. Light regulated activation of neurons represents an extremely powerful tool to study and engineer the nervous system.44 The advantage of light in synthetic circuits is the precision with which stimulation can be controlled both temporally and spatially. Although stimulation by light in living tissue remains a challenge due to the tissue’s opacity for light, light exhibits fewer side effects than chemical stimulation. The Arabidopsis light-sensitive cryptochrome 2 (CRY2) and its dimerization partner CIB1 or their truncated mutants have been established as the most useful heterodimerization domains due to the absence of a requirement for exogenous chromophores and fast kinetics of protein association upon light stimulation and dissociation in dark.56 This fast and reversible system was thus used to generate light-inducible transcription factors in mammalian cells coupled to the yeast GAL4 DNA-binding domain56 as well as the designable TALE57 and the programmable CRISPR/Cas9 DNA-binding domains.58 The possibilities for synthetic circuit design are further expanded by a light inducible Cre recombinase.56 In order to generate orthogonal signaling circuits using only light as an input, one must pay attention to the wavelengths of light that induce any particular system. The CRY-CIB system is induced by irradiation with blue light and two other systems induced by blue light exist, based either on FKF1-GIGANTEA interaction or on the LOV domain’s inducible conformational shift. The FKF1-GIGANTEA is a heterodimerization system that differs from the CRY-CIB system most importantly in the reaction kinetics and reversibility. While CRY-CIB quickly dissociates to inactive monomers in the dark, FKF1 and GIGANTEA remain associated for at least 1.5 hours, increasing their activity even after short stimulation, but making the system not suitable for precise temporal control.59 The LOV domain on the other hand utilizes a completely different mechanism of activation. Upon blue light stimulation it undergoes a conformational change, exposing its C-terminal Ja helix.60 This allowed the engineering of tunable light-inducible dimerization tags (TULIPs)—peptide tags buried inside the LOV protein while in the dark state, but exposed after the unfolding of the Ja domain, enabling 6 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

interaction with a PDZ domain. Point mutations in the LOV domain were required to increase the dynamic range of the system by reducing the background Ja undocking while point mutations in the PDZ domain ¨ ller provided a wide range of association affinities.61 Interestingly, Mu et al. used the PDZ domain with a low affinity to their advantage in designing a circuit with three orthogonal light inputs.62 The phytochrome (Phy-PIF) system as one of the other two systems used in the circuit with three light inputs, responds to red light. Two other important features distinguish this system from the ones described above: (1) it requires the presence of a chromophore not available in mammalian cells, which therefore must be supplied exogenously or biosynthesized, and (2) once activated by red light, it does not passively return to an inactive state, rather it can be actively recovered through illumination with far red wavelengths. The latter has in many cases proven to be an advantage as it effectively creates a stable switch and allows even more precise control over the system’s activation.63 Another approach to engineering light sensitive systems originates from the plant stress response to ultraviolet light. Absorption of UV-B wavelengths allows the use of this system orthogonally to the blue and red light-inducible systems or fluorescent proteins without crosstalk. The system utilizes the unique properties of the UVR8 protein, which forms photolabile homodimers. Induction with UV-B light causes disruption of salt-bridges in the dimer structure and thus leads to formation of protein monomers.64 The most interesting application of the UVR8 light sensitive system was the development of a light-triggered protein secretion system in mammalian cells in which the cargo was fused to several copies of UVR8. Association of these proteins into clusters prevented their transport in membrane vesicles, resulting in release of the protein only upon UV irradiation and dissociation of the clusters.65 Although the ability to respond to light has evolved across kingdoms, all of the light responsive systems described here thus far were, not surprisingly, derived from plants. The choice of an evolutionarily distant source organism almost assures that the systems will be orthogonal to the mammalian chassis. However, a native mammalian receptorbased light-inducible system does also exist, involving chimeric GPCR proteins sensitive to light, named optoXR. Similarly to the GPCR-based systems described above, optoXR was designed to control receptor initiated signaling pathways. In the chimeric protein, intracellular loops of the light-sensitive GPCR rhodopsin were replaced with intracellular domains of other GPCRs, retaining the opsin activation and transduction of the light signal, but diversifying the output by the use of any one of the large family of GPCRs as an effector. Up to date, two optoXRs that selectively recruit distinct signaling pathways were characterized, one based on the human a1a-adrenergic receptor and the other based on the hamster b2-adrenergic receptor. Optical stimulation resulted in activation of adenylyl cyclase (production of cAMP) or phospholipase C (production of IP3 and DAG), respectively, leading to activation of downstream signaling.66 Synthetic Biology, 2018, 2, 1–34 | 7

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

3

Designed synthetic circuits

While input signals form the initial stage for control of biologic circuits, the circuits themselves are defined by the interaction between the biological molecules that transduce and convert the signal. These molecules can range from DNA and RNA to proteins in either closed loop or open loop topologies. In other words, the interplay between the transcription, translation and protein interactions/modifications is what determines the behavior and function of a biological circuit. Input signals can be applied to control any of these levels. In the following sections we are going to review the most commonly used methods to control transcription, translation and protein modification in synthetic circuits, before highlighting some of the most interesting circuit topologies. 3.1 Designed transcription factors The highest number of synthetic biological circuits in prokaryotes as well as eukaryotes has been developed with transcriptional regulation as the principle underlying mechanism. Since proteins are usually introduced into foreign cells and organisms by means of genetic modification, control of their transcription provides an easily accessible way of circuit design. While the abundance of bacterial transcription factors provides a toolbox of regulatory elements, the development of designable targeted DNA binding domains, such as zinc fingers, transcription activation like effectors (TALEs) and the Cas9 protein of the bacterial clustered regularly interspaced short palindromic repeats (CRISPR) system provides a highly important advance. The design of targeted DNA binding domains provides proteins that have very similar properties yet can be designed to recognize almost any DNA sequence, which means that any endogenous sequence in the complex mammalian genome can be targeted. Zinc fingers were the first developed designable modular DNA binding domains. One zinc finger repeat recognizes and binds to a DNA base-pair triplet and longer sequences can be targeted by zinc finger fusions, called polydactyl zinc fingers. However, cooperativity and target-sequence overlap between tandem fusions of zinc finger domains present limitations to zinc finger modularity and complicate the design of arbitrary DNA sequence targeting domains, so several methods to simplify zinc finger synthesis were developed.67 An important breakthrough occurred by the elucidation of the DNA recognition code by transcription activator like effectors (TALEs). These proteins were derived from plant pathogens68 and have a more straightforward DNA-recognition code with each nucleotide recognition mapping to two residues in a 34-residue repeat.69,70 Stringing of these repeats creates DNA binding modules recognizing typically 18 base pair long sequences. Fast TALE assembly platforms have been developed that enabled construction of large numbers of TALEs,71–76 including TALEs targeting all human gene coding regions.77 This highly efficient platform was however overshadowed by the CRISPR/Cas system where the recognition is based on the complementarity between the target site and a guide RNA (gRNA). CRISPR/Cas 8 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

proteins therefore simplify the construction of large arrays of targeting designs by the fact that they do not need to be re-designed for each individual target. Providing several sgRNAs to a cell expressing Cas9 allows easy multiplexing of different targets.78 The area of research most important to a mammalian circuit designer concerning these DNA binding domains are modifications that allow them to function as transcription factors in mammalian cells. First of all, the native Cas9 protein is a nuclease, so in order to function as an epigenetic effector rather than a genome editing tool, it had to be modified with two mutations knocking out its catalytic activity and resulting in the so called dCas9.79 All of the designed DNA binding proteins can then be used as transcription repressors simply in virtue of sterically blocking transcription (a mechanism termed roadblock) if they are targeted to a region closely downstream of a promoter.79–82 Alternatively, similarly to the bacterial transcription factors described above, they can be fused to activation and repression domains. The virally derived VP16 domain was improved by identification of the minimal activation region and then stringing of multiple copies of this region to result in more potent activation domains termed VP6483 or VP192. This was further improved by the addition of the p65 domain derived from the human nuclear factor kB and another virally derived transcription activation domain Rta, resulting in a much more potent, but also significantly larger, tripartite activation domain VPR.84 Tanenbaum et al. developed another interesting approach that avoids increasing the size of the protein. They fused only short epitope peptides to the Cas9 molecule and then supplied it with separately encoded VP64 fused to scFv antibodies binding to the epitope tag. In this way they were able to direct a large number of activation domains to the same DNA locus and increase transcription activation.85 VP64 or other effector domains can also be recruited to the gRNA-Cas9 complex via RNA-binding proteins (RBPs).86 The advantage of this method is that by selecting orthogonal RBPs with either activation or repression domains, and different copy numbers of RBP binding sites in each gRNA, one could selectively upregulate or downregulate a number of different genes in parallel with only one type of Cas9 molecule.87 The KRAB domain, which is also abundantly present in endogenous human transcription factors and functions by directing chromatin remodeling, remains the most widely used repression domain in synthetic mammalian circuits. However, it has to be mentioned that the KRAB domain silences gene expression in a wide region spanning tens of kilobases around the target site.88 Fusion of designed DNA binding domains with methylation and demethylation domains have also been used to influence gene expression by targeted epigenetic modification.89–94 Another important property that was introduced into the designed DNA binding domains is responsiveness to external stimuli. This was accomplished by the above described CIDs. Small molecule-inducible as well as light inducible zinc fingers,95 TALEs57,96,97 and Cas955,58,98,99 have been described on the basis of induced heterodimerization of the DBD with an activation or repression domain (Fig. 3A). Interestingly, both the TALE and the Cas9 DBDs themselves have been split and reconstructed, Synthetic Biology, 2018, 2, 1–34 | 9

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Fig. 3 Designed inducible transcription factors. Designed DNA-binding domains (DBD), such as TALEs are not naturally inducible, but can be modified by fusion with other ligand inducible domains. (A) The DBD and an activation domain (AD) are expressed as separate proteins, fused to dimerization domains, which reconstitute the transcription factor activity upon ligand binding. (B) The TALE domain is locked into a circular conformation through intein splicing and therefore unable to bind to target DNA. Proteolytic cleavage of the lock reconstitutes transcription factor binding and activity. (C) A fusion of tandem steroid ligand binding domains (LBD) prevents translocation of the transcription factor into the nucleus. Upon ligand binding, LBDs dimerize intramolecularly and reconstitute transcription factor activity.

Cas9 inducibly with the FKBP-FRB system98 and TALEs constitutively via inteins.13 Additionally, TALEs have been made inducible through inteinor FKBP-mediated circularization. In this case, the superhelical nature of the TALE interaction with DNA was exploited by locking the TALE into a circular conformation, thus preventing it from winding around the DNA (Fig. 3B). The ability to bind the DNA was restored and transcription of target genes upregulated only upon ligand- or protease-induced linearization of the TALE.100 On the other hand, a mechanism of control that is unique to the Cas9 molecule is the dependence of the nuclease function on the length of gRNA. Kiani et al. have demonstrated that a catalytically active Cas9-VPR fusion can be used to either activate transcription from its target DNA when directed by a short gRNA or to cut the target and thus knock out target genes when directed by a longer gRNA.101 On the other hand, several members of the nuclear hormone receptor family function as native mammalian ligand dependent transcriptional regulators. Their mechanism of action involves activation through binding of the ligand (a hormone or its synthetic analog) and a subsequent intramolecular homodimerization, resulting in nuclear transport and the binding of target DNA to activate transcription102 (Fig. 3C). Ligand binding domains (LBDs) of the nuclear receptors can be fused to selected DNA-binding domains, such as zinc fingers,95 TALEs97 and CRISPR/Cas9,103 to create targeted ligand dependent transcription factors. The human nuclear receptors for estrogen (ER), progesterone (PR) and the retinoid X receptor (RXR), as well as the Drosophila ecdysone receptor (EcR) are most widely used to generate ligand dependent artificial transcriptional regulators.95,97,103 Because their natural ligands may possess biological activity and stimulate unwanted gene networks, modified ER and PR LBDs were designed, that bind synthetic antagonists (4-OHT and RU486) which do not activate endogenous pathways.104,105 3.2 Genetic circuits with transcription regulation While many of the above mechanisms represent standalone sensors and simple devices, their true value for synthetic biology is realized only upon consideration of connectivity or composability. Assembly of synthetic 10 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

circuits such as switches, oscillators or complex logic gates requires layering and feedback loops, imposing the requirement that the output of one information processing module serves as the input of another. This is most readily realized with genetic circuits, where the modules are transcription units and the target of the first transcriptional regulation is itself a transcription factor controlling the next regulatory unit. Layering transcription repressors in this way leads to an interesting, albeit relatively simple system in which an odd number of repressors always leads to the final target repression while an even number of layered repressors leads to transcriptional activation of the final target (Fig. 4A–C). Due to the inherent level of noise in biological systems and imperfect on/off ratio such layering will eventually lead to a loss of signal, but a low number of layered repressors still offers some interesting circuit topologies. Most strikingly, one can arrange a target gene to be acted upon by two regulation cascades, one consisting of a single repression step and the other of a double inversion step, thus seemingly resulting in both activation and repression of the target gene upon stimulation. However, if these repressors are tuned such that the direct repressor has a lower affinity for binding, the two pathways will respond to different concentrations of inducers, resulting in a band-pass filter106 (Fig. 4D).

Fig. 4 Inverters and band pass filters. (A) Shemetic representation and logic notation of an inverter. The graph shows the output of an inverter in relation to the input signal when the affinity of the repression is high (solid line) or low (dashed line). Low affinity of the input repressor results in a response shifted to higher input concentrations. (B) Shematic representation of a double inverter. In logic notation, a double inverter equals a buffer gate. The graph shows the output of a double inverter buffer gate in relation to the input signal. (C) Shematic representation of higher order repressor layering. An even number of repressors results in a buffer gate, an odd number of repressors results in inversion of the input signal. (D) Schematic representation of a band-pass filter with a graph showing direct repression with low affinity (dashed line), double repression (dotted line) and band-pass filter behavior (solid line). (E) An example of a pand-pass filter with secreted alkaline phosphatase (SEAP) as output and E-KRAB and Pip as repressors. A positive feedback loop is included with the tTA to control the circuit with doxycyclin (dox). In schematic representations of the circuits, nodes (transcription units or proteins) are represented as circles or boxes, lines with a dash at the end represent repression, arrows represent activation. Synthetic Biology, 2018, 2, 1–34 | 11

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Allowing the genetic outputs to be controlled by more than one input results in assembly of logic gates. Kramer et al. designed OR, NOR, AND, NAND, implication and nonimplication gates by combining ON (buffer gates) and OFF (NOT gates) systems with small molecule inputs.107 Using only repressors, NOR gates can be constructed (Fig. 5A), and since these are functionally complete, by layering of just NOR gates all other Boolean operations can be constructed, for example NIMPLY gates (Fig. 5B) and OR gates (Fig. 5C). TALEs again represent a good transcription factor to use in such circuits, since (1) a large number of TALEs with high specificity can be designed, allowing construction of many layered or orthogonally parallel gates, (2) orthogonal TAL-KRAB effectors are likely to exert a similar degree of repression, resulting in high predictability of circuits, which is not necessarily true of different types of bacterial transcription factors and (3) TAL effectors are proteins which themselves can be controlled through transcriptional regulation, in contrast to the CRISPR-based repressors that require RNA for their targeting. Transcription of short RNA is performed by the RNA polymerase III, which requires the U6 promoter and is more challenging to regulate than mRNA transcription. For this reason, development of logic gates based on CRISPR lags behind other applications of CRISPR/Cas9. Methods to layer

Fig. 5 Logic gates based on NOR gate layering. (A) A shematic representation of a NOR gate circuit, the truth table for NOR and an example of a NOR gate based on TAL repressors. (B) A shematic representation of a B AND NOT A gate (B NIMPLY A), the truth table for the B NIMPLY A gate and an example of a B NIMPLY A gate based on TAL repressors. (C) A shematic representation of an OR gate circuit, the truth table for OR and an example of an OR gate based on TAL repressors. In schematic representations of the circuits, nodes (transcription units or proteins) are represented as circles or boxes, lines with a dash at the end represent repression. Inputs are labeled as A and B. 12 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

CRISPR-based genetic circuits include insertion of gRNA into introns of controlled target mRNA genes or design of Cas9-regulated polymerase III promoters, but both of these proved to be relatively noisy and no higher order gates were constructed.108 An approach using ribozymes to express gRNA as a longer mRNA and then autocatalytically remove the unnecessary 5 0 and 3 0 sequences along with the protective caps, seems to offer more promising results, but has not yet been peer reviewed.109 While the CRISPR system exceeds any other platforms with respect to the ease of construction of large libraries and multiplexing, TALEs demonstrated at least comparable activation and repression potency and are probably more efficient in construction of complex multilayered circuits.81 Arranging transcription repressors in such a way that they act upon each other in a closed loop results in even more complex behavior, such as amplification, switching or oscillation. Interestingly, it is again the number of nodes that determines this behavior—an odd number of nodes results in oscillation and an even number results in multiple steady states.110,111 For the bistable switches it has been predicted112 and demonstrated113 that in addition to mutual repression, nonlinearity is required and is often provided in the form of cooperativity. Since most bacterial transcription factors bind to their target DNA as dimers in a cooperative manner, the fulfillment of this requirement is already built into the system and bistable mutual repression switches based on bacterial transcription factors in mammalian cells have been described114 (Fig. 6A). On the other hand, TALEs bind to DNA as monomers115,116 and therefore provide no cooperativity, so in order to construct a bistable switch with designable transcription factors the addition of a feedback loop based on competitive binding of repressors and activators was required113 (Fig. 6B). Modularity of transcription factor design with VP16 and KRAB domains, but utilizing the same DBDs, greatly facilitated this effort. Another example of a genetic switch with a positive feedback was designed to be inducible in only one direction, thus providing cellular memory for encounters with external inputs or physiologically relevant signals, such as hypoxia or DNA damage117 (Fig. 6C). Although any odd number of repressor arranged in a closed loop theoretically results in oscillation, a three node repressor based oscillator (repressilator, Fig. 6D) has only been demonstrated in bacteria9 and to date no higher order repressilators were constructed successfully. However, there are two other interesting oscillating circuit architectures described in mammalian cells. Swinburne et al. described a single-node architecture with TetR inhibiting expression of its own gene (Fig. 6E), where in order to increase the likelihood of oscillations additional RNA and protein destabilization were included, although it has not been shown that these features are indeed required for oscillation.118 Tigges et al. developed a two-node oscillator with TetR-VP16 (tTA) driving its own expression and the expression of another transcription activator (PIT) driving the expression of antisense mRNA to knock down tTA (Fig. 6F). They showed that oscillatory behavior depended crucially on gene copy number ratio, while the absolute Synthetic Biology, 2018, 2, 1–34 | 13

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Fig. 6 Genetic switches and oscillators. (A) A schematic representation of a mutual repressor switch and an example of a mutual repressor switch with Pip-KRAB and E-KRAB as repressors. Secreted alkaline phosphatase (SEAP) is fused to one of the repressors to provide a measurable output. The circuit is controlled by small molecule inputs pristinamycin (PI) and erythromycin (EM). (B) A schematic representation of a mutual repressor switch with positive feedback and an example of a mutual repressor switch with TALEs. YFP and BFP are fused to repressors to provide measurable output of each switch state. The circuit is controlled by small molecule inputs pristinamycin (PI) and erythromycin (EM). (C) A shematic representation and an example of a unidirectional positive feedback switch. The switch is activated by hypoxia. RFP is fused to the sensor to provide a measure of activation, while YFP fused to the autoregulated activator provides a measure of switch stability (memory). (D) A shematic representation of a repressilator. (E) A schematic representation and an example of a single node oscillator with delayed negative feedback. Fusion of YFP to the repressor provides a measurable output. (F) A schematic representation of an oscillator with positive feeback where a second node provides both a delayed negative feedback and regulates the transcription of YFP as an output. In schematic representations of the circuits, nodes (transcription units or proteins) are represented as circles or boxes, lines with a dash at the end represent repression, arrows represent activation.

quantity of the transfected genes influenced the amplitude and period of oscillations.119 They later explored a different approach using small interfering RNA (siRNA) instead of antisense mRNA, which resulted in a slower oscillator and relieved the strict requirement for fine balancing of gene copy number.120

3.3 Circuits with translation regulation One level downstream of genetic circuits, RNA circuits retain the simplicity of sequence targeting design by complementary base pairing, but allow fine-tuning of circuits with additional types of control due to the ability of RNA to form catalytically active tertiary structures. The synthetic biology of RNA in general has been reviewed in the first volume of this book,121 so we will only focus on the mechanisms of RNA that are particularly useful to circuit design. These include aptamers, ribozymes, RNA binding proteins, and RNA interference. 14 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Aptamers are the RNA ligand-response element, able to bind to external signals ranging from small molecules to large biochemical polymers. Aptamers can be designed by the evolutionary method SELEX122 to bind to selected targets (for a recent review see Darmostuk et al.123), but the most well characterized and widely used aptamer remains the theophylline binding aptamer124 due to its high affinity, selectivity and robust performance in live mammalian cells. The other key property of aptamers is the fact that they can be incorporated into other RNA molecules, making these responsive to external stimuli. For example a secondary structure rearrangement upon ligand binding can cause a transcription start site in mRNA to become obstructed or interfere with ribosomal scanning and thus prevent translation.125 Interestingly, translational activation is not usually achieved through ligand binding to 5 0 UTR aptamers but instead relies on ribozymes. Ribozymes in the context of RNA circuit design refer to short RNA sequences with the ability to autocatalytically excise themselves from a longer RNA molecule. They can be used to generate short RNAs from longer transcription cassettes, for example to facilitate mRNA degradation by removing the 5 0 cap or the 3 0 poly(A) sequence from the transcribed mRNA. Importantly, ribozymes can cut either to their 5 0 or their 3 0 terminus. These functions are performed for example by the hepatitis delta virus ribozyme and the hammerhead ribozymes, respectively, and the use of both types of ribozymes enables the removal of both end modifications.109 Ribozymes achieve even more features when aptamers are included in their sequence, making them ligand-dependent. In this way, translation of target proteins can be inhibited by ligand-induced ribozyme (aptazyme) cleavage126,127 (Fig. 7A) or initiated by ligandinduced aptazyme deactivation128 (Fig. 7B). Just like transcription factors bind DNA, RNA-binding proteins (RBPs) can bind RNA in a sequence specific manner. Most often, the bacteriophage MS2 coat protein129,130 and the archaeal L7Ae protein131,132 are used to bind target RNA, even though their target sequence cannot be selectively designed. These two proteins have been used to inhibit translation when bound in the 5 0 UTR133 as well as to control mRNA splicing19 (Fig. 7C). A strategy to convert the RBP OFF-switch into an ONswitch also exists and is based on the inclusion of a 5 0 RBP binding site, a premature stop codon and an internal ribosome entry site (IRES) before the coding sequence for the protein of interest. In this way the mRNA is degraded through nonsense-mediated mRNA decay in the absence of an RBP, but in its presence, the stop codons are obstructed and the protein in translated from the IRES134 (Fig. 7D). TetR can also be used to bind to RNA, introducing ligand responsiveness and allowing interesting interplay between the DNA and RNA levels.135 Parallel to the modular structure and designable targeting of TALEs to DNA, designable targeting of RNA can be achieved with the use of PUF proteins, which are also composed of modular repeats, each binding to a specific RNA base in a predictable manner,136–138 and can be fused to repression and activation domains to either downregulate or upregulate protein expression.139 Additionally, PUFs can be used as Synthetic Biology, 2018, 2, 1–34 | 15

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Fig. 7 Mechanisms of translation regulation. (A) A ribozyme included in the 3 0 untranslated region of mRNA will, upon excision, mark the mRNA for degradation. The ribozyme can be a ligand-dependent aptazyme, allowing inducible degradation of mRNA. If ligand binding enables self-cleavage of the ribozyme, this results in an OFF switch. (B) An ON switch is obtained when the ribozyme cleavage is inhibited by ligand binding, allowing stabilization of mRNA and translation. (C) Aptamers or proteins binding in the intron regions of mRNA influence splicing. In the depicted example, a protein consisting of three exons is spliced so that only two exons are translated in the presence of an RNA-binding protein. (D) Inclusion of a premature stop codon marks the mRNA for degradation. An aptamer or an RNA-binding protein in the 3 0 untranslated region can repress the translation of the nonsense sequence, preventing degradation and thus functioning as an on switch for expression of a protein coded downstream of an internal ribosome entry site (IRES).

designable splicing factors140 and even as designable RNA-endonucleases to parallel TAL endonucleases.141 Finally, the naturally prominent method of RNA regulation mechanism, RNA interference, can also be adapted to synthetic biologic circuits. Antisense RNA strategies have largely been replaced with short RNA interference, but with the caveat that the latter relies on the DICER processing and the RISC complex, only present in eukaryotes.142 Endogenous microRNAs (miRNAs) can be used for circuit regulation, resulting in the recognition of the host cell type, for example for identification and selective targeting of cancer cells.16 Artificial miRNAs and short hairpin RNAs (shRNAs) can be used as synthetic circuit inputs, especially when combined with other strategies, such as aptamers and aptazymes, to make them ligand responsive.17,18,143,144 16 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

3.4 Circuits combining transcription and translation Significant improvements of genetic circuits can be obtained through the transcription and translation regulation interplay. Using translation regulation of reporters with RNA-binding proteins which were themselves under the transcriptional regulation of inducible transcription factors, ¨nder et al. designed NIMPLY logic gates, where one small-molecule Ausla input signal was used to repress transcription of a reporter with an MS2 or L7Ae binding site and another input signal was used to repress transcription of the RBP (Fig. 8A). In the presence of only one input, the reporter was transcribed and translated, but if both inputs were absent, the transcription of RBP prevented translation of the reporter, while the presence of both inputs repressed the reporter directly. Wiring two NIMPLY gates such that each input activated transcription of the reporter in one gate and of the RBP in the other gate, resulted in an XOR gate, which allowed for the assembly of a half-adder and a half-subtractor.145 A seminal work by Deans et al. demonstrated that tight, tunable and reversible control of transgene expression can be obtained through coupling of two transcription repressors (TetR and LacI) with shRNA interference. A circuit was designed in which the gene of interest was repressed by LacI and at the same time protein expression from any leaky transcription was knocked down by shRNA. The LacI repressor also controlled the expression of a TetR gene, which, when transcribed, inhibited transcription of shRNA (Fig. 8B). In this way induction with IPTG relieved both the repression and the knockdown and allowed efficient expression of the protein of interest, while uninduced cells produced no detectable

Fig. 8 Genetic circuits based on transcription and translation regulation. (A) A schematic representation of an XOR logic gate, operated by erythromycin (EM) and phloretin (Ph), and the truth table for XOR. Two states of the logic gate are shown: on the left, the circuit is shown in the presence of phloretin, and on the right, the circuit is shown in the presence of erythromycin. Solid lines represent active repression, dotted lines represent absence of repression due to the input signals. Repressed nodes in the circuit are shown as white boxes and absent inputs are shown as white hexagons. (B) An inducible expression circuit with increased safety. The output EGFP is controlled through a direct repression and a triple repression mechanism, of which one step is implemented through shRNA. The circuit is activated by the addition of IPTG. Synthetic Biology, 2018, 2, 1–34 | 17

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

quantities of target protein, making this system safe for expression of even highly toxic proteins.146 A similarly safe system was designed by Greber et al., who cloned a tagged gene of interest under a promoter controlled by a small molecule inducible transcription factor. An intronic siRNA against the tag was inserted into the gene coding for the transcription factor. In this way, the translation of the protein of interest was constitutively knocked down, which resulted in tighter regulation, although at the cost of reduced maximum expression. Importantly, a toggle switch with mutual repression supplemented with the siRNA control mechanism exhibited a much higher dynamic range than without siRNA.147 Different cell types and particularly cancer cells have different expression profiles of endogenous miRNA, but to differentiate between them with high certainty, it is important to not only sense a single miRNA with high expression but also those that are absent in a particular cell type. As miRNAs act as repressors of translation by facilitating mRNA degradation, they can be considered inverters (NOT gates). Combining a single and double inversion circuit with miRNA inputs, Xie et al. designed a cell sorter. In this circuit the miRNAs expressed at low levels in target cells were designed to inhibit translation of the reporter gene directly and miRNAs expressed at high levels inhibited expression of a LacI repressor, which in turn also controlled the gene of interest (Fig. 9A). Interestingly, miRNAs in the single inversion step can be designed to act on the same transcript (NOR gate), while the double repression miRNAs must act each on its own LacI transcript to result in an AND gate (Fig. 9B), otherwise an OR gate would be obtained (Fig. 9C). The system was used to design a complex gate integrating information from a large number of endogenous mRNA (Fig. 9D) and further improved by placing LacI under the control of an activator, whose translation was repressed by the same highly expressed miRNAs. Additionally, an exogenous miRNA was encoded along with LacI to enhance repression. Replacing the reporter gene with a suicide gene, resulting in selective destruction of a selected cell type, has a high therapeutic potential.16 Strikingly, the existence of self-replicating RNA molecules and subgenomic promoters that allow transcription of RNAs from RNAs, can even make RNA circuits completely independent of the genetic encoding at the level of DNA. Synthetic logic circuits, sensors and switches can thus be operated entirely by RNA and proteins encoded in it. Wroblewska et al. re-designed the cell classifier by encoding it on an alphaviral RNA replicon and replacing the LacI repressor with an L7Ae protein binding in the 5 0 untranslated region. They also designed a double and triple inversion circuit and a mutual repressor switch using the MS2-CNOT7 fusion protein, which also acts as an inhibitor of translation when bound to the 3 0 untranslated region of mRNA.148 These types of circuits are safer in comparison to DNA encoded circuits, because they pose no risk of genomic integration.

3.5 Circuits with posttranslational regulation While most engineered genetic circuits are based on transcriptional and translational regulation, there are advantages to regulation on the 18 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Fig. 9 Logic gates based on miRNA operated inverters. (A) A schematic representation of a B AND NOT A gate, the truth table for B AND NOT A and an example of a B AND NOT A gate operated with miR-21 as the doubly inverted input and miR-141, miR-142(3p) and miR146a as single inverted inputs. (B) A schematic representation of an AND gate, the truth table for AND and an example of an AND gate operated with miR-21 and miR-17-30a. The miRNA inputs act on the LacI repressor as well as on the rtTA activator of LacI. The circuit functions in the presence of doxycycline. (C) A schematic representation of an OR gate, the truth table for OR and an example of an OR gate operated with miR-17 and miR-30a. (D) A multi-input gate with inputs A and B integrated as AND and input C repressing the output directly, so it is integrated as AND NOT. The truth table shows only the state of the circuit with the positive output.

protein level. Most importantly, regulation of proteins through transcription or translation creates a delay, but some natural circuits are able to respond faster by using signaling pathways based on protein interactions or their modifications. In some applications, fast response is absolutely crucial, especially for therapeutic purposes e.g. release of hormones, such as insulin, or neurotransmitters, where the response is required within minutes or seconds. Synthetic Biology, 2018, 2, 1–34 | 19

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

3.5.1 Phosphorylation. Fast signal processing in cells is mainly performed by the phosphorylation/dephosphorylation cascades catalyzed by protein kinases and phosphatases. Phosphorylation/ dephosphorylation cascades are rapid and reversible, however design of selective small molecules for pharmacological perturbation of native protein kinases or phosphatases still remains a challenge. One option for design of synthetic phosphoregulation pathways is through small molecule-controlled kinases and phosphatases.149 Camacho-Soto et al. prepared split tyrosine-kinases (TK)49 and split tyrosine-phosphatases (TP)48 as two functionally inactive fragments appended to CIDs (rapamycin, abscisic acid and gibberellin inducible dimerization domains). This allowed their control with external inputs, but did not enable assembly of a complete signaling pathway orthogonal to endogenous signaling. In fact, few attempts at design of fully synthetic phosphorylation cascades have been made as kinase specificity selective enough to maintain complex orthogonal signaling pathways has proven difficult to engineer. Ryu and Park took advantage of the threetiered mitogen-activated protein kinase cascade (MAPK) to design modular protein-protein interaction signaling pathways in mammalian cells.150 In MAPK signaling three consecutive kinases act upon one another and the flow of information is often directed by a scaffold protein, but functional domains of the scaffold often overlap.151 In the engineered MAPK system, the kinases of the yeast mating pathway were transferred to a mammalian chassis and the native scaffold protein Ste5 was replaced with well characterized and modular protein interaction domains, such as PDZ and MTD.150 This represents a step toward rewiring of MAPK pathways, but the complexity of the MAPK interactions has not yet been understood enough to allow generation of completely orthogonal circuits with designed inputs and outputs. Phosphoregulation has been used to design some interesting biologic devices in combination with transcriptional regulation. If multiple targets for a transcription factor are present, binding to any of the targets will sequester the transcription factor and reduce its availability to bind to others. This creates a load that interferes with the circuit operation. Even though synthetic biology aims to create modular designs in which each module acts predictably upon modules downstream of it, such retroactive loads disrupt this predictability and complicate circuit design. Mishra et al. addressed this retroactivity by combing transcriptional control with phosphoregulation cascades. A load driver was engineered in which a quasi-steady state is established faster than in the transcription regulation system, thus allowing the device to respond more predictably to input changes.152 3.5.2 Proteolysis. In comparison to phosphorylation/dephosphorylation, proteolysis is also fast but irreversible. However, to our benefit, the principles of protease target recognition and cleavage are well understood and orthogonal protease-proteolytic target pairs are easier to engineer. The tobacco etch virus protease (TEV) is the best characterized and most widely used protease in synthetic biology. A split version 20 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

of TEV protease with inducible reconstitution was engineered and offers an attractive method to regulate the protease activity.153 In addition to the widely used TEV protease, other proteases with orthogonal cleavage sites154–157 could be used to provide a tool for modulating fast multiple signaling processes in mammalian cells. Proteases have been used for selected target protein degradation, for example by exposing cryptic degrons or, conversely, can induce target binding or catalytic activity by removal of inhibition domains.156 Stein and Alexandrov developed an autoinhibited hepatitis C virus protease, which could be activated by the removal of the autoinhibition peptide by the TVMV protease,157 while Shekhawat et al. developed autoinhibited coild-coil pairs, in principle useful for induction of dimerization through proteolytic removal of the inhibitory peptide, and demonstrated its use on the in vitro example of a split luciferase reporter.158 On the other hand, conditional protein depletion can be achieved through the use of N-terminal degrons. These protein domains induce protein degradation based on the N-end rule through recognition of the N-terminal degron residues by ubiquitin ligase which mediates polyubiquitination, marking the protein for proteasomal degradation. The earliest described N-terminal degron was a peptide derived from the yeast temperature sensitive dihydrofolate reductase (DHFR).159 In synthetic circuits, degrons have been designed to be regulated for example by proteolytic cleavage and a genetic circuit using potyviral proteases to control degrons was implemented in bacteria, but not in mammalian cells.156 A small-molecule inducible system for rapid protein depletion was developed based on the plant auxin dependent degradation pathway. Auxins bind to the TIR1 protein and induce interaction between TIR1 and the auxin inducible degron (AID). TIR1 recruits the multi-protein E3 ubiquitin ligase complex SCF, therefore interaction between TIR1 and the AID degron leads to ubiquitination and protein depletion. Because the TIR1-SCF interaction is based on highly conserved protein domains but TIR1 is only present in plants, the auxin-inducible degron (AID) enables orthogonal, reversible and tunable protein degradation in a range of eukaryotic cells, from yeast to mammals.160

3.6 Circuits with recombination While RNAs can be independent of genetic encoding and posttranslational regulation is fast and often reversible, recombinases enable a completely different approach to biologic circuit design as they result in permanent genetic modifications. By their mechanism of action, recombinases are divided into two groups; tyrosine mediated site-specific recombinases, which can mediate gene expression by excision of a transcription terminator,161 and serine mediated site-specific integrases, which can influence gene expression by inversion of the gene of interest.162 Tyrosine recombinases Cre/loxP and Flp/FRT, and serine recombinases including PhiC31 and Bxb1 integrases are fully orthogonal. Additionally, inducible split version of both Flp and Cre recombinases are available. Synthetic Biology, 2018, 2, 1–34 | 21

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Recombination-based cellular memory and counters have first been developed in bacteria, but here we would like to point out some unintuitive applications in mammalian cells. Lapique et al. used recombinases to create a time delay circuit. They realized that upon transfection of an inducible target gene along with its repressor for an inverter or double inverter gate, a pulse of high expression will occur before sufficient repressor accumulates to inhibit transcription of the target gene. Only after that pulse does the expression of target gene become predictably inducible by depression. To avoid this pulse, the target gene was placed between recombination sites in a reverse orientation relative to its promoter and cotransfected with both a repressor and a recombinase. In the time that the recombinase is expressed and inverts the target gene, enough repressor accumulates to prevent the intial pulse. They showed that such a circuit can be used to safely express a toxic gene in response to specific cell type inputs, such as miRNA, thus improving the cell-classification circuit described previously by the same group.163 Site-specific recombinases can act as activators or repressors of transcription within the same cell, integrating signals on a single transcriptional layer. However, implementation of complex functions requires design of heterospecific recognition sites, which are orthogonal in respect to each other, but can be processed by the same recombinase. The most complex circuit utilizing orthogonal recombinases and heterospecific recombination sites to date is a logic circuit designed by Weinberg et al. (Fig. 10). While logic gates are usually designed ad hoc with multiple promoters or copies of output gene as required, they decided on a more general approach in which a number of genetic ‘‘addresses’’ are designed, corresponding to each possible combination of input signals (four addresses for a two-input system, eight for a three input system and so on). Each address can code for zero, one or more output genes and which address gets transcribed is determined by the action of recombinases, inducible by input signals. This ensures that the basic coding architecture for all logic gates will be the same, but does not yet represent a general system, as the outputs inside the addresses again need to be recoded ad hoc for each specific logic gate. They therefore took it one step further and encoded a transcriptionally inactive output into each address, but each flanked by a pair of recombination site orthogonal to all other recombination sites in the system. In this way it was demonstrated that all possible two-input Boolean logic gates can be encoded by the same genetic construct, from which one can select any one gate by the action of four recombinases acting on the addresses to repair the output gene at that address. The final output is then produced by the two inputs inducing another two orthogonal recombinases selecting which address will get transcribed. In its final implementation this system thus requires a large number of orthogonal recombinases, some of them with the requirement for heterospecific target sites, but the existence of such an orthogonal set has also been demonstrated.164 22 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Fig. 10 A complex logic gate implemented with recombinases. A general coding sequence for two-input functions contains four inactive transcription units, termed addresses. In stage 1, the target sites of logic function selecting recombinases are shown as grey (colored) triangles and the other recombination sites are shown as white shapes. Application of logic function selecting recombinases results in either inversion of the reverse coding sequence or removal of a terminator from upstream of a coding sequence in the selected addresses. In stage two, input recombinases are applied and their target sites are shown in grey (color), while other recombination sites are shown in white. Recombinase B has two heterospecific target pairs. The truth table shows the coding sequence and the output protein after recombination with each combination of input recombinases.

4 Therapeutic applications of synthetic biology Engineered synthetic circuits in mammalian cells provide powerful approaches to sense and control levels of crucial disease-relevant metabolites and they offer promising strategies for cell-based therapeutic applications as the additional level of control provides higher accuracy, efficiency as well as safety. In this final section, we will review some of the proof of concept studies of treatment of human diseases with biologic networks. Gene therapy based on engineering cells from patients has been the topic of research for more than a decade, however the simple introduction of a selected gene into engineered cells has been surpassed by the design of autonomous therapeutic devices able to sense a Synthetic Biology, 2018, 2, 1–34 | 23

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

combination of different signals, process this information and produce the desired therapeutic effectors (Fig. 11). In addition to engineered patient’s cells, encapsulated (e.g. in alginate beads) universal therapeutic cells have been tested in several different animal studies to regulate hormones in diabetic patients, treat obesity,22 metabolic syndrome,165 and hypertension24 as well as diseases such as psoriasis,25 gout23 and colitis.26 Unless treated in time, chronically deregulated blood glucose levels linked to a complex and progressive metabolic disorder diabetes mellitus, can initiate downstream pathologic cascades. Since glucose levels in the body are in constant flux, a designer sensor-effector cellular device for glycemic control and insulin delivery represents an important synthetic biology state of the art strategy for treatment of diabetes. Extrapancreatic cells were reprogrammed to sense physiological blood glucose levels by mimicking the b-cell glucose sensing cascade through glycolysis mediated calcium entry with voltage-gated calcium channels (CaV1.3). The calcium channel CaV1.3 enables b-cells as well as engineered cells to profile physiological blood glucose levels in a precise and reversible

Fig. 11 Therapeutic applications of synthetic biologic devices. The designed biologic circuits can either be inserted in universal cells and applied in encapsulation or they can be inserted into patients own cells, which are then returned into the body. (A) An example application of a universal cellular device is provided by a circuit responding to inflammation. The circuit is fine-tuned by an amplifier, a thresholder and regulated by an inducible off switch. (B) An example of personalized cell therapy is provided by modified T-cells that contain a synthetic Notch recetor (synNotch) responding to CD19, which actives a chimeric antigen receptor (CAR) responding to mesothelin (meso). In this way, T-cells are activated only in the presence of both signals. 24 | Synthetic Biology, 2018, 2, 1–34

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

12

manner. Ca influx as a result of opening of CaV1.3 channels due to nonphysiological levels of glucose, promotes intracellular calmodulincalcineurin signaling and nuclear translocation of calcium responsive transcription factors to induce gene expression, resulting in insulin secretion. Coupling of CaV1.3-based glucose sensing to insulin production and secretion resulted in b-cell mimetic designer cells that provide closed-loop glycemic control.166 Another interesting cellular device employing therapeutic gene circuits was developed for sensing and suppressing inflammation. A complex biological response involving the innate immune system is usually elicited by pathogens, injury or damaged tissue and serves to eliminate pathogens and restore homeostasis. However, aberrant inflammatory signaling is harmful for the organism and may lead to chronic diseases, such as rheumatoid arthritis and multiple sclerosis, due to amplified signals through positive feedback loops.167 Fast and precise suppression of inflammation is therefore a key challenge for successful therapy of inflammatory diseases. Smole et al. developed an anti-inflammatory cell device based on engineered encapsulated mammalian cells able to detect inflammatory mediators at physiologically relevant levels and suppress inflammation by production of anti-inflammatory proteins. Furthermore, to enable a fine-tuned and precise response, the device additionally contains a signal amplifier with a positive feedback thresholder (Fig. 11A). This module is designed so that activation of the sensor results in transcription of a positively autoregulated transcription factor, but when the signal is low, a competitive DNA-binding domain prevents activation of the amplifier. To prevent long term systemic inhibition of beneficial inflammatory signaling an externally inducible switch off system is included and represents a major improvement of this therapeutic cellular device compared to already established gene therapy.26 It is worth noting that synthetic biologic sense-and-respond circuits have not only been proposed for use in therapies, but also for veterinary practice to facilitate livestock breeding. Artificial insemination is standard practice in kettle breeding, however reproduction in mammals is narrowly time-regulated through hormone controlled female ovulation, and sperm delivery must match this fertility window.168 Ovulation in mammals is triggered with the release of luteinizing hormone (LH) from the pituitary gland and its binding to the LH receptor activates GPCR signaling, resulting in release of an oocyte. To address the challenges of sperm delivery timing in cattle artificial insemination, a synthetic biology device which enables coordination of ovulation and sperm delivery was engineered and tailored to dairy cows. The device is composed of sensor cells and spermatozoa encapsulated in cellulose sulphate capsules and implemented into a cow’s uterus. Sensor cells expressing LHR monitor levels of LH and upon detection of oestrus produce cellulase, which degrades the capsules and enables release of sperm. The artificial insemination device was validated in cows, however fine tuning of the device could facilitate its use in other species.169 Probably the greatest success of therapeutic application of synthetic biology is represented by the CAR T-cell-based cancer immunotherapy. Synthetic Biology, 2018, 2, 1–34 | 25

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

View Online

Recently, clinical implementation of two important advances in cancer immunotherapy were reported: engineered T-cell receptors (TCRs)170 and T-cells with chimeric antigen receptors (CARs).171 In their latest generation, CARs consist of an engineered extracellular recognition domain, which is most often a single chain antibody against a selected cancer antigen, the cytoplasmic activation domain CD3z and costimulatory domains like CD28 and 4-1BB. The activation and costimulatory domains trigger T-cell activation and expansion that enables retargeting of T-cells towards cancer elimination.172–174 Despite broad applicability, however, CAR T-cell therapy displays several drawbacks, including potential severe side effects and a limited number of absolutely tumor specific antigens. Several additional synthetic biology tools have thus been recently developed to precisely regulate CAR T-cell proliferation to prevent side effects such as cytokine storms.171 Small molecule control175 and engineering of dual-antigen sensing to create AND logic gates176 with improved specificity (Fig. 11B), as well as kill switches177 to deplete the therapeutic cells when they are no longer needed, were implemented. In currently on-going clinical trials of CAR T-cell cancer immunotherapy, targeting of cancer cells is based on the cancer specific antigen CD19, which is present in both healthy and malignant B cells.178 To bypass the problem of cross-reactivity to healthy cells, a strategy with dual-antigen sensing using two independent CARs in the same cell was developed. A CAR specific for the B-cell marker CD19 is used to mediate CD3z activation, while a CAR specific for the prostate stem cell antigen (PSCA) is used for costimulation through the CD28 and 4-1BB signaling domains. The implementation of an AND gate should enhance ON-target activity by binding of both CARs at the same time to nearby target cells and result in specific activation of T-cells. However, the anti-CD19 CAR was able to activate T-cells without costimulatory signals from the second PSCA-specific CAR. By minimization of the T-cell activation through introduction of CARs with diminished activity and by switching from the combination of CD19 and PSCA antigens to PSCA and PSMA, more specific T-cell activation was achieved.179 An even more robust dual-receptor AND gate was created using CARs in combination with synthetic Notch receptor signaling (synNotch). The extracellular domain of synNotch was engineered to target CD19, but unlike with CARs, binding does not trigger T-cell activation. Upon antigen binding, the intracellular part of the synNotch receptor is cleaved, which results in release of a transcription factor that drives inducible expression of a CAR specific for the second antigen. The synNotch-gated CAR expression displayed a highly specific response to multiple antigens and precise therapeutic discrimination in vivo. A high level of modularity of synNotch receptor components provides flexibility in engineering cells with customized sensing and response. Thus, the synNotch-CAR dual receptor facilitates immune recognition of a wider range of tumors.176 Additionally, the safety of cell-based therapy, which includes stem cells and cancer immuno-therapy, is an important issue. Several kill switches have been developed to terminate the introduced cells upon application of a signal (e.g. doxycycline) in order to prevent excessive activation of 26 | Synthetic Biology, 2018, 2, 1–34

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

180

engineered T-cells or development of cancer from stem cells. However, many of the proposed solutions are quite sensitive to escape mutations, which can inactivate the kill switch, and such mutated cells would preferentially survive and multiply in the organism as a result of selective pressure. Safety switches that bypass this mutation sensitivity based on incorporation of unnatural aminoacids have been implemented in bacterial cells181,182 and it is likely that a similar approach could function in mammalian cells and in in vivo applications in patients.

5

Conclusions and lookout

Medical applications represent some of the most exciting areas of the development of synthetic biology. A wide array of different tools to regulate cellular response and process combinations of inputs have been developed and their function tested in mammalian cells and often also in animal models. Most of the tools that work in bacteria could also be implemented in mammalian cells. The most important remaining challenge of tool development in mammalian cell biology is the construction of fast response circuits with response times in seconds to minutes that are required in many medically relevant conditions. It is also worth noting, that apart from cancer immunotherapy, no other therapeutic applications of synthetic biology have yet been used in the clinic. Based on the available tools, robust and precise regulation could soon be translated into the clinical trials and practice. While cell therapy represents a therapeutic platform different from the existing pharmacological pillars composed of small molecules and biological drugs, major pharmacological companies are actively exploring and developing cell-based therapy. However, personalized cell therapy is bound to remain expensive due to the costs of individualized cell culturing. Introduction of encapsulated universal cells represents a tempting alternative as it could provide a more cost effective and well validated therapeutic approach, but novel delivery agents need to be developed. It is likely that we will see clinical applications of synthetic biology within the next decade.

Acknowledgements This work was supported by funding from the Slovenian Research Agency grant number P4-0176, J3-6791 and J1-6740.

References 1

2

D. G. Gibson, J. I. Glass, C. Lartigue, V. N. Noskov, R.-Y. Chuang, M. A. Algire, G. A. Benders, M. G. Montague, L. Ma, M. M. Moodie, C. Merryman, S. Vashee, R. Krishnakumar, N. Assad-Garcia, C. Andrews-Pfannkoch, E. A. Denisova, L. Young, Z.-Q. Qi, T. H. Segall-Shapiro, C. H. Calvey, P. P. Parmar, C. A. Hutchison, H. O. Smith and J. C. Venter, Science, 2010, 329, 52. S. M. Richardson, L. A. Mitchell, G. Stracquadanio, K. Yang, J. S. Dymond, J. E. DiCarlo, D. Lee, C. L. V. Huang, S. Chandrasegaran, Y. Cai, J. D. Boeke and J. S. Bader, Science, 2017, 355, 1040. Synthetic Biology, 2018, 2, 1–34 | 27

View Online

3 4 5 6

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

7 8 9 10 11 12 13 14 15 16 17 18 19 20

21 22 23 24 25 26 27 28 29 30 31 32 33 34

Z. Kis, H. S. Pereira, T. Homma, R. M. Pedrigi and R. Krams, J. R. Soc., Interface, 2015, 12, 20141000. M. Xie and M. Fussenegger, Biotechnol. J., 2015, 10, 1005. A. Dobrin, P. Saxena and M. Fussenegger, Integr. Biol., 2016, 8, 409. F. Lienert, J. J. Lohmueller, A. Garg and P. A. Silver, Nat. Rev. Mol. Cell Biol., 2014, 15, 95. J. Bonnet, P. Yin, M. E. Ortiz, S. Pakpoom and D. Endy, Science, 2013, 340, 599. R. Silva-Rocha and V. de Lorenzo, FEBS Lett., 2008, 582, 1237. M. B. Elowitz and S. Leibler, Nature, 2000, 403, 335. T. S. Gardner, C. R. Cantor and J. J. Collins, Nature, 2000, 403, 339. T. Sohka, R. A. Heins, R. M. Phelan, J. M. Greisler, C. A. Townsend and M. Ostermeier, Proc. Natl. Acad. Sci. U. S. A., 2009, 106, 10135. J. J. Lohmueller, T. Z. Armel and P. A. Silver, Nucleic Acids Res., 2012, 40, 5180. F. Lienert, J. P. Torella, J. H. Chen, M. Norsworthy, R. R. Richardson and P. A. Silver, Nucleic Acids Res., 2013, 41, 9967. R. Gaber, T. Lebar, A. Majerle, B. ˇ Ster, A. Dobnikar, M. Bencˇina and R. Jerala, Nat. Chem. Biol., 2014, 10, 203. S. Slomovic and J. J. Collins, Nat. Methods, 2015, 12, 1085. Z. Xie, L. Wroblewska, L. Prochazka, R. Weiss and Y. Benenson, Science, 2011, 333, 1307. C. L. Beisel, Y. Y. Chen, S. J. Culler, K. G. Hoff and C. D. Smolke, Nucleic Acids Res., 2011, 39, 2981. S. Kashida, T. Inoue and H. Saito, Nucleic Acids Res., 2012, 40, 9369. S. J. Culler, K. G. Hoff and C. D. Smolke, Science, 2010, 330, 1251. ¨nder, S. Ausla ¨nder, G. Charpin-El Hamri, F. Sedlmayer, M. Mu ¨ller, D. Ausla O. Frey, A. Hierlemann, J. Stelling and M. Fussenegger, Mol. Cell, 2014, 55, 397. H. Ye, M. D.-E. Baba, R.-W. Peng and M. Fussenegger, Science, 2011, 332, 1565. ¨ssger, G. Charpin-El-Hamri and M. Fussenegger, Nat. Commun., 2013, K. Ro 4, 1. C. Kemmer, M. Gitzinger, M. Daoud-El Baba, V. Djonov, J. Stelling and M. Fussenegger, Nat. Biotechnol., 2010, 28, 355. ¨ssger, G. C. Hamri and M. Fussenegger, Proc. Natl. Acad. Sci. U. S. A., K. Ro 2013, 110, 18150. L. Schukur, B. Geering, G. Charpin-El Hamri and M. Fussenegger, Sci. Transl. Med., 2015, 7, 318ra201. A. Smole, D. Lainsˇˇ cek, U. Bezeljak, S. Horvat and R. Jerala, Mol. Ther., 2017, 25, 102. W. Weber and M. Fussenegger, Handb. Exp. Pharmacol., 2007, 178, 73. B. C. Stanton, A. A. K. Nielsen, A. Tamsir, K. Clancy, T. Peterson and C. A. Voigt, Nat. Chem. Biol., 2013, 10, 99. J. F. Margolin, J. R. Friedman, W. K. Meyer, H. Vissing, H. J. Thiesen and F. J. Rauscher, Proc. Natl. Acad. Sci. U. S. A., 1994, 91, 4509. I. Sadowski, J. Ma, S. Triezenberg and M. Ptashne, Nature, 1988, 335, 563. M. L. Schmitz and P. A. Baeuerle, EMBO J., 1991, 10, 3805. M. Gossen and H. Bujard, Proc. Natl. Acad. Sci. U. S. A., 1992, 89, 5547. U. Deuschle, W. K. Meyer and H. J. Thiesen, Mol. Cell. Biol., 1995, 15, 1907. ¨ller, W. Hillen and H. Bujard, M. Gossen, S. Freundlieb, G. Bender, G. Mu Science, 1995, 268, 1766.

28 | Synthetic Biology, 2018, 2, 1–34

View Online

35 36

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

37 38 39 40 41 42 43 44 45

46 47 48 49 50 51 52 53 54 55 56 57

58 59 60 61 62

¨nder, A. Spinnler, D. Ausla ¨nder, J. Sikorski, M. Folcher ¨ller, S. Ausla M. Mu and M. Fussenegger, Nat. Chem. Biol., 2017, 13, 309. M. Gitzinger, C. Kemmer, M. D. El-Baba, W. Weber and M. Fussenegger, Proc. Natl. Acad. Sci. U. S. A., 2009, 106, 10638. W. E. Miller and R. J. Lefkowitz, Curr. Opin. Cell Biol., 2001, 13, 139. ¨nder, B. Eggerschwiler, C. Kemmer, B. Geering, S. Ausla ¨nder and D. Ausla M. Fussenegger, Nat. Commun., 2014, 5, 4408. L. M. Luttrell, in Transmembrane Signaling Protocols, ed. H. Ali and B. Haribabu, Humana Press, New Jersey, 2nd edn, 2006, vol. 332, pp. 3–49. G. Barnea, W. Strapps, G. Herrada, Y. Berman, J. Ong, B. Kloss, R. Axel and K. J. Lee, Proc. Natl. Acad. Sci. U. S. A., 2008, 105, 64. M. S. Djannatian, S. Galinski, T. M. Fischer and M. J. Rossner, Anal. Biochem., 2011, 412, 141. J. Hansen, E. Mailand, K. K. Swaminathan, J. Schreiber, B. Angelici and Y. Benenson, Proc. Natl. Acad. Sci. U. S. A., 2014, 111, 15705. L. A. Banaszynski, C. W. Liu and T. J. Wandless, J. Am. Chem. Soc., 2005, 127, 4715. L. C. Toru Komatsu, I. Kukelyansky, J. Michael McCaffery, T. Ueno and T. I. Varela, Nat. Methods, 2010, 7, 206. M. G. Victor, M. Rivera, T. Clackson, S. Natesan, R. Pollock, J. F. Amara, T. Keenan, S. R. Magari, T. Phillips, N. L. Courage, F. Cerasoli Jr and D. A. Holt, Nat. Genet., 1996, 14, 353. H. D. Mootz, E. S. Blum, A. B. Tyszkiewicz and T. W. Muir, J. Am. Chem. Soc., 2003, 125, 10561. H. D. Mootz and T. W. Muir, Am. Chem. Soc., 2002, 124, 9044. K. Camacho-Soto, J. Castillo-Montoya, B. Tye, L. O. Ogunleye and I. Ghosh, J. Am. Chem. Soc., 2014, 136, 17078. K. Camacho-Soto, J. Castillo-Montoya, B. Tye and I. Ghosh, J. Am. Chem. Soc., 2014, 136, 3995. R. Derose, T. Miyamoto and T. Inoue, Pflugers Arch. Eur. J. Physiol., 2013, 465, 409. A. Fegan, B. White, J. C. T. Carlson and C. R. Wagner, Chem. Rev., 2010, 110, 3315. M. Putyrski and C. Schultz, FEBS Lett., 2012, 586, 2097. F.-S. Liang, W. Q. Ho and G. R. Crabtree, Sci. Signalling, 2011, 4, 18. T. Miyamoto, R. DeRose, A. Suarez, T. Ueno, M. Chen, T. Sun, M. J. Wolfgang, C. Mukherjee, D. J. Meyers and T. Inoue, Nat. Chem. Biol., 2012, 8, 465. Y. Gao, X. Xiong, S. Wong, E. J. Charles, W. A. Lim and L. S. Qi, Nat. Methods, 2016, 13, 1043. M. J. Kennedy, R. M. Hughes, L. A. Peteya, J. W. Schwartz, M. D. Ehlers and C. L. Tucker, Nat. Methods, 2010, 7, 973. S. Konermann, M. D. Brigham, A. E. Trevino, P. D. Hsu, M. Heidenreich, L. Cong, R. J. Platt, D. A. Scott, G. M. Church and F. Zhang, Nature, 2013, 500, 472. L. R. Polstein and C. A. Gersbach, Nat. Chem. Biol., 2015, 11, 198. M. Yazawa, A. M. Sadaghiani, B. Hsueh and R. E. Dolmetsch, Nat. Biotechnol., 2009, 27, 941. S. M. Harper, L. C. Neil and K. H. Gardner, Science, 2003, 301, 1541. D. Strickland, Y. Lin, E. Wagner, C. M. Hope, J. Zayner, C. Antoniou, T. R. Sosnick, E. L. Weiss and M. Glotzer, Nat. Methods, 2012, 9, 379. ¨ller, R. Engesser, S. Schulz, T. Steinberg, P. Tomakidi, C. C. Weber, K. Mu R. Ulm, J. Timmer, M. D. Zurbriggen and W. Weber, Nucleic Acids Res., 2013, 41, e124. Synthetic Biology, 2018, 2, 1–34 | 29

View Online

63 64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

65 66 67 68 69 70 71

72 73 74 75 76

77

78 79 80 81 82 83 84

85 86

87

88

A. Levskaya, O. D. Weiner, W. A. Lim and C. A. Voigt, Nature, 2009, 461, 997. L. Rizzini, J.-J. Favory, C. Cloix, D. Faggionato, A. O’Hara, E. Kaiserli, R. Baumeister, E. Schafer, F. Nagy, G. I. Jenkins and R. Ulm, Science, 2011, 332, 103. D. Chen, E. S. Gibson and M. J. Kennedy, J. Cell Biol., 2013, 201, 631. R. D. Airan, K. R. Thompson, L. E. Fenno, H. Bernstein and K. Deisseroth, Nature, 2009, 458, 1025. T. Sera, Adv. Drug Delivery. Rev., 2009, 61, 513. S. Kay, S. Hahn, E. Marois, G. Hause and U. Bonas, Science, 2007, 318, 648. J. Boch, H. Scholze, S. Schornack, A. Landgraf, S. Hahn, S. Kay, T. Lahaye, A. Nickstadt and U. Bonas, Science, 2009, 326, 1509. M. J. Moscou and A. J. Bogdanove, Science, 2009, 326, 1501. T. Cermak, E. L. Doyle, M. Christian, L. Wang, Y. Zhang, C. Schmidt, J. A. Baller, N. V. Somia, A. J. Bogdanove and D. F. Voytas, Nucleic Acids Res., 2011, 39, e82. D. Reyon, S. Q. Tsai, C. Khayter, J. A. Foden, J. D. Sander and J. K. Joung, Nat. Biotechnol., 2012, 30, 460. S. Wang, W. Li, S. Wang and B. Hu, J. Genet. Genomics, 2014, 41, 339. S. Gogolok, C. Garcia-Diaz and S. M. Pollard, Sci. Rep., 2016, 6, 33209. J. Zhao, W. Sun, J. Liang, J. Jiang and Z. Wu, Mol. Cells, 2016, 39, 687. ´lvez, A. C. Ma, M. S. McNulty, T. L. Poshusta, J. M. Campbell, G. Martıńez-Ga D. P. Argue, H. B. Lee, M. D. Urban, C. E. Bullard, P. R. Blackburn, T. K. Man, K. J. Clark and S. C. Ekker, Hum. Gene Ther., 2016, 27, 451. Y. Kim, J. Kweon, A. Kim, J. K. Chon, J. Y. Yoo, H. J. Kim, S. Kim, C. Lee, E. Jeong, E. Chung, D. Kim, M. S. Lee, E. M. Go, H. J. Song, H. Kim, N. Cho, D. Bang, S. Kim and J.-S. Kim, Nat. Biotechnol., 2013, 31, 251. L. Cong, F. A. Ran, D. Cox, S. Lin, R. Barretto, N. Habib, P. Hsu, X. Wu, W. Jiang, L. A. Marraffini and F. Zhang, Science, 2013, 339, 819. L. S. Qi, M. H. Larson, L. A. Gilbert, J. A. Doudna, J. S. Weissman, A. P. Arkin and W. A. Lim, Cell, 2013, 152, 1173. Y. Li, Y. Jiang, H. Chen, W. Liao, Z. Li, R. Weiss and Z. Xie, Nat. Chem. Biol., 2015, 11, 207. T. Lebar and R. Jerala, ACS Synth. Biol., 2016, 5, 1050. M. H. Larson, L. A. Gilbert, X. Wang, W. A. Lim, J. S. Weissman and L. S. Qi, Nat. Protoc., 2013, 8, 2180. R. R. Beerli, D. J. Segal, B. Dreier and C. F. Barbas, Proc. Natl. Acad. Sci. U. S. A., 1998, 95, 14628. A. Chavez, J. Scheiman, S. Vora, B. W. Pruitt, M. Tuttle, E. P. R. Iyer, S. Lin, S. Kiani, C. D. Guzman, D. J. Wiegand, D. Ter-Ovanesyan, J. L. Braff, N. Davidsohn, B. E. Housden, N. Perrimon, R. Weiss, J. Aach, J. J. Collins and G. M. Church, Nat. Methods, 2015, 12, 326. M. E. Tanenbaum, L. A. Gilbert, L. S. Qi, J. S. Weissman and R. D. Vale, Cell, 2014, 159, 635. S. Konermann, M. D. Brigham, A. E. Trevino, J. Joung, O. O. Abudayyeh, C. Barcena, P. D. Hsu, N. Habib, J. S. Gootenberg, H. Nishimasu, O. Nureki and F. Zhang, Nature, 2014, 517, 583. J. G. Zalatan, M. E. Lee, R. Almeida, L. A. Gilbert, E. H. Whitehead, M. La Russa, J. C. Tsai, J. S. Weissman, J. E. Dueber, L. S. Qi and W. A. Lim, Cell, 2015, 160, 339. ńervaud, A. C. Groner, S. Meylan, A. Ciuffi, N. Zangger, G. Ambrosini, N. De P. Bucher and D. Trono, PLoS Genet., 2010, 6, e1000869.

30 | Synthetic Biology, 2018, 2, 1–34

View Online

89

90

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

91 92 93 94 95 96 97 98 99 100 101

102 103 104 105 106 107 108 109 110 111 112 113

114 115 116

M. L. Maeder, J. F. Angstman, M. E. Richardson, S. J. Linder, V. M. Cascio, S. Q. Tsai, Q. H. Ho, J. D. Sander, D. Reyon, B. E. Bernstein, J. F. Costello, M. F. Wilkinson and J. K. Joung, Nat. Biotechnol., 2013, 31, 1137. H. Chen, H. G. Kazemier, M. L. de Groote, M. H. J. Ruiters, G.-L. Xu and M. G. Rots, Nucleic Acids Res., 2014, 42, 1563. ´ and A. Vojta, P. Dobrinic´, V. Tadic´, L. Bocˇkor, P. Korac´, B. Julg, M. Klasic V. Zoldosˇ, Nucleic Acids Res., 2016, 44, 5615. S. R. Choudhury, Y. Cui, K. Lubecka, B. Stefanska and J. Irudayaraj, Oncotarget, 2016, 7, 46545. X. Xu, Y. Tao, X. Gao, L. Zhang, X. Li, W. Zou, K. Ruan, F. Wang, G. Xu and R. Hu, Cell Discovery, 2016, 2, 16009. M. Okada, M. Kanamori, K. Someya, H. Nakatsukasa and A. Yoshimura, Epigenet. Chromatin, 2017, 10, 24. R. R. Beerli, U. Schopfer, B. Dreier and C. F. Barbas, J. Biol. Chem., 2000, 275, 32617. Y. Li, R. Moore, M. Guinn and L. Bleris, Sci. Rep., 2012, 2, 897. A. C. Mercer, T. Gaj, S. J. Sirk, B. M. Lamb and C. F. Barbas, ACS Synth. Biol., 2014, 3, 723. B. Zetsche, S. E. Volz and F. Zhang, Nat. Biotechnol., 2015, 33, 139. ˜ ez, L. B. Harrington and J. A. Doudna, ACS Chem. Biol., 2016, J. K. Nun 11, acschembio.5b01019. ˇ cek-Keber and R. Jerala, Nucleic J. Lonzarıć, T. Lebar, A. Majerle, M. Man Acids Res., 2015, 44, 1471. S. Kiani, A. Chavez, M. Tuttle, R. N. Hall, R. Chari, D. Ter-Ovanesyan, J. Qian, B. W. Pruitt, J. Beal, S. Vora, J. Buchthal, E. J. K. Kowal, M. R. Ebrahimkhani, J. J. Collins, R. Weiss and G. Church, Nat. Methods, 2015, 12, 1051. R. Evans, Science, 1988, 240, 889. K. M. Davis, V. Pattanayak, D. B. Thompson, J. A. Zuris and D. R. Liu, Nat. Chem. Biol., 2015, 11, 316. E. Vegeto, G. F. Allan, W. T. Schrader, M. J. Tsai, D. P. McDonnell and B. W. O’Malley, Cell, 1992, 69, 703. T. D. Littlewood, D. C. Hancock, P. S. Danielianl and M. G. Parker, Methods, 1995, 23, 1686. D. Greber and M. Fussenegger, Nucleic Acids Res., 2010, 38, e174. B. P. Kramer, C. Fischer and M. Fussenegger, Biotechnol. Bioeng., 2004, 87, 478. S. Kiani, J. Beal, M. R. Ebrahimikhani, J. Huh, R. N. Hall, Z. Xie, Y. Li and R. Weiss, Nat. Methods, 2014, 11, 723. M. W. Gander, J. D. Vrana, W. E. J. Voje and J. M. Carothers, BioRxiv. H. Smith, J. Math. Biol., 1987, 25, 169. ¨ller, J. Hofbauer, L. Endler, C. Flamm, S. Widder and P. Schuster, S. Mu J. Math. Biol., 2006, 53, 905. ´, PLoS One, 2009, 4, e5399. S. Widder, J. Macıá and R. Sole T. Lebar, U. Bezeljak, A. Golob, M. Jerala, L. Kadunc, B. Pirsˇ, M. Strazˇar, D. Vucko, U. Zunpancic, M. Bencˇina, V. Forstnericˇ, R. Gaber, J. Lonzaric´, A. Majerle, A. Oblak, A. Smole and R. Jerala, Nat. Commun., 2014, 5, 5007. B. P. Kramer, A. U. Viretta, M. Daoud-El-Baba, D. Aubel, W. Weber and M. Fussenegger, Nat. Biotechnol., 2004, 22, 867. D. Deng, C. Yan, X. Pan, M. Mahfouz, J. Wang, J.-K. Zhu, Y. Shi and N. Yan, Science, 2012, 335, 720–723. A. N.-S. Mak, P. Bradley, R. A. Cernadas, A. J. Bogdanove and B. L. Stoddard, Science, 2012, 335, 716–719. Synthetic Biology, 2018, 2, 1–34 | 31

View Online

117 118 119

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

120 121

122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148

D. R. Burrill, M. C. Inniss, P. M. Boyle and P. A. Silver, Genes Dev., 2012, 26, 1486. I. A. Swinburne, D. G. Miguez, D. Landgraf and P. A. Silver, Genes Dev., 2008, 22, 2342. M. Tigges, T. T. Marquez-Lago, J. Stelling and M. Fussenegger, Nature, 2009, 457, 309. ńervaud, D. Greber, J. Stelling and M. Fussenegger, Nucleic M. Tigges, N. De Acids Res., 2010, 38, 2702. A. Filipovska and O. Rackham, in Synthetic Biology, ed. M. Ryadnov, L. Brunsveld and H. Suga, Royal Society of Chemistry, Cambridge, 2014, pp. 106–125. C. Tuerk and L. Gold, Science, 1990, 249, 505. M. Darmostuk, S. Rimpelova, H. Gbelcova and T. Ruml, Biotechnol. Adv., 2014, 33, 1141. R. D. Jenison, S. C. Gill, A. Pardi and B. Polisky, Science, 1994, 263, 1425. G. Werstuck, Science, 1998, 282, 296. ¨nder, P. Ketzer and J. S. Hartig, Mol. Biosyst., 2010, 6, 807. S. Ausla Y. Nomura, D. Kumar and Y. Yokobayashi, Chem. Commun., 2012, 48, 7215. L. Yen, J. Svendsen, J.-S. Lee, J. T. Gray, M. Magnier, T. Baba, R. J. D’Amato and R. C. Mulligan, Nature, 2004, 431, 471. D. S. Peabody, EMBO J., 1993, 12, 595. A. Bernardi and P. F. Spahr, Proc. Natl. Acad. Sci. U. S. A., 1972, 69, 3033. T. S. Rozhdestvensky, T. H. Tang, I. V. Tchirkova, J. Brosius, J. P. Bachellerie ¨ttenhofer, Nucleic Acids Res., 2003, 31, 869. and A. Hu K. T. Gagnon, X. Zhang, G. Qu, S. Biswas, J. Suryadi, B. A. Brown and E. S. Maxwell, RNA, 2010, 16, 79. R. Stripecke, C. C. Oliveira, J. E. McCarthy and M. W. Hentze, Mol. Cell. Biol., 1994, 14, 5898. K. Endo, K. Hayashi, T. Inoue and H. Saito, Nat. Commun., 2013, 4, 1. ¨nder, M. Wieland, S. Ausla ¨nder, M. Tigges and M. Fussenegger, D. Ausla Nucleic Acids Res., 2011, 39, e155. X. Wang, J. McLachlan, P. D. Zamore and T. M. T. Hall, Cell, 2002, 110, 501. T. M. T. Hall, Nat. Struct. Mol. Biol., 2014, 21, 653. Z. T. Campbell, C. T. Valley and M. Wickens, Nat. Struct. Mol. Biol., 2014, 21, 732. J. Cao, M. Arha, C. Sudrik, D. V. Schaffer and R. S. Kane, Angew. Chem.—Int. Ed., 2014, 53, 4900. Y. Wang, C.-G. Cheong, T. M. Tanaka Hall and Z. Wang, Nat. Methods, 2009, 6, 825. R. Choudhury, Y. S. Tsai, D. Dominguez, Y. Wang and Z. Wang, Nat. Commun., 2012, 3, 1147. Y. Deng, C. C. Wang, K. W. Choy, Q. Du, J. Chen, Q. Wang, L. Li, T. K. H. Chung and T. Tang, Gene, 2014, 538, 217. C. An, V. B. Trinh and Y. Yokobayashi, RNA, 2006, 12, 710. D. Kumar, C. I. An and Y. Yokobayashi, J. Am. Chem. Soc., 2009, 131, 13906. ¨nder, D. Ausla ¨nder, M. Mu ¨ller, M. Wieland and M. Fussenegger, S. Ausla Nature, 2012, 487, 123. T. L. Deans, C. R. Cantor and J. J. Collins, Cell, 2007, 130, 363. D. Greber, M. D. El-Baba and M. Fussenegger, Nucleic Acids Res., 2008, 36, e101. L. Wroblewska, T. Kitada, K. Endo, V. Siciliano, B. Stillo, H. Saito and R. Weiss, Nat. Biotechnol., 2015, 33, 839.

32 | Synthetic Biology, 2018, 2, 1–34

View Online

149 150 151 152

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170

171 172 173

174

K. Camacho-Soto, J. Castillo-Montoya, B. Tye, L. O. Ogunleye and I. Ghosh, J. Am. Chem. Soc., 2014, 136, 17078. J. Ryu and S.-H. Park, Sci. Signalling, 2015, 8, ra66. S.-H. Park, A. Zarrinpar and W. A. Lim, Science, 2003, 299, 1061. D. Mishra, P. M. Rivera, A. Lin, D. Del Vecchio and R. Weiss, Nat. Biotechnol., 2014, 32, 1268. ¨newald, S. Scheek, M. C. Wehr, R. Laage, U. Bolz, T. M. Fischer, S. Gru A. Bach, K.-A. Nave and M. J. Rossner, Nat. Methods, 2006, 3, 985. ¨zse ´r and D. S. Waugh, Protein Sci., 2010, 19, 2240. P. Sun, B. P. Austin, J. To ´rez, Z. Zhang, E. Domıńguez, J. A. Garcia and Q. Xie, N. Zheng, J. de Pe Protein Expression Purif., 2008, 57, 153. J. Fernandez-Rodriguez and C. A. Voigt, Nucleic Acids Res., 2016, 44, 6493. V. Stein and K. Alexandrov, Proc. Natl. Acad. Sci. U. S. A., 2014, 111, 15934. S. S. Shekhawat, J. R. Porter, A. Sriprasad and I. Ghosh, J. Am. Chem. Soc., 2009, 131, 15284. R. J. Dohmen, P. Wu, A. Varshavsky, A. Gold and S. Altman, Science, 2016, 263, 1273. K. Nishimura, T. Fukagawa, H. Takisawa, T. Kakimoto and M. Kanemaki, Nat. Methods, 2009, 6, 917. D. Esposito and J. Scocca, Nucleic Acids Res., 1997, 251348, 3605. M. C. M. Smith, W. R. A. Brown, A. R. McEwan and P. A. Rowley, Biochem. Soc. Trans., 2010, 38, 388. N. Lapique and Y. Benenson, Nat. Chem. Biol., 2014, 10, 1020. B. H. Weinberg, N. T. H. Pham, L. D. Caraballo, T. Lozanoski, A. Engel, S. Bhatia and W. W. Wong, Nat. Biotechnol., 2017, 35, 453. H. Ye, G. Charpin-El Hamri, K. Zwicky, M. Christen, M. Folcher and M. Fussenegger, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, 141. M. Xie, H. Ye, H. Wang, G. C. Hamri, C. Lormeau, P. Saxena and M. Fussenegger, Science, 2005, 109, 2. H. El-Gabalawy, L. C. Guenther and C. N. Bernstein, J. Rheumatol., 2010, 37, 2. R. L. Robker, D. L. Russell, S. Yoshioka, S. C. Sharma, J. P. Lydon, B. W. O. Malley, L. L. Espey and J. S. Richards, Steroids, 2000, 65, 559. C. Kemmer, D. A. Fluri, U. Witschi, A. Passeraub, A. Gutzwiller and M. Fussenegger, J. Controlled Release, 2011, 150, 23. D. M. Barrett, N. Singh, D. L. Porter, S. A. Grupp, C. H. June, A. Chew, D. Ph, B. Hauck, J. F. Wright, M. C. Milone, A. Manuscript, D. M. Barrett, N. Singh, D. L. Porter, S. A. Grupp, C. H. June, A. Chew, D. Ph, B. Hauck, J. F. Wright, M. C. Milone, D. M. Barrett, N. Singh, D. L. Porter, S. A. Grupp and C. H. June, Annu. Rev. Med., 2014, 368, 333. J. H. Esensten, J. A. Bluestone and W. A. Lim, Annu. Rev. Pathol.: Mech. Dis., 2017, 12, 305. D. L. Porter, B. L. Levine, M. Kalos, A. Bagg and C. H. June, N. Engl. J. Med., 2011, 365, 725. R. Brentjens, M. L. Davila, I. Riviere, J. Park, L. G. Cowell, S. Bartido, J. Stefanski, C. Taylor, O. Borquez-ojeda, J. Qu, T. Wasielewska, Q. He, I. V. Rijo, C. Hedvat, R. Kobos, K. Curran, P. Steinherz, J. Jurcic, T. Rosenblat, P. Maslak, M. Frattini and M. Sadelain, Sci. Transl. Med., 2013, 5(177), 177ra38, DOI: 10.1126/scitranslmed.3005930.CD19-targeted. S. A. Grupp, M. Kalos, D. Barrett, R. Aplenc, D. L. Porter, S. R. Rheingold, D. T. Teachey, A. Chew, B. Hauck, J. F. Wright, M. C. Milone, B. L. Levine and C. H. June, N. Engl. J. Med., 2013, 368, 1509. Synthetic Biology, 2018, 2, 1–34 | 33

View Online

175 176

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00001

177

178

179 180

181

182

C.-Y. Wu, K. T. Roybal, E. M. Puchner, J. Onuffer and W. A. Lim, Science, 2015, 350, aab4077. K. T. Roybal, L. J. Rupp, L. Morsut, W. J. Walker, K. A. McNally, J. S. Park and W. A. Lim, Cell, 2016, 164, 770. A. Di Stasi, S.-K. Tey, G. Dotti, Y. Fujita, A. Kennedy-Nasser, C. Martinez, K. Straathof, E. Liu, A. G. Durett, B. Grilley, H. Liu, C. R. Cruz, B. Savoldo, A. P. Gee, J. Schindler, R. A. Krance, H. E. Heslop, D. M. Spencer, C. M. Rooney and M. K. Brenner, N. Engl. J. Med., 2011, 365, 1673. `re, J. H. Park, M. L. Davila, X. Wang, J. Stefanski, R. J. Brentjens, I. Rivie C. Taylor, R. Yeh, S. Bartido, O. Borquez-Ojeda, M. Olszewska, Y. Bernal, H. Pegram, M. Przybylowski, D. Hollyman, Y. Usachenko, D. Pirraglia, J. Hosey, E. Santos, E. Halton, P. Maslak, D. Scheinberg, J. Jurcic, M. Heaney, G. Heller, M. Frattini and M. Sadelain, Blood, 2011, 118, 4817. C. C. Kloss, M. Condomines, M. Cartellieri, M. Bachmann and M. Sadelain, Nat. Biotechnol., 2012, 31, 71. G. Itakura, S. Kawabata, M. Ando, Y. Nishiyama, K. Sugai, M. Ozaki, T. Iida, T. Ookubo, K. Kojima, R. Kashiwagi, K. Yasutake, H. Nakauchi, H. Miyoshi, N. Nagoshi, J. Kohyama, A. Iwanami, M. Matsumoto, M. Nakamura and H. Okano, Stem Cell Rep., 2017, 8, 673. D. J. Mandell, M. J. Lajoie, M. T. Mee, R. Takeuchi, G. Kuznetsov, J. E. Norville, C. J. Gregg, B. L. Stoddard and G. M. Church, Nature, 2015, 518, 55. A. J. Rovner, A. D. Haimovich, S. R. Katz, Z. Li, M. W. Grome, B. M. Gassaway, M. Amiram, J. R. Patel, R. R. Gallagher, J. Rinehart and F. J. Isaacs, Nature, 2015, 518, 89.

34 | Synthetic Biology, 2018, 2, 1–34

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

Self-assembly at the multi-scale level: challenges and new avenues for inspired synthetic biology modelling Giuseppe Milano,a Irene Marzuoli,b Chris D. Lorenz*c and Franca Fraternali*b DOI: 10.1039/9781782622789-00035

We describe here progresses in the field of molecular simulations of large, macromolecular assemblies of different nature, ranging from protein complexes to protein– lipids, virus-like particles and interfaces with inorganic and organic substrates. We selected systems that are particularly interesting for the field of Synthetic Biology in the sense that they embed crucial chemical and biological functions that can inspire the design of simplified assemblies for application in nanotechnology, cell biology and biomedicine. These systems pose multiple challenges for simulations because of their need in accurate physical description of the variables playing a role in the realistic description of the molecular interactions occurring in the self-assembling process. In particular, the need for more sophisticated and flexible force fields for the study of complex anisotropic systems undergoing conformational changes during assembly is discussed. We highlight the importance of learning from accurate first-principle atomistic simulations in the development of multiscale and mixed approaches for a unified theoretical approach to such a vast repertoire of applications. In the future, such flexible and multi-tasking efficient simulation approaches will allow to routinely study large systems and their crucial functional mechanisms to be exploited in Synthetic Biology.

1

Introduction

Natural living systems are able, under specific conditions, to spontaneously achieve exceptionally ordered assemblies that are functionally effective and strongly resilient. Protein assembly into functional complexes is one of the most common examples of such spontaneous recognition that is essential to all biological processes. During evolution, biopolymers have developed to adopt distinctive and well-defined functional states accessible to physiological conditions (folded states). These are in constant interaction within the cell with other stably folded proteins and give rise to transient or long-lived homomeric or heteromeric complexes that carry out essential molecular functions.1–8 The flourishing area of synthetic biology takes inspiration by these well-designed and efficient principles that evolution has embedded in natural systems to recreate synthetically complex systems tailored to display specific functions or purposes.9 a

Dipartimento di Chimica e Biologia and NANOMATES, Research Centre for NANOMAterials and nanoTEchnologies at Universita` di Salerno, Via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy b Randall Division of Cell and Molecular Biophysics, King’s College London, London, UK. E-mail: [email protected] c Department of Physics, King’s College London, London WC2R 2LS, UK. E-mail: [email protected] Synthetic Biology, 2018, 2, 35–64 | 35 c

The Royal Society of Chemistry 2018

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

On the other hand, nature is not always ‘‘perfect’’: in some circumstances, even small modification of the environment and/or the protein sequence can lead to partially unfolded (misfolded) states that expose ‘‘reactive’’ patches to the environment and therefore result into selfassembled fibrillar aggregates designated as amyloid fibrils that are detrimental to the cell.10,11 These self-assembled structures are stabilised by non-native contacts that generate peculiar interactions with adjacent protein segments, leading to unwanted protein aggregates. The main challenge posed by studying these protein aggregates is understanding the underlying principles that lead to efficiently ordered structures and therefore developing a set of ‘‘design rules’’ that would allow us to devise a protein-based system able to discern between wanted and unwanted self-recognition. The major hurdle in this objective is that the same general physicochemical rules hold for the self-recognition in protein complexes and the self-association of protein aggregates. In fact, although it is quite clear that these aberrant associations are not directly encoded into the protein sequence, as is the case of native complexes, the ability of forming stable non-native contacts appears to be somehow an integral property of the peptide chains which form amyloid fibrils.11–13 Therefore, while the last half-century has been spent trying to understand the ab initio rules governing protein folded states,14–16 the next big challenge is in the understanding of the subtleties that differentiate between the pathways that are followed to form stable native and nonnative contacts. However, the chemical and structural heterogeneity of protein surfaces renders the control and manipulation of their selfassembly extremely challenging, although in some cases amenable to design.17,18 A simpler example of spontaneous assembly phenomena in biology is represented by phospholipids that naturally mix and organise in water due to their physicochemical characteristics. This is also a case in which subtle changes to the chemistry of the polar and apolar components of the lipidic chains can change dramatically the properties of the assembled material. Lately, novel approaches are being developed to describe the interactions between aggregating peptides and other types of surfaces: biological (e.g. lipid membranes) and biomimetic.19–22 The surface can promote the nucleation and aggregation of peptides and proteins, contributing to the efficient formation of toxic pores.10,23 All of the aforementioned spontaneous assembly processes will at some point occur within the cellular environment and a precise landscape of the timescale, stability and competition of all these possible assemblies is very difficult to extract. This represents the grand challenge in drawing a quantitative picture of the phenomena playing a role in the functioning and misfunctioning of the cell in health and disease. But discerning first principles that are at play in these phenomena allows us to synthetically devise bioinspired systems that can reproduce their macroscopic behaviour. Taking inspiration from these natural assemblies, one can attempt to design and fabricate novel nanostructures at the interactions between biomolecules and non-biological interfaces. Such structures have been found to be useful in a wide range of applications from the 36 | Synthetic Biology, 2018, 2, 35–64

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

24–35

biotechnological field to the material design. Alternatively, non-biological components such as carbon nanotubes (CNT) have been successfully used for gene delivery.36–38 Molecular dynamics (MD) simulations have been used to provide significant insights into the driving forces to protein assembly and to detail mechanisms playing a role in the interactions of proteins with biological, biomimetic and inorganic interfaces. This has led to progress in the methods and in the parametrization used to simulate the adsorption and self-assembly of biological molecules at the interface with non-biological materials. Two general approaches are used when applying molecular dynamics simulations to the aggregation of large macromolecular assemblies: a bottom-up approach in which atomistic simulations are used as a stepping stone to parametrise coarse-grain approaches which are capable of sampling the time and length scales necessary to the self-assembly process;3,16,39 and a top-down approach where molecular dynamics tools are used to refine the details of the interactions which play a significant role in the stability of the self-assembled system of interest and we will give examples of both these approaches.14,15,17 We will review the current state-of-the-art in computational methods and the use of molecular simulations to study assembly phenomena common in nature. While this is a very large field of research, we will focus on the progress that has been made to date in applying methodologies to the study of biological and biomimetic assemblies and in doing so we will highlight the need for an accurate description of the physicochemical interactions that drive the assembly phenomena. In particular we will discuss a number of cases in which the assembly phenomena is studied computationally, from database screening of crystallised structures to understand and design stable assemblies, to physically-based simulations of molecular assemblies in solution and at the interface with lipids and solid substrates.

2

Computational methods in self-assembly

Traditional MD simulations allow for the atomistic description of molecular systems on the basis of simple, easily derivable functionals that embed the essential physics when describing the forces and velocities acting on the atoms. These approaches are applicable when the number of atoms (o106 atoms) and the desired time scale (o106 seconds) is limited. Unfortunately, many of the interesting biological assemblies performing crucial functions in the cell, like ribosomes, viruses, mitochondria, and synthetically designed systems like nanocapsules for drug delivery,40 by far exceed the number of atoms and timescales accessible to traditional MD. Even for simple molecules such as surfactants, the self-assembly process from randomly distributed mixtures of molecules to micellisation can only be simulated with allatom representations if the systems consist of artificially dilute surfactant solutions (often well above the critical micelle concentration)41–44 or by using elevated temperatures which will enhance the diffusion of Synthetic Biology, 2018, 2, 35–64 | 37

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

molecules such that the self-assembly can be observed on the timescales accessible to simulations. Each of these approaches has limitations inherent in them as the size and shape of resultant micelles can be quite sensitive to concentration and temperature. For example, when using an elevated temperature to enhance diffusion, it can take time scales larger than those accessible with atomistic MD simulations to thermalise the self-assembled micelle at experimentally relevant temperatures. As a result of these limitations, the use of pre-assembled aggregates is a typical shortcut used in the study of structural and interfacial properties of the self-assembled micelles with atomistic models, however this means that these simulations require the aggregation number of the micelle as an input parameter. The results of this approach has been shown to yield good agreement with experiment,45,46 however even this approach has its drawbacks as demonstrated by Nevidimov et al.47 who have systematically shown that only random mixtures provide a low dependence of the final results on initial conditions. Within the past 20 years, a number of approaches have been developed so that MD simulations can be used to study the large number of particles and the large time scales required to simulate the self-assembly process. A significant focus has been placed on the development of coarse-grained (CG) approaches which can still capture the chemical specificity that is required to study the large variety of structures resulting from very minor modifications of the physicochemical properties of the assembling molecules. In this context, coarse-grain refers to the representation of several atoms with a single bead, which is then parametrised such that it has properties which are comparable to those of the atoms that it is replacing. Several examples of coarse-grain models used to simulate various materials have been reported in the scientific literature.41,42,48–55 The resulting CG models can then be used to simulate systems with larger length scales and longer time scales than are accessible from atomistic simulations. Therefore, MD simulations of coarse-grained models allow for the simulation of spontaneous selfassembly processes of large systems.41,56–58 These methods have been particularly successful in the simulation of micelle assemblies with realistic outcomes. In the last years, CG models have been also used to study the assembly of different types of lipids with carbon nanotubes (CNTs)59 and with membrane proteins.56 Several approaches are used to parametrise these CG models including deductive multiscale analysis (DMA),39,60 inverse Monte Carlo,61 Boltzmann inversion,62 elastic network models.63,64 In particular, DMA methods using the N-atom Liouville equation have been recently shown to be accurate and efficient for protein assembly.39 This particular method is of interest because is one of the few true multiscale approaches in the field of protein and peptides. For more information about these types of CG models we refer to any of the various recent review articles that have been written on the topic.65–67 In this chapter, we will focus more in detail on a coarse graining approach which has been recently developed by one of the authors, the Hybrid Particle Continuum Approach. 38 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

2.1 Hybrid particle continuum approach CG models based on particles like MARTINI are still computationally expensive if compared to approaches based on a continuum representation.68 In particular, models using field representations have been shown to be successful for modelling soft matter systems. The selfconsistent field (SCF) theory describes the model systems by using density fields; the main feature of SCF theory is the description of mutual interactions between molecular segments by an interaction between these segments and static external fields.69 Several applications to proteins,70 block copolymers71–74 and colloidal particles75,76 have shown that the SCF theory is a useful and powerful approach in a wide range of cases. A strong limitation of these studies is the restriction to very simple molecular architectures, typically linear gaussian chains. More recently hybrid approaches combining particles and density fields have been introduced. This class of methods is becoming popular in the framework of SCF models for their lack of limitation in treating complex molecular architectures and/or intramolecular interactions.76–81 The single chain in mean field (SCMF) approach introduced recently,62 in which a density field is calculated by projecting particle positions over a grid and kept static for a number of Monte Carlo steps, has been successfully applied to block copolymer systems and homopolymers.76,81,82 Later on, this approach has been extended to MD simulations. In particular, an implementation of MD suitable for the treatment of atomistic force fields and/or specific CG models, combined with a SCF description of nonbonded interactions (MD-SCF), has been introduced and later validated extensively for different molecular models. The main advantage of this hybrid particle-continuum MD-SCF approach is that the evaluation of the non-bonded forces, which is the most expensive part of MD simulations, is replaced by an evaluation of an external potential dependent on the local density. Moreover, this approach allows a very efficient parallelization and can be extended to length and time scales much larger than models based on pairwise forces.83 According to the formulation of SCF theory, a many-body problem such as motion of a molecular system is reduced to a problem of derivation of the partition function of a single molecule in an external potential V(r). Therefore, the non-bonded forces between atoms are replaced by a suitable expression of V(r) and its spatial derivatives. A particle is regarded to be interacting with the surrounding molecules through a density field, rather than direct interactions among pairs. The functional form of the density dependent interaction potential W needs to be assumed, and one common form is reported below where each species is specified by the index K: ð

kB T X 1 X W ½ffK ðrÞg ¼ dr wKK 0 fK ðrÞfK 0 ðrÞ þ f ðrÞ f0 2 K;K 0 2k K K

!2 ! (1)

fK(r) is the number density of species K at position r and wKK 0 are the mean field parameters for the interaction of a particle of type K with the Synthetic Biology, 2018, 2, 35–64 | 39

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

density field generated by particles of type K 0 , it can be shown using the so-called saddle point approximation that the external potential is given by: ! X dW ½ffK ðrÞg 1 X ¼ kB T VK ðrÞ ¼ wKK 0 fK ðrÞ þ f ðrÞ f0 (2) dfK ðrÞ k K K K The connection between the particle and field models for the proposed hybrid MD-SCF scheme, is obtained by a projection of particle positions into a density field. This projection is obtained through a mesh-based approach, which provides the density derivatives required to calculate the forces needed for propagation of MD algorithms. The details of the implementation of this approach and a complete derivation of eqn (2) is reported elsewhere.79 Details concerning the OCCAM code used to perform the MD-SCF simulations are reported in a recent publication.83

3

Force-field parameterization

As previously mentioned, one of major problems in simulating the assembly of large, non-homogeneous complex systems is the timescale needed to observe the diffusion of the systems components to achieve configurations effectively leading to stable assemblies. One can use interfacial supports to facilitate the assembly like purposely-selected inorganic and/or organic surfaces. In these cases, the accurate description of interfacial chemistry to use in molecular simulations is quite challenging because of the nonhomogeneity and anisotropy of the chemical environment of the studied species. Historically force fields have been developed and tested for specific class of molecules (i.e. proteins, organic molecules, polymers) and rarely for mixed systems. For example, when modelling the adsorption of peptides to nonbiological interfaces, the biggest challenge is in accurately describing the interactions between the interface and the peptide in solution. Over the past 10 years, the shortcomings of using standard atomistic and coarsegrain force fields for modelling the interaction between biological molecules and non-biological interfaces have been identified. As a result of these observations, novel methods to develop general descriptions of the interactions of peptides and non-biological interfaces have started to be developed. The interfacial force field (IFF)84–86 is one example of these novel approaches which identifies that there are two distinct phases in these systems, one being the liquid phase including the aqueous solution and the peptide, and the other represented by the solid phase comprising the molecules which make up the non-biological interface. This approach then assigns one set of non-bond force field parameters (i.e. epsilon, sigma and partial charge) for describing the interactions between molecules within the same phase and another set of non-bonded force field parameters for describing the interaction between the two phases. In doing so, the IFF does not affect the description of the interactions within a given phase which has generally already been optimised during the 40 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

development of the force field that is being used to model it, and then determine optimised parameters that accurately describe the energetics of the binding of the peptides to the interface. The IFF uses a nonpolarisable fixed charge model to optimise the non-bonded force field parameters used to describe a wide range of non-biological interfaces and are targeted such that they reproduce a number of properties of the nonbiological materials including crystal lattice parameters, water contact angle, solid–liquid interfacial tension, hydration energy and the adsorption energy of a reference set of peptides.87 In addition to the development of general approaches that can be used to determine force field parameters for a range of materials, the past decade has seen a variety of material peculiar force fields be developed in order to accurately model the specific properties of the given interface that make it challenging to model its interaction with peptides. While trying to discuss every force field that has been generated in order to address these systems is beyond the scope of this chapter, in the following paragraphs we will attempt to highlight some of the more recent developments. However, if one is interested in a more detailed discussion of these force fields we would refer to any of the numerous recent review articles that have been published on the various developments in the area.88–92 Metal surfaces and their applications including bio-sensing93 and nanomedicine,94 have been the focus of a significant amount of work in developing force field parameters that accurately describe the adsorption of peptides. The most common studied metal to date has been gold, for which numerous groups have used different methods to determine Lennard-Jones parameters that accurately describe the adsorption of peptides.95–97 More recently, the GolP family of force fields98–101 have developed a way to accurately and efficiently model the polarisation of the gold atoms in the presence of the charges present in the solute, which plays a significant role in the adsorption of peptides to gold. The GolP force field uses Lennard-Jones interactions on the actual gold atoms, and then a model of the virtual dipole of the atoms where one end of a rigid rod is constrained to a real gold atom and the other end of the rod is allowed to change its orientation in order to screen the electrostatic field that is exerted by the solute. This approach has also now been applied to develop a model for silver (the AgP force field) as well,102 and the development of optimised Lennard-Jones parameters for a variety of other face-centred cubic metals has also been reported.96 Another class of materials which have been the focus of research in determining accurate approaches in describing the interactions between peptides and their interfaces is represented by oxides, as they too have been found to be applicable in a wide range of biological and technological applications. The biggest challenge for classical molecular dynamics simulations of these systems is that by nature these interfaces are reactive with an aqueous environment. As a result, one needs to either employ a reactive force field to describe these interfaces, which are quite computationally intensive, or to use a method to produce a realistic interface of the material without reactions being modelled within the Synthetic Biology, 2018, 2, 35–64 | 41

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

simulation. One of the most common oxides studied by simulation is titanium oxide, where different functional forms have been used to model the non-bonded interactions within these systems ranging from the application of a many-body Finnis–Sinclair type of interaction to model the surface which is coupled to a Buckingham short-range repulsive interaction103 to a standard Lennard-Jones interaction.104–106 The reactive force field, ReaxFF,107 has been used to model the adsorption of glycine108 and cysteine residues58 to the interface of titania; however, currently the nature of these reactive force fields make them too computationally intensive to simulate the adsorption of larger peptides. Another commonly simulated oxide interface is that of silicon oxide, which has seen the recent development of several force fields to model the interactions of biological molecules with the a-quartz crystal109 and amorphous silica interfaces,87,110,111 where methods have been determined in order to model charged interfaces87,111 in order to reproduce the ionisation of the interface that will result from the silica being in contact with solutions of differing pH. More recently, Snyder et al. have applied their Interfacial Force Field approach to generate force field terms for the interaction of peptides with both a-quartz and amorphous silica interfaces.86 Like oxide materials, the complex nature of mineral substrates, which play a key role in many processes including biomineralisation, makes modelling the structure of the mineral interface and the solution phase containing the peptide that is adsorbed to the surface challenging. The CLAYFF force field is a general force field that has been developed to model the interface of a hydrated multicomponent mineral using mostly the common Lennard-Jones and Coulomb descriptions of non-bond interactions.112 A general approach has been used to develop bio-compatible force fields to describe these systems, which is quite similar in philosophy as the Interfacial Force Field, in that it uses existing force fields to model the various components of the system (e.g. ions, water, peptide, mineral) and new parameters are only derived for the non-bond force field parameters to describe the interactions between different species.113 This method has been applied to study organic molecules and peptides adsorbing to a range of mineral interfaces.114–117 More recently, atomic force microscopy (AFM) measurements have been used in combination with MD simulations in order to validate a force field which combines the GROMOS force field and CLAYFF to model the adsorption of peptides to mica.118 As one of the main minerals found in teeth and bone, hydroxyapatite (HAP) has been of particular focus when modelling its interaction with biological molecules. Initially, HAP was modelled using a force field119 which contained functional groups to describe its potential energy which are not commonly found in biomolecular force fields. Since then, this force field has been adapted to have the functional forms commonly found in biomolecular force fields and then used to study the adsorption of a variety of peptides to the interface of HAP.120–124 Due to their extraordinary physical, electronic and thermal properties, the various forms of carbon nanostructures (e.g. graphene, carbon nanotubes (CNT), fullerenes) have attracted a lot of interest in their 42 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

potential applications within the biological sciences. As a result a significant amount of work has been done in modeling biomolecules interacting with these various nanostructures. In studying different methods of describing these interactions, simulations using the AMOEBA polarisable force field125,126 of peptides adsorbing to the interfaces of CNT,127,128 graphite129 and graphene130 have shown that surface polarisation results in strong interaction between the carbon interface and aromatic rings in the side chains of the peptides. While these studies have proven that the effect of polarisation is important, the application of the AMOEBA force field (or any other of the similar polarisable force fields) is very computationally intensive for studying the adsorption of proteins to these interfaces. Therefore, simplified models of the polarisation have been recently investigated including the application of a model similar to that found in GolP and AgP, where a rigid rod model is used to include polarisability into the model of graphene, to study the free energy of adsorption of amino acids.131 Simulations of peptides adsorbing to graphene with non-polarisable force fields have suggested that dewetting effects may play a significant role in the binding of peptides to graphene interfaces.132 The surfaces of all of these solid substrates can be modified via functionalisation with a variety of molecules, most commonly in the form of self-assembled monolayers (SAMs). A majority of the commonly used biomolecular force fields have parameters to describe the molecules which make up the self-assembled monolayers (or at least methods by which to generate them) and numerous studies have been conducted with these force fields which show results consistent with experimental observations. However, it has been shown that even when using parameter sets from the same force field families there can be an overestimation of the binding strength of hydrophobic peptides and underestimation of the binding of negatively charged peptides.133–136 These observations have led to the application and extension of the IFF approach to these systems as well.137,138 While atomistic simulations can be used to study the initial adsorption of peptides to interfaces, there are several phenomena which play important roles in the actual systems that are the target of these simulations which require longer time scales and larger length scales to gain a good understanding, including the competition between protein–protein and protein–interface interactions. Therefore, coarse-grain approaches have started to gain attention, within this chapter we will just review some of the methods that have been applied to MD simulations. As is the case in the all-atom simulations, several studies have been done with force fields that were initially parametrised to describe the peptide and the surface but not necessarily the interaction between the two, including the application of the MARTINI force field to study the adsorption of proteins to surfaces139 and nanoparticle19 and also the application of a Go-like model to study the formation of a ubiquitin corona on gold nanoparticles.140 Meanwhile, there have been models specifically developed to simulate proteins near interfaces, in which the proteins are kept as rigid bodies141 or are modeled as flexible.142 Synthetic Biology, 2018, 2, 35–64 | 43

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

Another approach to coarse-graining of the system is the application of implicit solvent models as opposed to the explicit representation of the solvent.143,144 The behaviour of interfacial water near these types of nonbiological interfaces presents a unique challenge which is not usually accounted for by the implicit models traditionally used for molecules in solution. Therefore, material specific implicit solvent models, which accurately characterise the interfacial water at a specific materials surface, are beginning to be developed, with one example being the ProMetCS model which was developed for protein–gold(111) interfaces.145 In this model the distortion of the hydration shells around the protein–surface interface are determined by the implementation of an additional analytical function which is parametrised to reproduce the potential of mean force for a probe atom as determined from MD simulations with explicit water.

4 Applications 4.1 Crucial assemblies in nature: what can we learn from them We will discuss here some of the assemblies found in nature that are constituted by proteins and nucleic acids and that represent functionally active systems that can be used as building block ideally complementary to the generate stable assemblies and therefore can be studied to extrapolate first principles of molecular recognition. Protein natively assembled structures can be homomeric, i.e. formed by repeated copies of the same chain/unit; heteromeric complexes are instead composed of multiple distinct protein subunits, mostly encoded by different genes. Recent work from the Thornton and Teichman groups1,2 have elucidated the enormous diversity of protein interactions modes, mainly made possible by structural biology techniques. Of course the more ‘‘dynamic’’ and transient amongst these interaction modes are less easy to determine and characterise and therefore computational methods can help in the understanding of the general principles that differentiate permanent stable assembly from transient association.3,4,146 The accumulated information on the recognition principles used in these naturally occurring assemblies (i.e. buried hydrophobic surface, hydrogen bonds at the interface, salt bridges, etc.) has allowed for the extraction of general rules for the characterization of binding interfaces that can be exploited in synthetic design approaches,147,148 machine learning149 and docking procedures150 to predict unknown binding modes, still uncharacterised by 3D experiments. Knowledge of structurally undetermined binding modes are stimulated by the accessibility to modern DNA sequencing techniques that allow for the sequence determination of an enormous number of proteins,151 for which a systematic and detailed structure determination is impossible.152 Therefore computational approaches can help in filling the gap in this sequence-structure knowledge of the protein interaction space.5,7 Docking can be useful in generating possible bimolecular assemblies with simplified energy functions that contain a reduced set of parameters describing some of the main terms contributing to the association. These computational procedures do not directly use 44 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

molecular simulations, but they may use selected force-field functionals in their energy refinement procedure.153,154 Once known some of the general principles used in native-binding sites, one could attempt, through synthetic biology approaches, to design effective non-native binding modes. In their work to design and then crystallise mutants of protein dimers which associate through non-native binding sites, Schulz et al.147 have shown that designability of side-chain mobility can be crucial in discriminating stable designed binding modes from unstable ones. A different perspective in using simulations to depict the protein–protein interaction dynamics in the cellular crowded environment has been pioneered by the group of Elcock that used Brownian dynamics to simulate the cytoplasm of a bacterial cell and extracted thermodynamics properties of protein association.55,155 In this last approach simplified description of crucial cellular complexes is used to model the occurring interactions in the cellular environment. All these approaches can generate knowledge that can be directed to the design of more stable complexes and/or fine-tune protein interactions.147 A challenging parameter that is difficult to quantify is the role of the flexibility of residues at the interface. Recently, we performed a large-scale survey of the intrinsic dynamics of protein interfaces using a dataset of 251 proteins classified according to the different number of partners.156 Two main classes were defined in this study: monopartner and multipartner proteins. The goal was to demonstrate that overall multipartner proteins (promiscuous binders) use flexibility as one of their binding recognition modes. Conformational ensembles were generated using a variety of computational methods, including MD, Gaussian Network Models and the distance geometry method tCONCOORD.157 A set of multi-partner proteins was identified that showed a higher correlation between flexibility and binding profiles and a higher number of interface-correlated collective motions (Fig. 1). Moreover, statistically significant differences in the flexibility of mono- and multi-partner residues were found. In particular, while mono-partner residues showed a clear reduction in flexibility with increasing amino acid evolutionary conservation, this effect was less pronounced for multi-partner residues. The higher flexibility of conserved multi-partner residues could thus be interpreted as a compromise between the tendency observed in other works to reduce the binding entropic penalty of conserved residues and the necessity to adapt the protein interface to different environments. This finding could be a key element in the prediction of potentially promiscuous interface residues.156 As discussed before, it is interesting to aim at characterising structurally undetermined binding modes by exploiting native assembly rules. One different case is give by proteic assemblies that are resulting from aggregation. We will discuss some cases in the following paragraph. Most of the work in this field is performed by all-atoms representations as the dramatic changes in conformation that lead to aggregating species are difficult to reproduce with current force field CG parametrizations. Synthetic Biology, 2018, 2, 35–64 | 45

Published on 23 November 2017 on http://pubs.rsc.org |

46 | Synthetic Biology, 2018, 2, 35–64 Fig. 1 A set of multi-partner proteins was identified that show a higher correlation between flexibility and binding profiles. Statistically significant differences in the flexibility of mono- and multi-partner residues were found (left panel). In the right panel of the figure the example of an ubiquitin-like protein (Neddylin) is shown with the binding profile extracted from PROSITE162 and the C-alfa RMS profile superimposed.163

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

4.2 Proteins non-native assembly: some paradigmatic cases Protein aggregation is the biggest enemy of the protein chemist or of the molecular biologist. In the last years it has become clear that many neurodegenerative disorders are triggered by the presence in neural tissue of aberrant protein aggregates, therefore making protein aggregation a threat for mankind. More than 20 of protein aggregation-related diseases have been identified, including Parkinsons disease, Alzheimers disease and Type II diabetes.158,159 In addition, even proteins not linked to any known disease can aggregate, and actually most proteins do, provided conditions in which the native fold is destabilized and the solution is sufficiently concentrated.13,160 Therefore for these cases protein assembly and aggregation represent an alternate pathway to folding, rather than necessarily an aberrant process. In fact in many organisms (including bacteria, fungi, invertebrates, and humans), protein aggregates serve a functional role.161 At the moment, a number of primary sequences and three-dimensional structures of proteins capable of amyloid fibril formation have been characterised.164–166 This work has stimulated a number of simulation papers that have tested the efficacy of MD simulations in predicting assembly behaviour.166–172 We will focus on the special case of prion proteins (PrP), which are selfreplicating infective protein conformations capable of aggregating into amyloid fibrils. Due to their unusual ability to stably adopt multiple conformations, at least one of which is self-perpetuating, prions represent a very interesting challenge for in silico sampling methods. Molecular dynamics simulations have been extensively applied to prion proteins and different constructs of them to characterise at a molecularlevel their unfolding. The effects of pH, temperature and solvent environment have been investigated on the Human C-terminal domain of PrP (PrPC) stability and dynamics.143,173,174 Several groups, including authors of this chapter, have investigated the importance of sub-secondary structural elements in the nucleation for oligomers, which are the first stage of assembly leading to fibrils. We have proven by atomistic molecular simulations, with validation from cross-linking experiments, that the earlystage oligomer formation does not involve the native b-sheets present in the PrPC folded domain, previously suggested175 and is limited to two helices present in the structure of PrPC (H2H3 domain) only.174 MD simulations of the H2H3 domain revealed a complete conversion from an all-helical structure to a stable b-rich intermediate, providing a possible candidate structure to the initiation of the oligomerisation process (Fig. 2). The role that the solvent in which the proteins are immersed plays in the aggregation process is of significant interest as well. To that end, solvent entropy maps and density distributions have been introduced163,167,176 to correlate high entropy water spots with underprotected regions of the protein surface, where the protein backbone hydrogen bonds are exposed to the solvent. The hydration map of this stable b-rich intermediate of the H2H3 subdomain of the human PrPC showed two distinct faces one with localised hydrophobic patches, which could be responsible for the oligomer formation. Synthetic Biology, 2018, 2, 35–64 | 47

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

Fig. 2 Left: cartoon representation of the tetramer model based on interactions between residues 204–214 (spacefill-transparent sphere). Right: proposed model of the assembly of 9 tetrameric units of H2H3.174 Reprinted with permission from N. Chakroun, A. Fornili, S. Prigent, J. Kleinjung, C. A. Dreiss, H. Rezaei and F. Fraternali, J. Chem. Theory and Comp., 2013, 9, 2455. Copyright (2013) American Chemical Society.

These observations have important consequences for the assembly mechanism of the early oligomeric intermediates. Simulating the assembly process of prion proteins or their segments via MD is a much more complex challenge. The main reason for this is that it is still debated which segments are really responsible for oligomer assembly and/or fibril formation. For detailed in silico assembly studies people have focused on a small prion peptide of sequence GNNQQNY from the Yeast Prion Sup-35, as this has been crystallised similarly to other amyloid-forming peptides and shows a typical cross-beta spine arrangement.164 These studies proved that simulations can be used to study the assembly process and characterise the process of self-organization, and this can be used as a model in predicting the assembly of other related sequences.169 As mentioned before the atomistic description is often necessary for studies requiring the characterization of complex conformational changes, therefore investigation in this field have been possible by the use of a combination of sampling techniques (MC, Replica Exchange Molecular Dynamics REMD) and multiscale approaches to investigate the assembly (coarse-grained, semi-coarse grained, atomistic).168,169,177 4.3 Virus-like particles A further field of interest for synthetic biology is the study of viral structures. A virus capsid is the proteic capsule enclosing the viral 48 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

genome: understanding the interaction between the components forming the structure and how they assemble has a twofold crucial interest. First it helps designing new drugs which change the geometry of the virus, to destabilise its vital cycle. Second it guides the design of virus-like peptidic particles (VLP), recently synthesised in vivo to be used for drug or vaccine delivery. For the purpose of drug testing and design, all-atom simulations are needed to get the effect of the compound on the target capsid. The viral structure is statically assembled from known subunits using geometrical rules and experimental data. MD simulations are performed on such structure with the drug compound placed initially near the active site suspected to be tackled. Schulten et al. successfully simulated two drugbound virus capsids (HAP1 on HBV capsid and PF74 on HIV-1), finding that the drug does not change the subunit geometry but promotes a conformational change of the whole 3-D quaternary structure of the capsule, sufficient to inhibit the viral action.178 The second and more innovative field, the design of VLP, has not yet benefitted from the application of simulation techniques. The task is challenging due to the difficulties of simulating big systems for the time needed to see the self-assembly process from single synthetic designed subunits. Therefore various coarse graining techniques have been tested so far to tackle the problem. Voth et al.179 used an ultra coarse grained approach based on a heterogeneous Elastic Network Model. The peptidic subunit is described as a series of beads placed at the positions of the alpha-carbons of the original chain, connected by springs with a different elastic constants (fitted on the results of all-atom MD simulations). Using this approach, they studied how the partial assembly and disassembly of HIV-1 capsids were affected by the concentration of monomers in solution and by their conformation. This showed, for example, that only a narrow window of subunits concentration produces full mature capsids: outside this window, flatter structures that were not able to close in the capsules were observed to be preferred, which is in agreement with the structures observed in electron cryotomograms (Fig. 3). It is possible to retain some atomistic details implicitly within a Multiscale Factorization approach that does not assume any a priori defined scale separation but uses a stationary hypothesis on the CG particle momenta speeding up the simulations without the need for parametrization of expensive parameters, such as diffusion coefficients and thermal averaged forces. After the distinction of the variables describing the system in slow and fast ones, short full-atoms MD simulation are performed at fixed values of the slow variables (the CG ones), which are then updated on a larger timestep using information from the short MD runs. Such an approach was tested on human papillomavirus (HPV) 16 and P22, showing it can predict the initial stages of subunits assembly180 (Fig. 4). The final challenge is the simulation of capsids on a membrane, to test either toxicity for human cells or antimicrobial action on microbial cells. Experiments in that sense have successfully proven some VLP to be valuable antimicrobial agents,40 and an insight on the mechanism Synthetic Biology, 2018, 2, 35–64 | 49

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

Fig. 3 Subunits concentrations explored in the self-assembly process of HIV-1 capsid. A narrow range (0.75–1) generates single-lattice region, 0.25–0.75 concentrations produce trimers of dimers not assembled in a lattice and higher ones the nucleation of multiple lattice regions. In the bottom line examples of final structures; lamellar regions circled by ovals. Scale bar 20 nm. In grey (not in scale) electron cryotomography of lamellar lattice regions for the intermediate concentration. Reprinted under Creative Commons licence CC-BY (http://creativecommons.org/licenses/by/4.0/) from J. M. A. Grime, J. F. Dama, B. K. Ganser-Pornillos, C. L. Woodward, G. Jensen, M. Yeager and G. A. Voth, Coarse-grained simulation reveals key features of HIV-1 capsid self-assembly, Nat. Commun., 2016, 7, 11568, https://doi.org/10.1038/ncomms11568. Copyright r 2016, Rights Managed by Nature Publishing Group.

promoting such action is desirable for future developments. Past works focused on the action of single peptides on the membrane,181 small assemblies of few peptidic chains (like the fusion sequence of viral influenza HA182),182 and micelles (like Pluronic micelles as drug nanocarriers183 – Fig. 7). Virus-like nanocapsules are however around ten times bigger than lipid micelles and only the development of specific coarse graining methods will allow their accurate simulation.184 Future perspective includes the improvement of the mentioned techniques as well as the employment of MD-SCF described previously to simulate VLP systems. The set-up of a reliable tool to simulate the assembly would open the door to a systematic search for better building blocks to obtain customised geometry for drug and vaccine delivery.

5

MD-SCF models: applications in self assembly

The choice to develop a hybrid MD-SCF model has been based on opting for a description of the system able to retain chemical specificity but at the same time able to access large time and length scales. The coarsegraining model proposed by Marrink et al. had been deemed suitable for this purpose.185 In the same spirit, the MD-SCF model was first tested on structural properties of lipid bilayers.186 In particular, partial electron density profiles and the bilayer thickness calculated for different phospholipids were shown to compare well with those calculated from reference MD simulations and found in available experimental data.187,188 Another important point in developing coarse-grained models is the state 50 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

Fig. 4 Snapshots of the initial and final configurations of (a) HPV 16 VLP and (b) P22 VLP undergoing thermal fluctuations. The first is changed slightly during simulation, implying stability, while the second does not. Reprinted from Vaccine, 33, A. A. Mansour, Y. V. Sereda, J. Yang and P. J. Ortoleva, Prospective on multiscale simulation of virus-like particles: Application to computer-aided vaccine design, 5890–5896, Copyright (2015), with permission from Elsevier.

dependence. More specifically, a coarse-grained Hamiltonian incorporates a dependence of the effective parameters on the temperature and composition of the system. This dependence is not known a priori and needs to be investigated for every CG model. For the CG models reviewed here, the equivalent of mesoscopic coefficients used to describe the intermolecular interactions are with w and k parameters in eqn (1) and (2). This aspect is important because phospholipids display a rich variety of phase structures in water at different concentrations. Beside bilayers, they can form non-lamellar phases including the hexagonal and cubic phases, as well as micellar phases when they are in diluted water solutions. Hexagonal phases are typically columnar aggregates and they can be composed either by normal or reverse tubes. Cubic phases are made of curved bilayers or micelles. Depending on the water content, micelles can change their aggregation form from oil in water to reverse (‘‘water in oil’’) micelles. Test simulations have confirmed the ability of the MD-SCF model to accurately predict the appearance of the non-lamellar phases at the correct lipid concentration. Additionally, the MD-SCF models are able to correctly describe the different morphologies189 (shown in Fig. 5). Fig. 5A shows the spontaneous assembly of different aggregates of DPPC in water and in Fig. 5B the hexagonal packing of tubular aggregates at Synthetic Biology, 2018, 2, 35–64 | 51

Published on 23 November 2017 on http://pubs.rsc.org |

52 | Synthetic Biology, 2018, 2, 35–64 Fig. 5 (A) Self-assembled structures obtained from MD-SCF simulations of DPPC with different water contents. Reverse micelles (I), lipid bilayers (II), bicelle (III) and micelles (IV) going from a low to high water content, have been obtained. (B) Formation of reverse micelles in hexagonal phase. Reprinted from G. Milano, T. Kawakatsu and A. De Nicola, A hybrid particle–field molecular dynamics approach: a route toward efficient coarse-grained models for biomembranes, Physical Biology, 10(4), 45007, 2013, https://doi.org/10.1088/1478-3975/10/4/045007. r IOP Publishing. Reproduced with permission. All rights reserved.

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

larger DPPC concentration has been demonstrated, which is in agreement with experiments. 5.1 Nanostructures and nanomaterials in contact with lipid bilayers and liposomes Using MD-SCF both the processes of CNTs bundle formation and insertion/rearrangement inside lipid bilayers can described and analysed in details by MD simulations on the microsecond scale.190 These simulations have shown that during the insertion process lipid molecules coat the surfaces of CNT bundles. Moreover the simulations revealed that when the insertion of CNT bundles occurs in a perpendicular orientation with respect to the bilayer plane, adsorption of lipids on the bundle surfaces promotes a transient poration. This result suggests the formation of solvent-rich pockets as preliminary stage of membrane disruption operated by CNT bundles. It is worth noting that during the formation of the pore (Fig. 6B) phospholipids are present in all three possible assemblies, bilayer, normal micelles (coating the CNT surface) and in form of reverse micelle (around the water pore). These results highlight the important use of CG models with the correct concentration dependency even when simulating such simple systems. Thanks to the enhanced speed of MD-SCF simulations, it has been possible to study the interaction of assemblies of Pluronics chains (L64 Pluronics) as spherical micelles in water and in the presence of a DPPC

Fig. 6 Insertion process of a CNT bundles formed of tubes long 20 nm. (A) Snapshot of the insertion process. (B) Details of snapshot at 145 ns of MD-SCF simulation showing the formation of a pore filled of water molecules (beads representation).190 Reprinted from Chemical Physics Letters, 595–596, E. Sarukhanyan, A. De Nicola, D. Roccatano, T. Kawakatsu, and D. Milano, Spontaneous insertion of carbon nanotube bundles inside biomembranes: A hybrid particle-field coarse-grained molecular dynamics study, 156–166. Copyright (2014), with permission from Elsevier. Synthetic Biology, 2018, 2, 35–64 | 53

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

183

lipid bilayer. The polymer chains are self-assembled in stable micelles in the absence of the lipid membrane, but they have been found to be unstable in the presence of the bilayer. The mechanism revealed by MDSCF simulations on the timescale of 10 s involved a release of individual polymer chains from the micelle into the bilayer, which led to destabilization and dissolution of the micelle. The interaction of the L64 polymer chains with the lipids bilayer is able to shift the chemical equilibrium,

Fig. 7 (A) Pluronics L64 micelle in water. (B) L64 micelle in contact with a DPPC lipid bilayer two systems having a hydrophobic molecule inside the micelle core (hydrophobicity ¼ 1 stable micelle) and intermediate (¼0.5 unstable micelle) show different behaviour. (C) Behaviour of the equilibrium size (radius of gyration Rg) of the L64 micelle in contact with DPPC as function of hydrophobicity. Hydrophobicity is 1 if the interaction parameters of the drug molecules toward other species are the same of carbon tails of phospholipids and is 0 if interaction parameters are the same of water molecules. Intermediate values are linear combination of the two limiting cases. Reproduced from ref. 183 with permission from the PCCP Owner Societies. 54 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

subtracting the polymer chains from the assembly and leading the micelle below its critical micelle concentration. Interestingly, incorporation of model drugs into the polymer micelles shifted the equilibrium toward a stable micellar state and had a strong effect on the final size of the assembly. In particular, hydrophobic drugs stabilize the micelle, while more hydrophilic drugs have no effect on the micelle stability.183 The main results of this study are summarized in Fig. 7. Liposomes and their synthetic versions (polymersomes) are ubiquitous objects of interest for synthetic biology. Cell-like liposomes produced by microfluidic devices can be constructed as functional units in which the lipid composition can be designed to have desired biophysical or chemical properties.191 Hybrid simulations allow an efficient modelling of complete liposomes with molecular details. In particular, a model combining Brownian dynamics (BD) and dynamic density functional theory (DDFT) representing the solvent as a continuum field, has been reported.192 This model allows to easily describe constituents for which explicit chain conformations and/or different composition are important. Fields can represent the abundance of the aqueous solvent. This different representation adds molecular detail to the simulations and, at the same time, allows for increased efficiency. A typical model simulated with this hybrid model is depicted in Fig. 8. The possibility of having models with effective beads representing a small number of atoms and able to reach large scales it is interesting

Fig. 8 Upper panel: snapshots of dissipative particle dynamics (fully particle based left) and the hybrid method (right), all solvent beads are represented by a single solvent field. Field shadowed areas indicates concentration values, and the field is transparent below a threshold concentration. Lower panel: Starting structure for a punctured vesicle simulation, Structure obtained after 1.5106 time steps with the hybrid model, showing lipid rearrangement towards vesicle closure. The simulated vesicle has a diameter of B90 nm. Reproduced from ref. 192 with permission from The Royal Society of Chemistry. Synthetic Biology, 2018, 2, 35–64 | 55

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

Fig. 9 (A) Experimental (line) and calculated (symbols) pair distribution functions P(r) for a TX-100 micelle in a water solution containing NaCl 0.01 M. (B) Snapshot of the micelle structure with the density isosurfaces, where number density is larger than 0.01 ions per nm3, representing the distributions of Na1 and Cl ions. (C) Strategy adopted for reversemapping. SET 1 corresponds to atomistic configurations obtained from the TX-100 singlechain trajectories. SET 2 is a given coarse-grained (CG) configuration to be back-mapped. SET 3 is the back-mapped atomistic configuration. The best guess for a back mapped configuration is obtained using rotations maximizing atoms superposition.202 Reprinted with permission from A. De Nicola, T. Kawakatsu, C. Rosano, M. Celino, M. Rocco and G. Milano, J. Chem. Theory Comput., 2015, 11(10), 4959–4971. Copyright (2015) American Chemical Society.

because can be used in a multiscale scheme to obtain very detailed descriptions of the simulated systems by using reverse-mapping techniques.193–196 From the coarse grained description one can go back to an atomistic representation and vice versa. These reverse-mapping approaches have been successfully applied to several soft matter systems 56 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

to convert from coarse-grained representations of the systems to atomistic ones,62,197–200 including, in the context of self-assembly, micelles formed by lysophospholipids and by Triton TX-100 surfactants201,202 which is a widely used surfactant in biology. A hybrid particle-field CG model has been accurately validated in a wide range of concentrations using MD-SCF simulation technique. In particular, large-scale simulations show a critical micelle concentration, shape transition in isotropic micellar phases, and the existence of hexagonal ordered phase in the ranges reported in the experimental literature. The fine resolution of the CG model allows one to obtain, by reverse mapping techniques, well relaxed atomistic models of micellar assemblies and of the hexagonal phase. The atomistic models obtained for the micellar assemblies are structures that show good agreement with experimental pair distance distribution functions and hydrodynamic measurements. Some of the results related to these assemblies and a scheme describing the reverse mapping approach used are summarized in Fig. 9.

6

Conclusions and future perspectives

We have described here some of the challenges in the computational description of crucial functional assemblies highlighted from recent analyses of large-scale folded and misfolded protein assemblies occurring in nature.1,2 These fascinating macromolecular structures are starting to be amenable to simulations and some of their detailed mechanisms of functioning inaccessible to experiments may now be uncovered by Molecular Dynamics and related techniques. Lots still needs to be learned from the available structures that are collected by X-ray spectroscopy and other upcoming high-resolution methods,203 and the combination of structural bioinformatics and synthetic biology techniques can help in mining these catalogues to extract first principles playing a key role in recognition and self-assembly. With the success of machine learning methods and the diffusion of deep learning technologies,204 we foresee a further exploitation of these data for novel approaches in synthetic biology. It is not excluded that future development of force-fields would use these technologies. We have highlighted here progresses in the following areas: coarse grained methods for reduction of degrees of freedom whilst maintaining reproducibility and physical behaviour. Some widely used methods like Martini are still suffering of the lack of flexibility in describing conformational changes and therefore new methods are being developed to address these issues. Recent development of this force-field205 are revisiting the use of softer repulsive LJ 12-6 potential for the non-bonded interactions.206 More development is needed to account for conformational changes in proteins, so far backbone pseudo-dihedral potentials that reproduce backbone conformations of atomistic simulations have been attempted.182 Still, many of these approaches require preliminary atomistic simulations as test cases and cannot be transferred to other peptide sequences. Synthetic Biology, 2018, 2, 35–64 | 57

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

It is important to stress that all the phenomena observed at the nanoscale are intrinsically complex because the energy factors playing a role occur on the same scale of magnitude of the thermal energy. This implies that their behaviour needs to be quantitatively described by approaches that are able to accurately describe both enthalpy and entropy factors at play in the miscrosopic phenomena to model in synthetic biology approaches. This is exactly an area in which accurately designed molecular simulations can give an important contribution in screening properties that are critical to the nano-scale phenomena. We highlighted the importance of multi-scale simulations and many of the efforts in the next generation of force fields development and computational approaches will be devoted to these challenging physico-chemical systems. Methods like the hybrid particle continuum approach can use atomistic of coarse-grained representations to embed the calculated non-bonded interactions by the help of SCF theory. More development has to be done in this area to address conformational changes, charged species and different solvents of use in biomolecular simulations. We also focused on a novel area of interest in the study of assembly phenomena: the treatment of biomolecules at the interface with metal nanoparticles and/or carbon nanotubes. In the computational approaches for these systems we point out the limitation of all atoms simulations and classical MD simulations in the limited sampling of crucial events playing a role in the macroscopic behaviour of these assemblies. The accessible time scales even for the most sophisticated large supercomputing facilities are insufficient to accurately sample these. Nature-inspired systems will stimulate specifically designed systems that will, in turn, challenge the development of force-fields and methodological approaches that can reliably describe their properties for use in synthetic biology applications. Eventually, this will revolutionise the use of simulations and synthetic biology approaches in our understanding of nature.

Acknowledgements FF thanks a BBRSC grant n (BB/G017190/1); IM thanks support from EPSRC for the CANES DTC.

References 1 2 3 4 5 6 7 8

J. A. Marsh and S. A. Teichmann, Ann. Rev. Biochem., 2015, 84, 551. ńdez, C. V. Robinson and S. A. Teichmann, S. E. Ahnert, J. A. Marsh, H. Herna Science, 2015, 350, 2245. I. M. A. Nooren and J. M. Thornton, EMBO J., 2003, 22, 3486. I. M. A. Nooren and J. M. Thornton, J. Mol. Biol., 2003, 325, 991. ól and P. Aloy, Nat. Methods, 2013, 10, 47. R. Mosca, A. Ce ól, A. Valencia and P. Aloy, Curr. Opin. Struct. Biol., R. Mosca, T. Pons, A. Ce 2013, 23, 929. J. Negroni, R. Mosca and P. Aloy, Structure, 2014, 22, 1356. A. Stein, R. Mosca and P. Aloy, Curr. Opin. Struct. Biol., 2011, 21, 200.

58 | Synthetic Biology, 2018, 2, 35–64

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

View Online

9 Y. T. Lai, N. P. King and T. O. Yeates, Trends Cell Biol., 2012, 22, 653. 10 G. Invernizzi, E. Papaleo, R. Sabate and S. Ventura, Int. J. Biochem. Cell Biol., 2012, 44, 1541. 11 F. Chiti and C. M. Dobson, Nat. Chem. Biol., 2009, 5, 15. 12 F. Chiti, P. Webster, N. Taddei, A. Clark, M. Stefani, G. Ramponi and C. M. Dobson, Proc. Natl. Acad. Sci. U. S. A., 1999, 96, 3590. 13 M. Bucciantini, E. Giannoni, F. Chiti, F. Baroni, L. Formigli, J. Zurdo, N. Taddei, G. Ramponi, C. M. Dobson and M. Stefani, Nature, 2002, 416, 507. 14 Y. Lin, N. Koga, R. Tatsumi-Koga, G. Liu, A. F. Clouser, G. T. Montelione and D. Baker, Proc. Natl. Acad. Sci. U. S. A., 2015, 112, 5478. 15 H. Park, F. DiMaio and D. Baker, Structure, 2015, 23, 1123. 16 O. Alvizo and S. L. Mayo, Proc. Natl. Acad. Sci. U. S. A., 2008, 105, 12242. 17 J. B. Bale, S. Gonen, Y. Liu, W. Sheffler, D. Ellis, C. Thomas, D. Cascio, T. O. Yeates, T. Gonen, N. P. King and D. Baker, Science, 2016, 353, 389. 18 Y. Mou, P. Huang, F. Hsu, S. Huang and S. L. Mayo, Proc. Natl. Acad. Sci. U. S. A., 2015, 112, 10714. 19 A. Hung and I. Yarovsky, Biochemistry, 2011, 50, 1492. 20 K. Ley, A. Christofferson, M. Penna, D. Winkler, S. Maclaughlin and I. Yarovsky, Front. Mol. Biosci., 2015, 2, 64. 21 M. Penna, K. Ley, S. Maclaughlin and I. Yarovsky, Faraday Discuss., 2016, 191, 435. 22 L. Zhao, S. Chiu, J. Benoit, L. Chew and Y. Mu, J. Phys. Chem. B, 2011, 115, 12247. 23 R. Friedman, Biochem. J., 2011, 438, 415. 24 J. Choi, J. Kim, S. Kim and J. Min, Biochip J., 2009, 3, 157. 25 Y. Hu, A. Das, M. H. Hecht and G. Scoles, Langmuir, 2005, 21, 9103. é, S. A. Andaloussi, 26 G. M. Manecka, J. Labrash, O. Rouxel, P. Dubot, J. Laleve E. Renard, V. Langlois and D. L. Versace, ACS Sustain. Chem. Eng., 2014, 2, 996. 27 S. Y. Park, A. K. R. Lytton-Jean, B. Lee, S. Weigand, G. C. Schatz and C. A. Mirkin, Nature, 2008, 451, 553. 28 C. Sanchez, P. Belleville, M. Popall and L. Nicole, Chem. Soc. Rev., 2011, 40, 696. 29 S. V. Dorozhkin, Acta Biomater., 2010, 6, 715. 30 S. A. Mackowiak, A. Schmidt, V. Weiss, C. Argyo, C. von Schirnding, T. Bein üchle, Nanoletters, 2013, 13, 2576. and C. Bra 31 C. Tamerler and M. Sarikaya, Acta Biomater., 2007, 3, 289. 32 R. J. Macfarlane, B. Lee, M. R. Jones, N. Harris, G. C. Schatz and C. A. Mirkin, Science, 2011, 334, 204. 33 E. Gazit, NanoBioTechnology, Humana Press, Totowa, NJ, 2008, ch. V, p. 385. 34 E. Kasotakis and A. Mitraki, Methods Mol. Biol., 2013, 996, 195. 35 J. B. Matson and S. I. Stupp, Chem. Commun., 2012, 48, 26. 36 T. Aida, E. W. Meijer and S. I. Stupp, Science, 2012, 335, 813. 37 R. Rosˇic, P. Kocbek, J. Pelipenko, J. Kristl and S. Baumgartner, Acta Pharm., 2013, 63, 295. 38 M. Schmidt Am Busch, A. Lopes, D. Mignon and T. Simonson, J. Comput. Chem., 2008, 29, 1092. 39 A. Abi Mansour and P. J. Ortoleva, J. Chem. Theory Comput., 2016, 12, 1965. 40 V. Castelletto, E. de Santis, H. Alkassem, B. Lamarre, J. E. Noble, S. Ray, A. Bella, J. R. Burns, B. W. Hoogenboom and M. G. Ryadnov, Chem. Sci., 2016, 7, 1707. 41 S. J. Marrink, D. P. Tieleman and A. E. Mark, J. Phys. Chem. B, 2000, 104, 12165. ´. Pin ˜eiro, Langmuir, 2011, 27, 9719. 42 N. Hassan, J. M. Ruso and A Synthetic Biology, 2018, 2, 35–64 | 59

View Online

43 44 45

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

46 47 48

49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79

M. Jorge, Langmuir, 2008, 24, 5714. M. Sammalkorpi, S. Sanders, A. Z. Panagiotopoulos, M. Karttunen and M. Haataja, J. Phys. Chem. B, 2011, 115, 1403. D. T. Allen, Y. Saaka, M. J. Lawrence and C. D. Lorenz, J. Phys. Chem. B, 2014, 118, 13192. C. D. Lorenz, C. Hsieh, C. A. Dreiss and M. J. Lawrence, Langmuir, 2011, 27, 546. A. V. Nevidimov and V. F. Razumov, Mol. Phys., 2009, 107, 2169. ´, C. C. Domingues, N. C. Meirelles, S. V. P. Malheiros, P. S. C. Prete ˜i, E. de Paula and S. Schreier, Biochim. Biophys. Acta, 2011, F. M. Gon 1808, 164. `, E. Go ´mez, G. Caldero ´, J. Esquena, C. Solans and E. Valle ´s, Phys. A. Serra Chem. Chem. Phys., 2013, 15, 14653. Y. Wang, X. Wang, M. Antonietti and Y. Zhang, ChemSusChem, 2010, 3, 435. K. Beyer, J. Colloid Interface Sci., 1982, 86, 73. A. V. Ahir, P. G. Petrov and E. M. Terentjev, Langmuir, 2002, 18, 9140. A. Derecskei-Kovacs, B. Derecskei and Z. A. Schelly, J. Mol. Graphics Modell., 1998, 16, 206. D. Yordanova, I. Smirnova and S. Jakobtorweihen, J. Chem. Theory Comput., 2015, 11, 2329. C. T. Andrews and A. H. Elcock, J. Chem. Theory Comput., 2014, 10, 5178. P. J. Bond and M. S. P. Sansom, J. Am. Chem. Soc., 2006, 128, 2697. M. Velinova, D. Sengupta, A. V. Tadjer and S. J. Marrink, Langmuir, 2011, 27, 14071. Z. Li, P. Wang, B. Liu, Y. Wang, J. Zhang, Y. Yan and Y. Ma, Soft Matter, 2014, 10, 8758. E. J. Wallace and M. S. P. Sansom, Nanotechnology, 2009, 20, 045101. A. Abi Mansour and P. J. Ortoleva, J. Chem. Theory Comput., 2014, 10, 518. T. Murtola, E. Falck, M. Karttunen and I. Vattulainen, J. Chem. Phys., 2007, 126, 075101. ¨ller-Plathe, Soft Materials, 2002, 1, 1. F. Mu I. Bahar, A. R. Atilgan and B. Erman, Folding Design, 1997, 2, 173. T. Haliloglu, I. Bahar and B. Erman, Phys. Rev. Lett., 1997, 79, 3090. E. N. Brodskaya, Colloid J., 2012, 74, 154. M. Kang, D. Lam, D. E. Discher and S. M. Loverde, Computational Pharmaceutics, John Wiley & Sons, Ltd, Chichester, UK, 2015, ch. 5, p. 53. D. T. Allen and C. D. Lorenz, J. Self-Assembly Mol. Electron., 2015, 3, 1. L. Monticelli, S. K. Kandasamy, X. Periole, R. G. Larson, D. P. Tieleman and S. J. Marrink, J. Chem. Theory Comput., 2008, 4, 819. T. Kawakatsu, Statistical Physics of Polymers: An Introduction, Springer, Berlin, 2004. E. Dickinson, V. J. Pinfield, D. S. Horne and F. A. M. Leermakers, J. Chem. Soc., Faraday Transact., 1997, 93, 1785. M. W. Matsen and M. Schick, Phys. Rev. Lett., 1994, 72, 2660. F. Drolet and G. H. Fredrickson, Phys. Rev. Lett., 1999, 83, 4317. G. H. Fredrickson, V. Ganesan and F. Drolet, Macromolecules, 2002, 35, 16. Y. Lauw, F. A. M. Leermakers and M. A. Cohen Stuart, J. Phys. Chem. B, 2006, 110, 465. J. Roan and T. Kawakatsu, J. Chem. Phys., 2002, 116, 7295. ¨ller, J. Chem. Phys., 2006, 125, 184904. K. C. Daoulas and M. Mu G. Milano, T. Kawakatsu and A. De Nicola, Phys. Biol., 2013, 10, 045007. G. Milano and T. Kawakatsu, J. Chem. Phys., 2010, 133, 214102. G. Milano and T. Kawakatsu, J. Chem. Phys., 2009, 130, 214106.

60 | Synthetic Biology, 2018, 2, 35–64

View Online

80 81 82

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

83 84

85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111

S. Qi, H. Behringer and F. Schmid, New J. Phys., 2013, 15, 125009. ¨ller, J. J. de Pablo, P. F. Nealey and G. D. Smith, Soft K. C. Daoulas, M. Mu Matter, 2006, 2, 573. ¨ller, Soft Matter, 2009, K. C. Daoulas, A. Cavallo, R. Shenhar and M. Mu 5, 4499. Y. Zhao, A. De Nicola, T. Kawakatsu and G. Milano, J. Comput. Chem., 2012, 33, 868. P. K. Biswas, N. A. Vellore, J. A. Yancey, T. G. Kucukkal, G. Collier, B. R. Brooks, S. J. Stuart and R. A. Latour, J. Comput. Chem., 2012, 33, DOI: 10.1002/jcc.22979. T. M. Abramyan, J. A. Snyder, J. A. Yancey, A. Thyparambil, Y. Wei, S. J. Stuart and R. A. Latour, Biointerphases, 2015, 10, 021002. J. A. Snyder, T. Abramyan, J. A. Yancey, A. A. Thyparambil, Y. Wei, S. J. Stuart and R. A. Latour, Biointerphases, 2012, 7, 56. H. Heinz, T. Lin, R. K. Mishra and F. S. Emami, Langmuir, 2013, 29, 1754. H. Heinz and H. Ramezani-Dakhel, Chem. Soc. Rev., 2016, 45, 412. R. A. Latour, Colloids Surf., B, 2014, 124, 25. D. Costa, L. Savio and C.-M. Pradier, J. Phys. Chem. B, 2016, 120, 7039. P. Charchar, A. J. Christofferson, N. Todorova and I. Yarovsky, Small, 2016, 12, 2394. M. Ozboyaci, D. B. Kokh, S. Corni and R. C. Wade, Q. Rev. Biophys., 2016, 49, e4. R. Baron, B. Willner and I. Willner, Chem. Commun., 2007, 43, 323. M. Chakraborty, S. Jain and V. Rani, Appl. Biochem. Biotechnol., 2011, 165, 1178. L. M. Ghiringhelli, B. Hess, N. F. A. van der Vegt and L. Delle Site, J. Am. Chem. Soc., 2008, 130, 13460. H. Heinz, R. A. Vaia, B. L. Farmer and R. R. Naik, J. Phys. Chem. C, 2008, 112, 17281. A. Verde, J. M. Acres and J. K. Maranas, Biomacromolecules, 2009, 10, 2118. F. Iori and S. Corni, J. Comput. Chem., 2008, 29, 1656. F. Iori, R. Di Felice, E. Molinari and S. Corni, J. Comput. Chem., 2009, 30, 1465. L. B. Wright, P. M. Rodger, S. Corni and T. R. Walsh, J. Chem. Theory Comput., 2013, 9, 1616. L. B. Wright, P. M. Rodger, T. R. Walsh and S. Corni, J. Phys. Chem. C, 2013, 117, 24292. Z. E. Hughes, L. B. Wright and T. R. Walsh, Langmuir, 2013, 29, 13217. J. Schneider and L. C. Ciacchi, Surf. Sci., 2010, 604, 1105. J. Schneider and L. C. Ciacchi, J. Chem. Theory Comput., 2011, 7, 473. S. Koppen and W. Langel, Langmuir, 2010, 26, 15248. W. Friedrichs and W. Langel, Biointerphases, 2014, 9, 031006. A. C. T. van Duin, S. Dasgupta, F. Lorant and W. A. Goddard, J. Phys. Chem. A, 2001, 105, 9396. C. Li, S. Monti and V. Carravetta, J. Phys. Chem. C, 2012, 116, 18318. P. E. M. Lopes, V. Murashov, M. Tazi, E. Demchuk and A. D. Mackerell, J. Phys. Chem. B, 2006, 110, 2782. E. R. Cruz-Chu, A. Aksimentiev and K. Schulten, J. Phys. Chem. B, 2006, 110, 21497. C. D. Lorenz, P. S. Crozier, J. A. Anderson and A. Travesset, J. Phys. Chem. C, 2008, 112, 10222. Synthetic Biology, 2018, 2, 35–64 | 61

View Online

112 113 114

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

115 116 117 118

119 120 121 122 123 124 125 126 127 128 129 130

131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147

R. T. Cygan, J. Liang and A. G. Kalinichev, J. Phys. Chem. C, 2004, 112, 10222. C. L. Freeman, J. H. Harding, D. Cooke, J. A. Elliott, J. S. Lardge and D. M. Duffy, J. Phys. Chem. B, 2007, 108, 1255. C. L. Freeman, J. H. Harding, D. Quigley and P. M. Rodger, Angew. Chem. Int. Ed., 2010, 49, 5135. C. L. Freeman, J. H. Harding, D. Quigley and P. M. Rodger, J. Phys. Chem. C, 2011, 115, 8175. D. R. Katti, P. Ghosh, S. Schmidt and K. S. Katti, Biomacromolecules, 2005, 6, 3276. D. R. Katti, S. R. Schmidt, P. Ghosh and K. S. Katti, Clays Clay Minerals, 2005, 53, 171. A. Gladytz, T. John, T. Gladytz, R. Hassert, M. Pagel, H. J. Risselada, S. Naumov, A. G. Beck-Sickinger, B. Abel, A. Matsumoto, Y. Miyahara, üpl and R. Hassert, A. G. Beck-Sickinger, R. Hassert, M. Pagel, Z. Ming, T. Ha B. Abel, Phys. Chem. Chem. Phys., 2016, 18, 23516. S. Hauptmann, H. Dufner, J. Brickmann, S. M. Kast and R. S. Berry, Phys. Chem. Chem. Phys., 2003, 5, 635. R. Bhowmik, K. S. Katti and D. Katti, Polymer, 2007, 48, 664. X. Dong, Q. Wang, T. Wu and H. Pan, Biophys. J., 2007, 93, 750. J. Shen, T. Wu, Q. Wang and H. Pan, Biomaterials, 2008, 29, 513. Z. Xu, Y. Yang, Z. Wang, D. Mkhonto, C. Shang, Z. Liu, Q. Cui and N. Sahai, J. Comput. Chem., 2014, 35, 70. W. Zhao, Z. Xu, Q. Cui and N. Sahai, Langmuir, 2016, 32, 7009. P. Ren and J. W. Ponder, J. Comput. Chem., 2002, 23, 1497. P. Ren and J. W. Ponder, J. Phys. Chem. B, 2003, 107, 5933. ´sio and T. R. Walsh, Mol. Phys., 2007, 105, 221. S. D. Toma S. M. Tomasio and T. R. Walsh, J. Phys. Chem. C, 2009, 113, 8778. T. R. Walsh, Mol. Phys., 2008, 106, 1613. B. Akdim, R. Pachter, S. S. Kim, R. R. Naik, T. R. Walsh, S. Trohalaki, G. Hong, Z. Kuang and B. L. Farmer, ACS Appl. Mater. Interfaces, 2013, 5, 7470. ´sio and T. R. Walsh, Nanoscale, 2014, 6, 5438. Z. E. Hughes, S. M. Toma A. N. Camden, S. A. Barr and R. J. Berry, J. Phys. Chem. B, 2013, 117, 10691. Y. Sun and R. A. Latour, J. Comput. Chem., 2006, 27, 1908. Y. Sun, B. N. Dominy and R. A. Latour, J. Comput. Chem., 2007, 28, 1883. N. A. Vellore, J. A. Yancey, G. Collier, R. A. Latour and S. J. Stuart, Langmuir, 2010, 26, 7396. G. Collier, N. A. Vellore, J. A. Yancey, S. J. Stuart and R. A. Latour, Biointerphases, 2012, 7, 24. Y. Wei and R. A. Latour, Langmuir, 2010, 26, 18852. Y. Wei and R. A. Latour, Langmuir, 2009, 25, 5637. J. Liang, G. Fieg, F. J. Keil and S. Jakobtorweihen, Ind. Eng. Chem. Res., 2012, 51, 16049. F. Tavanti, A. Pedone and M. C. Menziani, New J. Chem., 2015, 39, 2474. H. Lopez and V. Lobaskin, J. Chem. Phys., 2015, 143, 243138. X. Wu and G. Narsimhan, Biochim. Biophys. Acta, 2008, 1784, 1694. J. Kleinjung and F. Fraternali, J. Chem. Theory Comput., 2012, 8, 3977. J. Kleinjung and F. Fraternali, Curr. Opin. Struct. Biol., 2014, 25, 126. D. B. Kokh, S. Corni, P. J. Winn, M. Hoefling, K. E. Gottschalk and R. C. Wade, J. Chem. Theory Comput., 2010, 6, 1753. J. Mintseris and Z. Weng, Proc. Natl. Acad. Sci. U. S. A., 2005, 102, 10930. D. Grueninger, N. Treiber, M. O. P. Ziegler, J. W. A. Koetter, M. Schulze and G. E. Schulz, Science, 2008, 319, 206.

62 | Synthetic Biology, 2018, 2, 35–64

View Online

148 149 150

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

151 152 153 154 155 156 157 158 159 160 161 162 163 164

165

166 167 168 169 170 171 172

173 174 175 176 177 178

G. E. Schulz, Prog. Mol. Biol. Transl. Sci., 2011, 103, 187. T. Hamp and B. Rost, Bioinformatics, 2015, 31, 1521. N. Tuncbag, A. Gursoy, R. Nussinov and O. Keskin, Nat. Protocols, 2011, 6, 1341. J. J. McManus, P. Charbonneau, E. Zaccarelli and N. Asherie, Curr. Opin. Colloid Interface Sci., 2016, 22, 23. T. Schwede, Structure, 2013, 21, 1531. A. Lopes, S. Sacquin-Mora, V. Dimitrova, E. Laine, Y. Ponty and A. Carbone, PLoS Comput. Biol., 2013, 9, e1003369. J. P. G. L. M. Rodrigues and A. M. J. J. Bonvin, FEBS J., 2014, 281, 1988. T. Frembgen-Kesner and A. H. Elcock, Biophys. Rev., 2013, 5, 109. A. Fornili, A. Pandini, H. Lu and F. Fraternali, J. Chem. Theory Comput., 2013, 9, 5127. D. Seeliger and B. L. De Groot, J. Comput. Chem., 2009, 30, 1160. D. J. Selkoe, Nature, 2003, 426, 900. F. Bemporad, G. Calloni, S. Campioni, G. Plakoutsi, N. Taddei and F. Chiti, Acc Chem. Res., 2006, 39, 620. ¨ndrich, V. Forge, K. Buder, M. Kittler, C. M. Dobson and S. Diekmann, M. Fa Proc. Natl. Acad. Sci. U. S. A., 2003, 100, 15463. D. M. Fowler, A. V. Koulov, W. E. Balch and J. W. Kelly, Trends Biochem. Sci., 2007, 32, 217. C. J. A. Sigrist, L. Cerutti, E. de Castro, P. S. Langendijk-Genevaux, V. Bulliard, A. Bairoch and N. Hulo, Nucleic Acids Res., 2010, 38, D161. A. Fornili, F. Autore, N. Chakroun, P. Martinez and F. Fraternali, Methods Mol. Biol., 2012, 819, 375. M. R. Sawaya, S. Sambashivan, R. Nelson, M. I. Ivanova, S. A. Sievers, M. I. Apostol, M. J. Thompson, M. Balbirnie, J. J. W. Wiltzius, H. T. McFarlane, A. Ø. Madsen, C. Riekel and D. Eisenberg, Nature, 2007, 447, 453. C. Liu, M. Zhao, L. Jiang, P.-N. Cheng, J. Park, M. R. Sawaya, A. Pensalfini, D. Gou, A. J. Berk, C. G. Glabe, J. Nowick and D. Eisenberg, Proc. Natl. Acad. Sci. U. S. A., 2012, 109, 20913. R. Tycko, Nature, 2016, 537, 492. A. De Simone, G. G. Dodson, C. S. Verma, A. Zagari and F. Fraternali, Proc. Natl. Acad. Sci. U. S. A., 2005, 102, 7535. G. Colombo, P. Soto and E. Gazit, Trends Biotechnol., 2007, 22, 211. J. Nasica-Labouze, M. Meli, P. Derreumaux, G. Colombo and N. Mousseau, PLoS Comput. Biol., 2011, 7, e1002051. R. Pellarin, P. Schuetz, E. Guarnera and A. Caflisch, J. Am. Chem. Soc., 2010, 132, 14960. R. Friedman and A. Caflisch, J. Mol. Biol., 2011, 414, 303. R. P. Watson, M. T. Christen, C. Ewald, F. Bumbak, C. Reichen, ¨ntert, A. Caflisch, A. Plu ¨ckthun and M. Mihajlovic, E. Schmidt, P. Gu O. Zerbe, Structure, 2014, 22, 985. G. Rossetti, S. Bongarzone and P. Carloni, Curr. Top. Med. Chem., 2013, 13, 2419. N. Chakroun, A. Fornili, S. Prigent, J. Kleinjung, C. A. Dreiss, H. Rezaei and F. Fraternali, J. Chem. Theory Comput., 2013, 9, 2455. N. Chakroun, S. Prigent, C. A. Dreiss, S. Noinville, C. Chapuis, F. Fraternali and H. Rezaei, FASEB J., 2010, 24, 3222. F. Sterpone, M. Ceccarelli and M. Marchi, J. Mol. Biol., 2001, 311, 409. E. Moroni, G. Scarabelli and G. Colombo, Front. Biosci., 2009, 14, 523. J. R. Perilla, J. A. Hadden, B. C. Goh, C. G. Mayne and K. Schulten, J. Phys. Chem. Lett., 2016, 7, 1836. Synthetic Biology, 2018, 2, 35–64 | 63

View Online

179

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00035

180 181

182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204

205 206

J. M. A. Grime, J. F. Dama, B. K. Ganser-Pornillos, C. L. Woodward, G. J. Jensen, M. Yeager and G. A. Voth, Nat. Commun., 2016, 7, 11568. A. A. Mansour, Y. V. Sereda, J. Yang and P. J. Ortoleva, Vaccine, 2015, 33, 5890. Y. Zhang, R. Bartz, G. Grigoryan, M. Bryant, J. Aaronson, S. Beck, N. Innocent, L. Klein, W. Procopio, T. Tucker, V. Jadhav, D. M. Tellers and W. F. DeGrado, ACS Chem. Biol., 2015, 10, 1082. F. Collu, E. Spiga, C. D. Lorenz and F. Fraternali, Front. Mol. Biosci., 2015, 2, 66. A. De Nicola, S. Hezaveh, Y. Zhao, T. Kawakatsu, D. Roccatano and G. Milano, Phys. Chem. Chem. Phys., 2014, 16, 5093. Y. Zhu, Z. Lu, G. Milano, A. Shi and Z. Sun, Phys. Chem. Chem. Phys., 2016, 18, 9799. S. J. Marrink, A. H. de Vries and A. E. Mark, J. Phys. Chem. B, 2004, 108, 750. A. De Nicola, Y. Zhao, T. Kawakatsu, D. Roccatano and G. Milano, J. Chem. Theory Comput., 2011, 7, 2947. J. F. Nagle, R. Zhang, S. Tristram-Nagle, W. Sun, H. I. Petrache and R. M. Suter, Biophys. J., 1996, 70, 1419. ´, N. Kuc ˇerka, M. Kiselev, S. P. Yaradaikin and P. Balgavy´, M. Dubnicˇkova ´, Biochim. Biophys. Acta, 2001, 1512, 40. D. Uhrkovaa A. De Nicola, Y. Zhao, T. Kawakatsu, D. Roccatano and G. Milano, Theor. Chem. Acc., 2012, 131, 1167. E. Sarukhanyan, A. De Nicola, D. Roccatano, T. Kawakatsu and G. Milano, Chem. Phys. Lett., 2014, 595, 156. T. Osaki, K. Kamiya and S. Takeuchi, Synthetic Biol., 2014, 1, 275, ch. 10. G. J. A. Sevink, M. Charlaganov and J. G. E. M. Fraaije, Soft Matter, 2013, 9, 2816. C. Peter and K. Kremer, Soft Matter, 2009, 5, 4357. C. Peter and K. Kremer, Faraday Discuss., 2010, 144, 9. ¨p, K. Kremer, O. Hahn, J. Batoulis and T. Bu ¨rger, Acta Polym., 1998, W. Tscho 49, 75. ¨p, K. Kremer, J. Batoulis, T. Bu ¨rger and O. Hahn, Acta Polym., 1998, W. Tscho 49, 61. ¨ller-Plathe and G. Milano, J. Phys. Chem. G. Santangelo, A. Di Matteo, F. Mu B, 2007, 111, 2765. ¨ller-Plathe and T. Spyriouni, C. Tzoumanekas, D. Theodorou, F. Mu G. Milano, Macromolecules, 2007, 40, 3876. A. Brasiello, L. Russo, C. Siettos, G. Milano and S. Crescitelli, Comput. Aided Chem. Eng., 2010, vol. 28, 625. Q. Shi and G. A. Voth, Biophys. J., 2005, 89, 2385. ´. Pin ˜eiro, Soft P. Brocos, P. Mendoza-Espinosa, R. Castillo, J. Mas-Oliva and A Matter, 2012, 8, 9005. A. De Nicola, T. Kawakatsu, C. Rosano, M. Celino, M. Rocco and G. Milano, J. Chem. Theory Comput., 2015, 11, 4959. M. Carroni and H. R. Saibil, Methods, 2016, 95, 78. A. Graves, G. Wayne, M. Reynolds, T. Harley, I. Danihelka, A. Grabska´ska, S. G. Colmenarejo, E. Grefenstette, T. Ramalho, J. Agapiou, Barwin A. P. Badia, K. M. Hermann, Y. Zwols, G. Ostrovski, A. Cain, H. King, C. Summerfield, P. Blunsom, K. Kavukcuoglu and D. Hassabis, Nature, 2016, 538, 471. S. J. Marrink and D. P. Tieleman, Chem. Soc. Rev., 2013, 42, 6801. L. Heinzerling, R. Klein and M. Rarey, J. Comput. Chem., 2012, 33, 2554.

64 | Synthetic Biology, 2018, 2, 35–64

Protein scaffolds and higher-order complexes in synthetic biologyy Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

Anniek den Hamer,z Bas J. H. M. Rosier,z Luc Brunsveld* and Tom F. A. de Greef* DOI: 10.1039/9781782622789-00065

Interactions between proteins control molecular functions such as signalling or metabolic activity. Assembly of proteins via scaffold proteins or in higher-order complexes is a key regulatory mechanism. Understanding and functionally applying this concept requires the construction, study, and utilization of synthetic scaffolds. This chapter first describes protein scaffolding in the context of its natural function as well as the underlying mechanistic origins via mathematical models and simulations. This is then funnelled into examples of synthetic biology approaches to engineer new scaffolds and their usage as regulators of signalling networks and metabolic engineering.

1

Introduction

Cell signalling is controlled by protein complexes that are able to bind several interaction partners. These protein complexes lead to spatiotemporal control over signal transduction by influencing subcellular localization, altering the properties of protein interaction partners and alńyi et al.,1 Lim et al.2 lowing proximity induced activation (reviewed by Reme 3 and Shaw et al. ). We classify these protein complexes into two categories, i.e. scaffold proteins and higher-order complexes. Scaffold proteins are defined as proteins that bind and colocalize two or more interaction partners of a catalytic pathway,4 while higher-order complexes are oligomerizations of receptor and signalling proteins consisting of multiple proteins arranged in an open-ended fashion (Fig. 1).5 Scaffold proteins usually lack intrinsic enzymatic activity and are characterized by modular protein domains.4,6 They function as mediators, wiring together biochemical input and output responses via linear pathways, pathway branching or feedback mechanisms. In addition, they can also be targets for external regulation.2 Recently it has been discovered that signalling proteins can also assemble into structured, higher-ordered signalling machines, i.e. signalosomes (Fig. 1B).5 Structural studies of these complexes show that higher-order oligomerization is common in immune signalling cascades but presumably is also important in other signalling pathways.5 In this chapter we will describe a selection of scaffold and higher-order complex proteins and discuss their function in the living cell. Scaffold proteins can tune the selectivity, sensitivity, specificity and robustness Laboratory of Chemical Biology, Department of Biomedical Engineering and Institute of Complex Molecular Systems, PO Box 513, 5600 MB Eindhoven, the Netherlands. E-mail: [email protected]; [email protected] y This work is supported by the Netherlands Organization for Scientific Research via Gravity Program 024.001.035. z These authors contributed equally to this work. Synthetic Biology, 2018, 2, 65–96 | 65 c

The Royal Society of Chemistry 2018

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

Fig. 1 Protein complexes involved in signal transduction. (A) Scaffold proteins wiring pathway input and output by binding two or more interaction partners of a signalling pathway. (B) Higher-order complexes formed by oligomerization of receptor and signalling proteins. Two potential amplification mechanisms are displayed: incorporation of unliganded receptors and excess recruitment of signalling proteins. Figure adapted from ref. 5.

of biochemical signals in the cell while at the same time the presence of these elements can lead to bistable and biphasic dynamics. To explain the mechanistic origins of these effects we will review a number of important theoretical studies highlighting mathematical models and simulations that reveal how proteins scaffolds influence the spatiotemporal characteristics of signal transduction. Finally, we will conclude with examples from synthetic biology which have revealed how engineered scaffolds can be used as regulators of in- and output of a signalling network by means of a bottom-up approach. This leads to further understanding of cell signalling and the development of applications in the fields of metabolic engineering and cell-based therapeutics.

2

Scaffold proteins

Scaffold proteins are essential components in signal transduction. In this section we will discuss well known scaffolds and highlight their functional mechanisms. Depending on the function of the protein, most scaffolds are comprised of multiple modular domains which are coupled interchangeably.6,7 Due to their modular nature, these proteins generally play a role in multiple pathways. Initially, scaffold proteins were assumed to act as passive elements whose only function was to tether signalling components thereby enforcing proximity induced activity. However, recent results have shown that scaffolds themselves are controlled by conformational fine tuning resulting in additional feedback loops and more complex regulatory behaviour. Consequently, scaffolds rely amongst others on posttranslational modifications such as phosphorylation to assist in the recognition of interaction partners and shape their output response. In 1988, Mayer et al. discovered Crk as one of the first scaffold proteins involved in signalling.8 Protein–protein interactions with Crk are constituted via the Src homology 2 and 3 (SH2, SH3) domains.9 These are the most prominent modular domains found in signalling, as SH2 is able to bind tyrosine-phosphorylated partners serving as input, while the 66 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

Fig. 2 Scaffold proteins making use of different modular domains. (A) The Crk protein couples tyrosine-phosphorylated proteins via a SH2 domain to proline-rich binding partners of the SH3 domain. Figure adapted from ref. 9. (B) Immune scaffold proteins MAVS, STING and TRIF are activated by respectively DNA, RNA and bacterial lipopolysaccharide (LPS), inducing phosphorylation by the kinases IKK and TBK1. Subsequently the phosphorylated scaffolds can recruit IRF3, resulting in phosphorylation of IRF3 by TBK1. Phosphorylated IRF3 dissociated from the scaffold and dimerizes, which allows it to enter the nucleus to induce IFNs. Figure adapted from ref. 12. (C) Scaffold protein 14-3-3 connects metabolism with apoptosis as 14-3-3 inhibits initiator caspase-2 through phosphorylated binding. Upon decreasing NADPH levels, 14-3-3 is released from caspase-2, allowing dephosphorylation and activation of the apoptosis pathway by PIDDosome formation. Figure adapted from ref. 23.

SH3 domain recruits proline-rich motifs, mediating the output of the signalling pathway (Fig. 2A). The diversity of regulatory and catalytic domains (reviewed by Pan et al.6), such as PDZ and WW, enables evolutionary mixing and matching resulting in the localization, coupling and activation of signalling complexes with a wide range of input–output connections.4 This is also the case for proteins of the Crk family as they bind numerous interaction partners involved in cellular transformation, cytoskeletal changes and phagocytosis (reviewed by Birge et al.9). Phosphorylation and other post-translational modifications of scaffold proteins or their binding partners increases their overall regulatory capacity (reviewed by Pawson et al.10). This is demonstrated by Nck, a scaffold protein consisting of one SH2 and three SH3 domains.11 Oxidative stress results in tyrosine phosphorylation of platelet-endothelial cell adhesion molecule-1 (PECAM-1), stimulating its binding to Nck.12 This leads to activation of the serine/threonine p21 activated kinase Synthetic Biology, 2018, 2, 65–96 | 67

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

2 (PAK2) by Nck dependent membrane targeting. Subsequently, PAK2 induces the NF-kB pathway, driving proinflammatory response.12 Relying on a similar mode of action are the innate immune scaffold proteins MAVS, STING and TRIF (Fig. 2B).13 Upon recognition of respectively viral RNA, cytosolic DNA and bacterial lipopolysaccharides by membrane receptors, the scaffold proteins recruit kinases IKK and TBK1, which in turn phosphorylate the scaffolds. In this way, interferon regulatory factor 3 (IRF3) is recruited and hence also phosphorylated by IKK and TBK1. Subsequently, the phosphorylated IRF3 dissociates from the scaffold and dimerizes, enabling entry to the nucleus to induce the host immune response by producing type I interferons (IFNs).13 14-3-3 proteins are phosphorylation-dependent scaffold proteins containing a different type of modular domain.14 Seven human isoforms of 14-3-3 are found in humans and these exist primarily as homo- and heterodimers.15 More than 200 binding partners are known,16 which primarily bind to an amphipathic binding groove via three phosphorylated consensus motifs; the internal motifs RSX(pS/pT)XP (mode I), RX(Y/F)X(pS/pT)XP (mode II)17 and the C-terminal motif – p(S/T)X1–2COOH (mode III).18 As scaffold protein they have several functions,19 namely changing conformation of the binding partner,20 facilitating as an interaction platform for two other proteins,21 or protecting the target protein against proteolytic degradation or dephosphorylation by physical occlusion.22 For example in metabolic regulation, increasing levels of NADPH inhibit caspase-2 through phosphorylation-mediated 14-3-3 binding (Fig. 2C).23 Upon decrease of NADPH levels, 14-3-3 is released from caspase-2, exposing it for dephosphorylation. Dephosphorylated caspase-2 is subsequently activated by induced proximity oligomerization via its respective scaffold proteins RIP-associated protein with a death domain (RAIDD) and the p53-induced protein with a death domain (PIDD) forming the PIDDosome. The 14-3-3 scaffold protein is therefore an important linkage between metabolic regulation and the cell death pathway when cells respond to stress. Scaffold proteins do not function in isolation but colocalize specific interaction partners, resulting in macromolecular complexes that facilitate the interaction with other proteins.6 The protein complexes involved in proper calcium signalling related to immune response are an illustration of this (Fig. 3A). Upon T-cell receptor (TCR) stimulation, Zap70 is activated and mediates phosphorylation of the scaffold proteins Slp76 and membrane bound LAT.2 This external stimulation regulates the association of other scaffold proteins colocalized at the membrane, such as Nck and the closely related Grb2 and Gads.24 The result of this phosphorylationinduced complex formation is the activation of downstream pathways via phospholipase C-g (PLCg), also bound to LAT via an SH2 domain, resulting in release of intracellular calcium stores and actin reorganization, essential for T-cell activation.25 The large scaffold protein AHNAK1 is an additional scaffold involved in TCR signalling. AHNAK1 is thought to facilitate calcium channels to localize at the plasma membrane, regulating the influx of calcium for the accurate nuclear translocation of the transcription factor NFAT essential for T-cell activation.26,27 68 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

Fig. 3 Simplifications of macromolecular complexes involved in signal transduction based on colocalization of interaction partners. (A) T-cell activation induces tyrosine kinase Zap70 to phosphorylate scaffold proteins Slp76 and membrane bound LAT. This leads to recruitment of several scaffold proteins such as Gads via their SH2 domains. Complex formation leads to activation of PLCg which initiates downstream processes essential for T-cell activation. Figure adapted from ref. 2. (B) The Ste5 scaffold involved in the mating response in yeast is recruited to the membrane upon binding of pheromone a-factor to the GPCR Ste2. Consequently Ste11 is recruited to the scaffold which is phosphorylated by Ste20, initiating the MAPK pathway by successively phosphorylating Ste7 and Fus3. Activated Fus3 is able to phosphorylate the Ste5 scaffold, leading to inhibition of the pathway. Figure adapted from ref. 2. (C) Upon growth factor stimulation, the KSR scaffold in mammalian cells is localized at the membrane where subsequent Raf-MEK-Erk phosphorylation leads to proliferation. Activated Erk is able to inhibit the KSR scaffold in a similar manner as Fus3 for the Ste5 scaffold. Figure adapted from ref. 2.

Scaffold proteins are able to coordinate positive and negative feedback loops enabling a dynamic response. This is exemplified by the most wellstudied scaffold in cell signalling, the Ste5 scaffold in yeast which is involved in the mating response (Fig. 3B).28,29 Stimulation of the mating response in yeast is initiated by binding of pheromone a-factor to the GPCR receptor Ste2, leading to activation of G protein and subsequent recruitment of scaffold Ste5 to the membrane. Upon localization at the membrane, MAPKKK Ste11 is recruited, and phosphorylation of Ste11 by Synthetic Biology, 2018, 2, 65–96 | 69

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

the colocalized activator Ste20 initiates the cascade. The successive phosphorylation and activation of the MAPKK Ste7 and MAPK Fus3 in the cascade results in mating output. Activated Fus3 is responsible for one of the regulatory feedback loops as its activation leads to phosphorylation of four sites on the Ste5 scaffold which result in inhibition of the pathway which might contribute to the switch-like behaviour. The kinase suppressor of RAS (KSR) scaffold involved in proliferation seems to be the mammalian equivalent of the Ste5 scaffold (Fig. 3C). Although they are both engaged in the mitogen-activated protein kinase (MAPK) pathway, these proteins do not show any sequence similarity.30 Dynamic response of KSR is shaped in a similar way as the Ste5 scaffold. Inactive KSR is bound to 14-3-3 in a phosphorylation-dependent manner, localizing it in the cytoplasm (reviewed by Shaw et al.27 and Kolch et al.31). Upon mitogen stimulation, nucleotide exchange factors are recruited and KSR is subsequently dephosphorylated by protein phosphatase-2A (PP2A). This results in the release of 14-3-3 and translocation to the cell membrane where it interacts with activated RAS and facilitates the phosphorylation of MAPK and extracellular signal-regulated kinase (ERK) by Raf. In turn the MAPK/ERK induce phosphorylation of nuclear and non-nuclear substrates required for mitosis. The feedback mechanism is regulated by activated ERK, which blocks Raf binding to KSR via phosphorylation. This attenuates MEK activation and thereby decreases pathway output. Taken together, scaffold proteins mainly function through the following mechanisms. Enforced proximity is used to enhance local concentration of the signalling components. Combinatorial use of modular domains, on the other hand, enables signalling components to participate in multiple pathways. Scaffold proteins cause conformational changes of their interaction partners or are targets for conformational change so they can activate or inhibit signalling.

3

Higher-order assemblies

Structural studies investigating intracellular immune signalling have shown the formation of higher-order complexes, i.e. signalosomes. Wu has recently classified three archetypical categories of signalosomes, i.e. helical assemblies, filamentous amyloid complexes and a twodimensional lattice structure.5 It is expected that these types of assemblies are not only limited to immune signalling but are an essential part of other signalling pathways.5 Given their size, signalosomes are proposed to facilitate spatial compartmentalization, substantially enhancing the local concentration without the need for membranes.5 In addition, it has been suggested that the unique growth mechanisms of higher-order assemblies can actually play a role in modulating transduction of biochemical signals.5 The proteins responsible for signalosome formation can be characterized by their domains. Helical assemblies discovered until now are generally part of the death domain (DD) fold superfamily, which comprises proteins incorporating a DD, death effector domain (DED), caspase 70 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

recruitment domain (CARD) or pyrin domain (PYD). These domains commonly appear to function as mediators for oligomeric platform assembly and additionally recruitment of downstream effectors.32,33 The PIDDosome is the signalling complex required for caspase-2 activation in metabolic regulation and as response to DNA damage. It is composed of PIDD, RAID and caspase-2, where PIDD and RAID interact via their DD domains, while caspase-2 subsequently interacts with RAID via their CARD domains after its release of 14-3-3. Crystal structures revealed a helical-core complex with a PIDD to RAID ratio of five to seven (Fig. 4A).32 The formation of the helical core complex supports the recruitment of seven caspase-2 proteins, enabling proximity-induced dimerization and subsequent activation of caspase-2, initiating the caspase cascade resulting in cell death. Investigation of the death-inducing signalling complex (DISC) involved in caspase-8 mediated apoptosis revealed many similarities to the PIDDosome.34 DISC is composed of the death receptor Fas, the scaffold protein FADD and caspase-8, wherein Fas and FADD interact via their DD and caspase-8 subsequently interacts with FADD via their DED. Crystal structures revealed 5 to 7 Fas DD compared to 5 FADD DD (Fig. 4A). This suggests a model where the membrane bound ligand FasL, consisting of two FasL trimers, triggers the intracellular recruitment of six Fas DD

Fig. 4 Higher-order assemblies in immune signal transduction. (A) Top, crystal structures of the death domains from the proteins involved in formation of the helical assemblies of the PIDDosome, DISC and Myddosome complex. Bottom, planar schematics of the complexes, forming layered oligomers. Reprinted from Curr. Opin. Struct. Biol., 22, R. Ferrao and H. Wu, Helical assembly in the death domain (DD) superfamily, 241–247, Copyright 2012 with permission from Elsevier. (B) Schematic of an amyloid assembly, composed of filaments (such as b-strands) creating longer fibrils. (C) Graphic representation of a modelled two-dimensional lattice structure composed by dimer- and trimerization of TRAF6. Figure adapted from ref. 38. Synthetic Biology, 2018, 2, 65–96 | 71

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

inducing the helical complex assembly which leads to the extrinsic activation of the apoptosis pathway. Another recent example of higher-order signalling complexes entails the Myddosome, a helical complex assembled at the cell membrane where it mediates signals in the Toll-like receptor (TLR) and interleukin-1 receptor (IL-1R) pathway.35 Scaffold protein MyD88 is recruited to the membrane, upon bacterial polysaccharide recognition by the TLR4 membrane receptor, where it subsequently recruits IL-1R associated kinases (IRAK), leading to binding via their DDs. Crystal structures indicate a helical assembly comparable to the PIDDosome and DISC, consisting of six MyD88 DDs, four IRAK4 DDs and four IRAK DDs (Fig. 4A), which initiates kinase activation and TLR/IL1-R signalling.35 The filamentous amyloid structure involved in programmed necrosis is formed by RIP1 and RIP3 kinases and relies on both their interaction motifs, RIP homotypic interaction motifs (RHIMs).36 RIP1 also contains a DD which facilitates the binding to the TNF receptor signalling complex required for NF-kB activation. In addition, the DD of RIP1 enables the assembly of the ripoptosome during apoptosis.37 The necrosis pathway requires caspase inhibition and RIP1 and RIP3 kinase activity to form the RIP1/RIP3 necrosome. Structural studies have shown classical characteristics of amyloid fibrils as the RIP1/RIP3 complex forms fibrous protein aggregates composed of short b-strands which stack in longer b-sheets (Fig. 4B).36 RIP1 and RIP3 were able to form homodimeric fibrils and experiments suggested a similar structural architecture, however the fibrils were shorter and less regular in size compared to the heterodimeric complex. Moreover the heterodimeric complex was found to be extremely stable. Mutations in the RHIMs weakened filament formation but also indicated essential residues which are crucial for cluster formation, kinase activation and programmed necrosis.36 The two-dimensional lattice structure found in NF-kB activation is composed of tumor necrosis factor receptor-associated factor 6 (TRAF6) and the ubiquitin-conjugating enzyme (E2) Ubc13.38 TRAF6 is a ubiquitin ligase that mediates polyubiquitination via a Lys63 linkage, a nondegradative form of polyubiquitination which acts as an aggregation platform. For this ubiquitination, a heterodimeric complex of Ubc13 and Uev1A is required. Structural studies revealed that TRAF6 dimerizes through its N-terminal RING-domains, despite a trimeric symmetry of its C-terminal domain.38 This dimerization enables TRAF6 to interact with Ubc13 via its RING-domain, and additionally through one of its zinc fingers. These interactions result in a rigid, elongated structure of TRAF6 and Ubc13 which is able to recruited downstream protein required for NF-kB activation (Fig. 4C).38 Although structural studies have yielded critical insight in the formation of higher-order assemblies found in immune signalling, extensive studies are needed to examine their precise spatiotemporal influence on signal transduction. Moreover it is expected that these open-ended complexes can also be found in other signalling pathways. How the functional mechanisms of scaffold proteins are related to higher-ordered complexes needs to be investigated in future studies. 72 | Synthetic Biology, 2018, 2, 65–96

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

4 Mathematical models and simulations An important goal of the field of synthetic biology is the formulation of a set of universal design rules and network motifs to understand and design complex biomolecular networks. This has led to the development of a theoretical framework that incorporates the behaviour of biological components as modular molecular algorithms to achieve higher-order regulatory functions.39 The ubiquitous nature of scaffolds involved in cellular processes also hints at a universal set of rules applying to scaffold proteins and their function. However, the situation is complicated by the fact that scaffolds might have several functions besides simply bringing together functionally related enzymes in a cascade. Indeed, not only passive tethering plays a role, also scaffold catalysis, subcellular localization, protection from or recruitment of inhibitors and feedback mechanisms have been found to contribute to scaffold function in experimental studies.40,41 As a result, theoretical models and computer simulations can be used as powerful tools to quantitatively describe the effects of scaffolding and to support experimental observations. In this section we will highlight key theoretical and computational studies that increased the understanding of the important role that scaffolds play in cellular processes. First, we will focus on models of signalling pathways that involve cascades of phosphorylation reactions and assess the effect of a scaffold on the stability, robustness, specificity and signal response profile of the pathway. Secondly, we will discuss the role of scaffolds in metabolic pathways that involve the processing of small molecules and look at the effect of nanoscale organization of enzymes on the specificity and efficiency of the cascade. 4.1 Models of signalling pathways In cellular signalling, scaffolds are thought to have a diverse array of functions, including the prevention of crosstalk, signal amplification, feedback and pathway robustness.40 In the previous section a number of scaffolds and their role in signalling pathways have been described. The most well-known is the Ste5 scaffold protein of the yeast MAPK pathway and therefore, mathematical and computational studies often use the MAPK pathway as a model system. In short, the pathway consists of three kinases MAPKKK, MAPKK and MAPK that sequentially phosphorylate each other and phosphatases that deactivate the kinases. Importantly, Ste5 is essential in the activation of the pathway and colocalizes the three kinases, after which the last kinase activates other targets downstream of the pathway (Fig. 3B). While most of the mathematical and computational work has been done on the MAPK pathway, the general concepts and principles can be translated to other signalling pathways and other organisms in a relatively straightforward fashion. 4.1.1 Combinatorial inhibition. In 1997, Bray and Lay were one of the first to address the effect of spatial organization of multi-enzyme complexes using computational methods and showed an inhibitory Synthetic Biology, 2018, 2, 65–96 | 73

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

effect on complex formation when one of the components is present in excess.44 This effect is related to the so-called prozone effect, which is the working principle behind the immunological precipitin assay, used to quantitatively determine antibody concentrations. In the assay, an antibody solution is mixed with antigen solutions of increasing concentration. When antigen and antibody are present at optimal concentration, a macroscopic precipitate forms consisting of large, ordered, three-dimensional complexes. As a result, upon increase of the antigen concentration the amount of precipitate also increases, but then reaches a maximum and sharply declines when the antigen concentration is increased even further. In the latter regime antigen molecules completely surround the antibody and as a result, small soluble complexes dominate over large, ordered structures that precipitate. For scaffolded enzymatic cascades a similar effect occurs, usually termed combinatorial inhibition,45 which can be understood fairly intuitively. Consider a signalling pathway that becomes activated upon co-localization of multiple enzymes on a scaffold protein. When the scaffold concentration is low, most of the enzymes are not bound to the scaffold and activation of the pathway is low. At the optimal stoichiometry, most binding sites on the scaffold are occupied and therefore the overall activity is high. However, when the scaffold concentration increases further, the fractional binding site occupancy decreases again, which strongly reduces the throughput of the pathway (Fig. 5A). In a key study, Levchenko and colleagues used the MAPK pathway as a model system to investigate the effect of combinatorial inhibition on signalling pathways.43 A system of ordinary differential equations (ODE) derived from Michaelis-Menten kinetics was used to explicitly describe the interactions between kinases, phosphatases and a two-member or three-member scaffold. Moreover, it was assumed that the scaffold is not catalytically active and that scaffold binding is a non-cooperative process. By numerically solving the ODE model using biologically relevant parameters, the authors show that indeed, an optimal scaffold concentration exists in the same order of magnitude as the kinase concentration, where the amplitude of the output signal is maximal.43 Importantly, the computational results are found to be in agreement with recent experimental work, in which careful measurements of the protein levels of Ste5 and other MAPK pathway components in yeast cells revealed similar trends in pathway activation (Fig. 5B and C).41,42 ODE models are useful to describe scaffolds in signalling pathways, but require the definition of specific mechanistic details and as a result they are complex and elaborate, and meaningful results can only be obtained through numerical approximations. Alternatively, minimalist models can be used that are relatively easy to interpret and that can generate helpful analytical expressions, providing a general theoretical understanding of scaffold function. For scaffold-mediated signalling, such models have been developed using mass-action kinetics to describe the binding between scaffold and ligand proteins.40,42,46,47 These models assume that a scaffold binds a set of distinct ligand proteins under dynamic equilibrium and that the binding of ligands occurs 74 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

Fig. 5 (A) Schematic overview of combinatorial inhibition involving a three-member scaffold protein and kinases. (B) Experimental evidence of combinatorial inhibition in yeast cells with varying levels of Ste5 expression levels. Activation of Ste5-mediated MAP kinases Fus3 and Kss1 was determined by measuring the concentration of phosphorylated kinase using quantitative immunoblot. Downstream transcription was measured with a fluorescent protein reporter. Reprinted from S. A. Chapman and A. R. Asthagiri, Mol. Syst. Biol., 2009, 5, 313, http://dx.doi.org/10.1039/10.1038/msb.2009.73. Copyright r 2009 EMBO and Nature Publishing Group with permission under Creative Commons licence CC-BY-3.0 (https://creativecommons.org/licenses/by/3.0/). (C) Predicted behaviour of the Ste5-mediated MAPK pathway using an ODE model42 (dashed line) and a minimalist model43 (solid line). The concentration of phosphorylated MAPK is plotted as a function of the scaffold concentration. Reprinted from Math. Biosci., 232, 2, J. Yang and W. S. Hlavacek, Scaffold-mediated nucleation of protein signaling complexes: Elementary principles, 164–173, Copyright 2011 with permission from Elsevier.

independently and in a non-cooperative manner. This simple description allows for the formulation of an analytical expression that relates the fraction of active scaffold-ligand complexes to the initial concentrations of the components and their interaction strength. Furthermore, the expression predicts the existence of a unique scaffold concentration for which the concentration of active scaffold-ligand complex is maximal and therefore complements the results from numerical studies (Fig. 5C). Interestingly, when cooperativity between ligand binding events is included in the model, the optimal scaffold concentration does not change, while the maximal concentration of fully active scaffold-ligand complex increases.42 This is in disagreement with the numerical results of Levchenko et al.,43 but can be explained by the fact that the minimalist model assumes steady-state conditions and therefore does not take into Synthetic Biology, 2018, 2, 65–96 | 75

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

account kinetic effects. While these models are relatively simple and involve a number of simplifications, they can provide important insights into the behaviour of scaffold-mediated signalling by relating the behaviour to critical parameters such as protein-protein interaction strengths, concentrations and reaction rates. 4.1.2 Signal amplification. From the previous studies it is clear that the concentration of scaffold in the cell can have a large effect on the amplitude of the output signal of a pathway. At optimal scaffold concentrations, the input signal is strongly amplified leading to a significant increase in pathway activation, while at other concentrations the output is limited. To further investigate the effect of scaffolds on signal amplification, Thomson and co-workers compared careful experimental measurements in yeast with a quantitative rule-based model of the scaffold-mediated MAPK cascade.48 Interestingly, the measured concentration of Ste5 scaffold is the lowest of all pathway components in wildtype cells (B480 molecules/cell), while activation of the MAPK pathway using pheromone stimulation does not lead to an increase of Ste5. Furthermore, by inducing an upregulation of Ste5 scaffold, pathway activity at saturating pheromone levels increases as expected, however, the basal activity of the pathway without pheromone activation is also increased. This surprising result was confirmed by the computational model, which used a number of parameter sets derived from experimental literature and in all cases revealed a tradeoff between system output and the dynamic range of the pathway. Indeed, at maximal system output the difference between steady-state levels and full activation of the MAPK cascade is quite small. The authors therefore suggest that yeast cells keep the Ste5 scaffold concentration at a low level to maintain a steady basal expression of pathway components, while allowing a large system response upon pheromone stimulation.48 While the computational work of the previous study confirms experimental observations of signal amplification in the MAPK pathway regulated by Ste5 scaffold levels, other experiments suggest that scaffolds can also attenuate or reduce signal transduction upon stimulation. To investigate this, Locasale et al. carried out kinetic Monte Carlo simulations of the MAPK pathway in a three-dimensional lattice box with 1 mm dimension.49 By choosing this type of simulation over ODE models that assume a well-stirred reaction environment, spatial effects on scaffoldligand interactions are taken into consideration. Importantly, the authors investigate the effect of phosphatase activity on pathway activation and consider two situations. First, in the case of high phosphatase activity MAPK kinases are rapidly deactivated, making the cascade difficult to activate. Upon strong stimulation of the pathway, the interaction of the kinases with the scaffold leads to a sharp increase in active kinases and a significant amplification of the signal (Fig. 6A). In contrast, when phosphatase activity is low, the pathway is easy to activate and even a weak stimulation leads to a strongly amplified response. Strikingly, the model now predicts attenuation of signal transduction in the presence of scaffold (Fig. 6A). These results reveal an interesting aspect of scaffold 76 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

Fig. 6 (A) Computational predictions of signal amplification in the MAPK pathway in the absence (low scaffold-kinase binding affinity) and presence (high scaffold-kinase binding affinity) of scaffold. Two situations are considered: activation by a strong stimulus and high phosphatase activity (left) leads to amplification of signal output in the presence of the scaffold. At low phosphatase activity (right), a weak stimulus leads to strong amplification in the absence of scaffold, whereas the signal is attenuated in the presence of the scaffold. For each simulation a constrained case (circles) where scaffold-bound kinases cannot activate downstream kinases and an unconstrained case (triangles) where scaffold-bound kinases can activate other kinases was considered.49 Reprinted from J. W. Locasale, A. S. Shaw and A. K. Chakraborty, Proc. Natl. Acad. Sci. U. S. A., 2007, 104, 13307–13312. Copyright (2007) National Academy of Sciences, U.S.A. (B) Definitions of pathway specificity SA and fidelity FA of a scaffolded pathway A in the presence of an unscaffolded pathway B with a shared kinase X that can lead to unwanted cross-talk between the pathways. (C) Examples of input-output response profiles at different Hill coefficients.

function in signalling pathways: scaffolds are able to maintain steady levels of pathway activation by amplifying strong signals in an inhibitory environment, while limiting signal amplification in situations where the pathway is relatively easy to activate. 4.1.3 Pathway specificity. Signal transduction pathways are often part of larger, interconnected networks and as a result, components are shared between different pathways leading to unwanted cross-talk. Additionally, pathways such as the MAPK cascade are known to be activated by a large number of signals and therefore evoke several distinct cellular responses through the same reaction cascade. Scaffold proteins are thought to play a crucial role in reducing cross-talk between pathways and maintaining specificity. To investigate this within a general Synthetic Biology, 2018, 2, 65–96 | 77

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

theoretical framework, Bardwell, Komarova and co-workers introduce the concepts of specificity and fidelity and show that for simple pathway architectures the analytical expressions of these quantities can reveal important information on the behaviour of the networks.50,51 Consider two pathways A and B that interact with each other in an undesirable fashion. For example, kinases from pathway A can lack selectivity and therefore also activate components from pathway B. Alternatively, pathway A can share components with pathway B, leading to activation of pathway A either through input from A or from B. The specificity of pathway A is now defined as the ratio between the authentic output of A and the spurious output of A, i.e. the activation of pathway B in response to input from A. The fidelity of pathway A is defined as the ratio between its output in response to input from A and output in response to input from B (Fig. 6B). For basic architectures of interconnected kinase-phosphatase pathways it is possible to obtain simple analytical expressions for both specificity and fidelity, which are shown to depend on system properties such as substrate selectivity, input signal strength and phosphatase activity. Importantly, the authors show that high specificity and fidelity cannot be achieved by simply tuning these network parameters. Instead, key components or pathways in the network need to be insulated through higher-order mechanisms such as compartmentalization or scaffolding.50 To show this, the authors again consider two pathways A and B that share a kinase, but now allow activation of pathway A exclusively on the scaffold. In this sense the function of the scaffold is formally equivalent to the compartmentalization of pathway A. As a result, both the specificity and the fidelity of the scaffolded pathway are significantly enhanced, under the constraint that deactivation of kinases by phosphatases is relatively fast compared to the binding and release of the kinases from the scaffold. Moreover, the specificity of the unscaffolded pathway can be improved by selective activation of the scaffold. This experimentally observed mechanism implies that a scaffold such as Ste5 only exists in an open, or active confirmation during authentic signalling, leading to a reduction of unwanted output of unscaffolded pathway in response to cross-reactivity of the scaffolded pathway.50 These results highlight the importance of scaffolds as necessary insulating components to increase specificity in the large, interconnected networks of signal transduction. By introducing specificity and fidelity as simple ratios of pathway input and output signals that can be straightforwardly measured in an experimental setting, they can be useful in quantifying and supporting experimental observations and improving the general understanding of signaling transduction networks. 4.1.4 Ultrasensitivity. Signal transduction pathways are involved in a large number of cellular responses, which require an appropriate reaction to internal or external stimuli. Some pathways convert an external input directly into an intracellular signal, leading to the classical input–output response profile described by Michaelis–Menten kinetics (Fig. 6C). Other processes require the conversion of a continuous, 78 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

graded input stimulus into an unambiguous switch-like output, indicated by a steep, sigmoidal response profile. Consequently, a network or pathway behaves very sensitively in a narrow concentration regime, where a small increment in the input signal leads to a large all-ornothing change in output signal. This ultrasensitivity can be caused by a number of mechanisms, such as cooperativity, positive feedback loops and multisite activation,52 and the extent to which a system behaves ultrasensitive is usually described by the Hill coefficient. This coefficient is determined by fitting the curve that describes the amplitude of the output signal as a function of the input to a Hill equation and indicates the steepness of the response compared to the standard Michaelis–Menten curve. For example, binding of multiple ligands to an enzyme in a strictly independent way results in a Hill coefficient of 1, while cooperative binding of oxygen to hemoglobin is an ultrasensitive process with a Hill coefficient of 2.8 (Fig. 6C). To show that scaffold proteins can be used to realize ultrasensitive behaviour, Dueber et al. engineered modular protein switches that produce either a graded or ultrasensitive output depending on the size of an intramolecular scaffold, i.e. the number of SH3 domains.53 The protein switch consists of a catalytically active N-WASP domain, which is inhibited intramolecularly by an SH3 interaction module. The input of the switch is an SH3-binding peptide, which releases the interaction module and activates the catalytic N-WASP domain producing the output signal. With a simple equilibrium model, the authors show that the degree of ultrasensitivity of the input-output response can be tuned by varying the number of SH3 interaction modules and that the response profile is largely unaffected by the individual interaction strength of the SH3 domains. Importantly, the model predicts that the activation of the switch becomes increasingly more difficult as the number of SH3 modules and therefore the degree of ultrasensitivity increases. Experimental work confirmed the predicted properties of the switches with Hill coefficients ranging from 1.0 for a single SH3 domain and 3.9 for five SH3 domains. Moreover, the experiments support the computationally predicted tradeoff between ultrasensitivity and switch activation.53 In yeast and other eukaryotic organisms, the MAPK pathway and associated scaffolds such as Ste5 and KSR, are involved in a number of important switch-like decisions, including proliferation, mating and differentiation, where gene expression needs to be regulated in an ultrasensitive way in response to continuous, graded levels of external stimuli. Indeed, early work by Huang and Ferrell indicates the possibility of ultrasensitivity in a linear cascade of kinase-phosphatase reactions.54 Numerical solutions of an ODE model describing the three-member MAPK cascade without scaffolds indicated a robust ultrasensitive response of kinase activity with a Hill coefficient of nearly 5 over a range of reaction parameters corresponding to experimentally observed values. In contrast, Levchenko et al. investigated the role of scaffolds in MAPK cascades using ODE models and found that the presence of a scaffold changes the switch-like behaviour of kinase cascades upon pathway stimulation to a more graded, linear response.43 Synthetic Biology, 2018, 2, 65–96 | 79

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

These theoretical works suggest that an ultrasensitive response can only be generated when a signalling pathway is not scaffolded. However, multiple experimental studies indicate that a switch-like response is possible in scaffolded pathways and that scaffolds are sometimes even required for ultrasensitive behaviour. For example, Hao and co-workers subjected yeast cells to a gradient of pheromone levels and found that activation of the Ste5-scaffolded Fus3 kinase through the MAPK cascade is slow and ultrasensitive, while the unscaffolded activation of Kss1 kinase via the same pathway is graded.55 The authors use computational models to demonstrate that scaffold binding slows kinase activation and therefore delays the response until a switch-like ultrasensitive response is observed. Alternatively, Takahashi and Pryciak argue that the MAPK cascade is inherently ultrasensitive and that additional insulating mechanisms are necessary to convert the switch-like behaviour into a more graded response.56 One of those mechanisms is the recruitment of scaffold-bound kinases to the cell membrane via binding of Ste20 (Fig. 3B). Indeed, by preventing membrane localization of the MAPK pathway in yeast cells activation of Fus3 remained ultrasensitive even in the presence of scaffold Ste5. In a different study, Malleshaiah and colleagues also find an ultrasensitive response of the scaffold-mediated MAPK pathway and ascribe it to a competition between the kinase Fus3 and a regulatory phosphatase Ptc1 for four phosphorylation sites on the Ste5 scaffold.57 Only when all four sites are dephosphorylated by Ptc1, Fus3 becomes active by releasing from Ste5. Partly due to the competition between Ptc1 and Fus3 the release becomes ultrasensitive with a high Hill coefficient of approximately 6. An ODE model suggests a two-stage binding of Fus3 kinase (and Ptc1) to Ste5, in which Fus3 first binds to a docking site and then is able to reach the phosphorylation sites. As a result, Ste5 phosphorylation occurs at locally saturated enzyme levels and zero-order ultrasensitivity contributes to the switch-like activation of Fus3.52,57 In summary, kinase-phosphatase pathways can exhibit ultrasensitive or switch-like behaviour, even in the presence of a scaffold protein. This conclusion is consistent with a thorough theoretical study by Thalhauser and Komarova, who revisited the ODE model of the yeast MAPK cascade by including membrane localization of the Ste5 scaffold.58 Strikingly, the model is able to capture all experimentally observed behaviours by careful consideration of the strength of the protein-protein interactions in the pathway: a stronger binding of kinases to the scaffold reduces the ultrasensitivity of the response, whereas a higher extent of selective activation of the scaffold by membrane localization favors a more graded response. These results reiterate the fact that scaffolds can have multiple mechanisms of dynamically altering and shaping the response dynamics of the cell and that these mechanisms are not necessarily mutually exclusive. The mathematical and computational models discussed here help to distinguish and separate the distinct functions of scaffolds to signal transduction pathways. Because of the complexity of these networks, models always require simplifications and assumptions, and therefore care must be taken to correctly interpret the results and compare them to experimental observations. 80 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

4.2 Models of metabolic pathways Whereas signalling pathways involve a network of enzymes that activate each other to evoke a downstream cellular response, metabolic pathways are characterized by a cascade of enzymes that catalyze the sequential reaction of small molecule metabolites. Also here, spatial co-localization of functionally related enzymes provides a means of increasing specificity, efficiency and throughput (reviewed extensively for both natural and synthetic systems59–61). In some metabolic networks this spatial organization is achieved by large multifunctional polypeptides that carry several enzymatic domains that each catalyze a distinct step in the cascade, while others utilize actual physical tunnels through which substrates can travel from one enzyme to the other. Because intermediate metabolites can be transferred directly and efficiently between enzymes, these pathways inherently reduce unwanted cross-talk and increase the overall efficiency of the cascade. Mathematical models and simulations have been developed to quantitatively assess the effect of co-localization on enzymatic cascades. For example, Brownian dynamics simulations were used to investigate a simple two-step reaction cascades catalyzed by two enzymes that are placed within close proximity.62,63 Intuitively, the simulations indicate that the probability of reaction for the overall cascade increases more than threefold when the distance between active sites is small (B1 nm), because the direct diffusive transfer of intermediates between enzymes can occur rapidly. Alternatively, scaffold proteins are employed to bring metabolic enzymes in close proximity, and experimental observations demonstrate that the processing of intermediates in scaffolded cascades is significantly accelerated (Fig. 7A).59 However, the direct transfer of an intermediate from one enzyme to the other becomes increasingly less likely when the active sites are not in immediate vicinity of each other (i.e. o2 nm) and therefore the effect of substrate channelling is strongly reduced. Indeed, numerical reaction-diffusion models show that the effect of direct diffusive

Fig. 7 (A) Schematic overview of substrate channelling on a scaffold. The probability of direct diffusive transfer of intermediates (I) from one enzyme to the next is enhanced on a scaffold, increasing the formation of product (P) from substrate (S). Diffusion away from the scaffold reduces the effectiveness and can lead to competing sequestering reactions. (B) Schematic overview of co-clustering of functionally related enzymes into large multienzyme complexes. Cascade reaction rate is increased by the availability of multiple enzyme targets in the direct environment. Non-specific scaffold-substrate interactions can further increase the activity of the cascade. Synthetic Biology, 2018, 2, 65–96 | 81

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

substrate transfer is relatively small, only giving rise to a short boost in activity in the initial stage of the reaction.64 Relatively quickly however, the substrate accumulates in the surrounding environment and the initial benefit of spatial localization is lost. Analytical analysis reveals that the characteristic timescale of this temporary boost is on the order of seconds for a typical scaffolded cascade with an interenzyme distance of 10 nm. Interestingly, this timescale is found to be independent on the catalytic properties of the enzymes, but instead depends only on the diffusion coefficient of the substrate and the volume of the reaction container. This theoretical work indicates that the direct transfer of intermediates by diffusion from one enzyme to the next only has a minor effect in scaffolded metabolic cascades. Three other mechanisms that can enhance or accelerate cascade activity were investigated using numerical models. First, the presence of a sequestering reaction that inactivates the intermediate metabolite can have an influence on the reaction cascade, similar to the action of phosphatases in signalling pathways. Indeed, in metabolic pathways intermediates are often highly reactive species that are prone to oxidation or other cross-reactions. Idan and Hess included a sequestering reaction in a reaction-diffusion model of a simple scaffolded two-reaction cascade and show that a sharp concentration gradient of the intermediate is formed directly surrounding the first enzyme.64 This gradient indicates that the intermediate is quickly sequestered in the surrounding environment, leading to an increase in the contribution of direct substrate transfer to the second enzyme and a permanent boost to cascade activity. Another possibility is the presence of a non-specific interaction between the metabolite and the scaffold. Depending on the chemical properties of the metabolite, electrostatic or van der Waals interactions can lead to a small affinity of the intermediate to the scaffold surface. As a result, the local concentration of the intermediate on the scaffold increases, which leads to an enhancement of cascade activity. In the reaction-diffusion model of Idan and Hess64 this was taken into account by introducing a semipermeable barrier around the cascade through which diffusion of the intermediate substrate is reduced. Indeed, even a small binding affinity between intermediate and scaffold (o2 kT, smaller than known interactions, e.g. between small molecules and DNA64) results in a persistent increase in cascade output for several hours. Additionally, the presence of the scaffold partially prevents the diffusion of the intermediate away from the enzymes, increasing the probability of efficient substrate transfer.63 Finally, the effect of precise nanoscale geometry and stoichiometry on cascade activity was investigated. Motivated by previous studies that indicated only a minor effect of direct substrate transfer at typical interenzyme distances62 and in vivo observations that suggest that metabolic enzymes are often found in unorganized clusters,59 the group of Wingreen developed a quantitative spatiotemporal mathematical model describing a two-step reaction cascade.65 The model is based on reaction-diffusion equations that describe enzyme and substrate concentrations using continuous densities. The enzyme distribution for optimal pathway efficiency was found to be a co-cluster of a large number 82 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

of both enzymes, in which the precise arrangement of the enzymes is not important (Fig. 7B). Co-localization of enzymes in aggregates of 0.26 mm diameter increases the efficiency of the two-member pathway nearly 6-fold, while a model of a three-member cascade displayed a 110-fold increase. The computational model was complemented by exciting in vivo experiments, in which the direction of substrate flux at a metabolic branching point was established through the co-clustering of the appropriate upstream and downstream enzymes of one of the branches. These results confirm the observation that the interenzyme distance on scaffolds is not the critical parameter in enhancing metabolic activity, but instead the availability of multiple enzyme targets in co-clusters of a large number of enzymes can strongly increase the transfer of intermediates between enzymes.63,65

5

Synthetic scaffolds in engineered signalling networks

Engineered protein scaffolds are important tools in synthetic biology to on the one hand fundamentally understand the underlying molecular processes and on the other hand explore the potential of artificial signal transduction pathways. Wendell Lim reviewed the opportunity and applications for synthetic scaffolds in the design of engineered cell signalling networks.66 The reviewed exploratory work reveals synthetic scaffolds that are functional and their relation to engineered constructs that are not.66 This understanding can lead to applications where synthetic scaffold proteins and higher-order complexes show precisly controlled behaviour, which could be beneficial in therapeutics, diagnostics and biotechnology. Early efforts in redirecting signalling pathways revealed that recombination of adaptor protein domains led to the possibility to rewire pathways.67–69 First, we will discuss the two leading studies in recombination of interaction domains, followed by a review of Ste5-derived scaffolds. The majority of the synthetic scaffolds engineered until now are based on this well-studied scaffold and this selection will demonstrate the potential of engineered scaffolds in synthetic signalling. In addition, synthetic scaffolds based on phosphorylation will be discussed, as phosphorylation is essential in signal transduction. Finally, synthetic scaffolds applied in metabolic engineering will be highlighted to show their potential in biotechnology. To our knowledge, synthetic higher order protein assemblies applicable for synthetic signal transduction have not been published so far and remain elusive for the future. 5.1 Pioneers in synthetic scaffolding The group of Pawson was one of the first to investigate interchangeability of domains from two different pathways to create new synthetic scaffolds.67 The authors coupled the SH2 domains from adaptor proteins Grb2 and ShcA, involved in growth and survival, to the DED FADD involved in programmed cell death.67 The chimeric adaptor was able to redirect input of respectively mitogenic or RTK signals to an output of induced caspase activation resulting into cell death. A study in oncogenic cells showed that elevated levels of RTK activity actuated the chimeric Synthetic Biology, 2018, 2, 65–96 | 83

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

adaptor leading to selective apoptosis. These results were the first to show the prospect of engineered scaffolds to control signalling pathways based on interchangeability of interaction domains. Moreover it showed that responses to external signals could be altered in such a way that they could be therapeutically beneficial.67 Other leading work in domain interchangeability led to a synthetic switch that could control actin polymerization.68 Combination of the VCA output domain of the regulator protein N-WASP (neuronal WiskottAldrich syndrome protein) with the autoinhibitory input domain PDZ resulted in a switch that was repressed under basal conditions. Competitive binding of an external PDZ ligand activated the switch while precise gating behaviour could be tuned by the affinity of the autoinhibitory interaction.68 For a two-input switch system a combinatorial library was designed using two geometrically different output domains, variation of interdomain linker lengths and intramolecular ligands with diverse affinities. This led to 34 switches with diverse behaviours, showing antagonistic behaviour, negative input control and positive input control of AND and OR gates. The small library revealed that subtle changes in switch design could lead to significant changes in gating behaviour.68 This study confirmed that use of modular input and output domains permit interchangeability, allowing unique regulatory connections between otherwise unrelated proteins.

5.2 Engineered scaffolds based on Ste5 Lim and co-workers also intensively investigated the Ste5 scaffold from the yeast MAPK pathway. The authors showed that simple recruitment of pathway components is sufficient for signal transduction and tested their hypothesis that primitive tethering could generate new pathways by recombining interaction domains, resulting in a diverter scaffold.69 This diverter scaffold was created from Ste5, responsible for mating response, fused to Pbs2, involved in the osmolarity pathway. The joined interaction partner Ste11 acted as the diverter node resulting in new input-output linkage of alpha-factor input and osmoresponse output (Fig. 8A).69 This engineered diverter scaffold revealed the possibility for systematic manipulation and redirection of signalling pathways.69 Subsequent research showed that engineered circuits could lead to ultrasensitivity, tunable adaptation and accelerated and delayed responses using a synthetic Ste5 scaffold.70 For this the endogenous mating pathway was coupled to synthetic feedback loops by fusing a leucine zipper to the C-terminus of the Ste5 scaffold leading to an altered mating response. Heterodimerization of the leucine zipper with a complementary zipper, fused to a modulator protein, allows regulation of output. Either a positive regulator (Ste5) or negative regulator (Msg5) were coupled to the complementary zipper to allow recruitment, displaying robust yet opposite effects on the pathway output.70 In addition, unrecruited Ste5 and Msg5 showed considerable smaller effects when expressed at the same level, stressing the necessity for recruitment to the scaffold.70 Moreover, putting the modulator Msg5 under control of a 84 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

Fig. 8 (A) The diverter scaffold developed by Park et al.69 linking a-factor input from the Ste5 scaffold to osmoresponse of the Pbs2 scaffold, based on the diverter node Ste11. Mutations in the modular binding sites led to variants showing the minimal requirement of pathway activity. Diverter activity was measured using an a-factor disc assay whereby growth around the disc indicates a functioning diverter. From S.-H. Park, A. Zarrinpar and W. A. Lim, Rewiring MAP Kinase Pathways Using Alternative Scaffold Assembly Mechanisms, Science, 2003, 299, 5609, 1061–1064. Reprinted with permission from AAAS. (B) An ultrasensitive switch based on the Ste5 scaffold was developed by Bashor et al.70 Leucine zippers allow recruitment of negative and positive modulators. The combinatorial use of a negative modulator and a positive-feedback loop results in negative modulator displacement by inducible expressed of a high affinity positive modulator, resulting in ultrasensitivity. From C. J. Bashor, N. C. Helman, S. Yan and W. A. Lim, Using Engineered Scaffold Interactions to Reshape MAP Kinase Pathway Signaling Dynamics, Science, 2008, 319, 1539–1543. Reprinted with permission from AAAS.

promoter generated a negative feedback loop in which the strength of the feedback could be tuned by modifying the affinity of the leucine zippers or by regulating the expression level of the negative modulator.70 In addition, more complex negative feedback circuits were designed by the addition of competitors, resulting in the observation of pulse-like activation response, accelerated or delayed response time and ultrasensitive dose-response behaviour (Fig. 8B).70 The recombination of modular domains from the yeast mating pathway was further investigated by recombination of 11 proteins Synthetic Biology, 2018, 2, 65–96 | 85

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

involved in mating response. This led to a library of 66 chimeric constructs.71 The constructs were transformed in yeast and phenotypic output of domain recombination was compared to the endogenous network, full gene duplication, domain duplication and domain co-expression. The recombined constructs caused altered dynamic behaviour either inhibiting or activating the mating pathway. Moreover, due to recombination some strains mate more efficiently than wild type. Additionally, the effect of domain shuffling was studied in more detail; proteins of the mating pathway were investigated which were also shared with other pathways. This confirmed that new regulations or localization of the catalytic domains alters signalling behaviour and cellular phenotype.71 Once more this emphasizes that recombination enables the engineering of new networks with desired functions. Recruitment of modulator proteins to a synthetic scaffold can be achieved in various ways. Wei et al. used a similar strategy as Lim and coworkers by making use of leucine zippers to recruit bacterial pathogen produced effector proteins.72 The authors investigated the effect of OspF and YopH on the MAPK pathway in yeast and human primary T-cells.72 Under normal conditions the introduced OspF causes growth inhibition in yeast, however, introduction of leucine zippers to OspF and to either Ste5 or Pbs2 scaffold triggered selective recruitment of the inhibitory protein, causing irreversible activation of the MAPKs.72 Additionally the design of a synthetic OspF feedback loop displayed frequency dependent input filtering altering the osmoresponse. Human T-cells were used to investigate the rewiring capability of bacterial effectors for therapeutic applications. In T-cell therapy, adverse side effects such as over-activation or off-target activation, might be controlled by tuning specificity, timing and amplitude of T-cell function by making use of synthetic modulators. OspF and YopH both inhibit TCR signalling by distinct mechanisms and were used to design circuits limiting the amplitude of responses or acting as a pause switch in T-cells. A library of negative feedback loops involving OspF and YopH reduced the maximal response amplitude of T-cell activation and in addition, the amplitude of the response could be tuned by changing feedback promoter strength and effector stability.72 Moreover, a pause switch was explored enabling transient and reversible disabling of T-cells. A doxycycline inducible promotor facilitated external control on effector expression. Experiments in Jurkat T-cells and CD4 þ T-cells showed doxycycline induced effector expression, leading to TCR inhibition, which could be recovered after doxycycline removal.72 These results show the potential of engineered scaffolds in therapeutic applications. The signal output of the MAPK pathway could also be controlled by applying a different approach. Galloway et al. employed orthogonal diverters to activate or attenuate pathway output.73 The network diverter consisted of an RNA-based transducer, promoters that act as modulators that control level and mode of expression, and pathway regulators. In this research the authors also employed the negative regulator Msg5 next to the positive regulator Ste4, which should result in contrasting cell fate response. Pathway activity was measured by fluorescence under the 86 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

control of a transcriptional reporter, while cell fate was measured using halo-assays. Positive and negative diverters were developed with and without feedback, enabling either pathway activation in the absence of pheromone or decrease of pathway activity in the presence of pheromone. The engineered dual diverter module containing both the positive and negative diverter showed antagonistic behaviour, for which basal expression from both diverters is sufficient to antagonize the opposite diverter. The influence of additional positive and negative modules on pathway output was modeled by varying the module strength. This analysis revealed that incorporation of the added modules enhances the function of the distinct diverters, while having a minimal effect on the opposing diverter.73 Different strategies were employed to minimize pathway antagonism in the engineered dual diverter system. These efforts led to the construction of an amplification diverter, overcoming the effect of the negative diverter, and an attenuation diverter showing minimal impact on pathway activation. Finally, combining the engineered diverters enables routing of cells into distinct fates and additionally showing pathway sensitivity, therefore the molecular diverters developed in this research demonstrate the ability to spatiotemporal control cell fate. In 2015, Peisajovich and co-workers created a library of Ste5-derived scaffolds following a comparable strategy as explored by Lim and co-workers by shuffling the interaction domains of nine proteins involved in the MAPK pathway.74 The library of synthetic scaffolds was created by directed evolution resulting in 3375 scaffold variants. These were transformed in a mutated yeast strain lacking the native Ste5 scaffold. A mating-responsive promotor controlling the expression of GFP allowed selection of those scaffolds capable of rescuing the MAPK pathway by employing FACS. Sequencing of the selected scaffolds showed that rearrangements of the interaction domains leads to maintained activity, however not all possible rearrangements lead to active scaffold variants (Fig. 9A). Moreover, active variants could be selected while they were composed of domains belonging to other mating pathways (Fig. 9A). Full mating response was verified for all active synthetic scaffolds and involved changes in cell morphology and fusion of two cells. In addition, the catalytic vWA domain was required for the release of Fus3 from Ste5, however the position of the domain in the scaffold was found not to be important. Nonetheless, time-course measurements revealed that the kinetics of the mating response is affected by the scaffolds domain architecture. On the other hand, dose response curves indicate that ultrasensitivity of the pathway is robust to changes in the scaffolds domain architecture.74 Changes in domain order and composition are tolerated suggesting that no defined geometry is required and that protein scaffolds in general can be used as platform for signalling engineering.74 More recently, Ryu and Park investigated replacement of the native Ste5 scaffold for a synthetic complex, allowing them to assess the minimal requirements of pathway reconstruction.75 The authors used multiple repeats of the PDZ domain (of PSD95) coupled to a membrane-targeting Synthetic Biology, 2018, 2, 65–96 | 87

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

Fig. 9 (A) Synthetic scaffolds composed of shuffled modules (left) derived from protein scaffolds involved in the MAPK pathway. The rearranged modules originate from N-terminal, C-terminal or internal domains of the scaffold proteins (center). Pathway activity was assessed by flow cytometry measuring GFP expression under the control of a mating responsive promoter (right). Synthetic scaffolds consisting of Ste5 domains show the highest pathway activation, moreover the order of these domains can be changed. Scaffolds containing domains of other proteins than Ste5 show minimal activity upon addition of pheromone. Reprinted with permission A. Lai, P. M. Sato and S. G. Peisajovich, ACS Synth. Biol., 2015, 4, 714–722. Copyright 2015 American Chemical Society. (B) Schematics of the synthetic scaffold engineered by Ryu and Park, based on PDZ domains to recruit the MAPK pathway kinases. Signalling output of the pathway was measured by Fus1-lacZ reporter induction, whereby kinases lacking the TP-tag for PDZ recruitment are shown in grey. The synthetic pathway shows properties of a logic gate as the signalling output shows the greatest response when all three kinases are present compared to background activity, whereas Ste11-Fus3 and Ste11-Ste7 also lead to a small increase in response. From J. Ryu and S. Park, Simple synthetic protein scaffolds can create adjustable artificial MAPK circuits in yeast and mammalian cells, Science Signal, 2015, 8, 1–11. Reprinted with permission from AAAS.

domain and investigated if specific localization would be sufficient to activate the pathway (Fig. 9B). The corresponding pathway kinases Ste11, Ste7 and Fus3 were all fused to a C-terminal target peptide which binds with moderate affinity to the PDZ domain. The engineered construct was 88 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

able to recruit the tagged kinases, nevertheless plasma membrane localization was required for the upstream activation by Ste20. Surprisingly, diploid formation assays and flow cytometry based assays indicated that two PDZ domains were already sufficient for a functional signalling network suggesting that either kinase exchange or cross activation from kinases bound to different scaffolds enabled the response. Subsequently, varying the presence of the involved kinases showed graded and all-ornone responses which indicate that the synthetic pathway has properties of a logic gate (Fig. 9B). Moreover, the authors could show that the PDZbased construct was also functional in mammalian 293T cells where they used target peptide tagged Raf-1, MEK1 and ERK1 to create a functional ERK/MAPK pathway. Therefore, this research confirms that localization of interaction partners is sufficient for engineered pathway activation and logic gate response and that exchange and cross activation can contribute to pathway establishment with minimal components. As the Ste5 scaffold is so well characterized it allows for simple engineering of synthetic scaffolds, giving insight into modularity, domain order, domain composition and regulators that can influence pathway output. Ste5 therefore serves as a valuable example for future research.

5.3 Phosphorylation-based engineered scaffolds To show the potential of synthetic scaffolds in reprogramming cellular behaviour and the essence of phosphorylation in signal transduction Yeh et al. engineered guanine nucleotide exchange factors (GEFs) coupled to a non-native input.78 In this way two distinctive pathways were linked in a similar fashion as some of the previously described studies. GEFs are activators of actin regulator GTPases, in this research the activity of the GEFs was placed under the control of PKA. Hence the catalytic domain of GEF, intersectin, was coupled to a PKA-sensitive autoinhibitory domain, consisting of a PDZ domain and its target peptide where phosphorylation of the peptide resulted in disruption of the interaction. Coupling of the PKA regulatory module to other GEFs resulted in seven constructs which all showed a degree of repression under basal conditions. Four of these constructs could be activated by PKA. Examination of the synthetic GEFs in cells resulted in PKA dependent morphological changes, showing the possibility of linkage between the otherwise unrelated pathways of endogeneous PKA and GTPase mediated cell morphology (Fig. 10A).78 Moreover they could demonstrate that two synthetic GEFs can be linked in series to form a higher order cascade, leading to reduced noise, amplification of the response and increased ultrasensitivity. The interaction of scaffold protein 14-3-3 with its interaction partners is phosphorylation-dependent as explained earlier. Ottmann and coworkers studied the capability of 14-3-3 to act as a chemically-induced dimerization (CID) scaffold which could be made reversible using the small natural product fusicoccin (FC).79 The authors fused the proteins of interest to the C-terminal 52 amino acids (CT52) of the plant plasma membrane pump H þ -ATPase (PMA). The interaction between CT52 and 14-3-3 can be stabilized by the small molecule FC and the system was Synthetic Biology, 2018, 2, 65–96 | 89

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

made orthogonal preventing interactions with endogenous proteins. The reversibility of the system was demonstrated by use of 14-3-3 and CT52 both tagged with a fluorescent protein and the use of nuclear localization sequences. Cell experiments indicated that upon addition of the small molecule FC, the CT52-tagged fluorescent protein was triggered out of the nucleus and associated with 14-3-3 which was mainly present in the cytoplasm. Washing with medium showed the removal of FC and the reversibility of the translocation.79 Physiologically relevant experiments further showed that the protein of interest, transcription factor NF-kB conjugated to CT52, was translocated to the nucleus upon addition of FC,

Fig. 10 (A) Schematic representation of an engineered guanine nucleotide exchange factor (GEF), whereby PKA phosphorylation of the scaffold leads to filopodia formation. Forskolin dependent filopodia formation of cells was observed, while cells lacking the GEF scaffold showed minimal background activity. GEF1* acts as a control as this autoinhibited scaffold cannot be activated by PKA. Reprinted by permission from Macmillan Publishers Ltd: Nature. B. J. Yeh, R. J. Rutigliano, A. Deb, D. Bar-Sagi and W. A. Lim, 2007, 447, 596–600, copyright 2007. (B) The two-component system developed by Whitaker et al.76 shows scaffold dependent control of phosphotransfer (y-axis, CpxR activation). Addition of a control scaffold which is not capable to colocalize the phosphotransfer components resulted in overall low activity. Reprinted from W. R. Whitaker, S. A. Davis, A. P. Arkin and J. E. Dueber, Proc. Natl. Acad. Sci., 2012, 109, 18090–18095. (C) Metabolic scaffolding of a three-enzyme system shows a 77-fold increase in mevalonate production if an engineered scaffold is used which consists of one GBD domain and two SH3 and PDZ domains.77 Reprinted by permission from Macmillan Publishers Ltd: Nature Biotechnology. J. E. Dueber, G. C. Wu, G. R. Malmirchegini, T. S. Moon, C. J. Petzold, A. V. Ullal, K. L. J. Prather and J. D. Keasling, 2009, 27, 753–759, Copyright 2009. 90 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

leading to transcription of specific target genes involved in immune and inflammatory response. Whitaker et al. focused on phosphate transfer and investigated redirection of phosphate transfer in prokaryotes by making use of the two-component system (TCS).76 The TCS is generally organized by a histidine kinase which functions as ‘receptor’ and consequently autophosphorylates, and a response regulator as target for phosphate transfer. The authors hypothesized that kinetic preferences of TCS might be partly subscribed to specific binding, meaning that increased local concentration of synthetic modularized parts could control signalling for nonassociated components.76 They made use of the natural histidine kinase Taz, which displays weak natural cross-talk to noncognate response regulators. To increase the local concentration of the TCS components, Taz was fused to the SH3 domain of Crk and a scaffold was created by a SH3 ligand fused to a leucine zipper. In addition the response regulator was fused to the complementary leucine zipper. Upon addition of the scaffold, amplification of specific noncognate phospotransfer was observed. Moreover biphasic behaviour was observed when varying the scaffold concentrations, indicating that scaffold-dependent signalling was dependent on the scaffold to target protein concentration (Fig. 10B).76 Robustness of the system was assessed by including autoinhibition in the Taz kinases by addition of an intramolecular SH3 ligand, leading to competitive inhibition with the SH3 ligand-leucine zipper scaffold. It was shown that autoinhibition decreased the sensitivity to histidine kinase concentrations when no scaffold was present. This research shows that modular and independent tunable parts facilitate engineering of synthetic signalling for two-component system and is a promising engineering platform for other prokaryotic pathways. Hobert and Shepartz developed a synthetic scaffold protein which could redirect tyrosine phosphorylation.80 The authors made use of miniature proteins, which are small well-folded protein domains. In their research they could redirect Src family kinase Hck to phosphorylate hDM2.80 hDM2 is normally a poor substrate of Hck and negatively regulates tumor suppressor p53. The engineered scaffold consisted of two miniature proteins. The first is YY2, which contains a SH3 binding domain, allowing interaction with Hck and which leads to disruption of Hck its intramolecular SH3 domain, resulting in increase of kinase activity. The second miniature protein 3.3 consists of a binding site allowing association of hDM2 which simultaneously inhibits its interaction with p53, leading to transcription of p53 dependent genes. Fusing the miniature proteins YY2 and 3.3 together allows scaffolded phosphorylation of hDM2. Ternary complex formation was proved by showing a bell-shaped activity dependence as a function of the scaffold concentration.80 Like the three previously described studies, this research shows that phosphorylation is an essential part of signal transduction and the ability to control this via synthetic scaffolds opens up a multitude of opportunities for synthetic signalling. Phosphorylation is an essential part of signal transduction as it induces or inhibits protein–protein interactions. Gaining control over Synthetic Biology, 2018, 2, 65–96 | 91

View Online

this post-translational modification using synthetic protein scaffolds opens up new possibilities to alter function and activity of phosphorylation dependent proteins, leading to new pathways.

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

6

Synthetic scaffolds in metabolic engineering

The use of synthetic scaffolds has also been extended to metabolic pathways. Metabolic engineering is used as an alternative for chemical synthesis, however low yield, unstable metabolic intermediates and undesirable side reactions are some of the challenges in this field.59 Scaffolding of metabolic pathways is an alternative and orthogonal strategy for the conventional approaches being transcriptional control of enzyme expression and the optimization of enzymes through directed evolution. Scaffolds can enhance substrate channelling as it prevents the loss of intermediates through diffusion or competing pathways and it decreases transfer times which is beneficial for unstable intermediates. However, as explained in Section 4.2, substrate channelling has a minimal effect on production rate, while additional effects such as kinetic matching, substrate scaffold interactions and spatial effects can have a larger contribution to production rate as well.81,82 Next we will discuss two examples of synthetic scaffolds that increase production rates in metabolic engineering. Conrado et al. reviewed naturally occurring metabolic channels and engineered multifunctional enzyme systems.59 The fusion proteins and post-translational assemblies that are discussed still have drawbacks, therefore the authors envision modular domains which sequentially recruit enzymes that can form a pathway with high productivity.59 Following this recommendation, Dueber and co-workers built a synthetic protein scaffold to increase the effective concentration of the components required in a metabolic pathway.77 The approaches of sequestration and covalent tethering of intermediates are challenging to undertake therefore they focused on increasing effective concentrations of intermediates by programmable formation of enzyme complexes. As a model system they used a pathway which produces mevalonate via three enzymes, namely acetoacetyl-CoA thiolase(AtoB), hydroxy-methylglutarlyCoA synthase (HMGS) and hydroxyl-methyl-glutaryl-CoA reductase (HMGR). As a first experiment, HMGR was fused to an SH3 domain while HMGS was fused to a varying number of SH3 ligands. Complex formation for this engineered system was observed whereby the relative stoichiometry of the two enzyme could be controlled by varying the number of HMGS bound ligands. More complex scaffolds were built using the GBD, PDZ and SH3 domain, to respectively recruit AtoB, HMGS and HMGR using fused short peptide ligands. The domains were connected via flexible glycine-serine linkers and the number of domain repeats was varied to control the ratio of the individual enzymes. Production improvement compared to an unscaffolded pathway was dependent on the number of domain repeats and their orientation. A biphasic effect was observed where scaffolds with high number of domain repeats led to optimal production at low scaffold 92 | Synthetic Biology, 2018, 2, 65–96

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

expression levels, while a low number of repeats in the scaffold led to optimal production at high scaffold expression levels.77 The most optimal scaffold led to a 77-fold increase in mevalonate production (Fig. 10C). To prove the programmability and generality of the system the scaffold was also applied to the synthetic pathway of glucaric acid, which required three enzymes from three different organism sources. Also for this pathway, production was improved compared to a non-scaffolded control, in this case leading to a 200% higher product formation.77 This research shows that clustering enzymes using domain repeats can lead to significant increases in production rate. Strikingly, a defined stoichiometry is important in this case, which is in contradiction to the discussed theoretical models (Section 4.2).64,65 You and Zhang made a synthetic metabolon in a similar fashion as described in the previous research.83 The authors also used a three enzyme complex system as a model, incorporating the enzymes triosephosphate isomerase (TIM), aldolase (ALD) and fructose 1,6-biphosphatase (FBP) based on the glycolysis and gluconeogenesis pathways. They made use of three different kind of cohesins linked together to act as scaffold and the recruited metabolic enzymes were coupled to the corresponding types of dockerin. A cellulose-binding module was fused to the scaffold allowing immobilization of the scaffold on cellulosic material. Complex formation was demonstrated and it was suggested that the number of cohesins and the orientation could lead to optimization. In addition, a non-immobilized scaffold was developed which was tested for substrate channelling together with the immobilized form. In comparison to the non-scaffolded control, production rates were enhanced 38 to 48 times for the non-immobilized and immobilized scaffold respectively. The higher reaction rate for the immobilized form compared to the non-immobilized form is speculated to be attributable to shorter enzyme-enzyme distances. Overall, the authors contribute the higher production rate to substrate channeling, however non-specific interactions between metabolite and scaffold or enzyme clustering might also attribute to the observed higher reaction rate (Section 4.2). Applications of synthetic scaffolds seem promising for metabolic engineering as the use of modular domains allows straightforward engineering of scaffolds varying in stoichiometry and configuration, which allows rapid optimization and easy comparison of production efficiencies. Moreover the discussed principles and designs are generalizable to other pathways. However, further research is required to unravel the underlying mechanisms and optimize the production rates of the modular systems described above to meet demands for products in therapeutics and biofuels.

7

Conclusions

The examples of synthetically engineered scaffolds show the enormous potential of synthetic biology in the fields of signal transduction and metabolic engineering, both in terms of fundamental understanding and in terms of applications. Firstly, it shows that the modularity of the Synthetic Biology, 2018, 2, 65–96 | 93

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

View Online

scaffolds is crucial for functioning and optimization. Synthetic biology approaches provide easy recombination of functional parts, which facilitates insight into important aspects such as domain ratios and orientations for optimal pathway output. Also, the modularity of this concept enables the scaffolds to be used in a multitude of signalling pathways, based on logical design parameters. The theoretical models developed on the basis of the natural systems now can be used as a predictive framework to dictate the molecular characteristics of the synthetic scaffolds and their interaction partners. Finally, scaffolds that can connect otherwise unrelated or non-optimal input and output responses are crucial for research and applications in metabolic engineering and catalysis and towards synthetic cells applicable in therapeutics, diagnostics and biotechnology.

References 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

16 17 18 19 20 21

ćs, W. A. Lim and A. Reme ńyi, Trends Cell Biol., 2009, 19, A. Zeke, M. Luka 364–374. M. C. Good, J. G. Zalatan and W. A. Lim, Science, 2011, 332, 680–686. W. R. Burack and A. S. Shaw, Curr. Opin. Cell Biol., 2000, 12, 211–216. ńyi, B. J. Yeh and W. A. Lim, Annu. Rev. Biochem., R. P. Bhattacharyya, A. Reme 2006, 75, 655–680. H. Wu, Cell, 2013, 153, 287–292. C. Q. Pan, M. Sudol, M. Sheetz and B. C. Low, Cell. Signal., 2012, 24, 2143–2165. J. D. Scott and T. Pawson, Science, 2009, 326, 1220–1224. B. J. Mayer, M. Hamaguchi and H. Hanafusa, Nature, 1988, 332, 272–275. R. B. Birge, C. Kalodimos, F. Inagaki and S. Tanaka, Cell Commun. Signal., 2009, 7, 13. T. Pawson and J. D. Scott, Science, 1997, 278, 2075–2080. ´s, Cell. Signal., 2002, 14, 723–731. L. Buday, L. Wunderlich and P. Tama J. Chen, I. L. Leskov, A. Yurdagul Jr., B. Thiel, C. G. Kevil, K. Y. Stokes and A. W. Orr, Science Signal., 2015, 8, 1–10. S. Liu, X. Cai, J. Wu, Q. Cong, X. Chen, T. Li, F. Du, J. Ren, Y. Wu, N. Grishin and Z. J. Chen, Science, 2015, 347, aaa26301-13. A. Aitken, Semin. Cancer Biol., 2006, 16, 162–172. X. Yang, W. H. Lee, F. Sobott, E. Papagrigoriou, C. V Robinson, ¨m, D. A. Doyle and J. M. Elkins, Proc. Natl. Acad. J. G. Grossmann, M. Sundstro Sci. U. S. A., 2006, 103, 17237–17242. C. Johnson, S. Crowther, M. J. Stafford, D. G. Campbell, R. Toth and C. MacKintosh, Biochem. J., 2010, 427, 69–78. M. B. Yaffe, K. Rittinger, S. Volinia, P. R. Caron, A. Aitken, H. Leffers, S. J. Gamblin, S. J. Smerdon and L. C. Cantley, Cell, 1997, 91, 961–971. B. Coblitz, M. Wu, S. Shikano and M. Li, FEBS Lett., 2006, 580, 1531–1535. D. T. Obsil, D. K. Hentges, T. Obsil and V. Obsilova, Semin. Cell Dev. Biol., 2011, 22, 663–672. T. Obsil, R. Ghirlando, D. C. Klein, S. Ganguly and F. Dyda, Cell, 2001, 105, 257–267. C. Ottmann, S. Marco, N. Jaspert, C. Marcon, N. Schauer, M. Weyand, C. Vandermeeren, G. Duby, M. Boutry, A. Wittinghofer, J.-L. Rigaud and C. Oecking, Mol. Cell, 2007, 25, 427–440.

94 | Synthetic Biology, 2018, 2, 65–96

View Online

22

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

23 24 25 26

27 28 29 30 31 32 33 34

35 36

37

38

39 40 41 42 43 44 45 46 47 48

49 50

L. K. Nutt, M. R. Buchakjian, E. Gan, R. Darbandi, S. Y. Yoon, J. Q. Wu, Y. J. Miyamoto, J. A. Gibbon, J. L. Andersen, C. D. Freel, W. Tang, C. He, M. Kurokawa, Y. Wang, S. S. Margolis, R. A. Fissore and S. Kornbluth, Dev. Cell, 2009, 16, 856–866. M. R. Buchakjian and S. Kornbluth, Nat. Rev. Mol. Cell Biol., 2010, 11, 715–727. M. Lettau, J. Pieper and O. Janssen, Cell Commun. Signal., 2009, 7, 1. P. L. Schwartzberg, L. D. Finkelstein and J. A. Readinger, Nat. Rev. Immunol., 2005, 5, 284–295. D. Matza, A. Badou, M. K. Jha, T. Willinger, A. Antov, S. Sanjabi, K. S. Kobayashi, V. T. Marchesi and R. A. Flavell, Proc. Natl. Acad. Sci. U. S. A., 2009, 106, 9785–9790. A. S. Shaw and E. L. Filbert, Nat. Rev. Immunol., 2009, 9, 47–56. ńyi and W. A. Lim, Cell, 2009, 136, M. Good, G. Tang, J. Singleton, A. Reme 1085–1097. J. G. Zalatan, S. M. Coyle, S. Rajan, S. S. Sidhu and W. A. Lim, Science, 2012, 337, 1218–1222. ¨thgen, Front. Physiol., 2012, 3, 1–14. F. Witzel, L. Maddison and N. Blu W. Kolch, Nat. Rev. Mol. Cell Biol., 2005, 6, 827–837. H. H. Park, E. Logette, S. Raunser, S. Cuenin, T. Walz, J. Tschopp and H. Wu, Cell, 2007, 128, 533–546. R. Ferrao and H. Wu, Curr. Opin. Struct. Biol., 2012, 22, 241–247. L. Wang, J. K. Yang, V. Kabaleeswaran, A. J. Rice, A. C. Cruz, A. Y. Park, Q. Yin, E. Damko, S. B. Jang, S. Raunser, C. V Robinson, R. M. Siegel, T. Walz and H. Wu, Nat. Struct. Mol. Biol., 2010, 17, 1324–1329. S.-C. Lin, Y.-C. Lo and H. Wu, Nature, 2010, 465, 885–890. J. Li, T. McQuade, A. B. Siemer, J. Napetschnig, K. Moriwaki, Y. S. Hsiao, E. Damko, D. Moquin, T. Walz, A. McDermott, F. K. M. Chan and H. Wu, Cell, 2012, 150, 339–350. T. Tenev, K. Bianchi, M. Darding, M. Broemer, C. Langlais, F. Wallberg, A. Zachariou, J. Lopez, M. MacFarlane, K. Cain and P. Meier, Mol. Cell, 2011, 43, 432–448. Q. Yin, S.-C. Lin, B. Lamothe, M. Lu, Y.-C. Lo, G. Hura, L. Zheng, R. L. Rich, A. D. Campos, D. G. Myszka, M. J. Lenardo, B. G. Darnay and H. Wu, Nat. Struct. Mol. Biol., 2009, 16, 658–666. W. A. Lim, C. M. Lee and C. Tang, Mol. Cell, 2013, 49, 202–212. ¨thgen, Front. Physiol., 2012, 3, 1–14. F. Witzel, L. Maddison and N. Blu S. A. Chapman and A. R. Asthagiri, Mol. Syst. Biol., 2009, 5, 313. J. Yang and W. S. Hlavacek, Math. Biosci., 2011, 232, 164–173. A. Levchenko, J. Bruck and P. W. Sternberg, Proc. Natl. Acad. Sci. U. S. A., 2000, 97, 5818–5823. D. Bray and S. Lay, Proc. Natl. Acad. Sci. U. S. A., 1997, 94, 13493–13498. J. E. Ferrell, Sci. STKE, 2000, 2000, pe1. R. Heinrich, B. G. Neel and T. A. Rapoport, Mol. Cell, 2002, 9, 957–970. E. F. Douglass, C. J. Miller, G. Sparer, H. Shapiro and D. A. Spiegel, J. Am. Chem. Soc., 2013, 135, 6092–6099. T. M. Thomson, K. R. Benjamin, A. Bush, T. Love, D. Pincus, O. Resnekov, R. C. Yu, A. Gordon, A. Colman-Lerner, D. Endy and R. Brent, Proc. Natl. Acad. Sci., 2011, 108, 20265–20270. J. W. Locasale, A. S. Shaw and A. K. Chakraborty, Proc. Natl. Acad. Sci. U. S. A., 2007, 104, 13307–13312. L. Bardwell, X. Zou, Q. Nie and N. L. Komarova, Biophys. J., 2007, 92, 3425–3441. Synthetic Biology, 2018, 2, 65–96 | 95

View Online

51 52 53 54

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00065

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77

78 79 80 81 82 83

N. L. Komarova, X. Zou, Q. Nie and L. Bardwell, Mol. Syst. Biol., 2005, 1, 2005.0023. J. E. Ferrell and S. H. Ha, Trends Biochem. Sci., 2014, 39, 612–618. J. E. Dueber, E. A. Mirsky and W. A. Lim, Nat. Biotechnol., 2007, 25, 660–662. C. Y. Huang and J. E. Ferrell, Proc. Natl. Acad. Sci. U. S. A., 1996, 93, 10078–10083. N. Hao, S. Nayak, M. Behar, R. H. Shanks, M. J. Nagiec, B. Errede, J. Hasty, T. C. Elston and H. G. Dohlman, Mol. Cell, 2008, 30, 649–656. S. Takahashi and P. M. Pryciak, Curr. Biol., 2008, 18, 1184–1191. M. K. Malleshaiah, V. Shahrezaei, P. S. Swain and S. W. Michnick, Nature, 2010, 465, 101–105. C. J. Thalhauser and N. L. Komarova, PLoS One, 2010, 5, 1–14. R. J. Conrado, J. D. Varner and M. P. DeLisa, Curr. Opin. Biotechnol., 2008, 19, 492–499. O. Idan and H. Hess, Curr. Opin. Biotechnol., 2013, 24, 606–611. C. M. Agapakis, P. M. Boyle and P. A. Silver, Nat. Chem. Biol., 2012, 8, 527–535. P. Bauler, G. Huber, T. Leyh and J. A. McCammon, J. Phys. Chem. Lett., 2010, 1, 1332–1335. C. C. Roberts and C. A. Chang, J. Chem. Theory Comput., 2015, 11, 286–292. O. Idan and H. Hess, ACS Nano, 2013, 7, 8658–8665. M. Castellana, M. Z. Wilson, Y. Xu, P. Joshi, I. M. Cristea, J. D. Rabinowitz, Z. Gitai and N. S. Wingreen, Nat. Biotechnol., 2014, 32, 1011–1018. W. A. Lim, Nat. Rev. Mol. Cell Biol., 2010, 11, 393–403. P. L. Howard, M. C. Chia, S. Del Rizzo, F.-F. Liu and T. Pawson, Proc. Natl. Acad. Sci. U. S. A., 2003, 100, 11267–11272. J. E. Dueber, B. J. Yeh, K. Chak and W. A. Lim, Science, 2003, 301, 1904–1908. S.-H. Park, A. Zarrinpar and W. A. Lim, Science, 2003, 299, 1061–1064. C. J. Bashor, N. C. Helman, S. Yan and W. A. Lim, Science, 2008, 319, 1539–1543. S. G. Peisajovich, J. E. Garbarino, P. Wei and W. A. Lim, Science, 2010, 328, 368–372. P. Wei, W. W. Wong, J. S. Park, E. E. Corcoran, S. G. Peisajovich, J. J. Onuffer, A. Weiss and W. A. Lim, Nature, 2012, 488, 384–388. K. E. Galloway, E. Franco and C. D. Smolke, Science, 2013, 341, 1235005. A. Lai, P. M. Sato and S. G. Peisajovich, ACS Synth. Biol., 2015, 4, 714–722. J. Ryu and S. Park, Sci. Signal, 2015, 8, 1–11. W. R. Whitaker, S. A. Davis, A. P. Arkin and J. E. Dueber, Proc. Natl. Acad. Sci., 2012, 109, 18090–18095. J. E. Dueber, G. C. Wu, G. R. Malmirchegini, T. S. Moon, C. J. Petzold, A. V. Ullal, K. L. J. Prather and J. D. Keasling, Nat. Biotechnol., 2009, 27, 753–759. B. J. Yeh, R. J. Rutigliano, A. Deb, D. Bar-Sagi and W. A. Lim, Nature, 2007, 447, 596–600. M. Skwarczynska, M. Molzan and C. Ottmann, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, E377–86. E. M. Hobert and A. Schepartz, J. Am. Chem. Soc., 2012, 134, 3976–3978. Y. Zhang, J. Ge and Z. Liu, ACS Catal., 2015, 5, 4503–4513. J. L. Lin and I. Wheeldon, ACS Catal., 2013, 3, 560–564. C. You and Y. H. P. Zhang, ACS Synth. Biol., 2013, 2, 102–110.

96 | Synthetic Biology, 2018, 2, 65–96

Design of synthetic symmetrical proteins J. Vrancken,a,y S. Wouters,a,y B. Mylemans,a,y H. Noguchi,a J. R. H. Tameb and A. R. D. Voet*a Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

DOI: 10.1039/9781782622789-00097

In recent years, there has been growing interest in the creation of synthetic proteins with perfect internal symmetry. Such structures can be achieved by analysing the consensus sequences of natural proteins or de novo by computational methods. Proteins with translational (polar) symmetry have found applications as artificial antibodies, while those with point (cyclic) symmetry have proved stimulating models of molecular evolution. Proteins with internal repeats can also serve as building blocks for the creation of larger symmetrical assemblies, such as cages, that may depend on and mutually organise other molecules. This has been demonstrated by the biomineralisation of the smallest nanocrystal in a designer protein complex.

1

Introduction

Throughout history, symmetry has played an important role in art, culture, and science alike. From the design of ancient Greek temples to the ornaments in Islamic mosques, symmetry has always been associated with order and beauty. Symmetry has also frequently played an important role in different scientific disciplines, ranging from the study of regular bodies, the development of the periodic table, and modern theories of subatomic particles. Plato saw in symmetry the connection between beauty, truth and moral good. In Greek philosophy the world itself is symmetry, and the demands of symmetry were held to be the reason for all kinds of natural phenomena: ‘it is so, because of symmetry’. Kepler’s theory of planetary proportions determined by the five Platonic solids is a historical example of such thinking.1 Symmetry is no longer seen as an ultimate cause of natural effects in itself, but a manifestation of other underlying phenomena, such as quantum or thermodynamic effects. Nevertheless, it remains a useful principle or guideline to follow in structural designs on any scale, and proteins are no exception. 1.1 Symmetry of the quaternary structure of proteins The double-helical structure of DNA is a good example of symmetry at the biochemical level of Nature, and it came as a great surprise that the first protein structures showed a much lower degree of internal symmetry. Protein monomers can however assemble into multimeric structures that exhibit different kinds of symmetry to form rings, helices and closed capsids (Fig. 1). It is estimated that between 30% to 50% of all human a

KU Leuven, Laboratory for Biomolecular Modelling and Design, Department of Chemistry, Celestijnenlaan 200G, Leuven 3000, Belgium. E-mail: [email protected] b Drug Design Laboratory, Yokohama City University, Suehiro 1-7-29, Tsurumi-ku, Yokohama 230-0045, Japan y These authors contributed equally. Synthetic Biology, 2018, 2, 97–114 | 97 c

The Royal Society of Chemistry 2018

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

Fig. 1 The quaternary structures of protein oligomers show various different symmetries such C4 (A: aquaporin, PDB: 3ZOJ), D3 (B: Ribulose-phosphate 3-epimerase, PDB: 5UMF). The hetero-tetramer may appear as a D2 assembly, but since alpha and beta-haemoglobin are not identical proteins, the overall quaternary structure exhibits C2 symmetry (C: human hemoglobin, PDB: 5ME2). Larger symmetric assemblies include ring shaped proteins (D: TRAP, PDB: 3AQD), helical tube forming assemblies (E: microtubule, PDB: 5SYF) and capsids (F: densovirus capsid, PDB: 3N7X). Proteins are depicted as cartoons. The microtubular assembly is represented as a molecular surface.

proteins form homomers, with dimers being most common.2 A recent study by Swapna, et al. indicated that almost all homomers form symmetrical complexes and only 11 asymmetric protein complexes were identified from a set of 213 homodimers.3 Heteromers are also abundant and generally symmetrical. Well-known examples of heteromeric complexes include haemoglobin, tubulin and viral capsids. Several explanations have been suggested for the frequently observed symmetrical nature of protein oligomers.4–6 Symmetrical assemblies may be required for correct functionality, such as allosteric regulation. The allosteric model proposed by Monod, Wyman and Changeux attributes cooperativity to a symmetrical conformational change in the subunits induced by the binding of the substrate, increasing the binding affinity.7 Furthermore, formation of symmetric complexes may enhance the folding and stability of proteins by preventing unwanted aggregation. Proteins from thermostable bacteria for example are sometimes found to oligomerise as a means of increasing their resistance to denaturation. In viruses, symmetry and pseudo-symmetry play an integral role in the formation of capsids, which are mainly icosahedral.8 Icosahedral symmetry allows a capsid built from a minimal number of protein types to enclose a maximal volume, which is important for viruses with a small genome.9 1.2 Symmetry of the tertiary structure of proteins Despite the shock relating to the asymmetry of the early globin crystal structures, symmetry is also found in the tertiary structure of some natural monomers. It was first observed by the group of Blundell in the model of an acid protease.10 The protein is composed of two 98 | Synthetic Biology, 2018, 2, 97–114

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

near-identical units, joined into a single amino acid chain, and related by an internal pseudo-2-fold axis. The internal structural symmetry of the protein backbone may be much more faithfully observed than the sequence identity of the repeated domain, and so specialist tools and computational methods have been developed to analyse such proteins.5 Two of the more recent tools for this purpose are symD and CEsymm.11,12 While these programs use different computational approaches, their analysis of symmetry within protein monomers reaches similar conclusions. SymD identified 10–15% of protein domains with internal symmetry in the SCOP-based ASTRAL domain database, while CE-symm picked out 18% of the structures in the SCOP database. Interestingly, just as in homo-multimers, 2-fold symmetry is the most prevalent. These pseudo-symmetrical protein structures are not limited to only a few protein families. Membrane proteins frequently display internal symmetry, most notably the b-barrels. Other folds that often occur with symmetry are the a/b-barrels, b-propellers, b-trefoils and the ferredoxin-like fold which is the only one with a 2-fold symmetry. 1.3 Evolutionary origin of internal symmetry in proteins The presence of symmetry at the tertiary structure level can be explained by the interplay of evolutionary processes between genes and proteins. It is commonly accepted that evolution is driven in part by the duplication of genetic information, creating tandem copies within genomes. The evolution of proteins is considered to be modular: proteins are created by combining different smaller fragments by fusion of genetic information. The origin of natural pseudo-symmetrical monomeric proteins is then most easily explained by a combination of gene duplication and fusion events.13 According to this idea, existent examples of such proteins arose from a smaller ancestral gene fragment, which encoded a polypeptide that assembled into a homomeric multimer. Following duplication of the gene, each encoded polypeptide contains more domains that represent the individual subunits of the ancestral homo-oligomer. However, over time, identical repeats were prone to evolutionary pressure and genetic drift, causing deletions, insertions, and/or mutations resulting in a pseudo-symmetric protein.14 This process can give rise to proteins with repeated architectural motifs showing either point symmetry (to make rings) or translational symmetry (to make rods and helices). (Fig. 2).

Fig. 2 The evolutionary theory of pseudo-symmetric protein stipulates that a single protein motif ‘‘A’’ may have evolved to self-assemble into a symmetric multimer of ‘‘B’’. Duplication of the genetic information followed by the fusion of the genes may have resulted in a symmetric monomeric protein ‘‘BB’’ which later diversified under evolutionary pressure into ‘‘CD’’. Synthetic Biology, 2018, 2, 97–114 | 99

View Online

Monomeric proteins showing either type of symmetry can be designed, as outlined in the following section. The scope of this review is limited to monomeric proteins however, and the recent successes in the design of multimeric symmetric assemblies are not discussed.

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

2

Symmetrical repeat proteins

The hypothesis of protein evolution by gene duplication and fusion readily explains how elongated symmetrical proteins could arise in which the length is controlled by the number of tandem repeats. Well-known examples of such proteins include the armadillo repeat, ankyrin repeats, and Leucine Rich Repeat (LRR) proteins. Often these architectures are found to be involved in protein–protein interactions and therefore they include key regulatory factors for many cellular processes. While the structure of each domain remains essentially the same, specificity for a certain binding partner is achieved by the mutation of only a few residues at the binding interface. This makes these types of scaffolds for multiple reasons. The surface area can easily vary in size by varying the number of tandem repeats. Since they are very effective at selective protein binding they could be used as artificial antibodies, especially as the specificity can be altered by mutating only a few residues. The sequences of these proteins can be divided in an epitope that is directly responsible for the target protein binding, and a structural framework that is responsible for the stabilising the fold and presenting the epitope. The framework is most conserved during evolution and reflects the consensus sequence the best. The N and C-terminal regions are often less well conserved, and serve as symmetry-breaking caps to avoid solvent exposure of the hydrophobic core, and induce and stabilize the folding of the protein.15 Consensus design is the predominantly successful method to generate novel proteins built from repeated motifs. In the case of elongated repeat proteins, the terminal repeats are not included in building a consensus sequence as they have special features related to their function. In the case of a protein whose domains are arranged in a ring, each domain is structurally equivalent and may be used. A sequence alignment of all the internal repeats from the protein family of interest is performed. The conserved residues are then examined in the context of a 3D model, and may be manually altered to remove cysteine residues or to avoid potential steric hindrance. Features of the original cap structures may be utilized for the stabilization of the new consensus repeat, or novel cap structures can be derived. Typically, when new cap structures are introduced they are based on a comparison of related cap structures or manually derived from the consensus sequence while applying mutations to increase the stability.16 The residues responsible for the protein–protein interactions will not be conserved and are not considered at this stage. They can be introduced later to create proteins with a desired binding site. An alternate method that has also proven successful in the design of symmetrical proteins is to rely on computational protein design. A computational search strategy is employed to identify the optimal amino acid sequence that may fold into a given protein backbone. First 100 | Synthetic Biology, 2018, 2, 97–114

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

Fig. 3 Symmetrical versions of natural repeat proteins have been designed (top panel) in which the central motif consists of identical tandem repeats (bottom panel). Here depicted are the synthetic ANK (PDB: 1SVX), ARM (PDB: 5AEI), TPR (PDB: 5FZQ), HEAT (PDB: 3LTM) and LRR (PDB: 4PSJ) proteins.

the idealized backbone conformation of the repeat protein needs to be designed, either de novo or with consideration of some natural protein template. Assignment of side-chains to the bare main-chain may also be carried out de novo without any prior information, but if the backbone is derived from a naturally occurring repeat protein, then its sequence and any homologues can also guide this step. The quality of the model derived by decorating the backbone with the chosen sequence is evaluated using a scoring function or force field. The Rosetta protein modelling package is the pre-eminent software used for such scoring, and has resulted in many successfully designed proteins.17,18 In 2015, Parmeggiani and colleagues demonstrated the design of several repeat proteins by computation using Rosetta, and examples of their work are discussed below.19 Success in creating new proteins with architectural repeats have not been uniform however, and some classes of motif have been more frequently targeted, and discussed below. (Fig. 3).

2.1 Ankyrin repeat proteins One of the most common types of protein built from repeating motifs are the ankyrin (ANK) repeat proteins, found in all three superkingdoms. One ANK repeat typically consists of 33 amino acid residues that fold into two antiparallel a-helices connected by loops. The number of repeats varies but proteins up to 34 repeats in one domain have been reported. Ankyrin repeats have served as a starting point to create a novel class of protein-binding designer proteins named Designed Ankyrin Repeat Proteins (DARPins). DARPins have been developed via a combinatorial library of Ankyrin repeats consensus sequences. Following their initial design, they have been developed in an antibody like protein. For a thorough overview of the DARPINs, we refer the reader to a recent review.20 While the initial designs did not contain exact repeats, consensus ANK proteins have also been designed to study protein folding. Several ANK proteins made from exactly repeated motifs were recently designed to atomic level accuracy by Parmeggiani and coworkers using the computational method described above.19 Synthetic Biology, 2018, 2, 97–114 | 101

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

2.2 HEAT repeat proteins Another common protein family built from repeated motifs are the HEAT proteins. The name is derived from the four proteins in which this repeat motif was first discovered, namely: Huntingtin protein, Elongation factor 3, protein phosphatase 2A, and TOR1. One repeat of this motif consists of about 40 residues and folds into a hairpin of two anti-parallel a-helices. The succession of multiple HEAT repeats gives rise to a solenoid shape overall. Natural HEAT proteins regulate cytoplasmic and nuclear transport among other pathways.21,22 In 2010, Urvoas, et al. created a perfectly symmetrical HEAT protein based on the consensus sequence of the HEAT repeats found in a thermophilic microorganism. These designer proteins are called aReps, and they are currently under investigation as a potential scaffold for artificial antibodies.23,24 Several HEAT repeats were successfully created by computational methods, and the structures experimentally verified.19 2.3 Armadillo repeat proteins Armadillo repeat proteins (ARM) are commonly found in eukaryotes and are named after the armadillo gene from Drosophila melanogaster. ARM proteins typically consist of 4 to 12 tandem repeats and each repeat contains around 42 residues. The ARM repeat motif consists of three compact a-helices that are stacked on top of each other, which upon concatenation generates a right-handed superhelix. The ARM proteins are able to bind their binding partner by wrapping around the peptide with each repeating unit able to bind two consecutive residues of the bound peptide. In 2008, the group of Pluckthun developed the first symmetric ARM proteins utilizing a consensus design method, followed by computational steps to optimize the internal sequence.16 The goal was also to develop novel protein scaffolds that can be rationally redesigned to bind to a given amino acid sequence. Next-generation proteins have been developed to improve the stability and folding.25 Following several cycles of rational protein engineering to improve the stability and affinity, they recently reported a designed symmetric ARM protein with picomolar affinity for a 5-fold lysine-arginine-repeating peptide.26 The computational strategy described above also yielded multiple symmetrical designer ARM proteins.19 2.4 Tetratricopeptide repeat proteins The tetratricopeptide (TRP) motif contains 34 residues that can be repeated up to 16 times. Each motif folds into two anti-parallel a-helices connected by a small loop, and the mutual packing of the neighbouring repeats results in a helical structure. TRP proteins are most commonly found to contain only three repeats, and like other proteins with internal symmetry they are also found to regulate pathways by means of protein– protein interactions.27 They have received significant attention from the group of Regan, which has engineered a protein interaction scaffold from 102 | Synthetic Biology, 2018, 2, 97–114

View Online 28,29

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

consensus sequences. These artificial proteins then served as a platform to engineer a derivative protein that binds specifically to Hsp70. Other symmetrical TRP designs were made to form gels by binding to peptide linkers, demonstrating the usage of symmetric repeat proteins as building blocks for larger networks.30,31 Parmeggiano, et al. also succeeded in the de novo creation of symmetrical TRP proteins.16 2.5 Leucine rich repeat proteins Unlike the a-helical repeat proteins discussed above, the leucine rich repeat (LRR) proteins consist of several tandem repeats of a strand-turnhelix motif, which fold into horse-shoe shaped proteins with a parallel b-sheet formed at the concave face. The protein family is named after the high incidence of leucine and isoleucine residues that form a continuous hydrophobic core and hold the secondary structure elements together tightly. The a-helix however is less conserved than other parts of the protein, and the curvature of the horse-shoe is also highly variable. The porcine ribonuclease inhibitor is one example of a very highly symmetrical LRR protein, which has therefore also served as a template for protein engineering experiments to create novel proteins scaffolds using a similar strategy to that used with the DARPins.32 The group of Andre used this template to create novel LRRs with predefined geometry utilizing Rosetta-based computational protein design. Interestingly they observed that two of their designed proteins self-assemble into a ring-shaped protein using electron microscopy. Several LRRs were also designed by Parmeggiani, et al.19 Following this work, the computational design of LRR based proteins with controlled curvature was also reported by Park, et al. by combining backbones derived from LRR proteins with different curvature.33 2.6 Other repeat proteins There are many more naturally occurring repeat proteins for which up to now, to the best of our knowledge, no artificial structures have been designed, but which may serve as interesting scaffolds (Fig. 4). The Transcription Activator-Like Effector (TALE) repeat proteins consist of a motif of about 34 residues that can be repeated up to 34 times. TALE repeats are able to recognise specific bases within double-stranded DNA or an RNA-DNA hybrid. Each repeat consists of one short and one long a-helix, forming an a-helical hairpin. Each one of the repeats is able to recognise one of the DNA bases. It is possible to redesign repeating units to identify a specific DNA sequence as demonstrated by Mahfouz, et al.34 Very recently these proteins were utilized to create protein-DNA hybrid nanostructures, and further research based on these proteins may create even more elaborate mixed complexes.35 Pentapeptide repeat proteins, which occur both in prokaryotes and eukaryotes on the other hand were only recently discovered and not much is known about their function. It has been observed that some of these proteins mimic the shape and polarity of DNA and competitively inhibit DNA-modifying enzymes such as DNA gyrase36 The repeats fold Synthetic Biology, 2018, 2, 97–114 | 103

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

Fig. 4 The TAL effector (left, PDB: 4OSH) and Pentapeptide repeat (right, PDB: 3N90) proteins show the potential to be engineered into symmetrical repeat proteins, but as yet no such proteins have been successfully engineered. The overall structure is depicted on the panel above, with a single repeat below One full turn of the Pentapeptide helix consist out of 4 repeats of 5 amino acids.

into a right handed b-helix with a square cross-section. The repeating motif appears to be formed out of 5 residues.37 2.7 Ice binding repeat proteins Most of the anti-freeze, ice-binding or thermal hysteresis proteins contain translational repeats with a unique ability to bind ice crystals, blocking them from growing.38 These ice-binding proteins are b-helical repeat proteins with a cylindrical shape and a highly organized flat surface that interacts with crystalline water. These proteins are therefore able to depress the freezing temperature of water very strongly. Different types of ice binding proteins with different architectures have emerged independently in different organisms (Fig. 5). While the tandem repeat is not highly conserved, the ice binding interface typically consists of polar amino acids (such as threonine) that mimic one face of an ice crystal. Leinala, et al. observed that proteins with more repeats have a greater effect.39 This leads to the idea that novel proteins could be easily engineered with an exact repeating motif, and that by varying the number of the repeats the thermal hysteresis temperature may be controlled to a desired degree. 2.8 De novo design of symmetrical proteins with elongated repeat proteins Above we have discussed naturally-occurring repeat proteins and artificial symmetrical derivatives, but there is also great interest in the 104 | Synthetic Biology, 2018, 2, 97–114

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

Fig. 5 Ice binding proteins often exhibit a highly symmetric tertiary structure, typically a b-helix. This is required for the precise placement of the ice binding residues that hydrogen bond to the highly symmetrically arranged water molecules in ice crystals. Different highly symmetrical proteins have emerged during evolution in different organisms (The overall structure are depicted as cartoons on the top, a view according to the translational axis of single repeat is given at the bottom: Tenebrio molitor, PDB: 1EZG; Lolium perene , PDB: 3ULT; Choristoneura fumiferana PDB: 1M8N; Marinomonas primoryensis PDB: 3P4G; Rhagium inquisitor PDB: 4DT5; Typhula ishikariensis PDB: 3VN3). They pose themselves as highly interesting protein scaffolds for the engineering of symmetric proteins.

creation of repeat architectures not found in nature. With no existing structural template or sequence data, these novel designs can only be achieved by purely computational methods. Following their work on the design of repeat proteins with and without predefined curving, based on natural repeat motifs, the Baker group developed a computational workflow using Rosetta to sample possible repeat structures built from a single motif consisting of two a-helices. In an exhaustive search, a-helical fragments of different lengths were combined with naturally occurring helix-loop-helix fragments to create a backbone which was then further refined by additional cycles of fragment fitting. This resulted in a multitude of symmetrical backbone conformations with different degrees of bending, and either left- or right-handed twisting. Only a few of these geometries have been observed before in natural proteins. Next Rosetta was utilized to design amino acid sequences consistent with these backbones, and the models were further validated by performing Rosetta based ab initio folding simulations to ensure that each designed sequence corresponded to a unique fold at the energy minimum. A total of 83 different well-scoring designs were experimentally tested, of which 44 were successful. For several the experimental crystal structure validated the design to atomic level accuracy (Fig. 6).40

3

Symmetrical globular proteins

The evolutionary process of duplication and fusion may also lead to structures that do not have polar symmetry, and which form closed rings with cyclic or dihedral symmetry. A protein that self-assembles into such a multimeric complex may evolve into a single polypeptide whose individual domains recapitulate the ancestral structure. From a protein engineering perspective, closed structures appear harder to design than the repeat proteins discussed above since both Synthetic Biology, 2018, 2, 97–114 | 105

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

Fig. 6 Using computational protein design, several repeat proteins (top) have been designed de novo by thoroughly sampling the different backbone conformation two a-helices (bottom) and a linker chain can adapt to (from left to right: PDB: 5CWB, 5CWC, 5CWH, 5CWI).

ends need to meet with exact geometry, but they also obviate the need for end-capping by specialised C and N-terminal domains. Unfolding of a ring protein is a highly cooperative process, since they cannot denature from one end, making them inherently more stable and resistant to geometric imperfections in the design.41 A number of ring-shaped proteins have been designed using exact domain repeats. Most of those arose from studies of evolutionary pathways or tests of novel protein designs, and did not have a future application in mind. However since the rotational symmetry of symmetric globular proteins could facilitate assembly of larger complexes and novel nano-structures, they may find uses as protein building blocks for the assembly of macro-sized complexes with a variety of biotechnological applications (Fig. 7).42

3.1 Symmetric a/b-barrel proteins The a/b-barrel protein, also known as a TIM-barrel after triosephosphate isomerase (TIM),43 is recognized as one of the most common protein folds, found in at least 15 distinct enzyme families with surprisingly low sequence identity and the ability to catalyse completely different reactions.44,45 This fold is characterized by an eight-fold repeat of (ba) units, forming a toroid topology with an eight-fold internal structural symmetry. The b-strands comprise the inner wall of the toroid, and are connected through long right-handed loops, which include the a-helices found on the outside of the protein. There is no central channel running through the protein—the protein core is tightly packed, containing mostly bulky hydrophobic amino acids, especially within the closed b-sheet. The a/b-barrel fold has long been a focus for studies of protein evolution, folding and design. 106 | Synthetic Biology, 2018, 2, 97–114

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

Fig. 7 a/b barrels, b-trefoils and b-propellers are naturally occurring pseudosymmetrical proteins. TIM (PDB: 5CSR), FGF1 (PDB: 1BAR) and PknD (PDB: 1RWL) are representative structures with apparent 8-, 3- or 6-fold symmetry respectively. The sTIM-11 structure (PDB: 5BVL) is the first 4-fold symmetrical a/b barrel protein. For the b-trefoil, both SymFoil (PDB: 3O49) and TreeFoil (PDB: 3PG0) have been engineered. Pizza6 (PDB: 3WW9) was the first symmetric b-propeller protein.

One of the first attempts ever to design a symmetrical globular protein was the creation of Octarellin in 1990 by the group of Martial.46 Initial designs based on consensus sequences were perfectly symmetric and appeared to be correctly folded, but the structure of the protein could never be experimentally validated. Over the next 20 years several variants were created.47 The last one was Octarellin VI, which was computationally designed using Rosetta.48 Crystallography however revealed that this protein adopted a Rossman-like fold rather than the intended a/b-barrel.49 The first successful symmetric TIM-barrel protein was 2-fold rather than 8-fold symmetric, and derived from fragmentation of natural proteins. In 2000, the group of Wilmanns identified a highly symmetrical a/b-barrel that could be split in half, and reassemble into a dimeric version. This finding was regarded as convincing evidence for the hypothesis that internal protein symmetry arises from gene duplication and fusion.50 Another group used the insights gained from natural proteins to create a new protein scaffold combining structurally similar parts from the a/b-barrel and flavodoxin-like fold in repeat proteins.51 Since their initial result differed significantly from the intended a/b-barrel fold (having an extra b-strand in its core), they turned to computationally guided mutagenesis. This allowed them to produce a stable monomer and experimentally validate the predicted structure, proving the viability of a fragment-based approach to rational protein design.52 At the same time, an artificial a/b-barrel protein was designed from two identical half-barrels.53 These fragment-based recombination designs, however, are only two-fold symmetrical, although there are eight structural repeats. It was initially believed that the fold originated from an 8-fold duplication of Synthetic Biology, 2018, 2, 97–114 | 107

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

this repeat, giving rise to an 8-fold pseudo-symmetrical protein. Following the discovery that the a-b repeat was evolutionarily conserved as a double repeat, and the demonstration that the hydrogen bonding pattern of the b-strand core of the fold could not be successfully established in an 8-fold symmetric protein, a 4-fold symmetrical protein consisting of four identical double a/b repeats was designed de novo utilizing a Rosetta by the Baker group. This required drafting geometric constraints to construct an initial protein backbone with 4-fold symmetry, which was then subjected to iterative cycles of unbiased side chain placement and all-atom energy minimization. Top-ranking solutions were kept and set the bar for sequences designed in further design cycles. The chosen sequence was manually adapted to fit certain needs (such as incorporation of tryptophan to set the spacing between helices). Protein crystallography validated the atomic accuracy of the design.54

3.2 Symmetric b-trefoil proteins The b-trefoil is yet another common globular fold adopted by at least 14 protein families.55 This protein harbours three repeat motifs, each composed of four b-strands. Each of these uses two b-strands to form a b-barrel, while the remaining strands describe a b-hairpin triplet that caps one end of the barrel.56 The Blaber group devised a ‘‘top-down symmetric deconstruction (SD)’’ approach in to create a symmetric trefoil protein.57 In short, they mutated and dissected an existing protein with a b-trefoil fold (Fibroblast growth factor-1) and determined a 42-residue peptide that was then used to assemble a novel, foldable and thermostable b-trefoil fold, named Symfoil.6 Using their Symfoil protein, the Blaber group tested their hypothesis that purely symmetric proteins could survive major replication errors (such as fusion, duplication and truncation) because of recurring key folding nuclei. Structural rearrangement would then likely leave such a folding nucleus intact, thus producing a primary structure symmetry can conserve a foldability.58 Ancestral proteins with a symmetrical primary structure would be more likely to survive these genetic events and evolve to the contemporary pseudo-symmetric protein folds. They observed that a protein that consisted of only 2 instead of 3 of the tandem repeats would form a trimeric complex, with two b-trefoil domains in which one of the three subunits adopts a domain-swapped conformation to provide one subdomain in each of the connected trefoils. In parallel with the development of the Symfoil protein, Broom and colleagues designed a completely three-fold symmetric b-trefoil sequence that folded into a highly thermostable protein named Threefoil.59 Their approach was radically different from the method adopted by the Blaber group, utilizing consensus design with a limited set of close homologues and Rosetta-based energy scoring. Various experimental and computational analyses showed that the Threefoil protein exhibits high resistance to thermal and chemical denaturation, despite its small size and lack of disulphide bonds. Again, the successful construction of the Threefoil and 108 | Synthetic Biology, 2018, 2, 97–114

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

Symfoil proteins reinforced the emerging paradigm of a modular structural element repetition approach in protein design.

3.3 Symmetrical b-propeller proteins b-propellers are toroid proteins consisting of 4 to 10 repeats of fourstranded antiparallel b-sheets, often called ‘blades’. These blades are circularly arranged around a central pore.60 The propeller forms a stable scaffold often used in Nature to assemble protein complexes through multiple protein–protein interactions at the different surfaces. One interesting property, observed in most b-propellers, is that their sequence repeats do not directly coincide with the separate structural blades, but are shifted by one b-strand. The N- and C-terminal residues therefore hold two adjacent blades together, closing the ring in a ‘‘Velcro-like’’ fashion.61 A subset of propeller proteins is called the ‘‘WD’’ family because these sequences include conserved tryptophan and aspartate residues. In 2006, a consensus design method was used to construct various propeller scaffolds using structure-based sequence alignment. An idealized WD repeat was designed, guided by structural restraints, and artificial concatemeric genes with up to 10 copies of this idealized repeat were constructed. Although proteins with 4 to 10 WD repeats could be expressed in E. coli, they were unstable and quickly degraded.62 In another study, Yadid, et al. chose the highly symmetrical five-bladed b-propeller tachylectin-2 as a model for generation of novel repeat propellers from identical repeats.63 The proteins created from identical tandem repeats proved to be highly unstable and prone to aggregation. The soluble fraction however retained some lectin function similar to the original tachylectin-2.64 Diversification of the sequence resulted in a more stable protein, leading to the conclusion that some degree of asymmetry is essential for stability. Later experiments in which a double repeat was utilized resulted in a pentameric protein consisting of two 5-bladed b-propeller domains.65 One of the subunits is forced to adopt a domain-swapped configuration and provide a single blade to both propeller domains. This structural plasticity, as previously also observed in trefoils by the group of Blaber, is hypothesized to be an intermediate step in the evolution of globular repeat proteins with an odd number of sequence repeats.6 Built on the idea that duplication and fusion of ancestral fragments yielded the monomeric predecessors of extant proteins, a novel computational approach succeeded in designing the first experimentally validated perfectly symmetrical b-propeller protein, Pizza6.66 Sequence analysis of the six blades of an existing pseudo-symmetrical b-propeller was used to build a library of putative ancestral sequences. These sequences were then evaluated by mapping them onto a symmetrical backbone and calculating the overall energy using RosettaDesign.67 The best scoring sequences proved to yield highly stable proteins with the predicted structure, which was validated by X-ray crystallography. The six-bladed protein also possesses three-fold and two-fold symmetry, Synthetic Biology, 2018, 2, 97–114 | 109

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

so that polypeptides comprising two (Pizza2) or three (Pizza3) blades are also able to fold directly into trimeric and dimeric structures respectively. This observation agrees with the duplication and fusion theory for globular pseudo-symmetric proteins. However, the Pizza proteins are also capable of assembly into various larger complexes corresponding to the lowest common multiple (LCM) of the number of repeats and six, showing increased potential for the montage of nano-structures.42 A Pizza8 protein for example with 8 identical repeats on each polypeptide can assemble into a trimeric complex with 24 repeats forming four propellers. Subsequently, a trimeric variant of Pizza protein was engineered for metal binding and when incubated with cadmium chloride sets of two trimers were found to sandwich a 19-atom nanocrystal centred on the three-fold symmetry axis of the complex.68 In 2016, the Tawfik group then also demonstrated a putative evolutionary trajectory for the tachylectin-2 protein via ancestral reconstruction of stable, foldable and biochemically precursor sequences. Rather than computationally selecting the optimal sequence they relied on exhaustive experimental testing and directed evolution to identify a symmetrical tachylectin-2 based beta-propeller.69 In their paper, they described how monomeric proteins were first assembled via tandem duplications of motifs comprising functional oligomeric ancestors and later diverged via simpler mutations, ultimately balancing monomer stability versus foldability. 3.4 De novo design of globular symmetric proteins In 2015, Doyle, et al. succeeded in creating the first globular symmetrical proteins for which there is no natural protein template. In this experiment, they built a series of closed a-solenoid repeat architectures, relying on the Rosetta Protein design package, and experimentally validated their designs.41 They first folded a peptide backbone consisting of a double a-helix according to the desired inter-repeat geometry (defined by a specific rise and curvature of the tandem repeats comprising the closed assembly). From these backbone models, they inferred the appropriate amino acid sequence from a list of computationally designed sequences. Via clustering they could identify recurring topologies, which were subjected to a more elaborate sampling. For every backbone, the best-scoring sequences were experimentally tested, and crystal structures confirmed the accuracy of the design to atomic level. These were the first globular proteins with 3- to 12-fold internal symmetry to be designed without relying on any naturally occurring sequence or structure (Fig. 8). While these proteins do not yet have any application, they surely represent a great step forward in the de novo design of novel protein building blocks with tailored shapes and sizes.

4 Future perspectives As shown above, a number of groups have now developed symmetrical proteins, with either polar or circular symmetry, inspired by naturally occurring templates or using purely computational methods de novo. 110 | Synthetic Biology, 2018, 2, 97–114

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

View Online

Fig. 8 A set of rotational symmetric ring-shaped proteins has been computationally designed de novo by sampling different backbone conformations two a-helices and a linker chain with the restriction to form a closed architecture. The resulting proteins exhibit 3- 6- 9- and 12-fold rotational symmetry (cartoon depiction; from left to right, PDB: 4YY2, 4YXX, 4YXZ, 5BYO).

While the majority of these designs have been made from a proof-ofprinciple point of view, to investigate protein evolution and folding, there is a huge untapped potential for the development of these novel proteins for different medical and technological purposes. Symmetrical proteins still serve as a fantastic tool to study protein folding, and additional designs will without any doubt improve our understanding of protein structure and stability, which in turn will help create novel proteins with desired functionalities. As indicated by the DARPins and similar proteins, symmetrical proteins may also be interesting starting points for the design of artificial antibody-like proteins. With more variety in shape and size, larger protein complexes may be targeted or engineered. It may even prove possible to design proteins that can target a specific antigen by matching the symmetry of the binding protein to its target. Many protein complexes of medical interest are homo-oligomers, and designer proteins may prove a useful route to artificial means of locating or inhibiting them in vivo. Possible interaction partners are not limited to proteins however, but include small molecules and metal ions. Designer proteins have already been shown able to biotemplate nanocrystals70 and there is now huge interest in proteins as templates of metal nanoclusters. The symmetrical self-assembling complexes made by artificial proteins with repeated structural modules may prove to be ideal building blocks for larger complexes with predefined symmetries such as crystal-like frameworks or capsids.42,71,72 Such self-assembled materials may be amenable to further functionalisation for use in storage, catalysis, biotemplating or drug release. In combination with metal oxides they may be able to form zeolite-like catalysts and molecular sieves. As synthetic biology moves from feasibility studies towards applications-based research and development, artificial symmetrical proteins appear to have a bright future, limited by little more than our imagination.

Acknowledgements JV acknowledges the FWO for an FWO-SB PhD fellowship. ARDV acknowledges the KUL and the FWO for Odysseus and project funding. JRHT thanks OpenEye Scientific Software for financial support. Synthetic Biology, 2018, 2, 97–114 | 111

View Online

References

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

20 21 22 23

24

25 26 27 28 29

G. R. Darvas, Symmetry: Cultural-Historical and Ontological Aspects of Sciencearts Relations: the Natural and Man-made World in an Interdisciplinary Approach, Birkh user, Basel; Boston, 2007. E. D. Levy and S. Teichmann, Prog. Mol. Biol. Transl. Sci., 2013, 117, 25–51. L. S. Swapna, K. Srikeerthana and N. Srinivasan, PloS One, 2012, 7, e36688. D. S. Goodsell and A. J. Olson, Annu. Rev. Biophys. Biomol. Struct., 2000, 29, 105–153. S. Balaji, Curr. Opin. Struct. Biol., 2015, 32, 156–166. J. Lee and M. Blaber, Proc. Natl. Acad. Sci. U. S. A., 2011, 108, 126–130. J. Monod, J. Wyman and J. P. Changeux, J. Mol. Biol., 1965, 12, 88–118. R. Zandi, D. Reguera, R. F. Bruinsma, W. M. Gelbart and J. Rudnick, Proc. Natl. Acad. Sci. U. S. A., 2004, 101, 15556–15560. D. L. Caspar and A. Klug, Cold Spring Harbor Symp. Quant. Biol., 1962, 27, 1–24. J. Tang, M. N. James, I. N. Hsu, J. A. Jenkins and T. L. Blundell, Nature, 1978, 271, 618–621. C. Kim, J. Basner and B. Lee, BMC Bioinf., 2010, 11, 303. D. Myers-Turnbull, S. E. Bliven, P. W. Rose, Z. K. Aziz, P. Youkharibache, P. E. Bourne and A. Prlic, J. Mol. Biol., 2014, 426, 2255–2268. A. L. Hughes, Proc. Natl. Acad. Sci. U. S. A., 2005, 102, 8791–8792. F. Emmert-Streib, PloS One, 2012, 7, e35531. L. K. Mosavi, T. J. Cammett, D. C. Desrosiers and Z. Y. Peng, Protein Sci. Publ. Protein Soc., 2004, 13, 1435–1448. F. Parmeggiani, R. Pellarin, A. P. Larsen, G. Varadamsetty, M. T. Stumpp, O. Zerbe, A. Caflisch and A. Pluckthun, J. Mol. Biol., 2008, 376, 1282–1304. K. T. Simons, R. Bonneau, I. Ruczinski and D. Baker, Proteins, 1999, Suppl 3, 171–176. K. W. Kaufmann, G. H. Lemmon, S. L. Deluca, J. H. Sheehan and J. Meiler, Biochemistry, 2010, 49, 2987–2998. F. Parmeggiani, P. S. Huang, S. Vorobiev, R. Xiao, K. Park, S. Caprari, M. Su, J. Seetharaman, L. Mao, H. Janjua, G. T. Montelione, J. Hunt and D. Baker, J. Mol. Biol., 2015, 427, 563–575. A. Pluckthun, Annu. Rev. Pharmacol. Toxicol., 2015, 55, 489–511. C. Kappel, U. Zachariae, N. Dolker and H. Grubmuller, Biophys. J., 2010, 99, 1596–1603. W. Li, L. C. Serpell, W. J. Carter, D. C. Rubinsztein and J. A. Huntington, J. Biol. Chem., 2006, 281, 15916–15922. A. Urvoas, A. Guellouz, M. Valerio-Lepiniec, M. Graille, D. Durand, D. C. Desravines, H. van Tilbeurgh, M. Desmadril and P. Minard, J. Mol. Biol., 2010, 404, 307–327. M. Valerio-Lepiniec, A. Urvoas, A. Chevrel, A. Guellouz, Y. Ferrandez, A. Mesneau, I. L. de la Sierra-Gallay, M. Aumont-Nicaise, M. Desmadril, H. van Tilbeurgh and P. Minard, Biochem. Soc. Trans., 2015, 43, 819–824. C. Madhurantakam, G. Varadamsetty, M. G. Grutter, A. Pluckthun and P. R. Mittl, Protein Sci. Publ. Protein Soc., 2012, 21, 1015–1028. S. Hansen, D. Tremmel, C. Madhurantakam, C. Reichen, P. R. Mittl and A. Pluckthun, J. Am. Chem. Soc., 2016, 138, 3526–3532. E. Petters, D. Krowarsch and J. Otlewski, Acta Biochim. Pol., 2013, 60, 585–590. E. R. Main, S. E. Jackson and L. Regan, Curr. Opin. Struct. Biol., 2003, 13, 482–489. E. R. Main, Y. Xiong, M. J. Cocco, L. D’Andrea and L. Regan, Structure, 2003, 11, 497–508.

112 | Synthetic Biology, 2018, 2, 97–114

View Online

30 31 32

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

33 34 35 36 37 38 39 40

41 42 43 44 45 46 47 48

49

50 51 52 53 54 55

56 57

T. Z. Grove, J. Forster, G. Pimienta, E. Dufresne and L. Regan, Biopolymers, 2012, 97, 508–517. T. Z. Grove and L. Regan, Curr. Opin. Struct. Biol., 2012, 22, 451–456. M. T. Stumpp, P. Forrer, H. K. Binz and A. Pluckthun, J. Mol. Biol., 2003, 332, 471–487. K. Park, B. W. Shen, F. Parmeggiani, P. S. Huang, B. L. Stoddard and D. Baker, Nat. Struct. Mol. Biol., 2015, 22, 167–174. M. M. Mahfouz, L. Li, M. Shamimuzzaman, A. Wibowo, X. Fang and J. K. Zhu, Proc. Natl. Acad. Sci. U. S. A., 2011, 108, 2623–2628. F. Praetorius and H. Dietz, Science, 2017, 355, 6331. M. W. Vetting, S. S. Hegde, J. E. Fajardo, A. Fiser, S. L. Roderick, H. E. Takiff and J. S. Blanchard, Biochemistry, 2006, 45, 1–10. S. Shah and J. G. Heddle, Appl. Microbiol. Biotechnol., 2014, 98, 9545–9560. P. L. Davies, Trends Biochem. Sci., 2014, 39, 548–555. E. K. Leinala, P. L. Davies, D. Doucet, M. G. Tyshenko, V. K. Walker and Z. Jia, J. Biol. Chem., 2002, 277, 33349–33352. T. J. Brunette, F. Parmeggiani, P. S. Huang, G. Bhabha, D. C. Ekiert, S. E. Tsutakawa, G. L. Hura, J. A. Tainer and D. Baker, Nature, 2015, 528, 580–584. L. Doyle, J. Hallinan, J. Bolduc, F. Parmeggiani, D. Baker, B. L. Stoddard and P. Bradley, Nature, 2015, 528, 585–588. J. C. Sinclair, K. M. Davies, C. Venien-Bryan and M. E. Noble, Nat. Nanotechnol., 2011, 6, 558–562. J. R. Knowles, Nature, 1991, 350, 121–124. N. Nagano, E. G. Hutchinson and J. M. Thornton, Protein Sci. Publ. Protein Soc., 1999, 8, 2072–2084. H. Hegyi and M. Gerstein, J. Mol. Biol., 1999, 288, 147–164. K. Goraj, A. Renard and J. A. Martial, Protein Eng., 1990, 3, 259–266. M. Beauregard, K. Goraj, V. Goffin, K. Heremans, E. Goormaghtigh, J. M. Ruysschaert and J. A. Martial, Protein Eng., 1991, 4, 745–749. M. Figueroa, N. Oliveira, A. Lejeune, K. W. Kaufmann, B. M. Dorr, A. Matagne, J. A. Martial, J. Meiler and C. Van de Weerdt, PloS One, 2013, 8, e71858. M. Figueroa, M. Sleutel, M. Vandevenne, G. Parvizi, S. Attout, O. Jacquin, J. Vandenameele, A. W. Fischer, C. Damblon, E. Goormaghtigh, M. ValerioLepiniec, A. Urvoas, D. Durand, E. Pardon, J. Steyaert, P. Minard, D. Maes, J. Meiler, A. Matagne, J. A. Martial and C. Van de Weerdt, J. Struct. Biol., 2016, 195, 19–30. D. Lang, R. Thoma, M. Henn-Sax, R. Sterner and M. Wilmanns, Science, 2000, 289, 1546–1550. T. A. Bharat, S. Eisenbeis, K. Zeth and B. Hocker, Proc. Natl. Acad. Sci. U. S. A., 2008, 105, 9942–9947. S. Eisenbeis, W. Proffitt, M. Coles, V. Truffault, S. Shanmugaratnam, J. Meiler and B. Hocker, J. Am. Chem. Soc., 2012, 134, 4019–4022. B. Hocker, A. Lochner, T. Seitz, J. Claren and R. Sterner, Biochemistry, 2009, 48, 1145–1147. P. S. Huang, K. Feldmeier, F. Parmeggiani, D. A. Fernandez Velasco, B. Hocker and D. Baker, Nat. Chem. Biol., 2016, 12, 29–34. R. D. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger, J. E. Pollington, O. L. Gavin, P. Gunasekaran, G. Ceric, K. Forslund, L. Holm, E. L. Sonnhammer, S. R. Eddy and A. Bateman, Nucleic Acids Res., 2010, 38, D211–222. C. P. Ponting and R. B. Russell, J. Mol. Biol., 2000, 302, 1041–1047. M. Blaber and J. Lee, Curr. Opin. Struct. Biol., 2012, 22, 442–450. Synthetic Biology, 2018, 2, 97–114 | 113

View Online

58 59

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00097

60 61 62 63 64 65 66 67 68 69 70 71 72

L. M. Longo, J. Lee, C. A. Tenorio and M. Blaber, Structure, 2013, 21, 2042–2050. A. Broom, A. C. Doxey, Y. D. Lobsanov, L. G. Berthin, D. R. Rose, P. L. Howell, B. J. McConkey and E. M. Meiering, Structure, 2012, 20, 161–171. C. K. Chen, N. L. Chan and A. H. Wang, Trends Biochem. Sci., 2011, 36, 553–561. V. Fulop and D. T. Jones, Curr. Opin. Struct. Biol., 1999, 9, 715–721. M. Nikkhah, Z. Jawad-Alami, M. Demydchuk, D. Ribbons and M. Paoli, Biomol. Eng., 2006, 23, 185–194. I. Yadid and D. S. Tawfik, J. Mol. Biol., 2007, 365, 10–17. I. Yadid and D. S. Tawfik, Protein Eng., Des. Sel., 2011, 24, 185–195. I. Yadid, N. Kirshenbaum, M. Sharon, O. Dym and D. S. Tawfik, Proc. Natl. Acad. Sci. U. S. A., 2010, 107, 7287–7292. A. R. D. Voet, A. Ito, M. Hirohama, S. Matsuoka, N. Tochio, T. Kigawa, M. Yoshida and K. Y. J. Zhang, Med. Chem. Commun., 2014, 5, 783–786. A. R. Voet, D. Simoncini, J. R. Tame and K. Y. Zhang, Methods Mol. Biol., 2017, 1529, 309–322. A. R. Voet, H. Noguchi, C. Addy, K. Y. Zhang and J. R. Tame, Angew. Chem., 2015, 54, 9857–9860. R. G. Smock, I. Yadid, O. Dym, J. Clarke and D. S. Tawfik, Cell, 2016, 164, 476–486. A. R. Voet and J. R. Tame, Curr. Opin. Biotechnol., 2017, 46, 14–19. J. E. Padilla, C. Colovos and T. O. Yeates, Proc. Natl. Acad. Sci. U. S. A., 2001, 98, 2217–2221. Y. T. Lai, D. Cascio and T. O. Yeates, Science, 2012, 336, 1129.

114 | Synthetic Biology, 2018, 2, 97–114

Designer proteins for bottom-up synthetic biology Maxim G. Ryadnov Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

DOI: 10.1039/9781782622789-00115

This chapter discusses recent developments in sub- and extra-cellular biological systems assembled from designer polypeptides. Artificial and semi-designed extracellular matrices and viruses are described from the perspective of bottom-up synthetic biology and proteininspired fabrication. The main emphasis is placed on protein self-assembly as a versatile tool of engineering biology aiming to create biological parts and devices that do not necessarily originate in nature. This intimately serves the fundamentals of synthetic biology, while addressing the main problem in biomolecular design being our incomplete understanding of how structure relates to function. Basic design principles are explained to introduce the need for designing synthetic biologics with future perspectives given in the light of the commercialisation of synthetic biology. The chapter reviews research findings published over the last few years to the time of its submission, with a reference given to background information, which covers an unlimited timeframe citing literature sourced from different databases including Web of Science, RCSB Protein Data Bank and PubMed.

1

Introduction

The complexity of a given biological system only matters when the desired function is delivered. This requirement permits a level of flexibility in biological hierarchy, which can be engineered artificially; that is, from scratch. An ideal strategy is to programme function in constitutent structural units, for which polypeptides provide ideal candidates. Considerable progress made in engineering materials using polypeptides offers construction principles for structurally and functionally complex architectures.1 Naturally occuring proteins are a major source of novel ideas and designs,2 rendering synthetic designs mimetics of existing native forms. These are based on relatively well understood structureactivity relationships. Although simplified these systems can deliver comparable or even better outcomes,3 conforming to the mission of synthetic biology, which applies engineering principles to the building of new biological parts and systems at the primiary component level:4,5 ‘‘the design and engineering of biologically-based parts, novel devices and systems – as well as the redesign of natural biological systems’’.6 This accepted definition prompts the term engineering biology. Engineering is at the heart of synthetic biology developments, which requires reproducible rules and structurally defined components or biological parts that can be re-used and re-purposed to support different functions.7 Engineering biology is stereotypically associated with genetic engineering for which a rich repertoire of genetic components is established, with a terminology proposed and available as an open source, including Synthetic Biology Open Language (SBOL).8 National Physical Laboratory, Teddington TW11 0LW, UK. E-mail: [email protected] Synthetic Biology, 2018, 2, 115–154 | 115 c

The Royal Society of Chemistry 2018

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

The language is proving instrumental for engineering metabolic pathways, chassis and genetic circuits. The purpose of these is to enable, scale in and scale up the manufacturing of desired products, be these drugs, agrochemicals, biofules, protective composites and virtually anything produce. Therefore, genetic synthetic biology relies on that products themselves are pre-established and pre-validated. In other words, these products are known and their properties are predicted. Innovation is hence largely dirven by the need to design manufacturing devices – chasses, circuits, pathways – rather than new materials and bioparts that can provide novel functions. The latter is the realm of biottom-up synthetic biology, where proteins have a major contribution. While nucleic acids are primarily responsible for storing and programming biological information, it is proteins that translate information into function. Proteins support diverse functions ranging from structural and regulatory to chemical transformation and molecular recognition. It is however deemed impossible to cover all aspects in one chapter. The objective of this chapter is to stimulate the understanding of protein-based synthetic biology. This is a relatively independent part of the area which strives to engineer biological parts with post-genetic functions. Some examples may include the design of novel protein folds that are able to selectively bind oxygen9 or to be activated by an artificial ligand10 acting thus as synthetic hemoglobins and G-protein coupled receptors, respectively. The drawback of these designs is that these are conserved autonomously folded proteins and are specifically assigned to a specific function. Although the same design rules can be applied to new designs, to re-purpose same designs for other functions does not appear to be straightforward. In this regard, there are two main strategies used for protein-based synthetic biology.

2

Designer proteins as mimetics of native assemblies

Peptide self-assembly uses conserved molecular recognition events that are programmed in primary or secondary structure elements such as coiled coils and aromatics-based p–p stacking. Once a given element is understood and experimentally confirmed it is exploited for synthetic self-assembly following one of the two outlined directions. However, in most cases found structural patterns are utilised in the system of their origin, as in a naturally occuring ensemble. To provide exhaustive structure-function relationships for a native polypeptide sequence allows for the description of self-assemby pathways, which remains a major challenge for synthetic designs. Because of this, most of modern selfassembly approaches incorporate elementary parts of rational design, those that are based on proven principles, and iterative designs that develop from and empirically extend the same principles. 2.1 Protein folding elements as elementary construction parts One direction focuses on approaches that can help to define a suitable rationale enabling a better control over polypeptide structure, which would be free from the constraints of natural selection, and able to tackle 116 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 1 A schematic representation of protein complexity. The solid arrows indicate the continuity of folding hierarchy from the bottom (primary structure) up (fully assembled and functional structures). The dashed arrow denotes a ‘‘by-pass’’ from an elementary folding motif, e.g. an a-helix (circle) to resulting assemblies functionally matching the most complex protein structures.

folding hierarchy at all conceivable levels, inevitably leading to finding effective solutions for new designs and applications (Fig. 1).11 Another direction bypasses this problem by making use of known, elementary folding elements as bio-bricks for the construction of increasingly complex assemblies.12 This approach tends to neglect sequence-to-function designs in two ways. Firstly, it adapts known protein folds into design algorithms to construct new assembly forms and materials;13 and, secondly, it successfully replaces protein design with peptide design to explore a far smaller sequence space for the delivery of similarly complex systems (Fig. 1).1 The main factor that discriminates between peptide and protein designs is the extent to which these strategies and relationships can be applied. It may be fairly straightforward to predict the sequence of a simple folding motif, i.e. an a-helix or b-sheet, but predicting sequencefolding pathways for a functional protein comprising several different motifs, each of which can be unique, is an altogether different proposition. Granted that the ultimate goal of synthetic biology is the synthesis of biologically inspired and operational systems, the complexity of such systems only matters when the desired function is achieved.14 Irrespective of what the target object or function is, it is engineered artificially, can be from scratch and is rationally programmed in building blocks. Nucleic acids and polypeptides can serve as such elementary units, while peptide self-assembly find use as a natural and most efficient process to afford functional synthetic biologics.15 Sub-cellular structures (virus-like particles) and extracellular systems (fibrillar matrices) are arguably the most complex biological structures that assemble from several fully folded proteins.1 Nevertheless, Synthetic Biology, 2018, 2, 115–154 | 117

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

functional mimetics of these and similar systems are being attempted by peptide self-assembly.16,17 An increasing interest in peptide self-assembly stems from that proteins allocate specific fragments to specialist functions in their sequences, i.e. polypeptides. The folding and oligomerisation of these shorter sequences are relatively easy to assign, design rules for peptides exist and are based on well characterised sequence stretches in native proteins. Despite being ‘‘simpler’’ peptide designs have a number of synthetic advanages over proteins. Firstly, peptide synthesis up to 50-mers is nowadays routine and affordable. The turnaround time from the design stage to a pure material merely takes a few days. Secondly, peptides are more amenable to iterative designs with a smaller number of alternative sequences. Thirdly, empirically it remains to be more straightforward to utilise the principles of negative (away from undesired structures) and positive (towards desired structures) designs to the selection of individual folding motifs than to fully folded proteins. All in all, this informs the scope of this chapter as an overview of recent and exciting synthetic designs derived from artificial polypeptides that are inspired by naturally occurring protein structures and assemblies. As exemplars, two most abundant protein morphologies – filamentous extracellular matrices and spherical virus-like assemblies – serve as the objects of choice for this discussion.

3

Designer protein bricks

Peptide self-assembly operates with secondary structure elements such as coiled coils and supramolecular patterns such as di-phenylalanine p–p stacking.18 In most cases, however, found structural patterns are utilised as specified in the system of their origin, in a naturally occuring ensemble. This is of no surpsie as synthetic is bound to follow native. For instance, the rules that guide filamentous assemblies are adapted for the self-assembly of fibrous extracellular matrices to support cell growth and 3D cell culture, whilst virus-like protein cages or capsules are assessed for their feasibility as gene delivery vectors, synthetic vaccines and controlled release systems.16–18 Furthermore, major efforts in cell therapy including regenerative medicine and gene therapy are no longer considered without peptide self-assembly as a fabrication method. Although robust and fully reproducible rules for the synthesis of fully functional systems remain to be lacking, examples reported to date encourage more trial-and-error attempts which feed aspirations to decipher the code of peptide selfassembly. Because our understanding of sequence-assembly relationships remains incomplete, modern self-assembly approaches incorporate elements of rational engineering (those that are based on proven principles), iterative designs (those that develop from and empirically extend the same principles). 3.1 From structural to functional forms An ideal synthetic form meets a specialist application. This applies to naturally occurring forms that might support roles different from 118 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

specialist functions, e.g. purely decorative. Popular self-assembly strategies start with selecting a form or function, but also striving to formulate a toolkit of amenable building blocks. Often, not only the form but also the way it is constructed is emulated. For peptides this depends on a clear understanding of links between primary structure, which can also incorporate non-peptidic elements,18 folding type (a-helix, b-strand) and the assembly pathway (filamentous, spheroids etc).1,17 Therefore, even when we claim that a peptidic bio-part is designed de novo it still uses protein-inspired links. In practical terms, de novo designs follow a strategy of emulating native folding motifs. Proteins geometrically self-arrange using these motifs. Few types of common elements and arrangements exist. These are either sequential, that is one polypeptide sequence encodes for all elements within one arrangement, or oligomeric, that is different polypeptide chains assemble into a nano-tomicroscopic material (partly, this is the reason for a subtle, if any, border between nanomaterial designs and synthetic biology designs). Similar elements can be found in different proteins carrying out different functions, but all elements relate sequential and spatial organisations. This forms a basic design framework. A designed protein dodecahedron can provide an example (Fig. 2).18 Geomterically defined protein structures are common in biology for gene and molecular packaging and transport. These are biologically effective and structurally reproducible and attract a growing interest for re-design. The designed dodecahedron is stable and reversible at elevated temperatures and high salts, which renders it as a customizable material for applications ranging from drug delivery to vaccine designs.19 A pre-biotic selection and replication of homochiral precursor monomers in the pool of chiral peptides can serve as another, earlier example demonstrating the application of folding motifs as a tool of synthetic biology.20 This tool uses an ability of peptides to form helical bundles that allow a stereochemical ‘‘editing’’ and consequently the discrimination of single stereochemical mutations in individual helices. The example suggests an elementary autocatalytic process for error correction and possibly the origin of life.

3.2 Protein folding elements Secondary motifs are typically confined to four main categories – helix, sheet, turn and loop, and more specific structures such as polyproline helices whose use is restricted to a selected few, i.e. collagen-based designs (Fig. 3). Turn- and loop-like structures are different from more extended helices and b-strands in that their conformational preferences depend on the sequence length between the two end residues. Turns are very short motifs that can comprise two (g-turns), three (b-turns) or four (a-turns) peptide bonds. Loops are longer but a rationale for their preference over turns is unclear. Proteins use both loops and turns for spatial exposure of short signal sequences. Backbone closures at the end residues in these structures is the main factor for function. Free or isolated turns and loops Synthetic Biology, 2018, 2, 115–154 | 119

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 2 A synthetic protein dodecahedron. (a) Cryo-EM micrograph showing homogeneous dodecahedral particles. (b) Computational model (left) and class average along the 5-fold axis (right).19 Reprinted by permission from Macmillan Publishers Ltd: Nature (Y. Hsia, J. B. Bale, and S. Gonen et al., 2016, 535, 136–139), copyright (2016).

rarely retain native functions because they need to be constrained to form stable conformations, which is provided by protein scaffolds that incorporate the motifs. Therefore, turns outside their original protein context are stabilised by covalently fixing a key intra-turn hydrogen bond. The caveat is not only to design a stable structure, but also to differentiate between single, inverse and multiple turns within the same backbone. The role of turns in protein structure and function is of course broader and includes their intrinsic ability to reverse polypeptide chains which determines protein preference for globularity over linearity and the subdivision of turn structures into several types linked to different globular shapes. Although turns help design super-secondary structures such as hairpins, both autonomous and oligomerising, the main interest in turns is due to them being a rich source of potential pharmacophores.21 120 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Unlike turns and loops that play auxiliary roles in supramolecular designs, helices and sheets are construction blocks. To assemble these must support at least two different types of long-range interactions. Any protein or indeed a biopolymer is subject to the hyrophobic effect that drives the formation of hydrophobic interfaces. The peptide bond is highly polar and is disfavoured in a hydrophobic core, which forces a peptide backbone to structure. However, in long enough sequences the peptide bond is also regularly repeated which leads to ordering mainly thanks to a persistent network of hydrogen bonds that neutralises the backbone polar groups. Electrostatic interactions and van der Waals forces are also important in making contacts with side chain polar groups. Secondary structures thus become local structures as opposed to global structures of fully folded proteins. Nonetheless, it is the aminoacid sequence that specifies both the secondary structure and the type of hydrogen bonding. For this reason, in designing folding motifs the main emphasis is on sequence patterns with reproducible geometries that are characteristic of a particular folding element, i.e. a-helix or b-strand (Fig. 3). Canonical a-helices and b-strands have 3.6 and 2 residues per turn, respectively. These are elemental repeats in contiguous segments that are stabilised by backbone hydrogen bonding. In helices this bonding is intrabackbone and is maintained in each i, i þ 4 amino-acid pair. This allows a helix to extend indefinetely in length from under ten amino acids to hundreds. In contrast, a b-strand having fewer residues per turn is of a more extended conformation which cannot be stabilised by hydrogen bonds within its backbone and requires inter-backbone hydrogen bonding. b-strands therefore tend to extend laterally by pairing with other b-strands into b-sheets, and need not be longer than ten residues. By following these basic principles it is possible to prescribe a particular structure. Sequences with alternating hydrophobic (H) and polar (P) residues form b-strands, (HP)n. The same patterns can afford turns with the inclusion of glycine or proline residues at turn points as these residues are known efficient breakers of extended secondary structures with the exception of polyproline helices which as the name suggests are built of repeating proline patterns as, for example, glycine-proline-proline triads in collagen triple helices. Hydrophobic residues at i þ 4 positions support a helix-promoting pattern (HPPPHPPP)n which is further stabilised by polar or small (alanine) residues between the hydrophobic residues (Fig. 3). This inclusion of polar and hydrophobic residues allows peptide motifs to accommodate hydrophobic interactions by segregating polar and hydrophobic residues into two distinct faces – hydrophobic and polar – thus rendering these building blocks amphipathic. Once a pattern is chosen it can be filled with amino-acid residues specified according to their conformational preferences. For example, glutamines, lysines and leucines favour helices, whereas tryptophans, threonines and valines strongly prefer b-structures. Such basic rules underpin first design principles of protein structure-function relationships, but are far from being complete to enable prediction approaches and algorithms which would make de novo design a routine operation. Synthetic Biology, 2018, 2, 115–154 | 121

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 3 Peptide folding motifs and their basic sequence patterns (a) a-helix, b-strand and polyproline (collagen) helix (2ZTA, 1ICO and 1CGD PDB entries rendered with PyMol) and (b) their hydrogen-bond patterns and oligomerisation states – coiled coils and b-sheets (PDB entries 1IJ3 and 1JY4 rendered with PyMol). Dotted lines indicate hydrogen bonds. (c) Turn structures classified according to the number of peptide bonds between terminal residues locked in an intra-turn hydrogen bond.

3.3 Optimizing self-assembling motifs The design of a-helix- and b-strand-based building blocks comes down to the conversion of folding motifs into self-assembling motifs. Design rules are similar except that a principle emphasis is made on propagating oligomerisation. This is distinguished from low oligomerisation into, for example, b-pleated structures or helical bundles, which are nonpropagating. Unlike autonomous (folding) and oligomerising (folding and oligomerisation) sequences, self-assembling motifs undergo all three stages, folding, oligomerisation and propagation, but are meant to become functional only at the final stage. Oligomerising sequences can be viewed as the simplest self-assembling motifs. Different approaches 122 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

used to design self-assembling motifs include those whose assembly modes are fully programmed in linear sequences and those that make use of auxiliary topological constraints.1,16,17,22 Self-assembling motifs produce a rather limited kit of nanoscale morphologies, many of which are proposed as materials for various uses. Designed self-assembled stuctures are often similar to naturally occuring analogues associated with nanostructured accumulations or deposits that derive from abnormal folding or misfolding and in vivo can develop into lesions, such as senile plaques, which can lead to undesired amyloidogenic conditions. In this context, self-assembling motifs can instigate pathogenic responses and one should take care to avoid adverse templating of de novo design.23 In particular, this is topical for b-structure designs that have a tendency for aggregation. To design a discrete and stable b-structure remains one of the major design challenges. Helical self-assembling designs, although being abundant in biology including intermediate fillaments, transmembrane pores and viral core shells,24–26 are difficult to control. Other approaches using low molecular weight amphiphiles or aromatic peptides can also give b-structured materials of comparable characteristics and properties .18 Most recent efforts have been focused on emulating naturally occuring collagen assembly. Collagen assembly forms the ECM and by association promotes cell growth and tissue development. Synthesis remains a major obstacle for collagen designs – a limitation imposed by the conserved

Fig. 4 Synthetic collagen. (a) Axial lysine-aspartate electrostatic interactions in the collagen triple helix. (Upper) amino acid residues in each vertical plane are approximately in the same cross section of the triple helix. O denotes 4-hydroxyproline. (Lower) model of an axial charge pair in a triple helix. (b) Nanoscale collagen fibers assembled from the peptides.29 Reprinted with permission from B. Sarkar, L. E. O’Leary and J. D. Hartgerink, J. Am. Chem. Soc., 2014, 136, 14417–14424. Copyright (2014) American Chemical Society. Synthetic Biology, 2018, 2, 115–154 | 123

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

sequence of glycine, proline and hydroxyproline residues, (Gly-Pro-HyPro)n. Varied compositions of this sequence pattern in synthetic collagens have been introduced to enable the assembly of collagen heterotrimers, which in some instances prove to produce materials with properties reminiscent of native collagen materials.27,28 A promising strategy is being explored in the use of oppositely charged domains in short collagen-like peptides, which can be achieved by introducing cationic (lysines) and anionic (aspartates) amino-acid residues into collagen amino-acid triplets.29 The design generates staggered triple helices held together by inter-chain electrostatic interactions, which ensures the formation of contiguous electrostatic networks between the domains leading to extensive collagen fibrillar structures (Fig. 5). The sequence space for collagen helices is rather limited. In comparison, a-helical motifs provide richer repertoire of candidate self-assembling motifs. In designing or rather re-designing these motifs the same strategy of arranging inter-chain contacts is applied. Helical bundles constitute a typical example. In these motifs individual amphipathic helices arrange around a shared hydrophobic interface with polar amino acids exposed to

Fig. 5 Helical folding motifs. (a) Helical bundles, structures (upper) and corresponding upand-down topologies (lower) of different natural examples (2IJK, 1F4I, 2MHR, 1KRQ and 1EQ1 PDB entries rendered with PyMol). Cylinders indicate helices, N and C are amino- and carboxy-termini, respectively (b) Helical-wheel representations and structures of leucine zipper domains: dimeric transcriptional activator GCN4 and trimeric cyclic nucleotidegated channel (2ZTA and 3SWF PDB entries rendered with PyMol). abcdefg repeats are configured into helical wheels with 3.5 residues per turn. Electrostatic e–g interactions and hydrophobic a–a and d–d pairs are indicated by double-headed arrows. 124 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

water. The number of helices and sub-oligomers define the type and topology of a given bundle. The lowest oligomers, two-helix bundles, tend to be unstable and dimerize into four-helix bundles, whereas three-, four and five helix bundles are common and stable (Fig. 5a).30 Helical bundles share many similarities with other folding motifs – coiled coils,31 which can also be defined as bundles of unconnected helices as opposed to helix bundles that are normally single-stranded structures. Because the assembly of coiled coils is independent of loops and turns these structures are more readily amenable to design (Fig. 5b).32

4 Current trends: functionally applicable designs 4.1 Pre-assembled or propagating oligomers To constrain into a specific bundle formation, coiled-coil helices interdigitate about a hydrophobic interface, which predominantly determines a particular coiled-coil architecture by packing in a ‘‘knobs-into-holes’’ manner.33 Canonical patterns dominate and are built from heptad repeats in which hydrophobic side chains alternate three and four residues apart, (HPPHPPP)n.31–33 Normally denoted abcdefg, with a and d being hydrophobic, this 3,4 hydrophobic pattern distinguishes coiled coils from another type observed in globular proteins known as the ‘‘ridges-intogrooves’’ packing.34 Canonical 3,4 patterns can be extended to noncanonical patterns. For example, these include decads and undecads that show 3,4,3 (abcdefghij) and 3,4,4 (abcdefghijk) combinations of hydrophobic repeats, respectively.33 The rationale for the 3,4 patterns is, on the one hand, to create a contiguous hydrophobic seam on each helix, which would guarantee stable and discrete super-helix associations, and, on the other hand, to maximise burial of the seams via their precise matching in bundles. Because the average spacing of hydrophobic residues along the sequence (3.5 residues) fall short of one complete turn of an a-helix (3.6 residues), such a seam adopts a left-handed twist with respect to the right-handed helix, which allows the association of helices in the bundle with left-handed helix-crossing angles (Fig. 5b).35 The exact number of helices and their type (hetero- or homotypic) in the bundle are defined by amino acid preferences of hydrophobic and electrostatic pairs. For example, a classical leucine zipper motif has leucines in d sites, which direct the assembly of parallel dimers. Higher oligomers for helix bundels and coiled coils also exist. These can be viewed as low self-assembled motifs. Six-helix bundles present a notable example, and are used by enveloped viruses (e.g. retroviruses, paramyxoviruses) to enable viral fusion with host cells.36 These motifs are of primary interest for developing anti-viral therapies by blocking tertiary contacts in the bundle (Fig. 6). Promising fusion inhibitors have been reported including a fragment of HIV gp41 protein, also known as fuzeon and enfuvirtide, proposed as an anti-HIV drug.37 Stand-alone bundles are limited to pentamers and some globular proteins are known to contain hexamers. Although there is still scarce information on these and larger oligomers the known bundles provide inspiration for new designs, with the emergence of engineered pentamers,38 hexamers39 and heptamers,40 Synthetic Biology, 2018, 2, 115–154 | 125

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 6 Native and de novo helical oligomers. (a) A stand-alone coiled-coil pentamer (1VDF PDB entry rendered with PyMol). (b) A designed coiled-coil oligomer, (left) a ribbon diagram of its 2.2 Å X-ray crystal structure; (right) a molecular surface representation of the assembled bundle with a central channel.39 Reprinted by permission from Macmillan Publishers Ltd: Nat. Chem. Biol., (N. R. Zaccai, B. Chi and A. R. Thomson et al., 2011, 7, 935–941), copyright (2011).

all of which can be used as building blocks in synthetic biology construction. 4.2 Directing longitudinal assembly b-structure designs that are intrinsically prone to undesirable aggregation, also provide ample opportunities for developeing rational strategies to control self-assembly. Early evidence can come from observations of amyloid-like properties of designed b-hairpin structures which has led to a conclusion that folded b-sheets can polymerise through edge-by-edge interactions and exhibit helical modes of assembly.41 Similar tendencies were found for single b-strands, linear peptide amphiphiles and twostrand b-hairpins, all of which were shown to form fibres with regular nanoscale paracrystalline features.18 In the case of hairpins particular turn arrangements can be used to tune the assembly. By choosing between DPro and LPro in a proline dipeptide motif it is possible to tune the parallel or anti-parallel orientation of the b-strands resulting in different fibrillar morphologies. The effect of turns is often apparent to promote externally triggered functions including hydrogelation and can also be extended to thermally reversible and anti-cancer hydrogels by making b-strands in the hairpin exchangeable.42 Interestingly, the use of a racemic mixture of mirror-image b-hairpins provides hydrogels that are four times more rigid than gels formed by either peptide alone. It appears that enantiomeric peptides coassemble in an alternating fashion along the fibril long axis, forming an extended heterochiral pleat-like b-sheet, a type of structure that was predicated by Pauling and Corey back in 1953 (Fig. 7).43 Similarly, b-pleated aggregates and a-helical coiled coils are being attempted to tailor hairpin architectures to furnish capsule- or cage-like assemblies.44 These nanostructures make use of symmetric reversible assemblies of self-oligomerising monomers that can be borrowed from known virus-derived folds or designed using structural analogues found 126 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 7 Directed aggregation of b-hairpins by fine-tuning turns. A schematic (a) and electron micrographs (b) for the formation of enantiomeric peptides that take two assembly pathways – self-sorting (solid-coloured fibrils) and co-assembly resulting in mixed fibrils (intermittently coloured fibrils). (b) Co-assembly of enantiomers as accessed by nanoparticle labeling using transmission electron microscopy (left), with arrows indicating examples of bound nanoparticles of two different sizes used (scale bar is 200 nm); and fluorescence qunenching (right) of fibrils (black diamonds) formed from one-type hairpins (right, in a), from two types pre-formed and mixed (white squares) and from the two types premixed to co-assemble (black squares).43 Reprinted with permission from K. Nagy-Smith, P. J. Beltramo, E. Moore, R. Tycko, E. M. Furst and J. P. Schneider, ACS Central Science, 2017, 3(6), 586–597, http://dx.doi.org/10.1021/acscentsci.7b00115. Copyright (2017) American Chemical Society. For further permissions related to this figure contact the ACS.

in capsid proteins. Although none of these is strictly de novo, the designs hold promise for generic building blocks provided their ability to reproducibly assemble into predictable (directed) assemblies (Fig. 8). In addition, such structures start finding use in gene and drug delivery, which suggests that spherical nanostructures can avoid environmentallytriggered transitions to non-controlled aggregation phases. Self-assembled encapsulants of nucleic acids have intrisically great potential for synthetic biology and are particularly conducive to those applications where the use of nano-to-microscale particle carriers is beneficial.1 Synthetic Biology, 2018, 2, 115–154 | 127

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 8 Directed assembly of a virus-like topology (a) into synthetic virions shown in (b) an electron micrograph and (c) 3D rendering (upper) in comparison with the HIV-1 core assembly (lower).44 Reprinted with permission from J. E. Noble, E. De Santis, J. Ravi, B. Lamarre, V. Castelletto, J. Mantell, S. Ray and M. G. Ryadnov, J. Am. Chem. Soc., 138, 2016, 12202–12210. Copyright (2016) American Chemical Society.

The outlined designs are examples of rational, predicted developments. However, resulting morphologies are often serendipitous. The reason for this is that peptide self-assembly intrinsically lacks control. Therefore, a crucial step in any design is to exclude self-assembling forms that are alternative to the desired or target morphology. Largely, this is the basis of peptide self-assembly.44 For the same reason, inspiration remains the main source of guidance, though restricting or rather preselecting possible synthetic forms. For example, creating synthetic helical assemblies is difficult to control away from common and abundant products of native systems being fibrils, rods, tubes and particles. It is not surprising therefore that a synthetic material can be traced back to a naturally occurring analogue. Designing extracellular matrices and viruslike particles are of no exception. All these are complex systems, they do assemble from propagating folding motifs, though not necessarily following one assembly pathway. Even a seemingly simple filamentous assembly depends at least on two events – longitudinal and lateral association of building blocks.45,46 The exact factors that inter-relate different assembly processes have yet to be shown, which leaves most designs with empirical rather than purely rational approaches. This is where inspiration or more precisely an opportunity to borrow takes a central stage not only in choosing design 128 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

routes, but also in choosing potential applications and uses. Hardly there is an object more daring for synthetic biology to produce than the synthetic versions of fully functional extracellular matrices or viruses. These two forms are as important from both pressure perspectives as they are unique in representing two most fundamental forms of life at the macromolecular levels – filamentous and spherical assemblies. The subsequent sections will highlight mainstream strategies in advancing these two areas of designer synthetic biology. The coverage is not meant to be exclusive or exhaustive, and will serve as a comprehensive, guiding sample to the exciting realm of bottom-up synthetic biology. 4.2.1 Synthetic bio-construction of extracellular matrices. Synthetic extracellular matrices have become requisite for tissue engineering applications. An obvious question why extracellular and why synthetic? The very process of tissue regeneration may shed light on this. Regeneration is often presented as a saving solution to many health problems. This is considering that regeneration is not evolutionally advantageous and remains characteristic of very few.47 Some plants can fully re-grow from individual cells, invertebrates can reborn from body parts and amphibians may regenerate lost limbs. But none of these is common for humans, even though certain tissues such as skin and blood undergo complete cellular renewal. Regardless of these differences the purpose of regeneration remains to maintain the integrity of tissues and organs. This is so-called ‘‘maintenance regeneration’’48 relies on stem cells maintaining contact with their environment which can trigger cell differentiation at any time. Injury-induced regeneration is a less common process. This is conditional and requires much faster regeneration turnover depending on the ability of differentiated cells to regress back into stem cells. This is dedifferentiation. And again it is plants and invertebrates that make an efficient use of the process for tissue repair.49 Human cells no longer have this ability. Instead, tissue regeneration is achieved through fibrosis – a process initiated by an inflammatory response leading to profuse fibrous tissue which remodels and matures into scars. Fibrosis can also heal extensive wounds that are is beyond the capacity of regeneration-competent tissues.50 The process is not a regeneration mechanism since it does not support the restoration of a native tissue. Regardless, regeneration still relies on an excessive production of the extracellular matrix (ECM) – a cell-supporting material for tissue growth. Solely for this reason, synthetic ECM analogues are as popular as ever in the search of effective solutions to tissue regeneration.1 Peptide self-assembly offers a particularly attractive strategy as it can mimic the native ECMs from the bottom up with a relative simple sequence designs.18 More importantly, however, it mimics the very process of the ECM construction. The ECM is a collagenous fibrous material which gives tensile strength to most tissues and defines their shape and form. The matrix is a mesh-like structure supporting cell adhesion and proliferation, storage depot of growth factors and cues for cellular homeostasis and durotaxis. Synthetic Biology, 2018, 2, 115–154 | 129

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

The ECM is a living part of a tissue, which can be made highly specialized: enhanced porosity of the ECM enables molecular filtering in kidneys, while regularly-tilted narrow lattices support the convex curvature of cornea affording optical transparency. Connective-tissue stromal ECMs that endow cartilage with elasticity and bone with fracture resistance require extensive extracellular connectors that are assembled from collagen types I and II. In basement membranes the enrichment of adhesion proteins compensates lesser amounts of the assemblies and growth factors, entactin and amorphous collagen IV provide boundaries between different tissue types and the controlled masking of enzymatically targeted cryptic sites. Nonetheless, any given matrix is built up of individual polypeptide modules into a highly repetitive modular morphology.51 The ECM selectively communicates with cells at the molecular level by multiplying signals that pass between messenger (matrix) molecules and cells. Extensive matrix surfaces transmit multiple signals via cell receptors to intracellular pathways triggering specific cellular responses, which support various biological functions ranging from cell adhesion to vascularization and organ morphogenesis.51 Collectively, these characteristics combine into a property of multiplying extra- and intracellular interactions which constitutes a major objective for synthetic matrix designs. But what to choose the most appropriate building blocks from? Collagen is the main structural component of the ECM. It also accounts for a quarter of our total protein content, whereas the conserved role it plays in ontogenesis is consistent with the nature of the collagen folding motif based on a tripeptide Gly-Xaa-Yaa, where Xaa and Yaa can be any amino acids, but often, and usually together, are proline (Pro) and hydroxyproline (Hyp) (Fig. 4). However, chemical synthesis of proline- and glycine-rich sequences does not permit collagen-size building blocks. Therefore, specific assembly modes are being devised for peptide collagen mimetics including the outlined approaches (Fig. 5). Extended triple helices, that in native collagen fibres are generated by long type I collagen polypeptide chains, can be assembled using overlapping shorter peptides. By combining staggered peptide packing with forced inter-peptide interactions in triple helices and between the helices heterotrimeric self-complementary short collagen-like sequences assemble into collagen-like fibres. The main advantages of synthetic collagens are their purity, more straightforward and scalable synthesis, better control over final morphologies and physicochemical properties (Fig. 9).52 The collagen motif is repeated in native collagen sequences to make up polypeptide chains of 300 nm in length. These chains interdigitate into nanometer-thick bundles that are often termed tropocollagens. These are used as the building block of the ECM. Different collagen types (types I–V, IX and XI) are directly involved in the ECM formation. Their structures exhibit a characteristic surface pattern of light and dark striations repeating every 67 nm, which usually referred to as a D-period.53,54 This D-period is a length unit for tropocollagens, which are about 4.5 D-periods each and which spontaneously assemble into five-stranded 130 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 9 Ribbon models of canonical and non-canonical collagen triple helices. Re-designed salt-bridges between the peptide chains of the triple helix promote a longer non-canonical offset resulting in longer and stickier collagen subunits that access new synthetic matrix-like morphologies.52 Reprinted with permission from A. Jalan, K. A. Jochim and J. Hartgerink, J. Am. Chem. Soc., 2014, 136, 7535–7538. Copyright (2014) American Chemical Society.

proto-fibrils of about 90 D-periods (B7 microns) by staggering side-byside in parallel (Fig. 10a). Such proto-fibrils provide seed structures that associate into microscopic fibres and are composed of type I and II collagens, which are also the collagen types that cell produce in response to injury. Thus, collagen fibres are periodic paracrystals with a cylindrical form which ensures a minimal surface area to volume ratio. 4.2.2 Nanoscale ordering – where synthetic meets native. Other filamentous assemblies including ECM fibrin and intermediate filaments are also highly periodic paracrystals.55–57 This suggests that protein filaments share the same generic principles irrespective of biological function and origin. Indeed, there are two most generic processes typical of filamentous assemblies. Firstly, building blocks in periodic fibres point in the same direction, i.e. the assembly is directional or polar.46 Secondly, maturation is achieved through lateral associations of early or proto-fibrils. Thus, the ratio of polar fibrils can determine the form, composition and size of a resulting matrix as well as fibre branching. The latter is universal to filamentous assemblies and may constitute a third generic process, however, the primary role of this process is to shape fibres into the three-dimensional structure of the ECM.58 As an exemplar, fibrin matrices assemble from periodic half-staggered protofibrils that laterally wrap each other into clots of branched fibres (Fig. 10b).59 Branching points in these matrices are randomly separated at various distances, with band patterns of diverging fibers being aligned. This suggest that individual fibres randomly split during the polymerization of building blocks. Alternatively, in other, more elastic fibres building blocks (e.g. fibrillin) polymerize independently in a stepwise manner and merge at a final stage giving rise to cross-linked networks.60 In most matrices, however, fibres exhibit striated surface patterns similar to those of collagen (Fig. 10). Such banding patterns are characteristic of paracrystalline materials ordered at the nanoscale. In addition to collagen and fibrin fibres, lamin-derived intermediate Synthetic Biology, 2018, 2, 115–154 | 131

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 10 Native extracellular matrices. (a) Electron micrographs of collagen fibrils with the characteristic pattern of light and dark stripes. The high-res inset shows the 8D-long region of polarity transition.53 Reprinted from Methods, 45, T. Starborg, Y. Liu, R. S. Meadows, K. E. Kadler and D. F. Holmes, Electron microscopy in cell-matrix research, 53–64, Copyright (2008), with permission from Elsevier. (b) Electron micrographs of fibrin clots (upper) and branching fibres (lower). The branched fibres show a characteristic band pattern of 22–23 nm.59 Reprinted from Biophys. Chem., 112, J. W. Weisel, The mechanical properties of fibrin for basic scientists and clinicians, 267–276, Copyright (2004), with permission from Elsevier. (c) Electron micrographs of naturally occurring lamin-derived intermediate filaments.61 Reprinted from J. Mol. Biol., 325, A. Karabinos, The single nuclear lamin of caenorhabditis elegans forms in vitro stable intermediate filaments and paracrystals with a reduced axial periodicity, 241–247, Copyright (2003), with permission from Elsevier.

filaments or microtubule-associated fibres show similar periodic banding patters (Fig. 10c).61,62 Thus, the various forms of the ECM are built up from anisotropic fibrillar structures exhibiting a near-crystalline order. Synthetic fibres 132 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

have similar characteristics despite that peptide building blocks are considerably smaller when compared to native systems. Designed short fibres tend to show narrow banding patterns. For example, triple collagen helix protomers assemble into distincitive D-periodic fibres. These protomers contain three distinctive domains, in which a (Pro-Hyp-Gly) domain forms a core domain flanked by two oppositely charged domains containing cationic arginine and anionic glutamate residues. The charged domains enable electrostatic interactions producing axially staggered triple helices, in which the central, neutral domain maintains the thermodynamically-favoured network of hydrogen bonds supporting the assembly of collagen-like fibrils. In these fibres axial D-period formed by 63 residues equating to a band of 18 nm. This is in a good agreement with the 67 nm periodicity for native tropocollagens comprising over a thousand residues. Such a correlation between the peptide length and the D-periods suggests that a minimum nucleating unit of the assembly should be a lateral oligomer of protomers that stagger via regions of varied packing and alternating charge density.63 As with native non-collagen fibres, banding patterns are characteristic of synthetic non-collagen sequences. For example, coiled-coil longitudinal staggers reminiscent of those supporting the propagation of intermediate filaments were designed to assemble into rigid nanoordered fibres with analogous striation patterns.64,65 Individual peptide blocks aligned longitudinally into infenite helical strands showed a 4-nm periodicity corresponding to one folded peptide. However, as candidates for ECM mimetics, these designs are too rigid to readily accommodate morphological changes that are inevitable in dynamic cellular environements, and cannot support contiguous cellular recruitment across length scales that are larger than tens or hundreds of nanometres.66 In this regard, native extracellular matrices allow significant orthogonality in assembly whereby generating multi-scale fibrillar networks and meshes.67 Achieving nanoscale spacings that closely match the span of one folded monomeric block is encouraging as this implies that peptide blocks remain in register regardless of their sizes. Indeed, extended peptide sequences give larger periodicity.65 With such a precision this mode of assembly has a potential for tuning fibre morphology, which could enable the rational programming of three-dimensional matrices. This can indeed be observed. For example, it proved possible to not only engineer fibre morphology,68 but also generate polygonal fibrillar networks derivatised with oriented and surfaceadjustable bio-functional elements (Fig. 11a).68–70 In this approach, resulting fibre morphology can be reasonably predicted using an empirical algorithm of designing specialist peptides. These peptides are complementary to the main, fibre-forming, building blocks or standards. This allows them to co-assemble and direct the fibre assembly. Resulting changes in the shape of individual fibres derive from the kinetically controlled concentrations of the specialists in a growing fibre which nucleate distinguishable morphologies (Fig. 11). However, the topology of specialists has to be distinctive from standards. Therefore, the specialists sequences are based on those of standards as Synthetic Biology, 2018, 2, 115–154 | 133

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 11 Tuning the morphology of self-assembling matrices. (a) Electron micrographs of various fibre morphologies assembled from a longitudinal fibre assembly (centre) with the help of specialist building blocks that caused (clockwise) fibre kinking, segmenting, networking, hyper-branching and branching combined with segmenting.70 Reprinted with permission from D. N. Woolfson and M. G. Ryadnov, MaP peptides: programming the self-assembly of peptide-based mesoscopic matrices, J. Am. Chem. Soc., 2005, 127, 12407–12415. Copyright (2005) American Chemical Society. (b) Electron micrographs of engineered matrices with nanoscale mesh sizes.72 Reprinted with permission from M. G. Ryadnov, A. Bella, S. Timson and D. N. Woolfson, J. Am. Chem. Soc., 2009, 131, 13240–13241. Copyright (2009) American Chemical Society.

topologically re-arranged, orthogonal, constructs. Specifically, the two oppositely charged fractions of standards can be separated to give four unique units that can then be re-coupled in different combinations to give a variety of head-to-head or tail-to-tail constructs. This introduces discontinuities into the assembly generating different fibre morphologies (Fig. 11), but also allows for structural manipulations of the obtained architectures. For instance, it is possible to adjust distances between kinks or vary branching density by subtle changes in the constructs.71 Although it is possible to render synthetic assemblies orthogonal with 134 | Synthetic Biology, 2018, 2, 115–154

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

the help of co-assembling specialist blocks, resulting morphologies span similar nanometer dimensions within which functional cell support may not be efficient (Fig. 11b).72 4.2.3 Expanding to the micro-scale – matching cellular length scales. Topology-based engineering can be applied to enable matrix formation rather than influence it. One effective format is the use of three-dimensional domain swapping – a molecular mechanism that proteins use to exchange identical structural domains between different folded monomers.73 Such monomers intertwine, but maintain intramolecular interactions. A filamentous assembly can be achieved using relatively simple peptides designed to have exchangeable b- or a-strands. With or without an external stimulus (elevated temperature, high salt) such a structure can switch from a random coil to a folded structure (b-sheet or a-helical coiled coil). Each strand is available for oligomerisation with the strand of another copy of the same peptide with the formation of low oligomers that propagate into fibrous microscopic net-like matrices. Because these matrices are biocompatible and are stimuli-responsive they find use as scaffolds to support cell proliferation. The dimensional features of most of similar synthetic materials are limited to the nanoscale which may not be satisfactory for supporting interactions with live cells, the sizes of which exceed several microns. Therefore, the main emphasis in these designs is placed on a means to enable sub-millimetre matrix architectures that can more effectively support adhesion, growth and proliferation of mammalian cells. A solution to this problem was found in arbitrary self-assembly of low oligomeric units, as opposed to topology designs (Fig. 11), and was proposed to support multi-directional or promiscuous modes of assembly (Fig. 12a).74–76 In these assemblies, peptide blocks have two complementary coiledcoil domains that oligomerize by forming domain pairs with partners of other blocks such that interactions occur between different peptides and not within the same peptide (Fig. 12b). This is ensured by conjugating the domains via two short linkers and cyclizing them antiparallel to each other. The linkers provide sufficient spacing only for outward interactions of the antiparallel domains, thus resulting in a bi-faceted anisotropic block, which propagates laterally and infinitely through interfacial interactions of the two domains (Fig. 12a,b). The resulting matrices proved to be efficient in supporting the adhesion and proliferation of different human cells at rates comparable with those for collagen type I matrices (Fig. 12c).76 Furthermore, this design allows for several biological properties at the expense of incorporating bioactive aminoacid motifs and patterns including cell adhesion motifs75 and antimicrobial stretches which when multiplied in the resulting matrix promote strong biofilm-resistant effects.76 To display biologically relevant functionalities is a critical property for biomimetic extracellular matrices. Selective biomolecular recruitment proves to be a popular approach, and focuses on the supra-molecular decoration of fibrillar surfaces with biologically active molecules.69 In Synthetic Biology, 2018, 2, 115–154 | 135

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 12 Assembling arbitrary matrices. (a) Optical micrographs of assembled microscopic matrices. (b) Arbitrary of a cyclised peptide domain into a microscopic net-like matrix shown schematically (left) and in an optical micrograph (right).76 (c) Fluorescence micrographs of human osteoblasts (left) and fibroblasts (right) seeded on the matrix. Depth scale is 1 mm. (a and c) Reproduced from ref. 74 with permission from The Royal Society of Chemistry. (b) Reprinted with permission from N. Faruqui, A. Bella, J. Ravi, S. Ray, B. Lamarre and M. G. Ryadnov, Differentially instructive extracellular protein micro-nets, J. Am. Chem. Soc., 2014, 136, 7889–7898. Copyright (2014) American Chemical Society.

these designs engineered fibres are assembled from peptides modified with small-molecule ligands or peptide antigens. These then act as baits for capturing partner proteins from solution on the matrix surfaces. The 136 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

interactions can also be directly seeded in building blocks as shown in several other designs based on non-helical fibrillar systems. For example, a peptide amphiphile was designed to assemble into fibres whose hydrophobic core was formed by clustered alkyl chains, whereas the solvent-exposed surface of the fibres was formed by cell adhesion motifs at nearly van der Waals packing densities.77 This scaffold, which shared physico-chemical properties of the native ECM including nanoscale dimensions, fibrous morphology and cell adhesion, was successfully tested for the encapsulation of neural progenitor cells. Specifically, similar amphiphile designs with incorporated neurite-sprouting laminin epitopes assembled into dense gelled 3D fibrillar networks that were capable of embedding neurite cells.78 Cell-adhesion epitopes on the networks were at densities considerably higher than those typical for the native ECM, which was found to be sufficient in promoting cell signalling. Furthermore, a library of different molecules based on the same amphiphile structure but carrying different functional moieties can be used to assemble into multiply decorated fibrils with broadly distributed and statistically spaced motifs of different types. Such an amphiphile assembly was also shown to occur concomitantly with cell growth at extremely low concentrations, while being able to ensure mechanical and cell-adhesion support sufficient for directing cell migration, differentiation and growth. However, the concept of matrix decoration has been applied mostly in the context of integrin-binding motifs promoting cell adhesion.79–81 Alternative ‘‘physical’’ modification strategies can be demonstrated in the use of collagen mimetic peptides, designed as multimers of collagen triads, which exhibited strong affinity to type I collagen under controlled thermal conditions.82 These mimetics, which are intrinsically prone to form collagen-type triple helices, were shown to successfully bind to native collagen fibres by associating with their thermally disentangled domains. These peptides may find use in mimicking non-fibrous FACIT collagens which could function as individual decorating units for the ECM and by combining site-specific unfolding and intervention matrix bio-functionalization can be extended into other applications.83 A notable example of biomimetic designs from unrelated sources is the use of phenylalanyl-phenylalanine (FF), which was identified as a core sequence involved in the formation of amyloid fibrils. FF can self-assemble into nanofibrous structures alone and as a part of longer peptides.84 More detailed research has revealed that derivatives based on this dipeptide motif are also prone to form nanostructures ranging from nanotubes to cages, while same dipeptides derivatised with N-(fluorenyl-9-methoxycarbonyl) (Fmoc) group can form gel scaffolds for the proliferation of chondrocytes in both two- and three-dimensional cell cultures.85,86 Needle-alike fibrillar structures derive from p–p stacking interactions of aromatic phenylalanine rings. Most recently, it has been shown that phenylalanine residues can be incorporated into a heptad-long peptide that provide a minimal length motifs to allow for the modular design of surfactants.84 The peptides assemble in an orthogonal manner (Fig. 13a), a geometric arrangement which allows Synthetic Biology, 2018, 2, 115–154 | 137

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 13 (a) helical wheel representations of two asymmetric units (upper) and their packing in a perpendicular manner stabilised by hydrophobic and stacking interactions of Phe residues. (b) cryo-electron micrographs of cylindrical micelles-like nanofibres formed at different concentrations.84 Reprinted under a Creative Commons CC-BY licence (http://creativecommons.org/licenses/by/4.0/) from S. Mondal et al., A minimal length rigid helical peptide motif allows rational design of modular surfactants, Nat. Commun., 2017, 8, 14018, http://dx.doi.org/10.1038/ncomms14018. Copyright r 2017, Rights Managed by Nature Publishing Group.

the peptides to accommodate p–p stacking of phenyl rings and stabilise the formation of low oligomers propagating into long fibrillar structures (Fig. 13b). 138 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

The designs perform a dual function of emulsifiers and thickeners and exhibit the highest stable emulsions reported to date, which makes them attractive for engineering biologically relevant emulsifiers.84 A greater morphological and functional diversification can be achieved usng a greater variery and number of co-assembling building blocks. This provides several advantages including a better control over the main, bulk, material form, which is seeded when different blocks are mixed, and the possibility of using linear and non-linear building blocks simultaneously to promote non-linear (orthotropic) assemblies. Individual fibres are examples of linear systems but as many synthetic designs show can readily undergo conversion into matrices whose dimensions are not limited to particular space or length scales. Fibres themselves can find different applications where their discrete morphology is requisite in which case matrix formation would become an alternative assembly that has to be avoided. Other synthetic morphologies are orthotropic by definition and cannot be afforded by linear assemblies. Linear sequences are therefore designed to assemble orthogonally. A natural form that provides inspiration for this type of assemblies is the architecture of viruses, protein shells and cages.

4.3 Synthetic artificial viruses Mastering synthetic viruses is appealing from two main perspectives – their fascinating structure and exceptional functional effectivity.87 Other biomaterials of the same morphology can be viewed as analogues of the virus structure; molecular storage and transport cages of clathrin and ferritin complexes as well as primitive bacterial organelles are structurally similar, but functions they support (chemical conversions, mineral storage) are narrowly specialist in comparison. Arguably, this is the function of viruses that attracts the designer attention. The potential of controllable and efficient gene delivery goes beyond one application or therapeutic involvement, representing a fundamental challenge, and consequently a fundamental reward, for biomedicine and other, not necessarily medical, applications.88 Synthetic viruses are promising as they would exhibit unique physicochemical properties making them advantageous over other materials. An exemplar property is the high surface-area-to-volume ratio of viral particles, which can multiply the surface exposure of an antigen to generate robust immune responses and enhances adsorption selectivity through chemical tailoring. Apart from obvious gene-delivery properties, synthetic viruses can give rise to powerful technology platforms allowing for the encapsulation and conversion of specific analytes at desired concentrations and their efficient traffic into cells and tissues.87,88 However, all these properties strongly depend on other specific requirements. These are dominated by monodispersity, agglomeration, stability and structural reproducibility which are particular important for the validation of virus-like materials in dynamic and mechanically aggressive environments. One route to address these and other potential issues is to nanostructure rather than merely nanosize these materials. In Synthetic Biology, 2018, 2, 115–154 | 139

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

other words, in order to afford reproducible, robust and stimuliresponsive materials, one has to develop materials that are structurally defined at the nanoscale. Viral particles are excellent examples. Many are well characterised with structures resolved at the atomistic level of detail, and all are highly reproducible assemblies. 4.3.1 Symmetry-driven capsular assembly. Viruses are little more than protein cages encapsulating a genetic cargo. They are monodisperse, stimuli-responsive and capable of self-assembly with or without their cargo. However, a number of undesired properties limit the systematic use of virus-based gene-delivery agents to in vitro experiments. Therefore, synthetic virus-like structures that can function as viruses, but lack their shortcomings, attract continuing interest. Two main approaches that hold the strongest promise can be identified. One is the persistent search for naturally occurring peptide motifs that can form capsule-like assemblies. These motifs do not necessarily derive from related assemblies and can have even unexpected origins. For instance, an elegant design is the modification of transmembrane domains of membrane proteins.89 These domains are hydrophobic sequences that without the support of their membrane lipid environment fold into compact and remarkably uniform spherical nanoparticles. The peptide sequences originate from a native protein and, as a consequence, the formed nanoparticles possess innate biological activity: these were shown to inhibit tumour metastasis associated with the protein and additionally, due to their ability to assemble into hollow cages, to encapsulate hydrophobic drugs for intracellular delivery. Thus, dual biological function was demonstrated.89 The other approach looks to re-purpose known building blocks for cage-like assembly. For example, in one of the first designs of this type two protein domains are conjugated to support two different oligomerisation states (dimer and trimer). The resulting chimera gives a cage-like tetrahedron assembly of 12 copies with an edge length of 12 nm.90 Later designs inspired by the concept diversified cage-like structures, allowing each cage to have a defined number of subunits of distinct architectures. These designs offer structural mimetics of artificial capsids whose exterior can be further modified with regular structural motifs. For example, a synthetic mimetic of a tomato bushy stunt virus was shown to reverisbly recruit coiled coil domains on its surfaces emulating a spiky morphology typical of some native viruses (Fig. 14a).91 All these approaches are based on linear sequences adopting predefined folding states. Alternative approaches use self-assembling non-linear structures. For istance, the co-assembly of linear coiled-coil sequences with three-arm dendrimer-like constructs, to set up C3-symmetry, generates multiple cavities of 5 nm in diameter.92 The cavities are confined to individual polynanocages resembling porous-like materials, which serve as nanoreactors by supporting a conversion of ionic silver into colloidal silver with diameters precisely matching the diameters of the cavities.92 Adopting this rationale in the co-assembly of linear coiled-coil dimers and trimers cemented by cysteine bridges yields 140 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 14 Synthetic virus designs. (a) Schematic of an artificial viral capsid decorated with a coiled coil moieties providing rigid spikes. The self-assembling sequence derives from a naturally occurring b-annulus peptide fragment of tomato bushy stunt virus. (b) A C3 b-strand triskelion that self-assembles into hollow capsules, modelled schematically (upper) and shown in an atomic force micrograph (lower). Reproduced from ref. 91 and 97 with permission from The Royal Society of Chemistry.

cage-like nanoparticles that tend to fuse into larger aggregates,93 likely due to interfacial gelation effects characteristic of microgel structures.94 Unlike most of synthetic designs to date that are averaged in size and can often aggregate and precipitate, viruses exist in discrete, monodisperse forms of uniform sizes. What sets size limitations in viruses is their genetic cargo. Depending on it viruses can vary from a few nanometers for satelliviruses to several hundreds of nanometers for megavirus.95 Despite the apparent diversity in size and function, all viruses are Synthetic Biology, 2018, 2, 115–154 | 141

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

symmetry-driven architectures. Thus, a desired virus-like capsule can be engineered to encapsulate, assemble or wrap around a size-defined material. Native encapsulating nanoparticles and principles of their assembly are sufficiently described in the literature to aid in the design of biomimetic nanoparticles. However, the current pace with which artificial self-assemblies are being introduced remains to be improved. Perhaps main reasons for this are the lack of robust structure prediction methods, which could provide generic rules for de novo construction; and the traditional notion of the entry efficiency of nanoparticles into host cells and tissues, which is being taken as an overriding design criterion. Structural and physicochemical properties representative of native architectures have been of a much lesser concern which has created a clear gap for the predictable design of virus-like assemblies. Encouragingly, existing evidence points to technologically promising biomimetic systems designed as geometric or symmetry-driven cages. In addition to those described above, de novo short sequences can be produced to build regular polyhedra.96 In this design, cages assemble from multiple copies of one asymmetric block, which also comprises two different oligomerization domains. The number of the copies is defined by the least common multiple of the domains which when superimposed onto the edges of a tripyramidal arrangement and linked together should assemble into a sphere. The domains are helical coiled-coil motifs that arranged into pentamers and trimers along five- and three-fold symmetry axes of a polyhedron. Thus, fifteen block copies can form an ‘‘even unit’’, with several units assembling around a nanoparticle shape. Yet, structural polymorphism for these assemblies was also apparent. Irregularities in the geometry of the units and in their packing lead to sizes ranging from 15 to 45 nm. Monodispersity in size is a necessary constraint ensuring the encapsulation of viral genes. For gene delivery applications and in particular when these concern overcoming tight junctions of blood-brain barrier, small, monodisperse capsules are also necessary. Often, however, function defines form. Structurally plastic virus-like agents can find use in areas requiring dual functions, one of each is not typical of native viruses. With this in mind, plastic peptide capsules were designed as a structurefunction platform for antimicrobial viruses.97 These capsules benefit from re-engineering a moderately antimicrobial peptide stretch of breast milk protein lactoferrin into a self-assembling triskelion. Because the virus architecture adopts an n-fold rotational symmetry, where n is usually 3 or 5 or both, a triskel conjugate of the resulting sequence, RRWTWE, gives a self-assembling motif with a trilateral symmetry reminiscent of native virus-like subunits (Fig. 14b). Similar to viruses, these capsules self-assemble from individual b-strand subunits and effectively promoted gene delivery and even targeted gene silencing in live human cells. Unlike viruses, however, the capsules are not limited to hosting specific cargo and are structurally plastic. Also unlike viruses, which are not antimicrobial, these capsules were strongly antimicrobial. Specifically, their relative polydispersity of 0.02–1 mm allows them to attack bacterial cells on contact. With most bacteria having 0.2 r2 mm 142 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

dimensions, such capsules match the task perfectly. Furthermore, the capsules caused distinctive pore-like lesions in membranes within just first minutes, which prompted a distinctive mechanism whereby building blocks penetrate bacterial lipid bilayers at one-strand depths as capsules land on membrane surfaces, which is then followed by the conversion of the capsules into pores. Rigid viral capsids cannot readily adjust to encapsulate different cargo or more of the same cargo. Structurally plastic capsules are free of such constraints and can be utilized to deliver a range of cargo while providing antimicrobial protection. Thus, the concept holds promise as a structural platform for engineering biologically differential nanomaterials and adds to the growing synthetic biology toolkit demonstrating the versatility of synthetic designs. 4.3.2 An ultimate inspiration. The highlighted and other designs98,99 may gradually conquer the notion of a synthetic virus, though with careful steps. An ultimate target is set by the virus architecture itself. All viruses, old and newly discovered, are now routinely categorised into geometrydefined groups that can be used as sketch templates for an intended synthetic structure.100 The concept of quasi-equivalent positions sets the basis for capsid designs, and offers an approach whereby protein subunits are packed around the surface of a sphere with a cubic symmetry (to provide a minimum free energy structure), may be positioned in similar, although not necessarily identical, environments (quasi-equivalent) with respect to each other, and are clustered into closed supramolecular networks forming closed, hollow shells – capsids (Fig. 15).101 Two empirical, hence useful, consequences of the concept are (1) the assembly is hierarchical; that is, protein subunits can be asymmetric and can cluster into larger subunits (capsomeres), which are intermediate to capsids, and (2) basic geometric parameters can be used to predict and direct capsid assemblies. Indeed, genetically unrelated viruses from different taxonomic groups can be described as proteinaceous icosahedra. Examples may include human (rhinoviruses or adenoviruses), plant (rice yellow mottle or cowpea mosaic viruses), animal (porcine parvovirus), algal (phycodnavirus), or bacterial (tailless phages) assemblies, all of which are icosahedra. An icosahedron consists of 20 flat faces of equilateral triangles arranged around a sphere with 5/2 3 symmetry. Each face is subdivided into smaller triangles, facets, the number of which is reflected by a triangulation number T (Fig. 17). T can be found graphically from an equilateral triangular net or using a simple equation: T ¼ f 2P, where f is any integer and P ¼ (k2 þ k h þ h2), with k and h being any integer with no common factor. For instance, T ¼ 1 describes an icosahedral structure with 20 triangular faces. Each face has three identical facets generating a total of 60 identical proteins. A T ¼ 4 virus will have each face comprising four smaller triangles giving 240 protein blocks in total. These blocks can be organized as pentameric (12 at vertices of the faces) and hexameric (30 at the vertices of facets) capsomeres to render the arrangement of Synthetic Biology, 2018, 2, 115–154 | 143

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 15 Virus architecture. (a) Triangulation numbers on a hexagonal p6 net. (0,0) indicates the origin of a five-fold vertex and the placement of a pentamer. T number is calculated from (h,k) and denotes the number of smaller triangles, arranged around local quasi-six folds, in one triangular icosahedron facet. (b) An equilateral triangular net relating icosahedral and quasi-equivalent symmetries. The grey triangle with bold lines defines one face of the T ¼ 3 surface lattice (h ¼ 1, k ¼ 1). A net of 20 identical faces, with each planar hexamer composed of six triangles (left). Convex pentamers are formed when one of the triangles is removed by connecting the two free edges of one pentagon with subsequent folding into a T ¼ 3 icosahedron (right). (c) Examples of hexagonal nets for avibirnavirus (left) and sobemovirus (right), with sobemovirus folded into a capsid (d). Reproduced from VIPER (http://viperdb.scripps.edu) and M. Carrillo-Tripp, C. M. Shepherd and I. A. Borelli et al., VIPERdb2: an enhanced and web API enabled relational database for structural virology, Nucleic Acids Res., 2009, 37, D436–D442.100

60 pentamers and 180 hexamers quasi-equivalent. Similarly, most T ¼ 3 viruses will have 180 non-identical subunits (3 proteins multiplied by 60 triangles). Thus, for all T 4 1 viruses, proteins will always have different numbers of neighbors and none in any given pair will be in a strictly equivalent environment (Fig. 15).101 Based on these conventions, it is plausible that having lower triangulation numbers may help minimize the number of different polypeptide sequences and subunits in the final assembly. In other words, designing 144 | Synthetic Biology, 2018, 2, 115–154

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 16 siRNA complexing helical hairpins. (a) A schematic of a helical locker with its sequence configured into a helical net (left) and a hairpin structure shown as two helical cylinders locked via a hydrophobic interface. (b) Fluorescence micrographs of human cells expressing GFP (lighter tone) after incubation with siRNA, bare and complexed with lockers that delivered siRNA to knockdown the production of GFP.104 Reprinted under a Creative Commons CC-BY licence (http://creativecommons.org/licenses/by/4.0/) from C. P. Guyader, B. Lamarre, E. De Santis, J. E. Noble, N. K. Slater and M. G. Ryadnov, Autonomously folded a-helical lockers promote RNAi, Sci. Rep., 2016, 6, 35012, http://dx.doi.org/10.1038/ srep35012. Copyright r 2016, Rights Managed by Nature Publishing Group.

T ¼ 1 virus is deemed more straightforward than T ¼ 100. Correlations also exist between T numbers and the number of proteins in the capsid, with larger capsids being of higher T numbers.100 Nevertheless, an example of a synthetic capsid has yet to emerge. Furthermore, a synthetic virus that would have all the attributes of native viruses by being structurally well-defined, infectious and self-replicating. Here is hoping that such a virus will appear before long, and that such a virus would perform strictly assigned functions under self-maintained control. The importance of following the geometric principles of the virus architecture lies in that the symmetry-driven designs encapsulate rather Synthetic Biology, 2018, 2, 115–154 | 145

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

Fig. 17 Top-down and bottom-up synthetic biology strategies towards designing an artificial cell. In the top-down approach, the genome of a living organism is replaced with a synthetic analogue in order to reduce complexity and minimize the molecular content to maintain self-replication. In the bottom-up approach, artificial cells are assembled from non-living, artificial components that can reconstitute and replicate the key and most important properties of naturally occurring cells.106 Reprinted from Mater. Today, 19, 9, C. Xu, S. Hu, and X. Chen, Artificial cells: from basic science to applications, 516–532. Copyright (2016), with permission from Elsevier.

than complex nucleic acids. These proves to have considerable impact on genetic delivery and expression. RNA interference (RNAi) can serve as an example. RNAi is a recognized research tool for synthetic biology and therapeutic applications. The translation of the technology to an applied and commercial capability is however hampered by poorly understood relationships between RNA delivery and gene suppression, which invariably points to the lack of effective transfection reagents.102 Given that synthetic biology seeks to find analogues to native systems, non-viral transfection vectors and artificial viruses are attracting a greater interest.103 Recently, is was shown that small interference RNA (siRNA) can be delivered to human cell using de novo designed a-helical hairpins that cannot assemble but can effectively condense siRNA into relatively monodisperse nanoparticles. Several hairpin designs were shown to form the nanoparticles that effectively translocate to the cytoplasm of treated 146 | Synthetic Biology, 2018, 2, 115–154

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

104

cells. However, only those that lock interfacing helices prior to binding to siRNA enable RNAi, whereas those that complex and lock siRNA in the nanoparticles cannot release it inside the cells thereby preventing RNAi (Fig. 16).104 This finding is striking as it indicates that gene encapsulation in viral shells is chosen by viruses to mitigate against such problems. Indeed, similar helical designs that did not complex, encapsulated siRNA by assembling around it in a capsid-like matter of native viruses, showed more appreciable RNAi.44

5

Future perspectives

Synthetic matrices and viruses that deliver specialist functions at will excite considerable interest in research communities.105 These designs conform to the same goal of replicating life using an artificial means contributing to one of the main concepts in synthetic biology that anticipates the creation of a fully function artificial cell (Fig. 16).106 To function the latter has to be assembled first. Two broad approaches involve the re-engineering of existing naturally occurring cells – top-down designs, and the reconstitution of a life cell from the bottom up using synthetic constituents – bottom-up designs. In this light, an artificial virus can be viewed as an example of a synthetic subcellular biologic. With a very few exceptions, synthetic designs are merely structural mimetics and unless they act and function like native viruses, these mimetics will remain reconstituted scaffolds of native forms. Engineering biology from scratch however offers considerable advantages. First of all, the bottom-up approach gives the designer substantial control over the physical and biological properties of the assembled structures, which are not limited to those normally observed in native systems. Secondly, purely artificial designs hold considerable potential for commercialization and consequently for accelerating industrial growth, improving healthcare and reducing costs of manufacturing. This is the main reason for synthetic biologics being increasingly attractive to different industry sectors that are prepared to invest into synthetic biology developments to scale-up more effective technologies at a reduced cost. This indicates that current developments are sufficiently advanced to estimate the return of investment. An ultimate adopter however resides in clinic where far more stringent criteria of product reliability and validation are applied. This is an alien territory for synthetic biology with much at stake. Indeed, the restoration of damaged or ageing tissues is the challenge of the highest priority for global healthcare. For example, the cost to the UK health system to manage a chronic wound alone is conservatively estimated at d3 bn per year, which is around 3% of the total out-turn expenditure on health for the same period.107 The clinic requires materials and devices that can stimulate tissue restoration and ideally with no specialist training. Industry looks into developing and commercializing such materials at the lowest possible cost. Traditional barriers to commercialization in the regenerative medicine area remain in high costs of processing and managing personalized tissue treatments and batch-to-batch variations Synthetic Biology, 2018, 2, 115–154 | 147

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

View Online

of commercial biomaterials. Taken together these prompt the need for cheaper, more efficient materials that can be generated under better control and the performance of which can be assessed directly by cellular and tissue responses to designed synthetic biologics. The situation for virus-like synthetic biologics is no different. Molecular therapy has reached a critical point where quantitative control over macromolecular transfer is necessary for any further progress: current macromolecular drugs that modulate genetic reactions overcome the problems of stability, excretion and uptake by phagocytes, release from endosomes and entry into the nucleus. The major remaining barrier is untraceable transfer and uptake by the target cells as a function of structural inconsistency of delivery vectors.108 To a large extent, these issues are attributed to the lack of developed metrology, which could otherwise address the poor understanding of factors influencing uncertainty and reproducibility of biological function.109 Concerns over this will remain (e.g. no ISO standards for gene vector materials) unless such a capability is established first. This will facilitate the timely market entry of new technologies and will pave the way for future molecular therapy standards. Yet, standards serve innovation, which must continue, and the impact of which on industry is integral to that on healthcare and clinic. For example, synthetic viruses can address and ultimately cure genetic disorders that comprise 415 000 different diseases. Most vulnerable age groups are children, with over 30% of all infant deaths due to single-gene disorders that can be cured by a single macromolecular drug if delivered into the cell. 10–20% of all pediatric hospital admissions (average for the developed world) are for children with genetic disorders, whereas the mortality rate for boys with Duchenne muscular dystrophy (DMD), which can be treated only genetically, remains 100%! This is where advancements in protein-based synthetic biology will impact most. An ultimate goal of synthetic biology is to apply engineering principles to promote the industrialisation of biology. The global market for synthetic biology is substantial, with conservative estimates depending on a forecast being of $50B by 2025.110 A major challenge in engineering biology is the requirement for quantitative measurements to enable the predictable and reproducible design, testing and implementation of new biological systems for specific industrial applications. An effective solution to this problem is to bring together physical, engineering and biological capabilities into one measurement continuum spanning the functional and physical synthetic biology scale. A critical aspect for this is the implementation of physical standards and accurate metrology across the biological continuum, starting with functional biomolecular assemblies described herein. This will facilitate the industrialisation of biological design and the establishment of new bio-based manufacturing processes. Since engineering is at the heart of industry, and systemic design is at the heart of engineering which is based on the principles of modularisation, characterisation and standardisation, the principles of synthetic biology are transforming rapidly into a powerful engine of advanced manufacturing. The motivation for the sustainable 148 | Synthetic Biology, 2018, 2, 115–154

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

development of synthetic biology is driven by the impact it is posed to have on economy, which has now been recognised on both sides of the Atlantic. It is now clear that synthetic biology will be a key driver in the growth of world-leading economies.

References 1

2 3

4 5 6 7 8

9 10

11 12

13 14

15

16 17 18 19 20

N. Kobayashi and R. Arai, Design and construction of self-assembling supramolecular protein complexes using artificial and fusion proteins as nanoscale building blocks, Curr. Opin. Biotechnol., 2017, 46, 57–65. A. Ljubeticˇ, H. Gradisˇar and R. Jerala, Advances in design of protein folds and assemblies, Curr. Opin. Chem. Biol., 2017, 40, 65–71. M. K. Pastuszka and J. A. MacKay, Engineering structure and function using thermoresponsive biopolymers, WIREs Nanomed. Nanobiotechnol., 2016, 8, 123–138. R. Sarpeshkar, Analog synthetic biology, Philos. Trans. A Math Phys. Eng. Sci., 2014, 372, 20130110. I. Amit et al., Voices of biotech, Nat. Biotechnol., 2016, 34, 270–275. http://www.rcuk.ac.uk/documents/publications/syntheticbiologyroadmap-pdf/. D. Endy, Foundations for engineering biology, Nature, 2005, 438, 449–453. M. Galdzicki et al., The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology, Nat. Biotechnol., 2014, 32, 545–550. R. L. Koder et al., Design and engineering of an O2 transport protein, Nature, 2009, 458, 305–309. B. N. Armbruster, X. Li, M. H. Pausch, S. Herlitze and B. L. Roth, Evolving the lock to fit the key to create a family of G protein-coupled receptors potently activated by an inert ligand, Proc. Natl. Acad. Sci. U. S. A., 2007, 104, 5163–5168. S. W. Englander and L. Mayne, The nature of protein folding pathways, Proc. Natl. Acad. Sci. U. S. A., 2014, 111, 15873–15880. ¨mer and T. Scheibel, Hierarchical structures made of M. Heim, L. Ro proteins. The complex architecture of spider webs and their constituent silk proteins, Chem. Soc. Rev., 2010, 39, 156–164. H. Park, F. DiMaio and D. Baker, CASP11 refinement experiments with ROSETTA, Proteins, 2015, DOI: 10.1002/prot.24862. Y. H. Wang, K. Y. Wei and C. D. Smolke, Synthetic biology: advancing the design of diverse genetic systems, Annu. Rev. Chem. Biomol. Eng., 2013, 4, 69–102. L. D. van Vliet, P. Y. Colin and F. Hollfelder, Bioinspired genotypephenotype linkages: mimicking cellular compartmentalization for the engineering of functional proteins, Interface Focus, 2015, 5, 20150035. J. Kopecˇek and J. Yang, Smart self-assembled hybrid hydrogel biomaterials, Angew. Chem., Int. Ed. Engl., 2012, 51, 7396–7417. N. Huebsch and D. J. Mooney, Inspiration and application in the evolution of biomaterials, Nature, 2009, 462, 426–432. T. Aida, E. W. Meijer and S. I. Stupp, Functional supramolecular polymers, Science, 2012, 335, 813–817. Y. Hsia, J. B. Bale and S. Gonen et al., Design of a hyperstable 60-subunit protein dodecahedron, Nature, 2016, 535, 136–139. A. Saghatelian, Y. Yokobayashi, K. Soltani and M. R. Ghadiri, A chiroselective peptide replicator, Nature, 2001, 409, 797–801. Synthetic Biology, 2018, 2, 115–154 | 149

View Online

21

22

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

23

24

25

26

27 28

29

30

31

32

33 34 35 36 37 38

39 40 41

C. Mas-Moruno, F. Rechenmacher and H. Kessler, Cilengitide: the first anti-angiogenic small molecule drug candidate design, synthesis and clinical evaluation, Anticancer Agents Med. Chem., 2010, 10, 753–768. M. G. Ryadnov and D. N. Woolfson, Engineering the morphology of a self-assembling protein fibre, Nat. Mater., 2003, 2, 329–332. M. Jucker and L. C. Walker, Pathogenic protein seeding in Alzheimer disease and other neurodegenerative disorders, Ann. Neurol., 2011, 70, 532–540. P. Natarajan, G. C. Lander, C. M. Shepherd, V. S. Reddy, C. L. Brooks and J. E. Johnson, Exploring icosahedral virus structures with VIPER, Nat. Rev. Microbiol., 2005, 3, 809–817. A. M. Roseman, O. Borschukova, J. A. Berriman, S. A. Wynne, P. Pumpens and R. A. Crowther, Structures of hepatitis B virus cores presenting a model epitope and their complexes with antibodies, J. Mol. Biol., 2012, 423, 63–78. P. D. Rakowska et al., Nanoscale imaging reveals laterally expanding antimicrobial pores in lipid bilayers, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, 8918–8923. M. D. Shoulders and R. T. Raines, Collagen structure and stability, Annu. Rev. Biochem., 2009, 78, 929–958. J. A. Fallas, L. E. O’Leary and J. D. Hartgerink, Synthetic collagen mimics: self-assembly of homotrimers, heterotrimers and higher order structures, Chem. Soc. Rev., 2010, 39, 3510–3527. B. Sarkar, L. E. O’Leary and J. D. Hartgerink, Self-Assembly of Fiber-Forming Collagen Mimetic Peptides Controlled by Triple-Helical Nucleation, J. Am. Chem. Soc., 2014, 136, 14417–14424. R. B. Hill, D. P. Raleigh, A. Lombardi and W. F. DeGrado, De novo design of helical bundles as models for understanding protein folding and function, Acc. Chem. Res., 2000, 33, 745–754. ¨ster, D. A. Weitz, R. D. Goldman, U. Aebi and H. Herrmann, S. Ko Intermediate filament mechanics in vitro and in the cell: from coiled coils to filaments, fibers and networks, Curr. Opin. Cell Biol., 2015, 32, 82–91. D. N. Woolfson, G. J. Bartlett, M. Bruning and A. R. Thomson, New currency for old rope: from coiled-coil assemblies to a-helical barrels, Curr. Opin. Struct. Biol., 2012, 22, 432–441. A. N. Lupas and M. Gruber, The structure of alpha-helical coiled-coils, Adv. Protein Chem., 2005, 70, 37–78. C. Chothia, M. Levitt and D. Richardson, Helix-to-helix packing in proteins, J. Mol. Biol., 1981, 145, 215–250. F. H. C. Crick, The packing of alpha-helices: simple coiled-coils, Acta Crystallogr., 1953, 6, 689–697. F. Naider and J. Anglister, Peptides in the treatment of AIDS, Curr. Opin. Struct. Biol., 2009, 19, 473–482. M. J. Root, M. S. Kay and P. S. Kim, Protein design of an HIV entry inhibitor, Science, 2001, 291, 884–888. J. Liu, W. Yong, Y. Deng, N. R. Kallenbach and M. Lu, Atomic structure of a tryptophan-zipper pentamer, Proc. Natl. Acad. Sci. U. S. A., 2004, 101, 16156–16161. N. R. Zaccai, B. Chi and A. R. Thomson et al., A de novo peptide hexamer with a mutable channel, Nat. Chem. Biol., 2011, 7, 935–941. J. Liu, Q. Zheng and Y. Deng et al., A seven-helix coiled coil, Proc. Natl. Acad. Sci. U. S. A., 2006, 103, 15457–15462. H. Inouye, J. E. Bond, S. P. Deverin, A. Lim, C. E Costello and D. A. Kirschner, Molecular organisation of amyloid protofilament-like

150 | Synthetic Biology, 2018, 2, 115–154

View Online

42

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

43

44

45 46 47 48 49 50 51 52

53 54

55

56

57

58 59 60

61

assembly of beta-bellin 15D: helical array of beta-sandwiches, Biophys. J., 2002, 83, 1716–1727. C. Sinthuvanich, A. S. Veiga, K. Gupta, D. Gaspar, R. Blumenthal and J. P. Schneider, Anticancer b-hairpin peptides: membrane-induced folding triggers activity, J. Am. Chem. Soc., 2012, 134, 6210–6217. K. Nagy-Smith, P. J. Beltramo, E. Moore, R. Tycko, E. M. Furst and J. P. Schneider, Molecular, Local, and Network-Level Basis for the Enhanced Stiffness of Hydrogel Networks Formed from Co-assembled Racemic Peptides: Predictions from Pauling and Corey, ACS Cent. Sci., 2017, 3, 586–597. J. E. Noble, E. De Santis, J. Ravi, B. Lamarre, V. Castelletto, J. Mantell, S. Ray and M. G. Ryadnov, A De Novo Virus-Like Topology for Synthetic Virions, J. Am. Chem. Soc., 2016, 138, 12202–12210. E. De Santis, N. Faruqui, J. Noble and M. G. Ryadnov, Exploitable length correlations in peptide nanofibres, Nanoscale, 2014, 6, 11425–11430. A. Bella, M. Shaw, S. Ray and M. G. Ryadnov, Filming protein fibrillogenesis in real time, Sci. Rep., 2014, 4, 7529, DOI: 10.1038/srep07529. R. J. Goss A History of Regeneration Research, ed. C. E. Dinsmore, Cambridge University Press, Cambridge, 1991, pp. 7–23. A. S. Alvarado and P. A. Tsonis, Bridging the regeneration gap: genetic insights from diverse animal models, Nat. Rev. Genet., 2006, 7, 873–884. P. A. Tsonis, Regenerative biology: the emerging field of tissue repair and restoration, Differentiation, 2002, 70, 397–409. G. C. Gurtner, S. Werner, Y. Barrandon and M. T. Longaker, Wound repair and regeneration, Nature, 2008, 453, 314–321. D. E. Discher, D. J. Mooney and P. W. Zandstra, Growth factors, matrices, and forces combine and control stem cells, Science, 2009, 324, 1673–1677. A. Jalan, K. A. Jochim and J. D. Hartgerink, Rational design of a non-canonical ‘‘sticky-ended’’ collagen triple helix, J. Am. Chem. Soc., 2014, 136, 7535–7538. T. Starborg, Y. Liu, R. S. Meadows, K. E. Kadler and D. F. Holmes, Electron microscopy in cell-matrix research, Methods, 2008, 45, 53–64. H. K. Graham et al., Identification of collagen fibril fusion during vertebrate tendon morphogenesis. The process relies on unipolar fibrils and is regulated by collagen-proteoglycan interaction, J. Mol. Biol., 2000, 295, 891–902. ´pinoux-Chambaud and J. Eyer, Review on intermediate filaments of C. Le the nervous system and their pathological alterations, Histochem. Cell Biol., 2013, 140, 13–22. B. T. Helfand, M. G. Mendez and S. N. Murthy et al., Vimentin organization modulates the formation of lamellipodia, Mol. Biol. Cell, 2011, 22, 1274–1289. M. K. Gardner, A. J. Hunt, H. V. Goodson and D. J. Odde, Microtubule assembly dynamics: new insights at the nanoscale, Curr. Opin. Cell Biol., 2008, 20, 64–70. J. W. Weisel and R. I. Litvinov, Mechanisms of fibrin polymerization and clinical implications, Blood, 2013, 121, 1712–1719. J. W. Weisel, The mechanical properties of fibrin for basic scientists and clinicians, Biophys. Chem., 2004, 112, 267–276. M. J. Sherratt, C. Baldock, J. L. Haston, D. F. Holmes, C. J. Jones, C. A. Shuttleworth, T. J. Wess and C. M. Kielty, Fibrillin microfibrils are stiff reinforcing fibres in compliant tissues, J. Mol. Biol., 2003, 332, 183–193. A. Karabinos et al., The single nuclear lamin of caenorhabditis elegans forms in vitro stable intermediate filaments and paracrystals with a reduced axial periodicity, J. Mol. Biol., 2003, 325, 241–247. Synthetic Biology, 2018, 2, 115–154 | 151

View Online

62 63

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

64

65

66 67

68 69

70

71

72

73

74

75

76

77

78

79

80

K.-F. Lechtreck, Analysis of striated fiber formation by recombinant SF-assemblin in vitro, J. Mol. Biol., 1998, 276, 423–438. S. Rele, Y. Song and R. P. Apkarian et al., D-Periodic Collagen-Mimetic Microfibers, J. Am. Chem. Soc., 2007, 129, 14780–14787. T. H. Sharp et al., Cryo-transmission electron microscopy structure of a gigadalton peptide fiber of de novo design, Proc. Natl. Acad. Sci. U. S. A., 2012, 109, 13266–13271. D. Papapostolou, A. M. Smith, E. D. Atkins, S. J. Oliver, M. G. Ryadnov, L. C. Serpell and D. N. Woolfson, Engineering nanoscale order into a designed protein fiber, Proc. Natl. Acad. Sci. U. S. A., 2007, 104, 10853–10858. M. D. Mager, V. LaPointe and M. M. Stevens, Exploring and exploiting chemistry at the cell surface, Nat. Chem., 2011, 3, 582–589. A. E. Brown, R. I. Litvinov, D. E. Discher, P. K. Purohit and J. W. Weisel, Multiscale mechanics of fibrin polymer: gel stretching with protein unfolding and loss of water, Science, 2009, 325, 741–744. M. G. Ryadnov and D. N. Woolfson, Engineering the morphology of a selfassembling protein fibre, Nat. Mater., 2003, 2, 329–332. M. G. Ryadnov and D. N. Woolfson, Fiber recruiting peptides: noncovalent decoration of an engineered protein scaffold, J. Am. Chem. Soc., 2004, 126, 7454–7455. M. G. Ryadnov and D. N.. Woolfson, MaP peptides: programming the selfassembly of peptide-based mesoscopic matrices, J. Am. Chem. Soc., 2005, 127, 12407–12415. D. N. Woolfson and M. G. Ryadnov, Peptide-based fibrous biomaterials: Some things old, new and borrowed, Curr. Opin. Chem. Biol., 2006, 10, 559–567. M. G. Ryadnov, A. Bella, S. Timson and D. N. Woolfson, Modular design of peptide fibrillar nano- to microstructures, J. Am. Chem. Soc., 2009, 131, 13240–13241. N. L. Ogihara, G. Ghirlanda and J. W. Bryson et al., Design of threedimensional domain-swapped dimers and fibrous oligomers, Proc. Natl. Acad. Sci. U. S. A., 2001, 98, 1404–1409. L. Zajiczek, M. Shaw, N. Faruqui, A. Bella, V. M. Pawar, M. A. Srinivasan and M. G. Ryadnov, Nano-mechanical single-cell sensing of cell-matrix contacts, Nanoscale, 2016, 8, 18105–18112. A. Bella, S. Ray, M. Shaw and M. G. Ryadnov, Arbitrary self-assembly of peptide extracellular microscopic matrices, Angew. Chem., Int. Ed., 2012, 51, 428–431. N. Faruqui, A. Bella, J. Ravi, S. Ray, B. Lamarre and M. G. Ryadnov, Differentially instructive extracellular protein micro-nets, J. Am. Chem. Soc., 2014, 136, 7889–7898. J. D. Hartgerink, E. Beniash and S. I. Stupp, Peptide-amphiphile nanofibers: a versatile scaffold for the preparation of self-assembling materials, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 5133–5138. G. A. Silva, C. Czeisler and K. L. Niece et al., Selective differentiation of neural progenitor cells by high-epitope density nanofibers, Science, 2004, 303, 1352–1355. T. Scheibel, R. Parthasarathy and G. Sawicki et al., Conducting nanowires built by controlled self-assembly of amyloid fibers and selective metal deposition, Proc. Natl. Acad. Sci. U. S. A., 2003, 100, 4527–4532. M. O. Guler, L. Hsu and S. Soukasene et al., Presentation of RGDS epitopes on self-assembled nanofibers of branched peptide amphiphiles, Biomacromolecules, 2006, 7, 1855–1863.

152 | Synthetic Biology, 2018, 2, 115–154

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

81

Y. Ruff, T. Moyer, C. J. Newcomb, B. Demeler and S. I. Stupp, Precision templating with DNA of a virus-like particle with peptide nanostructures, J. Am. Chem. Soc., 2013, 135, 6211–6219. 82 Y. Li and S. M. Yu, Targeting and mimicking collagens via triple helical peptide assembly, Curr. Opin. Chem. Biol., 2013, 17, 968–975. 83 Y. Li, C. A. Foss and D. D. Summerfield et al., Targeting collagen strands by photo-triggered triple-helix hybridization, Proc. Natl. Acad. Sci. U. S. A., 2012, 109, 14767–14772. 84 S. Mondal, M. Varenik, D. N. Bloch, Y. Atsmon-Raz, G. Jacoby, L. AdlerAbramovich, L. J. Shimon, R. Beck, Y. Miller, O. Regev and E. Gazit, A minimal length rigid helical peptide motif allows rational design of modular surfactants, Nat. Commun., 2017, 8, 14018. 85 Y. Zhang, H. Gu, Z. Yang and B. Xu, Supramolecular hydrogels respond to ligand-receptor interaction, J. Am. Chem. Soc., 2003, 125, 13680–13681. 86 M. Zhou, A. M. Smith and A. K. Das et al., Self-assembled peptide-based hydrogels as scaffolds for anchorage-dependent cells, Biomaterials, 2009, 30, 2523–2530. 87 E. Mastrobattista, M. A. van der Aa, W. E. Hennink and D. J. Crommelin, Artificial viruses: a nanotechnological approach to gene delivery, Nat. Rev. Drug Discov., 2006, 5, 115–121. 88 B. Lamarre and M. G. Ryadnov, Self-assembling viral mimetics: one long journey with short steps, Macromol. Biosci., 2011, 11, 503–513. 89 S. G. Tarasov, V. Gaponenko and O. M. Howard et al., Structural plasticity of a transmembrane peptide allows self-assembly into biologically active nanoparticles, Proc. Natl. Acad. Sci. U. S. A., 2011, 108, 9798–9803. 90 J. E. Padilla, C. Colovos and T. O. Yeates, Nanohedra: using symmetry to design self assembling protein cages, layers, crystals, and filaments, Proc. Natl. Acad. Sci. U. S. A., 2001, 98, 2217–2221. 91 S. Fujita and K. Matsuura, Self-assembled artificial viral capsids bearing coiled coils at the surface, Org. Biomol. Chem., 2017, 15, 5070–5077. 92 M. G. Ryadnov, A self-assembling peptide polynanoreactor, Angew. Chem. Int. Ed., 2007, 46, 969–972. 93 J. M. Fletcher, R. L. Harniman and F. R. Barnes et al., Self-assembling cages from coiled-coil peptide modules, Science, 2013, 340, 595–599. 94 A. Luchini, D. H. Geho and B. Bishop et al., Smart hydrogel particles: biomarker harvesting: one-step affinity purification, size exclusion, and protection against degradation, Nano Lett., 2008, 8, 350–361. 95 D. Raoult and P. Forterre, Redefining viruses: lessons from Mimivirus, Nat. Rev. Microbiol., 2008, 6, 315–319. 96 S. Raman, G. Machaidze, A. Lustig, A. Aebi and P. Burkhard, Structurebased design of peptides that self-assemble into regular polyhedral nanoparticles, Nanomedicine, 2006, 2, 95–102. 97 V. Castelletto, E. De Santis and H. Alkassem et al., Structurally plastic peptide capsules for synthetic antimicrobial viruses, Chem. Sci., 2016, 7, 1707–1711. 98 A. Hernandez-Garcia, D. J. Kraft and A. F. J. Janssen et al., Design and selfassembly of simple coat proteins for artificial viruses, Nat. Nanotechnol., 2014, 9, 698–702. 99 R. Ni and Y. Chau, Structural Mimics of Viruses Through Peptide/DNA Co-Assembly, J. Am. Chem. Soc., 2014, 136, 17902–17905. 100 M. Carrillo-Tripp, C. M. Shepherd and I. A. Borelli et al., VIPERdb2: an enhanced and web API enabled relational database for structural virology, Nucleic Acids Res., 2009, 37, D436–442. Synthetic Biology, 2018, 2, 115–154 | 153

View Online

101 102

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00115

103 104

105 106 107 108 109

110

D. L. Caspar and A. Klug, Physical Principles in the Construction of Regular Viruses, Cold Spring Harb. Symp. Quant. Biol., 1962, 27, 1–24. A. de Fougerolles, H. P. Vornlocher, J. Maraganore and J. Lieberman, Interfering with disease: a progress report on siRNA-based therapeutics, Nat. Rev. Drug. Discov., 2007, 6, 443–453. R. Kanasty, J. R. Dorkin, A. Vegas and D. Anderson, Delivery materials for siRNA therapeutics, Nat. Mater., 2013, 12, 967–977. C. P. Guyader, B. Lamarre, E. De Santis, J. E. Noble, N. K. Slater and M. G. Ryadnov, Autonomously folded a-helical lockers promote RNAi, Sci. Rep., 2016, 6, 35012. E. De Santis and M. G. Ryadnov, Peptide self-assembly for nanomaterials: the old new kid on the block, Chem. Soc. Rev., 2015, 44, 8288–8300. C. Xu, S. Hu and X. Chen, Artificial cells: from basic science to applications, Mater. Today, 2016, 19, 516–532. J. Posnett and P. J. Franks, The burden of chronic wounds in the UK, Nursing Times, 2008, 104, 44–45. EMEA Guidance on the Quality, preclinical and clinical aspects of gene transfer medicinal products, EMEA/273974/05. A. L. Plant, L. E. Locascio, W. E. May and P. D. Gallagher, Improved reproducibility by assuring confidence in measurements in biomedical research, Nat. Methods, 2014, 11, 895–898. https://connect.innovateuk.org/web/synthetic-biology-special-interest-group/ 2016-uk-synbio-strategic-plan.

154 | Synthetic Biology, 2018, 2, 115–154

Synthetic extracellular matrix approaches for the treatment of myocardial infarction Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

Sergio Spaans,a,b Noortje A. M. Bax,a,b Carlijn V. C. Boutena,b and Patricia Y. W. Dankers*a,b,c DOI: 10.1039/9781782622789-00155

Injectable biomaterials, that mimic one or more functions of the natural extracellular matrix (ECM), are used in therapies that aim at improvement of cardiac repair and induction of regeneration of the heart. In this chapter we discuss recent trends in synthetic ECM approaches using injectable biomaterials to repair and regenerate the heart, and relate this to the design criteria to truly mimic (parts of) the natural ECM in a synthetic way. To fully understand the design criteria we first discuss the healthy and adverse remodeled cardiac niche in which the ECM and soluble factors play important roles. Additionally, knowledge of the remodeling processes and cardiac performance post-myocardial infarction is needed. Finally, we review various synthetic ECM approaches for the treatment of the infarcted heart; i.e. (i) hydrogels that support the myocardial matrix via mechanical and/or bioactive cues, (ii) hydrogel materials that release bioactive factors, (iii) materials that deliver cells, and (iv) hydrogel materials that induce cell recruitment.

1

Introduction

The high occurrence of death due to heart failure demands for the development of new therapies for patients that have suffered from a myocardial infarction (MI). MI is an event which occurs in the blood vessels from the heart, also known as coronary arteries, during this event there is occlusion of these arteries which results in local deprivation of blood and oxygen in the heart. Post-MI, a hypoxic environment is developed which causes massive death of cardiomyocytes, which are the contractile cells of the heart. Subsequently a cascade of cellular processes and remodeling events take place that finally result in scar formation to compensate for the loss of cardiomyocytes, which inevitably leads to drastic decrease in contractile function in the left ventricle and heart failure occurs.1 Current therapies to prevent that MI develops into total heart failure include, e.g. drug, stem cell, gene or biomaterial-based therapy. When these therapies do not yield the required effect, the only remaining options are heart transplantation or the application of left ventricle assist devices (LVAD). Unfortunately, the number of donors is limited, and LVAD is used to bridge the gap towards heart transplantation or even as ‘‘destination’’ therapy. Therefore, novel therapies are a

Institute for Complex Molecular Systems, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, The Netherlands. E-mail: [email protected] b Department of Biomedical Engineering, Soft Tissue Engineering and Mechanobiology, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, The Netherlands c Department of Biomedical Engineering, Laboratory of Chemical Biology, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, The Netherlands Synthetic Biology, 2018, 2, 155–185 | 155 c

The Royal Society of Chemistry 2018

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

required that restore the loss of cardiomyocytes and support fast repair and ultimately regeneration of the heart after MI. Stem cells have the potential to regenerate cardiac tissue and provide for a long term solution. Cardiac stem cell therapy has been previously studied by injecting stem cells at the infarcted region.2 Only modest and temporary improvements are observed.3 Limited improvement is due to very low stem cell engraftment and retention following stem cell injection.4 One way to address the low retention is by combining stem cells with biomaterials/synthetic extracellular matrices (ECM) to improve engraftment. Designing biomaterials by mimicking the functionality of ECM components is known as synthetic ECM approaches. The ideal synthetic ECM should be able to recapitulate native cardiac tissue. Different biomaterial approaches have been pursued for the treatment of myocardial infarction, e.g. the application of cardiac tissue patches, the development of microtissues, and injectable hydrogels.5–7 Both the cardiac tissue patch and microtissue approaches are based on production of cardiac tissue in vitro and the implantation of these tissue engineered constructs on or in the damaged heart.8 Importantly, the application of injectable hydrogels allow for more variation in the regenerative therapy options, because they can function as passive support of the structurally weak heart tissue.9 Additionally, ECM-mimicking components can be incorporated to induce a certain degree of exogenous or endogenous repair of the cardiac tissue. In this chapter, we aim to show recent trends in synthetic ECM approaches to repair and regenerate the heart after MI. To completely understand these approaches, different treatment options of post-MI damage using injectable hydrogels are reviewed, with a focus on the various ECM components that are mimicked or incorporated to achieve improvement of cardiac function. We propose that regeneration of tissues can be stimulated and controlled using either bottom-up or top-down synthetic ECM approaches (Fig. 1). We define bottom-up ECM engineering as building a synthetic or (partly) natural ECM from the single components; either synthesized or naturally extracted. The single components are proposed to assemble into large functional ECM aggregates and structures. This approach allows for full control over the composition of the produced ECM. Top-down ECM engineering uses the capacity of cells to produce and deposit ECM components in the extracellular environment. In this case ECM components are secreted by the cells, leading to a natural microenvironment. This approach might benefit from materials that are designed to stimulate ECM secretion and possible ECM capturing.

2

Healthy versus adverse remodeled cardiac niche

The heart rhythmically contracts and with each contraction blood is pumped through the body. The heart wall consists of multiple layers. The largest and most important part of the heart wall is the myocardium, which represents the cardiac muscle tissue. Facing the inner part of the heart, in direct contact with blood is the endocardium. Facing the outside of the heart is the epicardium that functions as a protective layer and a 156 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | Synthetic Biology, 2018, 2, 155–185 | 157

Fig. 1 Schematic representation of synthetic ECM approaches for treating MI. Top-down ECM engineering is based on the ECM component producing capacity of cells and the deposition of these components in their extracellular environment. Bottom-up ECM engineering is defined as building a synthetic or (partly) natural ECM from single components, which is made synthetically or extracted naturally.

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

10

lubricant during rhythmic contraction. The volume of the myocardium is mainly composed of a contractile cell type, which is called a cardiomyocyte (CM). The total volume of a healthy human heart is composed of approximately 90% of CM, however CM only account to 30–40% of the total cell number in healthy human hearts.11 These CM form a three-dimensional (3D) anisotropic network through specific cell–cell interactions, called intercalated discs, featuring a high concentration of gap junctions, desmosomes and tight junctions. The most abundant cell types in a normal human heart are the fibroblasts, which represents 60–70% of the total cell number.11 However, the total volume of fibroblasts is much smaller compared to the total volume of CM. Fibroblasts are located throughout the myocardium and are mainly responsible for the maintenance of extracellular matrix (ECM).11 Next to CM and fibroblasts, the myocardium contains a vascular network consisting of endothelial cells, smooth muscle cells and pericytes.11 Recent evidence shows that throughout the myocardium endogenous cardiac stem cells are found that can be activated after a MI.12 Cardiomyocyte progenitor cells (CMPC) are one of those types of endogenous stem cells which are present in both fetal and adult hearts.13 Since these CMPC are resident in the heart and can also be isolated from atrial appendages or biopsies, they seem the ideal candidate for stem cell therapy.14 2.1 Healthy cardiac ECM The ECM is a large supramolecular network of cell-secreted components that surrounds all cells in tissue (Fig. 2). Healthy ECM typically accounts for 6.5% of total heart tissue.15 Additionally, the ECM is a highly organized and dynamic supramolecular network that plays a crucial role in different cell–matrix interactions. The ECM also provides mechanical

Fig. 2 Schematic representation of the ECM components present in healthy cardiac tissue. 158 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

support, regulates cell functions such as proliferation, adhesion and migration and helps in cell–cell signaling by creating a pathway to guide signals. The ECM is also a reservoir of biochemical compounds16 and there is a wide range of interactions between cells and their direct surrounding ECM.17 To illustrate this, the ECM can be divided into three categories, e.g. connective tissue, basement membrane and pericellular matrix (Fig. 2). The connective tissue is an anisotropic network where cells are embedded in. The stiffness of human myocardium is highly dependent on the composition of the connective tissue, which ranges from 0.02–0.5 MPa.18 The elastic modulus of healthy myocardium is typically 10–20 kPa during diastole and 200–500 kPa during systole.18,19 Connective tissue primarily consists of collagens type I, III and V, and elastin (Fig. 2, Connective tissue). Collagens are predominantly produced by fibroblasts that surround CM.20 Also CMPC have been shown to produce these types of collagen,21 which have been shown to contribute to CMPC differentiation into cardiomyocytes.22 Collagen type I and III are assembled from procollagen, which consists of three a-chains that assemble into a triple helix. Each chain has a specific repeating unit, i.e. glycine-X-Y. Where X is usually proline and Y is a hydroxyproline. Hydroxyproline is known to stabilize the triple helix through interchain hydrogen bonding. Procollagen is secreted by the cell and aggregates to form collagen fibrils, and finally collagen fibers in the extracellular space. Collagen type I is the most abundant protein in myocardial tissue and is responsible for the conduction of contractility between cardiomyocytes throughout the connective tissue of the myocardium.23 Collagen type V is a less occurring fibrillar protein, however it has been implicated to play a role in the initiation of collagen type I fibril assembly.24 These fibrillar proteins provide the tissue with a passive mechanical strength during rhythmic contraction. Elastin is a fibrillar protein that is also found in connective tissue (Fig. 2, Connective tissue). Synthesis of elastin starts with tropoelastin that is crosslinked and self-assembled in microfibrils to form the final elastin fibril. Elastin gives the tissue resilience and thus allows for the tissue to return to its original shape after stretching.25 Less occurring components found in connective tissue are, e.g. tenascin-C, decorin and vitronectin (Fig. 2, Connective tissue). Tenascin-C is a 239 kDa glycoprotein that is transiently expressed during cardiac development and pathological conditions. Conversely, it is closely related to cardiomyocyte differentiation during development, however enhances inflammatory response following MI.26 Decorin is a small leucine rich proteoglycan which contains binding sites for most fibrillary collagens and can sequester growth factors to its dermatan/chondroitin-sulphate chain.27 Vitronectin is a glycoprotein found in connective tissue and is known to promote cell adhesion and is involved in fibronectin reorganization in the ECM.28 The basement membrane, which is also known as the basal lamina, is a thin and dense sheet of matrix molecules that connects the cell to the connective tissue layer. The thickness of the basement membrane typically ranges between 40 and 120 nm and consists of proteins and glycoproteins such as, i.e. collagen type IV and VI, laminin-211, Synthetic Biology, 2018, 2, 155–185 | 159

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

nidogen-1, fibronectin and perlecan-a1 (Fig. 2, Basement membrane). The basement membrane provides optimal structural and mechanical cross-talk between the cells and the tissue. Non-fibrillar collagen type IV is produced by cardiomyocytes and has more interruptions in its triple helix structure compared to fibrillar collagens, such as collagen type I and III, allowing for the formation of bended fibers and finally a dense network.29 Collagen type IV also provides strength to the basement membrane. The most occurring component in the basal lamina is laminin, which is largely responsible for the sheet structure. Laminin is a cross-shaped protein composed of three different peptide chains, i.e. a, b and g. The longest arm of the cross (B80 nm) consists of the a-peptide chain intercoiled with the b- and g-peptide chains (B35–50 nm).30 Different combinations of isoforms exist, which are tissue dependent. For cardiac tissue the most occurring combination of isoforms is a2, b1 and g1. Furthermore, laminin contains binding sites for different types of integrin, nidogen and collagen. Nidogen-1, which is also known as entactin, is a sulfated glycoprotein that is composed of two protein domains. It is shown to promote cell adhesion and binds directly to laminin, collagen type IV, fibronectin and fibrinogen. Furthermore it has also been shown that it plays a role in hemostasis and wound healing.31 Fibronectin is a large 262 kDa glycoprotein that consists of two identical protein domains that are connected via a disulfide bond. Each protein domain contains separate smaller domains that bind to cells, collagen and heparin. The cell binding domain contains the well-known RGD peptide sequence that connects to the internal cytoskeleton via the transmembrane protein integrin. Another interesting property of fibronectin is that it only assembles and polymerizes at the cell surface. This is due to the stretching of the fibronectin protein, as a consequence of integrin–fibronectin linkage, which exposes binding sites for other fibronectin glycoproteins and forms fibrils at the basement membrane.25 Also perlecan-a1 is found in the basement membrane and is a proteoglycan that is rich in heparan sulfate.32,33 The pericellular matrix, which is also known as the glycocalyx, is the layer directly surrounding each cell (Fig. 2, Pericellular matrix). The thickness of the pericellular matrix varies between 10 to 1000 nm depending on the cell and tissue type. The cardiac pericellular matrix consists of glycosaminoglycans (GAG) and proteoglycans (PG), which form a dense and hydrated hydrogel in which the cells and other ECM components are embedded.34 PG consist of core protein sequences to which the GAG are covalently attached. GAG are highly charged, causing a high attraction of ions and an increase in osmotic pressure. This mechanism increases the water content in the corresponding tissue, thereby creating a natural gel, which is very resistant towards compressive forces. GAG are unbranched, linear polysaccharides, in which specific repeating disaccharides units determine the type. Well-studied GAG are hyaluronic acid, heparan sulfate, dermatan sulfate and chondroitin sulfate. Hyaluronic acid (HA) is a simple and unsulfated GAG that is composed of repeating units of N-acetylglucosamine and glucuronic acid (Fig. 2, Pericellular Matrix).35 HA is generally not cross-linked to a protein 160 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

core, although it is an important space filler during development, it resists compressive forces and is upregulated during wound healing processes and post-MI.36,37 Heparan sulfate is also composed of N-acetylglucosamine and glucuronic acid, however contains sulfated groups to the sugar groups. Furthermore, dermatan sulfate is composed of glucuronic acid and iduronic acid and chrondroitin sulfate is composed of N-acetylgalactosamine and glucuronic acid. PG play an important role in biochemical transport of soluble factors to the cells. This is done by either enhancing diffusion through the creation of more pores or inhibiting their signaling capacity through immobilization. Syndecan-2,3 and 4 are transmembrane PG that have been detected in myocardial tissue, and play a role in adhesion and signal transduction.38 Finally, versican is a 370 kDa PG that is found in the pericellular matrix of cardiac tissue (Fig. 2, Pericellular Matrix).39–41 Versican is supramolecularly bound to oligosaccharides located in HA chains via its G1 protein domain.42 This interaction is further stabilized via a hyaluronic acid and proteoglycan binding link protein (HAPLN).43 In cardiogenesis, HAPLN3 has been shown to be crucial in the maintenance of HA matrix during cardiac tissue development (Fig. 2, Pericellular Matrix).44

2.2 Soluble cardiac factors Next to structural and nonstructural ECM components, cardiac ECM also consists of soluble factors which are primarily growth factors. Growth factors (GF) are secreted by cells, and regulate important cellular behavior such as, proliferation, differentiation and ECM production. The effects of the release of these GF can be pro-angiogenic, antiinflammatory, anti-apoptotic or a chemoattractant for cells.45 Interestingly, growth factor activity is also modulated by means of binding with heparin found in GAG and PG in the ECM.46 Important growth factors associated with angiogenesis are, vascular endothelial growth factor (VEGF) and basic fibroblast growth factor (FGF-2).47 VEGF stimulates proliferation, differentiation and survival of endothelial cells. Additionally, it has been shown to increase the production of nitrogen oxide, which acts as a vasodilator.48 Furthermore, FGF-2 is an important factor in the formation and organization of vessels by stimulating smooth muscle cell and endothelial cell growth. Anti-apoptotic growth factors that stimulate survival and prevent apoptosis of cardiomyocytes are, e.g. insulin-like growth factor (IGF-1) and hepatocyte growth factor (HGF).49 Both IGF-1 and HGF also have a high affinity for heparin, which indicates that the activity is also modulated by the ECM. Other growth factor which have less affinity for heparin are, transforming growth factor b (TGF-b), platelet-derived growth factor (PDGF) and angiotensin II.46 Angiotensin II is responsible for vascular formation and the activation of other growth factors.50 Lastly, stromal cell-derived factor 1-a (SDF1-a) is a potent chemoattractant factor that has been shown to attract hematopoietic stem cells, mesenchymal stem cells and cardiac stem cells to the local tissue.51 Stem cell recruitment through SDF1-a stimulation could be crucial for complete regeneration of the myocardium following a MI. Synthetic Biology, 2018, 2, 155–185 | 161

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

2.3 Adverse post-MI remodeling The heart is an organ with limited self-regeneration capacity.25 Following a MI, complex cellular and molecular healing processes follow that do not induce myocardial tissue regeneration, which is also known as adverse post-MI remodeling.52 Healthy cardiomyocytes are replaced by disorganized fibrotic tissue. Fibrotic or scar tissue consists of a different ECM composition, structure and organization compared to healthy ECM (Table 1). During the first 6 hours after an occlusion has taken place in one of the coronary arteries, cardiomyocytes start to die through apoptosis or necrosis (Fig. 3, phase 1). Cardiomyocytes’ death is caused by the sudden ischemic and toxic environment. Necrotic cells upregulate the complement cascade, producing various kinds of chemokines and cytokines. This initiates recruitment of platelets, neutrophils and mononuclear cells (phase 2) after 12–48 hours (Fig. 3, phase 2). These mononuclear cells differentiate towards macrophages and phagocytize dead cardiomyocytes. In phase 2, MMPs are highly upregulated by myofibroblasts followed by degradation of connective tissue for optimal infiltration of inflammatory cells. This leaves the myocardium Table 1 Overview of main cardiac extracellular matrix components, their changes in expression/production and contribution to the mechanical and biochemical properties of post-MI tissue. ECM component Connective tissue

Collagen I

Collagen III

Elastin Basement membrane

Contribution to tissue proporties post MI

Scar tissue is composed of 30% at day 7 and 60% at day 21 Scar tissue is composed of 30% at day 7 and 60% at day 21 (initially faster than collagen I) No upregulation

Increases the stiffness of final scar tissue Increases the stiffness of final scar tissue

16

Decreases the tissue resilience Contributes to cellular migration and basement membrane stability Provides a scaffold for other ECM components

16, 55

Contributes to cellular migration and basement membrane stability Causes tissue to swell up with water for optimal cell infiltration

16

Collagen IV

Increased expression in peripheral zone at day 3 and maximal expression at day 7–11

Fibronectin

Large increase in expression and production from day 4 up to day 35 Increased expression in peripheral zone at day 3 and maximal expression at day 7–11

Laminin

Pericellular matrix

Regulation/ composition post-MI

Hyaluronic acid

Up regulated and maximal expression and production at day 3

162 | Synthetic Biology, 2018, 2, 155–185

Ref.

16, 54

54

53

16

Published on 23 November 2017 on http://pubs.rsc.org | Synthetic Biology, 2018, 2, 155–185 | 163

Fig. 3 Overview of the four stages of the cardiac healing process that takes place after MI. (Phase 1) Directly after MI the cardiomyocytes that do not obtain enough oxygen start to die through apoptosis or necrosis. (Phase 2) Consequently, necrotic cells start to produce chemokines and cytokines that atrract neutrophils and monocytes from the blood. During this phase the ECM is degraded by up regulated MMPs, that help neutrophils to migrate deep into the tissue. (Phase 3) Two to three days post-MI granulation tissue is formed. In this phase fibroblast differentiate into myofibroblasts or endothelial cells. The myofibroblasts produce the extra ECM which consists of primarily collagen III. The endothelial cells form new blood vessels. (Phase 4) During the final phase the amount of cells is decreased except for the amount of myofibroblasts, which is increased to maintain ECM remodeling.

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

structurally weak and results in a decrease in mechanical properties. This continues for 2–3 days until granulation tissue starts to form and the amount of myofibroblasts and inflammatory cells increases, accompanied by formation of new blood vessels (Fig. 3, phase 3). Additionally, hyaluronic acid expression and production is significantly increased during the first 3 days post-MI, which contributes to optimal cellular infiltration (Table 1, Pericellular matrix).16 Following phase 3, infiltrated myofibroblasts synthesize new ECM proteins which are primarily fibronectin, collagen IV and laminin followed by high concentrations of collagen type I and III (Table 1, Connective tissue and Basement membrane).16,53,54 Finally, after 10–18 days post-MI scar tissue is formed (Fig. 3, phase 4). Collagen type I and III cover 60% of scar tissue and is therefore the main cause of the increase in stiffness observed after 3 days post-MI. Also, collagens found in scar tissue adapt a more anisotropic structure compared to healthy ECM. This is due to the limited time available for proper collagen fiber maturation.52 Lastly, myofibroblasts remaining in scar tissue and continuously contract and remodel ECM, which stabilizes and strengthens the scar and prevents leakage of blood from the ventricles.

2.4 Cardiac performance post-MI In the clinic, macroscopically, cardiac output can be monitored to assess whether a MI has taken place. The cardiac output is measured by observing values for the ejection fraction (EF), end-diastolic volume (EDV) or end-systolic volume (ESV). As a result of ventricular remodeling, that has previously been illustrated, the ventricle wall thickness is reduced and the final cardiac output is drastically reduced. This is usually characterized by, decrease in EF and increase in EDV/ESV. Changes in these values are then associated with adverse molecular and cellular processes that have taken place post-MI. Holmes et al. clearly illustrate different ways how ventricular remodeling can negatively influence the function of the ventricle:1,56 1. The ventricular wall can rupture as a result of increased degradation of the connective tissue. 2. There is a high dissipation of energy occurring during systole. This is due to the soft tissue that is formed early after a MI as a result of connective tissue degradation. 3. The efficiency of diastolic filling is decreased as a result of the increase in stiffness of the infarcted area as a result of scar tissue formation. 4. The ventricular wall stress is increased as a result of thinning of the wall due to adverse remodeling of the ECM. This is accompanied with dilation of the cavity and further increases the wall stress of the remaining healthy myocardium, thereby finally achieving the same systolic pressure. 5. Coupling of the healthy myocardium to the infarcted myocardium results in decrease in deformation. This is the result of the anisotropic mechanical properties of the infarcted myocardium. 164 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

Ultimately, the composition and organization of healthy cardiac ECM, remains important. Also, during pathological conditions the composition of the ECM changes, followed by adverse molecular changes in cardiac tissue. During this remodeling phase the interplay between mechanical properties and contractile function remains important. With this knowledge therapy outcome using different injectable hydrogel approaches can be better understood. In the next paragraphs, recent trends in synthetic biological approaches to repair and regenerate the heart after MI, are reviewed.

3

Synthetic ECM approaches for the treatment of MI

Hydrogel biomaterials resemble the high water content found in natural ECM, and is therefore an adequate candidate for synthetic ECM approaches and post-MI therapy. Hydrogels consists of polymers that are cross-linked via physical or chemical interactions. Physical cross-links can be, e.g. entangled chains, hydrogen bonds or hydrophobic interactions. Moreover, chemical cross-links are primarily covalent bonds between polymer chains.57 Hydrogels are injectable when starting with a liquid and upon injection in the tissue it transforms into a solid crosslinked hydrogel, also known as sol-to-gel properties. Injectable hydrogels can be delivered in the myocardium via three routes, i.e. transendocardially, intracoronary and epicardially (Fig. 4A and B). In this chapter, different injectable hydrogels are reviewed as therapeutic synthetic ECM for the treatment of damaged infarcted myocardium (Fig. 4). Injectable hydrogels can be used as matrix support for the damaged myocardium, mimicking the structural, mechanical or bioactive properties of healthy ECM (Fig. 4C). Furthermore, hydrogels can be encapsulated with paracrine signaling molecules, which are found in healthy ECM and stimulate endogenous repair (Fig. 4D). Also, we consider injectable hydrogels that are encapsulated with stem cells for local delivery in the infarcted myocardium and stimulate in situ cardiac regeneration (Fig. 4E). Finally, injectable hydrogels can be loaded with chemoattractants for the recruitment of endogenous stem cells for the stimulation of in situ cardiac regeneration (Fig. 4F). 3.1 Matrix supporting hydrogels to repair and regenerate the myocardium Matrix supporting injectable hydrogels are identified according to which ECM components they mimic, i.e. the mechanical niche or the bioactive niche. Injectable hydrogels mimicking the mechanical niche of the myocardium typically mimic the passive mechanical properties of fibrillar collagens found in natural ECM. Consequently, the bioactive niche is mimicked by injection of naturally derived hydrogels. These hydrogels contain bioactive sequences that stimulate cellular responses such as, e.g. proliferation, migration and differentiation. First, injectable hydrogels that are used as matrix support for the damaged myocardium, mimicking the structural and mechanical niche of healthy ECM, are discussed. For successful injection of hydrogels in the myocardium, Synthetic Biology, 2018, 2, 155–185 | 165

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

Fig. 4 Schematic representation of the use of different injectable hydrogels as therapeutic synthetic ECM, which are applied either via a Top-down or Bottom-up ECM engineering approach. (A) Myocardial infarction in the left ventricular myocardium, which is caused by occlusion of the left anterior descending coronary artery. (B) Transendocardial injection of a hydrogel to stimulate regeneration of cardiac tissue post-MI. The biomaterial/hydrogel is applied with the goal to serve as (C) matrix support, (D) a depot for the presentation and release of bioactive factors, (E) cell delivery vehicle and (F) scaffold to recruit cells by presenting chemokines via a Top-down ECM engineering approach.

hydrogels should have similar mechanical properties compared to the myocardium (0.02–0.5 MPa), biodegradability to be able to promote cellular infiltration and biocompatibility to prevent adverse immune responses. In the group of Dobner at al., a 20 kDa 8-arm poly(ethylene) glycol (PEG) hydrogel was injected with an elastic modulus similar to cardiac tissue in diastole,58 which served as a passive matrix support post-MI (Fig. 5A1).59 Additionally, PEG hydrogels were also modified with matrix metalloproteinase (MMP)-1, 14 and -9 degradable peptide sequences (Fig. 5A1).60 These MMP-degradable peptide sequences can be locally cleaved by MMP enzymes, which results in a more degradable hydrogel.61 Degradation of the injected PEG hydrogel and increase in cellular migration was observed after 28 days inside the hydrogel (Fig. 5A2). In this research it is postulated that left ventricular remodeling post-MI is the result of the increase in wall stress. By injection of a synthetic MMP-degradable PEG hydrogel, adverse remodeling is proposed to be prevented. Accordingly, pathological remodeling was limited in the first 4 weeks post-MI, although this did not prevent dilation of the heart at a later stage (Fig. 5A3). This shows that by simply injecting a bio-inert matrix support is not sufficient for cardiac repair at a later stage.62 166 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

Furthermore, a synthetic hydrogel that was used to prevent adverse ventricular remodeling post-MI is based on co-polymers of N-isopropylacrylamide (NIPAAm), acrylic acid (AAc) and hydroxyethyl methacrylate-poly(trimethylene carbonate) (HEMAPTMC) (Fig. 5B1).63 The feed ratio is 86/4/10 and this hydrogel formed a hydrogel at pH ¼ 7.4 and 37 1C, proving thermoresponsive gelation. The elastic modulus of p(NIPAAm-co-AAc-co-HEMAPTMC) is 20% that of healthy myocardium in diastole. Furthermore, degradation is seen in a period of 5 months via hydrolysis of the PTMC residue. Preservation of left ventricular wall thickness and end-diastolic area is observed up to 8 weeks, which makes this injectable hydrogel a promising therapy tool for preventing adverse ventricular remodeling post-MI (Fig. 5B2). Next, a biopolymer which is chemically modified to obtain a passive injectable hydrogel with tunable mechanical properties, is based on methacrylated hyaluronic acid (MeHA) (Fig. 5C1).64 Interestingly, MeHA hydrogel elastic moduli of 8 and 43 kPa were injected in an ovine MI model. Injection of both hydrogels resulted in an increase in left ventricular wall thickness. However, injection of the MeHA hydrogel with a modulus of 43 kPa, resulted in significantly smaller infarct areas and better cardiac function compared to the control group and compared to the MeHA hydrogel with a modulus of 8 kPa (Fig. 5C2). This study clearly demonstrates the importance of matching the mechanical properties of the hydrogel with the properties of the myocardial tissue. Another promising injectable hydrogel therapy which is in the pre-clinical phase is a supramolecular hydrogel based on alginate. Alginate is a natural negatively charged linear polysaccharide that originates from seaweed.65 It consists of repeating parts of 1,4-linked b-D-mannuronic acid and a-L-guluronic acid (Fig. 5D1). Hydrogel formation is driven through bivalent cross-links formed with calcium ions. Leor et al. demonstrated the percutaneously injection of these calcium cross-linked alginate hydrogels in a MI pig model 4 days post-MI.66 After 60 days, the alginate hydrogel was completely degraded and the left ventricle wall thickness was preserved compared to the control group (Fig. 5D2). Naturally derived hydrogels consist of bioactive sequences found in natural ECM and are important to obtain specific cellular responses. These cellular responses include: recruitment of cells, induction of vascular formation, prevention of cell death and induction immune responses that favor regeneration.67 However, naturally derived hydrogels usually lack the required mechanical properties to support the damaged myocardium post-MI. Hallmark research by Huang et al. showed that injection into the infarcted zone of collagen type I, fibrin and matrigel in a rat model 1 week post-MI,68 increased neovascularization (Fig. 6A2). Collagen type I is the most abundant fibrillar and structural ECM component found in myocardial tissue (Fig. 6A1). Collagen type I can be made injectable by first solubilizing and upon injection at 37 1C it transforms into a hydrogel.69 Fibrin is a natural hydrogel that is produced by mixing the two components, fibrinogen and thrombin, and by tuning the ratio the mechanical properties of the final hydrogel can be carefully tailored.70,71 Finally, matrigel is a mixture of basement membrane Synthetic Biology, 2018, 2, 155–185 | 167

View Online

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

72,73

components derived from Engelbreth–Holm–Swarm tumors in mice. These basement components are primarily laminin, collagen IV and nidogen. Matrigel is stored as a liquid and transforms into a hydrogel at 37 1C, similar to collagen. Collagen type I, fibrin and matrigel show relatively low elastic moduli (o200 Pa) compared to synthetic hydrogels.70,74 Nevertheless, Huang et al. showed high infiltration and contraction of myofibroblasts after injection of all three hydrogels compared to PBS and was highest in collagen type I. Cellular infiltration is beneficial for the turnover of functional ECM, however this may also lead to

168 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

non-functional scar tissue formation. In an earlier study by Huang et al., fibrin scaffold showed preservation of left ventricle wall thickness when injected 1 week post-MI.75 One strategy to optimally mimic the composition of cardiac ECM is via the use of decellularized ECM (dECM). Decellularization, involves removal of all cellular components via a chemical, physical or enzymatic method. If the decellularization method is successful, a hydrogel is produced that is composed of tissue-specific ECM components (Fig. 6B1). This naturally based hydrogel shows promise as an injectable biomaterial therapy and is therefore in pre-clinical trial, progressing towards a firstin-human trial.76–78 To illustrate the success, infarcted porcine models were percutaneously injected with porcine derived dECM two weeks postMI. The tissues were analyzed 3 months after injection. Interestingly, higher preservation of EF, EDV and ESV was seen in pigs injected with dECM compared to the PBS control (Fig. 6B2). Also, a significant reduction of infarct size was observed, which proves this is a promising platform to be applied as injectable hydrogel therapy. 3.2 Mimicking soluble paracrine signaling molecules for endogenous repair In the previous paragraph, the importance of the mechanical integrity and the intrinsic bioactivity of injectable hydrogels was discussed. In this paragraph, the described injectable hydrogels contain bioactive proteins Fig. 5 Synthetic hydrogels used as mechanical matrix support. (A1) Chemical structure of 20 kDa 8-arm poly(ethylene glycol), which can be covalently crosslinked with dithiothreitol (DTT) or MMP-degradable peptide sequences. (A2) Histology images showing the cellular infiltration and enzymatically degradable PEG 30 min and 28 days postMI.61 Reprinted from Biomaterials, 33, K. Kadner, S. Dobner, T. Franz, D. Bezuidenhout, M. S. Sirry and P. Zilla et al., The beneficial effects of deferred delivery on the efficiency of hydrogel therapy post myocardial infarction, 2060–2066, Copyright (2016), with permission from Elsevier. (A3) Graph showing the change in end-diastolic diameter (EDD) after 2, 4 and 13 weeks post-MI of injected PEG and PBS.59 Reprinted from J. Card. Fail., 15(7), S. Dobner, D. Bezuidenhout, P. Govender, P. Zilla and N. Davies, Synthetic Non-degradable Polyethylene Glycol Hydrogel Retards Adverse Post-infarct Left Ventricular Remodeling, 629–636, Copyright (2009), with permission from Elsevier. (B1) Chemical structure of the pH and temperature dependent p(NIPAAm-co-AAc-co-HEMAPTMC) hydrogel. (B2) Histology images showing the preservation of ventricular wall thickness 8 week post-MI after injection of p(NIPAAm-co-AAc-co-HEMAPTMC) (right) compared to PBS (left).63 Reprinted from Biomaterials, 30(26), K. L. Fujimoto, Z. Ma, D. M. Nelson, R. Hashizume, J. Guan and K. Tobita et al., Synthesis, characterization and therapeutic efficacy of a biodegradable, thermoresponsive hydrogel designed for application in chronic infarcted myocardium, 4357–4368, Copyright (2009), with permission from Elsevier. (C1) Chemical structure of methacrylated hyaluronic acid hydrogel. (C2) Graph showing the regional differences in wall thickness, 8 weeks post-MI, of infarcted myocardium, MeHA High treatment, MeHa Low treatment and healthy conditions.64 Reprinted from J. L. Ifkovits, E. Tous, M. Minakawa, M. Morita, J. D. Robb, K. J. Koomalsingh, J. H. Gorman III, R. C. Gorman and J. A. Burdick Injectable hydrogel properties influence infarct expansion and extent of postinfarction left ventricular remodeling in an ovine model, Proc. Natl. Acad. Sci. U. S. A, 2010, 107(25), 11507–11512. (D1) Chemical structure of alginate and bivalent cross-link formation with calcium ions. (D2) Images of the morphology of heart sections 60 days post-MI are shown, in which alginate was injected or saline.66 Reprinted from J. Am. Coll. Cardiol., 54, J. Leor, S. Tuvia, V. Guetta, F. Manczur, D. Castel and U. Willenz et al., Intracoronary Injection of In Situ Forming Alginate Hydrogel Reverses Left Ventricular Remodeling After Myocardial Infarction in Swine, 1014–1023, Copyright (2009), with permission from Elsevier. Synthetic Biology, 2018, 2, 155–185 | 169

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

Fig. 6 Injectable hydrogels used as bioactive matrix support. (A1) Schematic representation of collagen fiber structure and formation. Collagen a-helix consists of repeating units Glycine-X-Y, where X and Y are usually proline and hydroxyproline respectively. (A2) CD31 staining showing neovascularization of the control PBS, Collagen Type I(Col), Matrigel (Mat) and fibrin (Fib). Reprinted from N. F. Huang, J. Yu, R. Sievers, S. Li and R. J. Lee, Tissue Eng., 2005, 11, 1860–1866. The publisher for this copyrighted material is Mary Ann Liebert, Inc. publishers. (B1) Schematic representation of decellularized cardiac ECM (dECM) (B2) Figures showing the change in EF, EDV and ESV after 3 months for injected dECM compared to PBS. Reprinted from J. Am. Coll. Cardol., 59, J. M. Singelyn, P. Sundaramurthy, T. D. Johnson, P. J. Schup-Magoffin, D. P. Hu and D. M. Faulk et al., Catheter-deliverable Hydrogel Derived From Decellularized Ventricular Extracellular Matrix Increases Endogenous Cardiomyocytes and Preserves Cardiac Function PostMyocardial Infarction, 751–763, Copyright (2012), with permission from Elsevier.

that are released and displayed for cells. These proteins are important to influence endogenous cells to repair damaged cardiac tissue and improve tissue remodeling post-MI. Among all different therapeutic proteins used to treat and repair damaged cardiac tissue post-MI, growth factors (GF) have shown most promise due to their high potency.79 Difficulties that are accompanied by encapsulating therapeutic molecules in a hydrogel are, i.e. the limited time in which these molecules are bioactive, the high rates of diffusion and the method of presenting bioactive compounds.80 Therefore, hydrogel properties must be carefully tailored to address each specific application.

170 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

Fig. 7 Injectable hydrogels used for local paracrine signaling. (A1) Chemical structure of pH- and temperature sensitive hydrogel based on p(NIPAAm-co-PAA-co-BA). (A2) Graph showing the fractional shortening of the left ventricle as function of the time post-injection of PBS, p(NIPAAm-co-PAA-co-BA) alone, bFGF alone and p(NIPAAm-co-PAA-co-BA) with bFGF.81 Reprinted from Biomaterials, 32, J. C. Garbern, E. Minami, P. S. Stayton and C. E. Murry, Delivery of basic fibroblast growth factor with a pH-responsive, injectable hydrogel to improve angiogenesis in infarcted myocardium, 2407–2416, Copyright (2011), with permission from Elsevier. (B1) Chemical structure of naturally-based alginate hydrogel. (B2) Graph showing the cumulative and sequential release profile of VEGFA165 and PDGF-BB from alginate hydrogels.84 Reprinted from X. Hao, E. Silva, A. Mansson-Broberg, K.-H. Grinnemo, A. J. Siddigui and G. Dellgren et al., Angiogenic effects of sequential release of VEGF-A165 and PDGF-BB with alginate hydrogels after myocardial infarction, Cardiovasc Res., 2007, 75(1), 178–185, by permission of Oxford University Press. (B3) Graph showing the cumulative and sequential release profile of IGF-1 (top line) and HGF (bottom line) from alginate hydrogels.85 Reprinted from Biomaterials, 32, E. Ruvinov, J. Leor and S. Cohen, The promotion of myocardial repair by the sequential delivery of IGF-1 and HGF from an injectable alginate biomaterial in a model of acute myocardial infarction, 565–578, Copyright (2011), with permission from Elsevier. (C1) Chemical structure of ureido-pyrimidinone poly(ethylene glycol) (UPyPEG), showing the hydrophilic PEG polymer and the UPy-alkyl-urea functionalities. (C2) Graph showing the cumulative release profile of IGF-1 (bottom line) and HGF (top line) from UPyPEG hydrogels.86 Reproduced from M. M. C. Bastings, S. Koudstaal, R. E. Kieltyka, Y. Nakano, A. C. H. Pape and D. A. M. Feyen et al., A Fast pH-Switchable and Self-Healing Supramolecular Hydrogel Carrier for Guided, Local Catheter Injection in the Infarcted Myocardium, Adv Healthcare Mater., 2014, 3, 70, John Wiley and Sons. Copyright r 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Synthetic Biology, 2018, 2, 155–185 | 171

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

Growth factors are found in natural ECM and control cellular function by stimulating signaling pathways. An important issue that needs to be addressed is the formation of healthy and stable vessels for the optimal perfusion of blood post-MI. This perfusion is crucial for the optimal regeneration of cardiac tissue. Garbern et al. encapsulated basic fibroblast growth factor (bFGF) in a pH- and temperature-responsive injectable hydrogel.81 This hydrogel is based on a random copolymer, poly(N-isopropylacrylamide-co-propylacrylic acid-co-butyl acrylate) (p[NIPAAm-co-PAA-co-BA]) (Fig. 7A1). The polymer NIPAAm block contributes to the phase change following increase in temperature from 25 1C to 37 1C. Additionally, by incorporating carboxylic acids in the polymer, a pH-sensitive hydrogel is created. Instantaneously, a hydrogel is formed upon injection of liquid polymer in the ischemic and acidic infarct (pH 6.8). During the increase of the pH towards 7.4, the polymer dissolutes and bFGF is released from the hydrogel. Injection of the bFGFcontaining hydrogel 20 min after ischemia induction in a rat, resulted in increased microvessel formation and improved fractional shortening after 28 days (Fig. 7A2). Another growth factor that is widely studied is the hepatocyte growth factor (HGF). HGF contains pro-angiogenic, antifibrotic and cardioprotective effects.82 Sonnenberg et al. encapsulated a HGF fragment (HGF-f) in a hydrogel based on decellularized ECM (dECM).83 dECM contains a high concentration of GAG and should therefore interact with HGF-f. Prolonged delivery of HGF-f was observed when injected with dECM hydrogel compared to HGF-f alone. Consequently, HGF-f encapsulated in the hydrogel significantly increased blood vessel formation. Decrease in fibrosis is apparent, however not significant compared to hydrogel and HGF-f alone. Alginate is a natural supramolecular hydrogel that mimics GAG found in natural ECM (Fig. 7B1). Hoa et al. mixed high molecular weight alginate (250 kDa) with low molecular weight alginate (0.5 kDa) and incorporated vascular endothelial growth factor-A165 (VEGF-A165) and platelet-derived growth factor-BB (PDGF-BB) (Fig. 7B2).84 Hydrogel formation is initiated by addition of calcium ions. Alginate hydrogel loaded with proteins was injected 7 days post-MI and rat hearts were analyzed 28 days after injection. The sequential delivery of VEGF followed by PDGF to the surrounding tissue resulted in increased angiogenesis and vessel maturation. Tuning the molecular weight of the polymers that form the hydrogel resulted in different degradation kinetics and release profiles. With alginate hydrogels it is possible to tune the dose and time frame of growth factor delivery. Ruvinov et al. used a similar approach, only using different growth factors and different composition of the alginate hydrogel.85 Insulin-like growth factor-1 (IGF-1) and HGF are mixed in a single sodium alginate hydrogel containing two alginate molecular weights, i.e.100 kDa and 30–50 kDa. Alginate precursors are sulphated to get a final weight ratio of 1 : 9 (sulphated:unmodified alginate). Alginate hydrogels loaded with IGF-1 and HGF are injected 6 days post-MI and rat hearts were analyzed 28 days after injection. The release profile revealed a sequential release of IGF-1 initially and a slower and continuous release of HGF (Fig. 7B3). This might be interesting for initial 172 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

stimulation of pro-survival pathways with IGF-1, followed by stimulation of pro-angiogenic and anti-fibrotic pathways with HGF. Supramolecular hydrogels based on directed and reversible noncovalent interactions show promise as an injectable biomaterial and GF delivery vehicle, due to the high control of cross-link formation. A recently developed supramolecular hydrogel is based on four-fold hydrogen bonding ureidopyrimidinone (UPy) units coupled to a PEG polymer via alkyl and urea spacers (Fig. 7C1).86 These PEG polymers are modified with UPy units to form supramolecular fibers in aqueous environments and finally result in a transient network that forms a hydrogel.87 Additionally, this hydrogel shows pH-sensitivity that facilitates sol-to-gel properties and can thus be used as an injectable biomaterial and GF delivery vehicle (Fig. 7C2). IGF-1 and HGF was encapsulated in the UPy-PEG hydrogels to enable local catheter injection and repair of the infarcted myocardium. UPy-hydrogels loaded with GF were injected in pig hearts 28 days post-MI and pig hearts were analyzed 28 days after injection. Interestingly, clusters of viable cardiomyocytes were observed in the UPy-PEG containing IGF-1/HGF compared to UPyPEG alone and IGF-1/HGF alone. This was accompanied with significant decrease in collagen content for the UPy-PEG containing IGF-1/HGF. Additionally, UPy-PEG showed gradual release of encapsulated GF during 7 days and improvement in cell viability and decrease in fibrosis formation.

3.3 Cell delivery strategies An example of another synthetic ECM approach is the delivery of stem cells, where stem cells secrete paracrine signal factors which contribute to cardiac repair. Injection of healthy stem cells in a hostile and pathological environment limits the survival of the cells and thus the therapy outcome. Therefore, stem cells should be combined with injectable hydrogels to improve retention, survival and support regeneration.6,79 The first example is a temperature-sensitive hydrogel based on chitosan, where b-glycerol phosphate (b-GP) was mixed in to increase solubility of chitosan at neutral pH (Fig. 8A1).88 Increasing the temperature of chitosan with b-GP to 37 1C resulted in hydrogel formation. Lu et al. encapsulated mouse embryonic-derived stem cells (mESCs) in b-GP modified chitosan hydrogels and injected rats 1 week post-MI. Cellular coverage significantly increases when mESCs are encapsulated in chitosan hydrogels (17.48% and 12.93%) compared to PBS control (9.91% and 6.41%) 24 hours ( po0.01) and 4 weeks ( po0.01) after injection respectively. Consequently, EF improved when chitosan hydrogels were encapsulated with mESCs compared to the control groups (Fig. 8A2). Another hydrogel based on a combination of chitosan, collagen type I and an angiopoietin-1 derived peptide sequence, QHREDGS, was used as a cell carrier (Fig. 8B1).89 QHREDGS was covalently attached to amine residues on chitosan using 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide HCl (EDC) chemistry.90 QHREDS sequences promote cellular attachment and survival. Upon induction of a MI in a rat model, Synthetic Biology, 2018, 2, 155–185 | 173

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

chitosan-collagen hydrogels modified with QHREDGS were injected and hearts are analyzed 14 days post-MI.91 Improvement in EF, possibly due to a significant increase in functional cardiomyocytes was observed compared to the untreated control (Fig. 8B2). Although, increase in cardiac function is observed when MI-model mice or rats are treated with stem cells inside an injectable hydrogel, there remains a large difference in the thickness and scale of the myocardium when comparing with humans.8 Therefore, larger animal

174 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

models should be used, since they show similar size and structure of the myocardium of humans. Chen et al. injected pig MI-models with HA hydrogels encapsulated with bone marrow-derived cells (BMC) (Fig. 8C1).92 Consequently, the combination of HA with BMC enhanced cardiac function significantly compared to hydrogel and BMC alone. Also, differentiation of BMC towards endothelial and smooth muscle cells was observed (Fig. 8C2). Supramolecular hydrogels based on mixing two synthetic components, a-cyclodextrin/poly(ethylene glycol)-b-polycaprolactone-(dodecanedioic acid)-polycaprolactone-poly(ethylene glycol) (a-CD/MPEG-PCL-MPEG), can be used for cellular transplantation in the infarcted myocardium (Fig. 8D1).93,94 The MPEG-PCL-MPEG complexes in the cavity of a-CD via hydrophobic interactions and finally forms micelle structures.95 Wang et al. encapsulated BMCs in a-CD/MPEG-PCL-MPEG and injected the construct in infarcted myocardium of rabbits 7 days post-MI. Four weeks after injection, higher retention of BMCs was seen when injected with supramolecular hydrogel compared to cells alone (Fig. 8D2). Also, EF and EDD of the left ventricle was preserved when injected with hydrogel and BMCs, which resulted in significant decrease in infarct size compared to control groups. These results demonstrate the importance of a supporting ECM-mimic to increase cellular retention and survival. Therefore, improvement of cellular contribution to cardiac repair can be achieved by sustaining the delivery of bioactive compounds and functional ECM components. Fig. 8 Injectable hydrogels used as cell delivery vehicles. (A1) Schematic representation of the interaction of chitosan with b-GP at physiological pH. (A2) Graph showing the EF of the left ventricle when injected with PBS, Chitosan alone, mESCs alone and Chitosan with mESCs.88 Reproduced from W.-N. Lu, S.-H. Lu ¨ , H.-B. Wang, D.-X. Li, C.-M. Duan and Z.-Q. Liu et al., Functional Improvement of Infarcted Heart by Co-injection of Embryonic Stem Cells with Temperature-responsive Chitosan Hydrogel, Tissue Eng. A., 2009, 15, 1437–1447. The publisher for this copyrighted material is MaryAnn Liebert, Inc. publishers. (B1) Chemical structure of QHREDGmodified chitosan/collagen hydrogel using 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide HCl chemistry. (B2) Graph showing the EF of the left ventricle when injected with PBS, chitosan-collagen hydrogel alone and chitosan-collagen hydrogel with the QHREDGS peptide.91 Reprinted from L. Reis, L. L. Y. Chiu, J. Wu, N. Feric, C. Laschinger and A. Momen, et al., Hydrogels with Integrin-binding Angiopoietin-1derived Peptide, QHREDGS, for Treatment of Acute Myocardial Infarction, Circ.: Heart Failure, 2015, 8(2), 333–341. Copyright 2015 with permission from Wolters Kluwers Health, Inc. (C1) Chemical structure of HA. (C2) Histology images showing endothelial cell formation (upper two images) and smooth muscle cells (lower two images) of infarcted tissue with BMCs alone or HA with BMCs. Scale bar ¼ 200 mm.92 Reprinted from C. H. Chen, M. Y. Chang, S.-S. Wang, and P. C. H. Hsieh, Injection of autologous bone marrow cells in hyaluronan hydrogel improves cardiac performance after infarction in pigs, AJP Hear Circ Physiol., 2014, 306, H1078–H1086 with permission from The American Physiological Society. (D1) Chemical structure of the supramolecular building block a-cyclodextrin (a-CD) and poly(ethylene glycol)b-polycaprolactone-(dodecanedioic acid)-polycaprolactone-poly(ethylene glycol) (MPEG-PCL-MPEG). (D2) Immunofluorescent images showing increased cellular entrapment when BMCs were injected with a-CD/MPEG-PCL-MPEG hydrogel compared to BMCs alone, scale bar ¼ 45 mm.93 Reprinted from Acta Biomaterialia, 5(8), T. Wang, X.-J. Jiang, Q.-Z. Tang, X.-Y. Li, T. Lin and D. Q. Wu et al., Bone marrow stem cells implantation with alpha-cyclodextrin/MPEG-PCL-MPEG hydrogel improves cardiac function after myocardial infarction, 2939–2944, Copyright (2009), with permission from Elsevier. Synthetic Biology, 2018, 2, 155–185 | 175

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

3.4 Cell recruitment strategies A relatively new strategy to stimulate in situ repair of damaged myocardium is through endogenous recruitment of stem/progenitor cells using chemokines. In previous examples, injected hydrogels were infiltrated with different endogenous cells. However, infiltration occurred via a non-selective process and usually consisted of circulating white blood cells. In this paragraph, injectable hydrogels are discussed that mimic natural ECM and contain crucial soluble ECM factors that allow for selective recruitment of stem cells. A widely known chemokine is the stromal derived factor-1 alpha (SDF-1a). This chemokine binds selectively to the receptor, CXCR4, and is an important regulator in recruiting leukocytes and BMCs from the blood stream.45 A self-assembling oligopeptide, RAD16, was modified with a protease-resistant SDF-1a chemokine (S4V) to allow for local delivery in the myocardium (Fig. 9A1).96 RAD16 self-assembles via hydrophobic interactions and beta-sheet

176 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

formation. Seger et al. observed increased infiltration of stem cells expressing SDF-1a receptors and increased vascularization 4 weeks after injection compared to control groups (Fig. 9A2). Additionally, significant improvement in the left ventricle function was observed. Later, Purcell et al. used recombinant SDF-1a in combination with photo cross-linkable HA (Fig. 9B1).36 The negatively charged backbone of HA interacts with CD44 and binds to SDF-1a with a binding affinity of 36 5mM. CD44 is a cell membrane receptor that is involved in cellular motility and is expressed by leukocytes, BMC, endothelial cells and myofibroblasts. Injection of photo cross-linkable HA hydrogels with the SDF-1a to the infarcted myocardium, resulted in a sustained release of SDF-1a and an B8.5 fold increase in recruitment of BMCs at the injected region compared to SDF-1a alone (Fig. 9B2). Song et al. adopted the same synergistic combination of HA with SDF-1a, however conjugated Ac-SDKP which is a peptide derivative of thymosin b4 (Fig. 9C1).97 Injection of all three components in a chronic MI rat model significantly improved regeneration of the damaged myocardium. This was concluded by the increased left ventricle function, increased angiogenesis and decreased infarct size 4 weeks post-MI and respective injection (Fig. 9C2). Another chemokine is CCL5, which is known to inhibit neutrophil infiltration and improve left ventricle function.98 Projahn et al. modified star shaped poly(ethylene oxide-stat-propylene oxide) (sP(EO-stat-PO)) with thiols to create a fast degradable hydrogel (FDH) (24 hours) and slow degradable hydrogel (SDH) (4 weeks) (Fig. 9D1). FDH was encapsulated Fig. 9 Injectable hydrogels based on recruitment of endogenous stem cells. (A1) Schematic representation of incorporation of SDF-1a in RAD16 self-assembling peptides.96 (A2) Immunofluorescent images showing capillaries isolectin, nuclei DAPI and a-smooth muscle actin (no scale bar).96 (A1 and A2) Reproduced from V. F. M. Segers, T. Tokunou, L. J. Higgins, C. MacGillivray, J. Gannon and R. T. Lee, Local Delivery of Protease-resistant Stromal Cell Derived Factor-1 for Stem Cell Recruitment After Myocardial Infarction, Circulation, 2007, 116(15), 1683–1692 with permission from Wolters Kluwers Health, Inc. (B1) Chemical structure of methacrylated HA where recombinant SDF 1a was encapsulated. (B2) Immunofluorescent images showing infiltration of BMCs in HA alone and HA with recombinant SDF-1a (scale bar ¼ 500 mm).36 Reprinted from Biomaterials, 33, B. P. Purcell, J. Elser, A. Mu, K. B. Margulies and J. Burdick, Synergistic effects of SDF-1 alpha chemokine and hyaluronic acid release from degradable hydrogels on directing bone marrow derived cell homing to the myocardium, 7849–7857. Copyright (2012), with permission from Elsevier. (C1) Schematic overview of acrylated HA hydrogel modified with MMP-sensitive peptides and Ac-SDKP via a Michael type addition. (C2) Masson’s Trichrom staining of left ventricular tissue injected with HA alone, HA encapsulated with SDF-1 alone, SDKP alone and both SDF-1 and SDKP. Magnification upper ¼ 2X and magnification lower ¼ 20X.97 Reprinted from Biomaterials, 35(8), M. Song, H. Jang, J. Lee J, J. H. Kim, S. H. Kim, K. Sun, et al., Regeneration of chronic myocardial infarction by injectable hydrogels containing stem cell homing factor SDF-1 and angiogenic peptide Ac-SDKP, 2436–2445. Copyright (2014), with permission from Elsevier. (D1) Chemical structure of fast degradable hydrogels (Met-CCL5, 24 hours) and slow degradable hydrogels (CXCL12, 4 weeks) using star-shaped sP(EO-stat-PO) (* ¼ mixture of D, L). (D2) Representative CD31 staining showing neovascularization following injection of PBS, FDH and SDH alone, Met-CCL5-FDH alone, CXCL12-SDH alone and both Met-CCL5-FDH and CXCL12-SDH (scale bar ¼ 50 mm).98 Reprinted from D. Projahn, S. Simsekyilmaz, S. Singh, I. Kanzler, B. K. Kramp and M. Langer et al., Controlled intramyocardial release of engineered chemokines by biodegradable hydrogels as a treatment approach of myocardial infarction, J. Cell. Mol. Med., 2014, 18, 790–800. r 2014 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine. Synthetic Biology, 2018, 2, 155–185 | 177

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

with Met-CCL5 and SDH was encapsulated with CXCL12 (SDF-1a) and showed in vivo degradation times of 24 hours and 4 weeks, respectively. Both hydrogels were pre-mixed and injected in a mouse MI model. Due to the fast release of Met-CCL5, infiltration of neutrophils and inflammation was significantly reduced during the first hours post-MI. Consequently, including a slow release of CXCL12 from a hydrogel significantly increase blood vessel formation compared to control groups (Fig. 9D2).98

4 Discussion and future perspectives As previously stated, the ideal synthetic ECM should be able to recapitulate native cardiac tissue. Native cardiac tissue contain key ECM components that control and influence different cellular behavior. Injectable hydrogels can be designed to mimic myocardial key ECM components and thereby stimulating cellular processes and enhance regeneration of cardiac tissue post-MI. The predominant focus of this chapter is on understanding the design criteria for injectable hydrogels that mimic ECM components. These design criteria can be via either mimicking the bioactive niche or the mechanical or physical properties of native ECM. In previous examples, injection of a mechanically supporting hydrogel preserved important cardiac functions post-MI. Which is primarily due to the decrease in ventricle wall stress and reduces damage to surrounding cardiomyocytes.56 Importantly, the injected biomaterial should allow for cellular infiltration in order to observe long term (3 months) improvement. These injectable hydrogels typically mimic connective tissue, in particular collagens, which are found in native cardiac tissue. Next, synthetic ECM approaches based on mimicking the bioactive niche were discussed. Paracrine signaling is an important cell–cell interaction process that is part of the healing process that occur post-MI. For example, during the third healing phase, granulation tissue is formed in which high amounts of ECM components, e.g. collagens and GAG are produced by myofibroblast. Particularly, GF are produced, sequestered in the ECM and stimulate important cellular pathways. Negatively charged polysaccharides such as, e.g. alginate or HA, can bind and sequester GF and pose as an interesting platform to mimic and enhance paracrine signaling post-MI. Nevertheless, a size-dependent release of encapsulated GF using UPy-hydrogels could also be used to get more control in the release profile. An important aspect of an injectable hydrogel is the biocompatibility and the degree in which the hydrogel can prevent high occurrence of cellular death when injected in the hostile infarcted environment. Both naturally derived and synthetic hydrogels discussed were able to meet with these requirements. However, difficulty arises when delivered cells should contribute to regeneration by proliferation and differentiation. Considering, the improvement of cardiac function of previously injected hydrogels, is primarily the result of a high degree of paracrine signaling due to the increase in cellular retention. Using HA in combination with BMC did 178 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

result in a certain degree of differentiation however this could be due to the GF sequestering nature of HA. Next, synthetic ECM approaches were based on designing a cell-free injectable hydrogel that was able to enhance the recruitment of circulating stem cells. These hydrogels mimic specific cell–cell interactions that enhance the effect of cellular processes on the therapy outcome. The functionality that is primarily used is based on SDF-1a. This molecule is able to recruit circulating white blood cells and myofibroblasts from the blood and surrounding tissue respectively, which makes it a perfect candidate to improve tissue formation. Interestingly, the same synergistic effect of HA with BMC is seen by Purcell et al., using SDF-1a instead of cells.36 During synthesis of HA, growing chains are pushed out of the cell membrane forming the pericellular matrix (Fig. 1, Pericellular matrix).37 The chemoattractant property of HA can be rationalized by the interaction of GF and SDF-1a to its negatively charged versican backbone and by the interaction with CD44 receptors found on infiltrating cells.41 This clearly illustrates how hydrogels can designed to interact with GF and enhance cellular processes that favor regeneration of cardiac tissue. In conclusion, we propose the most promising therapy option is the recruitment of endogenous stem cells. Especially, in human-studies, recruiting autologous stem cells reduces the need of an exogenous stem cell donors. Also, providing the ideal microenvironment for stems cell to proliferate and differentiate is important. Additionally, a missing factor in previously discussed synthetic ECM approaches is the production and formation of functional ECM. The ECM is responsible for the sequestering of cell–cell signaling molecules, presentation of adhesion motifs that promote migration and provides a structure for cells to proliferate and differentiate. Therefore, synthetic ECM approaches should be based on mimicking the mechanical properties and the bioactive niche of natural ECM. Attempts to design injectable hydrogels to mimic properties of ECM components has shown promise in limiting adverse remodeling of myocardial tissue, although not sufficient in complete regeneration. Therefore, injectable hydrogels should support the formation of complete functional ECM. This is proposed to be realized by incorporation of cues that stimulate ECM production by endogenous or exogenous stem cells and capture the produced ECM. In other tissues, enhancing ECM production of chrondrogenic cells was previously validated using tetradecylthioacetic acid (TTA).99 Here, chrondrogenic cells showed increased collagen type II and aggrecan production when stimulated with TTA, which are cartilage-specific ECM components. Interesting work done by Prewitz et al. on mesenchymal stem cells (MSC), utilizes a covalently bound FN surface for studying decellularized and MSC-secreted ECM.100 These findings show the importance of studying cell-secreted ECM in order to generate more native-like substrates. We propose that, applying cues that capture cell-secreted ECM in an injectable hydrogel could be the missing link to guide functional tissue formation and result in complete regeneration. Synthetic Biology, 2018, 2, 155–185 | 179

View Online

References 1

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

2

3

4

5

6 7

8 9

10 11 12

13

14

15

16 17

18

J. W. Holmes, T. K. Borg and J. W. Covell, Structure and mechanics of healing myocardial infarcts, Annu. Rev. Biomed. Eng., 2005, 7, 223–253. J. J. H. Chong, X. Yang, C. W. Don, E. Minami, Y.-W. Liu and J. J. Weyers, et al., Human embryonic-stem-cell-derived cardiomyocytes regenerate nonhuman primate hearts, Nature, 2014, 510(7504), 273–277. Nature Publishing Group. F. Sharif, J. Bartunek and M. Vanderheyden, Adult Stem Cells in the Treatment of Acute Myocardial Infarction, Catheter. Cardiovasc. Interventions, 2011, 77, 72–83. ń, Assessment and optimization of J. V. Terrovitis, R. R. Smith and E. Marba cell engraftment after transplantation into the heart, Circ. Res., 2010, 106(3), 479–494. J. Radhakrishnan, U. M. Krishnan, S. Sethuraman Hydrogel based injectable scaffolds for cardiac tissue regeneration. Biotechnol. Adv., 2014, 32(2), 449–461. Elsevier Inc. L. A. Reis, L. L. Y. Chiu, N. Feric, L. Fu and M. Radisic, Biomaterials in myocardial tissue engineering, J. Tissue Eng. Regen. Med., 2016, 10(1), 11–28. M. van Marion, N. Bax, A. van Spreeuwel, D. van der Schaft and C. Bouten, Material-Based Engineering Strategies for Cardiac Regeneration, Curr. Pharm. Des., 2014, 20(12), 2057–2068. Y. Dai and A. Foley, Tissue Engineering Approaches to Heart Repair, Crit. Rev. Biomed. Eng., 2014, 42(3–4), 213–227. A. A. Rane, K. L. Christman Biomaterials for the treatment of myocardial infarction: A 5-year update. J. Am. Coll. Cardiol., 2011, 58(25), 2615–2629. Elsevier Inc. A. L. Mescher Junqueira’s Basic Histology, Text and Atlas, USA, The McGrawHill Companies, 2010. K. E. Porter and N. A. Turner, Cardiac fibroblasts: At the heart of myocardial remodeling, Pharmacol. Ther., 2009, 123, 255–278. A. P. Beltrami, L. Barlucchi, D. Torella, M. Baker, F. Limana and S. Chimenti, et al., Adult Cardiac Stem Cells Are Multipotent and Support Myocardial Regeneration, Cell, 2003, 114(6), 763–776. P. van Vliet, A. M. Smits, T. P. de Boer, T. H. Korfage, C. H. G. Metz and M. Roccio, et al., Foetal and adult cardiomyocyte progenitor cells have different developmental potential, J. Cell. Mol. Med., 2010, 14(4), 861–870. A. M. Smits, L. W. van Laake, K. den Ouden, C. Schreurs, K. Szuhai and C. J. van Echteld, et al., Human cardiomyocyte progenitor cell transplantation preserves long-term function of the infarcted mouse myocardium, Cardiovasc. Res., 2009, 83(3), 527–535. M. A. Rossi, Connective tissue skeleton in the normal left ventricle and in hypertensive left ventricular hypertrophy and chronic chagasic myocarditis, Med. Sci. Monit., 2001, 7, 820–832. C. Jourdan-Lesaux, J. Zhang and M. L. Lindsey, Extracellular matrix roles during cardiac repair, Life Sci., 2010, 87, 391–400. M. Votteler, P. J. Kluger, H. Walles and K. Schenke-Layland, Stem Cell Microenvironments – Unveiling the Secret of How Stem Cell Fate is Defined, Macromol. Biosci., 2010, 10, 1302–1315. M. Tallawi, R. Rai, A. R. Boccaccini and K. Aifantis, Effect of substrate mechanics on cardiomyocyte maturation and growth, Tissue Eng., Part B, 2014, 21(1), 157–165.

180 | Synthetic Biology, 2018, 2, 155–185

View Online

19

20

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

21

22

23 24

25

26 27

28

29

30 31 32 33

34

35 36

37

J. R. Venugopal, M. P. Prabhakaran, S. Mukherjee, R. Ravichandran, K. Dan and S. Ramakrishna, Biomaterial strategies for alleviation of myocardial infarction, J. R .Soc., Interface, 2012, 9(66), 1–19. P. Camelliti, T. K. Borg and P. Kohl, Structural and functional characterisation of cardiac fibroblasts, Cardiovasc. Res., 2005, 65, 40–51. N. A. M. Bax, M. H. van Marion, B. Shah, M.-J. Goumans, C. V. C. Bouten and D. W. J. van der Schaft, Matrix production and remodeling capacity of cardiomyocyte progenitor cells during in vitro differentiation, J. Mol. Cell. Cardiol., 2012, 53, 497–508. H. Sato, M. Takahashi, H. Ise, A. Yamada, S. I. Hirose and Y. I. Tagawa, et al., Collagen synthesis is required for ascorbic acid-enhanced differentiation of mouse embryonic stem cells into cardiomyocytes, Biochem. Biophys. Res. Commun., 2006, 342(1), 107–112. J. Glowacki and S. Mizuno, Collagen scaffolds for tissue engineering, Biopolymers, 2008, 89(5), 338–344. R. J. Wenstrup, J. B. Florer, E. W. Brunskill, S. M. Bell, I. Chervoneva and D. E. Birk, Type V collagen controls the initiation of collagen fibril assembly, J. Biol. Chem., 2004, 279(51), 53331–53337. Y. Matsui, J. Morimoto and T. Uede, Role of matricellular proteins in cardiac tissue remodeling after myocardial infarction, World J. Biol. Chem., 2010, 1(5), 69–80. K. Imanaka-Yoshida, Tenascin-C in Cardiovascular Tissue Remodeling, Circ. J., 2012, 76(11), 2513–2520. K. V. T. Engebretsen, A. Waehre, J. L. Bjørnstad, B. Skrbic, I. Sjaastad and D. Behmen, et al., Decorin, lumican, and their GAG chain-synthesizing enzymes are regulated in myocardial remodeling and reverse remodeling in the mouse, J. Appl. Physiol., 2013, 114(8), 988–997. ´lez-Garcıá, M. Cantini, D. Moratal, G. Altankov, M. Salmero ńC. Gonza ńchez Vitronectin alters fibronectin organization at the cell-material Sa interface. Colloids Surf., B. 2013, 111, 618–625. Elsevier B.V. F. G. Spinale, Myocardial Matrix Remodeling and Matrix Metalloproteinases: Influence on Cardiac Form and Function, Physiol. Rev., 2007, 87, 1285–1342. E. Hohenester and P. D. Yurchenco, Laminins in basement membrane assembly, Cell Adhes. Migr., 2013, 7(1), 56–63. A. E. Chung, L.-J. Dong, C. Wu and M. E. Durkin, Biological functions of entactin, Kidney Int., 1993, 43(1), 13–19. M. C. Farach-carson and D. D. Carson, Perlecan—a multifunctional extracellular proteoglycan scaffold, Glycobiology, 2007, 17(9), 897–905. M. Nakahama, T. Murakami, S. Kusachi, I. Naito, K. Takeda and H. Ohnishi, et al., Expression of perlecan proteoglycan in the infarct zone of mouse myocardial infarction, J. Mol. Cell. Cardiol., 2000, 32(6), 1087–1100. M. Rienks, A. P. Papageorgiou, N. G. Frangogiannis and S. Heymans, Myocardial extracellular matrix: An ever-changing and diverse entity, Circ. Res., 2014, 114(5), 872–888. B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, P. Walter Molecular Biology of the Cell. Taylor & Francis Inc; 2007. B. P. Purcell, J a. Elser, A. Mu, K. B. Margulies, J. Burdick Synergistic effects of SDF-1 alpha chemokine and hyaluronic acid release from degradable hydrogels on directing bone marrow derived cell homing to the myocardium. Biomaterials, 2012, 33(31), 7849–7857. Elsevier Ltd. S. P. Evanko, M. I. Tammi, R. H. Tammi and T. N. Wight, Hyaluronandependent pericellular matrix, Adv. Drug Delivery Rev., 2007, 59(13), 1351–1365. Synthetic Biology, 2018, 2, 155–185 | 181

View Online

38

39

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

40

41

42

43

44

45 46 47

48 49

50

51

52

53

54

55

Z. Kassiri and R. Khokha, Myocardial extra-cellular matrix and its regulation by metalloproteinases and their inhibitors, Thromb. Haemostasis, 2005, 93, 212–219. T. N. Wight, Versican: A versatile extracellular matrix proteoglycan in cell biology, Curr. Opin. Cell Biol., 2002, 14(5), 617–623. M. K. B. Zanin, J. Bundy, H. Ernst, A. Wessels, S. J. Conway and S. Hoffman, Distinct Spatial and Temporal Distributions of Aggrecan and Versican in the Embryonic Chick Heart, Anat. Rec., 1999, 256, 366–380. S. P. Evanko, S. Potter-Perigo, P. Y. Johnson and T. N. Wight, Organization of hyaluronan and versican in the extracellular matrix of human fibroblasts treated with the viral mimetic poly I:C, J. Histochem. Cytochem., 2009, 57(11), 1041–1060. Y. Murasawa, K. Watanabe, M. Yoneda, M. Zako, K. Kimata and L. Y. Sakai, et al., Homotypic versican G1 domain interactions enhance hyaluronan incorporation into fibrillin microfibrils, J. Biol. Chem., 2013, 288(40), 29170– 29181. A. P. Spicer, A. Joo and R. A. Bowling, A hyaluronan binding link protein gene family whose members are physically linked adjacent to chondroitin sulfate proteoglycan core protein genes: the missing links, J. Biol. Chem., 2003, 278(23), 21083–21091. Y. Ito, S. Seno, H. Nakamura, A. Fukui and M. Asashima, XHAPLN3 plays a key role in cardiogenesis by maintaining the hyaluronan matrix around heart anlage, Dev. Biol., 2008, 319(1), 34–45. E. Tous, B. Purcell, J. L. Ifkovits and J. A. Burdick, Injectable acellular hydrogels for cardiac repair, J. Cardiovasc. Transl. Res., 2011, 4(5), 528–542. S. Corda, J. L. Samuel and L. Rappaport, Extracellular matrix and growth factors during heart growth, Heart Failure Rev., 2000, 5(2), 119–130. H. K. Awada, M. P. Hwang, Y. Wang Biomaterials Towards comprehensive cardiac repair and regeneration after myocardial infarction: Aspects to consider and proteins to deliver. Biomaterials, 2016, 82, 94–112. Elsevier Ltd. P. Carmeliet and R. K. Jain, Molecular mechanisms and clinical applications of angiogenesis, Nature, 2011, 473(7347), 298–307. J. P. G. Sluijter, G. Condorelli, S. M. Davidson, F. B. Engel, P. Ferdinandy, D. J. Hausenloy, et al. Novel therapeutic strategies for cardioprotection. Pharmacol. Ther., 2014, 144(1), 60–70. Elsevier Inc. B. Yang, D. Li, M. I. Phillips, P. Mehta and J. L. Mehta, Myocardial angiotensin II receptor expression and ischemia-reperfusion injury, Vasc. Med., 1998, 3(2), 121–130. D. I. Bromage, S. M. Davidson, D. M. Yellon Stromal derived factor 1a: A chemokine that delivers a two-pronged defence of the myocardium. Pharmacol. Ther., 2014, 143(3), 305–315. Elsevier Inc. W. M. Blankesteijn, E. Creemers, E. Lutgens, J. P. Cleutjens, M. J. Daemen and J. F. Smits, Dynamics of cardiac wound healing following myocardial infarction: Observations in genetically altered mice, Acta Physiol. Scand., 2001, 173(1), 75–82. M. M. Ulrich, M. Janssena, M. J. Daemen, L. Rappaport, J. L. Samuel and F. Contard, et al., Increased expression of fibronectin isoforms after myocardial infarction in rats, J. Mol. Cell. Cardiol., 1997, 29(9), 2533–2543. P. E. Shamhart, J. G. Meszaros Non-fibrillar collagens: Key mediators of post-infarction cardiac remodeling? J. Mol. Cell Cardiol., 2010, 48(3), 530–537. Elsevier B.V. M. Lichtenauer, M. Mildner, A. Baumgartner, M. Hasun, G. Werba and L. Beer, et al., Intravenous and intramyocardial injection of apoptotic white

182 | Synthetic Biology, 2018, 2, 155–185

View Online

56

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

57 58

59

60

61

62

63

64

65 66

67 68

69 70 71

72

blood cell suspensions prevents ventricular remodelling by increasing elastin expression in cardiac scar tissue after myocardial infarction, Basic Res. Cardiol., 2011, 106(4), 645–655. K. B. Gupta, M. B. Ratcliffe, M. A. Fallert, L. H. Edmunds and D. K. Bogen, Changes in passive mechanical stiffness of myocardial tissue with aneurysm formation, Circulation, 1994, 89(5), 2315–2326. J. Zhu and R. E. Marchant, Design properties of hydrogel tissue-engineering scaffolds, Expert Rev. Med. Devices, 2011, 8(5), 607–626. T. J. Sanborn, P. B. Messersmith and A. E. Barron, In situ crosslinking of a biomimetic peptide-PEG hydrogel via thermally triggered activation of factor XIII, Biomaterials, 2002, 23(13), 2703–2710. S. Dobner, D. Bezuidenhout, P. Govender, P. Zilla and N. Davies, A Synthetic Non-degradable Polyethylene Glycol Hydrogel Retards Adverse Post-infarct Left Ventricular Remodeling, J. Card Failure, 2009, 15(7), 629–636. M. Bracher, D. Bezuidenhout, M. P. Lutolf, T. Franz, M. Sun and P. Zilla, et al., Cell specific ingrowth hydrogels, Biomaterials, 2013, 34(28), 6797–6803. K. Kadner, S. Dobner, T. Franz, D. Bezuidenhout, M. S. Sirry and P. Zilla, et al., The beneficial effects of deferred delivery on the efficiency of hydrogel therapy post myocardial infarction, Biomaterials, 2012, 33(7), 2060–2066. A. A. Rane, J. S. Chuang, A. Shah, D. P. Hu, N. D. Dalton and Y. Gu, et al., Increased infarct wall thickness by a bio-inert material is insufficient to prevent negative left ventricular remodeling after myocardial infarction, PLoS One, 2011, 6, 6. K. L. Fujimoto, Z. Ma, D. M. Nelson, R. Hashizume, J. Guan, K. Tobita, et al. Synthesis, characterization and therapeutic efficacy of a biodegradable, thermoresponsive hydrogel designed for application in chronic infarcted myocardium. Biomaterials., 2009, 30(26), 4357–4368. Elsevier Ltd. J. L. Ifkovits, E. Tous, M. Minakawa, M. Morita, J. D. Robb and K. J. Koomalsingh, et al., Injectable hydrogel properties influence infarct expansion and extent of postinfarction left ventricular remodeling in an ovine model, Proc. Natl. Acad. Sci. U. S. A., 2010, 107(25), 11507–11512. J. A. Rowley, G. Madlambayan and D. J. Mooney, Alginate hydrogels as synthetic extracellular matrix materials, Biomaterials, 1999, 20(1), 45–53. J. Leor, S. Tuvia, V. Guetta, F. Manczur, D. Castel and U. Willenz, et al., Intracoronary Injection of In Situ Forming Alginate Hydrogel Reverses Left Ventricular Remodeling After Myocardial Infarction in Swine, J. Am. Coll. Cardiol., 2009, 54(11), 1014–1023. Z. Ye, Y. Zhou, H. Cai, W. Tan Myocardial regeneration: Roles of stem cells and hydrogels. Adv. Drug Deliv. Rev., 2011, 63(8), 688–697. Elsevier B.V. N. F. Huang, J. Yu, R. Sievers, S. Li and R. J. Lee, Injectable Biopolymers Enhance Angiogenesis after Myocardial Infarction, Tissue Eng., 2005, 11(11–12), 1860–1866. D. G. Wallace and J. Rosenblatt, Collagen gel systems for sustained delivery and tissue engineering, Adv. Drug Delivery Rev., 2003, 55(12), 1631–1649. J. W. Weisel, The mechanical properties of fibrin for basic scientists and clinicians, Biophys. Chem., 2004, 112(2–3 SPEC. ISS.), 267–276. A. Mol, M. I. Van Lieshout, C. G. Dam-De Veen, S. Neuenschwander, S. P. Hoerstrup and F. P. T. Baaijens, et al., Fibrin as a cell carrier in cardiovascular tissue engineering applications, Biomaterials, 2005, 26(16), 3113–3121. C. S. Hughes, L. M. Postovit and G. A. Lajoie, Matrigel: a complex protein mixture required for optimal growth of cell culture, Proteomics, 2010, 10, 1886–1890. Synthetic Biology, 2018, 2, 155–185 | 183

View Online

73 74

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

75

76

77

78

79

80

81

82

83

84

85

86

87

H. K. Kleinman and G. R. Martin, Matrigel: Basement membrane matrix with biological activity, Semin. Cancer Biol., 2005, 15, 378–386. A. P. G. Castro, P. Laity, M. Shariatzadeh, C. Wittkowske, C. Holland and D. Lacroix, Combined numerical and experimental biomechanical characterization of soft collagen hydrogel substrate, J. Mater. Sci.: Mater. Med., 2016, 27(4), 79. Springer US. K. L. Christman, H. H. Fok, R. E. Sievers, Q. Fang and R. J. Lee, Fibrin Glue Alone and Skeletal Myoblasts in a Fibrin Scaffold Preserve Cardiac Function after Myocardial Infarction, Tissue Eng., 2004, 10(3/4), 403–409. R. M. Wang, K. L. Christman Decellularized myocardial matrix hydrogels: In basic research and preclinical studies. Adv. Drug Deliv. Rev., 2015, Elsevier B.V. S. B. Seif-Naraghi, J. M. Singelyn, M. A. Salvatore, K. G. Osborn, J. J. Wang and U. Sampat, et al., Safety and Efficacy of an Injectable Extracellular Matrix Hydrogel for Treating Myocardial Infarction, Sci. Transl. Med., 2013, 5(173), 173ra25. J. M. Singelyn, P. Sundaramurthy, T. D. Johnson, P. J. Schup-Magoffin, D. P. Hu and D. M. Faulk, et al., Catheter-Deliverable Hydrogel Derived From Decellularized Ventricular Extracellular Matrix Increases Endogenous Cardiomyocytes and Preserves Cardiac Function Post-Myocardial Infarction, J. Am. Coll. Cardiol., 2012, 59(8), 751–763. C. L. Hastings, E. T. Roche, E. Ruiz-Hernandez, K. Schenke-Layland, C. J. Walsh, G. P. Duffy Drug and cell delivery for cardiac regeneration. Adv. Drug Deliv. Rev., 2014, 84, 85–106. Elsevier B.V. S. E. Epstein, S. Fuchs, Y. F. Zhou, R. Baffour and R. Kornowski, Therapeutic interventions for enhancing collateral development by administration of growth factors: Basic principles, early results and potential hazards, Cardiovasc. Res., 2001, 532–542. J. C. Garbern, E. Minami, P. S. Stayton and C. E. Murry, Delivery of basic fibroblast growth factor with a pH-responsive, injectable hydrogel to improve angiogenesis in infarcted myocardium, Biomaterials, 2011, 32(9), 2407–2416. T. Nakamura, K. Sakai, T. Nakamura and K. Matsumoto, Hepatocyte growth factor twenty years on: Much more than a growth factor, J. Gastroenterol. Hepatol., 2011, 26, 188–202. S. B. Sonnenberg, A a. Rane, C. J. Liu, N. Rao, G. Agmon and S. Suarez, et al., Delivery of an engineered HGF fragment in an extracellular matrix-derived hydrogel prevents negative LV remodeling post-myocardial infarction, Biomaterials, 2015, 45, 56–63. X. Hao, E. Silva, A. Mansson-Broberg, K.-H. Grinnemo, A. J. Siddigui and G. Dellgren, et al., Angiogenic effects of sequential release of VEGF-A165 and PDGF-BB with alginate hydrogels after myocardial infarction, Cardiovasc. Res., 2007, 75(1), 178–185. E. Ruvinov, J. Leor, S. Cohen The promotion of myocardial repair by the sequential delivery of IGF-1 and HGF from an injectable alginate biomaterial in a model of acute myocardial infarction. Biomaterials., 2011, 32(2), 565–578. Elsevier Ltd. M. M. C. Bastings, S. Koudstaal, R. E. Kieltyka, Y. Nakano, A. C. H. Pape and D. A. M. Feyen, et al., A Fast pH-Switchable and Self-Healing Supramolecular Hydrogel Carrier for Guided, Local Catheter Injection in the Infarcted Myocardium, Adv. Healthcare Mater., 2014, 3(1), 70–78. P. Y. W. Dankers, T. M. Hermans, T. W. Baughman, Y. Kamikawa, R. E. Kieltyka and M. M. C. Bastings, et al., Hierarchical Formation of

184 | Synthetic Biology, 2018, 2, 155–185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online

Supramolecular Transient Networks in Water: A Modular Injectable Delivery System, Adv. Mater., 2012, 24(20), 2703–2709. ¨, H.-B. Wang, D.-X. Li, C.-M. Duan and Z.-Q. Liu, et al., 88 W.-N. Lu, S.-H. Lu Functional Improvement of Infarcted Heart by Co-Injection of Embryonic Stem Cells with Temperature-Responsive Chitosan Hydrogel, Tissue Eng., Part A, 2009, 15(6), 1437–1447. 89 L. A. Reis, L. L. Y. Chiu, Y. Liang, K. Hyunh, A. Momen and M. Radisic, A peptide-modified chitosan-collagen hydrogel for cardiac cell culture and delivery, Acta Biomater., 2012, 8(3), 1022–1036. Acta Materialia Inc. 90 F. Rask, S. M. Dallabrida, N. S. Ismail, Z. Amoozgar, Y. Yeo and M. A. Rupnick, et al., Photocrosslinkable chitosan modified with angiopoietin-1 peptide, QHREDGS, promotes survival of neonatal rat heart cells, J. Biomed. Mater. Res., Part A, 2010, 95(1), 105–117. 91 L. A. Reis, L. L. Y. Chiu, J. Wu, N. Feric, C. Laschinger and A. Momen, et al., Hydrogels With Integrin-Binding Angiopoietin-1-Derived Peptide, QHREDGS, for Treatment of Acute Myocardial Infarction, Circ.: Heart Failure, 2015, 8(2), 333–341. 92 C.-H. Chen, M.-Y. Chang, S.-S. Wang and P. C. H. Hsieh, Injection of autologous bone marrow cells in hyaluronan hydrogel improves cardiac performance after infarction in pigs, Am. J. Physiol.: Heart Circ. Physiol., 2014, 306(7), H1078–H1086. 93 T. Wang, X.-J. Jiang, Q.-Z. Tang, X.-Y. Li, T. Lin and D.-Q. Wu, et al., Bone marrow stem cells implantation with alpha-cyclodextrin/MPEG-PCL-MPEG hydrogel improves cardiac function after myocardial infarction, Acta Biomater., 2009, 5(8), 2939–2944. 94 J. Zhou and H. Ritter, Cyclodextrin functionalized polymers as drug delivery systems, Polym. Chem., 2010, 1(10), 1552. 95 D. Wu, T. Wang, B. Lu, X. Xu, X. Jiang and X. Zhang, et al., Fabrication of Supramolecular Hydrogels for Drug Delivery and Stem Cell Encapsulation, Nature, 2008, 24(19), 10306–10312. 96 V. F. M. Segers, T. Tokunou, L. J. Higgins, C. MacGillivray, J. Gannon and R. T. Lee, Local Delivery of Protease-Resistant Stromal Cell Derived Factor-1 for Stem Cell Recruitment After Myocardial Infarction, Circulation, 2007, 116(15), 1683–1692. 97 M. Song, H. Jang, J. Lee, J. H. Kim, S. H. Kim, K. Sun, et al. Regeneration of chronic myocardial infarction by injectable hydrogels containing stem cell homing factor SDF-1 and angiogenic peptide Ac-SDKP. Biomaterials., 2014, 35(8), 2436–2445. Elsevier Ltd. 98 D. Projahn, S. Simsekyilmaz, S. Singh, I. Kanzler, B. K. Kramp and M. Langer, et al., Controlled intramyocardial release of engineered chemokines by biodegradable hydrogels as a treatment approach of myocardial infarction, J. Cell. Mol. Med., 2014, 18(5), 790–800. 99 B. Q. Le, H. Fernandes, C. V. C. Bouten, M. Karperien, C. van Blitterswijk and J. de Boer, High-Throughput Screening Assay for the Identification of Compounds Enhancing Collagenous Extracellular Matrix Production by ATDC5 Cells, Tissue Eng., Part C, 2015, 21(7), 726–736. 100 M. C. Prewitz, F. P. Seib, M. von Bonin, J. Friedrichs, A. Stißel and C. Niehage, et al., Tightly anchored tissue-mimetic matrices as instructive stem cell microenvironments, Nat. Methods, 2013, 10(8), 788–794.

Synthetic Biology, 2018, 2, 155–185 | 185

Published on 23 November 2017 on http://pubs.rsc.org | doi:10.1039/9781782622789-00155

View Online