Development of Selective DNA-Interacting Ligands: Understanding the Function of Non-canonical DNA Structures [1st ed.] 9789811577154, 9789811577161

This book addresses the development of both DNA-sequence-selective and DNA-form-selective ligands, with the aim of creat

350 43 6MB

English Pages XIII, 111 [120] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Development of Selective DNA-Interacting Ligands: Understanding the Function of Non-canonical DNA Structures [1st ed.]
 9789811577154, 9789811577161

Table of contents :
Front Matter ....Pages i-xiii
Introduction (Sefan Asamitsu)....Pages 1-44
Sequence-Specific DNA Alkylation and Transcriptional Inhibition by Long-Chain Hairpin Pyrrole–Imidazole Polyamide–Chlorambucil Conjugates Targeting CAG/CTG Trinucleotide Repeats (Sefan Asamitsu)....Pages 45-67
Ligand-Mediated G-Quadruplex Induction in a Double-Stranded DNA Context by Cyclic Imidazole/Lysine Polyamide (Sefan Asamitsu)....Pages 69-83
Simultaneous Binding of Hybrid Molecules Constructed with Dual DNA-Binding Components to a G-Quadruplex and Its Proximal Duplex (Sefan Asamitsu)....Pages 85-109
Back Matter ....Pages 111-111

Citation preview

Springer Theses Recognizing Outstanding Ph.D. Research

Sefan Asamitsu

Development of Selective DNA-Interacting Ligands Understanding the Function of Non-canonical DNA Structures

Springer Theses Recognizing Outstanding Ph.D. Research

Aims and Scope The series “Springer Theses” brings together a selection of the very best Ph.D. theses from around the world and across the physical sciences. Nominated and endorsed by two recognized specialists, each published volume has been selected for its scientific excellence and the high impact of its contents for the pertinent field of research. For greater accessibility to non-specialists, the published versions include an extended introduction, as well as a foreword by the student’s supervisor explaining the special relevance of the work for the field. As a whole, the series will provide a valuable resource both for newcomers to the research fields described, and for other scientists seeking detailed background information on special questions. Finally, it provides an accredited documentation of the valuable contributions made by today’s younger generation of scientists.

Theses are accepted into the series by invited nomination only and must fulfill all of the following criteria • They must be written in good English. • The topic should fall within the confines of Chemistry, Physics, Earth Sciences, Engineering and related interdisciplinary fields such as Materials, Nanoscience, Chemical Engineering, Complex Systems and Biophysics. • The work reported in the thesis must represent a significant scientific advance. • If the thesis includes previously published material, permission to reproduce this must be gained from the respective copyright holder. • They must have been examined and passed during the 12 months prior to nomination. • Each thesis should include a foreword by the supervisor outlining the significance of its content. • The theses should have a clearly defined structure including an introduction accessible to scientists not expert in that particular field.

More information about this series at http://www.springer.com/series/8790

Sefan Asamitsu

Development of Selective DNA-Interacting Ligands Understanding the Function of Non-canonical DNA Structures Doctoral Thesis accepted by Kyoto University, Kyoto, Japan

123

Author Dr. Sefan Asamitsu Kumamoto University Kumamoto, Japan

Supervisor Prof. Hiroshi Sugiyama Kyoto University Kyoto, Japan

ISSN 2190-5053 ISSN 2190-5061 (electronic) Springer Theses ISBN 978-981-15-7715-4 ISBN 978-981-15-7716-1 (eBook) https://doi.org/10.1007/978-981-15-7716-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Supervisor’s Foreword

In the past two decades, non-canonical DNA structures, other than the canonical B-form duplex, were proven to have profound implications on various biological, neurological, and pharmacological events. In parallel, studies on the development of ligands interacting with non-canonical DNA structures have been intensively performed with the aim of creating potential molecular probes and therapeutic agents for human diseases. In the present thesis, Dr. Sefan Asamitsu has made a great endeavor to develop DNA-sequence and DNA-form-selective ligands toward understanding the function of non-canonical DNA structures by utilizing a DNA-binding Pyrrole-Imidazole Polyamide (PIP) scaffold. In Chap. 1, he systematically and exhaustively summarizes the biological roles of non-canonical DNA structures formed by trinucleotide repeat sequences (hairpin form) and guanine-rich sequences (G-quadruplex form), which are both pathological molecules for neurodegenerative diseases and/or human cancers. He also describes the recent researches on synthetic ligands that interact with these DNA forms and modulate the biological events, focusing particularly on ligands that have clear preferences for particular DNA forms, their molecular design, and pharmaceutical applications. Chapter 2 addresses the development of a sequence-specific DNA-binding PIP that targets trinucleotide repeat sequences and selectively represses the pathogenic RNA transcripts. Expansion mutations of trinucleotide repeat sequences are associated with hereditary disorders, including Huntington’s disease and myotonic dystrophy type 1. The study dedicated to this chapter proves a concept that designer synthetic ligands are targetable to the pathogenic trinucleotide repeat DNAs with high specificity and maybe a new medication strategy for curing hereditary disorders. Chapter 3 describes a new G-quadruplex binding molecule based on the PIP scaffold, termed cyclic Imidazole/Lysine Polyamide (cIKP). cIKP was successfully designed to have an affinity to G-quadruplex structures while eliminating an affinity to a normal DNA duplex. It is worth mentioning that this study is the first report

v

vi

Supervisor’s Foreword

that a PIP scaffold is tunable to getting a specificity to a non-canonical DNA structure and may further be developed for a molecular probe of G-quadruplex structures in living cells. Based on the studies described in Chaps. 2 and 3, he has challenged the creation of a synthetic ligand that can selectively target designated G-quadruplexes at specific genome locations, which no one has ever been accomplished. Chapter 4 addresses this issue and advocates a new ligand design strategy, where a G-quadruplex-specific molecule (cIKP) is covalently conjugated with a duplex-specific molecule (PIP) to create PIP-cIKP hybrids, which can read out the specific duplex sequence adjacent to a designated quadruplex in the genome. A series of systematic and detailed binding assays described in Chap. 4 demonstrates that the concept of simultaneous recognition of G-quadruplex and its proximal duplex by PIP-cIKP hybrids may provide a new strategy for ligand design capable of targeting a large variety of designated quadruplexes at specific genome locations. A great contribution made by this thesis influences DNA chemical biology, medical and pharmaceutical researches across the fields and gives us a deeper understanding of nucleic acid structures and their targeting ligands for various applications. Further, this thesis is so systematic and informative that even non-specialists can follow the contents while maintaining a high standard of scientific content. Kyoto, Japan June 2020

Prof. Hiroshi Sugiyama

Parts of this thesis have been published in the following journal articles: 1. Asamitsu S, Obata S, Yu Z, Bando T, Sugiyama H (2019) Recent progress of targeted G-quadruplex-preferred ligands toward cancer therapy. Molecules 24:429. 2. Asamitsu S, Bando T, Sugiyama H (2019) Ligand Design to Acquire Specificity to Intended G‐Quadruplex Structures. Chemistry–A European Journal 25:417–430. 3. Asamitsu S, Obata S, Phan AT, Hashiya K, Bando T, Sugiyama H (2018) Simultaneous Binding of Hybrid Molecules Constructed with Dual DNA‐ Binding Components to a G‐Quadruplex and Its Proximal Duplex. Chemistry–A European Journal 24:4428–4435. 4. Asamitsu S, Li Y, Bando T, Sugiyama H (2016) Ligand‐Mediated G‐ Quadruplex Induction in a Double‐Stranded DNA Context by Cyclic Imidazole/Lysine Polyamide. ChemBioChem 17:1317–1322. 5. Asamitsu S, Kawamoto Y, Hashiya F, Hashiya K, …, Bando T, Sugiyama H (2014) Sequence-specific DNA alkylation and transcriptional inhibition by long-chain hairpin pyrrole–imidazole polyamide–chlorambucil conjugates targeting CAG/CTG trinucleotide repeats. Bioorganic & medicinal chemistry 22:4646–4657.

vii

Acknowledgements

Firstly, I would like to express my sincere gratitude to my supervisor Prof. Hiroshi Sugiyama for the continuous support of my study and research, for his patience, motivation, insightful suggestion, immense knowledge, and constant encouragement. I could not have imagined having such a wonderful supervisor for my study. I would like to express my special gratitude to Dr. Toshikazu Bando for his insightful comments, discussions, carefully considered feedback. I also would like to thank Dr. Soyoung Park, Dr. Masayuki Endo, and Dr. Namasivayam Ganesh Pandian, for their kind support of my study. I am grateful to Yasuko Niimi and Takako Futamata, for their kindness and help during my research period. I would like to thank Kaori Hashiya for her help with the solid-phase synthesis of polyamides and Shinsuke Sato for his technical support of cell experiments. I would like to express my thanks to Dr. Yue Li, Dr. Seiichiro Kizaki, Makoto Yamamoto, Dr. Yusuke Kawamoto, Fumitaka Hashiya, and Shunsuke Obata for their contributions to my work. I would also like to express my gratitude to Prof. Anh Tuân Phan for his kind acceptance to his lab as a visiting scholar, valuable comments, and suggestions. I am grateful to all current and previous lab members during my research period for their support. I would like to thank a JSPS research fellowship to young scientists for financial support. Finally, I am profoundly thankful to my parents for their love, encouragement, and support all the time.

ix

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 DNA-Binding Molecules . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Naturally Occurring Molecules . . . . . . . . . . . . . . . 1.1.2 Pyrrole–Imidazole Polyamides . . . . . . . . . . . . . . . 1.1.3 Trinucleotide Repeat-Targeting Molecules . . . . . . . 1.2 G-Quadruplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Structures and Biological Significance of G-Quadruplexes . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 G-Quadruplexes and Cancers . . . . . . . . . . . . . . . . 1.2.3 G-Quadruplex-Interacting Ligands . . . . . . . . . . . . . 1.2.4 Addressing the Specificity of Ligands to Particular G-Quadruplexes . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Conclusion and Future Prospects . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

1 1 1 3 9 12

..... ..... .....

12 16 25

..... ..... .....

26 36 36

. . . . . .

. . . . . .

. . . . . .

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition by Long-Chain Hairpin Pyrrole–Imidazole Polyamide–Chlorambucil Conjugates Targeting CAG/CTG Trinucleotide Repeats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Molecular Design and Synthesis . . . . . . . . . . . . . . . . . 2.2.2 DNA-Alkylating Activity of Conjugates 1–4 . . . . . . . . 2.2.3 The Influence of Conjugate 4 Over Transcription . . . . . 2.2.4 Binding Properties of the Parent Polyamides to Target Hairpin DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Syntheses of Compound 5–10 . . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

45 45 47 47 49 53

. . . . .

. . . . .

56 58 58 58 59

xi

xii

Contents

2.4.3 2.4.4 2.4.5

Syntheses of Parent PI Polyamides and Their Chlorambucil Conjugates . . . . . . . . . . . . . . . . . . . . . Preparation of Plasmid Containing (CAG/CTG)12 Repeat Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . Preparation of 5′-Texas Red-Labeled DNA Fragments and High-Resolution Gel Electrophoresis . . . . . . . . . . In Vitro Transcription Assays . . . . . . . . . . . . . . . . . . Quantitative SPR-Binding Assays . . . . . . . . . . . . . . .

...

61

...

64

...

64 64 65 66

2.4.6 ... 2.4.7 ... References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Ligand-Mediated G-Quadruplex Induction in a Double-Stranded DNA Context by Cyclic Imidazole/Lysine Polyamide . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Molecular Design and Synthesis . . . . . . . . . . . . . . . . . 3.2.2 SPR-Binding Assays . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 CD Spectra Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Induction of G4 Formation in DsDNA Context . . . . . . 3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Synthesis of cIKP (1) . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 SPR-Binding Experiments . . . . . . . . . . . . . . . . . . . . . 3.4.4 CD Spectra Measurements . . . . . . . . . . . . . . . . . . . . . 3.4.5 Native Gel Electrophoresis Analysis . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Simultaneous Binding of Hybrid Molecules Constructed with Dual DNA-Binding Components to a G-Quadruplex and Its Proximal Duplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 DNA Substrate and Molecular Design . . . . . . . . . . . . . 4.2.2 Characterization of Binding Properties to Individual Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 Design and Synthesis of Hybrid Molecules . . . . . . . . . 4.2.4 Recognition of Quadruplex by Hybrid Molecules . . . . . 4.2.5 Dual Recognition of a Quadruplex/Duplex by a Single Hybrid Compound . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Synthesis of HPIP1–4 and Hybrids 1–3 . . . . . . . . . . . . 4.4.3 NMR Spectroscopy . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

69 69 70 70 71 72 74 78 79 79 80 81 82 82 82

. . . .

. . . .

85 85 87 87

.. .. ..

90 91 93

. . . . . .

. . . . . .

96 100 102 102 102 105

Contents

xiii

4.4.4 4.4.5 4.4.6 4.4.7 4.4.8 4.4.9

UV-Melting Assays . . . . . . . . . . . . . . . . . . . . . . Circular Dichroism (CD) Titration Assays . . . . . . Thiazole Orange (TO) Displacement Assays . . . . Docking Analysis . . . . . . . . . . . . . . . . . . . . . . . . CD Melting Assays . . . . . . . . . . . . . . . . . . . . . . Fluorescence Resonance Energy Transfer (FRET) Melting Assays . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.10 SPR-Binding Assays . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

105 106 106 106 107

. . . . . . 107 . . . . . . 108 . . . . . . 108

Curriculum Vitae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Chapter 1

Introduction

Abstract Deoxyribonucleic (DNA) is one of the biomacromolecules and carries the genetic information of living organisms. DNAs contain four bases, adenine (A), thymine (T), cytosine (C), guanine (G), and stably exist in the nucleus forming double helix structures through Watson–Crick base pairs. Over the past two decades, the secondary structures of DNA have proven to have profound implications on various biological, neurological and pharmacological events. In addition, intensive studies on the creation of synthetic ligands that interact with the secondary structures of DNA and affect a specific transcriptional process have been performed with the aim of developing potential molecular probes and therapeutic agents for human diseases. This chapter summarizes the biological significance of DNA structures beyond the Watson–Crick structure and their interacting ligands. Keywords Deoxyribonucleic acid (DNA) · Non–canonical DNA · DNA–binding ligands · Human diseases

1.1 DNA-Binding Molecules 1.1.1 Naturally Occurring Molecules Deoxyribonucleic (DNA) is one of the biomacromolecules and carries the genetic information of living organisms. DNAs contain four bases, adenine (A), thymine (T), cytosine (C), guanine (G), and stably exist in the nucleus forming double helix structures through Watson–Crick base pairs (Fig. 1.1). While primary genetic information, a DNA base sequence is virtually identical in all the cells belonging to an individual organism, the expression pattern of the genes is diverse and dependent on organs, tissue type, cell lineage, and even on the unit of a single cell. By virtue of numerous and extensive researches that have been continuing from the onset when researchers having a question “What DNA is all about,” we now know that the diversity of the gene expression is precisely governed by specific protein–DNA interaction, protein– protein interaction, and those cooperative combinations, which are primarily based on the DNA base sequence, structure, and its modification pattern. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Asamitsu, Development of Selective DNA-Interacting Ligands, Springer Theses, https://doi.org/10.1007/978-981-15-7716-1_1

1

2

1 Introduction

Fig. 1.1 B-form double helical DNA and Watson–Crick base pairs

DNA-binding molecules can influence such highly governed gene expression systems through their direct interaction with the target DNAs. At the initial direction of the study on DNA-binding molecules, the contribution of biophysical chemists revealed that some antibiotics such as chromomycin, actinomycin D, netropsin, distamycin A, and calicheamicin oligosaccharide were identified to have sequencespecific DNA-binding properties and the DNA-drug complexes were elucidated at the atomic level using NMR and X-ray crystallography technique (Fig. 1.2). Chromomycin was shown to preferably bind a GC-rich sequence of duplex DNA and to have the ability to inhibit RNA synthesis. The binding mode of the drug complexed with a 5 -AAGGCCTT-3 duplex DNA was solved at the atomic level by an NMR analysis (Fig. 1.2a) [1]. Two molecules face each other side by side in an antiparallel orientation and the dimeric chromomycin with one magnesium centered adaptively molded into the minor groove to generate a symmetrical complex structure. Interestingly, a later study revealed that chromomycin was capable of adaptively binding to an unusual DNA form adopted by a CCG trinucleotide repeat sequences with these consecutive cytosine residues ejected [2]. Similarly, actinomycin D showed a GCrich sequence-preferred binding propensity and an antitumor activity by interfering with DNA replication and RNA transcription processes (Fig. 1.2b) [3]. Actinomycin D also has a high binding affinity to a G•G or T•T mismatch-containing hairpin structure formed by CGG or CTG trinucleotide-repeated DNA, respectively [4, 5]. On the contrary, netropsin, distamycin A, and N-methylpyrrole-based oligopeptide

1.1 DNA-Binding Molecules

3

Fig. 1.2 Naturally occurring products that bind to DNAs in a sequence-dependent manner

are the classical crescent-shaped minor-groove binders that predominantly recognize AT-rich sequences and enforce the thermal stability of the duplex with minimally affecting a microstructure of DNA (Fig. 1.2c, d) [6–11]. Initially, solution and crystal structures of the 1:1 binding complexes were elucidated by the NMR and X-ray crystallography structural analysis [6, 7]. The subsequent studies on the atomic-level structures of the complexes of distamycin A and the duplex DNAs (5 -CGCAAATTGGC-3 , 5 -CGCAAATTTGCG-3 , and 5 -GTATATAC-3 ) showed that distamycin A bound to the duplex in an antiparallel 2:1 binding mode, where the two distamycin molecules assembled side by side through an antiparallel orientation into the minor groove [8–10]. The sequence-preferential affinity to the duplex DNA relies on the formation of hydrogen bonds at the specific sites of the thymine or adenine base, i.e., amide protons of distamycin A were in close proximity to an N3 position at the adenine or an O2 position at the thymine [8–10].

1.1.2 Pyrrole–Imidazole Polyamides The unique binding preferences of the naturally occurring molecules introduced here, as evidenced by NMR and X-ray diffraction structural analyses, evoked the notion that finely designed synthetic molecules should be able to target a wider repertoire of DNA sequences. Notably, Dervan and colleagues developed distamycin analogs as programmable DNA-binding molecules, named pyrrole–imidazole polyamides (PIPs) [11–13]. PIPs are comprised of N-methylpyrrole (P) and N-methylimidazole (I) through amide bonds and can distinctly recognize A-T or T-A base pairings and GC base pairings with high specificity and affinity comparable to natural transcriptional

4

1 Introduction

factors (Fig. 1.3) [8, 9, 11, 14]. The mechanism for the base-pair discrimination by PIPs has been established by a series of the atomic-level structural analyses that were performed by a group of Wemmer and Dervan. In the antiparallel binding mode where two molecules of an I-P-P oligoamides were inserted side by side into the minor groove of DNA, the binding orientation of the dimeric oligoamides against the duplex DNA was defined to be an N → C direction with respect to the 5 → 3 direction of the proximal DNA strand [11]. Moreover, an electron-donating nitrogen at an N3 position of the imidazole ring was shown to be able to form a hydrogen bond with a hydrogen of the exo-amine of a guanine residue [11, 14]. The commitment of the imidazole moiety to the hydrogen bond formation confers the selectivity to G-C pairs over A-T or T-A pairs. The later relevant structural studies on PIPs confirmed that the other forms of PIPs, including hairpin [12] and cyclic types [13], were capable of discerning A-T or T-A and G-C base pairs, relying on the preferred hydrogen bond formation when an imidazole moiety pairs with a pyrrole at the opposite strand (Fig. 1.3) [15–17]. I here would like to summarize the rule of base-pair discrimination by PIPs; PIPs recognize a C-G base pair by the antiparallel pairing of a pyrrole (P)/imidazole (I) pair, and an A-T or a T-A base pair by the antiparallel pairing of P/P pair (Fig. 1.4). A β-alanine (β) [18] or γ-aminobutyric acid (GABA) which are alternative constituents of PIPs for promoting the relaxing of the entire structures or the adaptive folding into the hairpin forms, respectively, also prefers an A-T or a T-A base pair. A crucial advance in the issues of PIP synthesis was reported in 1996, describing a method for the solid-phase synthesis (SPS) of PIPs using Boc chemistry and the gram-scale synthesis of monomer units without chromatographic purification [19]. Therefore, the synthetic timescale was shortened for one polyamide from months to days. The subsequent progress of the solid-phase approach using an Fmoc chemistry

Fig. 1.3 Chemical structure of pyrrole–imidazole polyamide (PIP) and the crystal structure of the complex of a PIP and a duplex DNA (PDB code: 3omj)

1.1 DNA-Binding Molecules

5

Fig. 1.4 Base-pair discrimination by a P-P or I-P pair of PIPs

has been reported, showing optimized yields and purities [20]. Nowadays, the Fmoc SPS approach has been adopted for rapid and facile PIP synthesis. The sequence-specific binding of PIPs to the predetermined sequences permits the transcriptional regulation of intended genes. There have been numerous successful examples to date [21]. Gottesfeld, Dervan, and colleagues first reported a prominent work, where a PIP designed to target TFIIIA binding site (5 -ATGACT-3 ) interfered with 5S RNA gene expression in a kidney cell [22]. Continuously, the Dervan group has been a leading research group in this field of targeted gene regulation by PIP molecules and keeping offering the invaluable insight into PIP design for enhanced specificity and affinity, methodologies of how to target a specific gene of interest, the molecular mechanism of the action of PIPs, and in vivo application (see Dervan’s group website http://dervan.caltech.edu/). Conjugation strategy offers promise for more diverse gene transcription control in a site-specific fashion, where the functional domain connected to a PIP domain is capable of exerting its function in a specific locus in a tunable manner. For instance, an alkylating agent-connecting PIP allows for a sequence-specific alkylation [23]. Notably, the occupation and alkylation at a template strand in the coding region of genes are capable of arresting the progression of an RNA polymerase [24, 25]. Our group reported a PIP-seco-CBI conjugate molecule that directly targets mutant DNA in a KRAS driver oncogene. This molecule was capable of selectively alkylating the oncogenic codon 12 mutant DNA in the code region, repressing KRAS expression, and blocking the downstream RAS signaling pathway (Fig. 1.5) [26]. Consequently, it caused strand cleavage and tumor growth suppression in a xenograft mouse model [26]. Another important oncogenic mutation in KRAS gene, a codon 13 mutation was also found to be targetable by a finely designed PIP-seco-CBI conjugate molecule [27]. A PIP domain connected to a chromatin modulator allows for converting the action sites of the modulator from a vast extent of chromosomes into the vicinity

6

1 Introduction

Fig. 1.5 A PIP-indole-seco-CBI conjugate to target the oncogenic mutant codon 12 in KRAS gene

of the binding sites of the PIP domain, resulting in the epigenetic alteration of gene expression in a site-specific manner. In the previous studies of our group, suberoylanilide hydroxamic acid (SAHA), an inhibitor of histone deacetylases (HDACs), was covalently connected to PIPs that target different sequences to construct an in-house SAHA-PIP conjugate library (Fig. 1.6a) [28, 29]. The SAHA module tethered to a PIP would render the vicinal chromatin a loose state by exerting a site-specific inhibition of deacetylation of the histone proteins and trigger the transcriptional activation at the specific genes. A small screening of the thirty-eight compounds by a DNA microarray for examining the global gene expression profiles demonstrated that the respective SAHA-PIP conjugates displayed distinct transcription activation patterns in mouse and human fibroblasts, primarily based on different DNA recognition by the PIP domains [29]. The functional analysis of the DNA microarray [29] and the subsequent studies [30–33] revealed that each SAHA-PIP conjugate distinctly activated a certain gene network, where SAHA-PIP K, A, G, I, X, and L were identified to transcriptionally activate a different panel of genes important for gametogenesis [30], pancreas cell [29], cardiac cell [29], pluripotency [31], retinal cell [32], and neural development [33], respectively (Fig. 1.6b). Analogously, N-(4-Chloro-3-(trifluoromethyl)phenyl)-2-ethoxybenzamide (CTB), which is known to be a histone acetyltransferase (HAT) activator, exhibits the similar effect when it is conjugated to a PIP. A PIP I sequence connected to a CTB, named CTB I, induced a panel of genes similar to the case of treatment of SAHA I (Fig. 1.7) [34]. These results indicate that sequence-specific DNA recognition status derived from the PIP domain governs the action site of the chromatin modulating agents. The use of a bromodomain inhibitor as the partner of the conjugation with PIPs was also found to be a suitable option for the targeted transcriptional activation (Fig. 1.8a). Ansari and colleagues reported that the repressive and expanded GAA microsatellite

1.1 DNA-Binding Molecules

7

Fig. 1.6 SAHA-PIP conjugates that activate a particular gene network

repeats (>120 repeats) that silence frataxin (FXN) expression in Friedreich’s ataxia (FRDA) was selectively activated by the treatment of GGA-repeat-targeting PIP tethered to a bromodomain inhibitor (JQ1), named Syn-TEF1 (Fig. 1.8b), where targeted recruitment of an elongation factor (P-TEFb), concomitant with the assembled BRD4 across the GAA repeats restores FXN expression in FRDA patient-derived cells harboring a broad range of repeat expansion (Fig. 1.8b) [35]. Similarly, a CBP30 derivative, another class of bromodomain inhibitors, was successfully utilized to design a sequence-specific transcriptional activation tool, named Bi-PIP. The BiPIP targets the coactivator P300/CBP family of proteins and causes P300-dependent histone acetylation at the specific loci (Fig. 1.8b) [36]. A well-designed in vitro ChIP assay demonstrated that the Bi-PIP efficiently and selectively acetylated the histone H3 in a dose- and sequence-dependent manner. The mechanism of the action of the Bi-PIP as demonstrated by in vitro assays was biologically validated, where the most upregulated protein-coding gene included multiple putative binding sites of the PIP domain of the Bi-PIP in the gene body and promoter [36].

8

1 Introduction

Fig. 1.7 CTB-PIP conjugate (I) that exhibits a similar gene regulation profile compared to the SAHA-PIP conjugate (I) as indicated by a heatmap obtained from DNA microarray analysis

Fig. 1.8 Bromodomain inhibitor-PIP conjugates; a Bromodomain inhibitors, b Syn-TEH1, c BiPIP

1.1 DNA-Binding Molecules

9

1.1.3 Trinucleotide Repeat-Targeting Molecules In the human genome, a vast number of microsatellite repeat sequences were prevalent and some of them have profound implications in biological and neuropathological contexts [37, 38]. In this section, we highlight the trinucleotide microsatellite sequence and its implications in hereditary diseases from the viewpoint of DNA conformations and their targeting molecules as potential therapeutic agents and diagnosis tools. The expansion mutations are often seen in the human genome, and the elongation exceeding a defined threshold level causes hereditary disorders, including trinucleotide repeat diseases (Fig. 1.9) [39, 40]. The simple trinucleotide repeats, as exemplified by (CNG)n (where N represents any nucleotides) or (GAA)n , possess the expandable nature of repeats through the DNA replication, repair, and recombination processes, which are likely mediated by peculiar DNA conformation formed by the repeat sequences. This repeat expandable nature is accelerated with an increase in the length of the repeats and thus causes a decreased age of onset and increased severity in individuals of the subsequent generation. Mutant RNAs and proteins derived from the expanded repeat DNA regions are directly associated with the pathogenesis of each hereditary disorder [41]. For example, the expansion of repetitive CAG trinucleotides (>36 repeats) within the first exon of the Huntingtin (HTT ) gene causes Huntington’s disease [42]. The expanded repeat-derived polyglutamine (PolyQ) tracts tethered to a full-length HTT protein and toxic long hairpin RNA structures formed by the repetitive region of HTT transcripts are primary pathogenesis substances [43–45]. In myotonic dystrophy type 1 (DM1), transcripts harboring excessive CUG repeats that are transcribed from CTG repeat sequences situated at the 3 UTR of DMPK gene, form aggregative RNA foci within the nucleus by assembling multiple RNA-binding proteins [46, 47]. Those recruited proteins such as splicing regulators are unable to function properly any longer, ultimately leading to the pathogenesis of DM1. The CGG repeats

Fig. 1.9 Genomic locations and expansion thresholds of triplet repeats associated with trinucleotide repeat diseases. The repeat expansion exceeding defined thresholds is a pathogenic origin. UTR; untranslated regon,. SCA; spinocerebellat ataxia

10

1 Introduction

located in the 5 -UTR of FMR1 gene is associated with Fragile X syndrome (FXS, > 200 repeats) and FXTAS (55–200 repeats). While the pathogenesis of FXTAS is thought to be attributable to the capability of extended CGG repeated RNAs to form multiple high-order structures [48], more frequent CGG repeats in FXS cause epigenetic repression of the gene transcription by DNA methylation within the CGG repeat and 5 -UTR region [49]. The repetitive GAA runs observed in Friedreich’s ataxia (FRDA) also have a similar situation, where repressive chromatin across these repetitive genome regions hinders the RNA polymerase initiation and elongation, thus silencing the gene expression [50, 51]. Recently, it has been found that repeatassociated non-ATG translation (RAN translation) is also the pathogenic source of (CAG)n , (CGG)n , and other repeat-associated disorders [52, 53]. The common principle for these trinucleotide repeat diseases is depicted as the unusual structural features of nucleic acids (Fig. 1.10). Particularly, excessive repeat sequences are known to adopt stable hairpin structures [37, 41]. CAG, CTG, and CGG repeated DNAs all adopt a stem-loop structure with 1 bp mismatch-containing 5 -CXG-3 /3 -GXC-5 motifs (Fig. 1.10a, c). As discussed earlier, actinomycin D was found to have a considerable affinity to a 5 -CTG-3 /3 -GTC-5 or 5 -CGG3 /3 -GGC-5 mismatched motif as elucidated by X-ray diffraction structural analysis (Fig. 1.11) [4, 5]. Notably, the administration of actinomycin D with a DM1 (CTG repeat) patient-derived cell and the mouse model exhibited the transcriptional repression, reduce the toxic RNA foci, and recover pathogenic splicing defects [54]. Nakatani and colleagues reported a rationally designed synthetic small molecule,

Fig. 1.10 Non-canonical DNA structures formed by expandable repeats; a hairpin formed by CNG repeats, b G-quadruplex (G4) formed by GGGGCC or CGG repeats, c slipped hairpin formed by CTG/CAG repeats, d triple helix formed by GAA/CTT repeats

1.1 DNA-Binding Molecules

11

Fig. 1.11 (Left) crystal structure of a 2:1 actinomycin-d(ATGCTGCAT)2 complex (PDB code: 1mnv); (Right) crystal structure of a 2:1 actinomycin-d(ATGCGGCAT)2 complex (PDB code: 4hiv)

in which napthyridine and azaquinolone residues are connected though an appropriate hinge (named NA), which can target a 5 -CAG-3 /3 -GAC-5 motif [55]. The NMR structural analysis elegantly revealed a specific hydrogen bond-driven interaction between the ligand and the motif with mismatched adenine residues flipped out (Fig. 1.12) [55]. Based on this molecular scaffold, T•T and G•G mismatched motif-targeting ligands were also developed, and some of them were demonstrated to arrest the progression of DNA polymerase by ligand-mediated stabilization of those repetitive motifs [56]. Zimmerman and colleagues have created a series of designer molecules dually targeting for the d(CTG•GTC)n and r(CUG•GUC)n capable of selectively inhibiting the transcription process, cleaving the transcripts, and reducing the toxic RNA foci within nuclei (Fig. 1.13) [57]. Chenoweth and colleagues also reported a new class of synthetic molecules targeting the threeway junctions formed by CAG•CTG repeats (Fig. 1.14) [58]. The spatially defined molecules centered by a triptycene moiety likely mold into the cavity of the junction, and the triple cationic moieties (represented as “R” in Fig. 1.14) would contribute to increased binding affinities by interacting with the anionic DNA backbones. Interestingly, these molecules can modulate a microstructure of the DNA, whose character

12

1 Introduction

Fig. 1.12 a Solution structure of a 2:1 NA-d(CTAACAGAATG•CATTCAGTTAG)2 complex (PDB code: 1 × 26). b Intermolecular hydrogen bonds between NA and guanine or adenine bases

is potentially applicable to a chemical tool toward influencing the repeat-driven expansion/contraction and transcription processes.

1.2 G-Quadruplex 1.2.1 Structures and Biological Significance of G-Quadruplexes As discussed until the last section, the formation of the alternative conformation of DNAs, other than B-form DNAs, in cellular dynamics is now widely accepted [37, 59, 60]. Besides the repeat-associated unusual structures, the G-quadruplex (G4) structure is considered to be another important form of nucleic acids [61, 62]. It consists of several G-tetrad layers that comprise four planar guanines linked through Hoogsteen hydrogen bonding and is folded stably under physiological conditions with monovalent metal cations (such as Na+ and K+ ) (Fig. 1.15). G4s can be formed by guanine-rich sequences, and a motif 5 -G≥3 N1–7 G≥3 N1–7 G≥3 N1–7 G≥3 -3 is advocated as consensus sequences that have the ability to form intramolecular G4s [63, 64], although several exceptions have been reported to the present time [65–67]. Extensive physical characterizations of the G4 structure utilizing UV and CD spectroscopy

1.2 G-Quadruplex

13

Fig. 1.13 Designer molecules that dually target d(CTG/GTC)n and the derived r(CUG/GUC)n transcripts; (left) compound 5; (center) compound 6; (right) compound 9 in Ref. [57]

Fig. 1.14 Three-way junction formed by CAG and CTG repeat DNAs and triptycene molecules that bind to the cavity of the junction

revealed that the structure has extremely high thermal stability when possessing one or two nucleotide(s) between the G-tracts (T m = approximately 70–90 °C) [68]. The formation of such G4s in the genome is involved in biological events related to human diseases. For example, G4 formation in the promoters of genes, most notably

14

1 Introduction

Fig. 1.15 a Structure and schematic illustration of a G-tetrad. b Schematic illustrations of typical intramolecular G-quadruplex (G4) structures: (left) crystal structure of parallel-type telomere G4 (PDB code: 1kf1); (center) solution structure of antiparallel-type telomere G4 (PDB code: 143d); (right) solution structure of hybrid-type telomere G4 (PDB code: 2gku)

of genes involved in cellular proliferation and related to cancer, controls the expression of downstream genes by interfering with transcription factor binding (Fig. 1.16a) [69]. Similarly, G4 formation blocks the progression of DNA polymerase and sometimes causes severe DNA damage, such as strand breakage (Fig. 1.16b) [70–73]. Alternatively, G4s can function positively as markers for chromatin remodeling in G-rich regions to recruit histone H3.3 variants, and as the origin of replication at certain loci (Fig. 1.16c) [74]. These mechanisms rely on the involvement of G4binding proteins. Recently, it was reported that G4s affect epigenetic alterations. The formation of a single G4 at some loci stalls replication temporarily; this time lag causes an irreversible change in the histone pattern of each duplication process (Fig. 1.16d) [75, 76]. Furthermore, the excessive formation of repeated G4 structures accounts for some hereditary disorders [74]. Those vital roles of G4s in the transcription and replication processes were biophysically supported by the observation that G4 structures are markedly stable in spatially limited environments like holes inside processing DNA/RNA polymerases [77]. These various biological functions of G4s may be derived from their sequence, stability, location in the genome, local environments, and combinations of these factors; however, this issue has not been elucidated fully. Historically, the G4 structures observed in a telomere tandem repeat (GGGTTA)n region at the end of the chromosome initially attracted a great deal of attention, as a single-stranded background offers a greater likelihood of G4 formation. The group of Neidle and Hurley first identified that a telomere G4 DNA-interactive molecule is a telomerase inhibitor, using a series of 2,6-diamidoanthraquinone derivatives [78]. The discovery of telomestatin, a naturally occurring macrocycle compound that exhibits telomerase-inhibiting activity by binding to telomeric G4 structures, suggested the existence of G4s in vivo [79, 80]. The creation of extremely high affinity

1.2 G-Quadruplex

15

Fig. 1.16 Large variety of biological functions of G4 DNA. a The interferene of transcription factor binding or nucleosome occupancy by G4s can regulate the downstream gene expression. b The arrest of DNA polymerase progression occasionally induces strand breakage. c A series of actions of G4-binding proteins leads to the recruitment of histone H3.3 variants to reconstitute correctly the structure of the nucleosome. G4s play an important role in initiating replication. d Stalling the duplication process temporarily at G4s results in irreversible changes in histone modification patterns. Recycled and newly recruited histones are colored in green and red, respectively

and specific antibodies to G4s has enabled the visualization of G4s by immunofluorescence, which demonstrated the existence of G4s not only in the single-stranded telomere region but also in duplex regions in nuclei [81, 82]. Regarding G4s in duplex regions, structural analysis initially focused on one that was observed in the nuclease hypersensitive element (NHE) III1 located in the promoter region of a c-myc oncogene [83], the biological significance of which Hurley demonstrated by showing that it was associated with control of gene expression [69]. At a relatively early stage of G4 studies, other biologically important genes, such as hTERT [84], c-kit [85], KRAS [86, 87], BCL2 [88, 89], and VEGF [90], were also identified as genes in which the formation of a G4 was involved in transcriptional regulation. In parallel, many high-resolution G4 structures were elucidated at the atomic level using nuclear magnetic resonance (NMR) and X-ray crystallography, opening a new avenue for the rational design of G4 ligands [91–96]. Recently, G4 ChIP seq analysis revealed that approximately 10,000 actual G4 structures form in the human genome [97]. In addition, the growing number of reports of G4-interacting proteins and their relevant functions supports the biological functions of G4s [98–102]. For instance, some of the RNA helicases recognize G4 structures and manifest their unfolding activity so

16

1 Introduction

that DNAs are correctly replicated by DNA polymerases [101]. When such helicases were mutated so as to be devoid of unfolding ability, replication forks stalled in the genome at folded G4s, which resulted in genome instability. These mechanisms have been shown in some cases to be associated with genetic diseases [101, 102].

1.2.2 G-Quadruplexes and Cancers As mentioned earlier, the formation of G4 structures in human telomeric DNA was first assumed, due to the characteristic guanine-rich sequence (TTAGGG)n and a single-stranded context of human telomere. Abundant evidence has accumulated during the past two decades that the G4 structures are truly formed in the telomere region and has an important role in telomere-end processing in cells [81, 103]. More importantly, stabilization of telomere G4s and blockage of telomerase activities by small molecules, exemplified by telomestatin, is a new strategy for antitumor therapy [79, 80]. G4-forming sequences observed in the promoter of cancer-related genes have also received a great deal of attention as potential biomedical targets for antitumor therapy [104]. Generally, targeting the promoter region rather than expressed proteins has several advantages, including the lower likelihood of point mutations and the development of drug resistance. Quarfloxin, a G4-interacting ligand, had completed Phase II trials as a candidate therapeutic agent candidate against several tumors, including neuroendocrine tumors, carcinoid tumors, and lymphomas [105]. Quarfloxin disrupts the G4–nucleolin complexes of ribosomal DNA in the nucleolus, which in turn redistributes nucleolin into the nucleoplasm where it binds specifically to a G4 in the promoter region of c-myc proto-oncogene to inhibit its gene expression. Although the Phase III trials for Quarfloxin are currently not proceeding due to high albumin binding, several tumor-related genes were identified as genes in which the formation of a G4 was involved in transcriptional regulation, showing that G4s are potential molecular targets for cancer therapy. In this section, we would like to discuss the detailed exposition of the telomere and G4-driven oncogenes in terms of the direct targetability by synthetic ligands.

1.2.2.1

Telomere

A telomere is a structure of the ends of the chromosome, in which a repeated microsatellite sequence and its specifically interacting components (called a shelterin complex) protect the DNAs from DNA repair mechanisms [106]. The human telomeric DNA comprises a single microsatellite repeat sequence, (GGGTTA)n , with a 3 overhang at its terminus (200 ± 75 nucleotides). In normal somatic cells, the length of the telomere sequence gradually shortens with DNA replication, which limits cell growth and proliferation, as the expression of telomerase is almost entirely silenced. Telomerase is a reverse transcriptase enzyme that adds a repeated DNA

1.2 G-Quadruplex

17

sequence to the 3 end of telomeres. It consists of a catalytic subunit, hTERT, and TR/TERC, which is an RNA template that is used during the elongation of telomeres by hTERT. Although TR/TERC is globally expressed (regardless of cell type), hTERT is silenced in somatic cells and is reactivated in nearly 90% of human cancers. Aberrant telomerase activity disturbs the balance of the normal telomere maintenance mechanisms, contributing to the acquisition of immortality. Hence, inhibition of telomerase has long been considered as a potential therapeutic strategy for human cancers, and several telomerase inhibitors have entered preclinical or clinical trials. However, no clinically important benefits of these drugs have been reported to date. Recently, quadruplex-binding telomerase inhibitors have been considered as an alternative strategy for curing telomerase-positive cancers, as they exhibit high antitumor activity while minimally affecting normal somatic cells in vivo. As mentioned before, 2,6-diamidoanthraquinone derivatives and telomestatin were first found to be telomerase inhibitors through their binding to telomere G4s. Similarly, RHPS4 was shown to induce telomere dysfunction by disturbing the integrity of the shelterin complex in mammal cancer cells [107]. The later relevant studies found that a large repertoire of alternative higher-order structures derived from the canonical telomere G4 have been thought to be adopted at 3 overhang region [65, 108–110]. Those structures and their specific motifs are amenable to a gain of specificity for telomere G4s.

1.2.2.2

c-myc

c-myc encodes a multifunctional transcription factor that can act as a transcription activator of some genes involving the cell proliferation, while acting as a transcription repressor of other genes involving the growth arrest [111, 112]. There are a broad variety of c-myc-responsive genes that engage in the important cellular functions in concert, such as cell proliferation, metabolic transformation, and metastatic capacity [113]. In tumor cells, MYC protein function is almost always activated primarily through upstream oncogenic pathways. As the overexpression of the MYC protein is observed in various human malignancies (particularly in 80% of solid tumors), the downregulation of the gene may be an effective way toward cancer therapy. However, it is generally considered to be an undruggable target at the protein level because of its short half-life and unstructured nature [104]. The c-myc promoter region contains the nuclease hypersensitive element (NHE) III1 , which is located −142 to −115 base pairs upstream of the P1 promoter (Fig. 1.17a). There is one putative G4-forming sequence (PQS) in this element, which is capable of forming a nonduplex species, possibly accompanied by local unwinding or melting of the duplex structure under the influence of negative supercoiling stress (Fig. 1.17a) [114–116]. Structural dynamics in this region have also been considered to be a possible key mechanism in certain carcinomas to largely govern c-myc transcription, and the formation of a G4 is likely to act as a downregulator (Fig. 1.17b). Hence, G4-interacting ligands may contribute to suppression of the downstream c-myc gene expression by ligand-mediated G4 stabilization [117, 118].

18

1 Introduction

Fig. 1.17 a c-myc promoter has one putative G4-forming sequence (PQS). b Solution structure of G4 from a NHE III1 region in the vicinity of the P1 promoter (PDB code: 1xav)

In this context, the c-myc targeting G4-interacting ligands have been studied during the past two decades with an aim toward drug applications for antitumor therapy.

1.2.2.3

VEGF

Tumor progression and metastasis render the tumors more mature and malignant than undeveloped neoplasms, eventually resulting in the deterioration and immortality. Overexpressed vascular endothelial growth factor (VEGF) proteins including VEGFA, VEGFB, VEGFC, VEGFD, VEGFE, and PIGF in tumor cells are responsible for induced neovascularization. The expression of human VEGF, which is frequently elevated in many types of cancer, is regulated mainly at the transcriptional level [119, 120]. In a reporter assay system using several cancer cell lines, regulation of VEGF was basically regulated by a sequence from −85 to −50 relative to a transcription initiation site containing five arrays of more than three consecutive G-tracts, which is likely to adopt the G4 form of DNAs [121, 122]. VEGF is the attractive target molecule for malignant tumor therapy and its targeted antibody drugs have been approved for solid tumor treatment [123, 124]. Interestingly, VEGF gene has a promoter region in which the G4-forming sequences are located (Fig. 1.18).

1.2 G-Quadruplex

19

Fig. 1.18 a VEGF promoter has one PQS located close to the transcription start site (TSS) and hormone response element (HRE) that regulate the transcription. b Solution structure of G4 from the vicinity of the promoter (PDB code: 2m27)

The sequences are also consensus sequences for transcription factors such as Egr1 and Sp1, suggesting the dynamic equilibrium of DNA forms in this region also affects the gene regulation [93, 121]. Initially, the interaction of TMPyP4 and telomestatin with G4 oligonucleotides proved to unwind the duplex DNA oligomer into ssDNA oligomer and stabilize the G4 structure [90], and Se2SAP, a global G4-interacting ligand, efficiently suppressed VEGF expression in two adenocarcinoma cell lines (HEC1A and MDA-MB-231) [125]. These data offer the possibility that the transcription regulation of VEGF is controllable by ligand-mediated G4 stabilization and lead to the application of G4interacting ligand to cancer therapy. Similarly, a perylene monoimide derivative, PM2, was found to be a VEGF downregulator likely by direct interaction with the G4 structure [126]. A quindoline derivative, SYUIQ-FM05 also demonstrated strong interactions with a VEGF G4 and exhibited potential antiangiogenic and antitumor activities [127]. On the basis of these successful reports, several VEGF G4-preferred ligands have been developed, through small screening using docking and/or spectroscopic approaches [128, 129]. Biological activities of these ligands have never examined thus far, and therefore, a future study is awaited.

20

1.2.2.4

1 Introduction

BCL2

BCL2 (B-cell lymphoma 2) is recognized as an apoptosis-related gene whose translated product resides on the cytoplasmic face of the mitochondrial outer membrane and acts to suppress the movability of apoptosis-induced proteins by controlling mitochondrial membrane permeability [130]. Overexpressed BCL2 protein expression is associated with aberrant carcinoma growth in various human diseases, particularly solid tumors such as lymphomas, non-small cell lung cancer, myeloma, and melanoma, being recognized as a target for cancer therapy in the past three decades [131]. Several approaches have been made to downregulate of the BCL2 expression in cancer cells by small molecule to disrupt protein–protein interactions [132], antisense oligonucleotides [133], and peptidomimetics [134] toward cancer therapy. Overexpression of BCL2 is also indicated to be a principal element of chemoresistance, particularly for lymphocytic cancers [135, 136]. For instance, transfection of BCL2 into A549 cells induced resistance to the apoptotic effect triggered by triazine derivative 12459, a G4-interacting ligand that inhibits telomerase activity [137]. As another approach, the molecular decay effect by guanine-rich AS1411 aptamer that can be stably folded into a G4 structure causes the destabilization of BCL2 mRNA and degradation with RNase by interfering with the binding of nucleolin to the AUrich element of BCL2 mRNA, eventually inducing apoptosis [138]. This approach is reminiscent of the involvement of G4 formation in the gene expression. Amplification and translocation of BCL2 are shown to be equally common mechanisms that cause its overexpression in human cancer cells [139]. The human gene for BCL2 includes P1 and P2 promoters and has multiple transcription start sites. The major transcription regulation is less driven by a TATA box in promoter 2, while the P1 promoter that is situated 1386–1423 nucleotides upstream of the translation start site has been largely implicated in the control of BCL2 transcription (Fig. 1.19a) [140]. The GC-rich element exists in 1490−1451 nucleotides upstream of the P1 promoter, where multiple transcription factors have been said to be implicated in BCL2 gene expression including Sp1 [140], WT1 [141], E2F [142], and NGF [143]. That regulatory effect by the G4 formation in this region was suggested by luciferase reporter assays, in which mutation or deletion in this region resulted in an increase in promoter activity in B lymphocytes (DHL-4) [141] or human promyelocytic leukemia (HL-60) cells [144]. More recently, Onel, Yang, and coworkers demonstrated by a luciferase reporter assay using BCL2 promoter and mutated sequences that the formation of another G4 situated almost on the upper region of the P1 promoter attenuated the promoter activity (Fig. 1.19) [89]. Based on these reports, an approach to stabilizing the G4s formed in the regulatory element and attenuating the promoter activity by ligands has also been studied for cancer therapy, similar to the small molecule targeting of the c-myc G4. In addition to the G4s, i-motif, another form of DNA that forms in cytosine-rich sequences is involved in transcriptional regulation, in which the binding of hnRNP LL to the i-motif structure likely activates the BCL2 gene expression [145]. Moreover, an i-motif-interacting molecule, IM-48, was identified to modulate the BCL2 gene expression by affecting the dynamic equilibrium of the i-motif and the flexible hairpin

1.2 G-Quadruplex

21

Fig. 1.19 a BCL2 promoter has two G4-forming elements that were shown to attenuate the BCL2 promoter activity. b Solution structure of G4 from the vicinity of the P1 promoter (PDB code: 2f8u)

form [145], opening a new avenue to more precisely modulate the gene expression of BCL2. Targeting such canonical DNAs formed in the regulatory element of the promoter may be an effective way to specifically target a particular target to combat the tumor.

1.2.2.5

c-kit

The c-kit proto-oncogene encodes a receptor tyrosine kinase that is bridged and activated by the binding of dimerized stem cell factors (SCF), and in turn stimulate proliferation, differentiation, and survival in hemopoietic precursor cells [146– 148] Malfunctions of the KIT protein acquired by overexpression or mutations have been associated with several diseases including gastrointestinal stromal tumors (GIST), mastocytosis, and acute myelogenous leukemia (AML). Although the kinase

22

1 Introduction

Fig. 1.20 a c-kit promoter has two PQSs, where several transcription factors are likely involved. b Solution structure of G4 from the proximal PQS in the vicinity of the promoter (PDB code: 2o3m)

inhibitor Imatinib (Glivec) has been successfully developed as an FDA approved drug for GIST, the long-term exposure often causes secondary mutations at exon 13, 14 or 17 that encodes tyrosine kinase domains [149]. Notably, drug resistance derived from mutations at exon 17 is found to severely attenuate the therapeutic effect by imatinib [150]. A compelling approach to fundamentally suppress c-kit expression would be highly desirable. The human c-kit promoter is devoid of both a TATA box and CCAT boxes [151, 152]. Instead, the region within 200 nucletides upstream from TSS is highly rich in GC content, where several transcription factors are implicated (Fig. 1.20a). Two welldefined G4 structures were resolved, and the three-dimensional structural dynamics are shown to be involved in the regulation of c-kit gene transcription, accelerating the development of c-kit G4-preferred ligands (Fig. 1.20b) [153–156]. The modulation of such structural dynamics by small molecules is effective for suppressing gene expression and exhibiting an apoptotic effect.

1.2.2.6

hTERT

hTERT (human telomerase reverse transcriptase, TERT), which encodes the catalytic subunit of telomerase, has considerable attention as a compelling biomedical target

1.2 G-Quadruplex

23

particularly for cancers, since elevated TERT expression was often observed in ~90% of human cancer cells, whereas it is normally silenced in most of the normal cells [157, 158]. Aberrantly expressed TERT accelerates telomerase activity to irregularly maintain the telomere length [159]. Other than the canonical role as the maintenance of telomere length, TERT has been considered to suppress BCL2-dependent apoptosis [160] to regulate chromatin state [161, 162] and DNA damage responses [163], and to promote MYC and Wnt-driven cellular proliferation [163, 164]. The mutations that were identified in >70% of melanomas partially account for the elevated level of TERT expression [165]. The recent studies demonstrated that C to T mutations in the sense strand (G to A mutations in the antisense strand) in the TERT promoter highly activated transcription through creating a new consensus sequence for the binding of ETS/TCF (E-twenty six/ternary complex factor) [166]. Patients who have tumors expressing elevated levels of TERT exhibit even worse entire survival rates compared to those who do them expressing relatively lower levels of it [167]. These observations clearly indicate that a TERT promoter targeting based on the mutations might have a great impact on tumor therapeutics covering a wide range of tumors.

1.2.2.7

KRAS

The RAS gene family including HRAS, NRAS, and KRAS was first discovered in human tumors as driver oncogenes and has long been recognized as important therapeutic targets. Mutation of the KRAS gene is one of the most oncogenic driver mutations in pancreatic, colorectal, and lung cancers and plays a role in acquiring and increasing the drug resistance [168, 169]. Hence, the direct targeting for active KRAS by small molecules was considered to be a compelling strategy to combat the KRAS mutant tumors, yet it remains at an unsuccessful stage. Recently, our group has developed a novel approach that directly targets the mutant DNA using an alkylating pyrrole–imidazole polyamide (PIP) molecule, where it is capable for selectively alkylating oncogenic codon 12 mutant DNA and causing strand cleavage and consequent tumor growth suppression in tumor xenograft model of cancer in mice [26]. G4-mediated promoter targeting is also reported. The NHE in the KRAS proximal promoter is highly abundant in G-rich sequences, and several transcription factors interact with a G4 structure formed in this region [86, 87, 170–172]. A polypurine G-rich element located in approximately −300 to −100 nucleotides upstream of the exon 0/intron 1 boundary in a murine genome, or human genome was likely to be a component of the promoter activity and the PQS [86, 87, 170–174]. Importantly, pyrene-modified oligonucleotides that were devised to be a more stable form of the KRAS G4 was able to attract the transcription factors essential for transcription and to exhibit a strong antiproliferative activity through a G4-decoy effect in pancreatic cancer cells [175].

24

1.2.2.8

1 Introduction

c-myb

c-myb is largely expressed in an early stage of the differentiation of hematopoietic cells, and its expression is gradually decreased toward the end of the differentiation [176]. It encodes a transcription factor that plays a critical role in the proliferation, differentiation, and survival of haematopoietic progenitor cells. c-myb was identified by the discovery of v-myb oncogene found in avian myeloblastosis virus and E26 [177]. This gene is also recognized as a proto-oncogene, high expression of which is related to promoting the development of hematologic cancers and adenocarcinomas by a mechanism based on its canonical proliferative property [178–182]. The regulation of c-myb expression at a transcription level relies on multiple activating and repressing transcription factors in a cell-type-dependent fashion [183– 188]. Notably, a region in the promoter with three (GGA)4 triplet repeats beginning 17 nucleotides downstream of the transcription initiation site on the antisense strand was implicated in the promoter activity by forming very thermally stable higher-order parallel G4structures [189–191]. Partial deletion of the (GGA)4 triplet repeats not to be capable of forming the dimerized G4 enhances the promoter activity, suggesting that the G4 structures formed by utilizing together the three (GGA)4 triplet repeats should function as a negative regulator of the c-myb promoter activity [189]. Additionally, MAZ protein may bind to the c-myb G4 structure and negatively regulate the promoter activity. Recently, the group lead by Yuan performed a reporter assay to examine extensively the way that folded and unfolded G4s in the c-myb promoter activity affect the gene expression [192]. In this system, four PQSs in the c-myb promoter were selected as potential G4 formation elements, and the involvement of the respective G4s in promoter activity was measured using a set of promoter-containing plasmids where mutations were made so as not to form G4s in the respective PQSs (Fig. 1.21a). The promoter activity of the PQS1-mutated plasmid was markedly reduced, whereas the PQS1, 2-, and 3-mutated plasmid exhibited no significant changes in these promoter

Fig. 1.21 a c-myb promoter has multiple PQSs. b Chemical structure of topotecan that efficiently represses the MYB protein expression

1.2 G-Quadruplex

25

activities. These data strongly implied that the transcription regulation on those Grich sequences was considerably mediated by the formation of the G4 structures on the PQS1 element, where the binding of a transcription suppresser was likely impeded. The newly discovered c-myb G4-interacting ligand, topotecan specifically increased the transcription level in the wild-type plasmid without affecting the case of using PQS1-mutated plasmid (Fig. 1.21b). This downregulatiing effect was confirmed in the endogenous c-myb gene expression at the protein level. When the story moves more specifically to human diseases, c-myb proto-oncogene is identified as a target in glioma stem cells for glioblastoma multiforme (GBM) therapy, in which expression was considerably elevated in GBM tissues relative to normal tissues [193]. Interestingly, telomestatin, a global G4-interacting ligand, cause the impairment of the maintenance of GSC stem cell state through an apoptotic pathway largely by reducing a c-myb expression in vitro and in vivo. Although the direct interplay of telomestatin and c-myb G4s in the promoter has not examined, these observations offer the possibility that direct c-myb G4 DNA targeting might be a compelling therapeutic approach to GBM treatment.

1.2.2.9

Others (PDGFR-β, PDGF-A, STAT3, FGFR2)

Other G4s formed in putative regulatory elements in the promoters of cancer-related genes have been reported and are proposed as targetable by G4-interacting ligands (in promoters in genes for PDGFR-β [194], PDGF-A [195], STAT3 [196], FGFR2 [197]). For instance, GSA11129, which can interact with a G4 in the gene for PDGFR-β promoter to shift the equilibrium to a G4 species, was demonstrated to reduce the transcription level and to inhibit PDGF-β-driven cell proliferation and migration [194]. The G-rich element of the proximal promoter in the gene for PDGFA also forms a stable G4 structure even in the duplex context, and TMPyP4 reduced the basal promoter activity of PDGF-A, suggesting that targeting the PDGF-A G4 by the ligand specific for this G4 may be feasible as cancer therapy for gliomas, sarcomas, and astrocytomas [195, 198–202].

1.2.3 G-Quadruplex-Interacting Ligands Studies on the G4s from extensive aspects render researches led to a belief in the notion that G4s can form in the guanine-rich region in the human genome and are regarded as biologically and pharmaceutically important. In this context, numerous researchers have made tremendous efforts to get highly active G4 ligands and some of them attained great success in the development of drugs in vivo [203]. However, these drugs are still only midway toward approval for clinical use. One conceivable obstacle to impede the clinical application of G4-interacting molecules seems to rest with selectivity, although the global or multiple G4 targeting

26

1 Introduction

approaches may in some cases be effective [204–207]. As mentioned earlier, approximately 10,000 G4 structures exist in the human chromatin [97]. A growing number of G4-driven genes have also been reported, suggesting the high importance of the expanded varieties of G4-interacting ligands that possess differential binding profiles [208, 209]. However, poor ligand designability originating from the topological similarity of the skeleton of diverse G4s has remained a bottleneck for gaining specificity toward the individuals. Very recently, researchers came to enter the new phase of the development of next-generation G4-interacting ligands in which they consider the ligand selectivity to a particular G4 to be targeted, not only leading to developing highly antitumor and bioactive molecules with minimized side effects toward antitumor therapy, but also creating chemical biology tools for the detailed investigation of the functions of individual G4s in the genome [209]. In the next section, we address the recent progress of G4-interacting molecules that can discriminate particular G4 structures from the others.

1.2.4 Addressing the Specificity of Ligands to Particular G-Quadruplexes 1.2.4.1

Global G-Quadruplex-Selective Ligands

Since G4-interacting molecules were developed based on duplex DNA-binding molecules, researchers initially endeavored the development of G4 ligands that have a clear selectivity to G4 structures over the duplex DNA [210, 211]. A telomere G4-interacting molecule, 2,6-diamidoanthraquinone derivatives, was first found to act as a telomerase inhibitor by the group of Neidle and Hurley, as discussed before. Cationic porphyrin, TMPyP4, was also identified to be a G4 binder, whose planar skeleton and cationic propensity would be preferable for G4 binding [212]. Moreover, several commercially available G4 ligands such as BRACO19 [213], Pyridostatin [214], Phen-DC3 [215], L2H2-6OTD [216], and L1H1-7OTD [217] that have negligible binding affinities to duplex DNAs is dispensable to biochemical, biophysical, and chemical biology studies on G4s.

1.2.4.2

Flat-Shaped Compounds that Were Originally Developed in Different Fields

Flat-shaped compounds that were originally developed in different fields are often rerecognized as being G4 ligands because of their planar geometry and availability. In line with this background, some of these compounds possess an inherent preference for the topologies of certain G4s. For example, NMM IX prefers to bind to a hybrid or parallel topology [218–220], whereas crystal violet can discern an antiparallel topology (Fig. 1.22a, b) [221].

Fig. 1.22 DNA G4 ligands with a preference toward particular topologies or G4s. a, b Studies in the field of G4s shed light on NMM IX and crystal violet as topology-preferred ligands. c–m Synthetic ligands likely to interact with loops and grooves that offer distinct environments as scaffolds for specific molecular recognition

1.2 G-Quadruplex 27

28

1.2.4.3

1 Introduction

Loops and Grooves that Offer Distinct Environments for Specific Molecular Recognition

In the past three decades, a series of intensive studies using NMR techniques and Xray crystallography for the atomic-level elucidation of a library of G4 structures has facilitated and rationalized the design of G4 ligands that exhibit specificity between different G4s. One approach to gain specificity among many types of G4s without reducing the binding affinity is the use of loops and grooves that offer distinct environments for specific molecular recognition. For instance, the core G-tetrad layers of three types of the telomere G4 structures are centered with loops differently positioned (Fig. 1.15b). Based on this principle, several successful attempts were made to achieve preference toward a particular G4 over other quadruplexes. CPT2 can visually discern antiparallel G4s from parallel ones by fitting into the shape of the groove (Fig. 1.22c) [222]. ThT-HE, a thioflavin T analogue that was modified by the addition of a hydroxyethyl group at the N3 position of the benzothiazole ring, exhibits a clear preference toward a c-myc parallel G4 relative to other parallel structures (c-kit DNA, c-src DNA, and NRAS RNA) in a sodium-dominant buffer using fluorescence detection (Fig. 1.22d) [223]. Acridine–peptide conjugates were developed to discriminate between distinct G4s. Two peptide sequences (with substituents at different sites to contact distinctly the loops and grooves) were attached to an acridine core moiety that targets the planar surface of a G-tetrad. SPR-binding assays showed that Compounds 10, 14, 19, and 21 could distinguish specific G4s (Fig. 1.22e) with very high binding affinities (K D = 4–25 × 10−9 M) [224]. Molecular modeling suggested that the spatial allowance of the rectangular acridine moiety upon occupying the wider square shape of a G-tetrad would facilitate the correct positioning of substituents and their distinct interaction with the loops and grooves. GQR was identified among several BODIPY derivatives as a particular parallel G4-preferred light-up probe (93del: 5 -G4 TG3 AG2 AG3 T-3 over c-myc and c-kit parallel G4s) (Fig. 1.22f) [225]. NDI 3 was developed as a ligand with specificity for a c-kit G4, in which a planar core naphthalenediimide was functionalized with two lysines with boc-protected side chains (Fig. 1.22g) [226]. The preference for this interaction possibly relies on the specific contact with the loops or grooves. Phen-Et, a phenanthroline-bisbenzimidazole caroboxyamide molecule, shows a preference for c-myc and c-kit parallel quadruplexes over any topology of telomere G4s (parallel, antiparallel, hybrid, or higher-ordered topologies), albeit with a moderate binding affinity (K D ~ 1.6 × 10−5 M) (Fig. 1.22h) [153]. Computer-aided modeling studies underscored the significance of the optimal projection of N,N-dimethylaminoethyl side chains at the N position of the benzimidazole moiety for recognizing the propeller loops of promoter G4s. Guanosine moiety can be used for specific recognition when attached to a dansyl moiety to yield DDG, in which two azide-labeled dinucleosides are linked across a dansyl dialkyneamide through click chemistry (Fig. 1.22i) [227]. This conjugate is capable of recognizing specifically a c-myc parallel G4 against a c-kit parallel one. TOxaPy, a crescent-shaped molecule that is alternately made up of pyrimidine and oxazole rings, shows preferential binding to a telomere with antiparallel topology over a telomere with parallel topology with a high binding affinity

1.2 G-Quadruplex

29

Fig. 1.23 Template-guided component assembly and linkage through click chemistry gave the highly effective ligand, pyridostatin-based adduct 10, which is a more potent TRF1 competitor than PDS

(K D = 2 × 10−7 M) (Fig. 1.22j) [228]. Specific groove binding of ToxaPy to the antiparallel topology has been predicted by a docking analysis, but this has not been confirmed. It is worth describing the last three molecules, BTC-f [229], TH3 [230], and IZCZ-3 [231], because these molecules have been shown to reduce off-target effects in biological experiments (Fig. 1.22k–m). Therefore, information about their in vitro preference is biologically confirmed.

1.2.4.4

Template-Guided Component Assembly and Linkage Through Click Chemistry

Template-guided component assembly and linkage through click chemistry give highly effective ligands, usually resulting in a gain of specificity. The ligand is constructed in situ on the basis of the close proximity and appropriate direction of the functionalized components upon specific binding to a certain target. Using a telomere G4 as a template, pyridostatin (PDS)-based adduct 10 was found to be a more potent TRF1 competitor in cellular experiments than PDS, which is a general G4 binder; however, the binding affinity was slightly reduced (Fig. 1.23) [232]. Using the same strategy, an RNA G4-targeting ligand, carboxyPDS, was identified (to be discussed in a later section). This system is clearly applicable to other types of G4 targets.

1.2.4.5

Targeting of Non-canonical Higher-Order G-Quadruplex Structures

Recently, non-canonical higher-order G4 structures have been highlighted because of the specific and precisely controlled targeting of designated G4s. Better use of the specific motif in those structures is made by acquiring specificity. Hurley developed a small molecule that is specific to the higher-order G4 structure observed in the hTERT promoter via dual-motif targeting, mismatched duplex stem loop, and its proximal G4 (GTC365, Fig. 1.24a) [233]. In parallel, Phan’s group extensively studied duplex stem-loop-containing G4 motifs using both bioinformatics

30

1 Introduction

Fig. 1.24 a GTC365 is a highly specific to hTERT G4 containing a mismatched hairpin stem loop, which is recognized by a guanidine moiety. b Coaddition of netropsin and Phen-DC3 is able to simultaneously recognize both duplex and G4 segments on duplex stem-loop-containing G4 motifs, respectively. N-methylpyrrole is highlighted in blue

and biophysical approaches [67, 234–236]. These G4/duplex motifs serve as a dual binding site that has been shown by NMR spectroscopy to be simultaneously targetable with two distinct molecules (Netropsin/Phen-DC3 or other G4 ligands) (Fig. 1.24b) [237]. Although this non-linked dimeric system is at a primitive stage for the specific targeting of duplex-containing G4 motifs, careful design and linkage of the readout molecules such as PIP may allow the creation of highly specific hybrid molecules in future.

1.2.4.6

Cell-Based Screening of G-Quadruplex Ligands

Cell-based screening of G4 ligands overcomes the incompatibility between the outcomes of in vitro and cellular applications, which occurs often in in vitrobased ligand discovery. Moreover, it can be applied to the discovery of potential highly specific ligands. Luciferase reporter assays performed on a 96-well plate using a human gastric carcinoma cell line (HGC-27) led to the discovery of two benzo[a]phenoxazine (BPO) derivatives as potent c-kit G4 ligands (Fig. 1.25a) [238]. Subsequent RT–qPCR and SPR-binding analyses confirmed these two molecules acted as endogenous c-kit gene suppressors in an HGC-27 cell line, probably through binding to c-kit promoter G4s. Similarly, two quinazolone derivatives were identified that could downregulate c-kit expression at the protein level (Fig. 1.25b) [239]. Most recently, one striking PDS analog named PDC12 was discovered using a unique cell-based screening approach. It enables the induction of G4-dependent transcriptional reprogramming by stabilizing a single G4 located at the BU-1 locus (Fig. 1.25c) [240]. Interestingly, the local transcription reprogramming by PDC22 occurs in two stages, i.e., the loss of H3K4me3 and DNA cytosine

1.2 G-Quadruplex

31

Fig. 1.25 Hit compounds by cell-based screenings. a, b The c-kit-targeting ligands that were identified. The different parts of chemical structures are highlighted in purple, for clarity. c PDC12 was identified as the most potent candidate with the ability to induce G4-dependent transcriptional reprogramming by stabilizing a single G4 located at the BU-1 locus

methylation. It is noteworthy that the changes in the histone pattern were irreversible, even after the compound was removed. This was the first example of a small molecule that induces epigenetically heritable effects by targeting DNA secondary structures; thus, it represents a potentially new approach to epigenetic therapy.

1.2.4.7

Specific Targeting of Telomere G-Quadruplexes

As mentioned above, the human telomere region comprises a single microsatellite repeat sequence, (GGGTTA)n , with a 3 overhang at its terminus (200 ± 75 nucleotides). A large variety of alternative higher-order structures derived from the canonical telomere G4 have been considered for adoption, and the specific motifs in those structures are amenable to a gain of specificity for telomere G4s using unique methodologies (Fig. 1.26a) [86–88]. Dimeric G4 ligands target dimeric G4s. Tandemly aligned G4 ligands permit the favorable discrimination of a dimeric G4 from a monomeric one. The dinickel salophen dimer [241], berberine dimer [242], and telomestatin derivative tetramer [243] are successful examples of such ligands (Fig. 1.26b). In contrast, chiral helical supramolecules, Ni-M, exhibit a binding preference to dimers over monomers, with a 200-fold selectivity, probably because two consecutive G4s offer a preferred binding site (Fig. 1.26b) [244]. Conversely, the other enantiomer, Ni-P, is capable of specifically converting a monomeric antiparallel form to a monomeric hybrid form [245]. It is also interesting that, more recently,

32

1 Introduction

Fig. 1.26 Specific telomere G4 targeting by ligands. a Telomere G-stretch sequences potentially adopt non-orthodox G4s that offer specific binding motifs. b Several telomere G4-preferred binders based on the specific-motif recognition

Ni-M was shown to exhibit binding affinity to a left-handed Z-G4 in an enantioselective manner [246]. The junction pocket between two G4 units also serves for specific recognition. It is possible that Helicene M1 enantioselectively recognizes the helicity of the junction cavity to some extent (Fig. 1.26b) [247]. IZNP1 was shown to be correctly positioned into that junction by molecular modeling and to exhibit a reduced binding affinity to TERRA multimeric RNA G4s [248]. Notably, this molecule caused telomeric DNA damage and telomere dysfunction, without affecting several well-studied oncogenes that have monomeric G4s in their promoters. DATPE specifically detects a dimeric G4 after insertion into the junction pocket (Fig. 1.26b) [249]. Furthermore, binding of m-TMPipEOPP to the side faces of dimeric G4s conferred a preference for multimeric G4s over the monomeric form under molecularly crowded conditions (Fig. 1.26b) [250]. TzPyBDo is able to selectively discern dimeric G4s with a negligible binding affinity to monomers (Fig. 1.26b) [251]. The junction cavity located between the duplex and G4 is also an attractive target with respect to specificity, because it provides an exceptionally targetable pocket. For example, the potential G4–duplex interface formed by telomeric repeats may be a unique target for molecular binding and specific interference with telomere-related functions. A docking analysis suggested that BSU6039 can be accommodated in the

1.2 G-Quadruplex

33

cavity by forming several hydrogen bonds (Fig. 1.26b) [110]. A dihydropyrimidin4-one derivative was identified as a G-triplex ligand using virtual screening, but exhibited binding affinity to G4 structures (this will be discussed in a later section) [252]. A long-loop DNA sequence arranged by a monomeric G4 was also amenable to molecular recognition by hybridization of the complementary strand, which was demonstrated by an atomic-level NMR analysis [65].

1.2.4.8

RNA G-Quadruplex-Interactive Molecules

It seems that RNA is more likely to form a G4 structure because of its single-stranded state, flexibility, susceptibility to modifications, and individual distinct movements inside cells. In fact, several RNA G4s have been reported that exhibit specialized functions [253–256]. In terms of RNA G4 ligand design, the acquisition of RNA G4 specificity against a DNA G4 is generally difficult because of its structural similarity to DNA G4s and the unavailability of a defined structure at the atomic level. However, some methodologies have been developed recently. CarboxyPDS was successfully identified in a small screening based on template-guided component assembly and linkage through click chemistry, as mentioned before (Fig. 1.27a) [232]. This ligand highly stabilized a TERRA RNA G4 (T m = 20.7 °C), and the stabilization was not affected by the addition of up to 100 equivalents of a telomere DNA G4 competitor. It is worth mentioning that carboxyPDS has been successfully used for the selective stabilization of endogenous RNA G4s in cells [257]. RGB-1 was identified as a highly specific RNA G4 ligand in a chemical screening that relied on high-throughput assessments of the reverse-transcribed products of a G4-containing RNA template in the presence of chemicals (Fig. 1.27b) [258]. RGB-1 was demonstrated to cause RNA G4-mediated suppression of NRAS mRNA translation in breast cancer cells. Cell-based screening of G4 ligands, as mentioned previously, was also effective in this case. QUMA 1 was hit, which selectively stained RNA in HeLa cells during an enzyme-digestion-based screening of an in-house compound library (Fig. 1.27c) [259]. The hit compound exhibited desired (or more) properties that allowed the visualization of RNA G4 dynamics in live cells. ISCH-nras was uniquely developed for the detection of a particular RNA G4 (NRAS), based on the hybridization of a

Fig. 1.27 a–d RNA G4-targeting ligands. d It is noteworthy that ISCH-nras1 was able selectively to target and detect an NRAS RNA G4 in a cellular context

34

1 Introduction

tail RNA sequence adjacent to a G-rich sequence by a DNA molecule connected to a quadruplex-triggered fluorescent probe (Fig. 1.27d) [260].

1.2.4.9

Specific Localization for the Selective Targeting of Particular G-Quadruplexes; the Mitochondrial G-Quadruplex

Very recent studies showed that putative G4-forming sequences are present in mitochondrial DNA, and the formation of G4s is thought to have biological functions [261]. Furthermore, the fact that the mitochondrial transcription factor A (TFAM) displayed a binding affinity to G4 structures evoked great interest [262]. Hence, the mitochondrial G4 is of increasing importance, and identification of its specific targeting is crucial for the understanding of its detailed biological functions. ZnPc1 was shown to be localized in mitochondria, and photodynamic treatment (PDT) using this molecule induced the production of reactive oxygen species, a collapse of the mitochondrial membrane potential, and chromatin condensation, eventually leading to apoptosis (Fig. 1.28a) [263]. TP2Py was also shown to be strongly colocalized with mitochondria and to be a potent chemotherapy/radiotherapy for cancer (Fig. 1.28b) [264]. The planar geometry of the core part of these two ligands is suggestive of a potential binding property as G4 ligands, although these articles did not mention it. Other mitochondria-targeting planar compounds may also be involved in such G4 recognition [265, 266]. Their detailed mechanisms of action inside mitochondria await analysis.

Fig. 1.28 Representative mitochondria-localized compounds that are likely to bind to G4 structures; a ZnPC1, b TP2py

1.2 G-Quadruplex

35

Fig. 1.29 a–c G-triplex-targeting ligands. b A platform for their evaluation constructed by DNA origami

1.2.4.10

Alternative Nucleic Acid form as a Biomedical Target, G-Triplex

The G-triplex was initially regarded as a transient DNA form and a possible intermediate in the G4 folding process. A growing body of literature suggests that such a structure forms stably under physiological conditions [109, 267–269]. Along with its potential biological significance, small molecules targeting G-triplexes increasingly command considerable attention. Acridone–PNA conjugates highlight dualsite targeting by a planar acridone moiety appended with a Gly-GGG-Lys PNA sequence (Fig. 1.29a) [270]. The PNA moiety associated with one guanine of three G-tetrads to form a hybrid PNA + DNA G4. This ligand is thought to prefer a Grich sequence in a single-stranded context over a prefolded G4; thus, it might be especially useful for the targeting of G-triplex structures in such cellular dynamics. A dihydropyrimidin-4-one derivative is identified from Mcule chemical database by simple docking programs as both G-triplex and G4 structures (Fig. 1.29b) [252]. Our group has devised a nanoplatform constructed by DNA origami for studying such intermediates of G4 such as G-triplex and G-hairpin and found that PDC, a well-known G4-interacting ligand, unexpectedly recognized the G-triplex and Ghairpin structures (Fig. 1.29c) [271]. Considering this, the ability to recognize the intermediates of G4 might be an essential component for the high binding affinity, selectivity, or inducing ability of the G4 structures from the stable duplex or singlestranded DNA. The platform manifests the power to assess an unprecedented G4binding property of a ligand.

36

1 Introduction

1.3 Conclusion and Future Prospects I have discussed expandable trinucleotide repeats and G-quadruplexes (G4s) in terms of molecular targets by synthetic ligands toward creating potential drugs or chemical tools. From the therapeutic aspect of G4 ligands, the G4 is relatively recently considered to be a potential biomedical target particularly for tumor or neurologic disease therapy, and a considerable body of evidence has been accumulating that G4interacting drugs exhibit good antitumor activities. However, limited fruits remain. As protein-targeting drugs face the same situations, G4-interacting drugs displayed low selectivities to the targeted G4 structure, mainly due to the similar skeleton among different G4 forms prevalent in the genome. In this chapter, I have introduced G4-interacting ligands that were devised to gain selectivity to a particular G4 structure. The selectivity issues remain incompletely solved but, if accomplished, would substantially impact cancer therapy. Besides, the G4-driven oncogenes introduced here are known to usually well correlated and concertedly influence tumorigenesis, tumor growth, and malignant transition [160, 163, 183, 272, 273]. Although this relationship is not fully elucidated, combinatorial approaches may be a good option for further therapeutic advancements [273]. Collectively, the abovementioned non-canonical DNA conformations have profound implications in various biological, neurological, pharmacological events, primarily based on human diseases. The subsequent chapters include my Ph.D. study addressing the development of DNA-sequence and DNA-form selective ligands toward elucidating the function of non-canonical DNA structures relevant to human diseases.

References 1. Gao XL, Mirau P, Patel DJ (1992) J Mol Biol 223:259–279 2. Tseng WH, Chang CK, Wu PC, Hu NJ, Lee GH, Tzeng CC, Neidle S, Hou MH (2017) Angew Chem Int Ed Engl 56:8761–8765 3. Kamitori S, Takusagawa F (1992) J Mol Biol 225:445–456 4. Lo YS, Tseng WH, Chuang CY, Hou MH (2013) Nucl Acids Res 41:4284–4294 5. Hou MH, Robinson H, Gao YG, Wang AH (2002) Nucl Acids Res 30:4910–4917 6. Kopka ML, Yoon C, Goodsell D, Pjura P, Dickerson RE (1985) J Mol Biol 183:553–563 7. Coll M, Frederick CA, Wang AH, Rich A (1987) Proc Natl Acad Sci USA 84:8385–8389 8. Pelton JG, Wemmer DE (1989) Proc Natl Acad Sci USA 86:5723–5727 9. Pelton JG, Wemmer DE (1990) J Biomol Struct Dyn 8:81–97 10. Mitra SN, Wahl MC, Sundaralingam M (1999) Acta Crystallogr D Biol Crystallogr 55:602– 609 11. Mrksich M, Wade WS, Dwyer TJ, Geierstanger BH, Wemmer DE, Dervan PB (1992) Proc Natl Acad Sci USA 89:7586–7590 12. Mrksich M, Parks ME, Dervan PB (1994) J Am Chem Soc 116:7983–7988 13. Cho J, Parks ME, Dervan PB (1995) Proc Natl Acad Sci USA 92:10389–10392 14. Kielkopf CL, Baird EE, Dervan PB, Rees DC (1998) Nat Struct Biol 5:104–109 15. deClairac RPL, Geierstanger BH, Mrksich M, Dervan PB, Wemmer DE (1997) J Am Chem Soc 119:7909–7916

References

37

16. Zhang Q, Dwyer TJ, Tsui V, Case DA, Cho J, Dervan PB, Wemmer DE (2004) J Am Chem Soc 126:7958–7966 17. Chenoweth DM, Dervan PB (2009) Proc Natl Acad Sci USA 106:13175–13179 18. Turner JM, Swalley SE, Baird EE, Dervan PB (1998) J Am Chem Soc 120:6219–6226 19. Baird EE, Dervan PB (1996) J Am Chem Soc 118:6141–6146 20. Wurtz NR, Turner JM, Baird EE, Dervan PB (2001) Org Lett 3:1201–1203 21. Dervan PB, Edelson BS (2003) Curr Opin Struct Biol 13:284–299 22. Gottesfeld JM, Neely L, Trauger JW, Baird EE, Dervan PB (1997) Nature 387:202–205 23. Bando T, Sugiyama H (2006) Acc Chem Res 39:935–944 24. Oyoshi T, Kawakami W, Narita A, Bando T, Sugiyama H (2003) J Am Chem Soc 125:4752– 4754 25. Shinohara K, Sasaki S, Minoshima M, Bando T, Sugiyama H (2006) Nucleic Acids Res 34:1189–1195 26. Hiraoka K, Inoue T, Taylor RD, Watanabe T, Koshikawa N, Yoda H, Shinohara K, Takatori A, Sugimoto H, Maru Y, Denda T, Fujiwara K, Balmain A, Ozaki T, Bando T, Sugiyama H, Nagase H (2015) Nat Commun 6:6706 27. Taylor RD, Asamitsu S, Takenaka T, Yamamoto M, Hashiya K, Kawamoto Y, Bando T, Nagase H, Sugiyama H (2014) Chemistry 20:1310–1317 28. Pandian GN, Nakano Y, Sato S, Morinaga H, Bando T, Nagase H, Sugiyama H (2012) Sci Rep 2:544 29. Pandian GN, Taniguchi J, Junetha S, Sato S, Han L, Saha A, AnandhaKumar C, Bando T, Nagase H, Vaijayanthi T, Taylor RD, Sugiyama H (2014) Sci Rep 4:3843 30. Han L, Pandian GN, Junetha S, Sato S, Anandhakumar C, Taniguchi J, Saha A, Bando T, Nagase H, Sugiyama H (2013) Angew Chem Int Ed Engl 52:13410–13413 31. Pandian GN, Sato S, Anandhakumar C, Taniguchi J, Takashima K, Syed J, Han L, Saha A, Bando T, Nagase H, Sugiyama H (2014) ACS Chem Biol 9:2729–2736 32. Syed J, Chandran A, Pandian GN, Taniguchi J, Sato S, Hashiya K, Kashiwazaki G, Bando T, Sugiyama H (2015) ChemBioChem 16:1497–1501 33. Wei Y, Pandian GN, Zou T, Taniguchi J, Sato S, Kashiwazaki G, Vaijayanthi T, Hidaka T, Bando T, Sugiyama H (2016) Chem Open 5:517–521 34. Han L, Pandian GN, Chandran A, Sato S, Taniguchi J, Kashiwazaki G, Sawatani Y, Hashiya K, Bando T, Xu Y, Qian X, Sugiyama H (2015) Angew Chem Int Ed Engl 54:8700–8703 35. Erwin GS, Grieshop MP, Ali A, Qi J, Lawlor M, Kumar D, Ahmad I, McNally A, Teider N, Worringer K, Sivasankaran R, Syed DN, Eguchi A, Ashraf M, Jeffery J, Xu MS, Park PMC, Mukhtar H, Srivastava AK, Faruq M, Bradner JE, Ansari AZ (2017) Science 358:1617–1621 36. Taniguchi J, Feng Y, Pandian GN, Hashiya F, Hidaka T, Hashiya K, Park S, Bando T, Ito S, Sugiyama H (2018) J Am Chem Soc 140:7108–7115 37. Mirkin SM (2007) Nature 447:932–940 38. Li YC, Korol AB, Fahima T, Beiles A, Nevo E (2002) Mol Ecol 11:2453–2465 39. Orr HT, Zoghbi HY (2007) Annu Rev Neurosci 30:575–621 40. Das Bhowmik A, Rangaswamaiah S, Srinivas G, Dalal AB (2015) Eur J Med Genet 58:160– 167 41. Krzyzosiak WJ, Sobczak K, Wojciechowska M, Fiszer A, Mykowska A, Kozlowski P (2012) Nucleic Acids Res 40:11–26 42. Walker FO (2007) Lancet 369:218–228 43. Ross CA, Wood JD, Schilling G, Peters MF, Nucifora FC Jr, Cooper JK, Sharp AH, Margolis RL, Borchelt DR (1999) Philos Trans R Soc Lond B Biol Sci 354:1005–1011 44. Ranum LP, Cooper TA (2006) Annu Rev Neurosci 29:259–277 45. Peel AL, Rao RV, Cottrell BA, Hayden MR, Ellerby LM, Bredesen DE (2001) Hum Mol Genet 10:1531–1538 46. Timchenko LT, Miller JW, Timchenko NA, DeVore DR, Datar KV, Lin L, Roberts R, Caskey CT, Swanson MS (1996) Nucl Acids Res 24:4407–4414 47. Miller JW, Urbinati CR, Teng-Umnuay P, Stenberg MG, Byrne BJ, Thornton CA, Swanson MS (2000) EMBO J 19:4439–4448

38

1 Introduction

48. Berman RF, Buijsen RA, Usdin K, Pintado E, Kooy F, Pretto D, Pessah IN, Nelson DL, Zalewski Z, Charlet-Bergeurand N, Willemsen R, Hukema RK (2014) J Neurodev Disord 6:25 49. Sutcliffe JS, Nelson DL, Zhang F, Pieretti M, Caskey CT, Saxe D, Warren ST (1992) Hum Mol Genet 1:397–400 50. Kumari D, Biacsi RE, Usdin K (2011) J Biol Chem 286:4209–4215 51. Kim E, Napierala M, Dent SY (2011) Nucl Acids Res 39:8366–8377 52. Zu T, Gibbens B, Doty NS, Gomes-Pereira M, Huguet A, Stone MD, Margolis J, Peterson M, Markowski TW, Ingram MA, Nan Z, Forster C, Low WC, Schoser B, Somia NV, Clark HB, Schmechel S, Bitterman PB, Gourdon G, Swanson MS, Moseley M, Ranum LP (2011) Proc Natl Acad Sci USA 108:260–265 53. Pearson CE (2011) PLoS Genet 7:e1002018 54. Siboni RB, Nakamori M, Wagner SD, Struck AJ, Coonrod LA, Harriott SA, Cass DM, Tanner MK, Berglund JA (2015) Cell Rep 13:2386–2394 55. Nakatani K, Hagihara S, Goto Y, Kobori A, Hagihara M, Hayashi G, Kyo M, Nomura M, Mishima M, Kojima C (2005) Nat Chem Biol 1:39–43 56. Matsumoto J, Li J, Dohno C, Nakatani K (2016) Bioorg Med Chem Lett 26:3761–3764 57. Nguyen L, Luu LM, Peng S, Serrano JF, Chan HY, Zimmerman SC (2015) J Am Chem Soc 137:14180–14189 58. Barros SA, Chenoweth DM (2015) Chem Sci 6:4752–4755 59. Bacolla A, Wells RD (2009) Mol Carcinog 48:273–285 60. Phan AT, Kuryavyi V, Patel DJ (2006) Curr Opin Struct Biol 16:288–298 61. Gellert M, Lipsett MN, Davies DR (1962) Proc Natl Acad Sci USA 48:2013–2018 62. Sen D, Gilbert W (1988) Nature 334:364–366 63. Huppert JL, Balasubramanian S (2007) Nucl Acids Res 35:406–413 64. Huppert JL, Balasubramanian S (2005) Nucl Acids Res 33:2908–2916 65. Yue DJ, Lim KW, Phan AT (2011) J Am Chem Soc 133:11462–11465 66. Mukundan VT, Phan AT (2013) J Am Chem Soc 135:5017–5028 67. Lim KW, Phan AT (2013) Angew Chem Int Ed Engl 52:8566–8569 68. Bugaut A, Balasubramanian S (2008) Biochemistry 47:689–697 69. Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH (2002) Proc Natl Acad Sci USA 99:11593– 11598 70. Paeschke K, Bochman ML, Garcia PD, Cejka P, Friedman KL, Kowalczykowski SC, Zakian VA (2013) Nature 497:458–462 71. Paeschke K, Capra JA, Zakian VA (2011) Cell 145:678–691 72. Lopes J, Piazza A, Bermejo R, Kriegsman B, Colosio A, Teulade-Fichou MP, Foiani M, Nicolas A (2011) EMBO J 30:4033–4046 73. Rodriguez R, Miller KM, Forment JV, Bradshaw CR, Nikan M, Britton S, Oelschlaegel T, Xhemalce B, Balasubramanian S, Jackson SP (2012) Nat Chem Biol 8:301–310 74. Law MJ, Lower KM, Voon HP, Hughes JR, Garrick D, Viprakasit V, Mitson M, De Gobbi M, Marra M, Morris A, Abbott A, Wilder SP, Taylor S, Santos GM, Cross J, Ayyub H, Jones S, Ragoussis J, Rhodes D, Dunham I, Higgs DR, Gibbons RJ (2010) Cell 143:367–378 75. Sarkies P, Reams C, Simpson LJ, Sale JE (2010) Mol Cell 40:703–713 76. Schiavone D, Jozwiakowski SK, Romanello M, Guilbaud G, Guilliam TA, Bailey LJ, Sale JE, Doherty AJ (2016) Mol Cell 61:161–169 77. Shrestha P, Jonchhe S, Emura T, Hidaka K, Endo M, Sugiyama H, Mao H (2017) Nat Nanotechnol 12:582–588 78. Sun D, Thompson B, Cathers BE, Salazar M, Kerwin SM, Trent JO, Jenkins TC, Neidle S, Hurley LH (1997) J Med Chem 40:2113–2116 79. Shin-ya K, Wierzba K, Matsuo K, Ohtani T, Yamada Y, Furihata K, Hayakawa Y, Seto H (2001) J Am Chem Soc 123:1262–1263 80. Kim MY, Vankayalapati H, Shin-Ya K, Wierzba K, Hurley LH (2002) J Am Chem Soc 124:2098–2099 81. Biffi G, Tannahill D, McCafferty J, Balasubramanian S (2013) Nat Chem 5:182–186

References

39

82. Schaffitzel C, Berger I, Postberg J, Hanes J, Lipps HJ, Pluckthun A (2001) Proc Natl Acad Sci USA 98:8572–8577 83. Simonsson T, Pecinka P, Kubista M (1998) Nucl Acids Res 26:1167–1172 84. Palumbo SL, Ebbinghaus SW, Hurley LH (2009) J Am Chem Soc 131:10878–10891 85. Rankin S, Reszka AP, Huppert J, Zloh M, Parkinson GN, Todd AK, Ladame S, Balasubramanian S, Neidle S (2005) J Am Chem Soc 127:10584–10589 86. Cogoi S, Quadrifoglio F, Xodo LE (2004) Biochemistry 43:2512–2523 87. Morgan RK, Batra H, Gaerig VC, Hockings J, Brooks TA (2016) Biochim Biophys Acta 1859:235–245 88. Dexheimer TS, Sun D, Hurley LH (2006) J Am Chem Soc 128:5404–5415 89. Onel B, Carver M, Wu G, Timonina D, Kalarn S, Larriva M, Yang D (2016) J Am Chem Soc 138:2563–2570 90. Sun D, Guo K, Rusche JJ, Hurley LH (2005) Nucl Acids Res 33:6070–6080 91. Patel DJ, Phan AT, Kuryavyi V (2007) Nucl Acids Res 35:7429–7455 92. Neidle S (2009) Curr Opin Struct Biol 19:239–250 93. Parkinson GN, Lee MP, Neidle S (2002) Nature 417:876–880 94. Wang Y, Patel DJ (1993) Structure 1:263–282 95. Luu KN, Phan AT, Kuryavyi V, Lacroix L, Patel DJ (2006) J Am Chem Soc 128:9963–9970 96. Laughlan G, Murchie AI, Norman DG, Moore MH, Moody PC, Lilley DM, Luisi B (1994) Science 265:520–524 97. Hansel-Hertsch R, Beraldi D, Lensing SV, Marsico G, Zyner K, Parry A, Di Antonio M, Pike J, Kimura H, Narita M, Tannahill D, Balasubramanian S (2016) Nat Genet 48:1267–1272 98. Brazda V, Haronikova L, Liao JC, Fojta M (2014) Int J Mol Sci 15:17493–17517 99. Mishra SK, Tawani A, Mishra A, Kumar A (2016) Sci Rep 6:38144 100. Rhodes D, Lipps HJ (2015) Nucl Acids Res 43:8627–8637 101. Brosh RM Jr (2013) Nat Rev Cancer 13:542–558 102. Clynes D, Jelinska C, Xella B, Ayyub H, Taylor S, Mitson M, Bachrati CZ, Higgs DR, Gibbons RJ (2014) PLoS ONE 9:e92915 103. Lipps HJ, Rhodes D (2009) Trends Cell Biol 19:414–422 104. Balasubramanian S, Hurley LH, Neidle S (2011) Nat Rev Drug Discov 10:261–275 105. Drygin D, Siddiqui-Jain A, O’Brien S, Schwaebe M, Lin A, Bliesath J, Ho CB, Proffitt C, Trent K, Whitten JP, Lim JK, Von Hoff D, Anderes K, Rice WG (2009) Cancer Res 69:7653–7661 106. de Lange T (2005) Genes Dev 19:2100–2110 107. Salvati E, Leonetti C, Rizzo A, Scarsella M, Mottolese M, Galati R, Sperduti I, Stevens MF, D’Incalci M, Blasco M, Chiorino G, Bauwens S, Horard B, Gilson E, Stoppacciaro A, Zupi G, Biroccio A (2007) J Clin Invest 117:3236–3247 108. Abraham Punnoose J, Cui Y, Koirala D, Yangyuoru PM, Ghimire C, Shrestha P, Mao H (2014) J Am Chem Soc 136:18062–18069 109. Limongelli V, De Tito S, Cerofolini L, Fragai M, Pagano B, Trotta R, Cosconati S, Marinelli L, Novellino E, Bertini I, Randazzo A, Luchinat C, Parrinello M (2013) Angew Chem Int Ed Engl 52:2269–2273 110. Krauss IR, Ramaswamy S, Neidle S, Haider S, Parkinson GN (2016) J Am Chem Soc 138:1226–1233 111. Adhikary S, Eilers M (2005) Nat Rev Mol Cell Biol 6:635–645 112. Dang CV (2012) Cell 149:22–35 113. Miller DM, Thomas SD, Islam A, Muench D, Sedoris K (2012) Clin Cancer Res 18:5546–5553 114. Brooks TA, Hurley LH (2010) Genes Cancer 1:641–649 115. Brooks TA, Hurley LH (2009) Nat Rev Cancer 9:849–861 116. Gonzalez V, Hurley LH (2010) Annu Rev Pharmacol Toxicol 50:111–129 117. Brown RV, Danford FL, Gokhale V, Hurley LH, Brooks TA (2011) J Biol Chem 286:41018– 41027 118. Boddupally PV, Hahn S, Beman C, De B, Brooks TA, Gokhale V, Hurley LH (2012) J Med Chem 55:6076–6086 119. Martiny-Baron G, Marme D (1995) Curr Opin Biotechnol 6:675–680

40

1 Introduction

120. Bikfalvi A, Bicknell R (2002) Trends Pharmacol Sci 23:576–582 121. Finkenzeller G, Sparacio A, Technau A, Marme D, Siemeister G (1997) Oncogene 15:669–676 122. Shi Q, Le X, Abbruzzese JL, Peng Z, Qian CN, Tang H, Xiong Q, Wang B, Li XC, Xie K (2001) Cancer Res 61:4143–4154 123. Willett CG, Boucher Y, di Tomaso E, Duda DG, Munn LL, Tong RT, Chung DC, Sahani DV, Kalva SP, Kozin SV, Mino M, Cohen KS, Scadden DT, Hartford AC, Fischman AJ, Clark JW, Ryan DP, Zhu AX, Blaszkowsky LS, Chen HX, Shellito PC, Lauwers GY, Jain RK (2004) Nat Med 10:145–147 124. Ferrara N, Hillan KJ, Gerber HP, Novotny W (2004) Nat Rev Drug Discov 3:391–400 125. Sun D, Liu WJ, Guo K, Rusche JJ, Ebbinghaus S, Gokhale V, Hurley LH (2008) Mol Cancer Ther 7:880–889 126. Taka T, Joonlasak K, Huang L, Randall Lee T, Chang SW, Tuntiwechapikul W (2012) Bioorg Med Chem Lett 22:518–522 127. Wu Y, Zan LP, Wang XD, Lu YJ, Ou TM, Lin J, Huang ZS, Gu LQ (2014) Biochim Biophys Acta 1840:2970–2977 128. Bhattacharjee S, Chakraborty S, Chorell E, Sengupta PK, Bhowmik S (2018) Int J Biol Macromol 118:629–639 129. Bhattacharjee S, Sengupta PK, Bhowmik S (2017) Rsc Adv 7:37230–37240 130. Adams JM, Cory S (1998) Science 281:1322–1326 131. Radha G, Raghavan SC (2017) Biochim Biophys Acta Rev Cancer 1868:309–314 132. Vidler LR, Filippakopoulos P, Fedorov O, Picaud S, Martin S, Tomsett M, Woodward H, Brown N, Knapp S, Hoelder S (2013) J Med Chem 56:8073–8088 133. Klasa RJ, Gillum AM, Klem RE, Frankel SR (2002) Antisense Nucl Acid Drug Dev 12:193– 213 134. Tzung SP, Kim KM, Basanez G, Giedt CD, Simon J, Zimmerberg J, Zhang KY, Hockenbery DM (2001) Nat Cell Biol 3:183–191 135. Schmitt CA, Rosenthal CT, Lowe SW (2000) Nat Med 6:1029–1035 136. Amundson SA, Myers TG, Scudiero D, Kitada S, Reed JC, Fornace AJ Jr (2000) Cancer Res 60:6101–6110 137. Douarre C, Gomez D, Morjani H, Zahm JM, O’Donohue MF, Eddabra L, Mailliet P, Riou JF, Trentesaux C (2005) Nucl Acids Res 33:2192–2203 138. Soundararajan S, Chen W, Spicer EK, Courtenay-Luck N, Fernandes DJ (2008) Cancer Res 68:2358–2365 139. Rantanen S, Monni O, Joensuu H, Franssila K, Knuutila S (2001) Leuk Lymphoma 42:1089– 1098 140. Seto M, Jaeger U, Hockett RD, Graninger W, Bennett S, Goldman P, Korsmeyer SJ (1988) EMBO J 7:123–131 141. Heckman C, Mochon E, Arcinas M, Boxer LM (1997) J Biol Chem 272:19609–19614 142. Gomez-Manzano C, Mitlianga P, Fueyo J, Lee HY, Hu M, Spurgers KB, Glass TL, Koul D, Liu TJ, McDonnell TJ, Yung WK (2001) Cancer Res 61:6693–6697 143. Liu YZ, Boxer LM, Latchman DS (1999) Nucl Acids Res 27:2086–2090 144. Wang XD, Ou TM, Lu YJ, Li Z, Xu Z, Xi C, Tan JH, Huang SL, An LK, Li D, Gu LQ, Huang ZS (2010) J Med Chem 53:4390–4398 145. Kang HJ, Kendrick S, Hecht SM, Hurley LH (2014) J Am Chem Soc 136:4172–4185 146. Ashman LK (1999) Int J Biochem Cell Biol 31:1037–1051 147. Edling CE, Hallberg B (2007) Int J Biochem Cell Biol 39:1995–1998 148. Shaw TJ, Keszthelyi EJ, Tonary AM, Cada M, Vanderhyden BC (2002) Exp Cell Res 273:95– 106 149. Wardelmann E, Merkelbach-Bruse S, Pauls K, Thomas N, Schildhaus HU, Heinicke T, Speidel N, Pietsch T, Buettner R, Pink D, Reichardt P, Hohenberger P (2006) Clin Cancer Res 12:1743– 1749 150. Spitaleri G, Biffi R, Barberis M, Fumagalli C, Toffalorio F, Catania C, Noberasco C, Lazzari C, de Marinis F, De Pas T (2015) Onco Targets Ther 8:1997–2003 151. Yamamoto K, Tojo A, Aoki N, Shibuya M (1993) Jpn J Cancer Res 84:1136–1144

References

41

152. Yasuda H, Galli SJ, Geissler EN (1993) Biochem Biophys Res Commun 191:893–901 153. Dhamodharan V, Harikrishna S, Bhasikuttan AC, Pradeepkumar PI (2015) ACS Chem Biol 10:821–833 154. Diveshkumar KV, Sakrikar S, Rosu F, Harikrishna S, Gabelica V, Pradeepkumar PI (2016) Biochemistry 55:3571–3585 155. Zorzan E, Da Ros S, Musetti C, Shahidian LZ, Coelho NF, Bonsembiante F, Letard S, Gelain ME, Palumbo M, Dubreuil P, Giantin M, Sissi C, Dacasto M (2016) Oncotarget 7:21658– 21675 156. Gluszynska A, Juskowiak B, Kuta-Siejkowska M, Hoffmann M, Haider S (2018) Molecules 23 157. Artandi SE, DePinho RA (2010) Carcinogenesis 31:9–18 158. Jafri MA, Ansari SA, Alqahtani MH, Shay JW (2016) Genome Med 8:69 159. Blasco MA (2005) Nat Rev Genet 6:611–622 160. Del Bufalo D, Rizzo A, Trisciuoglio D, Cardinali G, Torrisi MR, Zangemeister-Wittke U, Zupi G, Biroccio A (2005) Cell Death Differ 12:1429–1438 161. Masutomi K, Possemato R, Wong JM, Currier JL, Tothova Z, Manola JB, Ganesan S, Lansdorp PM, Collins K, Hahn WC (2005) Proc Natl Acad Sci USA 102:8222–8227 162. Shin KH, Kang MK, Dicterow E, Kameta A, Baluda MA, Park NH (2004) Clin Cancer Res 10:2551–2560 163. Koh CM, Khattar E, Leow SC, Liu CY, Muller J, Ang WX, Li Y, Franzoso G, Li S, Guccione E, Tergaonkar V (2015) J Clin Invest 125:2109–2122 164. Choi J, Southworth LK, Sarin KY, Venteicher AS, Ma W, Chang W, Cheung P, Jun S, Artandi MK, Shah N, Kim SK, Artandi SE (2008) PLoS Genet 4:e10 165. Horn S, Figl A, Rachakonda PS, Fischer C, Sucker A, Gast A, Kadel S, Moll I, Nagore E, Hemminki K, Schadendorf D, Kumar R (2013) Science 339:959–961 166. Huang FW, Hodis E, Xu MJ, Kryukov GV, Chin L, Garraway LA (2013) Science 339:957–959 167. Killela PJ, Reitman ZJ, Jiao Y, Bettegowda C, Agrawal N, Diaz LA Jr, Friedman AH, Friedman H, Gallia GL, Giovanella BC, Grollman AP, He TC, He Y, Hruban RH, Jallo GI, Mandahl N, Meeker AK, Mertens F, Netto GJ, Rasheed BA, Riggins GJ, Rosenquist TA, Schiffman M, Shih Ie M, Theodorescu D, Torbenson MS, Velculescu VE, Wang TL, Wentzensen N, Wood LD, Zhang M, McLendon RE, Bigner DD, Kinzler KW, Vogelstein B, Papadopoulos N, Yan H (2013) Proc Natl Acad Sci USA 110:6021–6026 168. Krens LL, Baas JM, Gelderblom H, Guchelaar HJ (2010) Drug Discov Today 15:502–516 169. Cox AD, Fesik SW, Kimmelman AC, Luo J, Der CJ (2014) Nat Rev Drug Discov 13:828–851 170. Cogoi S, Paramasivam M, Membrino A, Yokoyama KK, Xodo LE (2010) J Biol Chem 285:22003–22016 171. Amato J, Madanayake TW, Iaccarino N, Novellino E, Randazzo A, Hurley LH, Pagano B (2018) Chem Commun (Camb) 54:9442–9445 172. Cogoi S, Paramasivam M, Spolaore B, Xodo LE (2008) Nucl Acids Res 36:3765–3780 173. Cogoi S, Xodo LE (2006) Nucl Acids Res 34:2536–2549 174. Kaiser CE, Van Ert NA, Agrawal P, Chawla R, Yang D, Hurley LH (2017) J Am Chem Soc 139:8522–8536 175. Cogoi S, Paramasivam M, Filichev V, Geci I, Pedersen EB, Xodo LE (2009) J Med Chem 52:564–568 176. Ramsay RG, Gonda TJ (2008) Nat Rev Cancer 8:523–534 177. Klempnauer KH, Gonda TJ, Bishop JM (1982) Cell 31:453–463 178. Greig KT, Carotta S, Nutt SL (2008) Semin Immunol 20:247–256 179. Zuber J, Rappaport AR, Luo W, Wang E, Chen C, Vaseva AV, Shi J, Weissmueller S, Fellmann C, Taylor MJ, Weissenboeck M, Graeber TG, Kogan SC, Vakoc CR, Lowe SW (2011) Genes Dev 25:1628–1640 180. Biroccio A, Benassi B, D’Agnano I, D’Angelo C, Buglioni S, Mottolese M, Ricciotti A, Citro G, Cosimelli M, Ramsay RG, Calabretta B, Zupi G (2001) Am J Pathol 158:1289–1299 181. Miao RY, Drabsch Y, Cross RS, Cheasley D, Carpinteri S, Pereira L, Malaterre J, Gonda TJ, Anderson RL, Ramsay RG (2011) Cancer Res 71:7029–7037

42

1 Introduction

182. Persson M, Andren Y, Moskaluk CA, Frierson HF Jr, Cooke SL, Futreal PA, Kling T, Nelander S, Nordkvist A, Persson F, Stenman G (2012) Genes Chromosomes Cancer 51:805–817 183. Guerra J, Withers DA, Boxer LM (1995) Blood 86:1873–1880 184. McCann S, Sullivan J, Guerra J, Arcinas M, Boxer LM (1995) J Biol Chem 270:23785–23789 185. Perrotti D, Melotti P, Skorski T, Casella I, Peschle C, Calabretta B (1995) Mol Cell Biol 15:6075–6087 186. Bellon T, Perrotti D, Calabretta B (1997) Blood 90:1828–1839 187. Sullivan J, Feeley B, Guerra J, Boxer LM (1997) J Biol Chem 272:1943–1949 188. Nicolaides NC, Correa I, Casadevall C, Travali S, Soprano KJ, Calabretta B (1992) J Biol Chem 267:19665–19672 189. Palumbo SL, Memmott RM, Uribe DJ, Krotova-Khan Y, Hurley LH, Ebbinghaus SW (2008) Nucl Acids Res 36:1755–1769 190. Matsugami A, Okuizumi T, Uesugi S, Katahira M (2003) J Biol Chem 278:28147–28153 191. Matsugami A, Ouhashi K, Kanagawa M, Liu H, Kanagawa S, Uesugi S, Katahira M (2001) J Mol Biol 313:255–269 192. Li F, Zhou J, Xu M, Yuan G (2018) Int J Biol Macromol 107:1474–1479 193. Miyazaki T, Pan Y, Joshi K, Purohit D, Hu B, Demir H, Mazumder S, Okabe S, Yamori T, Viapiano M, Shin-ya K, Seimiya H, Nakano I (2012) Clin Cancer Res 18:1268–1280 194. Brown RV, Wang T, Chappeta VR, Wu G, Onel B, Chawla R, Quijada H, Camp SM, Chiang ET, Lassiter QR, Lee C, Phanse S, Turnidge MA, Zhao P, Garcia JGN, Gokhale V, Yang D, Hurley LH (2017) J Am Chem Soc 139:7456–7475 195. Qin Y, Rezler EM, Gokhale V, Sun D, Hurley LH (2007) Nucl Acids Res 35:7698–7713 196. Lin S, Li S, Chen Z, He X, Zhang Y, Xu X, Xu M, Yuan G (2011) Bioorg Med Chem Lett 21:5987–5991 197. Zhang L, Tan W, Zhou J, Xu M, Yuan G (2017) Biochim Biophys Acta Gen Subj 1861:884–891 198. Yu J, Ustach C, Kim HR (2003) J Biochem Mol Biol 36:49–59 199. Westermark B, Heldin CH, Nister M (1995) Glia 15:257–263 200. Sulzbacher I, Birner P, Trieb K, Traxler M, Lang S, Chott A (2003) Mod Pathol 16:66–71 201. Afrakhte M, Nister M, Ostman A, Westermark B, Paulsson Y (1996) Int J Cancer 68:802–809 202. Guha A, Dashner K, Black PM, Wagner JA, Stiles CD (1995) Int J Cancer 60:168–173 203. Che T, Wang YQ, Huang ZL, Tan JH, Huang ZS, Chen SB (2018) Molecules 23 204. Xu H, Di Antonio M, McKinney S, Mathew V, Ho B, O’Neil NJ, Santos ND, Silvester J, Wei V, Garcia J, Kabeer F, Lai D, Soriano P, Banath J, Chiu DS, Yap D, Le DD, Ye FB, Zhang A, Thu K, Soong J, Lin SC, Tsai AH, Osako T, Algara T, Saunders DN, Wong J, Xian J, Bally MB, Brenton JD, Brown GW, Shah SP, Cescon D, Mak TW, Caldas C, Stirling PC, Hieter P, Balasubramanian S, Aparicio S (2017) Nat Commun 8:14432 205. Nakamura T, Okabe S, Yoshida H, Iida K, Ma Y, Sasaki S, Yamori T, Shin-Ya K, Nakano I, Nagasawa K, Seimiya H (2017) Sci Rep 7:3605 206. Marchetti C, Zyner KG, Ohnmacht SA, Robson M, Haider SM, Morton JP, Marsico G, Vo T, Laughlin-Toth S, Ahmed AA, Di Vita G, Pazitna I, Gunaratnam M, Besser RJ, Andrade ACG, Diocou S, Pike JA, Tannahill D, Pedley RB, Evans TRJ, Wilson WD, Balasubramanian S, Neidle S (2018) J Med Chem 61:2500–2517 207. Muoio D, Berardinelli F, Leone S, Coluzzi E, di Masi A, Doria F, Freccero M, Sgura A, Folini M, Antoccia A (2018) FEBS J 285:3769–3785 208. Brooks TA, Kendrick S, Hurley L (2010) FEBS J 277:3459–3469 209. Asamitsu S, Bando T, Sugiyama H (2019) Chemistry 25:417–430 210. Monchaud D, Teulade-Fichou MP (2008) Org Biomol Chem 6:627–636 211. Li Q, Xiang JF, Yang QF, Sun HX, Guan AJ, Tang YL (2013) Nucl Acids Res 41:D1115– D1123 212. Anantha NV, Azam M, Sheardy RD (1998) Biochemistry 37:2709–2714 213. Burger AM, Dai F, Schultes CM, Reszka AP, Moore MJ, Double JA, Neidle S (2005) Cancer Res 65:1489–1496 214. Rodriguez R, Muller S, Yeoman JA, Trentesaux C, Riou JF, Balasubramanian S (2008) J Am Chem Soc 130:15758–15759

References

43

215. De Cian A, Delemos E, Mergny JL, Teulade-Fichou MP, Monchaud D (2007) J Am Chem Soc 129:1856–1857 216. Tera M, Ishizuka H, Takagi M, Suganuma M, Shin-ya K, Nagasawa K (2008) Angew Chem Int Ed Engl 47:5557–5560 217. Tera M, Iida K, Ishizuka H, Takagi M, Suganuma M, Doi T, Shin-ya K, Nagasawa K (2009) Chembiochem 10, 431–435 218. Ren J, Chaires JB (1999) Biochemistry 38:16067–16075 219. Arthanari H, Basu S, Kawano TL, Bolton PH (1998) Nucl Acids Res 26:3724–3728 220. Nicoludis JM, Barrett SP, Mergny JL, Yatsunyk LA (2012) Nucl Acids Res 40:5432–5447 221. Kong DM, Ma YE, Guo JH, Yang W, Shen HX (2009) Anal Chem 81:2678–2684 222. Lai H, Xiao Y, Yan S, Tian F, Zhong C, Liu Y, Weng X, Zhou X (2014) Analyst 139:1834–1838 223. Kataoka Y, Fujita H, Kasahara Y, Yoshihara T, Tobita S, Kuwahara M (2014) Anal Chem 86:12078–12084 224. Redman JE, Granadino-Roldan JM, Schouten JA, Ladame S, Reszka AP, Neidle S, Balasubramanian S (2009) Org Biomol Chem 7:76–84 225. Zhang L, Er JC, Ghosh KK, Chung WJ, Yoo J, Xu W, Zhao W, Phan AT, Chang YT (2014) Sci Rep 4:3776 226. Rasadean DM, Sheng B, Dash J, Pantos GD (2017) Chemistry 23:8491–8499 227. Kumar YP, Bhowmik S, Das RN, Bessi I, Paladhi S, Ghosh R, Schwalbe H, Dash J (2013) Chemistry 19:11502–11506 228. Hamon F, Largy E, Guedin-Beaurepaire A, Rouchon-Dagois M, Sidibe A, Monchaud D, Mergny JL, Riou JF, Nguyen CH, Teulade-Fichou MP (2011) Angew Chem Int Ed Engl 50:8745–8749 229. Panda D, Debnath M, Mandal S, Bessi I, Schwalbe H, Dash J (2015) Sci Rep 5:13183 230. Dutta D, Debnath M, Muller D, Paul R, Das T, Bessi I, Schwalbe H, Dash J (2018) Nucl Acids Res 46:5355–5365 231. Hu MH, Wang YQ, Yu ZY, Hu LN, Ou TM, Chen SB, Huang ZS, Tan JH (2018) J Med Chem 61:2447–2459 232. Di Antonio M, Biffi G, Mariani A, Raiber EA, Rodriguez R, Balasubramanian S (2012) Angew Chem Int Ed Engl 51:11073–11078 233. Kang HJ, Cui Y, Yin H, Scheid A, Hendricks WPD, Schmidt J, Sekulic A, Kong D, Trent JM, Gokhale V, Mao H, Hurley LH (2016) J Am Chem Soc 138:13673–13692 234. Lim KW, Khong ZJ, Phan AT (2014) Biochemistry 53:247–257 235. Lim KW, Nguyen TQ, Phan AT (2014) J Am Chem Soc 136:17969–17973 236. Lim KW, Jenjaroenpun P, Low ZJ, Khong ZJ, Ng YS, Kuznetsov VA, Phan AT (2015) Nucl Acids Res 43:5630–5646 237. Nguyen TQN, Lim KW, Phan AT (2017) Sci Rep 7:11969 238. McLuckie KI, Waller ZA, Sanders DA, Alves D, Rodriguez R, Dash J, McKenzie GJ, Venkitaraman AR, Balasubramanian S (2011) J Am Chem Soc 133:2658–2663 239. Wang X, Zhou CX, Yan JW, Hou JQ, Chen SB, Ou TM, Gu LQ, Huang ZS, Tan JH (2013) ACS Med Chem Lett 4:909–914 240. Guilbaud G, Murat P, Recolin B, Campbell BC, Maiter A, Sale JE, Balasubramanian S (2017) Nat Chem 9:1110–1117 241. Zhou CQ, Liao TC, Li ZQ, Gonzalez-Garcia J, Reynolds M, Zou M, Vilar R (2017) Chemistry 23:4713–4722 242. Zhou CQ, Yang JW, Dong C, Wang YM, Sun B, Chen JX, Xu YS, Chen WH (2016) Org Biomol Chem 14:191–197 243. Abraham Punnoose J, Ma Y, Li Y, Sakuma M, Mandal S, Nagasawa K, Mao H (2017) J Am Chem Soc 139:7476–7484 244. Zhao C, Wu L, Ren J, Xu Y, Qu X (2013) J Am Chem Soc 135:18786–18789 245. Yu H, Wang X, Fu M, Ren J, Qu X (2008) Nucl Acids Res 36:5695–5703 246. Zhao A, Zhao C, Ren J, Qu X (2016) Chem Commun (Camb) 52:1365–1368 247. Shinohara K, Sannohe Y, Kaieda S, Tanaka K, Osuga H, Tahara H, Xu Y, Kawase T, Bando T, Sugiyama H (2010) J Am Chem Soc 132:3778–3782

44

1 Introduction

248. Hu MH, Chen SB, Wang B, Ou TM, Gu LQ, Tan JH, Huang ZS (2017) Nucl Acids Res 45:1606–1618 249. Zhang Q, Liu YC, Kong DM, Guo DS (2015) Chemistry 21:13253–13260 250. Huang XX, Zhu LN, Wu B, Huo YF, Duan NN, Kong DM (2014) Nucl Acids Res 42:8719– 8731 251. Bag SS, Pradhan MK, Talukdar S (2017) Org Biomol Chem 15:10145–10150 252. Amato J, Pagano A, Cosconati S, Amendola G, Fotticchia I, Iaccarino N, Marinello J, De Magis A, Capranico G, Novellino E, Pagano B, Randazzo A (2017) Biochim Biophys Acta Gen Subj 1861:1271–1280 253. Cammas A, Millevoi S (2017) Nucl Acids Res 45:1584–1595 254. Fay MM, Lyons SM, Ivanov P (2017) J Mol Biol 429:2127–2147 255. Song J, Perreault JP, Topisirovic I, Richard S (2016) Translation (Austin) 4:e1244031 256. Agarwala P, Pandey S, Maiti S (2015) Org Biomol Chem 13:5570–5585 257. Biffi G, Di Antonio M, Tannahill D, Balasubramanian S (2014) Nat Chem 6:75–80 258. Katsuda Y, Sato S, Asano L, Morimura Y, Furuta T, Sugiyama H, Hagihara M, Uesugi M (2016) J Am Chem Soc 138:9037–9040 259. Chen XC, Chen SB, Dai J, Yuan JH, Ou TM, Huang ZS, Tan JH (2018) Angew Chem Int Ed Engl 57:4702–4706 260. Chen SB, Hu MH, Liu GC, Wang J, Ou TM, Gu LQ, Huang ZS, Tan JH (2016) J Am Chem Soc 138:10382–10385 261. Bharti SK, Sommers JA, Zhou J, Kaplan DL, Spelbrink JN, Mergny JL, Brosh RM Jr (2014) J Biol Chem 289:29975–29993 262. Lyonnais S, Tarres-Sole A, Rubio-Cosials A, Cuppari A, Brito R, Jaumot J, Gargallo R, Vilaseca M, Silva C, Granzhan A, Teulade-Fichou MP, Eritja R, Sola M (2017) Sci Rep 7:43992 263. Ge YL, Weng XC, Tian T, Ding F, Huang R, Yuan LB, Wu J, Wang TL, Guo P, Zhou X (2013) Rsc Adv 3:12839–12846 264. Chennoufi R, Bougherara H, Gagey-Eilstein N, Dumat B, Henry E, Subra F, Bury-Mone S, Mahuteau-Betzer F, Tauc P, Teulade-Fichou MP, Deprez E (2016) Sci Rep 6:21458 265. Bolze F, Jenni S, Sour A, Heitz V (2017) Chem Commun (Camb) 53:12857–12877 266. Zielonka J, Joseph J, Sikora A, Hardy M, Ouari O, Vasquez-Vivar J, Cheng G, Lopez M, Kalyanaraman B (2017) Chem Rev 117:10043–10120 267. Rajendran A, Endo M, Hidaka K, Sugiyama H (2014) Angew Chem Int Ed Engl 53:4107–4112 268. Cerofolini L, Amato J, Giachetti A, Limongelli V, Novellino E, Parrinello M, Fragai M, Randazzo A, Luchinat C (2014) Nucl Acids Res 42:13393–13404 269. Jiang HX, Cui Y, Zhao T, Fu HW, Koirala D, Punnoose JA, Kong DM, Mao H (2015) Sci Rep 5:9255 270. Paul A, Sengupta P, Krishnan Y, Ladame S (2008) Chemistry 14:8682–8689 271. Rajendran A, Endo M, Hidaka K, Teulade-Fichou MP, Mergny JL, Sugiyama H (2015) Chem Commun (Camb) 51:9181–9184 272. Wu KJ, Grandori C, Amacker M, Simon-Vermot N, Polack A, Lingner J, Dalla-Favera R (1999) Nat Genet 21:220–224 273. Zaanan A, Okamoto K, Kawakami H, Khazaie K, Huang S, Sinicrope FA (2015) J Biol Chem 290:23838–23849

Chapter 2

Sequence-Specific DNA Alkylation and Transcriptional Inhibition by Long-Chain Hairpin Pyrrole–Imidazole Polyamide–Chlorambucil Conjugates Targeting CAG/CTG Trinucleotide Repeats Abstract Introducing novel building blocks to solid-phase peptide synthesis, we readily synthesized long-chain hairpin pyrrole–imidazole (PI) polyamide–chlorambucil (Chl) conjugates targeted to CAG/CTG repeat sequences that are associated with trinucleotide repeat disorders. A high-resolution denaturing polyacrylamide sequencing gel analysis revealed sequence-specific alkylations by PI polyamide-Chl conjugates at the N3 of adenines or guanines in CAG/CTG repeats, with 11 bp recognition. Cell-free transcription assays showed that the specific alkylation inhibited the progression of RNA polymerase at the alkylating sites. The nanomolar binding affinities of the parent PIP domains to CAG/CTG repeat sequences were observed in quantitative SPR-binding assays. These results suggest that long-chain PI polyamide-Chl conjugates targeted to CAG/CTG repeats might be further developed toward potential chemical tools or drugs for trinucleotide repeat disorders. Keywords Pyrrole–imidazole · Polyamide · CAG/CTG repeat sequence · Sequence-specific DNA · Transcriptional inhibition

2.1 Introduction Pyrrole–imidazole polyamides (PI polyamides) are synthetic ligands that bind to the minor groove of duplex DNA with high affinity and sequence specificity. These contain N-methylpyrrole (P) and N-methylimidazole (I) and recognize each of the four Watson–Crick base pairs uniquely [1–3]. The base-pair discrimination by PI polyamides relies on the recognition of a G/C base pair by the antiparallel pairing of an I/P, whereas a P/P pair recognizes an A/T or a T/A base pair. A β-alanine/βalanine (β/β) pair reads an A/T or a T/A base pair in the same way as a P/P pair does. Recently, the introduction of PI polyamides into DNA-alkylating agents has attracted attention because of their sequence-specific alkylation at target sites [4]. Trinucleotide repeat sequences exist in some regions of genomic DNA, and the expansion of the repeat sequences often causes neurological disorders [5, 6]. For example, the expansion of CAG trinucleotide repeats within the first exon of the © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Asamitsu, Development of Selective DNA-Interacting Ligands, Springer Theses, https://doi.org/10.1007/978-981-15-7716-1_2

45

46

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

Huntingtin (HTT) gene (from 36 to >100 repeats) yields a full-length HTT protein harboring expanded polyglutamine tracts, which disrupts neuronal protein functions, ultimately causing Huntington’s disease [7–9]. In myotonic dystrophy type 1 (DM1) and Huntington’s disease-like 2, which are associated with CTG repeat sequences located in the 3 UTR of genes, transcriptional products containing excess CUG repeats accumulate within the nucleus by forming heterogeneous 1 bp mismatchcontaining hairpin structures, which mistakenly recruit RNA-binding proteins such as splicing regulators. Bound RNA-binding proteins are unable to function properly any longer, eventually leading to the pathogenesis of each disease [10–13]. Surprisingly, these trinucleotide-repeated sequences tend to expand as a result of the dynamics of the hairpin structure folding during DNA replication, repair, and recombination. The expansion leads to a decreased age of onset and increased severity of the diseases in successive generations [5]. Several mechanisms to explain this characteristic inheritance have been proposed; however, a genuine one remains unknown. From the therapeutic aspects, antisense oligonucleotides that target the internal loop motifs located in the hairpin structure have promisingly been reported [13]. Furthermore, Disney and coworkers reported that covalent adduct formation of expanded rCUG transcripts in the hairpin structure by small molecules improved DM1-associated pre-mRNA splicing defects [14]. Besides, Nakatani and coworkers developed a small molecule that binds a hairpin d(CAG/CAG) region and flips cytidine nucleotides out [15]. Our group has developed PI polyamides with fluorescence dyes that detect CAG repeated sequences as a potential diagnostic tool [16, 17]. Based on these backgrounds, we hypothesized that the transcriptional suppression across CAG/CTG repeat DNAs via a covalent linking strategy using PI polyamides may prevent the production of mutated proteins and RNA transcripts [18]. Considering that DNA damaging agents, such as alkylating agents, UV irradiation, and oxidation, were reported to decrease the repeat length [19, 20], the development of alkylator-connecting PI polyamides targeted to CAG/CTG repeat DNAs might be an effective approach both for reducing the abnormal products and for contracting the repeat length without minimally damaging other sequences. Our group has long addressed the creation of alkylator-connecting PI polyamides [21–24]. In the present study, we designed and synthesized PI polyamides conjugated to a DNA alkylator, chlorambucil (Chl), which are targeted to CAG/CTG repeat sequences, and their length of sequence recognition was successfully elongated by introducing novel building blocks used in Fmoc solid-phase peptide synthesis (SPPS) [25]. The activities of the compounds obtained were characterized in terms of DNA sequencespecific alkylation, transcriptional inhibition, and dynamics of the interaction with the target sequence.

2.2 Results and Discussion

47

2.2 Results and Discussion 2.2.1 Molecular Design and Synthesis We rationally designed several types of PI polyamide–Chl conjugates (1–4) to target CAG/CTG repeat sequences with different lengths or structures (Fig. 2.1) [26, 27].

Fig. 2.1 a Chemical structure of PI polyamide–chlorambucil (Chl) conjugates (1–4) targeting the CAG/CTG repeat sequences. b Schematic representation of the binding mode of conjugates 1–4 to the CAG/CTG repeat region via a dimer (conjugate 1) or hairpin (conjugate 2–4) conformation

48

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

Chl is a nitrogen mustard alkylating agent that belongs to the group of anticancer drugs and has been used widely as the first line of treatment for chronic lymphocytic leukemia. Its efficacy has been confirmed at the clinical level [28, 29]. The potency of hairpin PI polyamide–Chl conjugates has been investigated in terms of chemical and biological activity [30, 31]. Hartley and coworkers previously demonstrated preferential alkylation at the N3 of guanine or adenine on 5 -TTTTGG-3 or 5 -TTTTGA-3 sequences, respectively, by tripyrrole conjugated with BAM [32], which is one of the nitrogen mustards. In addition to this, considering that Chl is more cytotoxic than BAM in the human colonic adenocarcinoma LS174T and leukemic K562 cell lines [33], we chose to use a Chl for designing alkylating PI polyamides. For detailed molecular design, since the flexibility of Chl within the structure may allow reaching the opposite strand [34], β-alanine and 3,3 -diaminoN-methyldipropylamine (DMDPA; as a relatively long linker) were added as an additional linker moiety, aiming to enhanced alkylating activity. Also, we designed PI polyamides symmetrically to allow the polyamides to bind in two orientations, increasing the number of possible alkylating sites [30]. β-alanine residues were incorporated into the appropriate positions within the PI polyamide domain, which allows for optimal H-bond donor/acceptor interactions by offering structural flexibility [35–38]. Regarding the synthesis of long-chain hairpin PI polyamides, only monomeric or/and dimeric units have been used in Fmoc SPPS to produce relatively long hairpin polyamides so far [39]. In this case, their yields were much worse because of the increase in the number of reaction steps. In particular, Fmoc SPPS had difficulty in coupling with a P after the introduction of a GABA (γ-turn), which functions to transform polyamides into hairpin forms that allow high affinity and specificity [24, 40]. To overcome these difficulties, we decided to synthesize FmocHN-P-γ-CO2 H (5), Boc-D-Dab (FmocHN-P)-CO2 H (6), and FmocHN-P-β-I-CO2 H (10) as building blocks of Fmoc SPPS (Scheme 2.1) [41, 42]. All of them are washed and dried to produce powder states that can then be subjected to Fmoc SPPS without further purification. Consequently, the introduction of new building blocks into Fmoc SPPS reduced the reaction steps from 17 to 8 steps and failure sequence products for the synthesis of polyamide 13 (Scheme 2.2). Aiming to better binding status, we designed and synthesized polyamide 15 with a chiral substitution of the γ-turn by an amino group [2, 43]. For comparison, we also prepared a shorter polyamide 11 (linear type) and 12 (hairpin type). All the products obtained from Fmoc SPPS were cleaved with DMDPA, to produce the amino PI polyamides (polyamide 11–14). Conjugates 1–3 were obtained via the PyBOP coupling with Chl. Conjugate 4 was prepared by the PyBOP coupling with Chl and the subsequent deprotection of the Boc group by TFA. The diamino PI polyamide 15 was also prepared for SPR-binding assays. Conjugates 1–4 were purified by reversed-phase HPLC and confirmed by ESI-TOF-MS.

2.2 Results and Discussion

49

Scheme 2.1 Reagents and conditions: (i) HCTU, DIEA, DMF, and then a solution of GABA in DIEA and DMF, rt, 3 h, 94%; (ii) PyBOP, DIEA, DMF, and then a solution of Boc-D-Dab-OH in DMF, rt, 3 h, quant; (iii) BocHN-β-alanine, HCTU, DIEA, DMF, rt, 1.5 h, 84%; (iv) TFA, H2 O, rt, then NaOH, MeOH, H2 O, 40 °C, 2 h; (v) a solution of FmocHN-P-CO2 H and HCTU in DIEA and DMF, rt, 6 h, 73%

2.2.2 DNA-Alkylating Activity of Conjugates 1–4 First, the DNA-alkylating activities of conjugates 1–3 were evaluated using highresolution polyacrylamide gel electrophoresis (HR-PAGE), to investigate the effects of the structure or recognition length of PI polyamides on specific alkylation within CAG/CTG repeat sequences [44, 45]. 5 -Texas Red-labeled 214 bp DNA fragments including a (CAG/CTG)12 repeat sequence were prepared by transformation into pGEM-T Easy Vectors, and subsequently by PCR amplification and purification (Fig. 2.2) [46]. Alkylation was carried out at 23 °C for 18 h, followed by quenching by the addition of calf thymus DNA. The samples were heated at 95 °C under neutral conditions for 10 min. Under these conditions, all the N3 of purines at the alkylated sites in the DNA fragment produced cleavage bands quantitatively on the gel [47, 48]. The HR-PAGE results of the DNA fragments treated with conjugates 1–3 are shown

50

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

Scheme 2.2 Reagents and conditions: (i) Fmoc solid-phase synthesis of 11: FmocHN-P-CO2 H, 10, FmocHN-I-CO2 H (in this order, total three steps), 20% piperidine, HCTU, DIEA, DMF, NMP in each step; (ii) DMDPA; (iii) Chl, PyBOP, DIEA, DMF, rt; (iv) Fmoc solid-phase synthesis of 12: FmocHN-P-CO2 H, 10, FmocHN-I-CO2 H, 5, 10, FmocHN-I-CO2 H (in this order, total six steps), 20% piperidine, HCTU, DIEA, DMF, NMP in each step; Fmoc solid-phase synthesis of 13 (14): FmocHN-P-CO2 H, 10, 10, FmocHN-I-CO2 H, 5 (6), 10, 10, FmocHN-I-CO2 H (in this order, total eight steps), 20% piperidine, HCTU, DIEA, DMF, NMP in each step; (v) TFA–DCM, rt; (vi) Chl, HCTU, DIEA, DMF, then TFA–DCM, 0 °C

in Fig. 2.3. Regarding the alkylation of the (CAG)12 repeat sequence (Fig. 2.3a), no alkylation by conjugates 1 and 2 occurred at the sites of the CAG repeat, and alkylation by conjugate 1 occurred at site a, which can be explained by recognition of expanded state with 1 bp mismatch, instead of dimer recognition [49]. Even at 10 times concentrations, conjugate 1 did not specifically alkylate the CAG repeat sequence (data not shown). At around 150 nM concentrations, alkylation by conjugate 2 occurred within the CAG repeat, albeit with a specificity that was much lower than that of conjugate 3 (data not shown). On the contrary, we observed selective alkylation by conjugate 3 at sites of the target region with higher alkylating activities and sequence specificity at nanomolar concentrations (lanes 2–5), except alkylation at site b with 3 bp mismatch. On the other hand, for the (CTG)12 repeat sequence (Fig. 2.3b), we observed specific alkylation by both conjugates 2 and 3, except alkylation at site c and d with 3 bp mismatch by conjugate 2 and 3, respectively, and no alkylation by conjugate 1. These results indicated that 11 bp recognition through

2.2 Results and Discussion

51

Fig. 2.2 Sequences of the 5 -Texas Red-labeled DNA fragments used in the sequencing gel analysis. The two sequences are the same but opposite strands are Texas Red labeled. The sequence in bold represents an insert (5 -(CAG)12 CA-3 /5 -G(CTG)12 A-3 ) that was ligated into the pGEM-T Easy Vector. The DNA fragment (a) or (b) was used for the detection of the cleavage bands on the gel that are produced as a result of the DNA alkylation of one strand containing a (CAG)12 or (CTG)12 repeat sequence, respectively

hairpin conformation (conjugate 3) is required for sequence-specific alkylation in both strands including CAG/CTG repeat sequences. Having established that conjugate 3 is the most appropriate for targeting CAG/CTG repeat sequences, we compared the alkylating activity of conjugates 3 and 4 for CAG/CTG repeat DNAs in detail and the influence by chiral substitution in the γ-turn with an amino group. Specific alkylation by conjugates 3 and 4 was observed within the CAG/CTG repeat region (sites 1–13 for CAG repeats and sites 1 –15 for CTG repeats), except non-specific alkylation (site b and site d) as also seen in Fig. 2.3 (Fig. 2.4). For the CTG repeat sequence (Fig. 2.4b), both conjugates alkylated at the N3 position of the guanines in the CTG trinucleotides, particularly from the third CTG (site 6 ) to the fifth CTG trinucleotide (site 8 ) numbered from the 5 side. These sites are included in ones that can be alkylated by the conjugates through two binding modes (sites 6 –11 ), thus showing the relative higher cleavage intensities (Fig. 2.4c). In total, we observed 15 cleavage bands as a result of the N3 adenine or guanine alkylation, which corresponded to possible alkylating sites within the target region (Fig. 2.4b, c). Similarly, specific alkylation by conjugates 3 and 4 within the CAG repeat produced 14 cleavage bands. As with the case for

52

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

Fig. 2.3 Thermally-induced strand cleavage of 214 bp DNA fragments derived from 5 -Texas Redlabeled strands including a a (CAG)12 or b (CTG)12 repeat (8 nM) by conjugates 1, 2, and 3 at 23 °C for 18 h. These two DNA strands are complementary. Lane 1: DNA control; lanes 2–5: 2.5, 5, 10, and 20 nM of 3; lanes 6–9: 2.5, 5, 10, and 20 nM of 2; lanes 10–13: 2.5, 5, 10, and 20 nM of 1. c Schematic representation of the recognition and alkylation models of these conjugates. The arrows indicate the alkylating sites that were inferred from the sequencing gel analysis. Note that alkylation does not occur at the site indicated by *, as this band is also observed in lane 1 (control) [45]

the CTG repeats, the cleavage bands with the relatively higher intensities (sites 4–9) mostly correspond to the sites which can be alkylated by the conjugates through two binding modes (sites 4–10). Also, we observed that these cleavage bands on the gel were a little broader than those obtained for the CTG repeat, suggesting that both N3 adenines and their adjacent guanines in CAG trinucleotides were alkylated by conjugates 3 and 4 (Fig. 2.4a, c). Taken together, we found that, although the effect of a chiral substitution on DNA-alkylating activities was not observed in this experimental system, rationally designed long-chain PI polyamide–Chl conjugates (conjugates 3 and 4) have the higher alkylating ability and selectivity to CAG/CTG repeat DNAs, compared to the shorter type conjugates (conjugates 1 and 2), though the two binding orientations.

2.2 Results and Discussion

53

Fig. 2.4 Thermally-induced strand cleavage of 214 bp DNA fragments derived from 5 -Texas Red-labeled strands including a a (CAG)12 or b (CTG)12 repeat (8 nM) by conjugates 3 and 4 at 23 °C for 18 h. These two DNA strands are complementary. Lane 1: DNA control; lanes 2–5: 2.5, 5, 10, and 20 nM of 4; lanes 6–9: 2.5, 5, 10, and 20 nM of 3. c Schematic representation of the recognition and alkylation models of these conjugates with two binding modes. The arrows indicate the alkylating sites within the target region that were inferred from the sequencing gel analysis. The underlined sites show the ones at which the conjugates can alkylate with two binding modes. Note that alkylation does not occur at the site indicated by *, as this band is also observed in lane 1 (control) [45]

2.2.3 The Influence of Conjugate 4 Over Transcription To examine the influence of alkylation by a PI polyamide–Chl conjugate on transcription, we performed a cell-free transcription assay using T7 RNA polymerase and unlabeled DNA templates including a (CAG/CTG)12 repeat (sequences A and

54

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

B, as shown in Fig. 2.5a) with or without conjugate 4 and its parent polyamide 15 as a control. Sequence B was prepared by inserting a (CAG/CTG)12 repeat (the same sequence as the HR-PAGE), while sequence A was prepared by inserting an inversed (CAG/CTG)12 repeat (Figs. 2.5a and Fig. 2.6). The transcripts were electrophoresed by 10% denaturing PAGE [50] and stained with SYBR Gold. In only lanes 2–4 of Fig. 2.5b, c, truncated RNAs (predicted length: approximately 60–101 nt) appeared as ladder-like bands in a dose-dependent manner, indicating that conjugate 4 alkylated the CAG/CTG repeat region and arrested the progression of RNA polymerase at the alkylating sites. Furthermore, we observed that the number of the ladder-like

Fig. 2.5 Transcribed DNA sequences and transcription products. a Sequence A has a template strand that contains a (CAG)12 repeat and sequence B has one that contains a (CTG)12 repeat. The numbers indicate the position of bases in the sequence from the beginning of transcription. Bold domains indicate the inserted (CAG/CTG)12 repeat sequence. The arrows show probably alkylated bases. b and c Effects on transcription of conjugate 4 and its parent polyamide 15; b Sequence A and c sequence B. Lane 1: without any polyamides; lanes 2–4: 10, 20, and 30 nM of 4; lane 5: ssRNA markers, lanes 6–8: 10, 20, and 30 nM of 15

2.2 Results and Discussion

55

Fig. 2.6 Sequences of the non-labeled DNA fragments used in in-vitro transcription assays. The two sequences in bold (the insert) are the same but each was ligated into the pGEM-T Easy Vector in the reverse direction. B is the same sequence as DNA fragments a and b depicted in Fig. 2.2

bands obtained from the transcription of sequence A was greater than that obtained from sequence B. This observation showed the alkylation by conjugate 4 at both adenines and guanines in the (CAG)12 repeat, which corresponds to the results of the HR-PAGE analysis (Fig. 2.4a). Conversely, the parent polyamide 15 did not inhibit the transcription, as evidenced by the little appearance of retarded bands in lanes 6–8. This result is also consistent with the report that PI polyamides not conjugated with alkylating reagents do not inhibit transcriptions by T7 RNA polymerase in the absence of histones [51].

56

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

2.2.4 Binding Properties of the Parent Polyamides to Target Hairpin DNA To obtain additional information about the interaction of PI polyamides with the CAG/CTG repeat region, we next performed quantitative SPR-binding assays to evaluate the binding affinities and kinetics parameters of the four parent polyamides 11–13 and 15 for a CAG/CTG repeat DNA (Fig. 2.7a) [52, 53]. The solution of each polyamide in HBS buffer with 0.1% DMSO flowed on a sensor chip onto which the designed biotinylated DNA was immobilized, and the interaction was visualized as a sensorgram (Fig. 2.7b–e). The values of k a , k d , and K D obtained by curve fitting are shown in Table 2.1. Polyamide 12 had about fourfold higher binding ability than that of polyamide 13, mainly owing to the difference in the rates of association (k a ). One possible explanation for this observation is that it is more difficult for polyamide 13 to mold into the DNA minor groove with the closed state (hairpin form), because

Fig. 2.7 SPR assays to evaluate the binding affinity of the parent polyamides 11-13 and 15 to that a hairpin DNA oligomer containing CAG/CTG repeats, which was immobilized on a sensor chip. a A biotinylated DNA oligomer was used in this study. SPR response curves of the interaction of the parent polyamides at various concentrations with the hairpin DNA immobilized on a sensor chip. From the top, concentration: b 3200, 1600, 800, 400, and 200 nM; c 500, 250, 125, 62.5, and 31.25 nM; d 1000, 500, 250, 125, and 62.5 nM; e 200, 100, 50, 25, and 12.5 nM, respectively

2.2 Results and Discussion

57

Table 2.1 Values of affinities and kinetics parameters by curve fitting for each sensorgram PI polyamide

k a (M−1 s−1 )

11a

k a1 = 8.8 ×

103

k a2 = 4.4 ×

102

12b

1.2 × 105

13c

2.3 ×

104

15c

8.5 ×

104

k d (s−1 )

χ 2d

K D (M)

k d1 = 5.4 ×

10−3

k d2 = 5.7 ×

10−6

K D1 = 6.1 ×

10−7

K D2 = 1.3 ×

10−8

2.3

1.6 × 10−3

1.4 × 10−8

7.5

1.2 ×

10−3

5.3 × 10−8

4.1

9.5 ×

10−4

10−8

3.8

1.1 ×

a Determined

by fitting to 2:1 binding model with mass transfer b Determined by fitting to 1:1 binding model with drifting baseline c Determined by fitting to 1:1 binding model with mass transfer d The closeness of fit is described by the statistical value χ 2

of its flexibility resulting from the long chain. Another possibility of the stronger binding by polyamide 12 is that although this polyamide interacts with the DNA through the 1:1 binding mode, it has two matching positions in the target region. This would facilitate the penetration of the polyamide, which is reflected in the k a value. Concerning the rates of dissociation (k d ), we observed that the extension of the recognition length results in a small decrease in the k d values. This indicates that the increase in the hydrogen bonds did not affect the tendency to keep binding to the target region in the DNA minor groove, even with the introduction of the cationic amino substituent (polyamide 15). Conversely, comparison between the binding affinities of polyamides 13 and 15 revealed that polyamide 15 had about fivefold higher binding ability than that of polyamide 13 with a clear deviation in the k a values, suggesting that polyamide 15 is more likely to access the DNA with the cationic amino group being positioned in the DNA minor groove to give better electrostatic attraction, while retaining the forward hairpin folding of polyamide 15 [43]. Regarding polyamide 11, since polyamide 11 was designed to bind to the target region with a dimer conformation, the sensorgram obtained for polyamide 11 was fitted to 2:1 stoichiometry binding model that represents dynamics in the formation of the complexes with two polyamides bound side by side in the same site [26, 27, 52]. As seen in Table 2.1, the value of K D2 (1.3 × 10−8 M) is lower than that of K D1 (6.1 × 10−7 M), indicating the cooperative binding for the target region. To compare the other K D values that were obtained by fitting to the 1:1 binding model, a (K D1 · K D2 )1/2 value was used, which is the value adjusted to a per bound molecule [52]. The value for polyamide 11 (8.9 × 10−8 M) was the highest of those in this experiment, which corresponded to lower alkylating activity and specificity observed in the HR-PAGE.

58

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

2.3 Conclusion We rationally designed and readily synthesized PI polyamide–Chl conjugates targeting CAG/CTG repeat sequences with different lengths or structures. In particular, by introducing the novel building blocks synthesized in the solution phase into Fmoc SPPS, we successfully prepared long-chain hairpin PI polyamides which otherwise would be difficult to synthesize. The HR-PAGE analysis revealed that the long-chain hairpin PI polyamide–Chl conjugates showed sequence-specific alkylation at the N3 adenines or/and guanines in CAG/CTG repeat sequences. In vitro transcription assays demonstrated that the sequence-specific alkylation by the long-chain PI polyamide–Chl conjugate terminated the progression of RNA polymerase at the alkylating sites. The chiral substitution on the γ-turn with an amino group enhanced the binding affinity as measured by SPR-binding assays. Collectively, these results suggest that long-chain PI polyamide–Chl conjugates targeted to CAG/CTG repeat DNAs could be developed toward potential drugs for trinucleotide repeat diseases or chemical/chemical biology tools for probing expanded CAG/CTG repeat sequences. Moreover, as reported previously [41], the concept of the use of building blocks corresponding to a repeated motif for providing relatively long polyamides is applicable not only to this case, but also to various cases for targeting other repeated sequences like CGG and GAA repeats.

2.4 Materials and Methods 2.4.1 General 1

H NMR spectra were recorded on JEOL JNM ECA-600 spectrometer (600 MHz for 1 H), with chemical shifts reported in parts per million relative to residual solvent and coupling constants in hertz. The following abbreviations were applied to spin multiplicity: s (singlet), br s (broad singlet), d (doublet), t (triplet), q (quartet), and m (multiplet). HPLC analysis was performed on a Jasco Engineering PU-2080 plus series system using a 150 × 4.6 mm X-Terra MS C18 reversed-phase column in 0.1% TFA in water with acetonitrile as the eluent at a flow rate of 1.0 mL/min and a linear gradient elution of 0–100% acetonitrile in 20 or 40 min with detection at 254 nm. Collected fractions were analyzed by ESI-TOF-MS (Bruker). Reversedphase flash column chromatography was performed on CombiFlash Rf (Teledyne Isco, Inc.) using a 4.3 g reversed-phase flash column (C18 RediSep Rf) in 0.1% TFA in water with acetonitrile as the eluent at a flow rate of 18.0 mL/min and a linear gradient elution of 0–35% acetonitrile in 5–40 min with detection at 254 nm. HPLC purification was performed with a Jasco Engineering UV2075 HPLC UV/VIS detector and a PU-2080 plus series system using a 150 × 4.6 mm CHEMCOBOND 5-ODS-H reversed-phase column in 0.1% TFA in water with acetonitrile as the

2.4 Materials and Methods

59

eluent at a flow rate of 1.0 mL/min and a linear gradient elution of 30–75% acetonitrile in 30 min with detection at 254 nm. Polyacrylamide gel electrophoresis was performed on a HITACHI SQ5500-S DNA sequencer. Long Ranger gel solution (50%) was purchased from FMC bioproducts. Calf thymus DNA (10 mg/mL, 1 μL) was purchased from Invitrogen and Thermo Sequence core sequencing kit was from GE Healthcare. In vitro transcription assays were carried out by in vitro Transcription T7 Kit (Takara Bio Inc.), and the transcripts were investigated on 10% TBE-Urea Gels containing 7 M urea (Invitrogen). SPR assays were performed with a Biacore X system (GE Healthcare), and processing of data was carried out by using BIAevaluation, version 4.1. Chlorambucil (Chl) was purchased from Sigma, and BocHN-β-alanine, 3,3 -diamino-N-methyldipropylamine (DMDPA) and 10% Pd/C were from Aldrich. Fmoc-β-Wang resin (0.55 mmol/g) and O-(1H-6chlorobenzotriazol-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate (HCTU) were purchased from Peptide International. FmocHN-P-CO2 H, FmocHN-I-CO2 H, O2 N-I-COCl3 , N,N-dimethylformamide (DMF), 1-methyl-2-pyrrolidone (NMP), and piperidine were purchased from Wako, and Boc-D-Dab-OH was from Watanabe Chemical Ind., Ltd. PyBOP were purchased from Novabiochem. Diisopropylethylamine (DIEA), 4-amino-n-butyric carboxylic acid (GABA) were purchased from Nacalai Tesque, Inc. Trifluoroacetic acid (TFA) was purchased from Kanto Chemical Co., Inc. Dichloromethane (DCM) was purchased from Sasaki Chemical co., Ltd. The other reagents and solvents were purchased from standard suppliers and used without further purification.

2.4.2 Syntheses of Compound 5–10 FmocHN-P-γ-CO2 H (compound 5). Compound 5 was synthesized from FmocHNP-CO2 H. To a solvent of DMF was added compound 5 (1.50 g, 4.1 mmol), HCTU (1.71 g, 4.1 mmol), and DIEA (1.44 mL, 8.3 mmol), and the mixture was stirred at room temperature for 15 min. The reaction mixture was added slowly to a solution of GABA (858 mg, 8.3 mmol) in 1.4 mL DIEA and 10 mL DMF, and it was stirred at room temperature for 3 h. After concentration, the oil residue was dissolved in the minimum amount of DCM, washed with diethyl ether. After the supernatant was removed, to the residue was added a large amount of water to afford the precipitation. It was filtered and then dried in vacuo to obtain 5 as a brown powder (1.73 g, 94%). The compound was used in the solid-phase synthesis without further purification. 1 H NMR (600 MHz, DMSO-d 6 ) d 9.42 (s, 1H, NH), 8.31 (br s, 1H, NH), 7.90 (d, J = 7.1 Hz, 2H, CH), 7.73 (d, J = 7.5 Hz, 2H, CH), 7.42 (t, J = 7.5 Hz, 2H, CH), 7.34 (t, J = 7.6 Hz, 2H, CH), 6.85 (s, 1H, CH), 6.65 (s, 1H, CH), 4.40 (d, J = 6.8 Hz, 2H, CH2 ), 4.27 (t, J = 6.8 Hz, 1H, CH), 3.76 (s, 3H, NCH3 ), 3.14 (m, 2H, CH2 ), 2.16 (br s, 2H, CH2 ), 1.67 (m, 2H, CH2 ), ESI-MS m/z calcd for C25 H25 N3 O5 [M + H]+ 448.1872 found 448.1869.

60

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

Boc-D-Dab (FmocHN-P)-CO2 H (compound 6). Compound 6 was synthesized from FmocHN-P-CO2 H. To a solvent of 5 mL DMF was added FmocHN-P-CO2 H (300 mg, 0.83 mmol), PyBOP (431 mg, 0.83 mmol), and DIEA (0.40 mL), and the mixture was stirred at room temperature for 1 h. To the reaction mixture was added Boc-D-Dab-OH (190 mg, 0.87 mmol) and 2 mL DMF, and then it was stirred at room temperature for 3 h. After concentration, the oil residue was dissolved in the minimum amount of ethyl acetate, washed with diethyl ether. After the supernatant was removed, to the residue was added a large amount of water to afford the precipitation. It was filtered and then dried in vacuo to obtain 6 as a brown powder (478 mg, quant). The compound was used in the solid-phase synthesis without further purification. 1 H NMR (600 MHz, CDCl3 ), d 7.77 (d, 2H, J = 7.6 Hz, CH), 7.61 (d, 2H, J = 6.9 Hz, CH), 7.40 (t, 2H, J = 7.2 Hz, CH), 7.31 (t, 2H, J = 7.2 Hz, CH), 6.85 (s, 1H), 6.63 (s, 1H), 6.54 (s, 1H), 5.57 (d, 1H, J = 6.2 Hz, NH), 4.48 (d, 2H, J = 6.2 Hz, CH2 ), 4.34 (d, 2H, J = 6.9 Hz), 4.23 (t, 1H, J = 6.2 Hz, CH), 3.83 (s, 3H, NCH3 ), 3.76 (m, 1H, CH), 3.68 (m, 1H, CH), 3.17 (m, 1H, CH), 2.02 (m, 1H, CH), 1.95 (m, 1H, CH), 1.43 (s, 9H, CH3 ), ESI-MS m/z calcd for C30 H34 N4 O7 [M + H]+ 563.2506 found 563.2491. BocHN-β-I-CO2 Me (compound 8). Compound 8 was synthesized O2 N-I-COCl3 . Compound 7 was prepared from O2 N-I-COCl3 by 2 steps using much the same method as a previously reported one [54]. To a DMF solvent (12 mL) was added compound 7 (1.37 g, 8.83 mmol), BocHN-β-alanine (1.68 g, 8.86 mmol), HCTU (3.85 g, 9.30 mmol), DIEA (3.10 mL, 17.7 mmol), and the solution was mixed and stirred at room temperature for 1.5 h. After concentration, a large amount of water was added to the oil residue to produce the precipitation. It was filtered and dried in vacuo to afford 8 as a white powder (2.44 g, 7.46 mmol, 84%). The compound was used in the next reaction step without further purification. 1 H NMR (600 MHz, CDCl3 ) d 8.45 (s, 1H, NH), 7.67 (d, J = 1.4 Hz, 1H, CH), 7.49 (s, 1H, NH), 4.00 (s, 3H, OCH3), 3.95 (s, 3H, NCH3), 3.49 (q, J = 6.2 Hz, 2H,CH2), 2.63 (t, J = 6.22 Hz, 2H CH2), 1.42 (s, 9H, CH3). FmocHN-P-β-I-CO2 H (compound 10). TFA (2.4 mL) was used in the deprotection of a Boc group that belongs to compound 8 (1.23 g, 3.75 mmol), and it was then evaporated. To a solution of obtained crude in H2 O/MeOH (1:1, total 20 mL) was added NaOH (1.26 g), and the solution was stirred at 40 °C for 2 h. After MeOH was removed from the reaction mixture by an evaporator, it was neutralized by 6 N HClaq and concentrated in vacuo to afford crude gelatinous 9. To a solution of an intact 9 in DMF (5 mL) was added a solution of FmocHN-P-CO2 H (816 mg, 2.25 mmol), HCTU (932 mg, 2.25 mmol), and DIEA (1.3 mL) in DMF (10 mL) which was stirred at room temperature for 1 h, and the reaction mixture was stirred at room temperature for 5 h. After a salt was removed from the reaction mixture by filtration, a large amount of water was added to the oil residue obtained to produce the precipitation. The crude obtained by filtration was dissolved in a minimum amount of DCM and added diethyl ether to be precipitated again. The pellet by a centrifuge was collected and dried in vacuo to afford 10 as a pale brown powder (1.07 g, 1.93 mmol, 73%). The compound was used in the solid-phase synthesis without further purification. 1 H

2.4 Materials and Methods

61

NMR (600 MHz, DMSO-d 6 ) d 9.50 (s, 1H, NH), 9.40 (s, 1H, NH), 8.01 (s, 1H, NH), 7.90 (d, J = 7.6 Hz, 2H, CH), 7.72 (d, J = 7.5 Hz, 2H, CH), 7.42 (t, J = 7.6 Hz, 2H, CH), 7.33 (t, J = 7.6 Hz, 2H, CH), 7.20 (s, 1H, CH), 6.86 (s, 2H, CH), 4.41 (d, J = 6.2 Hz, 2H, CH2 ), 4.26 (t, J = 6.8 Hz, 1H, CH), 3.95 (s, 3H, NCH3 ), 3.77 (s, 3H, NCH3 ), 3.48 (m, 2H, CH2 ), 2.52 (t, J = 2.0 Hz, 2H, CH2 ), ESI-MS m/z calcd for C29 H28 N6 O6 [M + H]+ 557.2149 found 557.2137.

2.4.3 Syntheses of Parent PI Polyamides and Their Chlorambucil Conjugates General Procedures of Fmoc Solid-Phase Peptide Synthesis. Synthesis of each polyamide was performed on a PSSM-8 (Shimadzu) computer-assisted operation system on a 0.03 mmol scale by using Fmoc chemistry. An Fmoc building block (0.20 mmol) in each step was set up to solve by NMP on the synthetic line. The synthetic procedure of all PI polyamides was as follows: twice deblocking for 4 min with 20% piperidine/DMF (0.6 mL), activating for 2 min with HCTU (88 mg, 0.21 mmol) in NMP (1 mL) and 10% DIEA/DMF (0.4 mL), coupling for 60 min, and washing with DMF. All coupling reactions were carried out with a single-coupling cycle. Building blocks used in this study are FmocHN-P-CO2 H (77 mg), FmocHN-ICO2 H (77 mg), and compound 10 (70 mg) for synthesis of polyamide 11; FmocHN-PCO2 H (77 mg), FmocHN-I-CO2 H (77 mg), compound 5 (70 mg), and compound 10 (70 mg) for synthesis of polyamide 12 and 13; FmocHN-P-CO2 H (77 mg), FmocHNI-CO2 H (77 mg), compound 6 (70 mg), and compound 10 (70 mg) for synthesis of polyamide 14. At the last capping process, the samples were washed with 20% acetic acid in DMF (1 mL). All lines are purged with solution transfers and bubbled by N2 gas for stirring the resin. After the completion of the synthesis, the resin was washed with DMF (2 mL) and methanol (2 mL) and then dried in a desiccator at room temperature in vacuo. AcIP-β-IP-β-(CH2 )3 -NCH3 -(CH2 )3 -NH2 (compound 11). Using a 43 mg Fmoc-βWang resin (0.55 mmol/g) and a novel building block 10, compound 11 was synthesized in a stepwise reaction by Fmoc solid-phase protocol described above. After resulting AcIP-β-IP-β-wang resin was cleaved with 200 μL of DMDPA for 3 h at 55 °C, the solvent was evaporated. The residue was dissolved in the minimum amount of DCM and subsequently washed with diethyl ether to yield a 15.1 mg white solid, which was used in the next coupling reaction without further purification. For SPR assays, the crude compound was purified by reversed-phase flash column chromatography (water with 0.1% TFA containing 0–35% CH3 CN over a linear gradient in 5–40 min). The peak around 23 min was collected and lyophilized to produce 11 (6.6 mg, 7.9 μmol, 23%) as a yellow solid. 1 H NMR (600 MHz, DMSO-d 6 ) d 10.28 (s, 1H, NH) 10.23 (s, 1H, NH) 9.93 (s, 1H, NH) 9.90 (s, 1H, NH) 8.09–8.05 (m, 3H, NH) 7.45 (s, 1H, CH) 7.41 (s, 1H, CH) 7.22 (d, 1H, J = 1.4 Hz,

62

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

CH) 7.20 (s, 1H, CH) 6.93 (s, 2H, CH) 3.94 (s, 1H, NCH3 ) 3.93 (s, 1H, NCH3 ) 3.81 (s, 1H, NCH3 ) 3.80 (s, 1H, NCH3 ) 3.42 (m, 6H, CH2 ) 2.87 (br, 3H, CH3 ) 2.77–2.73 (m, 6H, CH2 ) 2.59 (t, J = 6.8 Hz, 2H, CH2 ) 2.35 (t, 2H, J = 7.5 Hz, CH2 ) 2.02 (s, 3H, CH3 ) 1.77 (m, 4H, CH2 ), ESI-MS m/z calcd for C37 H53 N15 O7 [M + 2H]2+ 410.7205 found 410.7201. AcIP-β-IP-β-(CH2 )3 -NCH3 -(CH2 )3 -NH-Chl (conjugate 1). To a solution of the crude compound 11 (2.0 mg) in DMF (200 μL), PyBOP (5.1 mg, 10 μmol), DIEA (1.7 μL, 10 μmol) and chlorambucil (3.1 mg, 10 μmol) were added. The reaction mixture was incubated overnight at room temperature. Evaporation of the solvent gave a yellow crude, which was washed with ether and DCM and subsequently was purified by HPLC using a CHEMCOBOND 5-ODS-H column (water with 0.1% TFA/MeCN 30–75% linear gradient, 0–30 min, detected at 254 nm), to produce 1 (1 mg, 0.91 lmol) as a pale yellow powder. ESI-MS m/z calcd for C51 H72 Cl2 N16 O8 [M + 2H]2+ 553.2548 found 553.2527. AcIP-β-IP-γ-IP-β-IP-β-(CH2 )3 -NCH3 -(CH2 )3 -NH2 (compound 12). Using a 45 mg Fmoc-β-Wang resin (0.55 mmol/g) and two novel building blocks 5 and 10, compound 12 was synthesized in a stepwise reaction by Fmoc solid-phase protocol described above. A subsequent synthetic procedure similar to that used for the preparation of compound 11 provided compound 12 (3.6 mg, 2.5 μmol, 10%) as a yellow powder. 1 H NMR (600 MHz, DMSO-d 6 ) d 10.29 (s, 1H, NH) 10.27 (s, 1H, NH) 10.25 (s, 1H, NH) 10.23 (s, 1H, NH) 9.96–9.92 (m, 4H, NH) 8.09–8.06 (m, 5H, NH) 7.45 (s, 2H, CH) 7.43 (s, 1H, CH) 7.40 (s, 1H, CH) 7.22 (s, 1H, CH) 7.21 (s, 1H, CH) 7.20 (s, 2H, CH) 6.95 (s, 1H, CH) 6.93 (s, 3H, CH) 3.94–3.93 (m, 12H, NCH3 ) 3.81 (s, 6H, NCH3 ) 3.80 (s, 6H, NCH3 ) 3.42 (m, 8H, CH2 ) 3.19 (t, J = 6.2 Hz, 2H, CH2 ) 2.86 (s, 3H, NCH3 ) 2.75–2.73 (m, 6H, CH2 ) 2.59 (t, J = 7.6 Hz, 6H, CH2 ) 2.38 (m, 1H, CH2 ) 2.01 (s, 3H, CH3 ) 1.78–1.76 (m, 6H, CH2 ), ESI-MS m/z calcd for C66 H87 N27 O13 [M + 2H]2+ 733.8567 found 733.8549. AcIP-β-IP-γ-IP-β-IP-β-(CH2 )3 -NCH3 -(CH2 )3 -NH-Chl (conjugate 2). To a solution of the crude compound 12 (1.9 mg) in DMF (200 μL), PyBOP (5.9 mg, 12 μmol), DIEA (1.9 μL, 11 μmol), and chlorambucil (3.3 mg, 11 μmol) were added. A subsequent synthetic procedure similar to that used for the preparation of compound 1 provided 2 (0.5 mg, 0.28 μmol) as a pale yellow powder. ESI-MS m/z calcd for C80 H104 Cl2 N28 O14 [M + 2H]2+ 876.3910 found 876.3892. AcIP-β-IP-β-IP-γ-IP-β-IP-β-IP-β-(CH2 )3 -NCH3 -(CH2 )3 -NH2 (compound 13). Using a 50 mg Fmoc-β-Wang resin (0.60 mmol/g) and two novel building blocks 5 and 10, compound 13 was synthesized in a stepwise reaction by Fmoc solid-phase protocol described above. A subsequent synthetic procedure similar to that used for the preparation of compound 11 provided compound 13 (4.9 mg, 2.4 μmol, 8%) as a yellow solid. 1 H NMR (600 MHz, DMSO-d 6 ) d 10.29 (s, 3H, NH) 10.26 (s, 1H, NH) 10.25 (s, 1H, NH) 10.23 (s, 1H, NH) 9.94 (s, 1H, NH) 9.93 (s, 2H, NH) 9.92 (s, 2H, NH) 9.91 (s, 1H, NH) 8.07 (m, 7H, NH) 7.44 (s, 3H, CH) 7.43 (s, 1H, CH) 7.41 (s, 2H, CH) 7.21–7.19 (m, 6H, CH) 6.95–6.93 (m, 6H, CH) 3.94 (s, 12H, NCH3 ) 3.93 (s, 6H, NCH3 ) 3.81 (s, 12H, NCH3 ) 3.80 (s, 6H, NCH3 ) 3.41 (m, 14H, CH2 )

2.4 Materials and Methods

63

2.77–2.72 (m, 6H, CH2 ) 2.58 (t, J = 6.8 Hz, 10H, CH2 ) 2.37–2.33 (m, 2H, CH2 ) 2.08 (s, 3H, CH3 ) 2.02 (s, 3H, CH3 ) 1.79–1.75 (m, 6H, CH2 ), ESI-MS m/z calcd for C94 H119 N39 O19 [M + 3H]3+ 700.3260 found 700.3239. AcIP-β-IP-β-IP-γ-IP-β-IPy-β-IPy-β-(CH2 )3 -NCH3 -(CH2 )3 -NH-Chl (conjugate 3). To a solution of the crude compound 13 (2.1 mg) in DMF (200 μL), PyBOP (3.9 mg, 7.6 μmol), DIEA (1.3 μL, 7.5 μmol), and chlorambucil (2.3 mg, 7.6 μmol) were added. A subsequent synthetic procedure similar to that used for the preparation of compound 1 provided 3 (0.6 mg, 0.25 μmol) as a pale yellow powder. ESI-MS m/z calcd for C108 H136 Cl2 N40 O20 [M + 3H]3+ 795.3489 found 795.3477. Boc-D-Dab (AcIP-β-IP-β-IP)-IP-β-IP-β-IP-β-(CH2 )3 -NCH3 - (CH2 )3 -NH2 (compound 14). Using a 44 mg Fmoc-β-Wang resin (0.60 mmol/g) and two novel building blocks 6 and 10, compound 14 was synthesized in a stepwise reaction by Fmoc solid-phase protocol described above. Resulting Boc-D-Dab (AcImPyβ-ImPy-β-ImPy)-ImPy-β-ImPy-β-ImPy-β-wang resin was cleaved with 200 μL of DMDPA and 100 μL DMF for 3 h at 45 °C, the solvent was evaporated and the residue was dissolved in the minimum amount of DCM, washed with diethyl ether to yield an 18.5 mg white solid, which was used in next reactions without further purification. AcIP-β-IP-β-IP-D-Dab-IP-β-IP-β-IP-β-(CH2 )3 -NCH3 -(CH2 )3 -NH2 (compound 15). To a solution of the crude compound 14 (16.3 mg) in DCM (600 μL) was added TFA (400 μL) and the mixture was shook at room temperature for 30 min. The mixture was concentrated in vacuo, and the residue was dissolved in the minimum amount of DCM, washed with diethyl ether to afford the oil residue. The residue was purified by a reversed-phase flash column as mentioned above to obtain 15 (3.5 mg, 1.7 lmol) as a yellow solid. ESI-MS m/z calcd for C94 H120 N40 O19 [M + 3H]3+ 705.3296 found 705.3255. AcIP-β-IP-β-IP-D-Dab-IP-β-IP-β-IP-β-(CH2 )3 -NCH3 -(CH2 )3 -NH-Chl (conjugate 4). To a solution of the crude compound 14 (16.3 mg) in DMF (1 mL), PyBOP (41.8 mg, 80 μmol), DIEA (15 μL), and chlorambucil (22.6 mg, 74 μmol) were added. The reaction mixture was mixed for 9 h at room temperature. Evaporation of the solvent gave a yellow residue, which was dissolved in the minimum amount of DCM, washed with diethyl ether. It was purified by reverse-phase flash column chromatography as mentioned above and the Boc group was then deprotected with TFA–DCM (1:1, total 2.4 mL) to produce the oil residue. It was dissolved in the minimum amount of DCM, washed with diethyl ether, and purified by HPLC as mentioned above to produce 4 (0.7 mg, 0.29 μmol) as a pale yellow powder. ESI-MS m/z calcd for C108 H137 Cl2 N41 O20 [M + 3H]3+ 800.3525 found 800.3488.

64

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

2.4.4 Preparation of Plasmid Containing (CAG/CTG)12 Repeat Sequences All DNA fragments and primers for cloning or DNA amplification were purchased from Sigma. The 38 bp DNA fragments (5 -(CAG)12 CA-3 and 5 -G(CTG)12 A-3 ) were annealed, then ligated into pGEM-T Easy Vectors (Promega). Escherichia coli JM109 competent cells (Toyobo) were transformed and cultured on LB plates with 50 μg/mL ampicillin overnight at 37 °C. Transformed colonies were identified by colony direct PCR using 1 x Go Taq Green Master Mix (Promega) and 300 nM of primer set (T7: 5 -TAATA CGACT CACTA TAGG-3 , sp6: 5 -ATTTA GGTGA CACTA TAGAA TAC-3 ) by a standard PCR method. The appropriate colony was selected and cultured in a liquid LB medium with ampicillin at 37 °C overnight. Plasmids were extracted using GenElute Plasmid Miniprep Kit (Sigma-Aldrich), and their sequences were identified by 3130xL Genetic Analyzer (Applied Biosystems).

2.4.5 Preparation of 5 -Texas Red-Labeled DNA Fragments and High-Resolution Gel Electrophoresis The 5 -Texas Red-labeled DNA fragments containing the CAG/CTG repeat sequence were prepared by a standard PCR method using a primer set of 5 -Texas Red-labeled T7 and sp6 promoter primer or T7 promoter primer and 5 -Texas Red-labeled sp6 promoter primer. The PCR products were purified by GenElute PCR cleanup kit (Sigma-Aldrich). The 5 -Texas Red-labeled DNA fragments were alkylated by alkylating polyamides with specified concentrations in 5 mM sodium phosphate buffer (pH 7.0) containing 10% DMF at 23 °C, for 18 h. The reaction was quenched by the addition of calf thymus DNA (10 mg/mL, 1 μL), and the mixtures were heated at 95 °C for 10 min to cleave the DNA strands at the alkylating sites. The DNA was recovered by vacuum centrifugation. The pellet was dissolved in 7 μL of loading dye (formamide with New fuchsin), heated at 95 °C for 25 min, and then immediately chilled on ice. The 1.8 μL aliquot was subjected to electrophoresis on a 6% denaturing polyacrylamide gel using a HITACHI SQ5500-E DNA Sequencer.

2.4.6 In Vitro Transcription Assays DNA fragments including (CAG/CTG)12 repeats for RNA transcription was prepared by a standard PCR method using the T7/sp6 promoter primer sets. PCR products were purified by QIAquick Gel Extraction Kit (Qiagen). The obtained DNA fragments (8 nM) were incubated with various concentrations of 4 and 15 in a 5 mM sodium phosphate buffer (pH 7.0) containing 10% DMF at room temperature for 18 h. The alkylation reagents were quenched by the addition of DNA oligomer (200 μM,

2.4 Materials and Methods

65

2 μL). The reaction mixtures were purified by illustra MicroSpin G-25 Columns (GE Healthcare) and then lyophilized for 3 h. The DNA fragments were transcribed with in vitro Transcription T7 Kit (Takara Bio Inc.). After transcriptions, the remaining DNA fragments were digested by RNase free DNase I (10 Units) for 30 min at 37 °C and the obtained RNAs were purified by RNeasy Mini Kit (Qiagen). Transcription products were analyzed by PAGE at 180 V for 80 min using 10% TBE-Urea Gels containing 7 M urea (Invitrogen) and detected by SYBR Gold (Applied Biosystems). Before loading, samples were dissolved in TBE-Urea Sample Buffer (Invitrogen), heated for 5 min at 95 °C and then immediately cooled on ice for a few minutes. Low Range ssRNA Ladder (New England Biolabs) was used as an RNA marker. The bands were photographed and analyzed with Fluoro Image Analyzer FLA-3000GF (Fujifilm).

2.4.7 Quantitative SPR-Binding Assays SPR experiments were performed on a Biacore X instrument. A biotinylated hairpin DNA was purchased from Sigma-Aldrich. A streptavidin-functionalized SA sensor chip was purchased from Biacore. The biotinylated hairpin DNA is immobilized onto the chip to obtain the desired immobilization level (approximately 1200 RU rise). SPR assays were carried out using degassed and filtered HBS buffer (10 mM HPES pH 7.4, 150 mM NaCl, 3 mM EDTA, and 0.005% Surfactant P20) with 0.1% DMSO at 25 °C. A series of sample solutions with various concentrations were prepared in the buffer with 0.1%DMSO and injected at a flow rate of 20 μL/min. To measure the values of binding affinity and kinetics parameters, data processing was performed with an appropriate fitting model using BIAevaluation 4.1 program. The predefined models (1:1 binding model with mass transfer or with drafting baseline) were used for fitting the sensorgrams of polyamide 12, 13, and 15 to give better fitting. For polyamide 11, to represent dynamics in forming the complexes with two polyamides bound side by side in the same region, the sensorgram was fitted with the following model (Eq. 2.1a–2.1d): ka1

ka2

kd1

kd2

A + B  AB A + AB  A2 B (A: polyamide, B: DNA, AB, A2 B: polyamide–DNA complex) dA/dt = kt · (Conc A) − (ka1 · A · B − kd1 · AB) − (ka2 · A · AB − kd2 · A2 B) (2.1a) dB/dt = −(ka1 · A · B − kd1 · AB) dAB/dt = (ka1 · A · B − kd1 · AB) − (ka2 · A · AB − kd2 · A2 B)

(2.1b) (2.1c)

66

2 Sequence-Specific DNA Alkylation and Transcriptional Inhibition …

dA2 B/dt = (ka2 · A · AB − kd2 · A2 B)

(2.1d)

A and Conc A correspond to the polyamide close to the surface which is capable of binding to the DNA ligand and the bulk concentration of the polyamide, respectively. k a1 and k a2 represent the rates of association of the first and second polyamide with the ligand, respectively; k d1 and k d2 represent the rates of dissociation of the polyamide from a 1:1 complex and 2:1 complex, respectively. k t is the empirical constant for mass transfer.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

Dervan PB (2001) Bioorg Med Chem 9:2215–2235 Dervan PB, Edelson BS (2003) Curr Opin Struct Biol 13:284–299 Blackledge MS, Melander C (2013) Bioorg Med Chem 21:6101–6114 Bando T, Sugiyama H (2006) Acc Chem Res 39:935–944 Mirkin SM (2007) Nature 447:932–940 Cummings CJ, Zoghbi HY (2000) Hum Mol Gen 9:909–916 Kremer B, Goldberg P, Andrew SE, Theilmann J, Telenius H, Zeisler J, Squitieri F, Lin B, Bassett A, Almqvist E, Bird TD, Hayden MRN (1994) N Engl J Med 330:1401–1406 Reddy PH, Williams M, Charles V, Garrett L, Pike-Buchanan L, Whetsell WO Jr, Miller G, Tagle DA (1998) Nat Genet 20:198–202 Lin X, Cummings CJ, Zoghbi HY (1999) Neuron 24:499–502 Philips AV, Timchenko LT, Cooper TA (1998) Science 280:737–741 Mankodi A, Logigian E, Callahan L, McClain C, White R, Henderson R, Krym M, Thornton C (2000) Science 289:1769–1773 Orengo JP, Chambon P, Metzger D, Mosier DR, Snipes GJ, Cooper TA (2008) Proc Natl Acad Sci USA 105:2646–2651 Wheeler TM, Sobczak K, Lueck JD, Osborne RJ, Lin X, Dirksen RT, Thornton CA (2009) Science 325:336–339 Guan L, Disney MD (2013) Angew Chem Int Ed 52:10010–10013 Nakatani K, Hagihara S, Goto Y, Kobori A, Hagihara M, Hayashi G, Kyo M, Nomura M, Mishima M, Kojima C (2005) Nat Chem Biol 1:39–43 Bando T, Fujimoto J, Minoshima M, Shinohara K, Sasaki S, Kashiwazaki G, Mizumura M, Sugiyama H (2007) Bioorg Med Chem 15:6937–6942 Fujimoto J, Bando T, Minoshima M, Uchida S, Iwasaki M, Shinohara K, Sugiyama H (2008) Bioorg Med Chem 16:5899–5907 Ferguson LR, Liu AP, Denny WA, Cullinane C, Talarico T, Phillips DR (2000) Chem Biol Interact 126:15–31 Hashem VI, Sinden RR (2002) Mutat Res 508:107–119 Hashem VI, Pytlos MJ, Klysik EA, Tsuji K, Khajav M, Ashizawa T, Sinden RR (2004) Nucl Acids Res 32:6334–6346 Takahashi R, Bando T, Sugiyama H (2003) Bioorg Med Chem 11:2503–2509 Sasaki S, Bando T, Minoshima M, Shimizu T, Shinohara K, Takaoka T, Sugiyama H (2006) J Am Chem Soc 128:12162–12168 Yamamoto M, Bando T, Kawamoto Y, Taylor RD, Hashiya K, Sugiyama H (2014) Bioconjugate Chem 25:552–559 Taylor RD, Asamitsu S, Takenaka T, Yamamoto M, Hashiya K, Kawamoto Y, Bando T, Nagase H, Sugiyama H (2013) Chem Eur J 20:1310–1317 Wurtz NR, Turner JM, Baird EE, Dervan PB (2001) Org Lett 3:1201–1203

References 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54.

67

Pelton JG, Wemmer DE (1989) Proc Natl Acad Sci USA 86:5723–5727 Mrksich M, Parks ME, Dervan PB (1994) J Am Chem Soc 116:7983–7988 Faguet GB (1994) J Clin Oncol 12:1974–1990 CLL Trialists’ Collaborative Group (1999) J Natl Cancer Inst 91:861–868 Wurtz NR, Dervan PB (2000) Chem Biol 7:153–161 Wang YD, Dziegielewski J, Wurtz NR, Dziegielewska B, Dervan PB, Beerman TA (2003) Nucleic Acids Res 31:1208–1215 Wyatt MD, Lee M, Garbhas BJ, Souhami RL, Hartley JA (1995) Biochem. 34:13034–13041 Sunters A, Springer CJ, Bagshawe KD, Souhami RL, Hartley JA (1992) Biochem Pharmacol 44:59–64 Minoshima M, Bando T, Shinohara K, Kashiwazaki G, Nishijima S, Sugiyama H (2010) Bioorg Med Chem 18:1236–1243 Trauger JW, Baird EE, Mrksich M, Dervan PB (1996) J Am Chem Soc 118:6160–6166 Turner JM, Swalley SE, Baird EE, Dervan PB (1998) J Am Chem Soc 120:6219–6226 Wang CCC, Ellervik U, Dervan PB (2001) Bioorg Med Chem 9:653–657 Bando T, Minoshima M, Kashiwazaki G, Shinohara K, Sasaki S, Fujimoto J, Ohtsuki A, Murakami M, Nakazono S, Sugiyama H (2008) Bioorg Med Chem 16:2286–2291 Minoshima M, Bando T, Sasaki S, Fujimoto J, Sugiyama H (2008) Nucl Acids Res 36:2889– 2894 Pilch DS, Poklar N, Gelfand CA, Law SM, Breslauer KJ, Baird EE, Dervan PB (1996) Proc Natl Acad Sci USA 93:8306–8311 Kawamoto Y, Bando T, Kamada F, Li Y, Hashiya K, Maeshima K, Sugiyama H (2013) J Am Chem Soc 135:16468–16477 Wetzler M, Wemmer D (2010) Org Lett 12:3488–3490 Herman DM, Baird EE, Dervan PB (1998) J Am Chem Soc 120:1382–1391 Bando T, Narita A, Saito I, Sugiyama H (2002) Chem Eur J 8:4781–4790 Sasaki S, Bando T, Minoshima M, Shinohara K, Sugiyama H (2008) Chem Eur J 14:864–870 Shinohara K, Sasaki S, Minoshima M, Bando T, Sugiyama H (2006) Nucl Acids Res 34:1189– 1195 Sugiyama H, Fujiwara T, Ura A, Tashiro T, Yamamoto K, Kawanishi S, Saito I (1994) Chem Res Toxicol 7:673–683 Tao Z, Fujiwara T, Saito I, Sugiyama H (1998) Angew Chem Int Ed 38:650–653 Minoshima M, Bando T, Sasaki S, Shinohara K, Shimizu T, Fujimoto J, Sugiyama H (2007) J Am Chem Soc 129:5384–5390 Oyoshi T, Kawakami M, Narita A, Bando T, Sugiyama H (2003) J Am Chem Soc 125:4752– 4754 Gottesfeld JM, Belitsky JM, Melander C, Dervan PB, Luger K (2002) J Mol Biol 321:249–263 Lacy ER, Le NM, Price CA, Lee M, Wilson WD (2002) J Am Chem Soc 124:2153–2163 Henry JA, Le NM, Nguyen B, Howard CM, Bailey SL, Horick SM, Buchmueller KL, Kotecha M, Hochhauser D, Hartley JA, Wilson WD, Lee M (2004) Biochem 43:12249–12257 Baird EE, Dervan PB (1996) J Am Chem Soc 118:6141–6146

Chapter 3

Ligand-Mediated G-Quadruplex Induction in a Double-Stranded DNA Context by Cyclic Imidazole/Lysine Polyamide

Abstract G-quadruplex (G4) DNA is often observed as a DNA secondary structure in guanine-rich sequences and is thought to be relevant to pharmacological and biological events. Therefore, G4 ligands have attracted great attention as potential anticancer therapies or in molecular probe applications. Here, we designed cyclic imidazole/lysine polyamide (cIKP) as a new class of G4 ligand. It was readily synthesized without time-consuming column chromatography. cIKP selectively recognized particular G4 structures with low nanomolar affinity. Moreover, cIKP exhibited the ability to induce G4 formation of the promoter of G4-containing DNA in the context of stable double-stranded DNA (dsDNA) under molecular crowding conditions. This cIKP might be applicable as a molecular probe for the detection of potential G4-forming sequences in dsDNA. Keywords Cyclic polyamide · G4 induction · G-quadruplex · Heterocycles · Molecular crowding condition

3.1 Introduction G-quadruplex (G4) DNA is quadruple-stranded nucleic acids, in which each Gquartet comprises four planar guanines that stabilize each other by Hoogsteen hydrogen bonding. G4 structures can form in G-rich sequences, and monovalent metal cations (K+ or Na+ ) enhance the stability of such structures [1]. It has been shown that G4-forming sequences are widely present in human genomes, notably in the promoter and 5’-UTRs of genes that are relevant to several human diseases, as well as in telomere sequences [2]. Hence, G4 structures have attracted great attention regarding their biological significance, including gene regulation [3a,b], epigenetic regulation [3c], genome stability [3d], as well as for therapeutic use [3e-g]. Several synthetic molecules have been reported as G4 ligands, such as flat aromatic, macrocyclic, and crescent-shaped compounds [4]. A naturally occurring macrocycle, telomestatin, exhibits telomerase-inhibition activity by binding to telomeric G4 structures, and this has accelerated attempts to study synthetic macrocycles for targeting these structures, with the aim of developing anticancer therapies or molecular probes [3d, 5]. For example, furan-based cyclic oligopeptides were shown to © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Asamitsu, Development of Selective DNA-Interacting Ligands, Springer Theses, https://doi.org/10.1007/978-981-15-7716-1_3

69

70

3 Ligand-Mediated G-Quadruplex Induction …

Scheme 3.1 Chemical structures of cIKP (1) (a) and TMPyP4 (b)

be highly selective for G4 structures, and some cyclic oligopeptides (24-membered rings) suppressed mRNA expression of the c-myc oncogene, the promoter of which contains G4-forming sequences [6]. A cationic telomestatin derivative showed a strong binding affinity for telomeric G4 over other G4 structures [7]. Several other macrocyclic compounds have also been reported as G4 ligands, and, in some cases, selectivity for the G4 topology was successfully achieved [3b, 8]. Therefore, such macrocyclic structures are attractive scaffolds to achieve both binding affinity and selectivity for particular G4 structures. For years, our group has studied a synthetic DNA minor-groove binder, “pyrrole–imidazole polyamide” (PIP). PIP was developed by Dervan and colleagues as a programmable DNA-binding molecule that discriminates between A/T and G/C base pairings with high affinity and specificity [9]. In our PIP studies, we have reported that consecutive imidazole rings held together by amide bonds adopt a highly planar conformation, as predicted by density functional theory [10]. Based on these backgrounds, in the present study, we created cyclic imidazole/lysine polyamide, cIKP (1) as a new class of G4 ligand. It was readily synthesized and enabled selective recognition of particular G4s with higher binding affinity compared to TMPyP4, a well-studied G4-interactive compound (Scheme 3.1). cIKP-induced G4 formation at promoter G4 DNAs in the context of stable double-stranded (ds) DNA under molecular crowding conditions.

3.2 Results and Discussion 3.2.1 Molecular Design and Synthesis Based on our previous investigation, the imidazole moiety was selected as a key component for the molecular planarity of the structure, although other heteroaromatic rings are widely used to construct the planar backbones in most other macrocyclic

3.2 Results and Discussion

71

Scheme 3.2 Synthetic route of cIKP (1) in solution phase. Compound 2 was prepared from NO2 –I– COCCl3 . Reagents and conditions; (i) H2 , 10 wt% Pd/C, AcOEt, MeOH, rt. (ii) BocNH-Lys(Cbz)OH, HCTU, DIEA, DMF, rt, 64% over two steps. (iii) NaOH, MeOH, H2 O, 45 °C. (iv) TFA, DCM, rt. (v) FDPP, DIEA, DMF, rt, 52% over 3 steps. (vi) TfOH, TFA, rt, 98%

G4 ligands [4a, 11]. The lysine residue was adopted as another core component, mainly because its positively charged side chains can electrostatically interact with the negatively charged phosphate backbone of G4s [12]. Although the incorporation of two lysine moieties seems likely to sacrifice the high planarity of the molecule, this relatively flexible compound can mold onto the surface of the G-quartet [12]. cIKP (1) was synthesized over nine steps, without time-consuming chromatographic purification other than for the final product (Scheme 3.2). Moreover, although two or three days are generally necessary to finish the ring-closing step upon the construction of the macrocyclic backbone in low yield [6–8b], dimerization of the linear imidazole/lysine polyamide gave a better yield and reduced the reaction time. This may be attributed to internal H-bond formation between the N of imidazole and the H of the adjacent amide [10]. Three commonly used condensation agents (PyBOP, HCTU, and DPPA) were effective in this dimerization reaction, with comparable yields achieved within 4 h (see the Materials and Method section).

3.2.2 SPR-Binding Assays To characterize the binding status of 1 to G4s and dsDNA, we performed an SPR-binding assay (Fig. 3.1) [13]. We selected five DNAs harboring G4-forming sequences located in a human telomere, and the promoter regions of c-myc, c-kit-1, c-kit-2, and BCL2 (Table 3.1) [6, 14]. The sensorgram for c-myc was better fitted by a single-site model, whereas telomeric/c-kit-1/c-kit-2/BCL2 G4s were better fitted by two-site models, thus suggesting 1:1 and 2:1 stoichiometry for binding, respectively. These two fitting models include equations that reflect the mass transfer limitation effect. The association rate (k a ), dissociation rate (k d ), and dissociation constant (K D ) for the interactions are shown in Table 3.2. For telomeric/c-kit-1/c-kit-2/BCL2 G4s, 1 exhibited a preference for one site over another. The kinetic parameters and dissociation constants for the weaker binding sites are listed in Table 3.3. Compound 1

72

3 Ligand-Mediated G-Quadruplex Induction …

Fig. 3.1 SPR-binding assays to evaluate the binding properties of cIKP (1). Each SPR sensorgram is fitted with an appropriate fitting model (black line). a SPR sensorgrams for interactions with telomere G4. b SPR sensorgrams for interactions with c-myc G4. c SPR sensorgrams for interactions with c-kit-1 G4. d SPR sensorgrams for interactions with c-kit-2 G4. e SPR sensorgrams for interactions with BCL2 G4

showed higher affinity for c-myc (6.2 nM), c-kit-1 (7.4 nM), c-kit-2 (3.8 nM), and BCL2 (17 nM) G4s, than for telomeric G4 (90 nM): 24-fold selectivity difference between c-kit-2 and telomeric G4s. Furthermore, a low response of 1 for duplex DNA (both GC-rich and AT-rich) was observed, even at higher concentrations, thus indicating significantly high selectivity toward G4 structures (Fig. 3.2). Compared with the K D values for TMPyP4 (Scheme 3.1), measured for some G4s by SPR [8h], 1 exhibited approximately twofold and five- to tenfold higher binding affinities for telomeric G4s and c-myc/c-kit-1 G4s, respectively. Overall, 1 showed high selectivity and affinity (nanomolar) toward G4 structures compared with dsDNA and exhibited modest selectivity toward particular G4s.

3.2.3 CD Spectra Analysis We performed a CD analysis to investigate the binding behavior of 1 to G4 structures c-myc, c-kit-1, c-kit-2, and BCL2 G4s (Table 3.1). CD titration of 1 at several concentrations revealed that it could recognize and bind to G4 structures and did not cause conformational changes upon binding (Fig. 3.3). Next, the ability of 1 to form G4 structures was evaluated in the absence of K+ (Fig. 3.4) [15]. With c-myc, c-kit-2, and BCL2 ssDNA, no significant differences were observed (compared with the presence of K+ ), thus suggesting that 1 forms the same G4 structures (Fig. 3.4a, c, d) [16]. However, 1 exhibited a different behavior with c-kit-1 ssDNA (e.g., amplitude of the positive peak at ~295 nm; Fig. 3.4b). This suggested that 1 drove c-kit-1 ssDNA to form hybrid or antiparallel/parallel mixed G4s instead of the parallel structure [16].

3.2 Results and Discussion

73

Table 3.1 DNA oligomers used in this study DNA

Sequence (5’–3’)

telomere

biotin-TTTAGGGTTAGGGTTAGGGTTAGGG

c-Myc

biotin-TTTTGGGGAGGGTGGGGAGGGTGGGGAAGG

c-kit 1

biotin-TTTAGGGAGGGCGCTGGGAGGAGGG

c-kit 2

biotin-TTTCGGGCGGGCGCGAGGGAGGGG

BCL2

biotin-TTTGGGCGCGGGAGGAAGGGGGCGGG

GC-rich dsDNA

biotin-GCCGCGCGCGCTTATTTTAAGCGCGCGCGGC

AT-rich dsDNA

biotin-GCCATATATATTTATTTTTAAATATATATGGC

c-Myc ssDNA

TGGGGAGGGTGGGGAGGGTGGGGAAGG

c-kit 1 ssDNA

AGGGAGGGCGCTGGGAGGAGGG

c-kit 2 ssDNA

CGGGCGGGCGCGAGGGAGGGT

BCL2 ssDNA

GGGCGCGGGAGGAATTGGGCGGG

c-Myc dsDNA

GCGGTTCTGAACTCGATAT GGGTGGGTAGGGTGGG ATTAGTGCTAGCTACGCG CGCGTAGCTAGCACTAAT CCCACCCTACCCACCC ATATCGAGTTCAGAACCGC CGCGTAGCTAGCACTAAT TTTTTTTTTTTTTTTT ATATCGAGTTCAGAACCGC

c-kit 1 dsDNA

GCGGTTCTGAACTCGATAT AGGGAGGGCGCTGGGAGGAGGG ATTAGTGCTAGCTACGCG CGCGTAGCTAGCACTAAT CCCTCCTCCCAGCGCCCTCCCT ATATCGAGTTCAGAACCGC CGCGTAGCTAGCACTAAT TTTTTTTTTTTTTTTTTTTTTT ATATCGAGTTCAGAACCGC

c-kit 2 dsDNA

GCGGTTCTGAACTCGATAT CGGGCGGGCGCGAGGGAGGGG ATTAGTGCTAGCTACGCG CGCGTAGCTAGCACTAAT CCCCTCCCTCGCGCCCGCCCG ATATCGAGTTCAGAACCGC CGCGTAGCTAGCACTAAT TTTTTTTTTTTTTTTTTTTTT ATATCGAGTTCAGAACCGC

BCL2 dsDNA

GCGGTTCTGAACTCGATAT GGGCGCGGGAGGAAGGGGGCGGG ATTAGTGCTAGCTACGCG CGCGTAGCTAGCACTAAT CCCGCCCCCTTCCTCCCGCGCCC ATATCGAGTTCAGAACCGC CGCGTAGCTAGCACTAAT TTTTTTTTTTTTTTTTTTTTTTT ATATCGAGTTCAGAACCGC

Collectively, these results showed that 1 had the ability to form G4 structures from ssDNA.

74 Table 3.2 Values of the association rates (k a ) and dissociation rates (k d ) obtained from curve fittings of the sensorgrams, and dissociation constants (K D )

3 Ligand-Mediated G-Quadruplex Induction … DNA

cIKP (1) k a [M-1 s-1 ]

k d [s-1 ]

K D [nM]

Telomerea

7.5 ×

105

6.7 × 10−2

90

c-Mycb

2.0 ×

106

10−2

6.2

c-kit-1a

2.2 × 106

1.6 × 10−2

7.4

c-kit-2a

3.3 × 106

1.2 × 10−2

3.8

BCL2a

2.4 ×

dsDNA

Few response

106

1.3 ×

4.1 ×

10−2

Few response

17 Few response

a

Determined by fitting with a modified heterogeneous ligandbinding model (two-site binding model) b Determined by fitting with a 1:1 binding model with mass transfer

Table 3.3 Values of the association rates (k a ) and dissociation rates (k d ) obtained from curve fittings of the sensorgrams, and dissociation constants (K D ) for the weaker binding sites

DNA

cIKP (1) k a [M-1 s-1 ]

k d [s-1 ]

K D [nM]

10−3

131

3.7 × 104

9.8 × 10−3

262

c-kit-2a

3.8 × 104

1.1 × 10−2

291

BCL2a

4.4 ×

Telomerea

1.7 ×

c-kit-1a

104

104

2.3 ×

2.4 ×

10−3

53

a

Determined by fitting with a modified heterogeneous ligandbinding model (two-site binding model)

Fig. 3.2 SPR-binding assays to evaluate the binding properties of cIKP (1). (a, b) SPR sensorgrams for interactions with duplex DNA including GC- or AT-rich sequences, respectively

3.2.4 Induction of G4 Formation in DsDNA Context In living cells, genomic DNA usually exists as a double helix. We examined the binding property of 1 for four G4s in promoter regions in the presence of their complementary strands. Previous pioneer work with gel electrophoresis revealed the induction of G4 formation by a small molecule with the core G4-forming sequence

3.2 Results and Discussion

75

Fig. 3.3 CD spectra of c-Myc (a), c-kit 1 (b), c-kit 2 (c), and BCL2 (d) ssDNA (5 µM) after the addition of 1 (5–15 µM) under 50 mM K+ solution. A moderate positive peak at 262 nm and a negative peak at 240 nm is characteristic of parallel conformations (a,b,c), and A positive peak around at 264 and 290 nm and a negative peak around at 240 nm is characteristic of mixed parallel/antiparallel conformations (d)

and its complementary strand [17]. In order truly to reflect and better understand the G4 induction effect by 1 with genomic DNA, we employed DNA that consisted of a single G4-forming sequence with two flanking duplex regions along with complementary DNA and conducted these experiments under molecular crowding condition [18]. DNA and complementary DNA were annealed in 40% (w/v) PEG 200 solution and incubated at 37 °C for 30 min in the presence or absence of 1, then subjected to native gel electrophoresis. As positive controls, poly-T sequences (complementary sequences of G4-forming ones were replaced) were used to confirm the band showing the DNA containing a G4 on the gel (Table 3.1). G4-containing DNAs migrated behind duplex DNAs of the same sequence [18]. The migration behavior of DNAs upon the addition of 1 is shown in Fig. 3.5. With c-kit-1, c-kit-2, and BCL2 dsDNAs, G4s were minimally formed in the absence of 1; however, the migrated bands indicated G4 formation after incubation with 1, with a mobility on the gel that corresponded to that of the positive control (Fig. 3.5c–e). Note that the two bands for G4 c-kit-1 and BCL2 can be attributed to several potential G4 conformations for each sequence [19]. In contrast, TMPyP4 did not exhibit G4 induction under the same conditions in PEG solution (Fig. 3.6). Given that G4 induction did not occur in the absence of PEG solution (Fig. 3.7), 1 has the ability to induce G4 formation from

76

3 Ligand-Mediated G-Quadruplex Induction …

Fig. 3.4 CD spectra of c-Myc (a), c-kit 1 (b), c-kit 2 (c), and BCL2 (d) ssDNA (5 µM) in the presence of 1 (5–15 µM) under no K+

Fig. 3.5 Induction of G4 formation by 1 in the presence of KCl, as assessed by native gel electrophoresis. a Schematic representation of G4 induction by the ligand. b c-myc dsDNA, c c-kit-1 dsDNA, d c-kit-2 dsDNA, and e BCL2 dsDNA. Lane 1 is a positive control for G4 formation bands. Lane 4 shows dsDNA formation bands. Note that the two bands showing G4 formation for c-kit-1 and BCL2 are attributable to several potential G4 conformations on each sequence [19]

3.2 Results and Discussion

77

Fig. 3.6 No induction of G4 formation by TMPyP4 (Fig. 3.1b) by native gel electrophoresis under PEG solution; a c-Myc dsDNA, b c-kit 1 dsDNA, c c-kit 2 dsDNA, d BCL2 dsDNA. Lane 1 is a positive control (Posi. Ctrl) for G4 formation bands. Lane 2 shows dsDNA bands as negative controls

stable dsDNA under molecular crowding condition. For c-myc dsDNA, G4 formed in the absence of 1 and, therefore, it was difficult to observe the induction effect of 1 in this dsDNA context (Fig. 3.5b). To overcome this problem, we performed the same experiment in the absence of KCl. Interestingly, even in the absence of KCl, induction of G4 formation by 1 was observed, indicating the ability of 1 to induce G4 structures in a stable dsDNA context (Fig. 3.8). However, the dose-dependency (Fig. 3.8) was not significant over the measured concentration ranges. We therefore performed additional experiments at lower concentrations of 1 (Fig. 3.9): Nevertheless, the dose-dependency was not significant. We are unable to provide a definitive reason for this, but we offer two possible explanations. Firstly, a ligand might remain on the G4 structure that was induced by the ligand and is therefore not recruited for the next continuing G4 induction. Secondly, once G4 structures are generated, ligands might bind non-specifically to G4s at other, weaker binding sites, or additional ligands might aggregate G4–ligand complexes; these changes would be observed as greater SPR responses at higher concentrations. These factors might suppress the ability to induce G4 formation in the presence of increased concentrations of G4s.

78

3 Ligand-Mediated G-Quadruplex Induction …

Fig. 3.7 No induction of G4 formation by 1 under no PEG solution by native gel electrophoresis; a c-Myc dsDNA, b c-kit 1 dsDNA, c c-kit 2 dsDNA, d BCL2 dsDNA. Lane 1 is a positive control (Posi. Ctrl) for G4 formation bands. Lane 2 shows dsDNA bands as negative controls. Concentrations of 1 are 1, 4, and 16 µM for lane 4, 5, and 6, respectively

Fig. 3.8 Induction of G4 formation on c-Myc dsDNA by cIKP (1) in the absence of K+ under PEG solution by native gel electrophoresis. Lane 1 is a positive control (Posi. Ctrl) for G4 formation bands. Lane 3 shows dsDNA bands as negative controls. Concentrations of 1 are 1, 2, 4, and 8 µM for lane 4, 5, 6, and 7, respectively

3.3 Conclusion We designed and synthesized cyclic imidazole/lysine polyamide, cIKP (1), as a new class of G4 ligand based on the high planarity of consecutive imidazole scaffolds, as hinted at by structural studies of PIP. cIKP had high selectivity and affinity toward G4 structures over dsDNA, and modest selectivity for particular G4 structures (24-fold).

3.3 Conclusion

79

Fig. 3.9 Induction effect of G4 formation on c-Myc dsDNA at the lower concentration ranges of cIKP (1) in the absence of K+ under PEG solution by native gel electrophoresis. Lane 1 is a positive control (Posi. Ctrl) for G4 formation bands. Lane 3 shows dsDNA bands as negative controls. Concentrations of 1 are 62.5, 125, 250, 500, and 1000 nM for lane 4, 5, 6, 7, and 8, respectively

Interestingly, cIKP displayed the ability to induce G4 formation even in the presence of the complementary DNA. Recent studies have reported several new motifs that form stable intramolecular G4s in vitro, such as motifs with bulges, long loops, and (4n-1) guanines [20]. Given the high affinity and induction ability of cIKP toward G4s, we anticipate that it is a promising molecular probe for the detection of such new motifs or of relatively unstable G4 structures that have not been discovered in genome-wide analysis.

3.4 Materials and Methods 3.4.1 General 1

H NMR spectra were recorded on JNM ECA-600 spectrometer (600 MHz for 1 H, 150 MHz for 13 C; JEOL), with chemical shifts reported ppm relative to residual solvent and coupling constants in hertz. Regular column chromatography was performed on Silica Gel 60 (70–230 mesh; Merck Millipore). Analytical HPLC was performed on a PU-2080 Plus series system (Jasco) with a XTerra MS C18 reversed-phase column (150 × 4.6 mm), with TFA (0.1% in water) and acetonitrile as eluent (flow rate 1.0 µL/min) and linear gradient elution: 0–100% acetonitrile over 20 or 40 min (detection at 254 nm). Collected fractions were analyzed by ESITOF-MS (Bruker). Reversed-phase flash column chromatography was performed on CombiFlash Rf (Teledyne Isco, Lincoln, NE) with a 4.3 g C18 RediSep Rf reversedphase flash column with TFA (0.1% in water) and acetonitrile as the eluent (flow rate 18.0 mL/min) with linear gradient elution: 0–35% acetonitrile over 5–40 min (detection at 254 nm).

80

3 Ligand-Mediated G-Quadruplex Induction …

Pd/C (10 wt%) and pentafluorophenyl diphenylphosphinate (FDPP) were purchased from Sigma-Aldrich. O-(1H-6-Chlorobenzotriazol-1-yl)-1,1,3,3tetramethyluronium hexafluorophosphate (HCTU) was purchased from Peptides International (Louisville, KY). Diphenylphosphoryl azide (DPPA), O2 N-ICOCl3 , trifluoromethanesulfonic acid (TfOH), and N,N-dimethylformamide (DMF) were purchased from Wako (Osaka, Japan); (benzotriazol-1yloxy)tripyrrolidinophosphonium hexafluorophosphate (PyBOP) was purchased from Novabiochem/Merck Millipore; BocHN-Lys(Cbz)-CO2 H was purchased from Watanabe Chemical Industries (Hiroshima, Japan); diisopropylethylamine (DIEA) was purchased from Nacalai Tesque (Kyoto, Japan); trifluoroacetic acid (TFA) was purchased from Kanto Chemical Co. (Tokyo, Japan); dichloromethane (CH2 Cl2 ) was purchased from Sasaki Chemical Co. (Kyoto, Japan). Other reagents and solvents were purchased from standard suppliers and used without further purification.

3.4.2 Synthesis of cIKP (1) Synthesis of compound 3: Pd/C (10 wt%; 90 mg) was added to a solution of compound 2 (1.01 g, 3.28 mmol; synthesized in three steps from NO2 –I–COCCl3 in overall 83% yield) [21] in AcOEt (40 mL) and MeOH (20 mL). The solution was stirred at RT under H2 (P = 0.21 MPa) for 2 h. After filtration of Pd/C, the filtrate was concentrated on an evaporator to give the amine compound (803 mg, 2.88 mmol) as a light brown powder, which was used without further purification. The amine compound (121 mg, 0.43 mmol), BocHN–Lys(Cbz)–CO2 H (182 mg, 0.49 mmol), and HCTU (198 mg, 0.48 mmol) were dissolved in dry DMF (8 mL), then DIEA (83 mL, 0.48 mmol) was added. The solution was stirred at RT for 2.5 h, then the solvent was evaporated, and the mixture was washed with water to produce the crude oil, which was used without further purification. For spectral data, it was purified by column chromatography (silica gel, hexane/AcOEt 1:10) to give 3 as a white powder (203 mg, 73% yield). 1 H NMR (600 MHz, [D6 ]DMSO): d = 10.32 (s, 1H), 9.80 (s, 1H), 7.70 (s, 1H), 7.50 (s, 1H), 7.35–7.30 (m, 5H) 7.23 (t, J = 5.6 Hz, 1H), 6.97 (d, J = 7.6 Hz, 1H), 4.99 (s, 2H), 4.10 (dd, J = 13.7, 8.9 Hz, 1H), 3.96 (s, 3H) 3.95 (s, 3H), 3.82 (s, 3H), 2.97 (t, J = 17 Hz, 2H), 1.56 (m, 2H), 1.38 (s, 9H), 1.31 (m, 2H), 1.24 ppm (m, 2H); 13C NMR ([D6]DMSO): d = 158.8, 156.1, 155.7, 155.4, 137.3, 136.2, 133.0, 131.3, 128.3, 127.7, 115.4, 114.5, 99.5, 78.0, 65.1, 59.7, 54.2, 51.8, 35.6, 35.0, 31.5, 29.1, 28.2, 22.8 ppm; ESI-MS m/z calcd for C30H40N8O8: 641.3042 [M + H]+ found: 641.3112. Synthesis of Compound 5: Solid NaOH (460 mg) was added to 3 (203 mg, 0.32 mmol) in MeOH (6 mL) and H2 O (6 mL), then the mixture was heated at 458C for 1 h. After evaporation of MeOH, the solution was acidified with HCl (6 N) to produce a white precipitate. This was collected by filtration and dried in vacuo to give the carboxylic acid (159 mg) as a white powder. TFA (1.5 mL) was added to a solution of the carboxylic acid (32.6 mg) in CH2 Cl2 (1.5 mL). The solution was stirred at RT for 1 h, then the solvent was evaporated, and the mixture was dried in

3.4 Materials and Methods

81

vacuo to give the TFA salt of 4 as a light brown solid. The salt of 4 and FDPP (79 mg, 0.21 mmol) was dissolved in dry DMF (8.7 mL), and DIEA (54 mL, 0.31 mmol) was added to the mixture. The solution was stirred at RT for 4 h, and then the solvent was evaporated. The crude product was dissolved in a minimum amount of CH2 Cl2 and recrystallized with Et2 O to give 5 as a white powder (27.4 mg, 52% for three steps). Three condensation agents (PyBOP, HCTU, and DPPA) were also used for this dimerization reaction with comparable yields (40, 36, and 32%, respectively). 1H NMR (600 MHz, [D6 ]DMSO): d = 10.19 (s, 2H), 9.41 (s, 2H), 8.82 (d, J = 9.7 Hz, 2H), 7.52 (s, 2H), 7.46 (s, 2H) 7.36–7.29 (m, 10H), 7.25 (t, J = 5.7 Hz, 2H), 4.99 (s, 4H), 4.59 (dd, J = 15.8, 9.6 Hz, 2H), 4.00 (s, 6H) 3.94 (s, 6H), 2.99 (t, J = 13.1 Hz, 4H), 1.85 (m, 2H), 1.75 (m, 2H), 1.49–1.42 (m, 4H), 1.38–1.31 ppm (m, 4H); 13C NMR ([D6]DMSO): d = 168.8, 157.9, 156.1, 154.8, 137.3, 135.6, 134.7, 134.3, 133.2, 128.3, 127.7, 114.9, 113.0, 65.1, 53.7, 34.9, 32.2, 28.9, 22.8 ppm; ESI-MS m/z calcd for C48 H56 N16 O16 : 1017.4438 [M + H]+ found: 1017.4467. Synthesis of Compound 1 (cIKP): TfOH (80 mL) was slowly added to 5 (11.6 mg, 11.4 mmol) in TFA (800 mL). The solution was stirred at RT for 5 min, then poured into cold Et2 O (5 mL) to produce a light brown precipitate, which was collected by centrifugation. The pellet was purified by reversed-phase flash chromatography to give 1 as a white powder (10.9 mg, 98% yield). 1 H NMR (600 MHz, [D6 ]DMSO): d = 10.27 (s, 2H), 9.42 (s, 2H), 8.86 (d, J = 9.6 Hz, 2H), 7.72 (br s, 4H), 7.53 (s, 2H) 7.46 (s, 2H), 4.61 (dd, J = 15.8, 9.6 Hz, 2H), 4.01 (s, 6H) 3.95 (s, 6H), 2.80 (t, J = 13.1 Hz, 4H), 1.87 (m, 2H), 1.76 (m, 2H), 1.64–1.55 (m, 4H), 1.45–1.35 ppm (m, 4H); 13 C NMR ([D6 ]DMSO): d = 169.3, 158.5, 155.3, 136.1, 135.2, 134.8, 133.7, 115.5, 113.6, 54.0, 39.2, 35.4, 32.6, 27.1, 23.0 ppm; ESI-MS m/z calcd for C32 H44 N16 O6 : 749.3702 [M + H]+ found: 749.3704.

3.4.3 SPR-Binding Experiments SPR experiments were performed on a Biacore X instrument (GE Healthcare) as previously reported, with some modifications. 5’-Biotinylated DNAs (Table 3.1) were purchased from Japan Bio Services Co. Ltd. (Saitama, Japan). Biotinylated DNA was immobilized on a streptavidin-functionalized SA sensor chip. SPR measurements were carried out in degassed, filtered HBS buffer (HEPES (10 mm, pH 7.4), NaCl (150 mm), EDTA (3 mm), Surfactant P20 (0.005%)) with DMSO (0.1%) and KCl (100 mm) at 25 °C. Sample solutions over a wide range of concentration were prepared in the buffer with DMSO (0.1%) and KCl (100 mm), and injected at a flow rate of 20 µL/min. After each cycle, the sample remaining on the DNA was detached with NaOH (50 mM)/NaCl (1 M) until the sensorgram baseline was restored. The optimized concentration range was selected for subsequent quantitative analysis. The sensorgrams were fitted with a 1:1 (single site) or modified heterogeneous ligand (two-site) binding model with mass transfer, by using BIAevaluation

82

3 Ligand-Mediated G-Quadruplex Induction …

4.1 (GE Healthcare) to obtain k a , k d , and K D . The two fitting models include equations that reflect the mass transfer limitation effect. The best-fitted curves are shown in Fig. 3.1.

3.4.4 CD Spectra Measurements DNA samples were prepared in Tris-HCl buffer (10 mM, pH 7.5). Annealing was performed by heating at 95 °C for 5 min and gradually cooling to RT in the presence or absence of 1. For CD titration assays, 1 (various concentrations) was added to the annealed DNA (5 mm) in the buffer, then incubated for 30 min. CD spectra were measured from 340 to 220 nm (0.5 nm steps) on a J-805LST spectrometer (JASCO) in a 1 cm quartz cuvette (Figs. 3.3 and 3.4).

3.4.5 Native Gel Electrophoresis Analysis DNA samples were prepared in Tris-HCl buffer (10 mM, pH 7.5) containing PEG 200 (40%, w/v) with or without KCl (50 mM). The DNA samples were heated at 95 °C for 5 min, then gradually cooled to RT, before 1 was added to the resulting DNA solution (200 nm) and incubated for 30 min at 37 °C. The samples were loaded on a 12% polyacrylamide gel containing KCl (50 mM) and electrophoresed (4 8C, 100 V, 90 min) in 1 × TBE buffer containing KCl (50 mM). The gel was stained with SYBR Gold for 15 min and then imaged on an FLA-3000 Fluorescent Image Analyzer (FujiFilm, Tokyo, Japan).

References 1. (a) Gellert M, Lipsett MN, Davies DR (1962) Proc Natl Acad Sci USA 48:2013–2018; (b) Sen D, Gilbert W (1988) Nature 334:364–366; (c) da Silva MW (2007) Chem Eur J 13:9738–9745; (d) Sannohe Y, Sugiyama H (2010) Curr Protoc Nucleic Acid Chem 17.2:1–17; (e) Patel DJ, Phan AT, Kuryavyi V (2007) Nucleic Acids Res 35:7429–7455 2. (a) Huppert JL, Balasubramanian S (2005) Nucleic Acids Res 33:2908–2916; (b) Huppert JL, Balasubramanian S (2007) Nucleic Acids Res 35:406–413; (c) Zahler AM, Williamson JR, Cech TR, Prescott DM (1991) Nature 350:718–720; (d) Lam EYN, Beraldi D, Tannahill D, Balasubramanian S (2013) Nat Commun 4:1796–1811; (e) Mìller S, Kumari S, Rodriguez R, Balasubramanian S (2010) Nat Chem 2:1095–1098 3. (a) Rodriguez R, Miller KM, Forment JV, Bradshaw CR, Nikan M, Britton S, Oelschlaegel T, Xhemalce B, Balasubramanian S, Jackson SP (2012) Nat Chem Biol 8:301–310; (b) Grand CL, Han H, MuÇoz RM, Weitman S, Von Hoff DD, Hurley LH, Bearss DJ (2002) Mol Cancer Ther 1:565–573; (c) Law MJ, Lower KM, Voon HPJ, Hughes JR, Garrick D, Viprakasit V, Mitson M, De Gobbi M, Marra M, Morris A, Abbott A, Wilder SP, Taylor S, Santos GM, Cross J, Ayyub H, Jones S, Ragoussis J, Rhodes D, Dunham I et al. (2010) Cell 143:367–378; (d) Kim M-Y, Vankayalapati H, Shin-ya K, Wierzba K, Hurley LH (2002) J Am Chem Soc 124:2098–2099; (e) Qin Y, Hurley LH (2008) Biochimie 90:1149–1171; (f) Brooks TA, Hurley

References

4. 5.

6. 7. 8.

9. 10. 11. 12. 13.

14.

15. 16. 17. 18. 19.

20.

21.

83

LH (2010) Genes Cancer 1:641–649; (g) Balasubramanian S, Hurley LH, Neidle S (2011) Nat Rev Drug Discovery 10:261–275 (a) Monchaud D, Teulade-Fichou M-P (2008) Org Biomol Chem 6:627–636; (b) Luedtke NW (2009) Chimia 63:134–139 (a) Shin-ya K, Wierzba K, Matsuo K-i, Ohtani T, Yamada Y, Furihata K, Hayakawa Y, Seto H (2001) J Am Chem Soc 123:1262–1263; (b) Doi T, Yoshida M, Shin-ya K, Takahashi T (2006) Org Lett 8:4165–4167 Agarwal T, Roy S, Chakraborty TK, Maiti S (2010) Biochemistry 49:8388–8397 Tera M, Ishizuka H, Takagi M, Suganuma M, Shin-ya K, Nagasawa K (2008) Angew Chem Int Ed 47:5557–5560 (a) Gonçalves DPN, Rodriguez R, Balasubramanian S, Sanders JKM (2006) Chem Commun 4685–4687; (b) Shirude PS, Gillies ER, Ladame S, Godde F, Shin-ya K, Huc I, Balasubramanian S (2007) J Am Chem Soc 129:11890–11891; (c) Shinohara K-i, Sannohe Y, Kaieda S, Tanaka K-i, Osuga H, Tahara H, Xu Y, Kawase T, Bando T, Sugiyama H (2010) J Am Chem Soc 132:3778–3782; (d) Jantos K, Rodriguez R, Ladame S, Shirude PS, Balasubramanian S (2006) J Am Chem Soc 128:13662–13663; (e) Zhang Q, Cui X, Lin S, Zhou J, Yuan G (2012) Org Lett 14:6126–6129; (f) Nicoludis JM, Barrett SP, Mergny J-L, Yatsunyk LA (2012) Nucleic Acids Res 40:5432–5447; (g) Kong D-M, Ma Y-E, Guo J-H, Yang W, Shen H-X (2009) Anal Chem 81:2678–2684; (h) Arora A, Maiti S (2008) J Phys Chem B 112:8151–8159; (i) Minhas GS, Pilch DS, Kerrigan JE, LaVoie EJ, Rice JE (2006) Bioorg Med Chem Lett 16:3891–3895 Dervan PB, Edelson BS (2003) Curr Opin Struct Biol 13:284–299 Han Y-W, Matsumoto T, Yokota H, Kashiwazaki G, Morinaga H, Hashiya K, Bando T, Harada Y, Sugiyama H (2012) Nucleic Acids Res 40:11510–11517 Cocco MJ, Hanakahi LA, Huber MD, Maizels N (2003) Nucleic Acids Res 31:2944–2951 Chung WJ, Heddi B, Tera M, Iida K, Nagasawa K, Phan AT (2013) J Am Chem Soc 135:13495– 13501 (a) Lacy ER, Le NM, Price CA, Lee M, Wilson WD (2002) J Am Chem Soc 124:2153–2163; (b) Asamitsu S, Kawamoto Y, Hashiya F, Hashiya K, Yamamoto M, Kizaki S, Bando T, Sugiyama H (2014) Bioorg Med Chem 22:4646–4657 (a) Seenisamy J, Bashyam S, Gokhale V, Vankayalapati H, Sun D, Siddiqui-Jain A, Streiner N, Shin-ya K, White E, Wilson WD, Hurley LH (2005) J Am Chem Soc 127:2944–2959; (b) Rezler EM, Seenisamy J, Bashyam S, Kim M-Y, White E, Wilson WD, Hurley LH (2005) J Am Chem Soc 127:9439–9447; (c) Bejugam M, Sewitz S, Shirude PS, Rodriguez R, Shahid R, Balasubramanian S (2007) J Am Chem Soc 129:12926–12927; (d) Wang X-D, Ou T-M, Lu Y-J, Li Z, Xu Z, Xi C, Tan J-H, Huang S-L, An L-K, Li D, Gu L-Q, Huang Z-S (2010) J Med Chem 53:4390–4398 Rodriguez R, Pantos GD, Gonçalves DPN, Sanders JKM, Balasubramanian S (2007) Angew Chem Int Ed 46:5405–5407 Karsisiotis AI, Hessari NM, Novellino E, Spada GP, Randazzo A, da Silva MW (2011) Angew Chem Int Ed 50:10645–10648 Rangan A, Fedoroff OY, Hurley LH (2001) J Biol Chem 276:4640–4646 (a) Zheng K-w, Chen Z, Hao Y-h, Tan Z (2010) Nucleic Acids Res 38:327–338; (b) Zhang C, Liu H-h, Zheng K-w, Hao Y-h, Tan Z (2013) Nucleic Acids Res 41:7144–7152 (a) Rankin S, Reszka AP, Huppert J, Zloh M, Parkinson GN, Todd AK, Ladame S, Balasubramanian S, Neidle S (2005) J Am Chem Soc 127:10584–10589; (b) Phan AT, Kuryavyi V, Burge S, Neidle S, Patel DJ (2007) J Am Chem Soc 129:4386–4392; (c) Dexheimer TS, Sun D, Hurley LH (2006) J Am Chem Soc 128:5404–5415 (a) Chambers VS, Marsico G, Boutell JM, Di Antonio M, Smith GP, Balasubramanian S (2015) Nat Biotechnol 33:877–881; (b) Mukundan VT, Phan AT (2013) J Am Chem Soc 135:5017– 5028; (c) Heddi B, Martin-Pintado N, Serimbetov Z, Kari TMA, Phan AT (2016) Nucleic Acids Res 44:910–916 Xiao J, Yuan G, Huang W (2000) J Org Chem 65:5506–5513

Chapter 4

Simultaneous Binding of Hybrid Molecules Constructed with Dual DNA-Binding Components to a G-Quadruplex and Its Proximal Duplex Abstract A G-quadruplex (quadruplex) is a nucleic acid secondary structure adopted by guanine-rich sequences and is considered to be relevant to various pharmacological and biological contexts. Although a number of researchers have endeavored to discover and develop quadruplex-interactive molecules, poor ligand designability originating from topological similarity of the skeleton of diverse quadruplexes has remained a bottleneck for gaining specificity for individual quadruplexes. This work reports on hybrid molecules that were constructed with dual DNA-binding components, a cyclic imidazole/lysine polyamide (cIKP), and a hairpin pyrrole/imidazole polyamide (hPIP), with the aim toward specific quadruplex targeting by reading out the local duplex DNA sequence adjacent to designated quadruplexes in the genome. By means of circular dichroism (CD), fluorescence resonance energy transfer (FRET), surface plasmon resonance (SPR), and NMR techniques, we showed the dual and simultaneous recognition of the respective segment via hybrid molecules, and the synergistic and mutual effect of each binding component that was appropriately linked on higher binding affinity and modest sequence specificity. Monitoring quadruplex and duplex imino protons of the quadruplex/duplex motif titrated with hybrid molecules revealed distinct binding features of hybrid molecules to the respective segments upon their simultaneous recognition. A series of the systematic and detailed binding assays described here showed that the concept of simultaneous recognition of quadruplex and its proximal duplex by hybrid molecules constructed with the dual DNA-binding components may provide a new strategy for ligand design, enabling targeting of a large variety of designated quadruplexes at specific genome locations. Keywords Dual DNA-binding components · G-quadruplexes · Quadruplex/duplex motif · Sequence selectivity · Simultaneous recognition

4.1 Introduction Beyond canonical B-form DNAs, DNAs harboring characteristic sequences have been proven to adopt non-canonical, higher-order structures, such as triplex, [1] G-quadruplex (quadruplex) [2], G-hairpin [3], G-triplex [4], i-motif [5], and © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Asamitsu, Development of Selective DNA-Interacting Ligands, Springer Theses, https://doi.org/10.1007/978-981-15-7716-1_4

85

86

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

mismatched hairpin DNAs occurring in triplet repeat expansions [6]. Specific motifs of the constituents of DNA have also attracted a great deal of attention because they are thought to be focused recognition sites for particular biomolecules or ligands [7]. Currently, a set of quadruplex/duplex complex composites has been considered as targets of molecular recognition in chemical biology research because of the cooperative function and/or high thermal stability under physiological conditions. For example, an arginine–glycine-rich RGG peptide from the fragile X mental retardation protein (FMRP) is specifically recognized as an RNA duplex–quadruplex junction and provided structural insight into understanding the relevant diseases [8]. Parkinson and coworkers recently revealed by X-ray crystallographic investigation, the complex 3 interface formed between parallel quadruplexes and a duplex DNA segment constructed from the human telomeric repeat sequence [9]. They asserted that such a junction has the potential to be a unique target for molecular binding and interference with telomere-related functions. Furthermore, duplex stemloop-containing quadruplex motifs have been extensively studied using both bioinformatic and biophysical approaches [10]. Those quadruplex/duplex motifs offer a dual-binding segment that was shown to be simultaneously targetable by two distinct molecules [10e]. For the past two decades, numerous researchers have been working on the discovery and development of quadruplex-interactive molecules [11]. Some successful attempts were made to achieve selectivity toward particular quadruplexes over other quadruplexes, mainly on the basis of the topology of quadruplex structures, namely parallel, antiparallel, or hybrid types [12]. However, despite continuous efforts, poor ligand designability originating from the topological similarity of the skeleton of diverse quadruplexes has remained a bottleneck for gaining specificity toward the individual quadruplexes classified into the same topological group. This is an obstacle to the progress of the detailed investigation of features and functions of individuals in a genome-wide context. Furthermore, evidence emerging from several recent studies that the parallel structures of quadruplex were more likely to be adopted in living cells and molecularly crowded conditions [13] has allowed us to reconsider the conventional ligand design of quadruplexes and devise a new strategy to acquire genome-wide specificity. For years, our group has studied a synthetic DNA minor-groove binder, “pyrrole–imidazole polyamide” (PIP). PIP was developed by Dervan and colleagues as a programmable DNA-binding molecule capable of targeting a large repertoire of DNA sequences [14]. Specifically, hairpin-type PIPs (hPIPs) are now widely used to exert specific gene regulatory effects in vivo and in vitro [15]. Recently, we developed a new class of G-quadruplex ligand, cyclic imidazole/lysine polyamide (cIKP), which specifically binds to G-quadruplex structures with low nanomolar affinity [16]. By combining the features of the dual DNA-binding components, it is conceivable that the readout of the local duplex sequence adjacent to a quadruplex should provide an alternative strategy toward the selective targeting of a designated quadruplex in the genome. Huang and Tan et al. have recently reported a successful attempt to uniquely visualize a particular RNA quadruplex based on the hybridization of a tail RNA sequence adjacent to a G-rich sequence by a DNA molecule connected to a

4.1 Introduction

87

Fig. 4.1 Schematic representation of the selective targeting of a quadruplex and its proximal duplex in the nuclease hypersensitive element (NHE) III1 upstream of the c-myc P1 promoter by hybrid molecules constructed with a hairpin pyrrole–imidazole polyamide (hPIP) and a cyclic imidazole/lysine polyamide (cIKP)

quadruplex-triggered fluorescent probe [17]. In the present proof-of-concept study, we designed and synthesized several hPIP–cIKP conjugates as the first demonstration of hybrid molecules that allow for highly specific recognition of an intended quadruplex (Fig. 4.1). A series of biophysical assays were employed to characterize binding features of the dual targeting via the hybrid molecules using a set of the model quadruplex and its proximal duplex DNA substrate.

4.2 Results and Discussion 4.2.1 DNA Substrate and Molecular Design As a model DNA substrate, we chose a quadruplex and its adjacent duplex sequence observed in the nuclease hypersensitive element (NHE) III1 of a c-myc gene. The

88

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

NHE III1 region is located –142 to –115 base pairs upstream of the P1 promoter and is capable of forming nonduplex species, possibly accompanied by local unwinding or melting of the duplex structure under the influence of negative supercoiling stress (Fig. 4.1) [18]. Structural dynamics in this region have also been considered to be a possible key mechanism to largely control c-myc transcription, and the formation of a quadruplex is likely to act as a downregulator. To simplify a series of biophysical binding assays in the present study, we employed artificial DNA substrates devoid of the complementary sequence of a c-myc quadruplex sequence (S1 and S2, Fig. 4.2). Simultaneous formation of both a quadruplex and duplex on S1 and S2 was confirmed by 1D 1 H-NMR. Note that G to T modifications at the quadruplex–duplex interface of S2 were made to facilitate the formation of a quadruplex, while avoiding alternative hybridization of the shorter strand with the quadruplex-forming sequence of the longer strand. The 1 H-NMR data were supported by the CD spectrum of S1 that closely matched the sum of the spectra of each duplex and quadruplex segment (Fig. 4.3) [10b]. We selected cIKP as a quadruplex ligand, which has been proven to have high selectivity toward quadruplex structures over double-stranded DNAs (dsDNAs) (Fig. 4.4a) [16]. Either side chain of lysine residues could be used for the linking scaffold, which has to be advantageous for the prompt shift to design and synthesis of hybrid molecules without structural modifications for it. We next designed hPIPs targeted to the duplex segment of the c-myc sequence (Fig. 4.4b, c, d). Component hPIPs (4 + 4), which are the best studied and characterized, were selected as the first design of the local duplex sequence readers in the present study. Because hPIPs were targeted to two target sequences, 5 - WWGGW-3 or 5 -WGGWG-3 , which were not fully characterized, we first designed several matching hPIPs for them. Lee et al. previously reported (3 + 3) component hPIPs that were successfully targeted to a 5 -TGGT-3 site, an inverted CCAAT box 2 in topoisomerase IIα promoter

Fig. 4.2 Sequences and 1D 1 H-NMR of S1 and S2

4.2 Results and Discussion

89

Fig. 4.3 CD spectrum of S1 (purple) and the sum of CD spectra of the hairpin duplex DNA and a c-myc quadruplex (blue) in 10 mM Tris-HCl (pH 7.5) and 4 mM KCl buffer

Fig. 4.4 Each component that was characterized toward design of the hybrid molecule in the present study. Chemical structures of a cIKP, b hPIP1, c hPIP2, and d hPIP3. The target sequences by hPIPs are shown in blue. The blue circles, red filled circles, and black diamonds represent N-methylpyrrole, N-methylimidazole, and β-alanine residues, respectively. The curved line connecting the sides of two circles represents a GABA turn. The plus sign of hPIP1 represents a dimethylaminopropylamine (Dp) residue

90

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

[19]. Inspired by this report, we designed a (4 + 4) hPIP for the 5 -WWGGW-3 sequence, to which a single Py–Py pair was extended (hPIP1). In addition to an eight-ring hairpin polyamide (hPIP2), a β-alanine-containing hPIP (hPIP3) was also designed as a replacement of the pyrrole positioned between two imidazole residues with a flexible β-alanine to enhance the affinity for the 5 -WGGWG-3 sequence. A methyl amide tail was selected as the C-terminal portion of hPIP2 and hPIP3 for comparable recognition of a GC base pair [20].

4.2.2 Characterization of Binding Properties to Individual Segments We first assessed the binding properties of the parent hPIP candidates of hybrid molecules to the target duplex sequence. Three compounds hPIP1, hPIP2, and hPIP3 (Fig. 4.4) were synthesized using an automated Fmoc solid-phase synthesis as previously reported [21]. Comparison of the melting temperatures (T m ) of the duplex in the presence or absence of polyamide clearly showed that hPIP3 exhibited the highest stabilization effect to the target duplex, which was reflected in the largest positive change in T m values (T m ) upon the addition of hPIP3 (T m = 12.5 °C) compared with hPIP1 and hPIP2 (T m = 1.5 and 8.1 °C, respectively) (Fig. 4.5a). Binding stoichiometry was then examined by monitoring circular dichroism (CD) spectral changes upon titration with the compound (Fig. 4.5). Each titration process generated substantial positive CD signals at around 330 nm, which were indicative of the binding of hPIPs to the minor groove of DNA [22]. Plots of maximum molar ellipticity of the emerging signals against the equivalents of a ligand to DNA showed that hPIP1 and 3 probably recognized the target sequence through a 1:1 motif. Conversely, the binding stoichiometry of hPIP2 was estimated to be 4:1 or more, suggesting multiple binding modes to the target. Taken together, we decided to adopt hPIP1 and hPIP3 as the duplex-binding components of hybrid molecules because the undeterminable binding modes of hPIP2 would be averse to the selective targeting of the c-myc quadruplex. We next characterized the potential binding site of cIKP, which was proven to be a quadruplex-specific binder. Thiazole orange (TO) displacement assays and docking analysis indicated that cIKP was probably bound to a G-tetrad through end-stacking and two side chains of the lysines might both be placed in the groove of the quadruplex (Figs. 4.6 and 4.7). Using UV-melting assays, we confirmed that cIKP did not bind to a duplex DNA, in agreement with SPR data described in our previous report [16].

4.2 Results and Discussion

91

Fig. 4.5 Characterization of hPIP1–3 to the target sequences of a duplex segment. a Changes in duplex melting temperature values (T m ) in the presence of compounds detected by absorbance at 260 nm. Error bars represent the standard deviation of three independent trials. CD spectra of hairpin duplex DNA (5 -GAAGGTGCGTTTTCGCACCTTC-3 ) titrated with b hPIP1, c hPIP2, d hPIP3, and plots of maximum molar ellipticity of the emerging signals against the equivalents of each hPIP to DNA from which the binding stoichiometry can be estimated

4.2.3 Design and Synthesis of Hybrid Molecules On the basis of the structural insights of the substrate and the binding propensity toward each segment, we designed and synthesized hybrid molecules constructed with hPIP1 or hPIP3, and cIKP connected through linkers of unequal length, aiming for the selective targeting of the c-myc quadruplex (hybrids 1–4, Fig. 4.8). Synthetic schemes for hybrids 1–4 are described in Schemes 4.1 and 4.2. Hairpin polyamides that had been synthesized on a β-alanine Wang resin by Fmoc solid-phase synthesis [21] were cleaved with 2,2 -(ethylenedioxy)bis(ethylamine) or 4,7,10trioxa-1,13-tridecanediamine to afford polyamides 11 and 12, respectively. These polyamides were next functionalized with an N-succinimidyl iodoacetate to produce iodoacetate-containing polyamides. cIKP, which had been synthesized in solution phase according to our reported method [16], was coupled with the iodoacetate moiety to afford hybrids 1 and 2. In the case for hybrids 3 and 4, hPIP3 was coupled with a carboxylic acid moiety of Boc-protected linkers using HCTU/DIEA chemistry in DMF solvent, then the Boc groups were deprotected with TFA/DCM. The amine-containing polyamides were functionalized with N-succinimidyl iodoacetate

92

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

Fig. 4.6 Thiazole orange (TO) fluorescence spectra upon the titration with a cIKP and b TMPyP4 in the presence of c-myc G-quadruplex. These spectra for cIKP displayed a similar transition behavior to those for TMPyP4

Fig. 4.7 Lowest-energy model of cIKP/c-myc G-quadruplex complexes by docking analysis; a Overview. b Interaction site. Imidazole rings and a G-tetrad are shown in red and purple, respectively

4.2 Results and Discussion

93

Fig. 4.8 Designed hybrid molecules (hybrids 1−4). a Hybrids 1 and 2 were constructed with hPIP1 and cIKP connected through a shorter and longer linker, respectively. b Hybrids 3 and 4 were constructed with hPIP3 and cIKP connected through a shorter and longer linker, respectively. The detailed synthetic methods are shown in Schemes 4.1 and 4.2

to produce iodoacetate-containing polyamides. cIKP was then conjugated to these polyamides, in the same way, to afford hybrids 3 and 4.

4.2.4 Recognition of Quadruplex by Hybrid Molecules To investigate whether the hybrid molecules could maintain their stabilization effect comparable to the quadruplex binding moiety, cIKP on a c-myc quadruplex, we

94

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

Scheme 4.1 Synthetic schemes of hybrids 1 and 2

Scheme 4.2 Synthetic schemes of hybrids 3 and 4

4.2 Results and Discussion

95

measured thermal denaturing profiles of the quadruplex segment using S1 in the presence or absence of hybrids 1–4. From CD spectra of S1 and the duplex segment, molar ellipticities recorded at 263–268 nm were likely to be used for observing quadruplex denaturing behavior without being influenced by the presence of the duplex segment (Fig. 4.9a) [10b]. After considering the temperature dependence of CD spectra of the hairpin duplex segment (Fig. 4.9b), it was found that molar ellipticity at 267 nm did not alter over the range of 20–94 °C and thus was selected to measure the parallel quadruplex melting profiles of S1 (Fig. 4.10). In the absence of the compound, molar ellipticities at 267 nm decreased almost linearly as the temperature increased, and the transition of melting curves did not appear enough to obtain a melting temperature. Upon the addition of cIKP, a linear decrease at a slower rate with temperature increase was observed. Although we could not obtain a

Fig. 4.9 a CD spectra of S1 (purple) and the hairpin duplex DNA (green). Gray region represents a range of 263–268 nm, which are not likely to be influenced by the presence of the duplex segment. b Transition of molar ellipticities of the hairpin duplex DNA recorded at 263–268 nm with an increase in temperature

Fig. 4.10 Quadruplex denaturing profiles of S1 in the presence or absence of the compounds detected by absorption at 267 nm: a DMSO (blue filled triangles), cIKP (orange filled squares), hybrid 1 (gray diamonds), hybrid 2 (yellow circles); b DMSO (blue filled triangles), cIKP (orange filled squares), hybrid 3 (gray diamonds), and hybrid 4 (yellow circles)

96

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

clear transition profile, it was found that 68% of quadruplex species existed even at 90 °C (average, 86–94 °C) relative to the state at 20 °C. To allow a comparison between the stabilization effects on a quadruplex segment of S1 by cIKP and hybrids 1–4, we decided to employ values of the ratio of quadruplex species at 90 °C instead of determining melting temperatures. Quadruplex melting profiles of S1 in the presence of hybrids 1–4 are shown in Fig. 4.10. Each ratio of quadruplex species at 90 °C was calculated to be 51%, 56%, 60%, and 40% against hybrids 1, 2, 3, and 4, respectively. The findings that hybrids 1–4 exhibited a stabilization effect comparable to cIKP indicated that the cIKP moiety included in hybrids 1–4 was able to recognize the quadruplex segment of S1 as cIKP did itself.

4.2.5 Dual Recognition of a Quadruplex/Duplex by a Single Hybrid Compound To confirm a proper recognition of the duplex segment by hybrid molecules, we performed a CD titration assay using S1 in a similar manner as used for hPIPs. For hybrids 1–3, a 1:1 binding stoichiometry was estimated from plots of the CD signal at around 330 nm against the equivalent of the ratio of moles of hybrids to moles of DNA (Fig. 4.11a–d). By contrast, hybrid 4 was considered to have displayed additionally two equivalent binding modes. To gain further insight into the binding behaviors of conjugates toward a duplex region within the complex structures, we devised and performed FRET melting assays in the presence or absence of compounds (Fig. 4.11e) [23]. The short strand and the long strand of S2 were labeled at the 5 -end with a fluorescein (FAM) and the 3 -end with a tetramethylrhodamine (TAMRA), respectively. Monitoring fluorescence of FAM with an increase in temperature reflects a duplex melting temperature in the complex structure. The denaturing profiles of the duplex region of the dual-labeled S2 are shown in Fig. 4.12. The duplex melting temperature of the dual-labeled S2 in the absence of these compounds was 25.6 °C. Increases in T m values in the presence of hybrids 1–3 (T m = 10.9, 11.5, and 19.8 °C, respectively) were higher than those when hPIP1/cIKP (the corresponding components of hybrids 1 and 2) or hPIP3/cIKP (the corresponding components of hybrid 3) were added to the system (T m = 2.4 and 16.2 °C, respectively) (Fig. 4.11f). Considering that the quadruplex structure remained even under such temperatures, these results indicate the synergistic effect of binding of the cIKP moiety to the quadruplex segment on the higher thermal stabilization of the duplex region of S2. Conversely, the melting profile in the presence of hybrid 4 was a twostage transition. This transition may result from the presence of the duplex species that is unrecognized by hybrid 4, which was reflected in a more than 1:1 binding stoichiometry on a CD titration assay (Fig. 4.11d). Competitive assays were conducted to test the specificity. The presence of an excess amount of telomere quadruplex DNA or matched hairpin duplex DNA (20 equivalents) as competitors resulted in minimal attenuation of T m values (Figs. 4.11g

4.2 Results and Discussion

97

Fig. 4.11 Dual and simultaneous recognition of both quadruplex and duplex segments by a single hybrid compound. a– d CD spectra of S1 titrated with hybrids 1−4, and plots of maximum molar ellipticity of the emerging signals against the equivalents of each hybrid to DNA from which the binding stoichiometry can be estimated. e–g FRET melting-point experiment performed using a dual-labeled S2 with monitoring FAM fluorescence; e schematic representation of a devised duplex FRET melting system. f T m values of duplex segments of S2 with cIKP + hPIP1, cIKP + hPIP3, and hybrids 1−4 relative to DMSO. g Competitive FRET melting-point assays with hybrids 1−3 and telomere sequence or matched duplex DNA as competitors. Note that the melting-point temperature with hybrid 4 was not obtained because the denaturation profile showed a two-stage transition

and 4.13), indicating a high specificity toward the c-myc quadruplex. The remarkable preference for S2 over the matched hairpin duplex DNA can be explained by binding of the cIKP moiety to a quadruplex segment offered by S2. Also, we confirmed that while maintaining the quadruplex binding, hPIP moieties of hybrids 1–3 are able to recognize the duplex sequence properly, as indicated by a much higher stabilization of the duplex segment of S2 than that of a fully mismatched quadruplex/duplex substrate (Fig. 4.14). We next employed SPR-binding assays to understand further the quantitative binding features of the hybrid molecules toward the quadruplex/duplex motifs [16, 24]. We prepared three sensor chips onto which 5 -biotin-labeled target complexes that contained matched, 1 bp, or 2 bp mismatched sequences on a duplex region (S3, S4, and S5, respectively) had been immobilized, so as to test the sequence specificity by the hPIP moieties of the hybrid molecules. Association/dissociation dynamics of

98

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

Fig. 4.12 a Thermal denaturing profiles of a dual fluorescence-labeled S2 with monitoring FAM fluorescence in the presence of cIKP + hPIP1, hybrid 1, or hybrid 2. b Thermal denaturing profiles of the dual fluorescence-labeled S2 with monitoring FAM fluorescence in the presence of cIKP + hPIP3 or hybrid 3

Fig. 4.13 a–c Thermal denaturing profiles of a dual fluorescence-labeled S2 with monitoring FAM fluorescence in the presence of hybrids 1−3 with 1, 5, and 20 equivalents of the telomere DNA as a competitor. d–f Thermal denaturing profiles of the dual fluorescence-labeled S2 with monitoring FAM fluorescence in the presence of hybrids 1− 3 with 1, 5, and 20 equivalents of the hairpin duplex DNA as a competitor

the interaction of optimized concentrations of hybrids 1–4 with DNAs were visualized as sensorgrams (Fig. 4.15). The dissociation constants (K D ) of compounds to S3, S4, or S5 were determined from the best-fitting curves corresponding to the lower χ 2 values (Table 4.1). Both hybrids 3 and 4 showed higher binding affinities toward S3 (match) compared with S4 (1 bp mismatch), whereas their binding affinity toward S5 (2 bp mismatch) resulted in almost the same values as those for S4. A similar tendency was observed for hybrids 1 and 2, showing the best binding

4.2 Results and Discussion

99

Fig. 4.14 a Sequence of a fully mismatched quadruplex/duplex substrate labeled with a TAMRA and a FAM. b Thermal denaturing profiles of the fully mismatched substrate with monitoring FAM fluorescence in the presence of hybrids 1−3. c Comparison between T m values of duplex segments of S2 and a fully mismatched substrate with hybrids 1−3

affinity toward S3, followed by S4 and S5. The small but significant differences in the binding affinities between the match and mismatch sequences suggest that those hybrid molecules were able to bind and correctly read the predefined DNA sequences based on conventional binding rules as advocated by Dervan and coworkers. Hybrids 2 and 3 displayed higher binding affinities compared with that of cIKP to the c-myc quadruplex (K D = 6.2 nM) [16], likely because the additional enthalpies acquired by the proper recognition of dual-binding sites overcome the entropic penalty generated by covalent linkage of the hPIP moieties. Finally, to capture the distinct binding features of the two types of hybrid (hybrids 1 or 2 versus hybrids 3 or 4) upon the dual and simultaneous recognition of the quadruplex/duplex motif, imino protons of the quadruplex, and duplex of S1 titrated with hybrids 2 or 3 were monitored by 1D 1 H-NMR spectra (Fig. 4.16). Both the duplex and quadruplex imino proton regions were considerably perturbed by the addition of the compounds. Titration with hybrid 3 in less than stoichiometric amounts generated two sets of appearing and disappearing resonances in the duplex and quadruplex imino proton regions that derived from the complexed and free DNA, respectively, exhibiting a slow-exchange behavior of each component with the respective segments. The addition of hybrid 2 also displayed the slow-exchange behavior in the duplex imino proton region, whereas it rendered original quadruplex imino protons broader, suggestive of a rapid dissociation from the quadruplex segment. Given that the exchanged resonances in quadruplex and duplex imino proton regions were both

100

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

Fig. 4.15 a Biotin DNA oligomers that were immobilized onto streptavidin-functionalized SA sensor chips. Nucleotides in italics represent mutations that were made to prepare 1 or 2 bp mismatched sequence for hPIP moieties. b SPR sensorgrams for the interaction of the hybrids with match, 1 bp mismatch, and 2 bp mismatch substrates (S3, S4, and S5). Black lines represented the best fitting using a modified heterogeneous binding model. Concentrations are as follows: (i) 250 (ii) 125 (iii) 62.5 (iv) 31.25 nM for hybrid 1/S5, hybrid 2/S3, hybrid 2/S4, hybrid 2/S5, and hybrid 4/S4, hybrid 4/S5 sensorgrams; (i) 187.5 (ii) 93.8 (iii) 46.9 (iv) 23.4 nM for hybrid 1/S4, hybrid 3/S3 sensorgrams; (i) 125 (ii) 62.5 (iii) 31.25 (iv) 15.63 nM for hybrid 1/S3, hybrid 3/S4, hybrid 3/S5, hybrid 4/S3 sensorgrams

saturated after the addition of 1.0 and 1.5 equivalents of hybrid 3, respectively, hybrid 3 forms as a better-defined complexed structure as was intended.

4.3 Conclusion We designed, synthesized, and characterized the hybrid molecules constructed with dual DNA-binding components as the first demonstration of the feasibility of specific

4.3 Conclusion

101

Table 4.1 Dissociation constants obtained by best fitting with a model for heterogeneous binding site including equations that reflect mass transfer limitation effects for the interaction of hybrids 1–4 with the match (S3), 1 bp mismatch (S4), and 2 bp mismatch (S5) substrates Hybrid 1

K D /10−9 M

Hybrid 2

S3 (match)

S4 (1 bp mismatch)

S5 (2 bp mismatch)

S3 (match)

S4 (1 bp mismatch)

S5 (2 bp mismatch)

14 (± 0.7)

18 (± 5.7)

54 (± 1.0)

4.2 (± 0.2)

11 (± 3.3)

19 (± 2.1)

S3 (match)

S4 (1 bp mismatch)

S5 (2 bp mismatch)

S3 (match)

S4 (1 bp mismatch)

S5 (2 bp mismatch)

3.7 (± 0.7)

17 (± 7.8)

13 (± 1.4)

6.6 (± 1.8)

19 (± 1.4)

14 (± 4.2)

Hybrid 3

K D /10−9 M

Hybrid 4

The standard deviation of two independent experiments is indicated in parentheses

Fig. 4.16 1D 1 H-NMR spectra of both duplex and quadruplex imino proton regions of S1 titrated with a hybrids 2 and b hybrid 3. The black and blue asterisks represent appearing and diminishing/broadening resonances, respectively

quadruplex targeting at the genomic level. Our strategy to gain specificity was to utilize a local duplex DNA sequence adjacent to a selected quadruplex, offering diverse information distinguished by readout molecules (hPIP), which was covalently linked to a quadruplex ligand (cIKP). CD/FRET melting, quantitative SPR binding, and CD/NMR titration assays revealed that these hybrid molecules recognized simultaneously and synergistically both quadruplex and duplex segments with high binding affinities and modest sequence specificities. Among them, hybrid 3 forms a betterdefined complexed structure as indicated by the slow-exchange behavior of both quadruplex and duplex imino proton resonances. The detailed structural determination and biological activities of these complex-specific hybrid molecules are attractive for further investigation. The molecular design concept described in the present study may in theory be applicable to selective targeting of a broad variety of designated quadruplexes in the genome, such as the G-rich sequences in the promoters of c-KIT [25], BCL2 [26], and KRAS genes [27], and specific duplex stem-loop-containing

102

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

quadruplex motifs as observed in human genome[10d, 7e] by careful design and linkage of the readout molecules.

4.4 Materials and Methods 4.4.1 General Concentrations of all the polyamides and hybrids were quantified by measuring their molecular weight (2–5 mg) with an analytical microbalance (Mettler Toledo International Inc.). The materials that are assumed to exist as the TFA salts were dissolved in the exact volume of DMSO (99.5% purity, Nacalai Tesque, Inc.) to yield 20 mM master solution and stored at –30 °C [28]. Adjusting a series of concentration for each assay was conducted by continuous dilution with singlechannel pipetters (NEXTY). After solid-phase peptide synthesis, all of the reactions were tracked with an analytical high-performance liquid chromatography (HPLC) on a PU-2089 plus series system (JASCO) using COSMOSIL 150 × 4.6 mm 5C18 MS-II Packed Column (Nacalai Tesque, Inc.) in 0.1% TFA in water with acetonitrile as the eluent at a flow rate of 1.0 mL/min and a linear gradient elution of 0–75% acetonitrile in 30 min with detection at 254 nm. Collected fractions were analyzed by MALDI-TOF-MS (Bruker). All reported mass values were calibrated with a standard sample (m/z: 1521.9719 or m/z: 2121.9335). HPLC purification of compounds was carried out using CHEMCOBOND 150 × 4.6 mm 5-ODS-H Column (Chemco Plus Scientific Co., Ltd.) in 0.1% TFA in water with acetonitrile as the eluent at a flow rate of 1.0 mL/min and a linear gradient elution of ca. 25–50% acetonitrile in 20 min with detection at 254 nm. Oligonucleotides were purchased from Integrated DNA Technologies, Inc. or Sigma-Aldrich Co. LLC. Strand concentrations were quantified from 260–nm molar extinction coefficients by UV-Vis spectrum measurement at 95 °C on a spectrophotometer V-650 (JASCO).

4.4.2 Synthesis of HPIP1–4 and Hybrids 1–3 General procedures of Fmoc solid-phase synthesis of polyamide [29] Synthesis of each polyamide was performed on a PSSM-8 (Shimadzu) computerassisted operation system on a 0.02–0.04 mmol scale. An Fmoc building block in each step was set up to solve by NMP on the synthetic line. The iterative polyamide assembly was as follows: twice deblocking for 4 min with 20% piperidine/NMP (0.6 mL), activating for 2 min with HCTU (88 mg, 0.21 mmol) in NMP (1 mL) and 10% v/v DIEA/NMP (0.4 mL), coupling for 60 min, and washing with DMF (0.6 mL × 5). All coupling steps were carried out with a single-coupling cycle. The

4.4 Materials and Methods

103

Fmoc building blocks used in this study are FmocHN–P–CO2 H (77 mg), FmocHN–I– CO2 H (77 mg), and FmocHN–PI–CO2 H (70 mg) Fmoc-β–CO2 H (66 mg) and, Fmocγ–CO2 H (69 mg), Fmoc-D-Dab–CO2 H (93 mg). At the last capping process, the samples were washed with 20% acetic acid in NMP (1 mL). All lines are purged with solution transfers and bubbled by N2 gas for stirring the resin. After the completion of the assembly, the resin was washed with DMF (2 mL) and methanol (2 mL), then dried in a desiccator at room temperature in vacuo. Synthesis of parent polyamides (hPIP1–3) hPIP1. On an 82 mg of Fmoc-P-Oxime resin (0.39 mmol/g), AcPPII-γ-PPPPresin was synthesized in the iterative reaction described above using Fmoc-P-CO2 H, Fmoc-I-CO2 H, Fmoc-PI-CO2 H, and Fmoc-γ-CO2 H. The resulting resin was cleaved with N,N-dimethyl propanediamine (750 μL) for 3 h at 55 °C, then a portion of the residues removed from the resin was directly poured into diethyl ether (35 mL) to be precipitated. A portion of the precipitates (14.1 mg) was purified by reverse-phase HPLC to give hPIP1 (3.2 mg, 2.65 μmol, 27% yield from resin loading). Analytical HPLC: t R = 16.7 min. MALDI-TOF-MS m/z calcd for C57 H70 N21 O10 + [M + H]+ 1208.561 found 1208.570; [M + Na]+ 1230.543 found 1230.532; [M + K]+ 1246.517 found 1246.532. hPIP2. On an 82 mg of Fmoc-PI-CLEAR acid resin (0.45 mmol/g), AcPPPPD-Dab-IIPI-resin was synthesized in the iterative reaction described above using Fmoc-P-CO2 H, Fmoc-I-CO2 H, and Fmoc-D-Dab-CO2 H. The resulting resin was cleaved with 2 M MeNH2 (750 μL) in THF for 3 h at 55 °C, then a portion of the residues removed from the resin was directly poured into diethyl ether (35 mL) to be precipitated. The dried precipitates were dissolved in TFA/DCM (1.3/3.5 mL), and the solution was stirred at room temperature for 20 min to deprotect the Boc group. The resulting polyamide was purified by reverse-phase HPLC to give hPIP2 (5.6 mg, 4.42 μmol, 12% yield from resin loading). Analytical HPLC: t R = 16.0 min. MALDITOF-MS m/z calcd for C52 H61 N22 O10 + [M + H]+ 1153.494 found 1153.544; [M + Na]+ 1175.476 found 1175.504; [M + K]+ 1191.449 found 1191.495. hPIP3. On an 84 mg of Fmoc-PI-CLEAR acid resin (0.24 mmol/g), AcPPPP-DDab-IIβI-resin was synthesized in the iterative reaction described above using FmocP-CO2 H, Fmoc-I-CO2 H, Fmoc-β-CO2 H, and Fmoc-D-Dab-CO2 H. The subsequent procedure similar to that used for the synthesis of hPIP2 afforded hPIP3 (17.6 mg, 14.5 μmol, 73% yield from resin loading). Analytical HPLC: t R = 15.3 min. MALDITOF-MS m/z calcd for C49 H60 N21 O10 + [M + H]+ 1102.483 found 1102.465; [M + Na]+ 1124.465 found 1124.528; [M + K]+ 1141.246 found 1140.465. Synthesis of hybrid molecules (hybrids 1–4) Hybrid 1 and 2. On an 80 mg of Fmoc-β-Ala-Wang resin (0.55 mmol/g), AcPPIIγ-PPPPβ-resin was synthesized in the iterative reaction described above using FmocP-CO2 H, Fmoc-I-CO2 H, Fmoc-PI-CO2 H, and Fmoc-γ-CO2 H. The resulting resin was cleaved with 2,2’-(ethylendioxy)bis(ethylamine) (750 μL) for 3 h at 55 °C, then a portion of the residues removed from the resin was directly poured into diethyl ether (35 mL) to be precipitated. The precipitates were purified by reverse-phase flash chromatography to give polyamide 11 (7.9 mg, 5.49 μmol, 12% yield from resin loading). Analytical HPLC: tR = 16.4 min. MALDI-TOF-MS m/z calcd for

104

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

C61 H77 N22 O13 + [M + H]+ 1325.604 found 1325.502; [M + Na]+ 1347.585 found 1347.470; [M + K]+ 1363.559 found 0363.531. In the same way, polyamide 12 was obtained using 4,7,10-Trioxa1,13-tridecanediamine instead of 2,2’-(ethylenedioxy)bis(ethylamine) (13.8 mg, 9.13 μmol, 22% yield from resin loading). Analytical HPLC: t R = 16.7 min. MALDITOF-MS m/z calcd for C65 H85 N22 O14 + [M + H]+ 1397.661 found 1397.641; [M + Na]+ 1419.643 found 1419.673; [M + K]+ 1435.617 found 1435.681. To a solution of polyamide 11 (7.9 mg, 5.49 μmol) in DIEA (4 μL, 22.9 μmol), dry DMF (500 μL) was added N-succinimidyl iodoacetate (1.9 mg, 6.71 μmol). The mixture was stirred at room temperature for 1 h, and then the solvent was evaporated. The resulting residues were dissolved in a minimum amount of DCM/MeOH and poured into diethyl ether (9 mL) to produce off-white precipitates, which were used for the next steps without further purification. Cyclic imidazole/lysine polyamide (cIKP, 10.1 mg, 10.3 μmol), which had been synthesized by our previously reported method, was dissolved in dry DMF (280 μL) and DIEA (7.2 μL, 41.3 μmol). After stirred for 10 min, to the mixture was added a solution of the crude polyamide in dry DMF (120 μL), then stirred at 40 °C for 6 h. The evaporation afforded the crude titled compound, which was purified by reverse-phase HPLC to give hybrid 1 (3.7 mg, 1.58 μmol, 29% yield over two steps). Analytical HPLC: t R = 17.3 min. MALDI-TOF-MS m/z calcd for C95 H12 1N38 O20 + [M + H]+ 2114.965 found 2114.930; [M + Na]+ 2136.947 found 2136.989. In the same way, hybrid 2 was obtained from polyamide 12 (3.5 mg 1.45 μmol, 30% yield over two steps). Analytical HPLC: t R = 17.8 min. MALDI-TOF-MS m/z calcd for C99 H129 N38 O21 + [M + H]+ 2186.0189 found 2186.221; [M + Na]+ 2208.001 found 2208.267; [M + K]+ 2223.9748 found 2224.049. Hybrids 3 and 4. On an 81 mg of Fmoc-I-CLEAR acid resin (0.24 mmol/g), AcPPPP-γ-IIβI-resin was synthesized in the iterative reaction described above using Fmoc-P-CO2 H, Fmoc-I-CO2 H, Fmoc-β-CO2 H, and Fmoc-γ-CO2 H. The treatment with 2 M MeNH2 in THF and TFA/DCM and the subsequent HPLC purification afforded hPIP3 as described above. To a solution of hPIP3 (16.7 mg, 13.7 μmol) in dry DMF (1 mL) was added a solution of t-Boc-N-amido-PEG3-acid (9.0 mg, 28.0 μmol) and HCTU (12.4 mg, 30.0 μmol) in dry DMF (300 μL) and DIEA (10 μL, 57.4 μmol), which had been stirred at room temperature for 10 min. The mixture was stirred at room temperature for 4.5 h, and then the solvent was evaporated. The resulting residue was dissolved in a minimum amount of DCM/MeOH and poured into diethyl ether (7 mL) to produce off-white precipitates, which were used for the next steps without further purification. The dried precipitates were dissolved in TFA/DCM (1.3/3.5 mL), and the solution was stirred at room temperature for 20 min to deprotect the Boc group. After evaporation and the diethyl ether precipitation described above, the crude amine-containing polyamide was dissolved in dry DMF (1.5 mL) and DIEA (10 μL, 57.4 μmol). To the mixture was added N-succinimidyl iodoacetate (21.9 mg, 77.4 μmol), then stirred at room temperature for 30 min. The evaporation afforded the crude iodoacetatecontaining polyamide, which was purified by reverse-phase HPLC to give the pure polyamide 13 (10.0 mg, 6.79 μmol, 50% yield over three steps). Analytical HPLC: tR

4.4 Materials and Methods

105

= 17.5 min. MALDI-TOF-MS m/z calcd for C60 H78 IN22 O15 + [M + H]+ 1473.5056 found 1473.445; [M + Na]+ 1495.488 found 1495.376; [M + K]+ 1511.462 found 1511.396. In the same way, polyamide 14 was obtained using N-Boc-N’-succinyl4,7,10-trioxa-1,13-tridecanediamine instead of t-Boc-N-amido-PEG3-acid (9.3 mg, 5.91 μmol, 47% yield over three steps). Analytical HPLC: tR = 17.8 min. MALDITOF-MS m/z calcd for C65 H87 IN23 O16 + [M + H]+ 1572.574 found 1572.414; [M + Na]+ 1594.556 found 1594.564; [M + K]+ 1610.530 found 1610.547. cIKP (7.7 mg, 7.88 μmol) was dissolved in dry DMF (1 mL) and DIEA (3.0 μL, 17.2 μmol). After stirred for 10 min, to the mixture was added the iodoacetatecontaining polyamide (3.8 mg, 2.58 μmol), then stirred at 40 °C for 6 h. The evaporation afforded the crude titled compound, which was purified by reverse-phase HPLC to give hybrid 3 (3.3 mg, 1.42 μmol, 55% yield). Analytical HPLC: t R = 17.8 min. MALDI-TOF-MS m/z calcd for C92 H121 N38 O21 + [M + H]+ 2093.956 found 2093.944; [M + Na]+ 2116.942 found 2116.805; [M + K]+ 2131.912 found 2132.837. In the same way, hybrid 4 was obtained (3.1 mg, 1.28 μmol, 47% yield). Analytical HPLC: t R = 17.8 min. MALDI-TOF-MS m/z calcd for C92 H121 N38 O21 + [M + H]+ 2193.0247 found 2192.845; [M + Na]+ 2216.010 found 2215.904; [M + K]+ 2230.981 found 2230.870.

4.4.3 NMR Spectroscopy 0.1 mM of oligonucleotides were prepared in the solutions containing 70 mM of KCl and 20 mM of potassium phosphate (pH 7.0). 1D spectra were recorded on a Bruker 600 MHz or 800 MHz spectrometer at 25 °C and processed with the software TopSpinTM . For titration assays, aliquots of 20 mM of compounds in DMSO-d 6 were added continuously to the above-mentioned solution prior to each measurement.

4.4.4 UV-Melting Assays UV-melting temperature assays were carried out on a spectrophotometer V-650 (JASCO) equipped with a thermos-controlled PAC-743R cell changer (JASCO) and a refrigerated and heating circulator F25-ED (Julabo). Hairpin duplex DNA (5’GAAGGTGCGTTTTCGCACCTTC-3’) was quantified as above-mentioned. DNA samples (2.5 μM and 100 μL) were prepared in 10 mM Tris-HCl (pH 7.5) and 4 mM KCl buffer. Prior to analysis, samples were heated to 95 °C and cooled down to 20 °C for 1 h. Temperature scans were performed in the presence or absence of compounds (3.75 μM) by monitoring continuously from 20 to 95 °C at 260 nm on a 1 °C/min rate. The melting temperatures (T m ) were determined as the maximum of the first derivative of the denaturing profile.

106

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

4.4.5 Circular Dichroism (CD) Titration Assays The hairpin duplex DNA (5’-GAAGGTGCGTTTTCGCACCTTC-3’) or S1 (Fig. 4.2) was quantified as above-mentioned. The DNA samples (5 μM, 500 μL) for CD titration were prepared in 10 mM Tris-HCl (pH 7.5) and 4 mM KCl. Aliquots of the master solution of compounds (1 mM in DMSO) were added continuously and incubated at least 3 min to reach the equilibrium. CD spectra were recorded at 25 °C over the range of 270–400 or 220–400 nm using JASCO J-805LST spectrometer in a 1-cm path length quartz cuvette. Maximum molar ellipticity signals around at 330 nm were plotted against the mole ratios of the compound to DNA oligomer to estimate the binding stoichiometry.

4.4.6 Thiazole Orange (TO) Displacement Assays TO displacement assays were conducted according to the previous literature [30]. In brief, DNA samples (0.25 μM, 100 μL) were prepared in 10 mM Tris-HCl (pH 7.5) and 100 mM KCl buffer. The samples were heated at 95 °C for 5 min and slowly cooled down to room temperature for annealing, then incubated with TO (0.5 μM) for at least 20 min. The resulting samples were added continuously master solution of compounds (cIKP or TMPyP4) and incubated at least 6 min to reach the equilibrium. Fluorescence spectra upon selective excitation at 501 nm were measured in a medium sensitivity mode at 25 °C on a JASCO spectrofluorometer FP-6300. TO and TMPyP4 were purchased from AAT Bioquest, Inc. and Aldrich, respectively.

4.4.7 Docking Analysis Structure of the receptor for docking was provided from the solution structure of the c-myc G-quadruplex bound to the bisquinolinium compound (Phen-DC3) under K+ solution (PDB code: 2MGN) [31] and generated by Discovery Studio 4.5 software package (MSI, San Diego, CA). Structural energy of cIKP was first minimized by means of density functional theory and a 6-311 + G* polarization basis set using Spartan Software [32]. The docking process was performed by a grid-based molecular dynamics (MD) docking algorithm, CDOKER (CHARMm-based DOCKER) [33]. The initial ligand placements were defined in the receptor with a radius of 10.6 Å of the input site sphere at the reference position of Phen-DC3 (surface of the G-tetrad). In the docking procedure, 100 random conformations of the ligand were generated by equilibrium and minimization steps using high-temperature MD. 100 random orientations of those conformations are produced by translating the center of the ligand to a defined placement and carrying out various randomized rotations. Each orientation is subjected to simulated annealing MD with default parameters. Before

4.4 Materials and Methods

107

the final production of top 10 low-energy receptor/ligand complexes, the ligand in the receptor was minimized using non-softened potential.

4.4.8 CD Melting Assays DNA samples (5 μM, 100 μL) for CD melting measurements were prepared in 10 mM Tris-HCl (pH 7.5) and 4 mM KCl buffer. Temperature scans were performed in the presence or absence of compounds (5 μM) by monitoring continuously from 20 to 94 °C at 267 nm on a 0.5 °C/min rate in a 1-cm quartz cuvette using JASCO J-805LST Spectrometer and a refrigerated and heating circulator Julabo F25-ME.

4.4.9 Fluorescence Resonance Energy Transfer (FRET) Melting Assays DNA samples (1 μM, 25 μL) for FRET melting measurements were prepared in 10 mM Tris-HCl (pH 7.5) and 10 mM KCl buffer on a 96-well plate. The shorter strand and the longer strand of S2 (Fig. 4.2) or a fully mismatched substrate (Fig. 4.14) were labeled at the 5’-end with a fluorescein (FAM) and the 3’-end with a tetramethylrhodamine (TAMRA), respectively. Annealing procedure included two steps, which was controlled by ProFlexTM PCR System Thermal cycler. First, the longer strand containing a quadruplex-forming sequence was heated at 90 °C for 5 min and cooled down to 55 °C for 1 h to form the quadruplex. Subsequently, to the resulting solution was added an equal amount of the shorter strand, then cooled down to 4 °C for 1 h to form the duplex at the 3’ end. Temperature scans were performed in the presence or absence of compounds (1.5 μM) by monitoring continuously from 15 to 73 °C with FAM fluorescence monitored on a 1 °C/min rate using 7300 Real-Time PCR System. For competitive assays, folded telomere sequence (5’-AGGGTTAGGGTTAGGGTTAGGG-3’) or the hairpin duplex DNA was mixed and preincubated with the compound solution at 4 °C and then to the mixture was added the annealed DNA solution. The final concentrations of labeled DNA and compounds are the same as those for FRET melting assays without competitors. T m was determined as the temperature at half of the maximum signal increase. Standard deviations were calculated from the three independent experiments at least.

108

4 Simultaneous Binding of Hybrid Molecules Constructed with Dual …

4.4.10 SPR-Binding Assays SPR experiments were performed on a Biacore X instrument (GE Healthcare) according to previous reports with some modifications. Biotinylated DNA was immobilized to streptavidin-functionalized SA sensor chips to obtain the desired immobilization level (approximately 600–800 RU). SPR measurements were carried out using degassed and filtered HBS buffer (10 mM HEPES pH 7.4, 150 mM NaCl, 3 mM EDTA, and 0.005% Surfactant P20) with 0.2% DMSO and 100 mM KCl at 25°C. A series of sample solutions with a wide range of concentrations were prepared in the buffer with 0.2% DMSO and 100 mM KCl and injected at a flow rate of 30 μL/min. After each cycle, the samples remaining on the DNA was detached with 50 mM NaOH/1 M NaCl buffer and/or 10 mM glycine (pH = 2.0) until the baseline of the sensorgrams was restored. Optimized concentration range was then selected for the subsequent quantitative analysis. These procedures were conducted twice independently. The resulting sensorgrams were fitted with a model for the heterogeneous binding site including equations that reflect mass transfer limitation effects to give the best fittings using BIA evaluation 4.1 program.

References 1. (a) Felsenfeld G, Davies DR, Rich A (1957) J Am Chem Soc 79:2023–2024; (b) Plum GE, Park YW, Singleton SF, Dervan PB, Breslauer KJ (1990) Proc Natl Acad Sci USA 87:9436–9440 2. Gellert M, Lipsett MN, Davies DR (1962) Proc Natl Acad Sci USA 48:2013–2018 3. (a) Rajendran A, Endo M, Hidaka K, Sugiyama H (2014) Angew Chem Int Ed 53:4107–4112; (b) Gajarský M, Živkovi´c ML, Stadlbauer P, Pagano B, Fiala R, Amato J, Tomáška L, Šponer J, Plavec J, Trantírek L (2017) J Am Chem Soc 139:3591–3594 4. Limongelli V, Tito SD, Cerofolini L, Fragai M, Pagano B, Trotta R, Cosconati S, Marinelli L, Novellino E, Bertini I, Randazzo A, Luchinat C, Parrinello M (2013) Angew Chem Int Ed 52:2269–2273 5. Guéron M, Leroy JL (2000) Curr Opin Struct Biol 10:326–331 6. Mirkin SM (2007) Nature 447:932–940 7. (a) Kang H − J, Kendrick S, Hecht SM, Hurley LH (2014) J Am Chem Soc136:4172–4185; (b) Barros SA, Chenoweth DM (2014) Angew Chem Int Ed 53:13746–13750; (c) Barros SA, Chenoweth DM (2015) Chem Sci 6:4752–4755; (d) Huang H, Suslov NB, Li N–S, Shelke SA, Evans ME, Koldobskaya Y, Rice PA, Piccirilli JA (2014) Nat Chem Biol 10:686–691; (e) Kang H − J, Cui Y, Yin H, Scheid A, Hendricks WPD, Schmidt J, Sekulic A, Kong D, Trent JM, Gokhale V, Mao H, Hurley LH (2016) J Am Chem Soc 138:13673–13692 8. Phan AT, Kuryavyi V, Darnel JC, Serganov A, Majumdar A, Ilin S, Raslin T, Polonskaia A, Chen C, Clain D, Darnell RB, Patel DJ (2012) Nat Struct Mol Biol 18:796–804 9. Krauss IR, Ramaswamy S, Neidle S, Haider S, Parkinson GN (2016) J Am Chem Soc 138:1226– 1233 10. (a) Lim KW, Phan AT (2013) Angew Chem Int Ed 52:8566–8569; (b) Lim KW, Khong ZJ, Phan AT (2014) Biochemistry 53:247–257; (c) Lim KW, Nguyen TQN, Phan AT (2014) J Am Chem Soc 136:17969–17973; (d) Lim KW, Jenjaroenpun P, Low ZJ, Khong ZJ, Ng YS, Kuznetsov VA, Phan AT (2015) Nucleic Acids Res 43:5630–5646; (e) Nguyen TQN, Lim KW, Phan AT (2017) Sci Rep 7:11969

References

109

11. (a) Monchaud D, Teulade − Fichou M-P (2008) Org Biomol Chem 6:627–636; (b) Luedtke NW (2009) Chimia 63:134–139; (c) Li Q, Xiang J − F, Yang Q − F, Sun H − X, Guan A − J, Tang Y − L (2013) Nucleic Acids Res 41:D1115–23 12. (a) Redman JE, Granadino-Roldán JM, Schouten JA, Ladame S, Reszka AP, Neidle S, Balasubramanian S (2009) Org Biomol Chem 7:76–84; (b) Nicoludis JM, Barrett SP, Mergny J–L, Yatsunyk LA (2012) Nucleic Acids Res 40:5432–5447; (c) Kong D–M, Ma Y–E, Guo J–H, Yang W, Shen H–Xi (2009) Anal Chem 81:2678–2684 13. (a) Liu H–Y, Zhao Q, Zhang T–P, Wu Y, Xiong Y–X, Wang S–K, Ge Y–L, He J–H, Lv P, Ou T–M, Tan J–H, Li D, Gu L–Q, Ren J, Zhao Y, Huang Z–S (2016) Cell Chemical Biology 23:1261–1270; (b) Xue Y, Kan Z–y, Wang Q, Yao Y, Liu J, Hao Y–h, Tan Z (2007) J Am Chem Soc 129:11185–11191; (c) Yu H, Gu X, Nakano S, Miyoshi D, Sugimoto N (2012) J Am Chem Soc 134:20060–20069; (d) Buscaglia R, Miller MC, Dean WL, Gray RD, Lane AN, Trent JO, Chaires JB (2013) Nucleic Acids Res 41:7934–7946; (e) Heddi B, Phan AT (2011) J Am Chem Soc 133:9824–9833 14. (a) Geierstanger BH, Mrksich M, Dervan PB, Wemmer DE (1994) Science 266:646–650; (b) White S, Szewczyk JW, Turner JM, Baird EE, Dervan PB (1998) Nature 391:468–471. (c) Dervan PB, Edelson BS (2003) Curr Opin Struct Biol 13:284–299 15. (a) Blackledge MS, Melander C (2013) Bioorg Med Chem 21:6101–6114; (b) Hiraoka K, Inoue T, Taylor RD, Watanabe T, Koshikawa N, Yoda H, Shinohara K–i, Takatori A, Sugimoto H, Maru Y, Denda T, Fujiwara K, Balmain A, Ozaki T, Bando T, Sugiyama H, Nagase H (2015) Nat Commun 6:6706; (c) Matsuda H, Fukuda N, Ueno T, Tahira Y, Ayame H, Zhang W, Bando T, Sugiyama H, Saito S, Matsumoto K, Mugishima H, Serie K (2006) J Am Soc Nephrol 17:422–432; (d) Pandian GN, Taniguchi J, Junetha S, Sato S, Han L, Saha A, AnandhaKumar C, Bando T, Nagase H, Vaijayanthi T, Taylor RD, Sugiyama H (2014) Sci Rep 4:3843 16. Asamitsu S, Li Y, Bando T (2016) Sugiyama H. ChemBioChem 17:1317–1322 17. Chen S − B, Hu M − H, Liu G − C, Wang J, Ou T − M, Gu L − Q, Huang Z − S, Tan J − H (2016) J Am Chem Soc 138:10382–10385 18. (a) Balasubramanian S, Hurley LH, Neidle S (2011) Nat Rev Drug Discovery 10:261–275; (b) Brooks TA, Hurley LH (2010) Genes Cancer 1:641–649; (c) Brooks TA, Hurley LH (2009) Nat Rev Cancer 9:849–861; (d) González V, Hurley LH (2010) Annu Rev Pharmacol Toxicol 50:111–129 19. Henry JA, Le NM, Nguyen B, Howard CM, Bailey SL, Horick SM, Buchmueller KL, Kotecha M, Hochhauser D, Hartley JA, Wilson WD, Lee M (2004) Biochemistry 43:12249–12257 20. Belitsky JM, Nguyen DH, Wurtz NR, Dervan PB (2002) Bioorg Med Chem 10:2767–2774 21. (a) Wurtz NR, Turner JM, Baird EE, Dervan PB (2001) Org Lett 3:1201–1203, (b) Kawamoto Y, Sasaki A, Hashiya K, Ide S, Bando T, Maeshima K, Sugiyama H (2015) Chem Sci 6:2307–2312 22. Pilch DS, Poklar N, Gelfand CA, Law SM, Breslauer KJ, Baird EE, Dervan PB (1996) Proc Natl Acad Sci USA 93:8306–8311 23. Howell WM, Jobs M, Brookes AJ (2002) Genome Res 12:401–407 24. Lacy ER, Le NM, Price CA, Lee M, Wilson WD (2002) J Am Chem Soc 124:2153–2163 25. Rankin S, Reszka AP, Huppert J, Zloh M, Parkinson GN, Todd AK, Ladame S, Balasubramanian S, Neidle S (2005) J Am Chem Soc 127:10584–10589 26. (a) Hurley LH, Wheelhouse RT, Sun D, Kerwin SM, Salazar M, Fedoroff OY, Han FX, Han H, Izbicka E, Von Hoff DD (2000) Pharmacol Ther 85:141–158. (b) Dexheimer TS, Sun D, Hurley LH (2006) J Am Che. Soc 128:5404–5415 27. Cogoi S, Quadrifoglio F, Xodo LE (2004) Biochemistry 43:2512–2523 28. Chenoweth DM, Harki DA, Dervan PB (2009) J Am Chem Soc 131:7175–7181 29. (a) Wurtz NR, Turner JM, Baird EE, Dervan PB (2001) Org Lett 3:1201–1203; (b) Kawamoto Y, Sasaki A, Hashiya K, Ide S, Bando T, Maeshima K, Sugiyama H (2015) Chem Sci 6:2307–2312 30. Monchaud D, Allain C, Teulade-Fichou M-P (2006) Bioorg Med Chem Lett 16:4842–4845 31. Chung WJ, Heddi B, Hamon F, Teulade − Fichou M − P, Phan AT (2014) Angew. Chem Int Ed 53:999–1002 32. Shao Y et al (2006) Phys Chem Chem Phys 8:3172–3191 33. Wu G et al (2003) J Comput Chem 24:1549–1562

Curriculum Vitae

Dr. Sefan Asamitsu obtained his B.Sc. degree from Kyoto University in 2014. He also obtained his M.Sc. degree in 2016 and his Ph.D. degree in 2019 from the Department of Chemistry at Kyoto University in the group led by Prof. Hiroshi Sugiyama. His Ph.D. thesis focuses on the development of highly selective and bioactive DNAinteractive ligands with the aim of creating potential molecular probes and therapeutic agents.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2020 S. Asamitsu, Development of Selective DNA-Interacting Ligands, Springer Theses, https://doi.org/10.1007/978-981-15-7716-1

111