G-Quadruplex Nucleic Acids: Methods and Protocols [1st ed. 2019] 978-1-4939-9665-0, 978-1-4939-9666-7

This volume covers the structures, properties, and functions of G-quadruplexes in a wide range of biological disciplines

435 101 14MB

English Pages XIV, 437 [438] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

G-Quadruplex Nucleic Acids: Methods and Protocols [1st ed. 2019]
 978-1-4939-9665-0, 978-1-4939-9666-7

Table of contents :
Front Matter ....Pages i-xiv
G-Quadruplex DNA and RNA (Danzhou Yang)....Pages 1-24
CD Study of the G-Quadruplex Conformation (Iva Kejnovská, Daniel Renčiuk, Jan Palacký, Michaela Vorlíčková)....Pages 25-44
Revealing the Energetics of Ligand-Quadruplex Interactions Using Isothermal Titration Calorimetry (Andrea Funke, Klaus Weisz)....Pages 45-61
Biosensor-Surface Plasmon Resonance: Label-Free Method for Investigation of Small Molecule-Quadruplex Nucleic Acid Interactions (Ananya Paul, Caterina Musetti, Rupesh Nanjunda, W. David Wilson)....Pages 63-85
Putting a New Spin of G-Quadruplex Structure and Binding by Analytical Ultracentrifugation (William L. Dean, Robert D. Gray, Lynn DeLeeuw, Robert C. Monsen, Jonathan B. Chaires)....Pages 87-103
Mass Spectroscopic Study of G-Quadruplex (Huihui Li)....Pages 105-116
G-Quadruplex Stability from DSC Measurements (San Hadži, Matjaž Bončina, Jurij Lah)....Pages 117-130
X-Ray Crystallographic Studies of G-Quadruplex Structures (Gary N. Parkinson, Gavin W. Collie)....Pages 131-155
NMR Studies of G-Quadruplex Structures and G-Quadruplex-Interactive Compounds (Clement Lin, Jonathan Dickerhoff, Danzhou Yang)....Pages 157-176
Using Molecular Dynamics Free Energy Simulation to Compute Binding Affinities of DNA G-Quadruplex Ligands (Nanjie Deng)....Pages 177-199
Electrophoretic Mobility Shift Assay and Dimethyl Sulfate Footprinting for Characterization of G-Quadruplexes and G-Quadruplex-Protein Complexes (Buket Onel, Guanhui Wu, Daekyu Sun, Clement Lin, Danzhou Yang)....Pages 201-222
A DNA Polymerase Stop Assay for Characterization of G-Quadruplex Formation and Identification of G-Quadruplex-Interactive Compounds (Guanhui Wu, Haiyong Han)....Pages 223-231
Chromatin Immunoprecipitation Assay to Analyze the Effect of G-Quadruplex Interactive Agents on the Binding of RNA Polymerase II and Transcription Factors to a Target Promoter Region (Daekyu Sun)....Pages 233-242
Characterization of Co-transcriptional Formation of G-Quadruplexes in Double-Stranded DNA (Ke-wei Zheng, Jia-yu Zhang, Zheng Tan)....Pages 243-255
Quantitative Analysis of Stall of Replicating DNA Polymerase by G-Quadruplex Formation (Shuntaro Takahashi, Naoki Sugimoto)....Pages 257-274
Single-Molecule Investigations of G-Quadruplex (Shankar Mandal, Mohammed Enamul Hoque, Hanbin Mao)....Pages 275-298
Direct Observation of the Formation and Dissociation of Double-Stranded DNA Containing G-Quadruplex/i-Motif Sequences in the DNA Origami Frame Using High-Speed AFM (Masayuki Endo, Xiwen Xing, Hiroshi Sugiyama)....Pages 299-308
G-Quadruplex and Protein Binding by Single-Molecule FRET Microscopy (Chun-Ying Lee, Christina McNerney, Sua Myong)....Pages 309-322
High-Throughput Screening of G-Quadruplex Ligands by FRET Assay (Kaibo Wang, Daniel P. Flaherty, Lan Chen, Danzhou Yang)....Pages 323-331
Targeting G-Quadruplexes with PNA Oligomers (Bruce A. Armitage)....Pages 333-345
Primer-Modified G-Quadruplex-Au Nanoparticles for Colorimetric Assay of Human Telomerase Activity and Initial Screening of Telomerase Inhibitors (Fang Pu, Jinsong Ren, Xiaogang Qu)....Pages 347-356
Heme•G-Quadruplex DNAzymes: Conditions for Maximizing Their Peroxidase Activity (Nisreen Shumayrikh, Dipankar Sen)....Pages 357-368
In Vivo Chemical Probing for G-Quadruplex Formation (Fedor Kouzine, Damian Wojtowicz, Arito Yamane, Rafael Casellas, Teresa M. Przytycka, David L. Levens)....Pages 369-382
G-Quadruplex Visualization in Cells via Antibody and Fluorescence Probe (Matteo Nadai, Sara N. Richter)....Pages 383-395
In Cell NMR Spectroscopy: Investigation of G-Quadruplex Structures Inside Living Xenopus laevis Oocytes (Michaela Krafcikova, Robert Hänsel-Hertsch, Lukas Trantirek, Silvie Foldynova-Trantirkova)....Pages 397-405
19F NMR Spectroscopy for the Analysis of DNA G-Quadruplex Structures Using 19F-Labeled Nucleobase (Takumi Ishizuka, Hong-Liang Bao, Yan Xu)....Pages 407-433
Back Matter ....Pages 435-437

Citation preview

Methods in Molecular Biology 2035

Danzhou Yang Clement Lin Editors

G-Quadruplex Nucleic Acids Methods and Protocols

METHODS

IN

MOLECULAR BIOLOGY

Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, UK

For further volumes: http://www.springer.com/series/7651

For over 35 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-bystep fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitute the key ingredient in each and every volume of the Methods in Molecular Biology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexed in PubMed.

G-Quadruplex Nucleic Acids Methods and Protocols

Edited by

Danzhou Yang Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, IN, USA Purdue Center for Cancer Research, West Lafayette, IN, USA Purdue Institute for Drug Discovery, West Lafayette, IN, USA

Clement Lin Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, IN, USA

Editors Danzhou Yang Department of Medicinal Chemistry and Molecular Pharmacology College of Pharmacy, Purdue University West Lafayette, IN, USA

Clement Lin Medicinal Chemistry and Molecular Pharmacology College of Pharmacy, Purdue University West Lafayette, IN, USA

Purdue Center for Cancer Research West Lafayette, IN, USA Purdue Institute for Drug Discovery West Lafayette, IN, USA

ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-4939-9665-0 ISBN 978-1-4939-9666-7 (eBook) https://doi.org/10.1007/978-1-4939-9666-7 © Springer Science+Business Media, LLC, part of Springer Nature 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature. The registered company address is: 233 Spring Street, New York, NY 10013, U.S.A.

Preface G-quadruplexes are noncanonical four-stranded nucleic acid structures that are formed in guanine-rich DNA and RNA sequences. G-quadruplexes have emerged as one of the most exciting nucleic acid secondary structures. G-quadruplexes can readily form under physiologically relevant conditions and are globularly folded nucleic acid secondary structures. G-quadruplexes have been found to form in specific genomic guanine-rich sequences with functional significance, such as human telomeres, oncogene-promoter regions, replication initiation sites, and 50 - and 30 -untranslated region (UTR) of mRNA, and are involved in a number of critical cellular processes, including gene transcription, replication, translation, and genomic stability. G-quadruplexes are also found in genomes beyond those of humans. As such, G-quadruplexes have emerged as a new class of molecular targets for drug development. In addition, there is considerable interest in the use of G-quadruplexes for biomaterials, biosensors, and biocatalysts. There is therefore great interest in the structures, properties, and functions of G-quadruplexes in a wide range of biological disciplines as well as therapeutic intervention and biomaterial application. G-quadruplex Nucleic Acids: Methods and Protocols has been produced as a tool for the many researchers in various areas who are interested in conducting research in the area of G-quadruplex nucleic acids, either in the area of fundamental biology or in the area of more applied drug discovery or biomaterial development work. This first edition of G-quadruplex Nucleic Acids: Methods and Protocols covers a wide range of essential and novel experimental tools and methods for studying G-quadruplexes. These methods originate from fields including biophysics, structural biology, computational biology, biochemistry, and molecular and cell biology. The protocols are presented in a stepby-step fashion, allowing them to be readily repeated and applied by both experienced and new researchers. We believe this work will be found both informative and useful. West Lafayette, IN, USA

Danzhou Yang Clement Lin

v

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v xi

1 G-Quadruplex DNA and RNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Danzhou Yang 2 CD Study of the G-Quadruplex Conformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iva Kejnovska´, Daniel Rencˇiuk, Jan Palacky´, and Michaela Vorlı´cˇkova´

1

3 Revealing the Energetics of Ligand-Quadruplex Interactions Using Isothermal Titration Calorimetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrea Funke and Klaus Weisz 4 Biosensor-Surface Plasmon Resonance: Label-Free Method for Investigation of Small Molecule-Quadruplex Nucleic Acid Interactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ananya Paul, Caterina Musetti, Rupesh Nanjunda, and W. David Wilson 5 Putting a New Spin of G-Quadruplex Structure and Binding by Analytical Ultracentrifugation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . William L. Dean, Robert D. Gray, Lynn DeLeeuw, Robert C. Monsen, and Jonathan B. Chaires 6 Mass Spectroscopic Study of G-Quadruplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huihui Li 7 G-Quadruplex Stability from DSC Measurements. . . . . . . . . . . . . . . . . . . . . . . . . . . San Hadzˇi, Matjazˇ Boncˇina, and Jurij Lah 8 X-Ray Crystallographic Studies of G-Quadruplex Structures . . . . . . . . . . . . . . . . . Gary N. Parkinson and Gavin W. Collie 9 NMR Studies of G-Quadruplex Structures and G-Quadruplex-Interactive Compounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Clement Lin, Jonathan Dickerhoff, and Danzhou Yang 10 Using Molecular Dynamics Free Energy Simulation to Compute Binding Affinities of DNA G-Quadruplex Ligands . . . . . . . . . . . . . . . . . . . . . . . . . . Nanjie Deng 11 Electrophoretic Mobility Shift Assay and Dimethyl Sulfate Footprinting for Characterization of G-Quadruplexes and G-Quadruplex-Protein Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Buket Onel, Guanhui Wu, Daekyu Sun, Clement Lin, and Danzhou Yang

vii

25

45

63

87

105 117 131

157

177

201

viii

12

13

14

15

16 17

18

19

20 21

22

23

24

Contents

A DNA Polymerase Stop Assay for Characterization of G-Quadruplex Formation and Identification of G-Quadruplex-Interactive Compounds . . . . . . . Guanhui Wu and Haiyong Han Chromatin Immunoprecipitation Assay to Analyze the Effect of G-Quadruplex Interactive Agents on the Binding of RNA Polymerase II and Transcription Factors to a Target Promoter Region . . . . . . . . Daekyu Sun Characterization of Co-transcriptional Formation of G-Quadruplexes in Double-Stranded DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ke-wei Zheng, Jia-yu Zhang, and Zheng Tan Quantitative Analysis of Stall of Replicating DNA Polymerase by G-Quadruplex Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shuntaro Takahashi and Naoki Sugimoto Single-Molecule Investigations of G-Quadruplex . . . . . . . . . . . . . . . . . . . . . . . . . . . Shankar Mandal, Mohammed Enamul Hoque, and Hanbin Mao Direct Observation of the Formation and Dissociation of Double-Stranded DNA Containing G-Quadruplex/i-Motif Sequences in the DNA Origami Frame Using High-Speed AFM . . . . . . . . . . . . . Masayuki Endo, Xiwen Xing, and Hiroshi Sugiyama G-Quadruplex and Protein Binding by Single-Molecule FRET Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chun-Ying Lee, Christina McNerney, and Sua Myong High-Throughput Screening of G-Quadruplex Ligands by FRET Assay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kaibo Wang, Daniel P. Flaherty, Lan Chen, and Danzhou Yang Targeting G-Quadruplexes with PNA Oligomers . . . . . . . . . . . . . . . . . . . . . . . . . . . Bruce A. Armitage Primer-Modified G-Quadruplex-Au Nanoparticles for Colorimetric Assay of Human Telomerase Activity and Initial Screening of Telomerase Inhibitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fang Pu, Jinsong Ren, and Xiaogang Qu Heme G-Quadruplex DNAzymes: Conditions for Maximizing Their Peroxidase Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nisreen Shumayrikh and Dipankar Sen In Vivo Chemical Probing for G-Quadruplex Formation . . . . . . . . . . . . . . . . . . . . Fedor Kouzine, Damian Wojtowicz, Arito Yamane, Rafael Casellas, Teresa M. Przytycka, and David L. Levens G-Quadruplex Visualization in Cells via Antibody and Fluorescence Probe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matteo Nadai and Sara N. Richter

223

233

243

257 275

299

309

323 333

347

l

357 369

383

Contents

ix

25

In Cell NMR Spectroscopy: Investigation of G-Quadruplex Structures Inside Living Xenopus laevis Oocytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 ¨ nsel-Hertsch, Lukas Trantirek, Michaela Krafcikova, Robert Ha and Silvie Foldynova-Trantirkova 26 19F NMR Spectroscopy for the Analysis of DNA G-Quadruplex Structures Using 19F-Labeled Nucleobase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 Takumi Ishizuka, Hong-Liang Bao, and Yan Xu

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

435

Contributors BRUCE A. ARMITAGE  Department of Chemistry and Center for Nucleic Acids Science and Technology, Carnegie Mellon University, Pittsburgh, PA, USA HONG-LIANG BAO  Division of Chemistry, Department of Medical Sciences, Faculty of Medicine, University of Miyazaki, Miyazaki, Japan MATJAZˇ BONCˇINA  Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia RAFAEL CASELLAS  Laboratory of Pathology, National Cancer Institute, National Institutes of Health (USA), Bethesda, MD, USA JONATHAN B. CHAIRES  James Graham Brown Cancer Center, University of Louisville, Louisville, KY, USA LAN CHEN  Purdue Institute for Drug Discovery, West Lafayette, IN, USA GAVIN W. COLLIE  UCL School of Pharmacy, University College London, London, UK; Discovery Sciences, R&D, AstraZeneca, Cambridge, UK WILLIAM L. DEAN  James Graham Brown Cancer Center, University of Louisville, Louisville, KY, USA LYNN DELEEUW  James Graham Brown Cancer Center, University of Louisville, Louisville, KY, USA NANJIE DENG  Department of Chemistry and Physical Sciences, Pace University, New York, NY, USA JONATHAN DICKERHOFF  Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, IN, USA MASAYUKI ENDO  Department of Chemistry, Graduate School of Science, Kyoto University, Kyoto, Japan; Institute for Integrated Cell-Material Sciences, Kyoto University, Kyoto, Japan DANIEL P. FLAHERTY  Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, IN, USA; Purdue Center for Cancer Research, West Lafayette, IN, USA; Purdue Institute for Drug Discovery, West Lafayette, IN, USA SILVIE FOLDYNOVA-TRANTIRKOVA  CEITEC-Central European Institute of Technology, Masaryk University, Brno, Czech Republic; Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic ANDREA FUNKE  Institute of Biochemistry, University of Greifswald, Greifswald, Germany ROBERT D. GRAY  James Graham Brown Cancer Center, University of Louisville, Louisville, KY, USA SAN HADZˇI  Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia HAIYONG HAN  Molecular Medicine Division, Translational Genomics Research Institute, Phoenix, AZ, USA ROBERT HA¨NSEL-HERTSCH  Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK MOHAMMED ENAMUL HOQUE  Department of Chemistry and Biochemistry, Kent State University, Kent, OH, USA

xi

xii

Contributors

TAKUMI ISHIZUKA  Division of Chemistry, Department of Medical Sciences, Faculty of Medicine, University of Miyazaki, Miyazaki, Japan IVA KEJNOVSKA´  The Czech Academy of Sciences, Institute of Biophysics, Brno, Czech Republic FEDOR KOUZINE  Laboratory of Pathology, National Cancer Institute, National Institutes of Health (USA), Bethesda, MD, USA MICHAELA KRAFCIKOVA  CEITEC-Central European Institute of Technology, Masaryk University, Brno, Czech Republic; Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic JURIJ LAH  Faculty of Chemistry and Chemical Technology, University of Ljubljana, Ljubljana, Slovenia CHUN-YING LEE  Department of Biophysics, Johns Hopkins University, Baltimore, MD, USA DAVID L. LEVENS  Laboratory of Pathology, National Cancer Institute, National Institutes of Health (USA), Bethesda, MD, USA HUIHUI LI  National and Local Joint Engineering Research Center of Biomedical Functional Materials, Jiangsu Collaborative Innovation Center of Biomedical Functional Materials, School of Chemistry and Materials Science, Nanjing Normal University, Nanjing, People’s Republic of China CLEMENT LIN  Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, IN, USA SHANKAR MANDAL  Department of Chemistry and Biochemistry, Kent State University, Kent, OH, USA HANBIN MAO  Department of Chemistry and Biochemistry, Kent State University, Kent, OH, USA CHRISTINA MCNERNEY  Department of Biology, Johns Hopkins University, Baltimore, MD, USA ROBERT C. MONSEN  James Graham Brown Cancer Center, University of Louisville, Louisville, KY, USA CATERINA MUSETTI  Department of Chemistry, Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA, USA; Department of Screening, Profiling and Mechanistic Biology, Platform Technology and Science, Glaxo Smith Kline, Collegeville, PA, USA SUA MYONG  Department of Biophysics, Johns Hopkins University, Baltimore, MD, USA MATTEO NADAI  Department of Molecular Medicine, University of Padua, Padua, Italy RUPESH NANJUNDA  Department of Chemistry, Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA, USA; Janssen Research and Development, Spring House, PA, USA BUKET ONEL  Medicinal Chemistry and Molecular Pharmacology, Purdue University, College of Pharmacy, West Lafayette, IN, USA JAN PALACKY´  The Czech Academy of Sciences, Institute of Biophysics, Brno, Czech Republic GARY N. PARKINSON  UCL School of Pharmacy, University College London, London, UK ANANYA PAUL  Department of Chemistry, Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA, USA TERESA M. PRZYTYCKA  Laboratory of Pathology, National Cancer Institute, National Institutes of Health (USA), Bethesda, MD, USA FANG PU  Laboratory of Chemical Biology and State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China

Contributors

xiii

XIAOGANG QU  Laboratory of Chemical Biology and State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China JINSONG REN  Laboratory of Chemical Biology and State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China DANIEL RENCˇIUK  The Czech Academy of Sciences, Institute of Biophysics, Brno, Czech Republic SARA N. RICHTER  Department of Molecular Medicine, University of Padua, Padua, Italy DIPANKAR SEN  Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada; Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada NISREEN SHUMAYRIKH  Department of Chemistry, Simon Fraser University, Burnaby, BC, Canada NAOKI SUGIMOTO  Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, Kobe, Japan; Graduate School of Frontiers of Innovative Research in Science and Technology (FIRST), Konan University, Kobe, Japan HIROSHI SUGIYAMA  Department of Chemistry, Graduate School of Science, Kyoto University, Kyoto, Japan; Institute for Integrated Cell-Material Sciences, Kyoto University, Kyoto, Japan DAEKYU SUN  University of Arizona, College of Pharmacy, Tucson, AZ, USA; BIO5 Institute, Tucson, AZ, USA; Arizona Cancer Center, Tucson, AZ, USA SHUNTARO TAKAHASHI  Frontier Institute for Biomolecular Engineering Research (FIBER), Konan University, Kobe, Japan ZHENG TAN  State Key Laboratory of Membrane Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, People’s Republic of China LUKAS TRANTIREK  CEITEC-Central European Institute of Technology, Masaryk University, Brno, Czech Republic; Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic MICHAELA VORLI´CˇKOVA´  The Czech Academy of Sciences, Institute of Biophysics, Brno, Czech Republic KAIBO WANG  Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, IN, USA KLAUS WEISZ  Institute of Biochemistry, University of Greifswald, Greifswald, Germany W. DAVID WILSON  Department of Chemistry, Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA, USA DAMIAN WOJTOWICZ  Laboratory of Pathology, National Cancer Institute, National Institutes of Health (USA), Bethesda, MD, USA GUANHUI WU  Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, College of Pharmacy, West Lafayette, IN, USA XIWEN XING  Department of Chemistry, Graduate School of Science, Kyoto University, Kyoto, Japan; Department of Biotechnology, Key Laboratory of Virology of Guangzhou College of Life Science and Technology, Jinan University, Guangzhou, P. R. China YAN XU  Division of Chemistry, Department of Medical Sciences, Faculty of Medicine, University of Miyazaki, Miyazaki, Japan

xiv

Contributors

ARITO YAMANE  Laboratory of Pathology, National Cancer Institute, National Institutes of Health (USA), Bethesda, MD, USA DANZHOU YANG  Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, IN, USA; Purdue Center for Cancer Research, West Lafayette, IN, USA; Purdue Institute for Drug Discovery, West Lafayette, IN, USA JIA-YU ZHANG  State Key Laboratory of Membrane Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, People’s Republic of China KE-WEI ZHENG  State Key Laboratory of Membrane Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, People’s Republic of China

Chapter 1 G-Quadruplex DNA and RNA Danzhou Yang Abstract G-quadruplexes (G4s) have become one of the most exciting nucleic acid secondary structures. A noncanonical, four-stranded structure formed in guanine-rich DNA and RNA sequences, G-quadruplexes can readily form under physiologically relevant conditions and are globularly folded structures. DNA is widely recognized as a double-helical structure essential in genetic information storage. However, only ~3% of the human genome is expressed in protein; RNA and DNA may form noncanonical secondary structures that are functionally important. G-quadruplexes are one such example which have gained considerable attention for their formation and regulatory roles in biologically significant regions, such as human telomeres, oncogene-promoter regions, replication initiation sites, and 50 - and 30 -untranslated region (UTR) of mRNA. They are shown to be a regulatory motif in a number of critical cellular processes including gene transcription, translation, replication, and genomic stability. G-quadruplexes are also found in nonhuman genomes, particularly those of human pathogens. Therefore, G-quadruplexes have emerged as a new class of molecular targets for drug development. In addition, there is considerable interest in the use of G-quadruplexes for biomaterials, biosensors, and biocatalysts. The First International Meeting on Quadruplex DNA was held in 2007, and the G-quadruplex field has been growing dramatically over the last decade. The methods used to study G-quadruplexes have been essential to the rapid progress in our understanding of this exciting nucleic acid secondary structure. Key words G-quadruplexes, DNA, RNA, Oncogene promoters, Human telomeres, UTR, DNA damage, Transcription, Replication, Translation, Drug target, Cancer, Human diseases

1

G-Quadruplex Nucleic Acids G-quadruplexes (G4s) are noncanonical four-stranded nucleic acid structures formed in guanine-rich DNA and RNA sequences (Fig. 1). They have emerged as one of the most exciting nucleic acid secondary structures. DNA is widely recognized as a doublehelical structure essential in genetic information storage. Results from the ENCODE project [1] indicate that only ~3% of the human genome is expressed in protein and that RNA and DNA may form noncanonical secondary structures that are functionally important. G-quadruplexes are one such example which have gained considerable attention for their formation and regulatory

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_1, © Springer Science+Business Media, LLC, part of Springer Nature 2019

1

2

Danzhou Yang

Fig. 1 (a) Schematic illustration of a G-tetrad, four guanine bases arranged in a square plane with Hoogsteen hydrogen bonding. Monovalent cations (K+ or Na+, shown as blue spheres) are required to stabilize G-quadruplexes by coordinating with the O6 atoms of the adjacent G-tetrad planes. (b) A schematic intermolecular (tetrameric) G-quadruplex with three G-tetrads. (c) Examples of intramolecular G-quadruplexes with different folding structures and loop conformations. The experimentally determined molecular structures are shown as examples for parallel, hybrid, and basket G-quadruplexes. (d) Example NMR molecular structures of ligand complexes with the c-MYC promoter G-quadruplex and the human telomeric G-quadruplex

roles in biologically significant regions. G-quadruplexes are found to be involved in a number of critical cellular processes, including gene transcription, translation, DNA replication, and genomic stability. G-quadruplexes can readily form under physiologically relevant conditions and are globularly folded structures. Many proteins have been identified to interact with G-quadruplex DNA or RNA,

G-Quadruplex DNA and RNA

3

including G-quadruplex-stabilizing or destabilizing/unfolding proteins (see reviews: [2–6]). As such, G-quadruplexes have emerged as a new class of molecular targets for drug development. In addition, there is considerable interest in the use of G-quadruplexes for biomaterials [7, 8], biosensors [9, 10], and biocatalysts [11]. First observed in 1910 [12], the G-tetrad structure was not determined until 1962 [13]. The core structure of a G-quadruplex consists of stacked guanine-tetrads (G-tetrads), a square planar platform of four guanine bases that are held together by Hoogsteen hydrogen bonds (Fig. 1a). G-quadruplex structures require cations, particularly K+ or Na+, to stabilize stacked G-tetrads by coordinating with tetrad-guanine O6 atoms [14–16]. The tetradguanines can adopt anti or syn glycosidic conformation; tetradguanines from G-strands with the same direction, i.e., parallel strands, adopt the same glycosidic conformation, whereas those from G-strands with the opposite direction, i.e., antiparallel strands, adopt different glycosidic conformations [17]. G-quadruplexes can be intramolecular (monomeric) or intermolecular (multimeric), which are formed with one or more than one nucleic acid molecules, respectively. Tetramolecular G-quadruplexes (Fig. 1b) are usually parallel-stranded with tetrad guanines adopting anti glycosidic conformation. Most biologically relevant G-quadruplexes are intramolecular G-quadruplexes, with three-tetrad cores being the most common (Fig. 1c). In contrast to tetramolecular structures, intramolecular G-quadruplexes form quickly and exhibit great conformational diversity, such as in folding topology, loop conformation, and capping structures. Based on G-strand directionality, a G-quadruplex can be parallel with all four G-strands in the same direction, hybrid/mixed with both parallel and antiparallel strands, or antiparallel with all adjacent G-strands antiparallel to each other. G-strands in intramolecular G-quadruplexes are connected by different types of loops, such as propeller for connecting parallel strands, lateral for connecting adjacent antiparallel strands, and diagonal for connecting antiparallel strands across the G-tetrad core. Not only can different sequences adopt distinct topologies, but a given sequence can also fold into different conformations, as in the case of the human telomeric DNA, or form multiple structures, as in the case of human gene promoter sequences [18]. While a number of principles of G-quadruplex folding have been recognized, a G-quadruplex conformation is difficult to predict and requires experimental structure determination.

4

2

Danzhou Yang

G-Quadruplex Occurrence and Functions G-quadruplexes have been found to form in specific human guanine-rich sequences with functional significance, such as telomeres, oncogene-promoter regions, and 50 - and 30 -untranslated region (UTR) of mRNA, as well as in nonhuman genomes.

2.1 G-Quadruplexes in Telomeres

The first biologically relevant G-quadruplex was observed in telomeric DNA. Telomeres are specific DNA–protein complexes at the ends of linear chromosomes, providing protection against gene erosion from cell divisions, chromosomal nonhomologous end-joinings, and nuclease attacks [19–21]. Telomeric G-quadruplexes were first reported as novel intramolecular structures containing guanine–guanine base-pairs in single-stranded telomeric sequences of several organisms [22], and as guaninetetrads between hairpin loops of the Tetrahymena telomeric DNA [23]. The importance of monovalent cations in the stabilization of G-quadruplex structures was revealed by Williamson, Cech in their monovalent cation-induced square-planar G-quartet model using the Oxytricha and Tetrahymena telomeric DNA [14]. Human telomeres consist of tandem repeats of the hexanucleotide d (TTAGGG)n 5–10 kb in length, which terminate in a singlestranded 30 -overhang of 35–600 bases [24]. Telomeres of cancer cells do not shorten upon replication, mainly due to the activation of a reverse transcriptase, telomerase, that extends the telomeric sequence at the chromosome ends [25]. Telomerase is activated in 80–85% of human cancer cells to maintain telomere length and malignant phenotype [25–27]. The G-quadruplex formation can inhibit the activity of telomerase [28], making it an attractive target for cancer therapeutic intervention. In addition to the formation at the telomere end, which most likely involves intramolecular G-quadruplex structures, intermolecular G-quadruplex formation may also be involved in the T-loop invasion complex [29, 30]. Recently, telomeric repeat-containing RNA (TERRA) G-quadruplex was identified and found to inhibit telomerase [31, 32]. Human telomeric DNA is structurally polymorphic and may adopt different intramolecular G-quadruplex conformations, including two equilibrating hybrid-type structures [33–39] and a 2-tetrad structure [40–42] in K+ solution, a parallel structure in the crystalline form in the presence of K+ [43], and a basket-type structure in Na+ solution [44]. The hybrid-type structures can effectively form packed multimers at the telomere ends [33, 45]. Although different human telomeric G-quadruplexes appear to have small energy differences relative to each other, interconversion between them is kinetically slow, indicating a high-energy intermediate (s) [33, 41, 46–48]. The structure

G-Quadruplex DNA and RNA

5

polymorphism appears to be an intrinsic property of the highly conserved telomeric sequence in higher eukaryotes, particularly the TTA loop sequence [49]. On the other hand, telomeric repeat-containing RNA was shown to adopt parallel-stranded G-quadruplexes [50–52]. 2.2 G-Quadruplexes in Gene Promoters

More recently, DNA G-quadruplexes were found to form in the gene promoter regions and function as transcriptional regulators [53, 54], which has been the most active area for G-quadruplex DNA in the past decade. The first experiments that suggested the existence of unusual forms of DNA associated with runs of guanines in gene promoters were reported in 1982 for the chicken β-globulin gene based upon the nuclease hypersensitivity of promoter elements [55–57]. Since then, the occurrence of these elements in the human gene promoters has been reported, including in those of insulin [58], c-MYC [54, 59], VEGF [60, 61], HIF-1α [62], BCL-2 [63–65], MtCK [66], K-RAS [67, 68], c-KIT [69, 70], RET [71], PDGF-A [72], c-MYB [73], hTERT [74], and PDGF-Rβ [75, 76], in addition to mouse α7 integrin [77]. The potential occurrence of DNA G-quadruplexes has been discovered in the promoter regions of human genes involved in growth and proliferation [53, 78, 79]; these genes all contain G-rich/C-rich tracts in the proximal regions of promoters and are mostly TATAless. In addition, the potential for quadruplex formation is higher within oncogenes as compared to tumor suppressor genes [80]. Computational analyses showed significant enrichment of G-quadruplex-forming sequences in the promoter regions of human genes near transcription start sites (TSS) [81]. The driving force of the formation of promoter G-quadruplexes appears to be the transcription-induced dynamic negative superhelicity [82–85]. The c-MYC gene promoter is the most extensively studied system for the promoter G-quadruplex [54, 86]. A highly conserved G-rich nuclease hypersensitivity element III1 in the proximal region of the c-MYC promoter controls 80–90% of the transcriptional activity regardless of whether the P1 or P2 promoter is used [87–90]. This element in the c-MYC promoter is highly dynamic in its conformation [91], and can form G-quadruplex structures, which function as a transcriptional silencer [54, 59]. In contrast to the repeating tandems in the telomeric sequence, the promoter G-quadruplex-forming sequences are each unique in their number and length of G-tracts and intervening bases. The promoter G-rich sequences often contain more than four G-tracts with unequal numbers of guanines and can form multiple G-quadruplexes through utilizing varying combinations of G-tracts or different loop isomers through utilizing varying guanines on one G-tract [18]. Parallel structures are common to the promoter G-quadruplexes, usually with a three-tetrad core. Structural studies showed that each promoter G-quadruplex adopts

6

Danzhou Yang

unique capping and loop structures determined by its specific sequence, such as c-MYC [92–95], BCL-2 [63, 65, 96, 97], KRAS [98], c-KIT [99–101], VEGF [102], and PDGFR-β [103, 104]. A notable feature in the promoter G-quadruplexes is the prevalence of the G3NG3 motif, a robust parallel-stranded structural motif with a 1-nt propeller loop. This motif was first observed in the major G-quadruplex structure formed in the c-MYC promoter, which showed that the 1-nt propeller loop conformation is highly favored [94]. By having two such motifs, parallel promoter G-quadruplexes can have a long and variable middle loop [65, 97, 105]. In addition, parallel G-quadruplexes exist in variant forms, such as with broken-strand [99, 103], end-insertion [104], or even with an additional hairpin loop conformations [65, 74]. Furthermore, certain promoter sequences can form multiple G-quadruplexes on one overlapping region or on separate regions. For example, the BCL-2 proximal promoter contains two G-quadruplex-forming regions that are separated by 13 nt (Pu39 and P1G4), with two competing G4s, i.e., a hybrid structure [63, 96] and a parallel structure [97], formed in Pu39, and two equilibrating parallel G4s formed in P1G4 [65]. Similar phenomenon was observed in the promoters of KRAS [67, 68, 106–108], c-KIT [69, 70, 99–101], PDGFR-β [75, 76, 103, 104], and hTERT [74, 109–111]. The variations in promoter G-quadruplexes give rise to different overall structure properties that could be specifically recognized by proteins or small-molecule ligands for transcriptional regulation. Moreover, inherent polymorphism and equilibrium between different conformations may provide an additional layer of transcriptional modulation. 2.3 G-Quadruplexes in Other Regions of Genome and in RNA

G-quadruplexes have been found in other regions of the human genome, such as immunoglobulin class switch regions [112–114], ribosomal DNA [115], mitochondrial DNA [116–119], replication initiation regions [120], the LINE-1 retrotransposon [121–123], DNA:RNA hybrid-G-quadruplexes in transcription [124], as well as in the extended repeat sequences in neurodegenerative diseases at both DNA and RNA levels, such as the (CGG)n repeat in the 50 -UTR of the FMR1 gene in the Fragile X syndrome (FXS) [125–127] and the hexanucleotide repeat expansion (HRE) (GGGGCC)n in C9orf72 of amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) [128]. In addition, G-quadruplexes have been found to form in RNA. G-quadruplexes formed in 50 -UTR have been shown to inhibit translation [129, 130], such as NRAS [131], Zic-1 [132], TRF2 [133], Yin Yang 1 [134], or in internal ribosomal entry sites (IRESs) to initiate cap-independent translation, such as VEGF [135]. In addition, G-quadruplexes are found in 30 -UTRs [136–138] as well as in RNA introns to regulate the alternative splicing, such as TP53 [139] and Bcl-XL [140].

G-Quadruplex DNA and RNA

7

Notably, DNA G-quadruplex structures are recently shown to be involved in genomic instability and DNA damage [141–143]. 2.4 G-Quadruplexes in Nonhuman Genomes

3

G-quadruplexes have been identified in nonhuman genomes. As in the human genome, these G-quadruplexes predominantly occur in regions of regulatory importance. Particularly, G-quadruplexes are found in genomes of human pathogens [144]. Many examples of G-quadruplexes were identified in viruses [145, 146], including human immunodeficiency virus (HIV) [147–150], multiple species of herpes virus [151–155], human papillomavirus (HPV) [156], hepatitis C [157], Zika [158], Ebola [159], and a G-quadruplexbinding protein was found in severe acute respiratory syndrome (SARS) coronavirus [160]. In bacteria, G-quadruplexes are found in Escherichia coli [161], Neisseria gonorrhoeae [162], Neisseria meningitidis [163], Mycobacterium tuberculosis [164], and Deinococcus radiodurans [165]. G-quadruplexes were also found in ciliates [14], malaria parasites [166, 167], and yeasts [168, 169]. Notably, helicases that resolve G-quadruplex structures, such as RecQ [170, 171] and Pif1 [142] families, were found in both nonhuman and human systems. Most recently, the presence of G-quadruplexes in plant genomes has also emerged [172].

G-Quadruplex Detection In Vivo There has been significant progress in the detection of G-quadruplexes structures in vivo [173]. The first direct evidence of the in vivo existence of G-quadruplexes was established by using G-quadruplex-specific single chain variable fragment (scFv) of an antibody to detect G-quadruplexes formed at telomeres in macronuclei of the ciliate Stylonychia lemnae, which was shown to be cell cycle-dependent [174] and controlled by telomere end-binding proteins TEBPα and TEBPβ phosphorylation [175]. More recently, using a G-quadruplex-specific antibodies BG4 (scFv) [176] and 1H6 (monoclonal antibody) [177], G-quadruplex structures were visualized in human cells at both telomeric and non-telomeric sites on chromosomes, and G4-loci number increased after exposure of live cells to G-quadruplex ligands [176] or in the absence of FANCJ, a G-quadruplex DNA-specific helicase [177]. Using BG4 to map endogenous G-quadruplex structures by G4 ChIP-seq in human cells, ~10,000 endogenous G-quadruplex structures were detected in immortalized precancerous HaCaT cells, 10 times higher than in normal human NHEK cells [178]. G-quadruplex structures were found to be enriched in nucleosome-depleted regulatory regions including the promoters, such as c-MYC, and 50 UTRs, of highly transcribed genes. The detected G-quadruplexes in cells account for less than 1% of the

8

Danzhou Yang

genomic G4-sites identified by G4-seq [179] or predicted by G4 algorithms [81], suggesting the in vivo formation of G-quadruplex is highly context-dependent. In addition, the endogenous potential G4 sites were detected in live human cells by chemical footprinting combined with high-throughput sequencing and were found to enrich in regions involving chromatin reorganization and gene transcription [180]. G-quadruplex formation in vivo was also detected by small molecules probes, such as the radiolabeled Gquadruplex-ligands 360A [181], and fluorescent G-quadruplexligands BMVC [182, 183] and DAOTA-M2 [184].

4

G-Quadruplex-Interactive Small Molecules Recognition of the biological significance of G-quadruplexes has promoted research and development of G-quadruplex-interactive small molecule ligands (G4-ligands). The identification of genomic G-quadruplex structures in regions of functional importance, such as human telomeres and oncogene promoters, has created the opportunity to selectively target these globular DNA structures for cancer-specific drug development [17, 185–190]. The therapeutic possibilities of targeting telomeric G-quadruplexes to inhibit telomerase were first reported in 1997 [191] and have been actively pursued [185–188]. G-quadruplex-ligands were also shown to inhibit the alternative lengthening of telomeres (ALT) pathway which maintains telomere stability in a telomerase-independent manner in ~15% of cancer cells [192–195]. The discovery of the perylene derivative PIPER to inhibit helicase Sgs1-mediated G-quadruplex unfolding suggested the existence of a broader mechanism for G-quadruplex-ligands [196]. In 2002, a small molecule that stabilizes the G-quadruplex formed in the c-MYC promoter was shown to inhibit c-MYC expression, suggesting therapeutic opportunity of targeting promoter G-quadruplexes for transcriptional modulation [54, 197]. Different groups of compounds, such as quindolines and ellipticines, were reported to suppress c-MYC transcription by stabilization of the c-MYC promoter G-quadruplex [198–201]. Subsequently, transcriptional repression of other oncogenes was shown by compounds stabilizing promoter G-quadruplexes, such as c-KIT [202], BCL-2 [203], KRAS [204–206]. More recently, G-quadruplex-stabilizing compounds were shown to cause DNA damage and genomic instability and exhibit synergistic effect with inhibitors or deficiency of DNA-repair mechanisms [207–211]. Specifically, G-quadruplexstabilizing compounds were shown to induce selective lethality in BRCA-deficient cancers by targeting the inherent DNA doublestrand break (DSB) repair deficiency [212, 213]. A G-quadruplex targeting drug, Quarfloxin (CX-3543) [115], based on the fluoroquinolone compounds developed by Laurence

G-Quadruplex DNA and RNA

9

Hurley [214, 215] showed excellent in vivo activity in various solid tumors and had reached Phase II clinical trials. Its secondgeneration compound, CX-5461, is currently in clinical trials for BRCA1/2 deficient tumors (Canadian trial, NCT02719977) [213]. Diverse families of other small molecule compounds that interact with G-quadruplexes were developed and studied. For example, TMPyP4, a tetra-(N-methyl-4-pyridyl)-porphyrin, is a structure-based designed compound that exhibits significant selectivity for quadruplex DNA over duplex DNA and inhibits telomerase and ALT [216, 217]. Its positional isomer TMPyP2 is a poor G-quadruplex-interactive compound and can be used as a negative control of TMPyP4 [218]. Later studies revealed that TMPyP4 interacts with the c-MYC promoter G-quadruplex and downregulates c-MYC [54, 197]. TMPyP4 and TMPyP2 have been one of the most widely used molecules in G-quadruplex research. Telomestatin is a natural product isolated from Streptomyces anulatus 3533-SV4 to be a highly potent inhibitor of telomerase [219] and active against human cancers by stabilizing telomeric G-quadruplex and inhibit telomere-protein binding [220–223]. BRACO19 is a rationally designed trisubstituted acridine to directly target telomeric G-quadruplex [224] and was shown to inhibit telomerase and induce telomere uncapping in human cancer cells [195] and have high in vivo activity against different cancer xenograft models [225, 226]. 12459 is a triazine G4-ligand that exhibits anti-telomerase activity but also appears to involve BCL-2 and hTERT splicing [192, 227, 228]. Closely related pyridine dicarboxamide derivatives 360A/307A and bisquinolinium compounds Phen-DC(3)/Phen-DC(6) are highly selective G-quadruplex ligands and were shown to be active against both telomeric and c-MYC G-quadruplex and inhibit c-MYC gene transcription in tumor cells, as well as bind to a G-quadruplex formed in the 50 -UTR of TRF2 mRNA to repress translation [133, 229–232]. Phen-DC was also shown to trigger genetic instability in Saccharomyces cerevisiae [207]. PDS is a pyridostatin compound which shows potent G-quadruplex-stabilization and has been widely used to study G-quadruplex functions and G-quadruplex-induced DNA damage [176, 210–212, 233]. In addition, G-quadruplex DNA was also shown to be potential cancer therapeutics. AS1411 (Antisoma, London, UK) is an unmodified 26-nt G-quadruplex forming oligonucleotide that has been in Phase II trials for treatment of renal cancer and acute myeloid leukemia [234]. G-quadruplex-interactive compounds have contributed immensely to understanding G-quadruplex functions and potential as a therapeutic target. Different G-quadruplex-ligands show various levels of selectivity, between G-quadruplex structures over other forms of DNA and between different G-quadruplexes, and this selectivity is likely to be related to their biological activity. Conventional and in silico screening methods as well as structure-

10

Danzhou Yang

based rational drug design were actively pursued in the development of G-quadruplex-targeting small molecules. A common feature among the G-quadruplex-ligands is the presence of a fused ring system that is capable of stacking with the terminal G-tetrads. In addition, a crescent-shaped asymmetric pharmacophore that can recruit a DNA base and cationic side chain substituents that have the propensity to interact with G-quadruplex grooves can give rise to specific interactions (Fig. 1d) [235, 236]. Structural data of G-quadruplex-ligand complexes has been playing an important role in the understanding of small molecule recognition of G-quadruplexes and the design of G-quadruplex-ligands [18, 237]. This includes a handful NMR solution structures of intramolecular G-quadruplex-ligand complexes, including c-MYC G-quadruplex-ligand complexes [235, 238–240] and telomeric G-quadruplex-ligand complexes [236, 241–243], and X-ray crystallographic structures of intramolecular and intermolecular telomeric G-quadruplex-ligand complexes [43, 237, 244–250].

5

Methods to Study G-Quadruplexes A wide variety of experimental tools and methods have been utilized or developed for studying G-quadruplex DNA and RNA. These methods play a pivotal role in enabling researchers to gain an understanding of G-quadruplex structures, properties, and functions. The methods commonly used for studying G-quadruplexes include biophysical, biochemical, molecular biology, and cellular methods, as described in this book. Biophysical methods are widely used to study physical properties of G-quadruplex such as structures, stability, and binding interactions with ligands and proteins. Circular dichroism (CD) is widely used to study G-quadruplex conformations and stability. Isothermal titration calorimetry (ITC) can directly measure binding enthalpies and provide thermodynamic characterization of G-quadruplex-ligand interactions. Biosensor-surface plasmon resonance (SPR) is a quantitative approach for the study of small molecule and protein ligand-quadruplex nucleic acid interactions in real time. Analytical ultracentrifugation (AUC) method can be used to characterize G-quadruplex formation and to monitor ligand binding. Mass spectroscopy can also be used to characterize G-quadruplex structures and ligand binding. Differential scanning calorimetry (DSC) can be used to obtain thermodynamic and sometimes kinetic parameters of G-quadruplexes. X-ray crystallography and solution NMR spectroscopy provide structural information of G-quadruplexes and ligand complexes, while molecular dynamics simulation can also be used to study G-quadruplex structures and small molecule binding.

G-Quadruplex DNA and RNA

11

Biochemical and molecular biology methods are used to study G-quadruplex formation, functions, and protein interactions. Electrophoretic mobility shift assay (EMSA), dimethyl sulfate (DMS) footprinting, and DNA polymerase stop (Pol-stop) assay are widely used to study G-quadruplex formation, protein complexes, and ligand interactions. Chromatin immunoprecipitation (ChIP) assays are used to probe protein interactions with G-quadruplex-forming DNA sequences. A combination of biochemical and biophysical methods can be used to monitor co-transcriptional formation of G-quadruplexes (transcription assay) and to quantitatively analyze the effects of G-quadruplex formation on DNA replication (replication assay). Single-molecule methods such as optical and magnetic tweezers, atomic-force microscopy (AFM), and singlemolecule fluorescence resonance energy transfer (FRET) microscopy can be used to investigate G-quadruplex conformations, ligand interactions, and protein interactions. In addition, methods are used to discover and develop G-quadruplex-targeting molecules, such as FRET-based high-throughput screening of small molecule ligands, and peptide nucleic acid (PNA) oligomers that are designed to bind to G-quadruplexes. G-quadruplexes are also used in nanoparticle-based assays, and as biocatalysts such as G-quadruplex DNAzymes. More recent and exciting developments include in-cell methods to study the G-quadruplex formation in vivo, such as in vivo chemical footprinting, G-quadruplex detection and visualization, and in-cell NMR. Chemical probing for G-quadruplex formation inside living cells combined with high-throughput sequencing can provide a snapshot of the DNA conformation over the whole genome in vivo. G4-specific antibodies and fluorescence probes are used to detect and visualize G-quadruplexes in cells. NMR spectroscopy is used to study G-quadruplex structures inside living Xenopus laevis oocytes, while 19F NMR can be used to study G-quadruplex conformation in vitro and in living cells. In conclusion, it is our hope that the protocols described herein will be found both informative and useful.

Acknowledgments This research was supported by the National Institutes of Health, R01CA122952, 1R01 GM083117, R01CA177585, and P30CA023168 (Purdue Center for Cancer Research). References 1. Consortium EP (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74. https:// doi.org/10.1038/nature11247

2. Fry M (2007) Tetraplex DNA and its interacting proteins. Front Biosci 12:4336–4351 3. Oganesian L, Bryan TM (2007) Physiological relevance of telomeric G-quadruplex

12

Danzhou Yang

formation: a potential drug target. BioEssays 29(2):155–165. https://doi.org/10.1002/ bies.20523 4. Mendoza O, Bourdoncle A, Boule JB, Brosh RM Jr, Mergny JL (2016) G-quadruplexes and helicases. Nucleic Acids Res 44 (5):1989–2006. https://doi.org/10.1093/ nar/gkw079 5. Bra´zda V, Ha´ronı´kova´ L, Liao JCC, Fojta M (2014) DNA and RNA Quadruplex-binding proteins. Int J Molecul Sci 15(10):17493 6. McRae EKS, Booy EP, Padilla-Meier GP, McKenna SA (2017) On characterizing the interactions between proteins and guanine quadruplex structures of nucleic acids. J Nucleic Acids 2017:9675348. https://doi. org/10.1155/2017/9675348 7. Calzolari A, Di Felice R, Molinari E, Garbesi A (2002) G-quartet biomolecular nanowires. Appl Phys Lett 80(18):3331–3333. https:// doi.org/10.1063/1.1476700 8. Yatsunyk LA, Mendoza O, Mergny JL (2014) “Nano-oddities”: unusual nucleic acid assemblies for DNA-based nanostructures and nanodevices. Acc Chem Res 47 (6):1836–1844. https://doi.org/10.1021/ ar500063x 9. Platella C, Riccardi C, Montesarchio D, Roviello GN, Musumeci D (2017) Gquadruplex-based aptamers against protein targets in therapy and diagnostics. Biochim Biophys Acta Gen Subj 1861(5. Pt B):1429–1447. https://doi.org/10.1016/j. bbagen.2016.11.027 10. Ma DL, Wu C, Dong ZZ, Tam WS, Wong SW, Yang C, Li G, Leung CH (2017) The development of G-quadruplex-based assays for the detection of small molecules and toxic substances. Chem Asian J 12 (15):1851–1860. https://doi.org/10.1002/ asia.201700533 11. Rioz-Martinez A, Roelfes G (2015) DNA-based hybrid catalysis. Curr Opin Chem Biol 25:80–87. https://doi.org/10. 1016/j.cbpa.2014.12.033 12. Bang I (1910) Untersuchungen u¨ber die Guanyls€aure. Biochem Z 26:293–311 13. Gellert M, Lipsett MN, Davies DR (1962) Helix formation by guanylic acid. Proc Natl Acad Sci U S A 48:2013–2018 14. Williamson JR, Raghuraman MK, Cech TR (1989) Monovalent cation-induced structure of telomeric DNA: the G-quartet model. Cell 59(5):871–880 15. Sen D, Gilbert W (1990) A sodiumpotassium switch in the formation of fourstranded G4-DNA. Nature 344

(6265):410–414. https://doi.org/10.1038/ 344410a0 16. Hud NV, Smith FW, Anet FAL, Feigon J (1996) The selectivity for K+ versus Na+ in DNA quadruplexes is dominated by relative free energies of hydration: a thermodynamic analysis by 1H NMR. Biochemistry 35 (48):15383–15390 17. Yang D, Okamoto K (2010) Structural insights into G-quadruplexes: towards new anticancer drugs. Future Med Chem 2 (4):619–646 18. Chen Y, Yang DZ (2012) Sequence, stability, and structure of G-quadruplexes and their interactions with drugs. Curr Protoc Nucl Acid Chem 50:17.15.11–17.15.17 19. Blackburn EH (2000) Telomere states and cell fates. Nature 408(6808):53–56. https:// doi.org/10.1038/35040500 20. van Steensel B, Smogorzewska A, de Lange T (1998) TRF2 protects human telomeres from end-to-end fusions. Cell 92(3):401–413 21. Hackett JA, Feldser DM, Greider CW (2001) Telomere dysfunction increases mutation rate and genomic instability. Cell 106(3):275–286 22. Henderson E, Hardin CC, Walk SK, Tinoco I, Blackburn EH (1987) Telomeric DNA oligonucleotides form novel intramolecular structures containing guanine-guanine base pairs. Cell 51(6):899–908 23. Sundquist WI, Klug A (1989) Telomeric DNA dimerizes by formation of guanine tetrads between hairpin loops. Nature 342 (6251):825–829. https://doi.org/10.1038/ 342825a0 24. Moyzis RK, Buckingham JM, Cram LS, Dani M, Deaven LL, Jones MD, Meyne J, Ratliff RL, Wu JR (1988) A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proc Natl Acad Sci U S A 85(18):6622–6626 25. Greider CW, Blackburn EH (1985) Identification of a specific telomere terminal transferase activity in Tetrahymena extracts. Cell 43 (2. Pt 1):405–413 26. Kim NW, Piatyszek MA, Prowse KR, Harley CB, West MD, Ho PL, Coviello GM, Wright WE, Weinrich SL, Shay JW (1994) Specific association of human telomerase activity with immortal cells and cancer. Science 266 (5193):2011–2015 27. Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100(1):57–70 28. Zahler AM, Williamson JR, Cech TR, Prescott DM (1991) Inhibition of telomerase by G-quartet DNA structures. Nature 350

G-Quadruplex DNA and RNA (6320):718–720. https://doi.org/10.1038/ 350718a0 29. Griffith JD, Comeau L, Rosenfield S, Stansel RM, Bianchi A, Moss H, de Lange T (1999) Mammalian telomeres end in a large duplex loop. Cell 97(4):503–514 30. Stansel RM, de Lange T, Griffith JD (2001) T-loop assembly in vitro involves binding of TRF2 near the 30 telomeric overhang. EMBO J 20(19):5532–5540 31. Azzalin CM, Reichenbach P, Khoriauli L, Giulotto E, Lingner J (2007) Telomeric repeat–containing RNA and RNA surveillance factors at mammalian chromosome ends. Science 318(5851):798–801 32. Schoeftner S, Blasco MA (2008) Developmentally regulated transcription of mammalian telomeres by DNA-dependent RNA polymerase II. Nat Cell Biol 10(2):228–236. https://doi.org/10.1038/ncb1685 33. Ambrus A, Chen D, Dai J, Bialis T, Jones RA, Yang D (2006) Human telomeric sequence forms a hybrid-type intramolecular G-quadruplex structure with mixed parallel/ antiparallel strands in potassium solution. Nucleic Acids Res 34(9):2723–2735. https://doi.org/10.1093/nar/gkl348 34. Dai J, Punchihewa C, Ambrus A, Chen D, Jones RA, Yang DZ (2007) Structure of the intramolecular human telomeric G-quadruplex in potassium solution: a novel adenine triple formation. Nucleic Acids Res 35(7):2440–2450 35. Dai J, Carver M, Punchihewa C, Jones RA, Yang DZ (2007) Structure of the Hybrid2 type intramolecular human telomeric G-quadruplex in K+ solution: insights into structure polymorphism of the human telomeric sequence. Nucleic Acids Res 35 (15):4927–4940 36. Xu Y, Noguchi Y, Sugiyama H (2006) The new models of the human telomere d[AGGG (TTAGGG)3] in K+ solution. Bioorg Med Chem 14(16):5584–5591 37. Luu KN, Phan AT, Kuryavyi V, Lacroix L, Patel DJ (2006) Structure of the human telomere in K+ solution: an intramolecular (3+1) G-quadruplex scaffold. J Am Chem Soc 128 (30):9963–9970 38. Phan AT, Luu KN, Patel DJ (2006) Different loop arrangements of intramolecular human telomeric (3+1) G-quadruplexes in K+ solution. Nucleic Acids Res 34(19):5715–5719 39. Phan AT, Kuryavyi V, Luu KN, Patel DJ (2007) Structure of two intramolecular G-quadruplexes formed by natural human telomere sequences in K+ solution. Nucleic Acids Res 35(19):6517–6525

13

40. Lim KW, Amrane S, Bouaziz S, Xu WX, Mu YG, Patel DJ, Luu KN, Phan AT (2009) Structure of the human telomere in K+ solution: a stable basket-type G-quadruplex with only two G-tetrad layers. J Am Chem Soc 131 (12):4301–4309 41. Zhang Z, Dai J, Veliath E, Jones RA, Yang DZ (2010) Structure of a two-G-tetrad intramolecular G-quadruplex formed by a variant human telomeric sequence in K+ solution: insights into the interconversion of human telomeric G-quadruplex structures. Nucleic Acids Res 38(3):1009–1021 42. Hansel R, Loehr F, Trantirek L, Doetsch V (2013) High-resolution insights into G-overhang architecture. J Am Chem Soc 135(7):2816–2824 43. Parkinson GN, Lee MPH, Neidle S (2002) Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417 (6891):876–880 44. Wang Y, Patel DJ (1993) Solution structure of the human telomeric repeat d[AG3 (T2AG3)3] G-tetraplex. Structure 1 (4):263–282 45. Petraccone L, Spink C, Trent JO, Garbett NC, Mekmaysy CS, Giancola C, Chaires JB (2011) Structure and stability of higher-order human telomeric quadruplexes. J Am Chem Soc 133(51):20951–20961. https://doi. org/10.1021/ja209192a 46. Gray RD, Chaires JB (2008) Kinetics and mechanism of K+- and Na+induced folding of models of human telomeric DNA into G-quadruplex structures. Nucleic Acids Res 36(12):4191–4203 47. Rajendran A, Endo M, Hidaka K, Sugiyama H (2014) Direct and single-molecule visualization of the solution-state structures of G-hairpin and G-triplex intermediates. Angew Chem Int Ed Engl 53 (16):4107–4112. https://doi.org/10.1002/ anie.201308903 48. Mashimo T, Yagi H, Sannohe Y, Rajendran A, Sugiyama H (2010) Folding pathways of human telomeric type-1 and type-2G-quadruplex structures. J Am Chem Soc 132(42):14910–14918. https://doi.org/10. 1021/ja105806u 49. Dai J, Carver M, Yang D (2008) Polymorphism of human telomeric quadruplex structures. Biochimie 90(8):1172–1183. https:// doi.org/10.1016/j.biochi.2008.02.026 50. Xu Y, Kaminaga K, Komiyama M (2008) G-quadruplex formation by human telomeric repeats-containing RNA in Na+ solution. J Am Chem Soc 130(33):11179–11184. https://doi.org/10.1021/ja8031532

14

Danzhou Yang

51. Collie GW, Haider SM, Neidle S, Parkinson GN (2010) A crystallographic and modelling study of a human telomeric RNA (TERRA) quadruplex. Nucleic Acids Res 38 (16):5569–5580. https://doi.org/10.1093/ nar/gkq259 52. Martadinata H, Phan AT (2013) Structure of human telomeric RNA (TERRA): stacking of two G-quadruplex blocks in K(+) solution. Biochemistry 52(13):2176–2183. https:// doi.org/10.1021/bi301606u 53. Qin Y, Hurley LH (2008) Structures, folding patterns, and functions of intramolecular DNA G-quadruplexes found in eukaryotic promoter regions. Biochimie 90 (8):1149–1171. https://doi.org/10.1016/j. biochi.2008.02.020 54. Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH (2002) Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc Natl Acad Sci U S A 99(18):11593–11598. https://doi.org/ 10.1073/pnas.182256799 55. Larsen A, Weintraub H (1982) An altered DNA conformation detected by S1 nuclease occurs at specific regions in active chick globin chromatin. Cell 29(2):609–622 56. Wood WI, Felsenfeld G (1982) Chromatin structure of the chicken beta-globin gene region - sensitivity to Dnase-I, micrococcal nuclease, and Dnase-II. J Biol Chem 257 (13):7730–7736 57. Woodford KJ, Howell RM, Usdin K (1994) A novel K+-dependent DNA-synthesis arrest site in a commonly occurring sequence motif in eukaryotes. J Biol Chem 269 (43):27029–27035 58. Hammond-Kosack MC, Dobrinski B, Lurz R, Docherty K, Kilpatrick MW (1992) The human insulin gene linked polymorphic region exhibits an altered DNA structure. Nucleic Acids Res 20(2):231–236 59. Simonsson T, Pecinka P, Kubista M (1998) DNA tetraplex formation in the control region of c-myc. Nucleic Acids Res 26 (5):1167–1172 60. Sun D, Guo K, Rusche JJ, Hurley LH (2005) Facilitation of a structural transition in the polypurine/polypyrimidine tract within the proximal promoter region of the human VEGF gene by the presence of potassium and G-quadruplex-interactive agents. Nucleic Acids Res 33(18):6070–6080. https://doi. org/10.1093/nar/gki917 61. Guo K, Gokhale V, Hurley LH, Sun D (2008) Intramolecularly folded G-quadruplex and i-motif structures in the proximal promoter

of the vascular endothelial growth factor gene. Nucleic Acids Res 36(14):4598–4608 62. De Armond R, Wood S, Sun D, Hurley LH, Ebbinghaus SW (2005) Evidence for the presence of a guanine quadruplex forming region within a polypurine tract of the hypoxia inducible factor 1alpha promoter. Biochemistry 44(49):16341–16350. https://doi.org/ 10.1021/bi051618u 63. Dai J, Dexheimer TS, Chen D, Carver M, Ambrus A, Jones RA, Yang DZ (2006) An intramolecular G-quadruplex structure with mixed parallel/antiparallel G-strands formed in the human BCL-2 promoter region in solution. J Am Chem Soc 128(4):1096–1098 64. Dexheimer TS, Sun D, Hurley LH (2006) Deconvoluting the structural and drugrecognition complexity of the G-quadruplexforming region upstream of the bcl-2 P1 promoter. J Am Chem Soc 128(16):5404–5415 65. Onel B, Carver M, Wu G, Timonina D, Kalarn S, Larriva M, Yang D (2016) A new G-quadruplex with hairpin loop immediately upstream of the human BCL2 P1 promoter modulates transcription. J Am Chem Soc 138 (8):2563–2570 66. Yafe A, Etzioni S, Weisman-Shomer P, Fry M (2005) Formation and properties of hairpin and tetraplex structures of guanine-rich regulatory sequences of muscle-specific genes. Nucleic Acids Res 33(9):2887–2900 67. Cogoi S, Paramasivam M, Spolaore B, Xodo LE (2008) Structural polymorphism within a regulatory element of the human KRAS promoter: formation of G4-DNA recognized by nuclear proteins. Nucleic Acids Res 36 (11):3765–3780 68. Cogoi S, Xodo LE (2006) G-quadruplex formation within the promoter of the KRAS proto-oncogene and its effect on transcription. Nucleic Acids Res 34(9):2536–2549. https://doi.org/10.1093/nar/gkl286 69. Rankin S, Reszka AP, Huppert J, Zloh M, Parkinson GN, Todd AK, Ladame S, Balasubramanian S, Neidle S (2005) Putative DNA quadruplex formation within the human c-kit oncogene. J Am Chem Soc 127 (30):10584–10589 70. Fernando H, Reszka AP, Huppert J, Ladame S, Rankin S, Venkitaraman AR, Neidle S, Balasubramanian S (2006) A conserved quadruplex motif located in a transcription activation site of the human c-kit oncogene. Biochemistry 45(25):7854–7860. https://doi.org/10.1021/bi0601510 71. Guo K, Pourpak A, Beetz-Rogers K, Gokhale V, Sun D, Hurley LH (2007) Formation of pseudosymmetrical G-quadruplex and i-motif structures in the proximal

G-Quadruplex DNA and RNA promoter region of the RET oncogene. J Am Chem Soc 129(33):10220–10228 72. Qin Y, Rezler EM, Gokhale V, Sun D, Hurley LH (2007) Characterization of the G-quadruplexes in the duplex nuclease hypersensitive element of the PDGF-A promoter and modulation of PDGF-A promoter activity by TMPyP4. Nucleic Acids Res 35 (22):7698–7713 73. Palumbo SL, Memmott RM, Uribe DJ, Krotova-Khan Y, Hurley LH, Ebbinghaus SW (2008) A novel G-quadruplex-forming GGA repeat region in the c-myb promoter is a critical regulator of promoter activity. Nucleic Acids Res 36(6):1755–1769 74. Palumbo SL, Ebbinghaus SW, Hurley LH (2009) Formation of a unique end-to-end stacked pair of G-quadruplexes in the hTERT core promoter with implications for inhibition of telomerase by G-quadruplexinteractive ligands. J Am Chem Soc 131 (31):10878–10891 75. Qin Y, Fortin JS, Tye D, Gleason-Guzman M, Brooks TA, Hurley LH (2010) Molecular cloning of the human platelet-derived growth factor receptor beta (PDGFR-beta) promoter and drug targeting of the G-quadruplexforming region to repress PDGFR-beta expression. Biochemistry 49 (19):4208–4219. https://doi.org/10.1021/ bi100330w 76. Brown RV, Wang T, Chappeta VR, Wu G, Onel B, Chawla R, Quijada H, Camp SM, Chiang ET, Lassiter QR (2017) The consequences of overlapping G-quadruplexes and i-motifs in the platelet-derived growth factor receptor β core promoter nuclease hypersensitive element can explain the unexpected effects of mutations and provide opportunities for selective targeting of both structures by small molecules to downregulate gene expression. J Am Chem Soc 139 (22):7456–7475 77. Etzioni S, Yafe A, Khateb S, WeismanShomer P, Bengal E, Fry M (2005) Homodimeric MyoD preferentially binds tetraplex structures of regulatory sequences of musclespecific genes. J Biol Chem 280 (29):26805–26812 78. Rustighi A, Tessari MA, Vascotto F, Sgarra R, Giancotti V, Manfioletti G (2002) A polypyrimidine/polypurine tract within the Hmga2 minimal promoter: a common feature of many growth-related genes. Biochemistry 41 (4):1229–1240 79. Hurley LH, Siddiqui-Jain A (2005) Developing therapeutics to target oncogenes. Genet Eng News 25(1):26

15

80. Eddy J, Maizels N (2006) Gene function correlates with potential for G4 DNA formation in the human genome. Nucleic Acids Res 34 (14):3887–3896 81. Huppert JL, Balasubramanian S (2007) G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res 35 (2):406–413 82. Kouzine F, Sanford S, Elisha-Feil Z, Levens D (2008) The functional response of upstream DNA to dynamic supercoiling in vivo. Nat Struct Mol Biol 15(2):146–154 83. Kouzine F, Levens D (2007) Supercoil-driven DNA structures regulate genetic transactions. Front Biosci 12:4409–4423 84. Sun D, Hurley LH (2009) The importance of negative superhelicity in inducing the formation of G-quadruplex and i-motif structures in the c-Myc promoter: implications for drug targeting and control of gene expression. J Medicin Chem 52(9):2863–2874 85. Zheng KW, He YD, Liu HH, Li XM, Hao YH, Tan Z (2017) Superhelicity constrains a localized and R-loop-dependent formation of G-quadruplexes at the upstream region of transcription. ACS Chem Biol 12 (10):2609–2618. https://doi.org/10.1021/ acschembio.7b00435 86. Brooks TA, Hurley LH (2009) The role of supercoiling in transcriptional control of MYC and its importance in molecular therapeutics. Nat Rev Cancer 9(12):849–861. https://doi.org/10.1038/nrc2733 87. Michelotti EF, Tomonaga T, Krutzsch H, Levens D (1995) Cellular nucleic acid binding protein regulates the CT element of the human c- myc protooncogene. J Biol Chem 270(16):9494–9499 88. Sakatsume O, Tsutsui H, Wang Y, Gao H, Tang X, Yamauchi T, Murata T, Itakura K, Yokoyama KK (1996) Binding of THZif-1, a MAZ-like zinc finger protein to the nucleasehypersensitive element in the promoter region of the c-MYC protooncogene. J Biol Chem 271(49):31322–31333 89. Tomonaga T, Levens D (1996) Activating transcription from single stranded DNA. Proc Natl Acad Sci U S A 93(12):5830–5835 90. Berberich SJ, Postel EH (1995) PuF/NM23H2/NDPK-B transactivates a human c-myc promoter-CAT gene via a functional nuclease hypersensitive element. Oncogene 10 (12):2343–2347 91. Michelotti GA, Michelotti EF, Pullner A, Duncan RC, Eick D, Levens D (1996) Multiple single-stranded cis elements are associated with activated chromatin of the human c-myc

16

Danzhou Yang

gene in vivo. Molecul Cell Biol 16 (6):2656–2669 92. Seenisamy J, Rezler EM, Powell TJ, Tye D, Gokhale V, Joshi CS, Siddiqui-Jain A, Hurley LH (2004) The dynamic character of the G-quadruplex element in the c-MYC promoter and modification by TMPyP4. J Am Chem Soc 126(28):8702–8709. https://doi. org/10.1021/ja040022b 93. Phan AT, Modi YS, Patel DJ (2004) Propeller-type parallel-stranded G-quadruplexes in the human c-myc promoter. J Am Chem Soc 126 (28):8710–8716. https://doi.org/10.1021/ ja048805k 94. Ambrus A, Chen D, Dai J, Jones RA, Yang D (2005) Solution structure of the biologically relevant G-quadruplex element in the human c-MYC promoter. Implications for G-quadruplex stabilization. Biochemistry 44 (6):2048–2058. https://doi.org/10.1021/ bi048242p 95. Mathad RI, Hatzakis E, Dai J, Yang DZ (2011) C-MYC promoter G-quadruplex formed at the 50 -end of NHE III1 element: insights into biological relevance and parallelstranded G-quadruplex stability. Nucleic Acids Res 39(20):9023–9033 96. Dai J, Chen D, Jones RA, Hurley LH, Yang DZ (2006) NMR solution structure of the major G-quadruplex structure formed in the human BCL2 promoter region. Nucleic Acids Res 34(18):5133–5144 97. Agrawal P, Lin C, Mathad RI, Carver M, Yang D (2014) The major G-quadruplex formed in the human BCL-2 proximal promoter adopts a parallel structure with a 13-nt loop in K+ solution. J Am Chem Soc 136 (5):1750–1753. https://doi.org/10.1021/ ja4118945 98. Kerkour A, Marquevielle J, Ivashchenko S, Yatsunyk LA, Mergny JL, Salgado GF (2017) High-resolution three-dimensional NMR structure of the KRAS proto-oncogene promoter reveals key features of a G-quadruplex involved in transcriptional regulation. J Biol Chem 292(19):8082–8091. https://doi.org/10.1074/jbc.M117. 781906 99. Phan AT, Kuryavyi V, Burge S, Neidle S, Patel DJ (2007) Structure of an unprecedented G-quadruplex scaffold in the human c-kit promoter. J Am Chem Soc 129(14):4386–4392 100. Hsu ST, Varnai P, Bugaut A, Reszka AP, Neidle S, Balasubramanian S (2009) A G-rich sequence within the c-kit oncogene promoter forms a parallel G-quadruplex having asymmetric G-tetrad dynamics. J Am

Chem Soc 131(37):13399–13409. https:// doi.org/10.11021/ja904007p 101. Kuryavyi V, Phan AT, Patel DJ (2010) Solution structures of all parallel-stranded monomeric and dimeric G-quadruplex scaffolds of the human c-kit2 promoter. Nucleic Acids Res 38(19):6757–6773 102. Agrawal P, Hatzakis E, Guo K, Carver M, Yang D (2013) Solution structure of the major G-quadruplex formed in the human VEGF promoter in K+: insights into loop interactions of the parallel G-quadruplexes. Nucleic Acids Res 41(22):10584–10592. https://doi.org/10.1093/nar/gkt784 103. Chen Y, Agrawal P, Brown RV, Hatzakis E, Hurley L, Yang D (2012) The major G-quadruplex formed in the human plateletderived growth factor receptor beta promoter adopts a novel broken-strand structure in K+ solution. J Am Chem Soc 134 (32):13220–13223 104. Onel B, Carver M, Agrawal P, Hurley LH, Yang D (2018) The 30 -end region of the human PDGFR-beta core promoter nuclease hypersensitive element forms a mixture of two unique end-insertion G-quadruplexes. Biochim Biophys Acta Gen Subj 1862 (4):846–854. https://doi.org/10.1016/j. bbagen.2017.12.011 105. Gue´din A, Gros J, Alberti P, Mergny J-L (2010) How long is too long? Effects of loop size on G-quadruplex stability. Nucleic Acids Res 38(21):7858–7868. https://doi. org/10.1093/nar/gkq639 106. Cogoi S, Paramasivam M, Filichev V, Geci I, Pedersen EB, Xodo LE (2009) Identification of a new G-quadruplex motif in the KRAS promoter and design of pyrene-modified G4-decoys with antiproliferative activity in pancreatic cancer cells. J Medicin Chem 52 (2):564–568 107. Paramasivam M, Cogoi S, Xodo LE (2011) Primer extension reactions as a tool to uncover folding motifs within complex G-rich sequences: analysis of the human KRAS NHE. Chem Commun 47 (17):4965–4967 108. Kaiser CE, Van Ert NA, Agrawal P, Chawla R, Yang D, Hurley LH (2017) Insight into the complexity of the i-motif and G-quadruplex DNA structures formed in the KRAS promoter and subsequent drug-induced gene repression. J Am Chem Soc 139 (25):8522–8536 109. Lim KW, Lacroix L, Yue DJ, Lim JK, Lim JM, Phan AT (2010) Coexistence of two distinct G-quadruplex conformations in the hTERT promoter. J Am Chem Soc 132

G-Quadruplex DNA and RNA (35):12331–12342. https://doi.org/10. 1021/ja101252n 110. Chaires JB, Trent JO, Gray RD, Dean WL, Buscaglia R, Thomas SD, Miller DM (2014) An improved model for the hTERT promoter quadruplex. PLoS One 9(12):e115580. https://doi.org/10.1371/journal.pone. 0115580 111. Kang HJ, Cui Y, Yin H, Scheid A, Hendricks WP, Schmidt J, Sekulic A, Kong D, Trent JM, Gokhale V, Mao H, Hurley LH (2016) A pharmacological chaperone molecule induces cancer cell death by restoring tertiary DNA structures in mutant hTERT promoters. J Am Chem Soc 138(41):13673–13692. https:// doi.org/10.1021/jacs.6b07598 112. Sen D, Gilbert W (1988) Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature 334(6180):364–366 113. Duquette ML, Handa P, Vincent JA, Taylor AF, Maizels N (2004) Intracellular transcription of G-rich DNAs induces formation of G-loops, novel structures containing G4 DNA. Genes Dev 18(13):1618–1629 114. Larson ED, Duquette ML, Cummings WJ, Streiff RJ, Maizels N (2005) MutS[alpha] binds to and promotes synapsis of transcriptionally activated immunoglobulin switch regions. Curr Biol 15(5):470–474 115. Drygin D, Siddiqui-Jain A, O’Brien S, Schwaebe M, Lin A, Bliesath J, Ho CB, Proffitt C, Trent K, Whitten JP, Lim JK, Von Hoff D, Anderes K, Rice WG (2009) Anticancer activity of CX-3543: a direct inhibitor of rRNA biogenesis. Cancer Res 69 (19):7653–7661. https://doi.org/10.1158/ 0008-5472.CAN-09-1304 116. Wanrooij PH, Uhler JP, Simonsson T, Falkenberg M, Gustafsson CM (2010) G-quadruplex structures in RNA stimulate mitochondrial transcription termination and primer formation. Proc Natl Acad Sci U S A 107(37):16072–16077. https://doi.org/10. 1073/pnas.1006026107 117. Wanrooij PH, Uhler JP, Shi Y, Westerlund F, Falkenberg M, Gustafsson CM (2012) A hybrid G-quadruplex structure formed between RNA and DNA explains the extraordinary stability of the mitochondrial R-loop. Nucleic Acids Res 40(20):10334–10344. https://doi.org/10.1093/nar/gks802 118. Bharti SK, Sommers JA, Zhou J, Kaplan DL, Spelbrink JN, Mergny JL, Brosh RM Jr (2014) DNA sequences proximal to human mitochondrial DNA deletion breakpoints prevalent in human disease form G-quadruplexes, a class of DNA structures

17

inefficiently unwound by the mitochondrial replicative twinkle helicase. J Biol Chem 289 (43):29975–29993. https://doi.org/10. 1074/jbc.M114.567073 119. Huang WC, Tseng TY, Chen YT, Chang CC, Wang ZF, Wang CL, Hsu TN, Li PT, Chen CT, Lin JJ, Lou PJ, Chang TC (2015) Direct evidence of mitochondrial G-quadruplex DNA by using fluorescent anti-cancer agents. Nucleic Acids Res 43(21):10102–10113. https://doi.org/10.1093/nar/gkv1061 120. Besnard E, Babled A, Lapasset L, Milhavet O, Parrinello H, Dantec C, Marin JM, Lemaitre JM (2012) Unraveling cell type-specific and reprogrammable human replication origin signatures associated with G-quadruplex consensus motifs. Nat Struct Mol Biol 19 (8):837–844. https://doi.org/10.1038/ nsmb.2339 121. Howell R, Usdin K (1997) The ability to form intrastrand tetraplexes is an evolutionarily conserved feature of the 30 end of L1 retrotransposons. Molecul Biol Evol 14 (2):144–155. https://doi.org/10.1093/ oxfordjournals.molbev.a025747 122. Lexa M, Steflova P, Martinek T, Vorlickova M, Vyskot B, Kejnovsky E (2014) Guanine quadruplexes are formed by specific regions of human transposable elements. BMC Genomics 15:1032. https://doi.org/ 10.1186/1471-2164-15-1032 123. Sahakyan AB, Murat P, Mayer C, Balasubramanian S (2017) G-quadruplex structures within the 30 UTR of LINE-1 elements stimulate retrotransposition. Nat Struct Mol Biol 24(3):243–247. https://doi.org/10.1038/ nsmb.3367 124. Zhang JY, Zheng KW, Xiao S, Hao YH, Tan Z (2014) Mechanism and manipulation of DNA:RNA hybrid G-quadruplex formation in transcription of G-rich DNA. J Am Chem Soc 136(4):1381–1390. https://doi.org/10. 1021/ja4085572 125. Fry M, Loeb LA (1994) The fragile-X syndrome D(Cgg)(N) nucleotide repeats form a stable tetrahelical structure. Proc Natl Acad Sci U S A 91(11):4950–4954 126. Usdin K, Woodford KJ (1995) Cgg repeats associated with DNA instability and chromosome fragility form structures that block DNA-synthesis in-vitro. Nucleic Acids Res 23(20):4202–4209. https://doi.org/10. 1093/nar/23.20.4202 127. Brown V, Jin P, Ceman S, Darnell JC, O’Donnell WT, Tenenbaum SA, Jin X, Feng Y, Wilkinson KD, Keene JD, Darnell RB, Warren ST (2001) Microarray identification of FMRP-associated brain mRNAs and

18

Danzhou Yang

altered mRNA translational profiles in fragile X syndrome. Cell 107(4):477–487 128. Haeusler AR, Donnelly CJ, Periz G, Simko EA, Shaw PG, Kim MS, Maragakis NJ, Troncoso JC, Pandey A, Sattler R, Rothstein JD, Wang J (2014) C9orf72 nucleotide repeat structures initiate molecular cascades of disease. Nature 507(7491):195–200. https:// doi.org/10.1038/nature13124 129. Bugaut A, Balasubramanian S (2012) 50 -UTR RNA G-quadruplexes: translation regulation and targeting. Nucleic Acids Res 40 (11):4727–4741. https://doi.org/10.1093/ nar/gks068 130. Beaudoin J-D, Perreault J-P (2010) 50 -UTR G-quadruplex structures acting as translational repressors. Nucleic Acids Res 38 (20):7022–7036 131. Kumari S, Bugaut A, Huppert JL, Balasubramanian S (2007) An RNA G-quadruplex in the 50 UTR of the NRAS proto-oncogene modulates translation. Nat Chem Biol 3 (4):218–221 132. Arora A, Dutkiewicz M, Scaria V, Hariharan M, Maiti S, Kurreck J (2008) Inhibition of translation in living eukaryotic cells by an RNA G-quadruplex motif. RNA 14 (7):1290–1296. https://doi.org/10.1261/ rna.1001708 133. Gomez D, Gue´din A, Mergny JL, Salles B, Riou JF, Teulade-Fichou MP, Calsou P (2010) A G-quadruplex structure within the 50 -UTR of TRF2 mRNA represses translation in human cells. Nucleic Acids Res 38 (20):7187–7198. https://doi.org/10.1093/ nar/gkq563 134. Huang W, Smaldino PJ, Zhang Q, Miller LD, Cao P, Stadelman K, Wan M, Giri B, Lei M, Nagamine Y (2011) Yin Yang 1 contains G-quadruplex structures in its promoter and 50 -UTR and its expression is modulated by G4 resolvase 1. Nucleic Acids Res 40 (3):1033–1049 135. Morris MJ, Negishi Y, Pazsint C, Schonhoft JD, Basu S (2010) An RNA G-quadruplex is essential for cap-independent translation initiation in human VEGF IRES. J Am Chem Soc 132(50):17831–17839. https://doi. org/10.1021/ja106287x 136. Christiansen J, Kofod M, Nielsen FC (1994) A guanosine quadruplex and two stable hairpins flank a major cleavage site in insulin-like growth factor II mRNA. Nucleic Acids Res 22 (25):5709–5716 137. Wieland M, Hartig JS (2007) RNA quadruplex-based modulation of gene expression. Chem Biol 14(7):757–763

138. Beaudoin JD, Perreault JP (2013) Exploring mRNA 30 -UTR G-quadruplexes: evidence of roles in both alternative polyadenylation and mRNA shortening. Nucleic Acids Res 41 (11):5898–5911. https://doi.org/10.1093/ nar/gkt265 139. Marcel V, Tran PL, Sagne C, Martel-PlancheG, Vaslin L, Teulade-Fichou MP, Hall J, Mergny JL, Hainaut P, Van Dyck E (2011) G-quadruplex structures in TP53 intron 3: role in alternative splicing and in production of p53 mRNA isoforms. Carcinogenesis 32 (3):271–278. https://doi.org/10.1093/car cin/bgq253 140. Weldon C, Dacanay JG, Gokhale V, Boddupally PVL, Behm-Ansmant I, Burley GA, Branlant C, Hurley LH, Dominguez C, Eperon IC (2018) Specific G-quadruplex ligands modulate the alternative splicing of Bcl-X. Nucleic Acids Res 46(2):886–896. https://doi.org/10.1093/nar/gkx1122 141. Ribeyre C, Lopes J, Boule JB, Piazza A, Guedin A, Zakian VA, Mergny JL, Nicolas A (2009) The yeast Pif1 helicase prevents genomic instability caused by G-quadruplex-forming CEB1 sequences in vivo. PLoS Genet 5 (5):e1000475. https://doi.org/10.1371/ journal.pgen.1000475 142. Paeschke K, Bochman ML, Garcia PD, Cejka P, Friedman KL, Kowalczykowski SC, Zakian VA (2013) Pif1 family helicases suppress genome instability at G-quadruplex motifs. Nature 497(7450):458–462. https://doi.org/10.1038/nature12149 143. Zhao J, Bacolla A, Wang G, Vasquez KM (2010) Non-B DNA structure-induced genetic instability and evolution. Cell Mol Life Sci 67(1):43–62. https://doi.org/10. 1007/s00018-009-0131-2 144. Harris LM, Merrick CJ (2015) G-quadruplexes in pathogens: a common route to virulence control? PLoS Pathog 11 (2):e1004562. https://doi.org/10.1371/ journal.ppat.1004562 145. Ruggiero E, Richter SN (2018) G-quadruplexes and G-quadruplex ligands: targets and tools in antiviral therapy. Nucleic Acids Res 46(7):3270–3283. https://doi. org/10.1093/nar/gky187 146. Metifiot M, Amrane S, Litvak S, Andreola ML (2014) G-quadruplexes in viruses: function and potential therapeutic applications. Nucleic Acids Res 42(20):12352–12366. https://doi.org/10.1093/nar/gku999 147. Sundquist WI, Heaphy S (1993) Evidence for interstrand quadruplex formation in the dimerization of human immunodeficiency

G-Quadruplex DNA and RNA virus 1 genomic RNA. Proc Natl Acad Sci U S A 90(8):3393–3397 148. Perrone R, Nadai M, Frasson I, Poe JA, Butovskaya E, Smithgall TE, Palumbo M, Palu G, Richter SN (2013) A dynamic G-quadruplex region regulates the HIV-1 long terminal repeat promoter. J Med Chem 56(16):6521–6530. https://doi.org/10. 1021/jm400914r 149. Perrone R, Nadai M, Poe JA, Frasson I, Palumbo M, Palu G, Smithgall TE, Richter SN (2013) Formation of a unique cluster of G-quadruplex structures in the HIV-1 Nef coding region: implications for antiviral activity. PLoS One 8(8):e73121. https://doi.org/ 10.1371/journal.pone.0073121 150. Piekna-Przybylska D, Sullivan MA, Sharma G, Bambara RA (2014) U3 region in the HIV-1 genome adopts a G-quadruplex structure in its RNA and DNA sequence. Biochemistry 53 (16):2581–2593. https://doi.org/10.1021/ bi4016692 151. Norseen J, Johnson FB, Lieberman PM (2009) Role for G-quadruplex RNA binding by Epstein-Barr virus nuclear antigen 1 in DNA replication and metaphase chromosome attachment. J Virol 83(20):10336–10346. https://doi.org/10.1128/JVI.00747-09 152. Artusi S, Nadai M, Perrone R, Biasolo MA, Palu G, Flamand L, Calistri A, Richter SN (2015) The herpes simplex virus-1 genome contains multiple clusters of repeated G-quadruplex: implications for the antiviral activity of a G-quadruplex ligand. Antivir Res 118:123–131. https://doi.org/10. 1016/j.antiviral.2015.03.016 153. Biswas B, Kandpal M, Jauhari UK, Vivekanandan P (2016) Genome-wide analysis of G-quadruplexes in herpesvirus genomes. BMC Genomics 17(1):949. https://doi. org/10.1186/s12864-016-3282-1 154. Madireddy A, Purushothaman P, Loosbroock CP, Robertson ES, Schildkraut CL, Verma SC (2016) G-quadruplex-interacting compounds alter latent DNA replication and episomal persistence of KSHV. Nucleic Acids Res 44(8):3675–3694. https://doi.org/10. 1093/nar/gkw038 155. Gilbert-Girard S, Gravel A, Artusi S, Richter SN, Wallaschek N, Kaufer BB, Flamand L (2017) Stabilization of telomere G-quadruplexes interferes with human herpesvirus 6A chromosomal integration. J Virol 91(14):00402–00417. https://doi. org/10.1128/JVI.00402-17 156. Tluckova K, Marusic M, Tothova P, Bauer L, Sket P, Plavec J, Viglasky V (2013) Human

19

papillomavirus G-quadruplexes. Biochemistry 52(41):7207–7216. https://doi.org/10. 1021/bi400897g 157. Wang SR, Min YQ, Wang JQ, Liu CX, Fu BS, Wu F, Wu LY, Qiao ZX, Song YY, Xu GH, Wu ZG, Huang G, Peng NF, Huang R, Mao WX, Peng S, Chen YQ, Zhu Y, Tian T, Zhang XL, Zhou X (2016) A highly conserved G-rich consensus sequence in hepatitis C virus core gene represents a new anti-hepatitis C target. Sci Adv 2(4):e1501535. https:// doi.org/10.1126/sciadv.1501535 158. Fleming AM, Ding Y, Alenko A, Burrows CJ (2016) Zika virus genomic RNA possesses conserved G-quadruplexes characteristic of the flaviviridae family. ACS Infect Dis 2 (10):674–681. https://doi.org/10.1021/ acsinfecdis.6b00109 159. Krafcikova P, Demkovicova E, Viglasky V (2017) Ebola virus derived G-quadruplexes: thiazole orange interaction. BBA-Gen Subjects 1861 ((5):1321–1328. https://doi. org/10.1016/j.bbagen.2016.12.009 160. Kusov Y, Tan J, Alvarez E, Enjuanes L, Hilgenfeld R (2015) A G-quadruplex-binding macrodomain within the "SARS-unique domain" is essential for the activity of the SARS-coronavirus replication-transcription complex. Virology 484:313–322. https:// doi.org/10.1016/j.virol.2015.06.016 161. Rawal P, Kummarasetti VBR, Ravindran J, Kumar N, Halder K, Sharma R, Mukerji M, Das SK, Chowdhury S (2006) Genome-wide prediction of G4 DNA as regulatory motifs: role in Escherichia coli global regulation. Genome Res 16(5):644–655 162. Cahoon LA, Seifert HS (2009) An alternative DNA structure is necessary for pilin antigenic variation in neisseria gonorrhoeae. Science 325(5941):764–767. https://doi.org/10. 1126/science.1175653 163. Wormann ME, Horien CL, Bennett JS, Jolley KA, Maiden MCJ, Tang CM, Aho EL, Exley RM (2014) Sequence, distribution and chromosomal context of class I and class II pilin genes of Neisseria meningitidis identified in whole genome sequences. BMC Genomics 15 (1):253. https://doi.org/10.1186/14712164-15-253 164. Thakur RS, Desingu A, Basavaraju S, Subramanya S, Rao DN, Nagaraju G (2014) Mycobacterium tuberculosis DinG is a structure-specific helicase that unwinds G4 DNA implications for targeting g4 dna as a novel therapeutic approach. J Biol Chem 289 (36):25112–25136. https://doi.org/10. 1074/jbc.M114.563569

20

Danzhou Yang

165. Beaume N, Pathak R, Yadav VK, Kota S, Misra HS, Gautam HK, Chowdhury S (2013) Genome-wide study predicts promoter-G4 DNA motifs regulate selective functions in bacteria: radioresistance of D. radiodurans involves G4 DNA-mediated regulation. Nucleic Acids Res 41(1):76–89. https://doi.org/10.1093/nar/gks1071 166. Smargiasso N, Gabelica V, Damblon C, Rosu F, De Pauw E, Teulade-Fichou MP, Rowe JA, Claessens A (2009) Putative DNA G-quadruplex formation within the promoters of Plasmodium falciparum var genes. BMC Genomics 10(1):362. https://doi. org/10.1186/1471-2164-10-362 167. Stanton A, Harris LM, Graham G, Merrick CJ (2016) Recombination events among virulence genes in malaria parasites are associated with G-quadruplex-forming DNA motifs. BMC Genomics 17(1):859. https://doi. org/10.1186/s12864-016-3183-3 168. Hershman SG, Chen Q, Lee JY, Kozak ML, Yue P, Wang LS, Johnson FB (2008) Genomic distribution and functional analyses of potential G-quadruplex-forming sequences in Saccharomyces cerevisiae. Nucleic Acids Res 36(1):144–156. https://doi.org/10. 1093/nar/gkm986 169. Smith JS, Chen Q, Yatsunyk LA, Nicoludis JM, Garcia MS, Kranaster R, Balasubramanian S, Monchaud D, TeuladeFichou M-P, Abramowitz L (2011) Rudimentary G-quadruplex–based telomere capping in Saccharomyces cerevisiae. Nat Struct Mol Biol 18(4):478 170. Hickson ID (2003) RecQ helicases: caretakers of the genome. Nat Rev Cancer 3 (3):169–178. https://doi.org/10.1038/ nrc1012 171. Lee JY, Kozak M, Martin JD, Pennock E, Johnson FB (2007) Evidence that a RecQ helicase slows senescence by resolving recombining telomeres. PLoS Biol 5(6):e160 172. Yadav V, Hemansi KN, Tuteja N, Yadav P (2017) G quadruplex in plants: a ubiquitous regulatory element and its biological relevance. Front Plant Sci 8(1163):1163. https://doi.org/10.3389/fpls.2017.01163 173. Hansel-Hertsch R, Di Antonio M, Balasubramanian S (2017) DNA G-quadruplexes in the human genome: detection, functions and therapeutic potential. Nat Rev Mol Cell Biol 18(5):279–284. https://doi.org/10.1038/ nrm.2017.3 174. Schaffitzel C, Berger I, Postberg J, Hanes J, Lipps HJ, Pluckthun A (2001) In vitro

generated antibodies specific for telomeric guanine-quadruplex DNA react with Stylonychia lemnae macronuclei. Proc Natl Acad Sci U S A 98(15):8572–8577. https://doi.org/ 10.1073/pnas.141229498 175. Paeschke K, Simonsson T, Postberg J, Rhodes D, Lipps HJ (2005) Telomere end-binding proteins control the formation of G-quadruplex DNA structures in vivo. Nat Struct Mol Biol 12(10):847–854 176. Biffi G, Tannahill D, McCafferty J, Balasubramanian S (2013) Quantitative visualization of DNA G-quadruplex structures in human cells. Nat Chem 5(3):182–186. https://doi. org/10.1038/nchem.1548 177. Henderson A, Wu Y, Huang YC, Chavez EA, Platt J, Johnson FB, Brosh RM Jr, Sen D, Lansdorp PM (2014) Detection of G-quadruplex DNA in mammalian cells. Nucleic Acids Res 42(2):860–869. https:// doi.org/10.1093/nar/gkt957 178. Hansel-Hertsch R, Beraldi D, Lensing SV, Marsico G, Zyner K, Parry A, Di Antonio M, Pike J, Kimura H, Narita M, Tannahill D, Balasubramanian S (2016) G-quadruplex structures mark human regulatory chromatin. Nat Genet 48 (10):1267–1272. https://doi.org/10.1038/ ng.3662 179. Chambers VS, Marsico G, Boutell JM, Di Antonio M, Smith GP, Balasubramanian S (2015) High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat Biotechnol 33(8):877–881. https://doi.org/10.1038/nbt.3295 180. Kouzine F, Wojtowicz D, Baranello L, Yamane A, Nelson S, Resch W, Kieffer-Kwon KR, Benham CJ, Casellas R, Przytycka TM, Levens D (2017) Permanganate/S1 nuclease footprinting reveals non-B DNA structures with regulatory potential across a mammalian genome. Cell Syst 4(3):344–356 e347. https://doi.org/10.1016/j.cels.2017.01. 013 181. Granotier C, Pennarun G, Riou L, Hoffschir F, Gauthier LR, De Cian A, Gomez D, Mandine E, Riou JF, Mergny JL, Mailliet P, Dutrillaux B, Boussin FD (2005) Preferential binding of a G-quadruplex ligand to human chromosome ends. Nucleic Acids Res 33(13):4182–4190 182. Chang CC, Kuo IC, Lin JJ, Lu YC, Chen CT, Back HT, Lou PJ, Chang TC (2004) A novel carbazole derivative, BMVC: a potential antitumor agent and fluorescence marker of cancer cells. Chem Biodivers 1(9):1377–1384

G-Quadruplex DNA and RNA 183. Kang CC, Chang CC, Chang TC, Liao LJ, Lou PJ, Xie W, Yeung ES (2007) A handheld device for potential point-of-care screening of cancer. Analyst 132(8):745–749. https://doi. org/10.1039/b617733f 184. Shivalingam A, Izquierdo MA, Le Marois A, Vysniauskas A, Suhling K, Kuimova MK, Vilar R (2015) The interactions between a small molecule and G-quadruplexes are visualized by fluorescence lifetime imaging microscopy. Nat Commun 6:8178. https://doi.org/10. 1038/ncomms9178 185. Hurley LH, Wheelhouse RT, Sun D, Kerwin SM, Salazar M, Fedoroff OY, Han FX, Han H, Izbicka E, Von Hoff DD (2000) G-quadruplexes as targets for drug design. Pharmacol Therapeut 85(3):141–158 186. Hurley LH (2002) DNA and its associated processes as targets for cancer therapy. Nat Rev Cancer 2(3):188–200 187. Mergny JL, Helene C (1998) G-quadruplex DNA: a target for drug design. Nat Med 4 (12):1366–1367 188. Neidle S, Parkinson G (2002) Telomere maintenance as a target for anticancer drug discovery. Nat Rev Drug Discov 1 (5):383–393 189. Balasubramanian S, Hurley LH, Neidle S (2011) Targeting G-quadruplexes in gene promoters: a novel anticancer strategy? Nat Rev Drug Discov 10(4):261–275 190. Neidle S (2017) Quadruplex nucleic acids as targets for anticancer therapeutics. Nature Reviews Chemistry 1:0041 191. Sun D, Thompson B, Cathers BE, Salazar M, Kerwin SM, Trent JO, Jenkins TC, Neidle S, Hurley LH (1997) Inhibition of human telomerase by a G-quadruplex-interactive compound. J Med Chem 40(14):2113–2116. https://doi.org/10.1021/jm970199z 192. Riou JF, Guittat L, Mailliet P, Laoui A, Renou E, Petitgenet O, Megnin-Chanet F, Helene C, Mergny JL (2002) Cell senescence and telomere shortening induced by a new series of specific G-quadruplex DNA ligands. Proc Natl Acad Sci U S A 99(5):2672–2677 193. Gowan SM, Heald R, Stevens MF, Kelland LR (2001) Potent inhibition of telomerase by small-molecule pentacyclic acridines capable of interacting with G-quadruplexes. Mol Pharmacol 60(5):981–988 194. Harrison RJ, Reszka AP, Haider SM, Romagnoli B, Morrell J, Read MA, Gowan SM, Incles CM, Kelland LR, Neidle S (2004) Evaluation of by disubstituted acridone derivatives as telomerase inhibitors: the

21

importance of G-quadruplex binding. Bioorg Med Chem Lett 14(23):5845–5849 195. Incles CM, Schultes CM, Kempski H, Koehler H, Kelland LR, Neidle S (2004) A G-quadruplex telomere targeting agent produces p16-associated senescence and chromosomal fusions in human prostate cancer cells. Molecul Cancer Therapeut 3 (10):1201–1206 196. Han HY, Bennett RJ, Hurley LH (2000) Inhibition of unwinding of G-quadruplex structures by Sgs1 helicase in the presence of N,N0 -bis[2-(1-piperidino)ethyl]-3,4,9,10perylenetetracarboxylic diimide, a Gquadruplex-interactive ligand. Biochemistry 39(31):9311–9316 197. Grand CL, Han H, Munoz RM, Weitman S, Von Hoff DD, Hurley LH, Bearss DJ (2002) The cationic porphyrin TMPyP4 downregulates c-MYC and human telomerase reverse transcriptase expression and inhibits tumor growth in vivo. Molecul Cancer Therapeut 1(8):565–573 198. Ou TM, Lu YJ, Zhang C, Huang ZS, Wang XD, Tan JH, Chen Y, Ma DL, Wong KY, Tang JC, Chan AS, Gu LQ (2007) Stabilization of G-quadruplex DNA and downregulation of oncogene c-myc by quindoline derivatives. J Med Chem 50(7):1465–1474. https://doi.org/10.1021/jm0610088 199. Liu J-N, Deng R, Guo J-F, Zhou J-M, Feng G-K, Huang Z-S, Gu L-Q, Zeng Y-X, Zhu X-F (2007) Inhibition of myc promoter and telomerase activity and induction of delayed apoptosis by SYUIQ-5, a novel G-quadruplex interactive agent in leukemia cells. Leukemia 21(6):1300–1302 200. Kang HJ, Park HJ (2009) Novel molecular mechanism for actinomycin D activity as an oncogenic promoter G-quadruplex binder. Biochemistry 48(31):7392–7398. https:// doi.org/10.1021/bi9006836 201. Brown RV, Danford FL, Gokhale V, Hurley LH, Brooks TA (2011) Demonstration that drug-targeted downregulation of MYC in non-hodgkins lymphoma is directly mediated through the promoter G-quadruplex. J Biol Chem 286(47):41018–41027 202. McLuckie KI, Waller ZA, Sanders DA, Alves D, Rodriguez R, Dash J, McKenzie GJ, Venkitaraman AR, Balasubramanian S (2011) G-quadruplex-binding benzo[a]phenoxazines down-regulate c-KIT expression in human gastric carcinoma cells. J Am Chem Soc 133(8):2658–2663. https://doi.org/ 10.1021/ja109474c 203. Wang XD, Ou TM, Lu YJ, Li Z, Xu Z, Xi C, Tan JH, Huang SL, An LK, Li D, Gu LQ,

22

Danzhou Yang

Huang ZS (2010) Turning off transcription of the bcl-2 gene by stabilizing the bcl-2 promoter quadruplex with quindoline derivatives. J Med Chem 53(11):4390–4398. https://doi.org/10.1021/jm100445e 204. Lavrado J, Borralho PM, Ohnmacht SA, Castro RE, Rodrigues CM, Moreira R, dos Santos DJ, Neidle S, Paulo A (2013) Synthesis, G-quadruplex stabilisation, docking studies, and effect on cancer cells of indolo[3,2-b] quinolines with one, two, or three basic side chains. ChemMedChem 8(10):1648–1661. https://doi.org/10.1002/cmdc.201300288 205. Lavrado J, Brito H, Borralho PM, Ohnmacht SA, Kim NS, Leitao C, Pisco S, Gunaratnam M, Rodrigues CM, Moreira R, Neidle S, Paulo A (2015) KRAS oncogene repression in colon cancer cell lines by G-quadruplex binding indolo[3,2-c] quinolines. Sci Rep 5:9696. https://doi. org/10.1038/srep09696 206. Ohnmacht SA, Marchetti C, Gunaratnam M, Besser RJ, Haider SM, Di Vita G, Lowe HL, Mellinas-Gomez M, Diocou S, Robson M, Sponer J, Islam B, Pedley RB, Hartley JA, Neidle S (2015) A G-quadruplex-binding compound showing anti-tumour activity in an in vivo model for pancreatic cancer. Sci Rep 5:11385. https://doi.org/10.1038/ srep11385 207. Piazza A, Boule JB, Lopes J, Mingo K, Largy E, Teulade-Fichou MP, Nicolas A (2010) Genetic instability triggered by G-quadruplex interacting Phen-DC compounds in Saccharomyces cerevisiae. Nucleic Acids Res 38(13):4337–4348. https://doi. org/10.1093/nar/gkq136 208. Salvati E, Scarsella M, Porru M, Rizzo A, Iachettini S, Tentori L, Graziani G, D’Incalci M, Stevens MFG, Orlandi A, Passeri D, Gilson E, Zupi G, Leonetti C, Biroccio A (2010) PARP1 is activated at telomeres upon G4 stabilization: possible target for telomere-based therapy. Oncogene 29 (47):6280–6293. https://doi.org/10.1038/ onc.2010.344 209. Aggarwal M, Sommers JA, Shoemaker RH, Brosh RM Jr (2011) Inhibition of helicase activity by a small molecule impairs Werner syndrome helicase (WRN) function in the cellular response to DNA damage or replication stress. Proc Natl Acad Sci U S A 108 (4):1525–1530. https://doi.org/10.1073/ pnas.1006423108 210. Rodriguez R, Miller KM, Forment JV, Bradshaw CR, Nikan M, Britton S, Oelschlaegel T, Xhemalce B, Balasubramanian S, Jackson SP (2012) Small-molecule-induced DNA

damage identifies alternative DNA structures in human genes. Nat Chem Biol 8 (3):301–310. https://doi.org/10.1038/ nchembio.780 211. McLuckie KIE, Di Antonio M, Zecchini H, Xian J, Caldas C, Krippendorff BF, Tannahill D, Lowe C, Balasubramanian S (2013) G-quadruplex DNA as a molecular target for induced synthetic lethality in cancer cells. J Am Chem Soc 135(26):9640–9643. https://doi.org/10.1021/ja404868t 212. Zimmer J, Tacconi EMC, Folio C, Badie S, Porru M, Klare K, Tumiati M, Markkanen E, Halder S, Ryan A, Jackson SP, Ramadan K, Kuznetsov SG, Biroccio A, Sale JE, Tarsounas M (2016) Targeting BRCA1 and BRCA2 deficiencies with G-quadruplex-interacting compounds. Mol Cell 61(3):449–460. https://doi.org/10.1016/j.molcel.2015.12. 004 213. Xu H, Di Antonio M, McKinney S, Mathew V, Ho B, O’Neil NJ, Dos Santos N, Silvester J, Wei V, Garcia J (2017) CX-5461 is a DNA G-quadruplex stabilizer with selective lethality in BRCA1/2 deficient tumours. Nat Commun 8:14432 214. Duan W, Rangan A, Vankayalapati H, Kim MY, Zeng Q, Sun D, Han H, Fedoroff OY, Nishioka D, Rha SY, Izbicka E, Von Hoff DD, Hurley LH (2001) Design and synthesis of fluoroquinophenoxazines that interact with human telomeric G-quadruplexes and their biological effects. Mol Cancer Ther 1 (2):103–120 215. Kim M-Y, Duan W, Gleason-Guzman M, Hurley LH, (2003) Design, synthesis, and biological evaluation of a series of fluoroquinoanthroxazines with contrasting dual mechanisms of action against topoisomerase II and G-Quadruplexes. J Med Chem 46 (4):571–583. https://doi.org/10.1021/ jm0203377 216. Wheelhouse RT, Sun D, Han H, Han FX, Hurley LH (1998) Cationic porphyrins as telomerase inhibitors: the interaction of tetra-(N-methyl-4-pyridyl)porphine with quadruplex DNA. J Am Chem Soc 120 (13):3261–3262 217. Kim MY, Gleason-Guzman M, Izbicka E, Nishioka D, Hurley LH (2003) The different biological effects of telomestatin and TMPyP4 can be attributed to their selectivity for interaction with intramolecular or intermolecular G-quadruplex structures. Cancer Res 63(12):3247–3256 218. Han FXG, Wheelhouse RT, Hurley LH (1999) Interactions of TMPyP4 and TMPyP2 with quadruplex DNA. Structural

G-Quadruplex DNA and RNA basis for the differential effects on telomerase inhibition. J Am Chem Soc 121 (15):3561–3570 219. Shin-ya K, Wierzba K, Matsuo K, Ohtani T, Yamada Y, Furihata K, Hayakawa Y, Seto H (2001) Telomestatin, a novel telomerase inhibitor from Streptomyces anulatus. J Am Chem Soc 123(6):1262–1263 220. Kim MY, Vankayalapati H, Shin-Ya K, Wierzba K, Hurley LH (2002) Telomestatin, a potent telomerase inhibitor that interacts quite specifically with the human telomeric intramolecular G-quadruplex. J Am Chem Soc 124(10):2098–2099 221. Tauchi T, Shin-ya K, Sashida G, Sumi M, Okabe S, Ohyashiki JH, Ohyashiki K (2006) Telomerase inhibition with a novel Gquadruplex-interactive agent, telomestatin: in vitro and in vivo studies in acute leukemia. Oncogene 25(42):5719–5725 222. Gomez D, O’Donohue MF, Wenner T, Douarre C, Macadre J, Koebel P, GiraudPanis MJ, Kaplan H, Kolkes A, Shin-ya K, Riou JF (2006) The G-quadruplex ligand telomestatin inhibits POT1 binding to telomeric sequences in vitro and induces GFP-POT1 dissociation from telomeres in human cells. Cancer Res 66(14):6908–6912 223. Tahara H, Shin-Ya K, Seimiya H, Yamada H, Tsuruo T, Ide T (2006) G-Quadruplex stabilization by telomestatin induces TRF2 protein dissociation from telomeres and anaphase bridge formation accompanied by loss of the 30 telomeric overhang in cancer cells. Oncogene 25(13):1955–1966 224. Read M, Harrison RJ, Romagnoli B, Tanious FA, Gowan SH, Reszka AP, Wilson WD, Kelland LR, Neidle S (2001) Structure-based design of selective and potent G-quadruplexmediated telomerase inhibitors. Proc Natl Acad Sci U S A 98(9):4844–4849 225. Kelland LR (2005) Overcoming the immortality of tumour cells by telomere and telomerase based cancer therapeutics - current status and future prospects. Eur J Cancer 41 (7):971–979 226. Burger AM, Dai FP, Schultes CM, Reszka AP, Moore MJ, Double JA, Neidle S (2005) The G-quadruplex-interactive molecule BRACO19 inhibits tumor growth, consistent with telomere targeting and interference with telomerase function. Cancer Res 65 (4):1489–1496 227. Douarre C, Gomez D, Morjani H, Zahm JM, O’Donohue MF, Eddabra L, Mailliet P, Riou JF, Trentesaux C (2005) Overexpression of Bcl-2 is associated with apoptotic resistance to the G-quadruplex ligand 12459 but is not

23

sufficient to confer resistance to long-term senescence. Nucleic Acids Res 33 (7):2192–2203 228. Gomez D, Lemarteleur T, Lacroix L, Mailliet P, Mergny J-L, Riou J-F (2004) Telomerase downregulation induced by the G-quadruplex ligand 12459 in A549 cells is mediated by hTERT RNA alternative splicing. Nucleic Acids Res 32(1):371–379 229. Lemarteleur T, Gomez D, Paterski R, Mandine E, Mailliet P, Riou J-F (2004) Stabilization of the c-myc gene promoter quadruplex by specific ligands’ inhibitors of telomerase. Biochem Biophys Res Commun 323(3):802–808 230. Pennarun G, Granotier C, Gauthier LR, Gomez D, Boussin FD (2005) Apoptosis related to telomere instability and cell cycle alterations in human glioma cells treated by new highly selective G-quadruplex ligands. Oncogene 24(18):2917–2928 231. De Cian A, Mergny JL (2007) Quadruplex ligands may act as molecular chaperones for tetramolecular quadruplex formation. Nucleic Acids Res 35(8):2483–2493 232. De Cian A, Delemos E, Mergny JL, TeuladeFichou MP, Monchaud D (2007) Highly efficient G-quadruplex recognition by bisquinolinium compounds. J Am Chem Soc 129 (7):1856–1857. https://doi.org/10.1021/ ja067352b 233. Rodriguez MC, Sonyang Z (2008) BRCT domains: phosphopeptide binding and signaling modules. Front Biosci 13:5905–5915 234. Bates PJ, Laber DA, Miller DM, Thomas SD, Trent JO (2009) Discovery and development of the G-rich oligonucleotide AS1411 AS a novel treatment for cancer. Exp Mol Pathol 86(3):151–164. https://doi.org/10.1016/j. yexmp.2009.01.004 235. Dai J, Carver M, Hurley LH, Yang D (2011) Solution structure of a 2:1 quindoline-c-MYC G-quadruplex: insights into G-quadruplexinteractive small molecule drug design. J Am Chem Soc 133(44):17673–17680. https:// doi.org/10.1021/ja205646q 236. Lin C, Wu G, Wang K, Onel B, Sakai S, Shao Y, Yang D (2018) Molecular recognition of the hybrid-2 human telomeric G-quadruplex by epiberberine: insights into conversion of telomeric g-quadruplex structures. Angew Chem Int Ed 57 (34):10888–10893 237. Neidle S (2009) The structures of quadruplex nucleic acids and their drug complexes. Curr Opin Struct Biol 19(3):239–250

24

Danzhou Yang

238. Phan AT, Kuryavyi V, Gaw HY, Patel DJ (2005) Small-molecule interaction with a five-guanine-tract G-quadruplex structure from the human MYC promoter. Nat Chem Biol 1(3):167–173 239. Chung WJ, Heddi B, Hamon F, TeuladeFichou MP, Phan AT (2014) Solution structure of a G-quadruplex bound to the bisquinolinium compound Phen-DC(3). Angew Chem Int Ed Engl 53(4):999–1002. https://doi.org/10.1002/anie.201308063 240. Kotar A, Wang B, Shivalingam A, GonzalezGarcia J, Vilar R, Plavec J (2016) NMR structure of a triangulenium-based long-lived fluorescence probe bound to a G-quadruplex. Angew Chem Int Ed Engl 55 (40):12508–12511. https://doi.org/10. 1002/anie.201606877 241. Chung WJ, Heddi B, Tera M, Iida K, Nagasawa K, Phan AT (2013) Solution structure of an intramolecular (3 + 1) human telomeric G-quadruplex bound to a telomestatin derivative. J Am Chem Soc 135 (36):13495–13501. https://doi.org/10. 1021/ja405843r 242. Wirmer-Bartoschek J, Bendel LE, Jonker HR, Gru¨n JT, Papi F, Bazzicalupi C, Messori L, Gratteri P, Schwalbe H (2017) Solution NMR structure of a ligand/hybrid-2-Gquadruplex complex reveals rearrangements that affect ligand binding. Angew Chem 129 (25):7208–7212 243. Liu W, Zhong Y-F, Liu L-Y, Shen C-T, Zeng W, Wang F, Yang D, Mao Z-W (2018) Solution structures of multiple G-quadruplex complexes induced by a platinum (II)-based

tripod reveal dynamic binding. Nat Commun 9(1):3496 244. Clark GR, Pytel PD, Squire CJ, Neidle S (2003) Structure of the first parallel DNA quadruplex-drug complex. J Am Chem Soc 125(14):4066–4067 245. Haider SM, Parkinson GN, Neidle S (2003) Structure of a G-quadruplex-ligand complex. J Molecul Biol 326(1):117–125 246. Parkinson GN, Ghosh R, Neidle S (2007) Structural basis for binding of porphyrin to human telomeres. Biochemistry 46 (9):2390–2397 247. Parkinson GN, Cuenca F, Neidle S (2008) Topology conservation and loop flexibility in quadruplex-drug recognition: crystal structures of inter- and intramolecular telomeric DNA quadruplex-drug complexes. J Molecul Biol 381(5):1145–1156 248. Campbell NH, Parkinson GN, Reszka AP, Neidle S (2008) Structural basis of DNA quadruplex recognition by an acridine drug. J Am Chem Soc 130(21):6722–6724 249. Collie GW, Promontorio R, Hampel SM, Micco M, Neidle S, Parkinson GN (2012) Structural basis for telomeric G-quadruplex targeting by naphthalene diimide ligands. J Am Chem Soc 134(5):2723–2731. https:// doi.org/10.1021/ja2102423 250. Bazzicalupi C, Ferraroni M, Bilia AR, Scheggi F, Gratteri P (2013) The crystal structure of human telomeric DNA complexed with berberine: an interesting case of stacked ligand to G-tetrad ratio higher than 1:1. Nucleic Acids Res 41(1):632–638. https://doi.org/10.1093/nar/gks1001

Chapter 2 CD Study of the G-Quadruplex Conformation Iva Kejnovska´, Daniel Rencˇiuk, Jan Palacky´, and Michaela Vorlı´cˇkova´ Abstract Circular Dichroic (CD) spectroscopy is one of the most frequently used methods for guanine quadruplex studies and in general for studies of conformational properties of nucleic acids. The reason is its high sensitivity to even slight changes in mutual orientation of absorbing bases of DNA. CD can reveal formation of particular structural DNA arrangements and can be used to search for the conditions stabilizing the structures, to follow the transitions between various structural states, to explore kinetics of their appearance, to determine thermodynamic parameters, and also to detect formation of higher order structures. CD spectroscopy is an important complementary technique to NMR spectroscopy and X-ray diffraction in quadruplex studies due to its sensitivity, easy manipulation of studied samples, and relative inexpensiveness. In this part, we present the protocol for the use of CD spectroscopy in the study of guanine quadruplexes, together with practical advice and cautions about various, particularly interpretation, difficulties. Key words Guanine quadruplex, Circular dichroism spectroscopy, Interpretation of CD spectra, Guanine-rich genomic sequences, DNA conformational transitions

1

Introduction Circular Dichroism is a phenomenon that occurs when linearly polarized light passes through absorbing optically active media. Optically active materials (i.e., substances containing chiral molecules) rotate the plane of linearly polarized light due to different refractive indices of its left and right circularly polarized components, thus giving rise to optical rotation. If a studied chiral substance absorbs light, the two components are absorbed differently, resulting in the change of the originally linearly polarized light into elliptically polarized light. The difference in the absorption of the left and right circularly polarized light is called circular dichroism (CD). It is expressed by Δε ¼ εL ‑ εR, in units of [M1 cm1], where εL and εR are molar absorption coefficients of the two light components. Other quantity, which measures this phenomenon, is called ellipticity, Θ. The ellipticity corresponds to the angle whose tangent is the ratio between the minor and major axes of the

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_2, © Springer Science+Business Media, LLC, part of Springer Nature 2019

25

26

Iva Kejnovska´ et al.

resulting elliptically polarized light. Molar ellipticity is expressed in units of [deg cm2 dmol1]. There are two conditions for an emergence of a CD signal: a sample of interest must be chiral and must absorb light. In the case of nucleic acids (NA), the chirality is fulfilled by sugar, deoxyribose or ribose containing asymmetric carbon C10 , and the absorption is accomplished by nucleic acid bases. To structural studies of nucleic acids, CD spectra are measured in the ultraviolet region of light (wavelengths ranging from 200–330 nm) corresponding to electronic transitions. Consequently, mononucleosides already provide CD signal. However, the main source of nucleic acid chirality arises from their secondary structure, i.e., from the folding of absorbing bases into asymmetric helical arrangements. CD thus reflects even slight changes in the mutual orientations of absorbing units. Due to its unique sensitivity to the conformation of nucleic acids, CD spectroscopy has become frequently used method in the field and it has been participating, very often as a pioneering method, in all basic findings on DNA conformational flexibility [1]. DNA can adopt, depending on its primary structure, various arrangements that are distinctly different from the classical Watson–Crick model. These noncanonical structures may differ in base pairing, the sense of the helix winding, and in the number and mutual orientation of oligonucleotide chains in a molecule. The individual structures provide characteristic CD spectra. CD spectroscopy was, e.g., the first method to discover the left-handed form of Z-DNA [2], several years before its existence was demonstrated in the crystal [3]. Nowadays, CD is frequently used in the study of G-quadruplexes, which are a hot topic of current molecular biology, in particular with regard to their proven biological relevance [4]. There exists a number of different G-quadruplex topologies [5]. Although many types of quadruplexes have been reported, there are two basic types of CD spectra, which have been associated to the relative orientation of the strands, parallel (all strands have the same 50 to 30 orientation, all glycosidic bonds are in anticonformation) or antiparallel formed by two folded-back strands (where guanines adopt syn and anti glycosidic conformation along each strand). Folding of the sugar-phosphate backbone usually determines orientation of guanine tetrads; however, irrespectively of the relative orientation of the strands, the polarity of stacked quartets is the determining factor for the type of G-quadruplex CD spectra. The different spectral features result from mutual orientation of the guanine electronic transitions (at 250 and 280 nm), thus from the different stacking orientation (head-to-tail with parallel quadruplexes, and head-to-head or tail-to-tail with antiparallel ones) between adjacent G-tetrads [6].

CD of G-Quadruplexes

27

Fig. 1 CD spectra of guanine quadruplexes. Black dashed spectra were taken in 1 mM sodium phosphate buffer, pH 7. (a) Time-dependent formation of a parallel-stranded quadruplex of d(G4) in the presence of 16 mM K+. (b) Na+-induced formation of an antiparallel bimolecular quadruplex of d(G4T4G4). (c) K+-induced formation of a (3 + 1) hybrid quadruplex structure of d(G3(TTAG3)4). Chemical structure of a G-tetrad and the schematic sketches of respective quadruplex structures are inserted. Unless stated otherwise Δε value with this and other figures are expressed related to nucleoside concentration

CD can clearly distinguish the main quadruplex types (Fig. 1) [7], as well as some less distinctive variations. CD spectroscopy was also the first to predict the hybrid (3 + 1) quadruplex formed by human telomere sequence [8], which was a year later confirmed by NMR [9, 10]. Interestingly, we observed a parallel-type CD spectrum for a sequence derived from the c-myc gene promoter that was shown to form intramolecular quadruplex [11], although at that time only multimolecular parallel quadruplexes were reported. The existence of intramolecular parallel quadruplexes was again later confirmed [12, 13].

2

Materials 1. Oligonucleotides—various sequences, Sigma Aldrich, 50 nmol scale, desalted (DST) purification (see Note 1). 2. CD spectrometer—Jasco J-815/Chirascan Plus with Peltier cell holder (see Note 2). 3. Quartz cells—transparent 1 cm rectangular cell with cap (Hellma, 110-QS) (see Note 3).

28

Iva Kejnovska´ et al.

4. Buffer for increasing ionic strength: 0.1 M sodium or potassium phosphate buffer, pH 7: mix 50 mL 0.2 M Na2HPO4 or K2HPO4 with 50 mL 0.2 M NaH2PO4 or KH2PO4 to the final volume 200 mL (see Note 4). 5. Britton-Robinson buffer for pH dependences: mix a solution of 0.04 M acids (H3BO3, H3PO4, CH3COOH) and pH is adjusted by 0.2 M NaOH or KOH (see Note 5 and 6). 6. 3 M KCl: weight 22.4 g KCl and dissolve in 100 mL water. 7. 3 M NaCl: weight 17.5 g NaCl and dissolve in 100 mL water. 8. MiliQ water is used for the preparation of all solutions.

3

Methods In this procedure, we will follow simple induction of G-quadruplex by potassium ions. Generally, the measurement of CD spectra itself (Subheadings 3.1–3.3) is quite simple and straightforward procedure, however, each step and each setting offer high variability and the result strongly depends on the proper setting. We will thus describe each such variable in more detail in the notes section. The most demanding and often incorrectly performed step is, however, the interpretation of CD spectra (Subheading 3.4).

3.1 Measurement of CD Spectra

1. Set the parameters of measurement (Jasco J-815: 200 to 330 nm, step 0.5 nm, mode continuous, scanning speed 200 nm min1, accumulations 4, temperature 23  C or Chirascan Plus: 200 to 330 nm, step 0.5 nm, time-per-point 0.5 s, temperature 23  C) (see Note 7). These settings will be used for all consecutive scans, changes will be indicated. 2. OPTIONAL: When using CD spectrometer able to measure absorbance simultaneously with CD (Chirascan Plus), measure the background scan with empty cell holder. In case of Jasco J-815, the absorbance is calculated from PMT voltage and no such scan is necessary. 3. Insert 1390 μL of 1 mM Na-phosphate buffer into 1 cm cell (see Note 8). 4. Measure the buffer in the cell—this is a baseline. 5. Add 10 μL of oligonucleotide stock solution (10 mM nucleoside, ~100 OD/mL) to the cell and gently mix (several times overturning the cell to homogenize the oligonucleotide solution). 6. OPTIONAL: Measure CD spectrum before denaturation— this spectrum indicates how the oligonucleotide was folded in stock solution.

CD of G-Quadruplexes

29

7. Heat-denature oligonucleotide in the cell by heating to 90  C for 5 min directly in the dichrograph, UV spectrophotometer or some heating device (see Note 9). 8. Measure the absorption spectrum or absorbance at 260 nm at 90  C and calculate the concentration (see Note 10). 9. OPTIONAL: Measure the CD spectrum of denatured sample at 90  C simultaneously with the absorption spectrum—this spectrum should reflect only the primary sequence of the oligonucleotide (see Note 11). 10. Cool-down to 23  C the cell and measure CD spectrum after denaturation—this should be the initial spectrum before any induction of quadruplex occurs. 11. Add 35.6 μL of 0.4 M potassium phosphate buffer (i.e., 10 mM final) and measure the CD spectrum. 12. Add 49.5 μL of 3 M potassium chloride (i.e., 100 mM KCl final; the total K+ concentration is 115 mM with potassium contained in phosphate buffer) and measure CD spectrum. 13. OPTIONAL: In order to create an equilibrated structure, it is appropriate and sometimes necessary to thermally denature (5 min at 90  C) preexisting nucleic acid structures and then to let it cool down slowly to room temperature for a few hours so that the resulting structure has enough time to form. This procedure is called annealing. In some cases annealing may support formation of thermodynamically more stable but kinetically restrained multimolecular species. It is convenient to check the sequences both by CD and by electrophoresis before and after annealing. 3.2 More Specific CD Measurements

1. CD of oligonucleotide concentration dependences: It is generally accepted that the constancy of amplitudes of CD spectra over DNA concentration range is one of the proofs for intramolecular structure. The concentration dependence must be performed in a wide range of concentrations, which is facilitated by using a range of cells with various path lengths. In this case, the knowledge of precise concentration in each cell is necessary. We usually check concentration from simultaneously measured absorbance spectrum, which is better than to bank on dilution precision. 2. Thermal denaturation followed by CD: In some cases, melting followed by CD, instead of absorbance, can give additional information about possible conformational transitions (Fig. 2). CD spectroscopy reveals in the shown figure that increasing temperature causes isomerization between two different quadruplex conformations prior to the oligonucleotide denaturation, while UV absorption does not reflect the change. Both the CD and absorption melting procedures of

30

Iva Kejnovska´ et al.

Fig. 2 CD spectra of d(G3T3G3CTG3T3G3) in 20 mM lithium cacodylate buffer, pH 7 with 100 mM potassium chloride measured at different temperatures. Dependences: Melting curves expressed as oligonucleotide folded fraction obtained by following absorbance at 297 nm (green), CD at 260 nm (blue) and CD at 290 nm (red)

quadruplexes were described in detail by Olsen and Marky [14] in other protocol. 3. Measurement in high volumes of crowding/dehydrating agents: Formation of guanine quadruplexes might be induced also by some crowding or dehydrating agents (polyethylene glycol, ethanol), usually at concentrations above 40%. The first problem is the ratio between minimal sample volume in cell, necessary for measurement, and total cell volume, which is around 0.4 for common 11 cm rectangular cells, but usually less for other cell types. This allows addition of pure PEG200 or ethanol only to around 60%. To reach higher concentrations, it is necessary to start with some low concentration of respective agent already present in the initial (minimal) volume and check in a separate measurement the change induced by the low concentration. We also recommend to use higher NA concentration than optimal because of consecutive high dilution. Second problem is the aggregation of nucleic acids, caused by high concentrations of crowding/dehydrating agents or by their combinations with some salts. Aggregation can be identified by non-zero CD and absorbance at wavelengths above 310 nm. 4. Measurements of protein-DNA/RNA interactions: The protein is added to quadruplex or other structures in molar strand

CD of G-Quadruplexes

31

ratios [15]. Buffers used for protein studies often highly absorb in short wavelengths, where the proteins give their dominant peaks. We advise to use cells with narrow path length for such measurements (to minimize the buffer absorption). 3.3 Processing of CD Spectra

The analysis software of each CD spectrometer offers full functionality to process the spectra; however, we prefer to perform the analysis numerically in third-party software (MS Excel, SigmaPlot, Origin, etc.). Working with such software simplifies processing of higher numbers of spectra, as well as correcting or modifying the calculations. 1. Export data into text format (txt, csv, . . .) and import them into your software. 2. Numerically subtract the baseline scan from each spectrum. Subtract the baseline from UV absorption spectra if they are measured simultaneously with CD spectra. 3. Express the spectra in molar units according to the Beer–Lambert law using respective cell path length and oligonucleotide concentration (see Note 12). 4. OPTIONAL: Smooth the spectra. We usually perform a Savitzky–Golay smoothing algorithm with 15 points window. The influence of smoothing, especially at spectral peaks, needs to be followed with care. 5. OPTIONAL: Subtract the CD spectra of the primary sequences (see Note 11).

3.4 Interpretation of CD Spectra

1. Parallel G-quadruplexes: Parallel quadruplexes have all four chains oriented in the same direction from the 50 to the 30 end and all guanosine glycosidic angles in the anti geometry. Their CD spectra are characterized by a huge positive band at 260 nm, a relatively shallow negative band at 240 nm, and another large positive band at 210 nm (Fig. 1a). Tetramolecular parallel quadruplexes are commonly formed by short G-rich oligonucleotides. In case the sequences are not terminated by non-G nucleotides, these oligonucleotides tend to associate into higher structures, either by terminal tetrad stacking [16] or by interlocking more than four strands into so-called G-wires [17]. The CD spectra of such associates do not differ distinctly from the ones of regular tetramolecular quadruplex. Only heavy aggregates can be recognized (by a non-zero CD signal at wavelengths higher than 300 nm) due to light scattering. Long G-rich sequences or sequences with more G-tracts interrupted by short non-G regions might fold, depending on the number of particular G-tracts, into bi- or mono-molecular quadruplexes characterized by the presence of propeller loops that allow the chain to proceed in a parallel fashion along the

32

Iva Kejnovska´ et al.

Fig. 3 (a) CD spectrum of the intramolecular parallel-stranded quadruplex of the G-rich c-myc promoter fragment d(TGGGGAGGGTGGGGAGGGTGGGGAAGG). Spectra were measured in 1 mM sodium phosphate buffer, pH 7 (black dashed line) and in 150 mM K+ (red solid line). (b) CD spectra of (red) d(G8), (pink) d (G4TG4), (violet) d(G4T2G4), (cyan) d(G4T3G4) and (blue) d(G4T4G4) measured in 10 mM sodium phosphate buffer, pH 7 and 150 mM NaCl 7 days after the addition of NaCl. (c) The bimolecular parallel-stranded RNA quadruplex of r(G4U4G4). Spectra were measured in 1 mM sodium phosphate buffer (black dashed line), 15 mM (red dashed) and 100 mM NaCl (red solid line)

quadruplex core. CD spectroscopy cannot simply distinguish the molecularity of parallel quadruplex. Generally, various parallel quadruplexes give CD spectra of very similar shape only sometimes differing in the intensity of the dominant parallel peak. The most studied biologically relevant example of parallel quadruplex is the sequence from the c-myc oncogene promoter, which forms parallel quadruplex in both potassium and sodium cations (Fig. 3a). Generally, increasing number of non-G nucleosides inserted between guanine blocks stabilizes formation of antiparallel quadruplex arrangement (Fig. 3b). Guanine-rich RNA sequences usually form parallel quadruplexes, and even for oligonucleotides that are in DNA form purely antiparallel (Fig. 3c). Recently it was reported that an

CD of G-Quadruplexes

33

Fig. 4 (a) CD spectra of parallel quadruplex of the fragment of the c-myc promoter d(TGGGGAGGGTGGGGAGGGTGGGGAAGG) in 150 mM K+ (red solid line) and the A-form of r(GCGAAGC) measured in 3 mM sodium phosphate buffer and 3 mM MgCl2 (violet line). (b) CD spectra of indicated sequences in 10 mM potassium phosphate, pH 7 and 0.1 M KCl: (red line) d(G4); (red dashes) d(CG4); (red dash-dots) d(C2G4) sequences forming parallel quadruplex; (black line) d(C3G4) and (black dash-dots) d(C4G4) as examples of duplexes. CD spectra in b are related to molar guanine concentration

intramolecular antiparallel RNA G-quadruplex is formed by a native human telomere RNA sequence [18]. CD spectrum of A-form of DNA or RNA duplex is characteristic with a high positive peak at 260 nm, similar to that of parallel G-quadruplex (Fig. 4a), which leads to wrong interpretations of CD spectra. However, the spectrum of A-form is usually slightly shifted to longer wavelengths, and, mainly, the short wavelength region of A-form spectra has a pronounced negative band at 210 nm, with the opposite amplitude to the positive band of parallel quadruplex. It is thus important to record CD spectra from 200 nm and to compare the whole spectrum, not just the 260 nm band. The confusion could particularly arise when studying RNA sequences or with duplex-quadruplex competition. Duplexes consisting of G and C blocks provide anomalous CD spectra with a dominant positive band at 260 nm and a short wavelength positive band, analogous to the CD spectrum

34

Iva Kejnovska´ et al.

of parallel quadruplex (Fig. 4b) (self-complementary duplexes beginning with 50 cytosine blocks usually provide in addition a negative band around 285 nm). This type of spectrum reflects some A-like duplex features of these sequences [19, 20]. It is a source of frequent erroneous interpretations based only on the intensive 260 nm band. Hence, one has to be aware of these anomalous CD spectra when studying these sequences. The similarity of CD spectra may indicate similar stacking interactions between guanine blocks within quadruplex and duplex [21]. In the case of interpretation indecisiveness between duplex and quadruplex, a comparison of thermal stabilities in potassium, sodium, and lithium ions, which are similar for duplexes but very different for quadruplexes, may help. 2. Antiparallel G-quadruplexes: as a rule antiparallel quadruplexes have two chains oriented in the same direction and the other two in the opposite direction combining various types of loops (edgewise and diagonal loops giving rise to basket and chair structures) [22]. Guanines have the syn and the anti glycosidic orientations. The CD spectrum of an antiparallel quadruplex (for an example, see Fig. 1b for the Oxytricha telomeric sequence G4T4G4) has a typical positive band at 295 nm and two other smaller ones at 240 and 210 nm, and a negative band at 260 nm. While G4T4G4 dodecamer forms a bimolecular quadruplex (G4T4)3G4 folds intramolecularly; interestingly, the CD spectra of both quadruplex structures are nearly identical. CD does not recognize the difference in molecularity. Other example of sequences adopting antiparallel quadruplex is the human telomere sequence (G3TTA)3G3, stabilized by sodium ions (Fig. 5a). This sequence and its sequence analogues are able to be transformed into parallel quadruplexes in aqueous ethanol (Fig. 5b) or PEG solutions [23, 24], or at high K+ or DNA concentrations [25]. Another sequence forming an antiparallel quadruplex is the thrombin-binding aptamer (TBA) G2T2G2TGTG2T2G2. The quadruplex is special by containing only two tetrads. It is better stabilized in potassium-containing solutions than in sodium-containing solutions. 3. Hybrid forms: G-quadruplexes with three chains oriented in one direction and the remaining one in the opposite direction are hybrids between parallel and antiparallel quadruplexes. Their formation is realized by the combination of edgewise and propeller loops. Their CD spectra are characterized by the presence of positive band at 290 nm with a shoulder or band around 260 nm, a negative band at 240 nm, and another positive band, present with all quadruplexes, at 210 nm (Fig. 1c). This hybrid (3 + 1) quadruplex arrangement is, e.g., formed by the sequence (T2G4)4 of Tetrahymena in

CD of G-Quadruplexes

35

Fig. 5 (a) CD spectra of the human telomere sequence (G3TTA)3G3 in 1 mM sodium phosphate buffer, pH 7 (black dashed line) and in 150 mM Na+ (blue solid line). (b) CD spectra of (G3TTA)3G3 in 57% ethanol (blue long dashes); KCl was added to the oligonucleotide in 57% ethanol to 1 mM (cyan dashed line) and 2 mM concentration (cyan long dashed line) measured immediately and after 5 days (red dash-dot line). Red line corresponds to parallel quadruplex measured after thermal denaturation and cooling to room temperature

sodium ions [26] or by a fragment from the human BCL-2 promoter region [27]. The human telomeric sequence (G3TTA)4G3 containing five G3 blocks [8] also forms the hybrid quadruplex structure, in which an extra G3TTA repeat forms the propeller type loop [10]. The hybrid structure is also stabilized by different overhanging sequences appended to the basic human telomere motif of (G3TTA)3G3: for instance TA (G3TTA)3G3 and A3(G3TTA)3G3A2 form the hybrid-1 (with propeller folding of the first TTA loop), and the sequence TA (G3TTA)3G3TT forms the hybrid-2 (with propeller folding of the third TTA loop) in the presence of potassium ions. These two hybrid structures differ in the topology of loops but they have the same constitution of tetrads as well as CD spectra [28].

36

Iva Kejnovska´ et al.

3.5 Suitable Complementary Methods

1. Molecularity—PAGE: Native polyacrylamide electrophoresis (PAGE) provides additional information to CD spectroscopy about the number of G-quadruplex chains. The PAGE must be run under the same experimental conditions as those used for recording CD spectra. The same buffer must be used for the preparation of samples, the PAGE gel, and the electrophoretic buffer. Ideally, the samples measured on CD are directly loaded on native PAGE; however, this is possible only with some combinations of DNA concentration and staining methods (for example, CD measurement in 0.1 cm cell and gel stained by StainsAll (Sigma) stain). In addition to the commercial ladder, it is recommended to use as a migration marker the appropriate heteroduplex, which is a mixture of the studied sequence with its complementary strand. It is not correct to use thymine ladders because thymine oligonucleotides migrate much slower than corresponds to their lengths [29]. This has been the source of many misinterpretations in the literature. According to our observations, parallel and antiparallel quadruplex can be distinguished based on gel migration: e.g., the 9mer G3TTAG3 can isomerize between antiparallel and parallel quadruplex in 150 mM KCl. The gel migration revealed two separately migrating bimolecular species. The antiparallel quadruplex migrated more quickly than the parallel one (Fig. 6) [8]. To the same conclusion we arrived with parallel and antiparallel duplexes [30]. 2. Study of G-quadruplexes using UV absorption spectra. CD spectra are recorded together with absorption spectra in most dichrographs. The formation of quadruplexes is accompanied by hyperchromicity in the long wavelength region around 296 nm, and reversely, it is also possible to monitor their thermal denaturation. Depending on the value of hyper/hypochromicity, it is possible to differentiate between parallel and antiparallel quadruplexes. Although parallel quadruplexes are generally more stable than antiparallel ones, they exhibit a lower hyperchromic effect upon their formation than antiparallel ones (Fig. 7). We recommend to check thermal melting in real changes in ε [M1 cm1] in addition to commonly used 1–0 normalized curves. This way can provide an additional information, e.g., the extent of denaturation or quadruplex formation can be limited (e.g., because of damage) or the hypo/hyperchromic effect can be affected by flanking bases, which are not included into quadruplex.

Fig. 6 CD spectra of G3TTAG3 in 10 mM potassium phosphate buffer, pH 7 and 150 mM KCl measured: immediately (blue line), and after 1 day (blue dashed line), 3 days (red dashed line), 10 days (red line). Native PAGE in 150 mM KCl, pH 7.2 at 0  C. The samples of G3TTAG3 were loaded immediately (line 1) and after 2 days incubation (lane 2). The heteroduplexes serve as markers

Fig. 7 Melting curves observed based on UV absorption at 297 nm of parallel (red line) d(AATG3TG2T3G3TG3TAA) and antiparallel/hybrid quadruplex (blue line) d(AATG2TG3T3G3TG3TAA). Insert: CD spectra of the measured quadruplexes in 10 mM K-phosphate buffer, pH 7, and 110 mM KCl. Thermal difference spectra (TDS) calculated as the difference of denatured and native states

38

4

Iva Kejnovska´ et al.

Notes 1. The commercially synthesized lyophilized oligonucleotides are dissolved in milliQ water or better in low salt solution (1 mM sodium/potassium phosphate buffer, pH 7 with 0.3 mM EDTA) to give stock solutions of about 100 OD/mL, i.e., ~10 mM in nucleosides. Addition of EDTA in millimolar concentrations prevents DNA structural changes caused by traces of divalent cations. Oligonucleotide stock solutions are stored at 20  C. As a rule, oligonucleotides with desalt purification (usually Sephadex-based gel filtration) are sufficient for circular dichroism measurements. In the case of measurements of higher DNA concentrations, it is suitable to use centrifugal filters (e.g., Amicon Ultra 3 kDa) of the smallest pore sizes to convert sample into defined ion environment. HPLC or PAGE purification is used when higher sample homogeneity is required or for oligonucleotides with modified bases due to provider recommendations. The length homogeneity of samples is checked by denaturing PAGE. 2. There are several manufacturers of dichrographs in the world, some offering several models. Recently, we use Jasco J-815 (Jasco inc., Japan) and Chirascan Plus (Applied Photophysics, UK). The devices significantly differ in several technical and operational parameters including: type of detector that influences the S/N ratio, wavelength range and absorbance range; the amount of nitrogen used for purging; time necessary to start the device (minutes to hours); modes of measurement ‑ in general, spectra are taken by step scanning measurements (e.g., 0.5 nm step with 0.5 s time-per-point), but some dichrographs (namely Jasco) can also operate in continuous mode (e.g., 100 nm/min); this mode usually gives better-looking raw spectra, but it might be considered as a variant of smoothing by some scientists. The instrument must be calibrated, e.g., by (1S)-(+)-10camphorsulfonic acid (1 g/L gives in 0.1 cm cell characteristic peaks of 69.5 mdeg at 192.5 nm and 33.5 mdeg at 291 nm) or by non-hygroscopic (1S)-(+)-10-camphorsulfonic acid ammonium salt (0.06% aqueous solution measured in 1 cm cell gives the CD signal of 190.4 mdeg at 291 nm). 3. To measure CD spectra of nucleic acids, absorbing in the ultraviolet spectral region, quartz cells have to be used. The rectangular quartz cells are available in a wide choice of path lengths ranging from 5 to 0.001 cm, which allows measuring DNA in various concentrations. The most commonly used 1 cm path length cells are made in variants with narrowed inner space (0.2 or 0.4 cm instead of 1 cm), which enables

CD of G-Quadruplexes

39

the amount of DNA sample to be saved. In case the cell side walls are made from transparent material, it is necessary to check and adjust properly the path of the beam of light so that it does not interfere with the cell walls. Otherwise, metallic mask has to be used to allow only part of the light beam to pass. It has to be noted that in this case the reduced light input leads to reduced spectra quality. Some cells have a raised bottom, which allows a further decrease in sample consumption. Otherwise, in cases where the inner sample chamber of the cell almost reaches the bottom of the cell, the cell can be placed on the top of a home-made metal prism so that the beam passes only through the sample Generally, unless higher DNA concentration is needed, we prefer to use 1 cm path length cells. In experiments where strongly absorbing compounds are present, the unwanted absorption of buffers might be reduced by using cells with very short path lengths (and increased concentration of measured samples). Microcells with 1 cm optical path are used for measuring small sample volumes (e.g., 100 μL) for PCR reactions products or the interaction of DNAs with proteins. 4. The sodium or potassium phosphate buffer prepared by mixing NaH2PO4 with Na2HPO4 (or KH2PO4 with K2HPO4) in a ratio that gives desired pH (1:1 gives pH around 6.9 and provides well-defined ionic strength) is most frequently used. The phosphate buffer is very suitable for CD measurement due to its low absorption. 5. Britton-Robinson buffer (pH range from 2 to 12) is used in studies involving changes in pH beyond the phosphate buffer optimum. The amount of added alkali ions resulting in an increase of ionic strength must be accounted for. 6. The other suitable buffers are: Tris–HCl buffer is commonly used for pH 7–9 in most molecular biology techniques. It has to be noted that its pH strongly depends on temperature. Sodium or lithium cacodylate buffer is used for pH region of 5–7.4, but they are highly toxic. Acetate buffer is suitable as well for measurements in acidic pH (3–6). Buffers or other components used in molecular-biological experiments (HEPES, TRITON, DMSO) often strongly absorb below 250 nm, which hinders CD measurements. 7. Before measuring CD spectra, one has to set the wavelength range (usually 190–330 nm for narrow path cells and 210–330 nm for 1 cm cells), the speed of data acquisition (depending on the height of CD signal of the structure— lower CD signals need slower speed that gives better spectra but takes more time), the number of consecutive spectra

40

Iva Kejnovska´ et al.

averaged together to enhance signal-to-noise ratio (increasing the number of accumulations provides smoother curves than respective lowering the speed), and the temperature of the cell compartment. By setting these parameters, we can optimize the smoothness and signal-to-noise ratio of the measured sample. The absorbance corresponding to the optimal signal-tonoise ratio is ~0.8 and it should be adjusted for the selected spectral wavelength (around 260 nm for nucleic acids) by an appropriate combination of cell path length and sample concentration. 8. It is useful to determine the minimum measurable volume in the cell in order to get correct spectra and to save DNA samples as well as the space for addition of, e.g., ethanol or other agents. A sample solution is gradually added to a cell until two recorded consecutive spectra are identical. If the beam passes only through a portion of the sample in a cell at the expense of the air above sample surface, the signal is smaller than that of the full cell. 9. It is a good practice to start with an unstructured sequence and to have under control the induction of a desired quadruplex. Thus, prior to CD measurement all oligonucleotides in cells should be subjected to thermal denaturation for 5 min at 90  C at low ionic strength (1 mM sodium phosphate buffer, pH 7 with 0.3 mM EDTA) to disrupt any secondary structures (especially those structures consisting of associated chains) formed at a high DNA concentration in the stock solution. Some G-rich oligonucleotides are difficult to dilute as they form heavy aggregates that cannot be removed by heating or further dilution. Then the following procedure helps: alkalization by adding 10–100 mM LiOH giving pH 11–12 for 10 min, followed by neutralization using diluted HCl. The denaturation can be associated with measurement of absorption at 260 nm to determine sample concentration either in absorption spectrophotometer or, better, directly in the dichrograph (observed CD spectrum gives some idea about the influence of chromophores (unstructured, primary sequence)). 10. Oligonucleotide DNA/RNA concentration is determined based on the Beer–Lambert law using molar absorption coefficient, which can be calculated from the Eq. (1) given by Gray et al. [31]. This equation takes into account contributions of individual mononucleosides and nearest neighbor dinucleotides involved in the applicable oligonucleotide. Molar absorption coefficients of individual monomers and dimers are in Table 1.

CD of G-Quadruplexes

41

Table 1 Molar absorption coefficients of deoxyribo/ribo-mononucleotides and dinucleotides [31] Monomer

DNA ε 260 nm [M1 cm1]

Monomer

RNA ε 260 nm [M1 cm1]

Ap

15,340

Up

10,210

Cp

7600

Gp

12,160

ApU

12,140

Tp

8700

CpU

8370

GpU

10,960

Dimer

Dimer ApA

13,650

UpA

12,520

ApC

10,670

UpC

8900

ApG

12,790

UpG

10,400

ApT

11,420

UpU

10,110

CpA

10,670

a

CpC

7520

CpG

9390

CpT

7660

TpA

11,780

GpA

12,920

TpC

8150

GpC

9190

TpG

9700

GpG

11,430

TpT

8610

GpT

10,220

DNA ε 260 nm [M1 cm1]

a

Other RNA dimers have the same values as DNA

  εðApTpCpGÞ ¼ ¼ 2ε ApT þ 2εðTpCÞ þ 2εðCpGÞ  εðTpÞ  εðCpÞ ¼ 10; 405 M1 cm1

ð1Þ

The DNA concentration is then calculated from Beer–Lambert law using calculated ε, cell path length and measured absorbance at 260 nm of the oligonucleotide at 90  C. (Remember that any further solvent or substance additions into cell change the oligonucleotide concentration in the cell and it must be recalculated according to the volume increase.) 11. Unstructured oligonucleotides provide CD spectra corresponding to their primary sequence. This spectrum contributes to the spectrum of the oligonucleotide quadruplex. In case quadruplexes of two rather similar sequences, differing, for example, in loop nucleotide composition are compared (Fig. 8), it might be beneficial to subtract the spectra of their

42

Iva Kejnovska´ et al.

Fig. 8 CD spectra of ATTT(T2AG3)4 (red) and AGXG(T2AG3)4 (black). X stands for apurinic site. (a) CD spectra measured in 10 mM potassium phosphate buffer, pH 7 with 100 mM KCl. (b) CD spectra measured in 1 mM sodium phosphate buffer, pH 7, the difference of these spectra (green) and the difference of calculated theoretic spectra (blue). (c) CD spectra from left panel after subtraction of the spectra corresponding low-salt spectra in middle panel

primary sequence; as a result the originally different spectra may become similar [32]. The spectrum of primary sequence might be either calculated using nearest neighbor model according to Eq. (1) [33], or, easier, the spectra of unstructured oligonucleotide measured in low salt solution can be used. These two types of “primary sequence CD spectra” might differ. Beware, that some quadruplexes, especially the parallel ones with three or more tetrads, may be already formed in low salt conditions. 12. Measured spectra are usually obtained in mdeg units. If we want to express them in Δε [M1 cm1], we use the relation: Δε ¼ θ/(32,980 c l), where c is the molar concentration [M], l is the optical path of a cell [cm]. Usually the Δε or Θ is related to the concentration of strands (molecules), which allows to compare the spectra of differing lengths (for example, short G4 forming sequence compared with the same sequence missing some nucleotide or embedded by some short sequence). On the other hand, in some cases (especially with repetitive sequences), the expression per base or per repeat is beneficial so that the same amount of the samples is compared: structural changes within one unit can be compared with those of the units multiples.

CD of G-Quadruplexes

43

Acknowledgments This work was supported by the Czech Science Foundation (grant No. 17-12075S) and by the project SYMBIT reg. Number: CZ.02.1.01/0.0/0.0/15 003/0000477 financed by the ERDF. References 1. Vorlickova M, Kejnovska I, Bednarova K, Renciuk D, Kypr J (2012) Circular dichroism spectroscopy of DNA: from duplexes to quadruplexes. Chirality 24(9):691–698. https:// doi.org/10.1002/chir.22064 2. Pohl FM, Jovin TM (1972) Salt-induced cooperative conformational change of a synthetic DNA: equilibrium and kinetic studies with poly(dG-dC). J Mol Biol 67(3):375–396 3. Wang AHJ, Quigley GJ, Kolpak FJ, Crawford JL, van Boom JH, van der Marel G, Rich A (1979) Molecular structure of a left-handed double helical DNA fragment at atomic resolution. Nature 282(5740):680–686 4. Biffi G, Tannahill D, McCafferty J, Balasubramanian S (2013) Quantitative visualization of DNA G-quadruplex structures in human cells. Nat Chem 5(3):182–186. https://doi.org/10. 1038/nchem.1548 5. Neidle S, Balasubramanian S (2006) Quadruplex nucleic acids. Royal Society of Chemistry, Cambridge, London 6. Masiero S, Trotta R, Pieraccini S, De Tito S, Perone R, Randazzo A, Spada GP (2010) A non-empirical chromophoric interpretation of CD spectra of DNA G-quadruplex structures. Org Biomol Chem 8(12):2653–2872 7. Vorlickova M, Kejnovska I, Sagi J, Renciuk D, Bednarova K, Motlova J, Kypr J (2012) Circular dichroism and guanine quadruplexes. Methods 57(1):64–75. https://doi.org/10. 1016/j.ymeth.2012.03.011 8. Vorlickova M, Chladkova J, Kejnovska I, Fialova M, Kypr J (2005) Guanine tetraplex topology of human telomere DNA is governed by the number of (TTAGGG) repeats. Nucleic Acids Res 33(18):5851–5860. https://doi. org/10.1093/nar/gki898 9. Ambrus A, Chen D, Dai JX, Bialis T, Jones RA, Yang DZ (2006) Human telomeric sequence forms a hybrid-type intramolecular G-quadruplex structure with mixed parallel/ antiparallel strands in potassium solution. Nucleic Acids Res 34(9):2723–2735. https:// doi.org/10.1093/nar/gkl348 10. Luu KN, Phan AT, Kuryavyi V, Lacroix L, Patel DJ (2006) Structure of the human telomere in K+ solution: an intramolecular (3+1) G-quadruplex scaffold. J Am Chem Soc 128

(30):9963–9970. https://doi.org/10.1021/ ja062791w 11. Simonsson T, Pecinka P, Kubista M (1998) DNA tetraplex formation in the control region of c-myc. Nucleic Acids Res 26(5):1167–1172 12. Parkinson GN, Lee MPH, Neidle S (2002) Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417 (6891):876–880 13. Phan AT, Modi YS, Patel DJ (2004) Propellertype parallel-stranded G-quadruplexes in the human c-myc promoter. J Am Chem Soc 126 (28):8710–8716. https://doi.org/10.1021/ ja048805k 14. Olsen CM, Marky LA (2010) Monitoring the temperature unfolding of G-quadruplexes by UV and circular dichroism spectroscopies and calorimetry techniques. In: Baumann P (ed) G-quadruplex DNA, vol 608. Humana Press, New York, NY 15. Ha´ronı´kova´ L, Coufal J, Kejnovska´ I, Jagelska´ EB, Fojta M, Dvorˇa´kova´ P, Muller P, Vojtesek B, Bra´zda V (2016) IFI16 preferentially binds to DNA with quadruplex structure and enhances DNA quadruplex formation. PLoS One 11(6):e0157156. https://doi.org/ 10.1371/journal.pone.0157156 16. Do NQ, Lim KW, Teo MH, Heddi B, Phan AT (2011) Stacking of G-quadruplexes: NMR structure of a G-rich oligonucleotide with potential anti-HIV and anticancer activity. Nucleic Acids Res 39(21):9448–9457 17. Poon K, Macgregor RB (2000) Formation and structural determinants of multi-stranded guanine-rich DNA complexes. Biophys Chem 84 (3):205–216 18. Xiao C-D, Shibata T, Yamamoto Y, Xu Y (2018) An intramolecular antiparallel G-quadruplex formed by human telomere RNA. Chem Commun 54(32):3944–3946. https://doi.org/10.1039/C8CC01427B 19. Stefl R, Trantirek L, Vorlickova M, Koca J, Sklenar V, Kypr J (2001) A-like guanine-guanine stacking in the aqueous DNA duplex of d (GGGGCCCC). J Mol Biol 307(2):513–524. https://doi.org/10.1006/jmbi.2001.4484 20. Trantirek L, Stefl R, Vorlickova M, Koca J, Sklenar V, Kypr J (2000) An A-type double

44

Iva Kejnovska´ et al.

helix of DNA having B-type puckering of the deoxyribose rings. J Mol Biol 297(4):907–922 21. Kypr J, Fialova M, Chladkova J, Tumova M, Vorlickova M (2001) Conserved guanineguanine stacking in tetraplex and duplex DNA. Eur Biophys J 30(7):555–558 22. Simonsson T (2001) G-quadruplex DNA structures—variations on a theme. Biol Chem 382(4):621–628 23. Renciuk D, Kejnovska I, Skolakova P, Bednarova K, Motlova J, Vorlickova M (2009) Arrangements of human telomere DNA quadruplex in physiologically relevant K+ solutions. Nucleic Acids Res 37(19):6625–6634. https:// doi.org/10.1093/nar/gkp701 24. Vorlickova M, Bednarova K, Kypr J (2006) Ethanol is a better inducer of DNA guanine tetraplexes than potassium cations. Biopolymers 82(3):253–260 25. Palacky J, Vorlickova M, Kejnovska I, Mojzes P (2013) Polymorphism of human telomeric quadruplex structure controlled by DNA concentration: a Raman study. Nucleic Acids Res 41(2):1005–1016. https://doi.org/10.1093/ nar/gks1135 26. Wang Y, Patel DJ (1994) Solution structure of the Tetrahymena telomeric repeat d(T(2)G(4)) (4) G-tetraplex. Structure 2(12):1141–1156 27. Dai J, Dexheimer TS, Chen D, Carver M, Ambrus A, Jones RA, Yang D (2006) An intramolecular G-quadruplex structure with mixed parallel/antiparallel G-strands formed in the human BCL-2 promoter region in solution. J Am Chem Soc 128(4):1096–1098. https:// doi.org/10.1021/ja055636a

28. Kejnovska´ I, Bedna´ˇrova´ K, Rencˇiuk D, Dvorˇa´kova´ Z, Sˇkola´kova´ P, Trantı´rek L, Fiala R, Vorlı´cˇkova´ M, Sagi J (2017) Clustered abasic lesions profoundly change the structure and stability of human telomeric G-quadruplexes. Nucleic Acids Res 45 (8):4294–4305. https://doi.org/10.1093/ nar/gkx191 29. Kejnovska I, Kypr J, Vorlickova M (2007) Oligo(dT) is not a correct native PAGE marker for single-stranded DNA. Biochem Biophys Res Commun 353(3):776–779. https://doi. org/10.1016/j.bbrc.2006.12.093 30. Kejnovska I, Tumova M, Vorlickova M (2001) (CGA)4: parallel, anti-parallel, right-handed and left-handed homoduplexes of a trinucleotide repeat DNA. Biochim Biophys Acta 1527:73–80 31. Gray DM, Hung SH, Johnson KH (1995) Absorption and circular dichroism spectroscopy of nucleic acid duplexes and triplexes. Methods Enzymol 246:19–34 32. Dvorˇa´kova´ Z, Vorlı´cˇkova´ M, Rencˇiuk D (2017) Spectroscopic insights into quadruplexes of five-repeat telomere DNA sequences upon G-block damage. Biochim Biophys Acta 1861 (11, Part A):2750–2757. https://doi.org/10. 1016/j.bbagen.2017.07.019 33. Cantor CR, Warshaw MM, Shapiro H (1970) Oligonucleotide interactions. III. Circular dichroism studies of the conformation of deoxyoligonucleolides. Biopolymers 9 (9):1059–1077. https://doi.org/10.1002/ bip.1970.360090909

Chapter 3 Revealing the Energetics of Ligand-Quadruplex Interactions Using Isothermal Titration Calorimetry Andrea Funke and Klaus Weisz Abstract The thermodynamic characterization of G4-ligand interactions has shown to be a powerful adjunct to structural information in the rational design and optimization of potent G-quadruplex ligands for use in therapeutics, diagnostics, or other technological applications. Isothermal titration calorimetry (ITC) can resolve energetic contributions to complex formation and constitutes the only available experimental method to directly measure binding enthalpies. A general protocol for using ITC in studies on quadruplex-ligand interactions with details on the experimental setup, data analysis, and potential pitfalls is presented. The methodologies used are illustrated on results obtained from the targeting of a parallel DNA G-quadruplex with a G4-binding indoloquinoline derivative. Key words Isothermal titration calorimetry, Thermodynamics, Drug-DNA interaction, G-Quadruplex, Indoloquinoline

1

Introduction Over the past years significant effort has been devoted to the search for G-quadruplex (G4) targeting ligands. The plethora of studies on G4-ligand interactions mostly derives from the realization that G4-forming DNA tracts are widely distributed in the human genome and frequently occur in regulatory genomic regions associated with uncontrolled cell proliferation and oncogenesis [1]. Consequently, these tetra-stranded nucleic acid structures have been identified as attractive drug targets for novel anticancer strategies [2–4]. On the other hand, various DNAzymes and nucleic acid aptamers that are based on the G-quadruplex scaffold rely on specific intermolecular interactions with other binding partners for their ever-increasing use in bio- and nanotechnological applications [5–7]. For a rational drug design, the binding event should ideally be described in terms of both the resulting complex structure and the thermodynamics of complex formation. Whereas the three-

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_3, © Springer Science+Business Media, LLC, part of Springer Nature 2019

45

46

Andrea Funke and Klaus Weisz

Fig. 1 (a) Schematic diagram of an ITC instrument. Sample and reference cell in an adiabatic jacket are kept at the same temperature by providing a constant thermal power to the reference cell and a feedback controlled power to the sample cell. If aliquots of the ligand in the syringe are injected to the receptor in the sample cell, the heat released or consumed upon exothermic or endothermic binding will result in a temperature change of the sample that is compensated by power adjustments of the feedback heater to restore identical temperatures in the cells. (b) The additional compensating thermal power recorded as a function of time results in a differential power signal due to binding events after each injection. As binding sites become saturated, peaks become smaller and eventually only heats of dilution contribute to the recorded signal

dimensional structure of a complex may reveal specific interactions at the interface between components, insight into the contribution of molecular forces that drive the association are hardly provided by structural details. Studying the thermodynamics of a molecular interaction involves quantification of energy changes when going from free to complexed species. In contrast to spectroscopy-based methodologies, isothermal titration calorimeters (Fig. 1) allow for a very accurate direct measurement of the heat associated with complex formation without any chemical modification or immobilization of the interacting components [8, 9]. As an additional benefit, molar enthalpies of association can directly be derived without resorting to van’t Hoff-based analyses and their inherent limitations. Thus, with the availability of high-sensitivity calorimeters, ITC is expected to become increasingly important for the future development of G-quadruplex ligands with optimized binding characteristics. Because isothermal titration calorimetry not only provides the binding enthalpy ΔH  but also the Gibbs free energy ΔG  and the entropy ΔS  in one experiment, a complete thermodynamic profile

ITC Studies on Ligand-Quadruplex Interactions

47

Fig. 2 Indoloquinoline PIQ-4m (left) and parallel MYC quadruplex (PDB ID: 1XAV, right)

for an association process can be obtained. This offers valuable information on factors that govern the binding event such as specific intermolecular interactions or hydrophobic and electrostatic contributions with their different impact on affinity and specificity. To not only give instructions on the general experimental setup but to also illustrate the use of isothermal titration calorimetry in studies on quadruplex-ligand interactions, the outcome of ITC experiments is exemplified by the binding of a phenylindoloquinoline with a cationic side chain to a well-characterized parallel G-quadruplex derived from the MYC promoter sequence (Fig. 2). The indoloquinoline derivative has previously been suggested to bind at multiple sites primarily through stacking on the quadruplex outer tetrads and may thus serve as a typical representative of a G4 binding low molecular weight ligand [10, 11]. 1.1 ITC Analysis and Basic Thermodynamic Relationships

Integration of the power output gives the additional heat ΔQ released or consumed upon ligand binding after each injection. Because ΔQ equals enthalpic changes in the sample, a complete titration experiment will give a thermogram that can be analyzed by a nonlinear least squares fit to yield the molar binding enthalpy ΔH  , the association constant Ka, and the binding stoichiometry n (Fig. 3a). Standard thermodynamic relationships can then be used to also calculate ΔG  and ΔS  at the given temperature T ΔG ∘ ¼ RT ln K a ΔS ∘ ¼ 



ΔG ΔH ∘ þ T T

ð1Þ ð2Þ

where R is the universal gas constant. In a model-free approach using the so-called excess-site method, ΔH  can conveniently be determined without resorting

48

Andrea Funke and Klaus Weisz

Fig. 3 (a) The binding curve for a ligand with moderately strong affinity to a G4 receptor possessing equivalent and independent binding sites is characterized by a sigmoidal binding isotherm. The differential heat plotted against the ligand-to-G4 molar ratio is determined by peak integration of the power output after each injection, normalization by the number of moles of added ligand and correction for the heats of dilution. Molar binding enthalpies, association constants, and stoichiometries are related to the height of the curve, the width of the transition and the molar ratio at the inflection point, respectively. (b) Complete binding of injected ligand in a corresponding excess-site ITC titration directly yields ΔH  that may be additionally averaged over the titration steps

to a curve fit with multiple adjustable parameters. Here, aliquots of the ligand are titrated to a large excess of receptor, ensuring complete binding of the ligand in the titration steps. As a result, the differential heat directly equals the molar binding enthalpy if corrected for the heats of dilution (Fig. 3b). The obtained thermodynamic profile for the association not only reflects direct interactions of the two binding partners as may be revealed by high-resolution structures but is also affected by coupled equilibria such as the release or uptake of water molecules, counterions, or protons. If protonation/deprotonation upon complex formation is anticipated, this can be assessed in detail by performing titrations at different pH values and by using buffers with different heats of ionization [12, 13]. Likewise, the release of counterions M+ upon binding a cationic ligand to the polyanionic G4 can be approached by multiple ITC measurements using different salt concentrations. Contributions ΔG pe to the Gibbs free energy from these coupled polyelectrolyte effects can be determined according to ΔG pe ¼ nion RT ln ½Mþ 

ð3Þ

with nion being the number of displaced counterions as determined by the slope of a plot logKa over log[M+] [14, 15]. Heat capacity changes ΔCp for complex formation can be obtained by the dependence of ΔH  on temperature

ITC Studies on Ligand-Quadruplex Interactions

  ∂ΔH ∘ ¼ ΔC p ∘ ∂T p

49

ð4Þ

Assuming ΔCp to be mostly governed by hydrophobic effects associated with the release or uptake of water molecules in biomolecular interactions, contributions ΔGhyd from hydrophobic transfer of the free ligand to the G4 binding site can be calculated from ΔCp by the semi-empirical relationship [16–18]. ΔG ∘ hyd ¼ 80  ΔC p ∘

ð5Þ

Because changes in heat capacity can also be estimated from changes in water-accessible surface areas, ΔCp can be used to directly correlate thermodynamic and structural information. 1.2 Critical Parameters for the Setup of ITC Experiments

2

As outlined above, fitting of the ITC binding isotherm based on an appropriate binding model will yield ΔH  , Ka, and the binding stoichiometry n. However, the accuracy of fitted variables critically depends on the shape of the ITC curve that itself is governed by the dimensionless Wiseman parameter c ¼ n Ka M0 with M0 being the total concentration of the receptor molecules (Fig. 4) [19]. The receptor concentration should ideally be selected for the Wiseman parameter to be in a range 50  c  500. If c is too high, nearly complete binding of added titrant until binding site saturation will lead to a rectangular curve with height ΔH  . Whereas stoichiometry and binding enthalpy are obviously well defined, there are not enough data points during the transition to yield an accurate association constant. If c is too low, ligands only bind to a small extent and the ITC curve follows a slightly curved or nearly horizontal path where saturation is never attained and deconvolution of the isotherm fails to yield reliable results. Both of these extremes should therefore be avoided but require an initial estimate of complex stoichiometry and affinity for the system under study (see Note 1).

Materials Use ultrapure water (resistivity of 18.2 MΩ-cm at 25  C obtained by purifying deionized water with a Millipore water purification system) and HPLC grade reagents. All prepared solutions can be stored at room temperature unless indicated otherwise.

2.1 Materials Required for Preparing the ITC Instrument

1. Water.

2.2 Materials Required for Sample Preparation

1. ITC buffer (appropriate buffer should be selected according to the particular ligand-G4 interaction): 20 mM K+ phosphate (K2HPO4/KH2PO4, pH 7.0), 100 mM KCl, 5% (v/v)

2. Methanol. 3. Cleaning detergent: 10% (v/v) Decon 90 in water (see Note 2).

50

Andrea Funke and Klaus Weisz

Fig. 4 Calculated ITC binding isotherms with different values of the Wiseman parameter c

DMSO (see Note 3). Dissolve K+ phosphate and KCl in approximately four-fifths of total water. Add DMSO and adjust the pH of the solution with KOH or HCl. After filling up with water to the final volume, filter the solution with a 0.2 μm filter. 2. HPLC-purified and lyophilized quadruplex-forming oligonucleotide. A less expensive purification by gel filtration is mostly acceptable if the standard protocol involves precipitation of the nucleic acid prior to use as described in Subheading 3.2, step 1. 3. For the DNA precipitation. 3 M K+ acetate solution in water, pH 6.0. The cation of the acetate salt (mostly K+ or Na+) should match the cation of the buffer solution. Ethanol absolute. 85% (v/v) Ethanol in water. 4. Dry ligand. 5. UV/vis spectrophotometer concentrations).

(for

6. Thermal block (for DNA annealing). 7. ITC equipment with a ThermoVac.

the

determination

of

ITC Studies on Ligand-Quadruplex Interactions

3

51

Methods The following protocol aims at providing general instructions applicable to ITC experiments on any ligand-quadruplex association. However, details on the instrumental setup and on experimental parameters are guided by studies on the interaction between the indoloquinoline ligand PIQ-4m and the MYC quadruplex (Fig. 2) using a MicroCal PEAQ-ITC (Malvern Panalytical, Malvern, UK). It is therefore recommended to become familiar with the operation of the particular ITC system by studying the corresponding manual before doing first measurements. Hints for choosing experimental parameters are provided where appropriate.

3.1 Preparing the ITC Instrument

1. Start the instrument and open the measurement application (see Note 4). 2. Clean both the sample and reference cell by rinsing with one volume of cleaning detergent and at least two volumes of water. A clean system is essential for accurate measurements (see Note 5). Remove any liquid from the cells after cleaning. 3. Filling the reference cell with water (see Notes 6 and 7). Degas water in the ThermoVac for 10 min to avoid formation of air bubbles during the experiments (see Note 8). Draw water into the filling syringe (see Notes 9 and 10) and remove any air bubbles that may have been introduced into the syringe (see Note 11). Insert the syringe into the cell until you gently touch the bottom, raise it about 1 mm, and slowly inject the water up to three quarter of total cell volume while keeping the syringe in this position. Subsequently, the cell is filled with three abrupt spurts until you notice spillover of the solution in the reservoir (see Note 12). Finally, remove excess liquid by inserting the tip of the filling syringe at the ledge formed by the transition of the cell stem into the cell reservoir and sucking off the liquid.

3.2 Preparation of the DNA Sample

1. DNA purification by ethanol precipitation. Dissolve the oligonucleotide in 200 μL of water and add 20 μL of the acetate solution. Precipitate the DNA by adding 900 μL of absolute ethanol. After keeping the mixture in a refrigerator overnight (see Note 13), the sample is centrifuged at 14,000 rpm (18,000  g) for 20 min at 4  C. The resulting supernatant is decanted and the remaining pellet is washed with 200 μL of 85% ethanol followed by an additional centrifugation step. After removal of excess liquid, the DNA is dried by lyophilization (see Note 14). 2. Determination of the DNA concentration (see Notes 15 and 16). Dissolve the dry G4-forming oligonucleotide in a defined small volume of water. Prepare a diluted sample (see Note 17), heat it to 80  C for 5 min to ensure complete DNA unfolding,

52

Andrea Funke and Klaus Weisz

and measure the absorption at 260 nm. Finally, the amount of DNA is calculated and the stock solution of the quadruplex is lyophilized again. 3. Preparation of the G4 buffer solution for the ITC measurements (see Note 18). By adding ITC buffer to the nucleic acid, a stock solution of the quadruplex is prepared and further diluted to the appropriate concentration for the ITC experiments (e.g., 20 μM). This DNA solution needs to be annealed. Heat the diluted sample in a pre-warmed thermal block to 85  C for 6 min. Switch off the heating block to allow the solution to slowly cool down to room temperature. Until use, store the G4 sample at 4  C. 3.3 Preparation of the Ligand Solution

1. Weigh an appropriate amount of the dry ligand and dissolve it in the ITC buffer (see Note 18). 2. Dilute the ligand stock solution to the desired concentration for the ITC experiments (e.g., 800 μM, see Notes 1 and 19). 3. To avoid erroneous binding parameters, it is essential that the concentration of the ligand solution is determined as accurately as possible (see Note 20). Depending on the ligand, absorption measurements in ITC buffer at room temperature are conveniently used for the determination of ligand concentration (see Note 17). Be careful to use ligand concentrations low enough to exclude aggregation effects.

3.4

Loading the ITC

1. To save time, the cells of the ITC device are pre-tempered to the temperature used for subsequent measurements (see Note 21). 2. Degas an appropriate amount of the DNA solution (for filling) and a larger amount of buffer (for rinsing) with the ThermoVac for 10 min. Both solutions are also pre-tempered to the target temperature during the degassing step. 3. Rinse the sample cell three times with buffer solution (see Notes 9 and 22) and remove all residual liquid. 4. Using the filling syringe, fill the sample cell with the degassed quadruplex solution in a bubble-free manner according to the instructions given in Subheading 3.1, step 3. 5. Load the injection syringe of the ITC device with the ligand solution according to the instructions of the manufacturer. Purge and refill the syringe three times using the corresponding button in the measurement application to ensure the absence of any air bubbles. Gently wipe off any droplets of ligand solution on the needle’s surface with a paper towel but never touch the tip of the needle.

ITC Studies on Ligand-Quadruplex Interactions

3.5 Instrument Settings and Start of Measurement

53

1. Set the total number of injections (e.g., 26). This parameter will depend on the concentrations and injection volumes used and should be selected to yield a full titration curve for extracting Ka and n (see Note 23). 2. Set the appropriate temperature for the measurement (see Note 24). 3. Set the reference power, a good starting point would be 4 μcal/s (see Note 25). 4. Set the initial delay to 120 s to establish a stable baseline before the first injection. 5. Enter the syringe and cell concentration (in M) into the appropriate input fields. 6. Set the stirring speed (e.g., 750 rpm, see Note 26). 7. Set the feedback mode (e.g., low, see Note 27). 8. Set the injection volume of the first titration step (e.g., 0.4 μL, see Note 28) and for the following steps (e.g., 1.5 μL). The duration time is automatically adjusted to the injection volume. 9. Set the spacing between injections (e.g., 240 s, see Note 29). 10. After setting the running parameters, the injection syringe is carefully inserted into the sample cell. Hold the pipette in a vertical position and make sure that the sensitive syringe does not hit any objects and sits properly in the cell. A click sound should be heard. If not, slightly push the syringe further down. 11. Start the experiment (see Note 30). 12. Repeat Subheadings 3.4 and 3.5 for two control experiments by replacing (a) the quadruplex solution in the sample cell and (b) the ligand solution in the injection syringe with pure buffer. Because dilution heats associated with the injection of small volumes of buffer into the quadruplex solution are mostly negligible, the second control experiment will be dispensable in most cases.

3.6

Data Analysis

Depending on the particular application used for processing the ITC data, individual steps for the analysis may slightly differ. Therefore, the steps described below represent a more general procedure of data analysis. 1. Adjust the baseline. Most processing programs automatically generate a baseline for the raw ITC data. However, results should be carefully checked and corrected if necessary (see Note 31). 2. Integrate peak areas and normalize the obtained heats to the amount of added titrant (ligand). In most cases, this is done automatically by the analysis application.

54

Andrea Funke and Klaus Weisz

Fig. 5 ITC thermogram (left) of the PIQ-4m ligand titrated to the MYC quadruplex at 40  C. The upper and lower panels show the heat burst for every injection step and the binding isotherm with the integrated, normalized, and dilution-corrected heat plotted against the ligand-to-G4 molar ratio. The red line represents the best fit based on a two-site model with high-affinity and low-affinity binding sites (Ka1 ¼ 2.0  106 M1, n1 ¼ 2.5, ΔH  1 ¼ 6.3 kcal/mol, Ka2 ¼ 4.8  103 M1, n2 ¼ 8.0, ΔH  2 ¼ 2.9 kcal/mol). A complete thermodynamic profile with enthalpic and entropic contributions obtained from best-fit values is plotted to the right for the high-affinity binding

3. Subtract the heats of dilution. After integration of the differential power and normalization, the dilution heats from blank titrations (described in Subheading 3.5, step 12) are subtracted from the data of the quadruplex-ligand interaction (see Note 32). 4. Fit of the binding isotherm. Analyze the normalized and dilution-corrected data through a nonlinear least squares fit employing an appropriate binding model (Fig. 5). The latter is mostly based on either equivalent (one-site model) or two different but independent binding sites (two-site model, see Notes 33 and 34). 5. Construct a thermodynamic profile from the extracted binding parameters. Using the fitted parameters Ka and ΔH  (see Note 35), ΔG  and ΔS  are calculated according to Eqs. (1) and (2) in Subheading 1.1. A plot of the enthalpic and entropic contribution to the Gibbs free energy identifies the major driving forces for ligand binding to the quadruplex (Fig. 5). Even more

ITC Studies on Ligand-Quadruplex Interactions

55

Fig. 6 (a) Plotting ΔH  as a function of temperature T yields the molar heat capacity ΔCp . (b) The number of cations nion released upon ligand binding can be determined from salt-dependent measurements if the logarithm of Ka is plotted against the logarithm of the cation concentration. (c) Results from temperature- and salt-dependent experiments reveal hydrophobic (ΔG  hyd) and polyelectrolyte (ΔG  pe) contributions to ΔG  as shown for the PIQ-4m-MYC interaction

detailed information on the ligand-quadruplex association can be obtained by additional temperature- and salt-dependent ITC measurements, allowing for the calculation of hydrophobic and polyelectrolyte terms of the Gibbs free energy according to Eqs. (3)–(5) in Subheading 1.1 (Fig. 6).

4

Notes 1. Generally, due to the detection limit of the calorimeter a heat exchange of about 2.5 μcal should follow the first full injection. Values 90  C in the ITC buffer used. 16. Errors in DNA concentration directly affect the determined binding stoichiometry n. 17. The absorption of the final solution should be between 0.1 and 1 to ensure an accurate determination of the concentration. 18. Dialysis of the DNA solution against the respective buffer followed by preparing a ligand solution in the dialysate is generally recommended in order to avoid a buffer mismatch between titrant and titrate. Differences in buffer composition may cause substantial heats of dilution, considerably affecting measurements. However, due to its prior purification by precipitation and high melting that excludes a final determination of concentration in the buffer solution, dialysis of high-melting quadruplexes like MYC may be omitted. 19. In case of protonatable ligands, the pH of the solution should be carefully controlled since even small differences of only 0.05 in the pH of the samples may cause mismatches and significant heat effects. These can be much higher when compared to heat effects caused by minor differences in salt concentration. 20. Errors in ligand concentration translate into errors of all parameters determined by ITC measurements, i.e., n, ΔH  , and Ka. 21. In case of the VP-ITC instrument pre-tempering is strongly suggested because it needs a longer time for thermal equilibration due to its two-jacket system when compared to the iTC200 or PEAQ-ITC with their one-jacket system. At temperatures below 25  C it is also recommended to choose a temperature for pre-equilibration 2–3  C below the actual temperature used for the measurement.

58

Andrea Funke and Klaus Weisz

22. Rinsing with buffer instead of the G4 solution will save costs but may result in deviations up to 2% of the sample concentration. 23. For the determination of ΔH  with an excess-site method no full titration curve is acquired and only a limited number of injections (e.g., 13) is used. To ensure complete ligand binding for these titration steps the G4 in the sample cell should be in large excess, typically requiring higher concentrations of the receptor in the sample cell (e.g., 100 μM) and lower concentrations of the titrant in the syringe (e.g., 200 μM with injection volumes of 3 μL). 24. In temperature-dependent ITC measurements, e.g., for a determination of ΔCp , this parameter must be varied. It should be noted that for temperatures above 50  C the baseline generally shows a higher level of noise. 25. The reference power depends on the particular interaction studied. If ligand binding is highly exothermic, a high value should be selected (approximately 10 μcal/s for the PEAQITC). In contrast, low values (such as 0.5 to 1 μcal/s) are chosen for strongly endothermic reactions. If exchanged heats are unknown or only moderate, an intermediate value such as 5 μcal/s (PEAQ-ITC) should be used. 26. The heat associated with stirring is determined and eliminated in the equilibration step of the ITC device during the experiment. The stirring speed should neither be too slow nor too fast in order to ensure good mixing of the components without producing too much heat in addition to the heat of binding. 27. This parameter affects both the sensitivity and the response time. The “high” mode will provide a fast return of the signal to the adjusted reference power (recommended for large signals). However, sensitivity is lowered resulting in a noisier baseline. On the other hand, a “low” or “none” feedback mode enhances sensitivity but can lead to broadened signals. 28. Due to diffusion effects following the insertion of the injection syringe into the sample cell, the first titration step is not used for the analysis. To minimize consumption of ligand solution, lower volumes are normally applied for the initial injection. 29. Depending on the amplitude of the heat burst and the kinetics of association, the signal may take more or less time to get back to baseline. Accordingly, the time between injections should be adapted to the investigated system. Generally, it is a good idea to choose a longer spacing time rather than a spacing that is too short for reaching baseline, resulting in accumulating errors for following titration steps.

ITC Studies on Ligand-Quadruplex Interactions

59

30. After having started the measurement, the quality of the baseline should be checked. If the baseline differs by more than 1 μcal/s from the set reference power at the start of measurement, the instrument may be contaminated or there might be a problem with the sample in the cell like aggregation effects due to stirring. Irregular interfering signals in the thermogram could be caused by small air bubbles. 31. When using the analysis software provided by the PEAQ-ITC, make sure that the corresponding fit model is selected before starting the baseline correction. Otherwise, a subsequent change of the model leads to the loss of previous corrections. 32. For the analysis of data obtained by an excess-site ITC experiment no adaptation is needed. Here, ΔH  is directly given by the normalized and dilution-corrected heat for each titration step, also allowing for averaging over several injections to minimize statistical errors. 33. The total heat Q which develops for an association process in an ITC experiment is given for a single set of identical sites by Q ¼ n α M 0 ΔH  V 0

ð6Þ

with α and V0 being the fractional saturation of binding sites and the cell volume, respectively. The free ligand concentration L can be expressed by the difference of total ligand concentration L0 and bound ligand concentration L ¼ L0  n α M 0

ð7Þ

Substituting L into Ka ¼

α ð1  αÞ∙L

ð8Þ

and solving the quadratic equation for α gives s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi3 2  n M 0 ΔH V 0 4 L0 1 L0 1 4L 0 5 Q ¼ 1þ ð9Þ þ  1þ þ  2 nM 0 nK a M 0 nM 0 nK a M 0 nM 0 2

Fitting ITC data involves the calculation of the heat content from the (i  1)th to the ith injection ΔQ ði Þ ¼ Q ði Þ  Q ði  1Þ Likewise, for two sets of independent sites   Q ¼ M 0 V 0 n1 α1 ΔH 1 þ n2 α2 ΔH 2 L ¼ L 0  M 0 ðn1 α1 þ n2 α2 Þ α1 K a1 ¼ ð1  α1 Þ∙L

ð10Þ ð11Þ ð12Þ ð13Þ

60

Andrea Funke and Klaus Weisz

K a2 ¼

α2 ð1  α2 Þ∙L

ð14Þ

Here, solving Eqs. (8) and (9) for α1 and α2 and substituting into (7) yields a cubic equation for L. It can be solved either in closed form or numerically if parameters n1, n2, Ka1, and Ka2 are assigned. Parameters α1 and α2 may finally be calculated from (8) and (9). 34. While most ITC instruments offer processing software that includes a two-site model, algorithms with more complex models like a three-site model have also been developed and successfully applied [20]. However, due to increasing interdependencies of variables extreme caution should be exercised when interpreting best-fit values obtained with too many freefloating parameters (in case of no additional restrictions there are nine fit parameters for a three-site model). Also, parameters for weak binding are mostly ill-defined because of small heat signals and the lack of a clearly defined endpoint. 35. In general, binding enthalpies as determined by an excess-site method are preferred over binding enthalpies extracted by curve fitting. More significant systematic differences in ΔH  between the two methodologies may either indicate solubility problems/aggregation effects or binding processes to non-equivalent binding sites with similar but different affinity.

Acknowledgments This work was supported by the Deutsche Forschungsgemeinschaft (INST 292/138-1). References 1. H€ansel-Hertsch R, Di Antonio M, Balasubramanian S (2017) DNA G-quadruplexes in the human genome: detection, functions and therapeutic potential. Nat Rev Mol Cell Biol 18:279–284 2. Collie GW, Parkinson GN (2011) The application of DNA and RNA G-quadruplexes to therapeutic medicines. Chem Soc Rev 40:5867–5892 3. Zhang S, Wu Y, Zhang W (2014) G-Quadruplex structures and their interaction diversity with ligands. ChemMedChem 9:899–911 4. Neidle S (2017) Quadruplex nucleic acids as targets for anticancer therapeutics. Nat Rev Chem 1:0041

5. Neo JL, Kamaladasan K, Uttamchandani M (2012) G-Quadruplex based probes for visual detection and sensing. Curr Pharm Des 18:2048–2057 6. Wang LX, Xiang JF, Tang YL (2015) Novel DNA catalysts based on G-quadruplex for organic synthesis. Adv Synth Catal 357:13–20 7. Wang F, Liu X, Willner I (2015) DNA switches: from principles to applications. Angew Chem Int Ed 54:1098–1129 8. Ladbury JE, Chowdhry BZ (1996) Sensing the heat: the application of isothermal titration calorimetry to thermodynamic studies of biomolecular interactions. Chem Biol 3:791–801 9. Haq I, Chowdhry BZ, Jenkins TC (2001) Calorimetric techniques in the study of high-

ITC Studies on Ligand-Quadruplex Interactions order DNA-drug interactions. Methods Enzymol 340:109–149 10. Funke A, Dickerhoff J, Weisz K (2016) Towards the development of structureselective G-quadruplex binding indolo[3,2-b] quinolines. Chem Eur J 22:3170–3181 11. Funke A, Weisz K (2017) Comprehensive thermodynamic profiling for the binding of a G-quadruplex selective indoloquinoline. J Phys Chem B 121:5735–5743 12. Baker BM, Murphy KP (1996) Evaluation of linked protonation effects in protein binding reactions using isothermal titration calorimetry. Biophys J 71:2049–2055 13. Nguyen B, Stanek J, Wilson WD (2006) Binding-linked protonation of a DNA minorgroove agent. Biophys J 90:1319–1328 14. Manning GS (1978) The molecular theory of polyelectrolyte solutions with applications to the electrostatic properties of polynucleotides. Q Rev Biophys 11:179–246 15. Record MT Jr, Anderson CF, Lohman TM (1978) Thermodynamic analysis of ion effects on the binding and conformational equilibria

61

of proteins and nucleic acids: the roles of ion association or release, screening, and ion effects on water activity. Q Rev Biophys 11:103–178 16. Baldwin RL (1986) Temperature dependence of the hydrophobic interaction in protein folding. Proc Natl Acad Sci U S A 83:8069–8072 17. Ha JH, Spolar RS, Record MT Jr (1989) Role of the hydrophobic effect in stability of sitespecific protein-DNA complexes. J Mol Biol 209:801–816 18. Spolar RS, Record MT Jr (1994) Coupling of local folding to site-specific binding of proteins to DNA. Science 263:777–784 19. Wiseman T, Williston S, Brands JF, Lin LN (1989) Rapid measurement of binding constants and heats of binding using a new titration calorimeter. Anal Biochem 179:131–137 20. Le VH, Buscaglia R, Chaires JB, Lewis EA (2013) Modeling complex equilibria in isothermal titration calorimetry experiments: thermodynamic parameters estimation for a three-binding-site model. Anal Biochem 434:233–241

Chapter 4 Biosensor-Surface Plasmon Resonance: Label-Free Method for Investigation of Small Molecule-Quadruplex Nucleic Acid Interactions Ananya Paul, Caterina Musetti, Rupesh Nanjunda, and W. David Wilson Abstract Biosensor-surface plasmon resonance (SPR) technology is now well established as a quantitative approach for the study of nucleic acid interactions in real time, without the need for labeling any components of the interaction. The method provides real-time equilibrium and kinetic characterization for quadruplex DNA interactions and requires small amounts of materials and no external probe. A detailed protocol for quadruplex-DNA interaction analyses with a variety of binding molecules using biosensor-SPR methods is presented. Explanations of the SPR method with basic fundamentals for use and analysis of results are described with recommendations on the preparation of the SPR instrument, sensor chips, and samples. Details of experimental design, quantitative and qualitative data analyses, and presentation are described. Some specific examples of small molecule-DNA quadruplex interactions are presented with results evaluated by both kinetic and steady-state SPR methods. Key words Small molecule-nucleic acid interactions, Kinetics, Steady-state analysis, Mass transfer, Biosensor, Surface plasmon resonance

1

Introduction Noncanonical DNA structures, formed by conformational rearrangements of genome regions bearing specific base sequences, are a novel mechanism of gene regulation. Four-stranded G-rich helical structures, known as G-quadruplexes (G4), are among the most actively investigated noncanonical DNA arrangements [1–3]. Discovered in 1910 thanks to a curious phenomenon of guanosine gel formation [4], G-quadruplexes are now one of the main structural elements of the genome. Computational predictions using the pattern match d(G3+N1–7G3+N1–7G3+N1–7G3+), where N is any nucleotide base, have identified 375,000 putative quadruplex sequences in the human genome [5, 6]. These noncanonical DNA arrangements are usually found at the ends of human chro-

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_4, © Springer Science+Business Media, LLC, part of Springer Nature 2019

63

64

Ananya Paul et al.

A R N

H N

N N

N

H

H

O H H N

R

N

O

O H

B

ParallelPropeller type in K+ solution

N H

K+

N

Antiparallelbasket type in Na+ solution

N

O

H N

N

N R

N

H

N H

N H H

N

N N

N R

Mixed parallel-antiparallel-Hybrid-1 in K+ solution

Mixed parallel-antiparallel-Hybrid-2 in K+ solution

Fig. 1 Schematic outline of (a) a G-quadruplex tetrad and (b) folding topologies of DNA G-quadruplexes in presence of monovalent cations

mosomes (telomeric G-quadruplexes) [7, 8] as well as at promoter regions of many oncogenes, where there is a high population of guanine-rich sequences [9–12]. In physiological conditions, guanine bases can associate through a network of Hoogsteen bonds to form planar arrays known as G-tetrads (Fig. 1). The overlapping of G-tetrads then leads to the formation of more complex arrangements, G-quadruplex structures, which are stabilized by the presence of mono-cations (mainly Na+ or K+, Fig. 1) [13–15]. Interestingly, at telomeres, where there is an equilibrium between the single-stranded repeat of the TTAGGG sequence and its G-quadruplex-folded conformation, G-quadruplex structures are an attractive target for therapeutic intervention. In fact, as G-quadruplex-folded telomeres cannot be recognized by the telomerase enzyme, the stabilization of such arrangements by small molecules can lead to the indirect inhibition of the enzyme activity [16, 17]. This is an attractive approach for the development of new selective anticancer agents as the enzyme is expressed in 85–90% of tumor cells [18–20] while its activity is low, or even absent, in somatic cells [19]. Additionally, as previously mentioned, G-quadruplex structures have also been identified in transcriptional regulatory regions of genes and oncogenes, where they can play a role in

Quadruplex DNA Interactions by Biosensor-SPR

65

expression control mechanisms [21, 22]. Transcriptional repression of oncogenes through small molecule-driven stabilization of these structures could also be seen as an emerging anticancer strategy. The thermodynamic signature for the binding of several small molecules to DNA has been extensively investigated as it can help to better understand the features driving the molecular recognition process [23, 24]. In results distinct from those reported for doublestranded DNA, G-quadruplex binders evidenced different binding behavior [25, 26]. A rationale for such differences rests on the polymorphic structure of G-quadruplexes [27–29], which is extremely sensitive to solution composition and G-quadruplex sequence [30–32]. G-quadruplex DNA can assume a number of conformational topologies depending on sequence, length, nature of the monovalent cation, presence of loops, crowding agents, and other environmental factors [29, 33, 34]. An example is represented by the intramolecular telomeric sequence, hTel22, for which a hybridtype folding appears to be predominant in potassium containing solutions but the coexistence of different conformations in mutual equilibrium has been extensively demonstrated [29, 35–38]. NMR studies reveal that two related telomere sequences in the same experimental conditions assume well-defined but different hybrid forms. Similarly, the 19mer c-myc promoter sequence (Fig. 1) can assume a unique parallel conformation [39, 40] while the fulllength G-rich sequence of the c-myc promoter can assume a number of different G-quadruplex-folded conformations [41]. The broad structural complexity of G-quadruplexes represents a multiplicity of different targets for small molecule binding. For this reason, a detailed characterization of the quantitative behavior of organic small molecules that can selectively interact with G-quadruplexes over duplex DNAs is a key quest [42–46]. Nonselective, off-target binding results not only in a significant reduction of the available compound but also in unwanted toxicity [47, 48]. Therefore, quantitative binding studies can demonstrate enhanced selectivity of specific ligands for a quadruplex sequence and are of fundamental importance in the development of new therapeutic agents. To establish a basic quantitative characterization of ligand-G4 DNA interactions is essential to determine a set of thermodynamic parameters [49]. In the best case scenario, the binding affinity (the equilibrium constant, K, and Gibbs energy of binding, ΔG), stoichiometry (n, the number of compounds bound to the biopolymer), cooperative effects in binding, and binding kinetics (the rate constants, k, that define the dynamics of the interaction) should be determined. These fundamental parameters are the keys to a molecular understanding of ligand-target interactions and how they translate into a specific phenotype. In order to define such parameters, an accurate method of determining the concentration of

66

Ananya Paul et al.

the bound and unbound species is required. Such information should be determined as a function of concentration at equilibrium for accurate K and n, and as a function of time for k determination. For each system then, the question becomes how to accurately determine the necessary concentrations as a function of time and reactant concentration, solution conditions, temperature, etc. Biosensor surface plasmon resonance (SPR) is a label-free method, which is operational down to very low concentration detection [49–51] and can provide all of the desired information. SPR responds to the refractive index or mass changes at the biospecific sensor surface upon complex formation [52]. Since the SPR signal responds directly to the amount of compound bound in real time, as versus indirect signals at equilibrium that are obtained for many physical measurements, it provides a very powerful method to study biomolecular interaction thermodynamics and kinetics. The use of the SPR signal and direct mass response to monitor biomolecular reactions also overcomes many difficulties with labeling or characterizing the diverse properties of biomolecules [49–52]. To illustrate the use of the biosensor-SPR method in G-quadruplex DNA-small molecule interactions, a cationic compound with a highly curved-shape and four-linked furan rings, DB1464, (Fig. 3a) was selected. DB1464 was tested for selective G-quadruplex DNA recognition based on the idea that planar and highly curved molecules do not have the proper shape to fit into the minor groove of the double helix [42, 45, 53]. This planar, aromatic, curved shape, however, can effectively stack on the terminal G-tetrads of quadruplexes [54–57]. These features are an advantage in terms of selectivity for a potential G-quadruplex-binding approach and it is reinforced by the fact that G-quadruplex structures are characterized by the presence of external planar tetrads available for π-π stacking interactions with ligands. Interaction of this compound with distinct DNA arrangements was evaluated using two sequences known to fold into a G-quadruplex conformation, the human telomere hTel22 and cMyc19, a 19mer sequence from the proto-oncogene promoter c-Myc. Additionally, a double-stranded hairpin sequence, containing AATT-binding sites, was used to compare the selectivity of the compounds between G-quadruplex and duplex DNA. 1.1 Basic Principles of Biosensor-SPR Methods

The results of a biosensor-SPR experiment are typically presented as a series of sensorgrams, in which the SPR-binding signal (response units or RU in Biacore and some other instruments) is shown as a function of time (Fig. 2). For the DNA immobilization with a common dextran sensor surface, covalently attached protein or DNA or capture of a nucleic acid strand with biotin linked to either the 50 or 30 terminus (see Note 1 for more detail about immobilization). The terminal attachment of biotin, through a flexible linker (Scheme 1), leaves the nucleic acid-binding sites open for complex

Quadruplex DNA Interactions by Biosensor-SPR

67

Fig. 2 SPR sensorgram and its components described in steps. (1) Running buffer is injected to stabilize the baseline of the DNA surface; (2) association: ligands are injected over the immobilized DNA, and there is a rise in RU as they bind to the immobilized DNA; (3) bound and unbound ligands in equilibrium at the steady state; (4) dissociation: on injection of running buffer to remove the samples and determine the dissociation constant; (5) injection of regeneration buffer to remove any remaining ligand on the chip; and (6) followed by the running buffer flow to stabilize the baseline for the next ligand injection

Biotin

Spacer

O HN

NH

S

H N O

O

O– O P O O O O P O HO

N

O

N 3'

N

O NH NH2

Scheme 1 Chemical structure of biotin derivative at 5’-DNA used during immobilization

formation. A range of other sensor chip surfaces and immobilization chemistries are also available, and it is generally possible to find an appropriate surface for any biological interaction application (Table 1). In this work and most of our other work, 50 -biotinlabeled DNA sequences (Scheme 1) such as hTel22 and c-Myc Gquadruplex-folded sequences and the hairpin-folded duplex control sequence (Fig. 3) were used. In principle 30 -biotin attachment should work in cases where such attachment is an advantage. A reference baseline is initially established by buffer flow and, secondly, a ligand solution is injected over the surface; the ligand

68

Ananya Paul et al.

Table 1 Example of different types of chips (available from GE Healthcare Inc.) Type

Features

Use

C1

Carboxymethylated, matrix-free surface for covalent immobilization

Need to avoid dextran on the surface for multivalent or very large macromolecules

CM3

Similar properties to sensor chip CM5, suited The interaction partner in solution is very to large interaction partners and exploratory large and exploratory assay conditions assay conditions

CM7

Similar properties to sensor chip CM5, but for Suitable for work with small molecules and fragment-based screening when achieving fragment and low molecular weight the required immobilization level is molecule samples with three times higher challenging capacity

SA

Carboxymethylated dextran pre-immobilized High binding capacity, reproducibility and chemical resistance give excellent with streptavidin for immobilization of performance over a broad range of biotinylated interaction partners application for RNAs and DNAs

HPA

Flat hydrophobic surface consisting of longchain alkanethiol molecules is attached directly to the gold film. It facilitates the adsorption of lipid monolayers for analysis of interactions involving lipid components

Model membrane systems

L1

Lipophilic groups are covalently attached to carboxymethylated dextran, making the surface suitable for direct attachment of lipid membrane vesicles such as liposomes

High-capacity capture of vesicles and liposomes while maintaining the lipid bilayer

NTA

Carboxymethylated dextran pre-immobilized Capture and immobilization of histidinetagged molecules via metal chelation with nitrilotriacetic acid (NTA). Histidinetagged molecules are immobilized via Ni2 +/NTA chelation

Protein L Carboxymethylated dextran matrix pre-immobilized with a recombinant protein L for antibodies and antibody fragments containing kappa light chain subtypes (1, 3 and 4) without interfering with its antigen-binding site

Oriented capture of antibody fragments

binding to DNA is monitored by changes in the SPR signal. With sufficient time, a steady-state plateau, where association and dissociation of the ligand are occurring at an equal rate, is established. Finally, buffer flow (without ligand) is reinitiated and the dissociation of the complex is monitored as a function of time (Fig. 2). The above steps are repeated with a series of ligand concentrations and the resulting sensorgrams are fitted to an appropriate binding model as described below.

Quadruplex DNA Interactions by Biosensor-SPR

O O HN NH 2 +

69

O O NH2 H2N+

DB1464 HTel G4 (22mer): 5'-Biotin-AGGGTTAGGGTTAGGGTTAGGG-3' c-Myc (19mer): 5'-Biotin-AGGGTGGGGAGGGTGGGGA-3' C T AATT-Hairpin : 5'-Biotin-CCAATTCG 3'-GGTTAAGC

C T

Fig. 3 Chemical structure of DB1464 and 50 -biotin-labeled DNA sequences. Gs are taking part in formation of G-tetrads (the compound was supplied by Professor David W. Boykin, GSU)

For a ligand (L) binding to a DNA sequence and forming a single complex (C), the interaction is described by the following equation: ka

L þ DNA ⇌ C kd

ð1Þ

and the equilibrium binding affinity for this interaction is: KA ¼

½C  ka 1 ¼ ¼ ½L ½DNA  kd K D

ð2Þ

where [L] is the concentration of the injected ligand, [DNA] is the concentration of the immobilized DNA not bound to the ligand (free DNA concentration), and [C] is the concentration of the complex; KA is the equilibrium binding constant, ka is the association rate constant, and kd is the dissociation rate constant. For association: d ½C  ¼ ka ½L ½DNA  dt

ð3Þ

d½C  ¼ k d ½C  dt

ð4Þ

and for dissociation:

Both the association and dissociation phases of the sensorgram can be simultaneously fit to a desired binding model with several sensorgrams at different ligand concentrations using a global fitting routine [58, 59]. Global fitting allows the most robust determination of kinetic constants (ka or kd) and the calculation of

70

Ananya Paul et al.

equilibrium constants, KA or KD, from the ratio of kinetic constants (Eq. 2). If a steady-state plateau is obtained, the SPR response in the plateau region can be used with the following model to obtain the equilibrium constant: r¼

K A C free RUobs ¼ 1 þ K A C free RUmax

ð5Þ

(limit of r as Cfree! +1) Note that for Eq. 5, the limit is 1 and this assumes a 1:1 binding model. r represents the moles of bound ligand per mole of DNA total and Cfree is the free ligand concentration in equilibrium with the complex. RUobs is the observed (experimental) response in the plateau region and RUmax is the predicted maximum response for a monomer ligand binding to a DNA site. RUmax can be calculated or determined experimentally at the RU for saturation of the DNA-binding sites. In Eq. 5, KA, r, and RUmax are determined by fitting RUobs versus Cfree. The ratio of observed steady-state response (RUobs) and RU at saturation (RUmax) yields the binding stoichiometry. If the complex dissociates slowly, the surface can be regenerated before the complete dissociation occurs with a solution that causes rapid dissociation of the ligand without irreversible damage to the immobilized DNA [59, 60]. For example, a solution at low or high pH (pH  2.5 or pH  10) can unfold DNA and cause the ligand to completely dissociate. Additional injections of the running buffer (around neutral pH) allow the immobilized DNA to refold and establish a stable baseline. This cycle is repeated with a series of additional ligand concentrations. With a series of sensorgrams generated with a broad range of concentrations, both the kinetics and equilibrium constant can be determined as discussed above. 1.2 Critical Factors for Ligand-DNA Interaction Evaluation by Biosensor-SPR Methods 1.2.1 Concentration Range and Binding Affinity KD

For accurate determination of equilibrium constants by any method, the selected set of experimental concentrations must provide both free and bound concentrations of reactants. In the biosensor-SPR method with DNA immobilized to the surface, the ligand concentrations should be below and above KD so that a range of bound fraction of ligand to DNA is obtained. The initial ligand concentrations have less binding to the DNA-binding sites, but as the concentration of ligand injected is increased, the fraction of sites bound on DNA increases and approaches the saturation level. The sensorgrams will have very low binding response at lower ligand concentrations and will approach saturation with higher response at higher ligand concentrations. In this way, a series of sensorgrams with broad ligand concentrations will enable accurate determination of equilibrium constants. If the range of ligand concentrations used is too low or too high, accurate estimation of on-rates and binding constants is not possible. Some preliminary

Quadruplex DNA Interactions by Biosensor-SPR

71

testing is recommended when the ligand-target approximate KD is unknown in order to establish an appropriate range of working ligand concentration. 1.2.2 Mass Transport in Association and Rebinding in Dissociation

2 2.1

For the ligand to bind to the DNA (or any target) immobilized to the sensor surface, the sample solution injected over the flow cell surface must be transported from the bulk solution to the immobilized target surface, a phenomenon known as mass transport. This is a diffusion-controlled process, and the transport rate can directly influence the binding kinetics, if the rate occurs slower than the binding reaction. A key requirement for accurate determination of kinetic constants by the SPR method is that the amount of free ligand in the matrix must quickly equilibrate with the flow solution. The equilibration is assisted by using high flow rates. If the association reaction is much faster than mass transport, the observed binding will be limited by the mass transport process. Conversely, if the transport rate is faster than the association rate, the observed binding will represent the true interaction kinetics [52, 61]. Therefore, the mass transport rate is a critical factor that must be considered in biosensor experimental design and in evaluating kinetic constants from biosensor-SPR methods. Overall, for kinetic measurements, it is generally recommended to use low surface densities of the immobilized DNA and high ligand flow rate to minimize the limitations on binding rates by mass transport processes. In addition, the dissociation phase can be set up for several hours or even longer with Biacore SPR, which allows at least 50% of bound ligand to dissociate, and a reliable kinetic fit can be performed, even with very slow dissociation. In summary, in biosensor-SPR evaluation of the interaction of ligand-DNA, ligand concentrations, mass transfer, and rebinding have to be evaluated carefully. The incorporation of optimally designed flow cells in the instrument and optimized experimental protocols and sensor chip have qualified biosensor-SPR as an excellent method for quantitative analysis of ligand-DNA interactions, especially for strong binding system.

Materials Instrumentation

Biacore is a system for real-time label-free biomolecular interactions analysis using surface plasmon resonance technology. A fourchannel Biacore instrument, typically a T200 (GE Healthcare Inc.), is recommended for most research studies and it has the best sensitivity of current commercial instruments. Biacore T200 and 2000/3000 instruments use sensor chips with four channels such that three DNAs can be immobilized with one flow cell left blank as a control for bulk refractive index subtraction. With a sensor surface that has covalently attached streptavidin, a nucleic

72

Ananya Paul et al.

acid strand with biotin linked to either the 50 or 30 terminus can be captured to create the biospecific surface. The specifications of the instrument have given in the web (https://www.biacore.com/ lifesciences/index.html). The materials and procedures presented here are recommended for Biacore instrumentation, but similar reagents and methods are used in other instruments. 2.2 Required Materials for Biacore General Instrument Cleaning and Checking

1. Maintenance chip with a glass flow cell surface (available from GE Healthcare Inc.). 2. 0.5% sodium dodecyl sulfate (SDS, Biacore desorb solution 1). 3. 50 mM glycine pH 9.5 (Biacore desorb solution 2) (see Note 2). 4. 1% (v/v) acetic acid solution. 5. 0.2 M sodium bicarbonate solution. 6. 6 M guanidine hydrochloride solution. 7. 10 mM HCl solution (see Note 3). 8. HBS–N buffer: 10 mM HEPES pH 7.4, 150 mM NaCl (User prepared or available from GE Healthcare Inc.). 9. BiaTest Solution: (15% (w/w) sucrose in HBS-EP buffer).

2.3 Required Materials for Immobilization of G-Quadruplex DNA on Chip Surface

1. CM5 sensor chip that has been at room temperature for at least 30 min prior to use (sensor chips are available from GE Healthcare Inc.) (see Note 1). 2. HBS–EP buffer: 10 mM HEPES pH 7.4, 150 mM NaCl, 3 mM EDTA, 0.005% v/v polysorbate 20. (User or GE Healthcare Inc.) (Running buffer). 3. Thoroughly filter and degas all solutions. It should be emphasized that the internal microfluidics flow system of the instrument can be damaged by particulate matter present in any solution. 4. 100 mM N-hydroxsuccinimide (NHS) freshly prepared in water. 5. 400 mM N-ethyl-N0 -(dimethylaminopropyl) carbodiimide (EDC) freshly prepared in water. 6. 10 mM acetate buffer pH 4.5 (immobilization buffer). 7. 200–400 mg/mL streptavidin prepared in immobilization buffer. 8. 1 M ethanolamine hydrochloride in water pH 8.5 (deactivation solution). 9. Activation buffer (1 M NaCl, 50 mM NaOH). 10. Biotin-labeled nucleic acid solutions (~25 nM of a single strand or hairpin DNA dissolved in HBS-EP buffer).

Quadruplex DNA Interactions by Biosensor-SPR

73

Table 2 Biacore instrument commands Biacore control software commands

2.4 Sensor Chip Preparation for G-Quadruplex DNA Immobilization 2.4.1 Preparation of Streptavidin Surface on CM5 Chip

Function

Desorb

Removes adsorbed materials from the flow system

Sanitize

Removes disinfects from the flow system

Superclean

Washes the flow system and denatures proteins to increase their solubility

Prime

Flushes the flow system with running buffer

Dock

Docks the sensor chip into the instrument

Undock

Undocks the sensor chip from the instrument

Manual run

Allows to control a run interactively

Sample injection

Injects sample

1. Dock the CM5 chip and prime with running buffer (see Note 1). A manual run is used to establish a stable baseline with a flow rate of 5 μL/min which is required to make a streptavidin functionalized chip. “Dock” and “Prime” are Biacore software commands that instruct the instrument to carry out specific operations. The commands and operations are listed in Table 2. 2. A solution mixture of 75 μL of NHS and 75 μL of EDC is required to activate the carboxymethyl surface to reactive esters. Note: mix these solutions just prior to injection to get good activation of the surface. 3. Inject NHS/EDC mixture for 10 min (50 μL) to receive optimum amount of reactive esters. 4. By using “Manual Inject” with a flow rate of 5 μL/min several injections of streptavidin, prepared in immobilization buffer, are injected over all flow cells. The number of RUs immobilized, which is available in real-time readout, are tracked to obtain the desired RU level. This is typically 2500–3000 RUs for a CM5 chip after the injection has been stopped. 5. For the deactivation of remaining ester, 1 M ethanolamine hydrochloride with a flow rate of 5 μL/min for 10 min is injected in all flow cells. 6. A few primes are necessary to obtain a stable baseline.

2.4.2 G-Quadruplex DNA Immobilization on a Streptavidin Chip

1. A streptavidin-coated sensor chip (chip prepared as outlined above or SA pre-derivatized sensor chips from GE) that has been at room temperature for at least 30 min is required.

74

Ananya Paul et al.

2. Biotin-labeled nucleic acid solutions (~25 nM DNA dissolved in HBS-EP buffer is also required (running buffer). 3. Dock the streptavidin-coated chip and by using command manual run, a sensorgram with a 25 μL/min flow rate is started to check the base line. 4. Activation buffer, 1 M NaCl, 50 mM NaOH is injected for 3 min (75 μL) for a course of five to seven times to remove unbound streptavidin from the sensor chip. 5. To ensure the surface stability, prime with running buffer a few times as necessary. 6. Allow buffer to flow at least 10 min (or until the baseline is stable) before immobilizing the nucleic acids. 7. A new sensorgram with a flow rate of 1 μL/min is started in a desired flow cell under “flow path” (e.g., flow cell 2, fc2) to immobilize the nucleic acid. Generally, flow cell 1 (fc1) is used as a control and is left blank for subtraction. It is often desirable to immobilize different nucleic acids on the remaining two flow cells (fc3 and fc4). 8. Wait for the baseline to stabilize (which usually takes a few minutes). Use “Manual Inject,” load the injection loop with ~100 μL of a 25 nM nucleic acid solution and inject over the flow cell. Track the number of RUs immobilized and stop the injection after the desired level is reached (typically ~200 RU for 20–30 base-pair DNA for kinetics experiments to minimize mass transport effects). 9. At the end of the injection and after the baseline is stabilized, determine the RUs of the immobilized nucleic acid by using the reference line option. The amount of nucleic acid immobilized is required to determine the theoretical moles of ligandbinding sites for the current flow cell (see Note 4). 10. Repeat the items 7 and 8 for immobilization of other DNAs to flow cells fc3 and fc4 separately. 11. After successful DNAs immobilization in all cells (except fc1), immobilization buffer is replaced by experimental buffer and followed by prime the system several times. 2.5 Flow Solutions: Buffers and Samples 2.5.1 General Buffers (See Note 5)

1. (a) 10 mM HEPES pH 7.4, 150 mM NaCl, 3 mM EDTA, (HBS–EP buffer) (GE Healthcare Inc.) (see Note 6). (b) 10 mM MES [2-(N-morpholino) ethanesulfonic acid] pH 6.25, 100 mM NaCl, 1 mM EDTA, and P20 (MES10 buffer). (c) 10 mM CCL [cacodylic acid] pH 6.25, 100 mM NaCl, 1 mM EDTA, and polysorbate 20, P20, (CCL10 buffer). (d) 10 mM Tris [(hydroxymethyl) aminomethane] pH 7.4, 100 mM NaCl, 1 mM EDTA, and polysorbate 20, P20, (Tris10 buffer).

Quadruplex DNA Interactions by Biosensor-SPR

75

2. The sample solution must be prepared in the same buffer used to establish the baseline (running buffer) (see Note 7). 3. The sample concentration to be used depends on the magnitude of the binding constant (KA). With a single binding site, for example, concentrations at least 10 times above and below 1/KA should be used (i.e., a 100-fold difference between the lowest and highest concentrations). A larger concentration range above and below 1/KA will yield a more reliable/accurate binding curve. For binding constants of 107–109 M1, as observed with many nucleic acid/small molecule complexes, small molecule concentrations from 0.01 nM to 5 μM in the flow solution allow accurate determination of binding constants. Injecting samples from low to high concentration is useful as it prevents data artifacts from ligand adsorption (see Note 8). 4. Possible problems at high sample concentrations: poor sensorgrams, nonspecific binding, ligand aggregation may be obtained. In this case only the lower concentrations can be used for quantification. 1. Good regeneration conditions help achieving complete removal of binding ligand from the chip surface without the immobilized target degradation. Commonly used regeneration solutions are listed in Table 3. In general, milder conditions are

2.5.2 Regeneration Solution (See Note 9)

Table 3 Regeneration solutions Interaction strength Weak

Intermediate

Strong

Acidic

Basic

Hydrophobic

Ionic

pH > 2.5 pH < 9 10 mM Glycine/ 10 mM HEPES/ HCl NaOH HCl Formic acid

pH < 9 50% ethylene glycol

1 M NaCl

pH 2–2.5 pH 9–10 10 mM Glycine/ 10 mM Glycine/ HCl NaOH Formic acid HCl NaOH H3PO4

pH 9–10 50% ethylene glycol

2 M MgCl2

pH < 2 pH > 10 10 mM Glycine/ NaOH HCl HCl Formic acid H3PO4

pH > 10 25–50% ethylene glycol

4 M MgCl2

6 M guanidinehydrochloride

76

Ananya Paul et al.

initially used, while more stringent conditions are applied only as needed. Regeneration solutions for different samples are available in the Biacore website. In our organic small molecules studies a high salt concentration solution or a stronger 10 mM Glycine/HCl (pH 2.5) solution is typically used as an efficient regeneration agent to remove small molecules from the DNA immobilized sensor chip surface. 2. Inject 10–20 μL of regeneration solution twice consecutively at high flow rates to assure efficient regeneration. 3. After injection of the regeneration solution, three 1-min injections of running buffer are recommended to wash off the remaining regeneration solution. 4. At the end of each cycle, a 5 min running buffer flow is also recommended to ensure that the chip surface is re-equilibrated for binding (i.e., the dextran matrix is re-equilibrated with running buffer) and that the baseline is stabilized before the following sample injection.

3

Method The Biacore software, supplied with the instrument, allows users to write a method or to use a software wizard to set up experiments. Several important factors, such as flow rate, association and dissociation times, injection order, and surface regeneration must be considered while setting up an experiment. A simple method used to collect small molecule binding results on G-quadruplex nucleic acid surfaces is shown below. The structure of the compound (DB1464, prepared in the laboratory of Professor D. W. Boykin at GSU [42]) and the biotin-labeled DNA sequences (hTel22, c-Myc19, AATT-Hairpin) used in this experiment are shown in Fig. 3.

3.1 Data Collection and Processing

1. A four-channel Biacore instrument, typically a T200 (GE Healthcare Inc.), is used in this study. 2. Three 50 -biotin-labeled DNAs are immobilized, each one immobilized in a distinct flow cell of a SA chip, as described in Subheading 2.4, item 2. Approximately the same amount of each DNA oligomer is immobilized on the surface of these flow cells to compare the sensorgram saturation levels directly from stoichiometry differences. 3. 10 mM Tris–HCl buffer (10 mM Tris–HCl, 50 mM KCl, 1 mM EDTA, 0.005% v/v polysorbate 20, pH 7.5) is used as a running buffer. 4. 10 mM Glycine/HCl (pH 2.5) is used as regeneration solution because DB1464 is a strong binder.

Quadruplex DNA Interactions by Biosensor-SPR

77

5. Serial dilutions (concentration range from 1 nM to 1 μM) of the DB1464 compound are prepared using the running buffer as the diluent to minimize changes in the refractive index caused by buffer components. The flow rate is set to 50 μL/ min (see Note 10). 6. A waiting period of 5 min prior to each sample injection cycle is recommended to allow baseline stabilization that is essential for accurate small molecule binding analysis. Several buffer samples are injected at the beginning of each experiment to evaluate if the instrument is performing within specifications. Buffer injections also serve as controls for data processing. 7. Inject 500 μL (10 min) of each compound concentration and set 600 s (these are varied with different compounds and kinetics) as dissociation time (see Note 11). Inject samples from low to high concentration to eliminate data artifacts from ligand carry over or contamination of the instrument flow system (see Note 12). 8. At the end of the dissociation phase, inject two short pulses (typically 30–60 s) of 10 μL Glycine/HCl (10 mM, pH 2.5), followed by three 1-min injections of running buffers are recommended to reduce the remaining regeneration solution and 5 min running with buffer flowing is also set to ensure that the chip surface is re-equilibrated for binding (see Subheading 2.5.2, item 3). 9. When the experiment is completed, open the raw data containing the sensorgrams in the BIAevaluation software for data processing (see Note 13). First, zero the sensorgrams on the y-axis (RU) to allow proper comparison of the responses of each flow cell. Generally, the average of a stable time region of the sensorgram, prior to sample injection, should be selected and set to zero for each sensorgram. Then, zero the x-axis (time) to align the beginnings of the injections with respect to each other [52]. 10. Subtract the control flow cell (fc1) sensorgram from the reaction flow cell sensorgrams (i.e., fc2–fc1, fc3–fc1, and fc4–fc1). This removes any bulk shift contribution to the change in RUs. 11. Subtract a buffer injection (the injection with a ligand concentration of zero), or better, an average of several buffer injections from the compound injections (different concentrations) on the same reaction flow cell (see Note 14). This is known as double subtraction and removes any flow cell specific baseline irregularities [52, 62]. At this point, the data should be of optimum quality and ready for analysis as described below. 3.2

Data Analysis

1. After the data are processed as described, kinetic and/or steady-state analysis is performed. Both kinetic and steadystate fitting can be done in the Biacore software or in other available software packages (such as Scrubber-2,

78

Ananya Paul et al. 50

50

50 DB1464-c-Myc

RU

DB1464-HTel

DB1464-AATT

40

40

40

30

30

30

20

20

20

10

10

10

0

0 0

200

400

Time (Sec)

600

0 0

200

400

Time (Sec)

600

0

200

400

600

Time (Sec)

Fig. 4 Respective SPR sensorgrams for the interaction of DB1464 with the hTel22, c-Myc, and AATT hairpin duplex, concentrations of DB1464 from bottom to top are 0–1 μM

http://www.biologic.com.au). As shown in Fig. 4, DB1464 binding reaches a steady-state plateau during the injection period so that a steady analysis can be used to determine the equilibrium constant. In this current experiment the binding rate is not limited by mass transfer and the association and dissociation rate constants can also be determined. The average of the data over a selected time period in the steady-state region of each sensorgram can be obtained, converted to r ¼ RU/ RUmax and plotted as a function of compound concentration in the flow solution (see Note 15). 2. To obtain the affinity constants, the data were fitted to the following interaction model using Kaleidagraph for nonlinear least-squares optimization of the binding parameters:  r ¼ K 1 C free þ 2K 1 K 2 C free 2 ð6Þ  = 1 þ K 1 C free þ K 1 K 2 C free 2 where K1 and K2 are equilibrium constants for two types of binding site (for a single site K2 ¼ 0) and Cfree is the concentration of the compound in equilibrium with the complex and is fixed by the concentration in the flow solution. For a single dominant binding site model, K2 is equal to zero. Errors in fitting results are less than 10%. As described above, the binding stoichiometry can also be obtained directly from comparing the maximum response with the predicted response per compound. 3. Since in this example, equal amounts of the G-quadruplex forming sequences hTel22 and c-Myc19 and of the control AATT-hairpin duplex were immobilized. The difference in maximum responses among the sets of sensorgrams was immediately observed. The differences in kinetics constants, binding constants, stoichiometry and cooperativity for DB1464 binding to the hTel22 and c-Myc19 G-quadruplex structures and the

Quadruplex DNA Interactions by Biosensor-SPR

79

50 DB 1464 40

RU

30 HTel c-Myc AATT duplex

20 10 0 0

2 10–7

4 10–7 6 10–7 C free

8 10–7

1 10–6

Fig. 5 Comparison of the SPR binding affinity of DB1464 with G-quadruplex sequence, hTel22 (squares), c-myc (circles) and AATT hairpin duplex (triangles) DNA sequences. RU values from the steady-state region of SPR sensorgrams are plotted against the unbound compound concentration (flow solution). The lines are the best-fit values using appropriate binding models Table 4 Binding constants obtained by fitting the curves using two-sites binding model

DB1464

hTel 22 KA

c myc 19 KA

AATT-Hairpin duplex KA

1.1  107 M1

2.5  107 M1 8.0  104 M1

3.2  105 M1

Ligand-Induced Dimerization of a Truncated Parallel MYC G-quadruplex

AATT hairpin duplex, can be obtained as illustrated in Fig. 5. Under these experimental conditions, DB1464 binds with a 1:1 ratio to hTel22 and AATT-Hairpin while for c-Myc19 a secondary weaker binding site has also been observed. 4. The steady-state binding responses are fit using a one-site binding equation for hTel22 and AATT-hairpin and a two-site binding model for c-Myc19 (Fig. 5 and Table 4). A comparison of the binding affinities of DB1464 versus the three targets used in the experiment indicates a strong preference for the G-quadruplex DNA conformation, with primary binding constants around 107 M-1 for both the human telomere and c-Myc while a very weak interaction (>100-fold less) with duplex DNA is observed in the same environmental conditions (Figs. 4 and 5, Table 4). DB1464 shows the higher affinity to c-Myc19 (2.5  107 M1) and the interaction with this target occurs with apparently slower rates of association and dissociation.

80

Ananya Paul et al.

5. Kinetic analysis using global fitting of SPR data places a great demand on obtaining high-quality data. Experimental design, analysis, and optimization of kinetic studies have been described in detail elsewhere [52]. In general, low surface densities of the immobilized target and high ligand flow rate should be used to minimize the effects of mass transfer. Several criteria must be satisfied when considering if a global kinetic fit is acceptable [52]: (a) within experimental limits, the RUmax is the same as the predicted value or from the steady-state results for one binding molecule; (b) the rate constants are within the range of small molecules; (c) the mass transport constant kt is in the 107 range; (d) (ka  RUmax/kt)  5; (e) the half-life t1/ 2 from the dissociation phase of sensorgram is close to the calculated half-life using the fitted value (t1/2 ¼ ln2/kd), suggesting the mass transport effect is minimized; (f) the residuals are within the instrumental noise and there are no systematic deviations; (g) a low chi-squared value is obtained at convergence.

4

Notes 1. The choice of sensor chip depends on the nature and demands of the application. For general purposes, a Biacore CM5 sensor chip, which carries a hydrophilic matrix of carboxymethylated (CM) dextran covalently attached to the gold surface, can be used. It has a high surface capacity for immobilizing a wide range of ligands from protein to nucleic acids and carbohydrates. For protein-DNA interaction investigation, the Biacore CM4 sensor chip is another good choice because it is similar to sensor chip CM5 but has a lower degree of carboxymethylation (~30% of that of CM5 chip) and charge that helps to reduce nonspecific binding of highly positively charged molecules, such as proteins, to the surface. Streptavidin-coated sensor chip has a surface carrying a dextran matrix to which streptavidin has been covalently attached. Streptavidin has a very high binding affinity for biotin (KD  1015 M) so that the surface provides a high capture of biotinylated ligands. The streptavidin-coated chip is particularly suited for nucleic acid immobilization since biotin coupling of oligonucleotides at the terminal or the internal positions is a well-established procedure. For some other specialized applications, range of other sensor chips surfaces and immobilization chemistries are also available (Table 1). 2. Maintenance chips are available from GE Healthcare Inc. “Desorb” is a Biacore software command that instructs the instrument to remove adsorbed ligands from the flow system.

Quadruplex DNA Interactions by Biosensor-SPR

81

A detailed list of commands and operations is shown in Table 2. Make sure that the analysis and sample compartment temperatures are not below 20 C, since SDS in Desorb solution 1 will precipitate at low temperature. 3. After running the regular Desorb for the additional extensive cleaning, additional super clean method may be used. 4. The amount of DNA to immobilize on the sensor chip depends on the relative molecular weight of the target DNA and of the ligand and on the sensitivity of the biosensor system. Since the SPR response is directly proportional to the mass concentration of material on the surface, the theoretical ligand-binding capacity for a 1:1 interaction of a given surface is relative to the amount of DNA immobilized. 5. The selection of experimental buffer depends on the nature of the ligand and DNA sequence. Salt concentration can be adjusted based on the experimental requirement. With the increase of ionic strength, the binding affinity of positively charged ligands for the negatively charged nucleic acid typically decreases due to charge shielding effects. 6. The amount P20 to be used depends on the system, the instrument and the sensor chip, typically concentrations between 0.05% and 0.005% are used. Other detergents are also used in some cases but for most studies of quadruplexes on a Biacore T200 or X100, P20 at 0.05% is best. 7. If the ligand requires the presence of a small amount of organic solvent (e.g., 50 μL/min) are used for kinetic experiments to minimize mass transport effects. 11. A sufficient association phase with a plateau region is needed for steady-state analysis. For the most accurate fitting of the dissociation phase, it is good practice to allow sufficient time for the compound to achieve at least 80% dissociation from the complex. 12. Many organic small molecules are easily adsorbed nonspecifically to the tubing of the injection microfluidics and are slowly released over the course of the experiment. Increasing surfactant concentration might reduce adsorbing to the tubing. 13. Other software programs such as Scrubber 2, CLAMP and GeneData are available for processing Biacore data. The results can also be exported and presented in graphing software such as KaleidaGraph for PC. Although it is useful to experiment with different software packages, BIAevaluation is sufficient for most routine analyses of sensorgram data. For the Biacore T200 user, data processing can be performed automatically using the Biacore T200 evaluation software, which is much more convenient for new users. For processing of Biacore data for large libraries of small molecules, GeneData is a preferred choice. 14. These two data processing steps are referred to as “double referencing.” Typically, multiple buffer injections are performed and averaged before subtraction. In double referencing, plots are made for each flow cell separately overlaying the control flow cell- corrected sensorgrams from the buffer and all sample injections. The buffer sensorgram is then subtracted from the sample sensorgrams. “Double referencing” removes the systematic drifts and shifts in baseline and is helpful to minimize offset artifacts and also to correct the bulk shift that results from slight differences in injection buffer and running buffer. 15. In some cases, at lower concentrations, where the response does not reach the steady-state, the equilibrium responses can be obtained from kinetic fits of the sensorgrams utilizing the known RUmax from the higher concentration sensorgrams. This extrapolation method works well with sensorgrams where the observed response is at least 50% of the equilibrium RU. In conclusion, Biosensor-SPR analysis of nucleic acid targets, either G-quadruplex or dsDNA, or RNA, offers a powerful method of obtaining thermodynamic and kinetics values.

Quadruplex DNA Interactions by Biosensor-SPR

83

Acknowledgments The work was supported by National Institutes of Health (NIH) Grant GM111749 (W.D.W.). References 1. Hurley LH (2000) G-quadruplex DNA: a potential target for anti-cancer drug design. Trends Pharmacol Sci 21:136–142 2. Burge S, Parkinson GN, Hazel P, Todd AK, Neidle S (2006) Quadruplex DNA: sequence, topology and structure. Nucleic Acids Res 34:5402–5415 3. Huppert JL (2010) Structure, location and interactions of G-quadruplexes. FEBS J 277:3452–3458 4. Gellert M, Lipsett MN, Davies DR (1962) Helix formation by guanylicacid. Proc Natl Acad Sci U S A 48:2013–2018 5. Kudlicki AS (2016) G-Quadruplexes involving both strands of genomic DNA are highly abundant and colocalize with functional sites in the human genome. PLoS One 11:e0146174. https://doi.org/10.1371/journal.pone. 0146174 6. Todd AK, Johnston M, Neidle S (2005) Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res 33:2901–2907 7. Neidle S (2010) Human telomeric G-quadruplex: the current status of telomeric G-quadruplexes as therapeutic targets in human cancer. FEBS J 277:1118–1125 8. Kar A, Jones N, Arat O, Fishel R, Griffith JD (2018) Long repeating (TTAGGG)n single stranded DNA self-condenses into compact beaded filaments stabilized by G-quadruplex formation. J Biol Chem 293(24):9473–9485. https://doi.org/10.1074/jbc.RA118.002158 9. Balasubramanian S, Hurley LH, Neidle S (2011) Targeting G-quadruplexes in gene promoters: a novel anticancer strategy? Nat Rev Drug Discov 10:261–275 10. Agrawal P, Lin C, Mathad RI, Carver M, Yang D (2014) The major G-Quadruplex formed in the human BCL-2 proximal promoter adopts a parallel structure with a 13-nt loop in K+ solution. J Am Chem Soc 136:1750–1753 11. Rigo R, Palumbo M, Sissi C (2017) G-quadruplexes in human promoters: a challenge for therapeutic applications. Biochim Biophys Acta 1861:1399–1413

12. H€ansel-Hertsch R, Antonio MD, Balasubramanian S (2017) DNA G-quadruplexes in the human genome: detection, functions and therapeutic potential. Nat Rev Mol Cell Biol 18:279–284 13. Fujii T, Podbevsˇek P, Plavec J, Sugimoto N (2017) Effects of metal ions and cosolutes on G-quadruplex topology. J Inorg Biochem 166:190–198 14. Marchand A, Gabelica V (2016) Folding and misfolding pathways of G-quadruplex DNA. Nucleic Acids Res 44:10999–11012 15. Kim HM, Evans HM, Dubins DN, Chalikian TV (2015) Effects of salt on the stability of a G-Quadruplex from the human c-MYC promoter. Biochemistry 54:3420–3430 16. Islam MK, Jackson PJM, Rahman KM, Thurston DE (2016) Recent advances in targeting the telomeric G-quadruplex DNA sequence with small molecules as a strategy for anticancer therapies. Future Med Chem 8:1259–1290 17. Neidle S (2016) Quadruplex nucleic acids as novel therapeutic targets. J Med Chem 59:5987–6011 18. Cech TR (2000) Life at the end of the chromosome: telomeres and telomerase. Angew Chem Int Ed 39:34–43 19. Blackburn EH, Greider CW, Szostak JW (2006) Telomeres and telomerase: the path from maize, Tetrahymena and yeast to human cancer and aging. Nat Med 12:1133–1138 20. Neidl S (2017) Quadruplex nucleic acids as targets for anticancer therapeutics. Nat Rev Chem 1:1–10 21. Balasubramanian S, Neidle S (2009) G-quadruplex nucleic acids as therapeutic targets. Curr Opin Chem Biol 13:345–353 22. Kaiser CE, Ert NAV, Agrawal P, Chawla R, Yang D, Hurley LH (2017) Insight into the complexity of the i-motif and G-quadruplex DNA structures formed in the KRAS promoter and subsequent drug induced gene repression. J Am Chem Soc 139:8522–8536 23. Mazur S, Tanious FA, Ding D, Kumar A, Boykin DW, Simpson IJ, Neidle S, Wilson WD (2000) A thermodynamic and structural

84

Ananya Paul et al.

analysis of DNA minor-groove complex formation. J Mol Biol 300:321–337 24. Chaires JB (2006) A thermodynamic signature for drug-DNA binding mode. Arch Biochem Biophys 453:26–31 25. Pagano B, Giancola C (2007) Energetics of quadruplex-drug recognition in anticancer therapy. Curr Cancer Drug Targets 7:520–540 26. Pagano B, Mattia CA, Giancola C (2009) Applications of isothermal titration calorimetry in biophysical studies of G-quadruplexes. Int J Mol Sci 10:2935–2957 27. Antonacci C, Chaires JB, Sheardy RD (2007) Biophysical characterization of the human telomeric (TTAGGG)4 repeat in a potassium solution. Biochemistry 46:4654–4660 28. Wang Y, Patel DJ (1993) Solution structure of the human telomeric repeat d[AG3(T2AG3)3] G-tetraplex. Structure 1:263–282 29. Ambrus A, Chen D, Dai J, Bialis T, Jones RA, Yang D (2006) Human telomeric sequence forms a hybrid-type intramolecular G-quadruplex structure with mixed parallel/ antiparallel strands in potassium solution. Nucleic Acids Res 34:2723–2735 30. Miyoshi D, Nakao A, Sugimoto N (2003) Structural transition from antiparallel to parallel G-quadruplex of d(G4T4G4) induced by Ca2+. Nucleic Acids Res 31:1156–1163 31. Heddi B, Phan A (2011) Structure of human telomeric DNA in crowded solution. J Am Chem Soc 133:9824–9833 32. Zimmerman SB, Minton AP (1993) Macromolecular crowding: biochemical, biophysical, and physiological consequences. Annu Rev Biophys Biomol Struct 22:27–65 33. Hazel P, Huppert J, Balasubramanian S, Neidle S (2004) Loop-length-dependent folding of G-quadruplexes. J. Am. Chem Soc 126:16405–16415 34. Rachwal PA, Brown T, Fox KR (2007) Effect of G-tract length on the topology and stability of intramolecular DNA quadruplexes. Biochemistry 46:3036–3044 35. Miyoshi D, Nakao A, Sugimoto N (2002) Molecular crowding regulates the structural switch of the DNA G-quadruplex. Biochemistry 41:15017–15024 36. Phan AT, Luu KN, Patel DJ (2006) Different loop arrangements of intramolecular human telomeric (3+1) G-quadruplexes in K+ solution. Nucleic Acids Res 34:5715–5719 37. Gray RD, Trent JO, Chaires JB (2014) Folding and unfolding pathways of the human telomeric G-Quadruplex. J Mol Biol 426:1629–1650

38. Wilson WD, Paul A (2014) Kinetics and structures on the molecular path to the quadruplex form of the human telomere. J Mol Biol 426:1625–1628 39. Ambrus A, Chen D, Dai J, Jones RA, Yang D (2005) Solution structure of the biologically relevant G-quadruplex element in the human c-MYC promoter. Implications for G-quadruplex stabilization. Biochemistry 44:2048–2058 40. Funke A, Karg B, Dickerhoff J, Balke D, Meller S, Weisz K (2018) Ligand-induced dimerization of a truncated parallel MYC G-quadruplex. ChemBioChem 19:505–512 41. Yang D, Hurley LH (2006) Structure of the biologically relevant G-quadruplex in the c-MYC promoter. Nucleosides Nucleotides Nucleic Acids 25:951–968 42. Ohnmacht SA, Varavipour E, Nanjunda R, Pazitna I, Di Vita G, Gunaratnam M, Kumar A, Ismail MA, Boykin DW, Wilson WD, Neidle S (2014) Discovery of new G-quadruplex binding chemotypes. Chem Commun 50:960–963 43. Nanjunda R, Owens EA, Mickelson L, Dost TL, Stroeva EM, Huynh HT, Germann MW, Henary MM, Wilson WD (2013) Selective G-quadruplex DNA recognition by a new class of designed cyanines. Molecules 18:13588–13607 44. Yang H, Zhong HJ, Leung KH, Chan DS, Ma VP, Fu WC, Nanjunda R, Wilson WD, Ma DL, Leung CH (2013) Structure-based design of flavone derivatives as c-myc oncogene downregulators. Eur J Pharm Sci 48:130–141 45. Duarte AR, Cadoni E, Ressurreicao AS, Moreira R, Paulo A (2018) Design of modular G-quadruplex ligands. ChemMedChem 13:869–893 46. Salgado GF, Cazenave C, Kerkour A, Mergny J-L (2015) G-quadruplex DNA and ligand interaction in living cells using NMR spectroscopy. Chem Sci 6:3314–3320 47. Le DD, Antonio MD, Chan LKM Balasubramanian S (2015) G-quadruplex ligands exhibit differential G-tetrad selectivity. Chem Commun 51:8048–8050 48. Dhamodharan V, Harikrishna S, Bhasikuttan AC, Pradeepkumar PI (2015) Topology specific stabilization of promoter over telomeric G-quadruplex DNAs by bisbenzimidazole carboxamide derivatives. ACS Chem Biol 10:821–833 49. Wilson WD (2002) Analyzing biomolecular interactions. Science 295:2103–2105

Quadruplex DNA Interactions by Biosensor-SPR 50. Homola J (2008) Surface plasmon resonance sensors for detection of chemical and biological species. Chem Rev 108:462–493 51. Rich RL, Myszka DG (2000) Advances in surface plasmon resonance biosensor analysis. Curr Opin Biotechnol 11:54–61 52. Nanjunda R, Munde M, Liu Y, Wilson WD (2011) Real-time monitoring of nucleic acid interactions with biosensor-surface plasmon resonance. In: Wanunu M, Tor Y (eds) Methods for studying nucleic acid/drug interactions. CRC Press, Boca Raton, pp 91–122 53. Lombardo CM, Welsh SJ, Strauss SJ, Dale AG, Todd AK, Nanjunda R, Wilson WD, Neidle S (2012) A novel series of G-quadruplex ligands with selectivity for HIF-expressing osteosarcoma and renal cancer cell lines. Bioorg Med Chem Lett 22:5984–5988 54. Jain AK, Paul A, Maji B, Muniyappa K, Bhattacharya S (2012) Dimeric 1,3-Phenylene-bis (piperazinyl benzimidazole)s: synthesis and structureactivity investigations on their binding with human telomeric G-Quadruplex DNA and telomerase inhibition properties. J Med Chem 55:2981–2993 55. Koirala D, Dhakal S, Beth Ashbridge B, Sannohe Y, Raphae¨l Rodriguez R, Sugiyama H, Balasubramanian S, Hanbin Mao H (2011) A single-molecule platform for investigation of interactions between G-quadruplexes and small molecule ligands. Nat Chem 3:782–787

85

56. Pilch DS, Barbieri CM, Rzuczek SG, LaVoie EJ, Rice JE (2008) Targeting human telomeric G-quadruplex DNA with oxazole-containing macrocyclic compound. Biochimie 90:1233–1249 57. White EW, Tanious F, Mohamed A, Ismail MA, Reszka AP, Neidle S, Boykin DW, Wilson WD (2007) Structure-specific recognition of quadruplex DNA by organic cations: influence of shape, substituents and charge. Biophys Chem 126:140–153 58. Myszka DG (2000) Kinetic, equilibrium, and thermodynamic analysis of macromolecular interactions with BIACORE. Methods Enzymol 323:325–340 59. Nguyen B, Tanious FA, Wilson WD (2007) Biosensor-surface plasmon resonance: quantitative analysis of small molecule-nucleic acid interactions. Methods 42:150–161 60. Tanious FA, Nguyen B, Wilson WD (2008) Biosensor-surface plasmon resonance methods for quantitative analysis of biomolecular interactions. In: Correia JJ, Detrich HW, editors. Methods Cell Biol 84:53–77 61. Karlsson R (1999) Affinity analysis of nonsteady-state data obtained under mass transport limited conditions using BIAcore technology. J Mol Recognit 12:285–292 62. Myszka DG (1999) Improving biosensor analysis. J Mol Recognit 12:279–284

Chapter 5 Putting a New Spin of G-Quadruplex Structure and Binding by Analytical Ultracentrifugation William L. Dean, Robert D. Gray, Lynn DeLeeuw, Robert C. Monsen, and Jonathan B. Chaires Abstract Analytical ultracentrifugation is a powerful biophysical tool that provides information about G-quadruplex structure, stability, and binding reactivity. This chapter provides a simplified explanation of the method, along with examples of how it can be used to characterize G4 formation and to monitor small-molecule binding. Key words Analytical ultracentrifugation, G-quadruplex, Molecular weight, Ligand binding, Stoichiometry, Sample homogeneity

1

Introduction

1.1 Analytical Ultracentrifugation

Analytical ultracentrifugation (AUC) is underappreciated by the G-quadruplex (G4) community. AUC is a venerable biophysical technique that has a (nearly) 100 year history. Theodor Svedberg invented the analytical ultracentrifuge in 1925, and won the Nobel Prize in Chemistry the next year for his research on colloids and proteins using his invention. AUC has since been widely used as a fundamental technique for the determination of macromolecular structure, reaction stoichiometry and ligand affinity [1–4]. AUC is based on first-principle physical theory, and can be used to determine absolute molecular weights of molecules, along with their hydrodynamic shapes. Our laboratory has found AUC useful for a variety of G4 structural studies [5–13]. The intent of this chapter is to provide a simplified overview of AUC and then to show its utility for characterizing G4 structure and binding. Figure 1 shows a schematic of the most basic AUC experiment, sedimentation velocity (SV). A sample is placed in one sector of the centerpiece within a sealed cell assembly with quartz windows (Fig. 1a). A reference solution is placed in the second sector. The

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_5, © Springer Science+Business Media, LLC, part of Springer Nature 2019

87

88

William L. Dean et al.

Fig. 1 Schematic representation of a sedimentation velocity experiment. The panel on the left represents a double sector centrifuge cell with a sedimenting sample in the upper sector and the reference buffer in the lower sector. The right panel shows four absorbance scans taken at four different times during sedimentation

cell assembly is then placed in a rotor and spun in an ultracentrifuge to produce a high centrifugal field. The centrifugal field is sufficient to cause the sedimentation of molecules within the sample cell in the direction of the field, toward the bottom of sector. The AUC instrument uses an optical system, typically absorbance, to scan the sample cell to monitor the concentration of molecules at each radial position in the cell, and to record their movement as a function of time at constant centrifugal force. As molecules move within the sample cell, a boundary is formed that changes with time. Figure 1b shows a schematic of an SV experiment, with intermittent scans showing the migration of a boundary toward the bottom of the sample cell. As the boundary migrates to the bottom it also broadens because of diffusion of the sedimenting molecule. Such primary data contain sufficient information to extract the structural properties of the sedimenting molecules. As any number of basic textbooks (see Note 1) derive and show, the sedimentation coefficient (s) is defined as the velocity (v) of the moving boundary (determined at the midpoint) divided by the centrifugal field strength (ω2r)(rpm in radians/sec squared times the radial distance from the center of rotation):  M 1  vρ v s¼ 2 ¼ ω r Nf The sedimentation coefficient is then equal to the product of the molecular weight of the molecule (M) and the buoyancy term  1  vρ divided by the product of Avagadro’s number (N) and the frictional coefficient ( f ) (see Note 2). In the buoyancy term, v is the partial specific volume and ρ is the density of the solvent. The sedimentation coefficient is thus determined by the mass of the

G4 Structure and Binding by AUC

89

molecule and its shape. A useful extension of this basic equation is the Svedberg equation:  M 1  vρ s ¼ D RT where D is the diffusion coefficient of the molecule (see Note 3), R the gas constant (8.314 J/mol  K), and T the absolute temperature. The molecular weight thus can be obtained from experimentally measured sedimentation and diffusion coefficients. AUC offers some unique advantages over other methods for determination of G4 molecular weight. First, AUC measurements are done in solutions of defined and invariant composition, and sedimentation is sensitive to not only the mass of molecules but also their hydrodynamic shape. Second, in methods such as mass spectrometry and analytical size exclusion chromatography, the equilibrium of the initial sample solution is necessarily perturbed, perhaps leading to some strand dissociation in multi-stranded G4 structures or small-molecule ligand dissociation. In contrast, in AUC measurements the moving G4 boundary remains in equilibrium as it moves through the cell since it is not diluted or bound to a solid medium, thus minimizing perturbation of multi-stranded structures or of bound ligand. 1.2 Use of AUC to Quantify Ligand Binding

2

Since the seminal paper by Steinberg and Schachman that described the use of the AUC absorbance optical system for studies of smallmolecule binding to proteins [14], there have been myriad examples where AUC has been used to determine binding stoichiometry and dissociation constants for protein binding interactions. However, the use of AUC to study interactions between small molecules and DNA is highly underutilized, even though it was first used in 1954 (!) to show that methyl green binds to calf thymus DNA [15]. There are few published examples utilizing AUC to study small molecule interactions with DNA [16–18] and even fewer assessing interaction of small molecules with G-quadruplex-containing sequences [18, 19].

Materials 1. Oligonucleotides can be purchased from a suitable supplier. We typically use oligos from IDT, Coralville, IA, or Eurofins Genomics, Louisville, KY. 2. Stock solutions are prepared by dissolving the lyophilized, desalted oligos in MQ-H2O to a final concentration of ~1 mM. 3. Working samples are prepared by diluting the stock DNA to ~3–5 μM in the appropriate buffer. We use either phosphate or TBAP (10 mM tetrabutyl-ammonium phosphate, 1 mM

90

William L. Dean et al.

EDTA, pH 7.0) buffers, usually containing at least 100 mM KCl to promote quadruplex folding and to prevent nonideal behavior in the AUC due to low ionic strength. 4. The annealing protocol used can play an important role in quadruplex formation (see Note 4) [18]. Unless otherwise noted, samples are annealed by heating for 5–10 min in a 1-L boiling water bath followed by slow (12–24 h) cooling to room temperature. 5. In binding measurements the small molecule ligands are dissolved in the same buffer. If the pure ligand is dissolved in DMSO, then a maximum of 1% DMSO can be tolerated in the ligand stock solution. 6. Our AUC is a Beckman Coulter ProteomeLab XL-A analytical ultracentrifuge (Beckman Coulter Inc., Brea, CA). We use either a four-hole (An-60) or an eight-hole (An-50) rotor with standard two-sector cells.

3

Methods

3.1 Analytical Ultracentrifugation

We have presented detailed protocols for the analysis of AUC data obtained for G4 structures in a previous volume of this series [13] and elsewhere [6] and a simple step-by-step protocol is available at: http://www.analyticalultracentrifugation.com/sedphat/experi mental_protocols.htm Our specific AUC protocol is presented below (see Note 5 regarding safety procedures and rotor treatment): 1. 430 μL of sample is placed in the sample sector and 450 μL of buffer in the reference sector of a standard two-sector cell used for sedimentation velocity. 2. A speed of 40,000 rpm (12,9000  g) and wavelength of 260 nm are appropriate for most applications with Gquadruplex-containing DNA oligonucleotides (see Note 6). 3. Buffers should contain at least 0.1 M salt to minimize nonideal behavior (see Note 7). 4. To obtain s20,w values, molecular weights, and frictional ratios, the density and viscosity of the buffer must be known. For simple buffers such as PBS, these parameters can be calculated using the program Sedinterp (free software: http://jphilo. mailway.com/download.htm). 5. Temperature can be maintained between 4 and 40  C in the AUC, but since DNA is quite stable, the standard temperature of 20  C is preferable. 6. After centrifugation sufficient for 100 scans (usually 4–8 h), the data from the AUC run are stored as a folder labeled with the

G4 Structure and Binding by AUC

91

date of the experiment and subfolders are labeled with the time of the run on that date by the instrument software package. 7. The data are analyzed by SEDFIT software (www. analyticalultracentrifugation.com) using the continuous c(s) distribution model. Using the “load new data” tab, the appropriate experiment is loaded and then the cell number is specified. The scans are chosen by highlighting and the density and viscosity of the buffer, and the partial specific volume (0.55 mg/mL for G4s) (see Note 8) are entered in the “parameters” tab. 8. After running and fitting the data, the program provides a figure showing c(s) (concentration dependence of the sedimenting DNA) versus sedimentation coefficient. Use of “Control-m” then gives the s20,w, frictional coefficient, frictional ratio (an indicator of the deviation from a perfect sphere where the larger the number, the more elongated the sedimenting species) and Stokes radius for each peak. 9. The analysis provides the starting concentration of each c(s) versus s peak so that the concentration of the molecule in the peak can be directly calculated from the extinction coefficient at that wavelength. 3.2 Illustrative Results of AUC Analysis of G4 Structures

Figure 2 shows an example of experimental data obtained by sedimentation velocity of the “hybrid” G4 structure formed by the human telomere repeat sequence Tel22, 5’AGGG(TTAGGG)3. With SEDFIT (see step 7 above in Subheading 3.1) the data can be analyzed in sophisticated ways to obtain s and D coefficients simultaneously, thereby providing both molecular weight estimates and shape information from a single sedimentation experiment. Figure 2b shows the results of such an analysis, presented as a c(s) plot in which the best-fit distribution of sedimentation coefficients derived from the data is shown. The results show that this G4 sample is homogeneous with a sedimentation coefficient of 1.9 S.

3.2.1 Application of SV to Demonstrate G4 Structural Formation

Figure 3 shows how SV can be used to demonstrate folding of an oligonucleotide into a compact structure. The telomeric sequence d[TTGGG(TTAGGG)3A] has been shown to fold into a compact three tetrad hybrid structure termed 2GKU in the presence of K+ [20]. We used AUC to show the different sedimentation behaviors of the unfolded sequence in LiCl, the unfolded sequence at pH 11.5, the folded 2GKU sequence in K+, an unstructured sequence of similar molecular weight, dT24, and finally the duplex formed with the complementary sequence to 2GKU. The differences in sedimentation behavior reflect differences in structure, not molecular weight, except for the case of the duplex. As indicated by the equation for the sedimentation coefficient above, an increase in the frictional coefficient due to unfolding will decrease the

92

William L. Dean et al.

Fig. 2 Sedimentation velocity experiment showing results from a solution containing 5 μM Tel22 in 0.2 M KCl. Panel A shows the raw data from 20 of the 100 scans taken over a 6 h time period at 40,000 rpm (12,9000  g) and 20  C. Panel B shows sedfit analysis using the continuous c(s) distribution model. The results show that this G4 sample is homogeneous with a sedimentation coefficient of 1.9 S

G4 Structure and Binding by AUC

93

Fig. 3 AUC and CD analysis of 2GKU under various buffer conditions known to result in the formation of different structures. Panel (a) shows CD spectra of 2GKU in KCl (black), 2GKU in LiCl (blue), 2GKU duplex (red), dT24 (green) and 2GKU at pH 11.5 (purple). The buffer was 10 mM sodium phosphate with added 185 mM KCl or LiCl, pH 7.0 or 11.5. Panel (b) shows sedfit analysis of sedimentation velocity results using the continuous c(s) distribution model of the same solutions using the same color scheme as in (a). The S20,w values for each structure are: 2GKU in KCl, 2.00; 2GKU in LiCl, 1.66; 2GKU duplex, 2.63; dT24, 1.41; 2GKU at pH 11.5, 1.60

94

William L. Dean et al.

observed sedimentation coefficient as shown for 2GKU in Li+, at pH 11.5 and for the dT24 sequence. The fact that 2GKU in Li+ has a larger sedimentation coefficient than the sequence at pH 11.5 or dT24 suggests that some folding has been retained in Li+. The sedimentation coefficient of the duplex is increased due to the near doubling of molecular weight and also a more elongated shape which increases f and therefore s. The structures of 2GKU under the buffer conditions used for AUC were analyzed by circular dichroism as shown in Fig. 3a. The conclusions with respect to oligonucleotide folding are consistent with the CD spectra presented in Fig. 3a. For example, 2GKU in 185 mM KCl is fully folded into a hybrid-1 structure, while 2GKU in 180 mM LiCl gives a CD spectrum that indicates partial folding, probably a mixture of folded hybrid and unfolded forms. This is consistent with the known inability of Li+ to promote full G4 folding at these concentrations, but is inconsistent with the prevalent notion that Li+ is completely unable to promote G4 folding. As expected, 2GKU in 185 mM KCl, pH 11.5, is unfolded as indicated by CD. The GC-rich 2GKU duplex at neutral pH gives a CD spectrum characteristic of A-type DNA [21] while the CD spectrum at pH 11.5 reflects the inherent asymmetry of the nucleobase chromophore itself in an non-helical (random) backbone conformation [22]. 3.2.2 Use of Circular Dichroism to Assess G4 Structure

Blank-corrected CD spectra were recorded in a 1-cm pathlength quartz cuvette at 25  C with a Jasco J810 spectropolarimeter equipped with Peltier temperature controller as described [21]. CD spectra were normalized using the equation Δε ¼ θ/ (32,980·l·c), where θ is the observed ellipticity in millidegrees, l is the pathlength in cm, and c is the molar strand concentration of DNA.

3.3 Detecting and Characterizing G4 Heterogeneity

A distinct and powerful advantage of SV is its ability to reveal and characterize heterogeneity. Figure 4 shows a case where a truncated form of the human telomere repeat sequence (50 (TTAGGG)3) was studied. It was anticipated that this sequence would fold into a triple-helical structure that might represent an intermediate along the multistate G4 folding pathway. SV shows, however, that this is not the case. There are two structures present at sedimentation coefficients of approximately 1.25 and 2.0 S and calculated molecular weights indicating a mixture of nearly equal amounts monomer and dimer. The CD spectrum (Fig. 4 insert) is consistent with a mixture of conformations rather than that of a single structure. This result is expected based on the NMR studies of truncated Tel22 sequences [23].

G4 Structure and Binding by AUC

95

Fig. 4 Sedfit analysis of 50 (TTAGGG)3 (7 μM) in 10 mM MOPS, 30 mM KCl, pH 7.0, using the continuous c(s) distribution model. Inset figure. CD spectrum of 50 (TTAGGG)3 (7 μM) in 10 mM MOPS, 30 mM KCl, pH 7.0 3.3.1 Application of SV to Show Ligand and Protein Binding to G4 Structures

The use of AUC to study small molecule-macromolecule interactions is based on first principles of thermodynamics. The determination of free and bound ligand using SV is actually quite straightforward if the ligand can be detected by absorbance, fluorescence, or interference optical systems. Macromolecule and ligand are mixed and allowed to come to equilibrium. The resulting solution is then centrifuged in the AUC. A virtue of AUC is that differential migration of the species allows the complex to be separated from the smaller reactants. Yet, since migration takes place in free solution, the equilibrium is not perturbed because the complex remains bathed in free ligand. After centrifugation for long enough to separate the faster sedimenting macromolecule complex from the much slower sedimenting ligand, the free concentration of the ligand can be determined directly from the portion of solution in the centrifuge cell that is free of macromolecule. Knowing the total amount of starting ligand, the amount bound ligand is calculated by difference. An idealized example is shown in Fig. 5. This measurement provides the fractional saturation of the ligand at the total ligand concentration used in the experiment and a single point in a binding isotherm that can be used to determine the dissociation constant for the ligand. In addition, one can obtain the stoichiometry of binding if the macromolecule is saturated, and an indication of any conformational change in the macromolecule that has

Fig. 5 Idealized binding analysis experiment using sedimentation velocity with absorption optics. Panel (a) shows the results of a single scan at the wavelength where the receptor absorbs and at a time when the receptor has sedimented approximately 60% of the length of the solution column. Panel (b) shows the results at the same time at a wavelength where the ligand absorbs. Panel (c) shows analysis of the scan from panel (b) with concentrations of total, free and bound ligand clearly delineated

G4 Structure and Binding by AUC

97

occurred upon ligand binding based on changes in the sedimentation coefficient and frictional ratio. This methodology is analogous to equilibrium dialysis and to the classic Hummel and Dryer gel filtration chromatography binding method [24]. A major advantage of AUC over these other methods is that it allows for the analysis of systems free in solution without the possible complications of interactions with a matrix that can occur in gel filtration or the dialysis membrane. Furthermore, one also obtains structural information about the macromolecule in the presence of bound ligand. If there is overlap between the absorption spectra of the ligand and macromolecule, then a multi-wavelength approach can be used to determine the amounts of ligand and macromolecule in the region of the cell containing both ligand and macromolecule [25–29]. For a two-component system such as a single macromolecule and ligand, only two wavelengths are required to determine the contribution of macromolecule and ligand to each position in the centrifuge cell. New methodology has been developed for the most recent iteration of the analytical ultracentrifuge (Beckman Coulter Optima AUC) in which a multi-wavelength scan is acquired at each position in the cell during centrifugation [26]. A general methodology is outlined below. 3.3.2 Steps to Determine Binding Using Absorbance Optics in the Proteome Lab XL-A

1. The concentration of ligand should be sufficient to give an absorbance of 0.2–1.2. 2. If there is overlap between the absorbance of the ligand and the macromolecule, the experiment must be carried out at two wavelengths, preferably at the absorption maxima of the ligand and the macromolecule. The total absorbance of the two components at each wavelength should be in the range of 0.2–1.2. 3. Three cells are required for each binding experiment, one containing ligand only at the same concentration as used in the binding experiment, one cell with macromolecule alone at the same concentration used in the mixture, and one cell with a mixture of the two components. 4. The starting concentration absorbance for the small molecule is used to determine the free ligand concentration using the extinction coefficient and knowing the total small molecule concentration allows for calculation of bound ligand by difference.

3.3.3 Analysis

With reference to the schematic in Fig. 5, or to the actual data scans in Fig. 6, the free ligand concentration (Lf) is calculated from the absorbance (Afree) of the first, slowest migrating, boundary: Lf ¼

A free εf l

98

William L. Dean et al.

Fig. 6 Sedimentation velocity analysis of binding of Sunitinib to Tel48. Sunitinib (30 μM) was mixed with Tel48 (3 μM) for 30 min and then analyzed by sedimentation velocity at 418 nm, the absorbance maximum of Sunitinib. Tel48 does not absorb at that wavelength. Panel (a) shows binding analysis as described in Fig. 5c for a single scan where Tel48 has sedimented 60% of the length of the solution column. Panel (b) shows sedfit analysis of the binding experiment using the continuous c(s) distribution model. The area under the curve at the sedimentation coefficient of Tel48 represents bound Sunitinib and is exactly equal to the absorbance of bound Sunitinib determined in (a), 0.28 absorbance units

G4 Structure and Binding by AUC

99

where εf is the molar extinction coefficient of the ligand and l is the pathlength of the centrifuge cell (1.2 cm). (Afree can also be obtained from c(s) plots by integration to obtain the peak area.) Since the total ligand concentration (LT) is known from the preparation of the solution, the amount of bound ligand (Lb) may be calculated as the difference between the total and free ligand concentrations: Lb ¼ LT  Lf (Alternatively, Lb could also be obtained directly from the absorbance of the boundary of the complex (Abound) in Figs. 5 and 6 if the extinction coefficient of the bound ligand (εb) is known and if the absorbance of the ligand is well separated from the absorbance of the macromolecule.) Once these quantities are known, the fraction of ligand bound (θ) is easily calculated as θ ¼ Lb/LT. The apparent binding stoichiometry n (moles bound ligand/mole of macromolecule) can be calculated as n ¼ Lb/M, where M is the total concentration of macromolecule receptor. 3.3.4 Illustrative Results

An example of a binding experiment in which interaction between a chemotherapeutic drug Sunitinib and hTel48 (TTAGGG)8 occurs is shown in Fig. 6. At 418 nm only the drug absorbs, so the analysis is not complicated by the presence of Tel48 and looks similar to the idealized example in Fig. 5a. The concentration of free Sunitinib can be calculated directly from the absorbance at 418 nm after sedfit analysis in the continuous c(s) distribution model in which the area under the c(s) versus s peak is equal to the absorbance of that species (Fig. 6b). Note that the absorbance for free Sunitinib from a single scan in Fig. 6a is equal to the absorbance in the integrated peak in Fig. 6b. In this experiment the starting concentration of Sunitinib was 30 μM and Tel48 was 3 μM, so at this tenfold excess of drug, the apparent stoichiometry is 6.3 moles of drug/mole Tel48. Table 1 summarizes the results for a number of small molecules found to interact with different G4 structures, including an antiparallel hybrid form (Tel22), a structure containing contiguous antiparallel G4 structures (Tel48) and a parallel, “propeller” G4 structure (1XAV). The reliable determination of the binding stoichiometries show in Table 1 provides a firm foundation for the analysis of binding isotherms obtained by independent spectrophotometric titrations or by isothermal calorimetry.

3.4

Our laboratory now regards AUC as an essential criterion for the demonstration of G4 formation in nucleic acid sequences of interest. AUC shows definitively whether or not a given sequence has folded into a compact structure and then provides the molecular weight of the structure to unambiguously determine its

Final Comments

100

William L. Dean et al.

Table 1 Summary of small molecule binding AUC measurements in K+ buffers Compound

DNA

Stoichiometrya

Wavelength

Doxorubicin

Tel48

12

470

Berberine

Tel22

1

345

Tel48

2

345

Tel22

3

478

1XAV

3

478

Tel22

0

380

1XAV

1

380

Tel22

4

340

1XAV

4

340

Thiazole orange

NMM

Quindoline

a

Rounded to nearest whole number. Ratio of small molecule: DNA for all experiments was a minimum of 10

molecularity. AUC provides a clear characterization of the homogeneity of the folded structure, and clearly identifies the presence and exact amounts any unfolded species or unwanted aggregates. AUC clearly indicates if additional sample purification is needed. Once AUC has established that a sequence has folded into a homogeneous structure, additional methods like circular dichroism can be used with confidence to characterize the topological features of the folded G4 structure [30]. Our confidence in AUC as a biophysical tool based on fundamental physical principles is such that we have adopted the motto “AUC non mentior”—AUC does not lie.

4

Notes 1. For example see https://en.wikipedia.org/wiki/Ultracentri fuge; https://www.beckman.com/resources/techniques-andmethods/analytical-ultracentrifugation K.E. van Holde, W.C. Johnson, P. S. Ho, Principles of Physical Biochemistry, 2nd ed, Chapter 5, Prentice-Hall, Upper Saddle River, NJ, 2006; D.M. Freifelder, Physical Biochemistry: Applications to Biochemistry and Molecular Biology 2nd ed., San Francisco, W.H. Freeman; D. Sheehan Physical Biochemistry: Principles and Applications 2nd ed., Hoboken, NJ, John Wiley & Sons; C.S. Smith, 1988, Estimation of sedimentation coefficients and frictional ratios of globular proteins, Biochem Education 16 (2), 104–106.

G4 Structure and Binding by AUC

101

2. Frictional coefficient f is a measure of resistance to movement of a molecule through solvent and is affected by size, shape, and viscosity. The frictional ratio f/f0 is the frictional coefficient of the sedimenting particle compared to a hypothetical spherical molecule of the same volume. A value of 1.1 for f/f0 indicates a nearly spherical shape while a value of greater than 1.5 indicates a significantly elongated molecule. The Stokes radius, Rs, is obtained from Stokes law: f ¼ 6πηRs. 3. The diffusion coefficient, D, is determined from the spreading of the c(s) versus s peak in SEDFIT. 4. To minimize molecular heterogeneity due to the formation of mis-folded or aggregated structures and to ensure reproducible folding, it is crucial to follow the slow annealing protocol with well-defined buffers as outlined. See ref. 18 for examples of such protocols. 5. Ultracentrifuge maintenance and hazards. (a) Wash cells with 1% SDS and ethanol to insure hydrophobic ligands are removed. Pre-cool rotors to save on temperature equilibration time. (b) Never use metal needles on the AUC cells. (c) Keep track of cell # and usage so any unusual behavior can be assigned to specific cells. (d) Emphasize safety precautions—rotor care, rotor inspection, see: https://www.americanlaboratory.com/914-Appli cation-Notes/1326-Centrifuge-Safety-and-Security/. 6. 40,000 rpm (129000  g) is chosen as a rotor speed because it is significantly slower than the maximum rpm of the An 50 rotor. Slower speeds could be chosen but require longer centrifugation times. 7. Use of ionic strength 99% purity, Merck). 7. Sodium chloride (>99% purity, Merck). 8. Polyethylene glycol (PEG 3000, Merck). 9. EDTA (>99% purity, Merck).

120

San Hadzˇi et al.

10. High-purity water (MilliQ). 11. UV/Visible Spectrophotometer (Cary 100 BIO; Varian Inc.) 12. Differential Scanning Calorimeter (Nano-DSC; TA Instruments, New Castle, DE, USA). 13. CD spectrophotometer (62A DS; Aviv Biomedical, Lakewood, NJ, USA).

3

Methods The methods described below outline sample and buffer preparation (Subheading 3.1), differential scanning calorimetry (DSC) (Subheading 3.2), thermodynamic analysis of DSC data (Subheading 3.3), phase diagrams (Subheading 3.4), thermodynamic driving forces (Subheading 3.5) and kinetic analysis (Subheading 3.6).

3.1 Sample and Buffer Preparation

1. Preparation of cacodylic buffer (see Note 1) solution with different concentrations of K+ or Na+ ions and PEG: Cacodylic acid and EDTA were dissolved in high-purity water. KOH (NaOH) was added to this solution acid to reach pH ¼ 6.9. Then, KCl or NaCl was added to obtain the desired concentration of K+ (Na+) ions. Solutions with PEG (see Note 2) were prepared by dissolving known amount of PEG in the buffer solution. 2. Preparation DNA sample solution: DNA oligonucleotide 50 -AGGGTTAGGGTTAGGGTTAGGG-30 (Tel22) was first dissolved in water and then extensively dialyzed (see Note 3) against the cacodylic buffer (3 exchanges of buffer solution in 12 h). The dialyzed Tel22 solution was first heated up to 95  C in an outer thermostat for 5 min to make sure that all Tel22 transforms into the unfolded form, cooled down to 5  C at the cooling rate of 0.05  C min1 allowing the Tel22 to adopt G4 structure(s). 3. DNA Concentration determination: Tel22 concentration was determined at 25  C spectrophotometrically. Tel22 concentrations at 25  C were obtained from the melting curves monitored at 260 nm. They are concentration independent in the 0.003–0.3 mM range suggesting that the observed transitions are monomolecular. Melting curves were used to obtain the absorbance of the unfolded form at 25  C (the linear hightemperature parts were extrapolated to 25  C). For the extinction coefficient of Tel22 unfolded form at 25  C we used the value 228,500 M1 cm1 estimated from the nearest-neighbor data of Cantor et al. [13]. 4. Verification of the Tel22 folded topology: To make sure that Tel22 is folded into the expected structural type (H ¼ hybrid (K+), A ¼ antiparallel (Na+), P ¼ parallel (K+, high PEG

G4 Transitions Analyzed by DSC

A

H

P H A

6

-1

[Q]/ 10 deg M cm

-1

9

121

3

3 0 -3 -6 220

240

260

280

300

320

λ / nm

P

Fig. 2 Model system: Tel22 and its G-quadruplex structures formed in solutions containing Na+ ions (A ¼ antiparallel, PDB: 143D), K+ ions (H ¼ hybrid 1, PDB: 2HY9, 2JSM; and hybrid-2, PDB: 2JPZ, 2JSK), and K+ ions and PEG (P ¼ parallel, PDB: 1KF1) showing characteristic peaks in CD spectra

concentrations)) we measured the CD spectra which show appropriate characteristic peaks (Fig. 2). CD spectra of with 20 μM oligonucleotide solutions were recorded at 25  C in the wavelength range between 215 and 320 nm with the signal averaging time of 10 s and 5 nm bandwidth. The measured ellipticity, Θ, was normalized to 1 M DNA and 1 cm path length to obtain molar ellipticity, [Θ]. 3.2 Differential Scanning Calorimetry (DSC)

Cyclic DSC experiments were performed at the DNA concentration of 0.3 mM in the temperature range between 1 and 120  C (see Note 4), at the heating/cooling rates (r) of 1.0 and 2.0  C/min (see Subheading 3.6). The consecutive scans performed at different r were identical suggesting that the (un)folding of Tel22 may be considered as a reversible process. The corresponding baseline (buffer vs. buffer) thermograms (see Note 5) were subtracted from the heating and cooling thermograms and the obtained differences were normalized to 1 mole of DNA (single strand) to obtain the partial molar heat capacity of DNA, C P ,2 , as a function of temperature. DSC experiments were performed using Nano DSC instrument (TA Instruments, New Castle, DE, U.S.). Upon further analysis the obtained C P ,2 is more conveniently expressed as the excess heat capacity, ΔC P ¼ C P ,2  C P , int , where C P , int represents an intrinsic heat capacity of DNA (see Note 6) that was, in the measured temperature interval, approximated by a second-order polynomial of T, fitted to the low-temperature (folded form) and high-temperature (unfolded form) parts of the experimental C P ,2 .

San Hadzˇi et al.

122

So the obtained C P , int was subtracted from the C P ,2 to obtain the excess heat capacity ΔCP (see Note 7). Integration of ΔCP versus T thermograms gives the model-independent value for the enthalpy of G4 unfolding (see Note 8). The excess heat capacity, ΔCP , can also be calculated based on the model mechanism (Fig. 3d). The resulting model function can be expressed as [4]: X ΔH i ð∂αi =∂T Þ ð1Þ ΔC P ¼

3.3 Thermodynamics Analysis of DSC Data

i

where ΔHi represents standard enthalpy changes corresponding to transition of Tel22 from the unfolded (U) state to state i (H, P, A, IH, IP, IA) and αi is the fraction of Tel22 in state i. Transitions predicted by the model mechanism (Fig. 3d) can be described

a

d

3

H

DCP / Kcal mol–1 K–1

25 mM K+ 70 mM K+ 2

200 mM K+

IH

1

U IA

0 280

320

340

A

360

c

3 100 mM Na+ 200 mM Na+

1.5

2

1.0

1

P

2.0

DCP / Kcal mol–1 K–1

DCP / Kcal mol–1 K–1

b

300

IP

[PEG] / mM 17 50 100 130 170

0.5

0.0

0 280

300

320 T/K

340

360

280

300

320

340

360

380

400

T/K

Fig. 3 DSC thermograms (symbols) measured in the presence of K+ ions (a), Na+ ions (b), and K+ ions and PEG (c). Model mechanism (d) translates into the corresponding model function (Eq. 1, lines) that successfully describes all the measured thermograms simultaneously

G4 Transitions Analyzed by DSC

123

thermodynamically in the following way. Each (reversible) transition step is described in terms of the corresponding changes of three standard thermodynamic parameters that are independent of K+, Na+, and PEG concentration. These are standard Gibbs free energy (ΔGi(T )) and standard enthalpy (ΔHi(T )) which are temperature-dependent and are thus determined at the reference temperature T0 ¼ 298.15 K. The third parameter standard heat capacity ΔCp, i is assumed to be temperature independent. These three parameters define the standard free energy and enthalpy of transition at any T through the Gibbs-Helmholtz relation [∂(ΔGi 2 (T )/T)/∂T]p ¼  ΔHi(T )/T and the Kirchhoff’s law [∂ΔHi(T )/ ∂T]p ¼ ΔCp, i. For our system it is convenient to describe each conformational transition in terms of the apparent standard ΔGi + + (T, K, Na, PEG) which depends on the T, K , Na , and PEG concentrations. Its relation with the standard thermodynamic ΔGi(T ) is given by the thermodynamic master equation ΔG i ðT ;K;Na;PEGÞ ¼ ΔG iðT Þ þ f i ðT ; K; Na; PEGÞ

ð2Þ

where fi(T, K, Na, PEG) is a contribution that describes how K+, Na+, and PEG affect ith transition, and is defined in terms of the corresponding parameters (e.g., no. of exchanged K+ and/or Na+ and/or PEG molecules, see ref. 4). Parameters ΔG iðT o Þ , ΔH iðT o Þ , and ΔCp, i and the parameters describing the influence of K+, Na+, and PEG define the apparent standard ΔGi(T, K, Na, PEG) for each U ! i step (i ¼ U, H, P, A, IH, IP, IA), and with that the corresponding equilibrium constants Ki(T, K, Na, ¼ exp (ΔG /RT). The apparent equilibrium i(T, K, Na, PEG) PEG) constant for each step bears the information on the equilibrium fractions (αi(T, K, Na, PEG)), through the Ki(T, K, Na, PEG) ¼ αi(T, K, Na, Na, PEG), which can be calculated using the concentraPEG)/αU(T, K, P tion balance ( αiðT ;K;Na;PEGÞ ¼ 1). In other words, at each combii nation of independent experimental variables T, [K+], [Na+], and [PEG] a system of equations (involving Ki(T, K, Na, PEG) ¼ αi(T, K, P αiðT ;K;Na;PEGÞ ¼ 1 for i ¼ U, H, Na, PEG)/αU(T, K, Na, PEG) and P, A, IH, IP, IA) has to be solved ifor αi(T, K, Na, PEG) to give a value of the corresponding model function (Eq. 1). Global fitting of the model function to the experimental DSC thermograms measured at various K+, Na+, and PEG concentrations (Fig. 3), based on the nonlinear Levenberg-Marquardt χ 2 regression procedure, was used to obtain the best-fit values of the model parameters. These values enable us to predict the species fractions αi(T, K, Na, PEG) in the solution at any T, [K+], [Na+], and [PEG]. To perform model analysis (fitting) we write our own programs in Fortran, C++, or Python which enable us to translate both simple and very complex model mechanisms into the corresponding species fractions and model functions describing different experimental signals. Such an approach fully supports our research freedom but unfortunately

124

San Hadzˇi et al.

generates programs that are not user-friendly and can be used by other potential users only after our extensive instructions. In the case of simple mechanisms (two-state or/and monomolecular transitions) the model function (Eq. 1) may be expressed explicitly in terms of adjustable parameters (see Note 9). To perform model analysis (fitting), such model function may be easily implemented into a user-friendly program platform (Excel, Origin, KaleidaGraph, . . .). 3.4

Phase Diagrams

Fig. 4 Phase diagrams

The calculated fractions (see Eq. 2 and the corresponding text) can be presented in terms of phase diagrams (e.g., temperature versus concentration, [K+] versus [Na+]) that were constructed by assigning areas in the phase space to the most populated species in that area (Fig. 4). Phase curves (borders) and triple points represent states in which two (curves) or three (triple points) most populated species are equally populated [4]. These diagrams indicate the possible macroscopic pathways of G4 folding/unfolding and interconversions. As an example, we show here two phase diagrams. Figure 4a shows that Tel22 thermal unfolding at low [PEG] follows mainly the H ! IH ! U pathway, but at high [PEG] the P ! IP ! U pathway is dominant. On the other hand, increasing [PEG] at 25  C induces the H ! P conformational change. The [Na+] versus [K+] diagram (Fig. 4b) shows that populated intermediates in the absence of PEG appear mainly at [Na+] or/and [K+] between 1 and 10 mM. Recent thermal unfolding studies on the same and similar human telomere DNA sequences [14, 15] suggest that the phase populated by intermediates may, at specific conditions (absence of PEG and Na+, presence of K+), consist of higher no. of intermediate species than predicted by the mechanism presented in Fig. 2d. Despite this difference, which can be attributed to the different sensitivities of biophysical methods used to monitor the

G4 Transitions Analyzed by DSC

125

unfolding pathway, the predicted T range at which phase I is stable (Fig. 4a) is the same as the one estimated before for several human telomere DNA sequence variants [15]. The presented phase diagrams show that Tel22 adopts an ensemble of structures that can be easily perturbed by changes in solution conditions (temperature, salts, cosolutes, cosolvents, ligands). Taken together, we demonstrate how phase diagrams can be obtained from the DSC data analysis and propagate the use of phase diagrams for better visualization of the possible macroscopic pathways of G4 conformational transitions. 3.5 Thermodynamic Driving Forces

Next, we describe how Tel22 folding and interconversion of species (Fig. 2d) at the specific solution conditions (T ¼ 25  C, [K+] ¼ [Na+] ¼ 25 mM and [PEG] ¼ 0 mM) can be interpreted in terms of the more fundamental driving forces. The apparent ΔGi (T,K,Na) of folding can be dissected into its enthalpy and entropy components (Fig. 5a), showing that each stage of the Tel22 folding to H, A, or P is characterized by the extensive enthalpy-entropy compensation. It should be emphasized that our DSC data and the global analysis provide a very reasonable estimate of the heat capacity change (ΔCP,i), which is considered a fingerprint for the changes in hydration. As such ΔCP,i can be related to the changes in solvent accessible surface areas, and may be translated into the Gibbs free energy contribution (ΔGi,hyd) that at T  25  C reflects mainly the entropy of (de)hydration of hydrophobic groups [16]. Figure 5b shows that most of the ΔGi,hyd change upon folding occurs during the formation of stable intermediates suggesting that the hydrophobic dehydration is the primary driving force of these processes. By contrast, specific ion binding, base stacking, and H-bonding appear to be the main driving forces for I ! G4 folding steps. Taken together, the global model analysis of DSC data provides information on the ΔCP,i (see Note 7) which strongly suggests that the hydrophobic dehydration importantly contributes to the G4 folding and in case of Tel22 hydrophobic dehydration drives the initial steps of folding. In addition, thermodynamic cycles may be used to estimate the thermodynamic parameters accompanying the interconversion of folded structural forms (Fig. 5c). These show that the P ! H conversion is strongly favored and driven by a large favorable entopic term which is due mainly to the dehydration of hydrophobic surface area. This interpretation is in accordance with the P and H structural features which indicate that the P form has more solvent exposed (hydrophobic) surface area than H form [17]. On the other hand, reducing the exposure of hydrophobic groups by addition of PEG makes the H ! P transition thermodynamically more favorable (see ref. 4 and Note 7). Figure 5c further shows that the A ! H conversion is energetically favorable, but only about 1 kcal separates these two forms. In this case the free

San Hadzˇi et al.

126 2

U U U

H A P

-5.0

a

H

IH

DGi / kcal mol-1

kcal mol-1

A

-2.5

0

b

-2

IA 0.0

IP

P 2.5

-4

DGi

DHi /10

TDSi /10

5.0 -20

-30

DGi,hyd / kcal mol-1

-40

c 4

kcal mol-1

2

0

P A P

-2

-4

DGi

D Hi

H H A

TDSi

Fig. 5 Thermodynamic profiles of Tel22 folding and interconversion (T ¼ 25  C, [K+] ¼ [Na+] ¼ 25 mM, [PEG] ¼ 0 mM). (a) Quadruplex folding: Free energy, enthalpy, and entropy contribution values; (b) Free energy of the folding into intermediate and folded forms as a function of the “dehydration” contribution estimated as ΔGhyd,i ¼ ΔCP,i·80 K [16]. (c) Quadruplex interconversion: Free energy, enthalpy, and entropy contribution values

energy changes result from equal enthalpy and entropy contributions. It appears that the H form is more populated than the A form mainly due to the favorable ion (de)hydration effect (Na+ hydration, K+ dehydration) which is in agreement with the generally accepted interpretation of Na+ and K+ influence on the quadruplex stability [18, 19]. Taken together, the global model analysis of DSC data provides information on the driving forces of G4 folding/ unfolding and interconversion that can be related to the structural changes of G4 accompanying these transitions. 3.6 Kinetically Limited Folding/ Unfolding

When DSC thermograms depend on the heating/cooling rate (r) they cannot be analyzed in terms of the equilibrium processes as described above. Such rate dependence or hysteresis observed with signals measured upon heating and cooling [20] suggest that the

G4 Transitions Analyzed by DSC

127

folding/unfolding of G4 involves transition steps that are kinetically limited (e.g., at high heating rate the transition step occurs at higher temperature—it does not have enough time to occur at the same temperature as at low heating rate). In such cases we have shown previously that DSC is a method of choice for characterization of G4 folding/unfolding process involving one or several kinetically limited transitions [11, 12]. Namely, the observed ratedependent thermograms can be described by a kinetic mechanism that defines time (t) and temperature (T) dependencies for the species fractions at given solution conditions. Based on the proposed model mechanism each ∂αi/∂T ¼ 1/r(∂αi/∂t) derivative can be defined in terms of the species fractions by the chemical reaction rate law. Reaction rate law defines through a set of rate constants how fractions of species at a temperature T change with time (∂αi/∂t). Temperature dependence of the (∂αi/∂t) derivate is accounted by the temperature dependence of the rate constants through the Arrhenius equation (frequency factor and activation energy define rate constant for each transition step at any T). The system of differential equations needs to be solved to obtain αi(T, t) and calculate the value of the model functon (Eq. 1) at any T and t. The model function is fitted globally to the experimental ΔCP versus T thermograms measured at different r to obtain the corresponding kinetic parameters and ΔHi values (see refs. [11, 12]).

4

Notes 1. Buffers appropriate for DSC measurements (cacodylic, phosphate) should have low ionization enthalpy; therefore, the temperature dependence of pH of these buffer solutions can be neglected in the analysis of G4 transitions. In contrast, TRIS buffer is not recommended for DSC measurements due to high ionization enthalpy. 2. Our results reveal that the PEG induced transition from hybrid to parallel Tel22 form is not necessarily driven by crowding [21] but may be favorable due to the selective PEG binding to the parallel conformation [22, 23]. 3. G4 structure and stability depends largely of the type and concentration of cations present in the solution. To control this, G4 sample solution should be dialyzed against buffer solutions with known concentration of specific cations. 4. In contrast to ordinary spectrophotometers, modern differential scanning calorimeters enable performance of experiments at increased pressure (typically about 4 atm), which prevents dilute aqueous solutions from boiling at temperatures higher than 100  C.

128

San Hadzˇi et al.

5. Reference DSC cell should contain a buffer solution that has been equilibrated with the G4 solution (sample cell) during the dialysis. In the case of samples with high PEG concentrations, equal aliquots of concentrated PEG solution were added to the “sample” and the “reference” cell solutions. 6. At any temperature the intrinsic heat capacity of DNA, C P , int , represents partial molar heat capacity of a given composition of DNA species, while the excess heat capacity, ΔCP, represents the corresponding contribution to the heat capacity due to DNA species (un)folding and interconversion. 7. A second-order polynomial baseline interpolated between low-temperature (folded G4) and high-temperature (unfolded DNA) part of the DSC thermogram appears to be the simplest reasonable way to describe theP intrinsic heat capacity of DNA, formally expressed as C p, int ¼ C p, i αi (see ref. 7-supplemeni tary information). However, subtracting the Cp, int from the measured C P ,2 does not mean that no ΔCp,i effects should be considered in the model analysis of data through Eq. (1). Be aware that the ΔCp,i influences the temperature dependence of ΔHi and αi through the Kirchhoff’s law and Gibbs-Helmholtz relation. 8. Model independent enthalpy of the DNA unfolding (ΔHU) corresponding to the transition from the initial [low-temperature (TG4) folded] state to the final [hightemperature (TU) unfolded] state can be obtained as an integral of ΔCp versus T thermogram over the (TG4, TU) interval. In the case of a non-two-state transition the direct comparison of the ΔHU to the model-dependent ΔHU is practically possible only when ΔCp,i effects are negligibly small. Then the ΔHi’s do not depend significantly on T, thus ΔHU may be considered as a sum of ΔHi contributions characterizing each step in the model predicted unfolding pathway G4 ! i . . . ! U. However, it should be emphasized that for any (two-state or non-twostate) transition good agreement between the model function (Eq. 1) and experimental thermogram guaranties that their integrals (enthalpies) also agree well. Therefore, when we observe a good fit at the “ΔCp level” no additional check of agreement between “model-independent” and “model-dependent” enthalpies is needed to test appropriateness of the proposed model mechanism. 9. To describe thermally induced reversible monomolecular two-state DNA folding/unfolding transition at constant pressure, cosolvent and cosolute concentrations Eq. (1) (i ¼ F) transforms into: ΔCP ¼ ΔHF(∂αF/∂T), where ΔHF is the standard enthalpy of folding and αF is the fraction of DNA in the folded state. αF can be expressed in terms of the corresponding

G4 Transitions Analyzed by DSC

129

apparent equilibrium constant of folding, KF,as αF ¼ KF/ (1 + KF). Temperature dependence of KF is defined through its relation with the apparent standard Gibbs free energy of folding [KF ¼ exp (ΔGF/(RT))] and Gibbs-Helmholtz   T T relation: ΔG ¼ ΔG þ ΔH 1  FðT o Þ T o þ ΔC p,   F  FðT o Þ T o

F T  T o  T ln TTo . Thus, (∂αF/∂T) ¼ ΔH 2F αF ð1  αF Þ=   RT 2 . Taking into account the Kirchhoff’s law ½ΔH F ¼ ΔH FðT o Þ þ ΔC p, F ðT  T o Þ the model function can be expressed in explicitly in terms of adjustable parameters:  2 ΔH FðT o Þ þ ΔC p, F ðT  T o Þ

ΔC P ðTÞ ¼ RT 2         1 T T T ΔG FðT o Þ þ ΔC p, F T  T o  Tln exp  þ ΔH FðT o Þ 1  RT To To To h h       ii2 1 ΔG FðT o Þ TTo þ ΔH FðT o Þ 1  TTo þ ΔC p, F T  T o  Tln TTo 1 þ exp  RT

This model function is fitted to a single ΔCP versus T curve measured by DSC using adjustable parameters ΔG FðT o Þ , ΔH FðT o Þ and ΔC p, F . Additional statistical analysis is needed to verify the reliability of the best-fit parameter values (see ref. 10).

Acknowledgments This work was supported by the grant P1-0201 of Slovenian Research Agency. We thank the editor Prof. Danzhou Yang as well as Prof. Jonathan B. Chaires for insightful comments and critical reading of the chapter. References 1. Neidle S, Balasubramanian S (2006) Quadruplex nucleic acids. RSC Publication, Cambridge 2. Rhodes D, Lipps HJ (2015) G-quadruplexes and their regulatory roles in biology. Nucleic Acids Res 43:8627–8637 3. Biffi G, Tannahill D, McCafferty J, Balasubramanian S (2013) Quantitative visualization of DNA G-quadruplex structures in human cells. Nat Chem 5:182–186 4. Boncˇina M, Vesnaver G, Chaires JB, Lah J (2016) Unraveling the thermodynamics of the folding and interconversion of human telomere G-quadruplexes. Angew Chem Int Ed 55:10340–10344

ˇ , Piantanida I, 5. Boncˇina M, Podlipnik C Eilmes J, Teulade-Fichou MP, Vesnaver G, Lah J (2015) Thermodynamic fingerprints of ligand binding to human telomeric G-quadruplexes. Nucleic Acids Res 43:10376–10386 6. Boncˇina M, Hamon F, Islam B, TeuladeFichou MP, Vesnaver G, Haider SM, Lah J (2015) Dominant driving forces in human telomere quadruplex binding-induced structural alterations. Biophys J 108:2903–2911 7. Boncˇina M, Lah J, Prislan I, Vesnaver G (2012) Energetic basis of human telomeric DNA folding into G-quadruplex structures. J Am Chem Soc 134:9657–9663

130

San Hadzˇi et al.

8. Marky LA, Breslauer KJ (1987) Calculating thermodynamic data for transitions of any molecularity from equilibrium curves. Biopolymers 26:1601–1620 9. Privalov PL, Potekhin SA (1986) Thermodynamic effects of mutations on the denaturation of T4 lysozyme. Methods Enzymol 131:4–51 10. Drobnak I, Vesnaver G, Lah J (2010) Modelbased thermodynamic analysis of reversible unfolding processes. J Phys Chem B 114:8713–8722 11. Prislan I, Lah J, Vesnaver G (2008) Diverse polymorphism of G-quadruplexes as a kinetic phenomenon. J Am Chem Soc 130:14161–14169 12. Prislan I, Lah J, Milanicˇ M, Vesnaver G (2011) Kinetically governed polymorphism of d (G4T4G3) quadruplexes in K+ solutions. Nucleic Acids Res 39:1933–1942 13. Cantor CR, Warshow MM, Shapiro H (1970) Oligonucleotide interactions. III. Circular dichroism studies of the confromation of deoxyoligonucleotides. Biopolymers 9:1059–1077 14. Gray RD, Buscaglia R, Chaires JB (2012) Populated intermediates in the thermal unfolding of the human telomeric quadruplex. J Am Chem Soc 134:16834–16844 15. Buscaglia R, Gray RD, Chaires JB (2013) Thermodynamic characterization of human telomere quadruplex unfolding. Biopolymers 99:1006–1018 16. Baldwin RL (1986) Temperature dependence of the hydrophobic interaction in protein folding. Proc Natl Acad Sci U S A 83:8069–8072

17. Miller MC, Buscaglia R, Chaires JB, Lane AN, Trent JO (2010) Hydration is a major determinant of the G-quadruplex stability and conformation of the human telomere 30 sequence of d (AG3(TTAG3)3). J Am Chem Soc 132:17105–17107 18. Hud NV, Smith FW, Anet FA, Feigon J (1996) The selectivity for K+ versus Na+ in DNA quadruplexes is dominated by relative free energies of hydration: a thermodynamic analysis by 1H NMR. Biochemistry 35:15383–15390 19. Largy E, Mergny JL, Gabelica V (2016) Role of alkali metal ions in G-quadruplex nucleic acid structure and stability. Met Ions Life Sci 16:203–258 20. Mergny JL, Lacroix L (2003) Analysis of thermal melting curves. Oligonucleotides 13:515–537 21. Miyoshi D, Nakao A, Sugimoto N (2002) Molecular crowding regulates the structural switch of the DNA G-quadruplex. Biochemistry 41:15017–15024 22. Buscaglia R, Miller MC, Dean WL, Gray RD, Lane AN, Trent JO, Chaires JB (2013) Polyethylene glycol binding alters human telomere G-quadruplex structure by conformational selection. Nucleic Acids Res 41:7934–7946 23. Trajkovski M, Endoh T, Tateishi-Karimata H, Ohyama T, Tanaka S, Plavec J, Sugimoto N (2018) Pursuing origins of (poly)ethylene glycol-induced G-quadruplex structural modulations. Nucleic Acids Res 46:4301–4315

Chapter 8 X-Ray Crystallographic Studies of G-Quadruplex Structures Gary N. Parkinson and Gavin W. Collie Abstract The application of X-ray crystallographic methods toward a structural understanding of G-quadruplex (G4) motifs at atomic level resolution can provide researchers with exciting opportunities to explore new structural arrangements of putative G4 forming sequences and investigate their recognition by small molecule compounds. The crowded and ordered crystalline environment requires the self-assembly of stable G4 motifs, allowing for an understanding of their inter- and intramolecular interactions in a packed environment, revealing thermodynamically stable topologies. Additionally, crystallographic data derived from these experiments in the form of electron density provides valuable opportunities to visualize various solvent molecules associated with G4s along with the geometries of the metal ions associated within the central channel—elements critical to the understanding G4 stability and topology. Now, with the advent of affordable, commercially sourced and purified synthetic DNA and RNA molecules suitable for immediate crystallization trials, and combined with the availability of specialized and validated crystallization screens, researchers can now undertake in-house crystallization trials without the need for local expertise. When this is combined with access to modern synchrotron platforms that offer complete automation of the data collection process—from the receipt of crystals to delivery of merged and scaled data for the visualization of electron density—the application of X-ray crystallographic techniques is made open to nonspecialist researchers. In this chapter we aim to provide a simple how-to guide to enable the reader to undertake crystallographic experiments involving G4s, encompassing the design of oligonucleotide sequences, fundamentals of the crystallization process and modern strategies used in setting up successful crystallization trials. We will also describe data collection strategies, phasing, electron density visualization, and model building. We will draw on our own experiences in the laboratory and hopefully build an appreciation of the utility of the X-ray crystallographic approaches to investigating G4s. Key words G4, G-quadruplex, Crystallization, Macromolecular crystallography, X-ray diffraction, Data collection, Structure solution, Native SAD MAD phasing, DNA, RNA

1

Introduction An understanding of molecular structures at atomic resolution has principally relied on the application of either X-ray crystallography or Nuclear Magnetic Resonance (NMR) methods, with crystallographic methods requiring the samples to be constrained in a solid state within an ordered crystalline lattice (Fig. 1). Unfortunately, the generation of suitable crystals containing ordered

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_8, © Springer Science+Business Media, LLC, part of Springer Nature 2019

131

132

Gary N. Parkinson and Gavin W. Collie

Fig. 1 Path to structure determination by single-crystal X-ray diffraction methods. (a) (i) Crystal mounted in nylon loop in cryostream and exposed to X-rays. (ii) Resulting diffraction pattern with solvent ring around 3.5 A˚. (iii) Schematic drawing of final coordinates modeled into resulting electron density maps. (b) Pathway to crystallographic structure determination

G-quadruplexes (G4s) remains the main obstacle to the successful application of X-ray crystallography as a tool for the determination of G4 structures. However, when high-quality crystals are formed, and diffract to better than 5 A˚ resolution, then, with access to high flux tuneable synchrotron X-ray sources, fast computers, and automated software, the successful determination of the nucleotide structures should almost be guaranteed, certainly with target structures with molecular weights around 7–15 kDa (Figs. 1 and 2) [1, 2]. The requirement for the generation of crystals then imposes further experimental constraints on the oligonucleotides, such as chemical purity, homogeneity of sequence and topology, a limited selection of solvents, along with the imposition of constrained packing interactions for the DNA/RNA molecules. However, if you are able to generate crystals and the structure is “solved,” this provides a significant opportunity to the researcher to identify any ordered scattering component whose characterization only relies on correct interpretation of its electron density, and its context to other surrounding components. The methodology also allows for the visualization of ordered solvent directly, and critically for

X-Ray Crystallographic Studies of G4 Structures

133

Fig. 2 Examples of G4 structures determined by single-crystal X-ray diffraction methods (PDB-ID codes given). Purple spheres represent Na+ ions (a, b) or K+ ions (c–f) and green spheres represent Ca2+ ions (a, b) or Mg2+ ions (d)

quadruplex structures, the metal ions in the central channels, the waters or any other associated elements essential to G4 formation. Extending further, the determination and interpretation of the associations of any additional components added to the crystallization buffer, such as small molecule ligands, can be visualized and their interactions to the quadruplex target identified without bias derived from our predetermined expectations. For example, even subtle difference between equivalent DNA and RNA quadruplexes can be determined, providing high-resolution structural data to aid in therapeutic drug design aimed toward G-quadruplex stabilization [3]. Herein we will discuss methods applied in our laboratory for the successful generation of crystals for G4 structural investigations, the role of annealing, choice of metal ions, and ionic strength, which all play a role in successful crystallization. After crystallization we will discuss material preparation for data collection, diffraction equipment used (Fig. 3), and data collection strategies. With the diffraction data in-hand the structure must still be solved, that is, accurate phases need to be assigned to structure factors to enable the calculation of electron density maps. However,

134

Gary N. Parkinson and Gavin W. Collie

Fig. 3 Experimental setup for diffraction experiments. (a) Schematic of machine components, alignments and image showing crystal mounted on our in-house sealed-tube X-ray diffractometer (Oxford Diffraction, now Rigaku Oxford Diffraction) with Titan CCD detector. (b) Schematic showing X-ray source, geometry and X-ray paths and scattering

structure solution continues to pose a major challenge as a consequence of the low solvent content resulting from close associations between the molecules. Additionally, structure solution using molecular replacement techniques often fails due to the repetitive patterns of stacked DNA/RNA, while fully automated MAD methods, implemented in many structure solution packages, lack suitable pipelines for automated DNA/RNA building. These and related challenges will be highlighted along with the tools to help resolve some of these difficulties.

2

Materials All buffer solutions can be stored at room temperature. Use purified deionized water at 18 MΩ-cm at 25  C and analytical grade reagents. Filter all stock buffer solutions with a 0.22 μm filter before use.

X-Ray Crystallographic Studies of G4 Structures

2.1 Buffers to Anneal RNA

135

1. 1 M KCl: Weigh 3.72 g of KCl and transfer into a 50 mL Falcon tube. Add about 20 mL of water and mix until dissolved, then make volume up to 50 mL with water. Filter solution (see Note 1). 2. 1 M Potassium cacodylate, pH 6.5: Weigh 8.80 g of potassium cacodylate salt and transfer into a 50 mL Falcon tube. Add about 30 mL of water and mix until dissolved. Mix and adjust pH with 1 M KOH. Make volume up to 50 mL with water. Filter solution (see Note 1).

2.2 Selection and Preparation of RNA for Annealing

1. Source oligonucleotide materials: Order a 1.0 μmol preparation of the RNA sequence r(UBrAGGGUUAGGGU) from a supplier (see Notes 2 and 3). 2. Hydrate RNA: Use all 150 nmol of RNA of the desired sequence from the 1.0 μmol preparation (see Note 4). Dissolve the lyophilized RNA in situ to a final single-stranded concentration of 2 mM, in this case by adding about 75 μL of RNasefree water to the tube (see Note 5). 3. 200 μL thin-walled PCR tube. 4. Heated water bath. 5. Parafilm. 6. Mix RNA and buffers ready for annealing: Combine 1 M stock salt solution with the 1 M buffering solution (pH 6.5) and 2 mM hydrated RNA solution in a thin-walled PCR tube to make a final concentration of 50 mM salt, 20 mM buffer, and 1.5 mM RNA in a final volume of 50 μL and seal with Parafilm (see Note 6).

2.3 Preparation of Crystallization Reagents

1. 2-Methyl-2,4-pentanediol (MPD): 100% 2. 5 M NaCl: Weigh 14.6 g of NaCl and transfer into a 50 mL Falcon tube. Add about 40 mL of water and mix until dissolved, then make volume up to 50 mL with water. Filter solution (see Note 1). 3. 1 M Sodium cacodylate, pH 6.5: Weigh 8.00 g of sodium cacodylate salt and transfer into a 50 mL Falcon tube. Add about 30 mL of water and mix until dissolved. Mix and adjust pH with 1 M NaOH. Make volume up to 50 mL with water. Filter solution (see Note 1). 4. 0.5 M spermine tetrahydrochloride: Weigh 1.74 g of spermine tetrahydrochloride and transfer into a 10 mL Falcon tube. Add about 8 mL of water and mix until dissolved, then make volume up to 10 mL with water. Filter solution (see Note 1).

136

Gary N. Parkinson and Gavin W. Collie

2.4 Crystallization Trays, Pipettes, and Storage

1. Standard 24-well cell culture/crystallization tray, greased, with lid (see Note 7). 2. Pre-siliconized glass cover slides: 22 mm diameter. 3. Marker pen. 4. Low vibration incubator with fixed temperature control: Sanyo mir-253 incubator or similar. 5. Pipettes and tips: 1–10 μL and 100–1000 μL with filter (see Note 8).

2.5 Mounting, Data Collection Equipment, and Software

1. Loops to mount crystals: Nylon fiber (or other synthetic polymer) attached to steel pins of a fixed length (22 mm) attached to a standardized magnetized base (see Note 9). 2. Stereo microscope (see Note 10). 3. X-ray source and a diffractometer designed to collect single crystal data (see Note 11). 4. Cryosystems unit mounted on diffractometer for in situ freezing crystals: Oxford Cryosystems 700 series cryostream cooler (Fig. 3) (see Note 12). 5. Data processing: CrysAlisPRO [4] (see Note 13). 6. Structure solution and refinement software: The CCP4 (see Note 14) package [5] software captures the majority of the programmes currently used in X-ray crystallography, which includes data reduction, phasing, refinement, validation, and analysis. 7. Model building: COOT [6] and RCrane [7] (see Note 14).

3

Methods Example Protocol: Crystallization of Human Telomeric RNA and Data Collection. The following protocol outlines the steps required to produce high-quality crystals of a G-quadruplex formed from the human telomeric RNA sequence, r(UBrAGGGUUAGGGU) (see Note 15), using the hanging-drop crystallization technique. This crystallization protocol does not require specialist equipment or reagents, and should yield well-diffracting crystals within 1–3 days (see Note 16). 1. Dissolve the lyophilized RNA in RNase-free water to a final single-stranded concentration of 2 mM. 2. Salts and buffers. Before crystallization trials, suitable salts and buffer reagents need to be added to the RNA. It is important to include a relatively high concentration of potassium ions at this stage. Details of the required buffer concentrations and volumes are shown in Table 1.

X-Ray Crystallographic Studies of G4 Structures

137

Table 1 RNA G4 sample preparation volumes Reagent

Stock []

Final [mM]

Volume to measure [μL]

RNA

2 mM

1.5

37.5

Potassium chloride

1M

50

2.5

Potassium cacodylate (pH 6.5)

1M

20

1

H2O





9

Total volume

50

3. Following the addition of salts and buffer, the RNA sample then needs to be annealed (see Note 17). Heat the sample from step 2 in a standard metal heat block at 90  C for 3 min. The heat block should then be switched off and allowed to cool to room temperature overnight. The annealing step can also be performed in a heated water bath, but it is important to ensure the cooling step is not too rapid—a cooling rate of approximately 0.1–0.2  C/min is required. 4. Preparation of crystallization reagents. Crystallization reagents typically consist of four components: a precipitant (to drive vapor diffusion), salts (to provide a suitable ionic environment for stable crystal growth), a buffering agent (to maintain the pH), and often an additive (a substance included usually a low concentration which has a specific favorable effect on crystallization) (see Note 18). For this experiment, the precipitant will be 2-methyl-2,4-pentanediol (MPD), the salt will be sodium chloride, the buffer will be sodium cacodylate (pH 6.5), and the additive will be spermine tetrahydrochloride. Six different crystallization reagents will be prepared, differing only in the MPD concentration. Crystallization reagent concentrations and volumes are shown in Table 2 (see Note 19). 5. Crystallization tray setup: We will use the hanging drop crystallization method which requires the standard 24-well cell culture/crystallization trays with greased well-rims and siliconized cover slides (Figs. 4a and 5) (see Note 20). 6. First, pipette 1 mL of reagent 1 into the first well of the 24-well tray. Next, pipette 1 μL of the annealed RNA sample onto a cover slide. Then take 1 μL of the crystallization reagent from the well and mix it with the RNA drop (see Note 21). 7. Invert the cover slide and seal over the first well of the crystallization tray. This results in a 2 μL hanging drop with a diffusion gradient of 50% relative to the well. 8. Repeat steps 6 and 7 for the remaining five crystallization reagents (Fig. 5).

138

Gary N. Parkinson and Gavin W. Collie

Table 2 Crystallization reagent volumes Reagent 2 (12% MPD) Volume

Reagent 3 (14% Reagent 4 Reagent 5 Reagent 6 MPD) (16% MPD) (18% MPD) (20% MPD) Volume Volume Volume Volume

Reagent

Stock [] Final []

Reagent 1 (10% MPD) Volume

MPD

100%

10–20%

100 μL

120 μL

140 μL

160 μL

180 μL

200 μL

Sodium chloride

1M

150 mM 150 μL

150 μL

150 μL

150 μL

150 μL

150 μL

1M Sodium cacodylate (pH 6.5)

50 mM

50 μL

50 μL

50 μL

50 μL

50 μL

50 μL

Spermine

1M

5 mM

5 μL

5 μL

5 μL

5 μL

5 μL

5 μL

H2O





695 μL

675 μL

655 μL

635 μL

615 μL

595 μL

1 mL

1 mL

1 mL

1 mL

1 mL

1 mL

Total volume

Fig. 4 Example of hanging drop crystallization experimental setup and resulting crystals. (a) 1 μL of the annealed RNA quadruplex mixed with 1 μL of a crystallization reagent solution, composed of 15% MPD, 150 mM NaCl, 50 mM NaCaco (pH 6.5), and 5 mM spermine. The hanging drop is sealed and equilibrated against a well containing undiluted crystallization reagent. (b) The resulting 2 μL hanging drop observed under a polarizing microscope revealing RNA G4 containing crystals after about 2 weeks

X-Ray Crystallographic Studies of G4 Structures

139

Fig. 5 Schematic of a 24-well “Linbro-style” vapor diffusion crystal growth plate. A typical “screening” setup is shown, where an increase in MPD well concentration is used (left to right) as a gradient to identify solubility/precipitation ranges, search for nucleation and ultimately suitable crystal growth conditions. Rows B-D can be used to vary other factors such as salt concentrations or additives

9. Store the tray in an incubator maintained at 12  C (see Note 22). Crystals should appear within 3 days and can be used directly for X-ray diffraction experiments without the need for additional cryoprotection. Crystals grown in these conditions are shown in Fig. 4b, and an image of the diffraction pattern collected on an in-house sealed-tube X-ray source is shown in Fig. 6. 10. Identify a suitable crystal for mounting on diffractometer and exposing to the cold stream (see Note 23). 11. Pick up the selected crystal from the crystallization drop using a cryo-loop (Fig. 7c) and flash-cool by transferring the crystal and pin assembly onto a magnetic mount exposing the loop and crystal to the nitrogen gas cryostream set at 100 K (Fig. 3a) (see Note 24). 12. Collect preliminary data on an X-ray diffraction system (see Note 25) to determine unit cell dimension and crystal orientation (see Note 26) and assess quality of diffraction (Fig. 3b) (see Note 27). 13. Define the most appropriate and efficient data collection strategy to collect all available data to the available maximum resolution (see Note 28) and conduct data collection.

140

Gary N. Parkinson and Gavin W. Collie

Fig. 6 X-ray diffraction image collected on an in-house sealed-tube X-ray source for a RNA crystal shown in Fig. 4b. A 1 Δ scan width with a 5 min exposure measured on a Titan CCD detector (Oxford Diffraction, now Rigaku Oxford Diffraction)

14. After the collection is complete confirm the space group and process the data (see Note 29). 15. Determine of the initial phases to “solve the structure” by molecular replacement (see Note 30) or if a novel structure use isomorphous replacement or anomalous dispersion methods (see Note 31). 16. Refine the structure (see Note 32). 17. Validate the structure (see Note 33). 18. Optional: Inclusion of ligands to the crystallization process to generate ligand/nucleic acid complexes (see Note 34).

4

Notes 1. Prepare a 0.22 μm filter by washing off the protective glycerol bound to the filter membrane with about 5 mL of water. Then discard the first 1 mL to reduce dilution of the buffer before filtering the remaining solution into a clean Falcon tube. 2. A 1.0 μmol preparation, High Performance Liquid Chromatography (HPLC) purified, and lyophilized to a dried white powder from a commercial supplier usually provides sufficient material for several dozen crystallization experiments.

X-Ray Crystallographic Studies of G4 Structures

141

Fig. 7 Images of G4 crystals and diffraction. (a–c) Crystals of DNA G4-forming sequence AGGG(TTAGGG)3 bound by naphthalene diimide compounds BMSGSH3, BMSG-SH-4 [36], and MM41 [39], for, (a–c) respectively. Crystals (a, b) are in crystallization drops, crystal c is mounted in a nylon loop prior to X-ray data collection. (d, e) Crystal of RNA G4-forming sequence UAGGGUUAGGGU bound by the acridine-based compound FD-121 [37] (d) and corresponding diffraction image (e)

3. An ideal heavy atom addition is bromine attached to uracil bases (5-Br-dU), which can be used as a mimic for thymine residues. In addition to the increased scattering, the bromine edge is close to the anomalous edge for selenium, an element routinely used by protein crystallographers enabling the use of synchrotron beam lines designed for automated protein data collections and structure solution. An alternative choice is iodine (5-I-dU), with an enhanced anomalous signal that is ideally suited for home CuKα sources thus enabling in-house SAD structure solution methods. However, caution is required, as crystallizations need to be conducted in the dark, and exposure to X-rays minimized during data collections due to the labile nature of the halogens. Successful recrystallization with these modifications is not guaranteed as the inclusion of modified bases within sensitive regions of the sequence can introduce unexpected bias in the preferred geometries of the

142

Gary N. Parkinson and Gavin W. Collie

ribose ring, such as a preference for 30 -endo pucker for sugars and a shift of preference toward alternative conformations as seen for non-G4-forming sequences [8, 9]. 4. Unlike proteins, sample preparation for both DNA and RNA crystallization experiments is significantly simpler. Typically, materials are generated by solid phase synthesis (e.g., phosphoramidite-based methods), which can incorporate HPLC purification protocols to ensure consistency in sequence length and purity of the supplied solid materials. DNA and RNA sequences can also be synthesized commercially with options to incorporate special modified bases at any position in the sequence, useful to either enhance stability, alter geometry, introduce anomalous scattering atoms, or introduce modifications suitable for quality control by the use of mass spectrometry. We are fortunate that we can selectively introduce modified nucleotides to incorporate heavy atoms (HA) or anomalous scatterers during the synthesis of the synthetic sequence as this can aid in structure solution, typically for isomorphous replacement (SIR, MIR), or anomalous dispersion (MAD and SAD) methods. Commercial vendors now routinely allow nucleotide substitutions which permits a vast range of modified nucleotides to be incorporated within an oligonucleotide sequence. These include selenium as a replacement for oxygen, such as 2-methylselenonucleotides [10], as well as selenium being used to bridge phosphates (O4 and O5), additionally beneficial in the aid of structure solution using MAD methods [11]. Another useful nucleotide substitution is the sugar-modified 20 -O-40 -C-methylene-guanosine, also known as locked-nucleic acid ((LNA)G). Here LNA has a preference for “anti” pucker conformations, and can therefore be used to either favor the “anti” position in parallel quadruplexes, or to selectively disrupt native G-quadruplex conformations with a “syn” dependence [8]. (a) Finally, although not routinely adopted in our labs, it is nevertheless worth noting that the use of racemic nucleic acid mixtures—that is, solutions containing both L- and Dnucleic acid enantiomers—could aid crystal growth [12]. The principal here is that by adding the mirror-image (i.e., L-form) nucleic acid strand to a natural D-form oligonucleotide sample, an achiral mixture is produced which thus gives the crystallization mixture access to all 230 space groups, as opposed to the mere 65 accessible to chiral systems. A racemic crystallization approach has been shown to be applicable to G4 sequences [12], and with L-form oligonucleotides readily available from commercial sources, this method may well be worth considering, particularly for recalcitrant systems.

X-Ray Crystallographic Studies of G4 Structures

143

5. Both DNA and RNA oligonucleotide sequences are readily soluble in water to millimolar concentrations, can be handled at room temperature, and are easily characterized by UV absorbance for accurate concentration determination. 6. Annealing of materials to help induce appropriate folding prior to crystallization experiments is routinely performed and highly recommended. Normally, DNA/RNA samples at 1–2 mM are heated to around 95  C (to unfold any existing structure) in the presence of metal ions and a buffer at pH 6.5. The sample is then slowly cooled to room temperature (i.e., overnight) using a covered heating block to promote G-quadruplex formation. During annealing we keep the concentration of metal ions and buffering agents to a minimum as these materials are transferred to the crystallization buffer with the sample. Experimentally, we typically keep the pH fixed at 6.5, and always avoid alkaline conditions to limit phosphodiester backbone cleavage. 7. The culture/crystallization trays can be purchased with pre-greased (i.e., with well-rims covered with a thin layer of silicon) or can be manually greased using 5 mL syringe filled with silicon grease. The grease ensures an airtight seal of the well when the glass coverslip is attached. 8. Before handling the RNA, it is recommended to ensure your working environment is RNase-free. This can be achieved relatively easily by wearing (and regularly changing) latex (or equivalent) laboratory gloves, using nuclease/RNase-free tips, reagents, and plasticware, and wiping all surfaces with a detergent-based anti-RNase spray (such as “RNase-Zap”). 9. Nylon loops can be obtained easily from a number of commercial suppliers either as a premade assembly or by gluing a pin attached to a nylon loop, onto a standard magnetic base. The pin should be about 22 mm in length. 10. We recommend the use of polarizing stereo microscopes (50) to assess crystalline quality early in the process of crystal growth optimization and for crystal manipulation. The appearance of straight extinction of the light under cross-polarized light is highly correlated to good diffraction properties. Ultimately, however, the crystals will need to be assessed by the use of X-ray diffraction techniques. 11. Single wavelength diffractometer for routine crystal collection of high-quality diffraction data such as a Rigaku (Oxford Diffraction) XtaLAB SuperNova with 4-circle kappa goniometer geometry, Titan CCD detector combined with cryo-cooling (Oxford cryostream700). Control and data processing uses the latest CrysAlisPro software.

144

Gary N. Parkinson and Gavin W. Collie

12. Crystals are usually removed from the mother liquor (i.e., crystallization reagent) and either flash-cooled to 100 K for later analysis or frozen in a stream of cold nitrogen gas for direct X-ray data collection. The freezing of the crystals locks in the diffraction properties, prevents solvent loss, and helps reduce free radical damage to the biological material when exposed to the X-rays. Cryo-protection is a requirement to ensure the integrity of the crystal during the cooling process. Common cryo-protection agents include glycerol, ethylene glycol, and low MW Polyethylene glycols (PEGs), which work by controlling the balance between contraction/expansion during the cooling process by modifying (typically reducing) the water content of the crystal. Effective cryo-cooling also reduces the formation of ice crystals and their contribution to the total diffraction, which would otherwise mask the diffraction pattern of the underlying molecules of interest. The exact conditions for cryo-protection are often found empirically, which normally involves the gradual addition of agents into the mother liquor containing the crystal and testing until the diffraction pattern is restored. This step can be avoided by the introduction of cryo-agents during the crystallization process. 13. Data processing techniques and software for G4s is no different from processing diffraction data collected for other types of macromolecular samples, with XDS [13], CrysAlis PRO [4], and mosflm [14] performing well. 14. The computational techniques and principles for G4 crystallography are no different from techniques used for any other type of macromolecular samples; however, the software is less extensive, typically lacking focus on nucleic acids. Automated molecular replacement pipelines, such as BALBES [15], are not calibrated for nucleic acid structure determinations, and require manual model building interventions, although efforts have been made to incorporate assisted model building to COOT [6], using for instance RCrane [7]. 15. The crystallization conditions indicated above are optimized specifically for this RNA sequence including the Bromine modification. “UBr” refers to a 5-bromo-uracil residue. 16. Crystallization experiments can be sensitive to apparently minor changes in conditions or technique. If crystals do not appear after 3 days, ensure all steps have been followed exactly, such as incubation temperature, salt concentrations, and hanging drop volumes. With the following modifications, the above protocol can be used to crystallize the equivalent DNA sequence (i.e., d[UBrAGGGUBrTAGGGT]): (1) reduce the

X-Ray Crystallographic Studies of G4 Structures

145

incubation temperature to 10  C and (2) omit spermine tetrahydrochloride from the crystallization reagents. 17. Annealing involves heating the RNA in order to disrupt preformed nonspecific secondary structures, followed by slowcooling to room temperature to allow stable G-quadruplex formation. It is routine to anneal samples prior to crystallizations; however, it is difficult to gauge the significance of chosen annealing protocol employed. The samples, at a fixed concentration, are heated to 90  C and allowed to cool over a specified period of time to room temperature. Changes in the conformation of the refolded oligonucleotides will have a direct impact on the outcome of a crystallization experiment, particularly as methods are empirically determined and cumulative in the sequence of the steps taken, such that once a protocol has been developed, any small alterations in the method normally results in failure. The crystallization of the c-kit quadruplex [16] proved to be an extreme example, as many months of patience were required to optimize every step of the process to finally generate crystals routinely and reproducibly. Indeed there is generally little incentive to refine a protocol that potentially contains many redundant steps, particularly as a single successful crystallization drop may generate sufficient materials for the all required diffraction experiments. It is important to note that the addition of any buffering agents carried over from the sample annealing in effect undermines the idea of a completely independent space matrix crystallization screen (see below). 18. The choice of precipitating agents appropriate for crystallization of DNA/RNA are quite varied; however they tend to fall into four main categories, these are MPD, PEGs and their variants, salts such as ammonium sulfate, and alcohols. PEG400 in particular has proven a successful PEG for G4 crystallization [1], although it should be noted that low molecular weight PEGs such as PEG400 have been shown to induce parallel topologies [17] when used in high concentrations, as confirmed by CD and NMR. This characteristic has been attributed to the ability of these PEGS to mimic molecular crowding conditions. Salts known to destabilize quadruplex formation such as those containing lithium frequently appear during our screenings as hits. These salts might aid crystallogenesis by reducing the number of alternative topologies in solution, thereby ensuring that the solutions are homogeneous in folded content. It is also possible that lithium promotes G4 crystallization by acting as an effective shield of the DNA phosphate charges. In terms of additives, we (and others) have found that by far the most successful additives seen to promote G4 crystallization to be polyamines such as spermine

146

Gary N. Parkinson and Gavin W. Collie

and spermidine [1, 3, 18–20]. These are hypothesized to aid crystallogenesis by shielding the strong negative charges of nucleic acid backbone phosphates, and are well worth exploring as additives in prospective G4 crystallization experiments, at a concentration range of around 0.5–10 mM. Cobalt hexamine is another polyamine worth considering as an additive for G4 crystallization, although we have found it to give a high false-positive rate (i.e., salt crystal formation). 19. Techniques normally applied to crystallizing proteins are equally applicable to generating crystals of G-quadruplex molecules, which typically include: vapor diffusion by sitting and hanging drop (Fig. 4a); dialysis and liquid-liquid diffusion, and under oil diffusion. Crystallizations are simple experiments to setup; typically pre-annealed DNA/RNA at a concentration of 0.6–1.5 mM is mixed with a well solution usually composed of three major components, and then sealed to allow equilibration to occur between the drop and well solution. The three components usually comprise the well solution, buffer, precipitating agent and salt. A fourth component—referred to as an “additive”—may also be included. Additives are usually low molecular weight compounds with specific properties added to a crystallization experiment at low-mM concentrations that can prove decisive in G4 crystallization experiments. Commercially available additive screens are available, and are typically added to initial hit conditions with the aim of improving crystal quality. 20. Crystallization can occur in hours to months, and within this timeframe new folding topologies may be adopted; perhaps significantly different from those induced during annealing. The main difficulties therefore in designing these experiments arise in the determination of the conditions appropriate for folding, self-association, and crystallization (Fig. 4b). From a simple starting point the number of critical variables available to help bring the molecules together is large, resulting in the generation of a multifactorial crystallization space. Sampling such a landscape can require significant resources in materials and time. Two strategies have been developed to tackle this challenge: (1) sampling using a sparse matrix approach and (2) data-mining of crystallization databases to build a library of high-frequency hits for use as a set of pre-formulated trial conditions. Fortunately, varying pH for crystallizations of DNA and RNA is not usually a crucial factor for crystal growth, as the protonation of nucleotides is less sensitive to pH than proteins (except for the formation of i-motifs), thus removing one factor. To assist in DNA/RNA crystallizations we developed a new crystallization screen, HELIX [21], currently being marketed by Molecular Dimensions Ltd. This sparse matrix

X-Ray Crystallographic Studies of G4 Structures

147

screen was designed using data from 1450 reported crystallization screens, and includes 24 conditions designed specifically for quadruplex crystallization and 12 conditions optimized for i-motifs. Although the vast majority of commercially available sparse-matrix crystallization screens have not been optimized for G4 sequences (or nucleic acids in general), it should be noted that such screens have yielded some success [22] and are therefore worth considering, particularly if combined with high-throughput crystallization robotic systems. The addition of small molecule ligands to form quadruplex/ligand complexes adds an additional layer of complexity to the search through crystallization space. Typically, the high concentrations needed for crystallizations requires a test ligand to be soluble at mM concentrations, which often necessitates the introduction of solvents such as DMSO into the well and drop solutions. These additional components can often disrupt the solution properties of the nucleic acids, which can alter previously identified crystallization conditions. However, examples of successful well-diffracting DNA and RNA G4-ligand co-crystals are shown in Fig. 7. 21. We have found that drop sizes have a large impact on the rate of equilibrium and have found that drop sizes between 0.8 and 1.6 μL are ideal for handling and crystal growth using vapor diffusion, typically with an oligonucleotide concentration range of 0.6–1.5 mM. The expansion in the number of variables often requires many thousands of independent crystallization experiments to be performed; however, the quantities of materials required are still manageable. With 1000s of conditions to be sampled, this requires about 4.5 mg of DNA or RNA, equivalent to an NMR experiment. 22. Temperature is often a critical variable in our trials—subtle differences of only a few degrees are often the difference between success and failure. Temperature is a variable that requires careful attention but must be found empirically. We use a range of specialist vibration-free incubators for our crystallization trials set at 4, 10, 12, 20, and 30  C. 23. Crystals are typically harvested using fine loops made of nylon fiber (or other synthetic polymer) attached to steel pins of a fixed length (Figs. 1a and 7c) attached to a standardized magnetized base. Such loops can be obtained easily from a number of commercial suppliers. 24. Many reported G4 crystallization conditions contain high concentrations of ammonium sulfate (for summaries see Campbell and Parkinson [1] and Campbell et al. [20]), which will always require the addition of a cryoprotectant for successful cryofreezing. We find the addition of about 25% glycerol suitable in

148

Gary N. Parkinson and Gavin W. Collie

most cases. Crystallizations containing MPD or PEGs can be successfully cryofrozen if the final MPD at concentrations above 25%, or for low molecular weight PEGs ranging from 400 to 3500 Da at concentrations of between 18–25%. Typically, successful recrystallizations at these high MPD and PEG concentration negate the requirement of adding additional cryo-agents during the freezing process. However, any modification of the crystallization protocol requires a careful balancing of the molar concentration of the DNA/RNA in solution with the precipitating concentration in order to prevent rapid DNA/RNA precipitation. We often coat the crystal in Paratone-N oil before transferring the crystal to the cold stream to prevent dehydration and salt formation on the surface of the crystals. 25. Lab-based X-ray sources such as sealed-tubes and rotating anodes have proven highly suitable for collecting high-quality diffraction data for quadruplex crystals. As with protein crystallography, synchrotrons generally enable higher resolution data to be collected on smaller crystals, largely due to increased X-ray brilliance. In addition, some synchrotrons offer the additional advantage of providing near-fully automated data collection (and processing) services [23], which are also highly suitable for nucleic acid crystals. 26. Follow a standard protocol based on the space group, crystal orientation, and diffraction quality. 27. Once the crystal is successfully frozen it is recommended that during data collection a suitable number of whole reflections are collected on each frame to ensure sufficient equivalent reflections are available for scaling between individual frames, unfortunately this does require a very wide Δθ scan for each frame collected for nucleic acids crystals. In contrast to protein crystallography, the intensity distribution of the scattered peaks is substantially wider for nucleic acid diffraction, leading to detector saturation for low angle scattering peaks along with those peaks associated with the repeat base stacking vectors of about 3.4 A˚. This creates an issue requiring consideration when optimizing exposures for high-resolution data collections, although this is predominantly an issue for CCD detectors and less so for modern hybrid photon counters. Lattice dimensions reflect the maximum size of the macromolecules, so with DNA/RNA quadruplexes of around 7500 Da, the cell dimensions are substantially smaller than typical for proteins. The smaller lattices result in fewer diffracted peaks with further separation in reciprocal space, resulting in too few peaks during integration if the data slices are too narrow (i.e., 95%. 4. For drug binding studies, concentrated drug stock solutions (20–40 mM) are prepared in H2O or DMSO-d6 (see Note 25). 5. The DNA (0.1–0.5 mM) solution containing 25 mM phosphate, 95 mM potassium, pH 7.0 is checked for G-quadruplex formation by 1H-NMR. 6. The DNA-drug complex is prepared by step-by-step additions of a small volume of concentrated drug stock solution into DNA sample to achieve DNA-drug ratios of 1:0.1, 1:0.2, 1:0.3, 1:0.4, 1:0.5, 1:1, 1:2, 1:3, and 1:4 (see Note 26). The complex sample is annealed if needed. 7. The change in chemical shifts of DNA-drug complex at various drug equivalences is monitored by 1D 1H-NMR spectra. The imino proton regions of tetrad-guanines are well separated from non-exchangeable protons and thus can be used in

NMR of G4 Structures and G4-Interactive Compounds

171

monitoring drug binding interactions and line-width changes in titration profiles. 8. In general, the effects of a drug on a G-quadruplex are readily interpreted by NMR titration methods. A drug that binds specifically to a particular DNA sequence with sufficient affinity will produce a NMR spectrum with well-resolved peaks. In the case of a slow exchange binding regime, two sets of peaks for the free DNA and the bound DNA can both be observed at lower drug equivalence, e.g., 0.5, and signals from the free DNA disappear once all the binding sites are occupied. In the case of a medium-to-fast exchange binding regime, only one set of peaks are observed for DNA and drug. In the case of medium-to-fast exchange binding, the complex structure cannot be determined, but some information on binding sites can be obtained. Compounds that do not bind tightly/specifically are readily discerned, as they do not lead to any shifts or cause spectral line broadening (see Note 27). 9. If a drug-DNA complex gives a well-resolved 1H-NMR spectrum, the binding site(s) or mode(s) can be determined. For example, a stacking compound (with either end of the G-quadruplex) will affect the chemical shifts of the guanine imino protons from the end G-tetrad with which the drug stacks, typically causing an upfield shift. For the groovebinding drugs, the protons located in the groove region, such as sugar protons and G’s H8, rather than the imino protons, will be shifted upon drug binding. The shifting of base protons’ chemical shifts will also be different for different binding modes. 10. Once the complex system is optimized, the detailed structure determination can be carried out by following the steps in Subheading 3. 11. For example, the binding of quindoline (EPI, Fig. 6a) with Myc 14/23 was studied by 1D 1H NMR titration experiments. The free Myc 14/23 (Fig. 6b) in 0.1 M K+ solution forms a major parallel G-quadruplex conformation as indicated by well-resolved imino proton peaks (Fig. 6c), as shown previously [26]. Upon addition of quindoline to the Myc 14/23 DNA solution, the imino proton peaks of the DNA first broaden at lower drug equivalence (0.5 N) and then become sharper at higher equivalence (Fig. 6c), indicating a medium exchange rate of quindoline binding to Myc 14/23 on the NMR time-scale. The NMR titration data support a 2:1 binding stoichiometry of quindoline with Myc 14/23, as no further qualitative change is visible in the imino region at the EPI equivalence higher than 2 (Fig. 6a). The observation of a relatively well-resolved imino proton peaks at a 2:1 ratio

172

Clement Lin et al.

A

C

Quindoline

+3 N QUI

N

HN

H N

+2 N QUI

N +1 N QUI

B

5'

Myc 14/23 G20 anti

6 G16 T19

G21

7 G17

G22

8 G18

G7 G11 G12

+0.5 N QUI

G8 T10 G9

Myc 14/23 Free DNA

G13

3'

12.0

11.5

11.0

10.5

ppm

Fig. 6 (a) The chemical structure of quindoline. (b) Schematic drawing of the parallel-stranded Myc 14/23 G-quadruplex. (c) The 1D 1H NMR titration profiles of Myc 14/23 with the small molecule quindoline at different ratios. Conditions: 25  C, 0.2 mM DNA, 25 mM phosphate, 95 mM potassium ion, pH 7.0

suggests a rather specific binding of quindoline. The upfieldshifting of the DNA imino proton peaks indicates that quindoline binds Myc 14/23 with a stacking binding mode [27].

4

Notes 1. Phosphoramidites are stored at room temperature until dissolution in acetonitrile. 2. The buffer for NMR sample preparation is stored at room temperature. 3. Some G-quadruplex forming sequences are prone to aggregation under high salt conditions. In such cases, this buffer may be diluted. 4. We used potassium salt as it is considered to be the more physiologically relevant ion. Na+ can also be used if needed.

NMR of G4 Structures and G4-Interactive Compounds

173

5. A low-concentration DNA sample is usually used for 1D experiments, while a high-concentration sample is needed for 2D NMR, particularly NOESY experiments. It needs to be confirmed that the same conformation of DNA is formed at all concentrations. 6. For DMSO stock solution, deuterated DMSO-d6 is used to minimize the solvent peak in 1H-NMR spectra. 7. DNA can also be synthesized at a larger scale, e.g., at 15 μmol; however, the yield is anticipated to be lower for the larger scale synthesis. 8. The level of enrichment can vary depending on resonances. An enrichment level lower than 6% can be used for detecting resonances with high intensity, while a higher level of enrichment is needed for detecting weaker resonances. 9. Cleavage of blocking groups from synthesized oligonucleotides with ammonium hydroxide can also be performed by incubation at 62  C for 10 h. 10. Careful extraction of acetic acid-treated oligonucleotides with fresh ether is necessary to obtain the high quality final material. Presence of cloudiness in the extraction process is an indication of unsuccessful extraction. Addition of 1 mL of high-purity water increases the aqueous phase volume, leading to better phase separation and better recovery of DNA products. 11. Dialysis in glass beakers is necessary to prevent the leaching of chemicals which occurs while stirring in plastic beakers and produces contaminant peaks in NMR spectra. 12. Dialysis should be performed against at least 1000 volumes of exchange solvent in order to obtain efficient change of solution conditions. 13. Oligonucleotides greater than 20-mer in lengths can be dialyzed in tubing with 3000–3500 MWCO; shorter oligonucleotides should be dialyzed in tubing with a MWCO of 1000. 14. Lyophilization is a critical step for DNA product quality. All materials should be hard-frozen prior to lyophilization and dried completely before removal from the lyophilizer at all stages. Lyophilized oligonucleotides are stored under desiccation at –20  C until dissolution. 15. It is also important to determine the presence of multiple conformations. From the number of imino peaks arising from a particular sequence it is possible to determine if there are multi-conformational species in solution and what their relative populations are. 16. Tel26 was found to form two well-defined G-quadruplex conformations when freshly dissolved in K+ solution, as indicated

174

Clement Lin et al.

by two separate sets of relatively sharp guanine imino peaks (Fig. 2c). One conformation (~40%) slowly converts to the other (~60%) overnight, and the complete conversion takes about a day. This observation led to the careful examination of the native 26-nt human telomeric sequence wtTel26, (TTAGGG)4TT (Fig. 2a). 17. However, this major conformation only accounts for ~70% of the total population, with about 30% population of minor conformations shown as weak and broader resonances. Thus, it is a more challenging process to get a complete resonance assignment for wtTel26 and a large number of spectra at different conditions are needed [21]. 18. It is possible to mistake the thymine H6-HMe cross-peak for the H6-H20 /H200 cross-peaks in the NOESY spectra in some systems. 19. Site-specific substitution for adenines or guanine with inosines (dI) can also be used for the assignment of their respective base protons. 20. Setting acquisition data points are to 4096  512 is sufficient with fold back spectra. (We use a spectral width of 18 ppm  9 ppm). For non-folded spectra, the resulting resolution may be poor without additional points (i.e., 4096  1024 points). 21. Sequential connectivity is often interrupted by various structural features of a G-quadruplex. Guanine H8 protons in syn bases exhibit reversed connectivity compared to standard b-DNA. Bases in strand-reversal loops may not show sequential NOE connectivity with guanines involved in G-tetrad formation, especially in the case of short (1–2 nt) strand-reversal loops. 22. It may be difficult to obtained from accurate distance values from significantly overlapped peaks (i.e., 3 or more) for NOE-restrained structure calculation. In such cases, restraints from these peaks may be discarded or introduced during late structural refinement stages. 23. If the G-quadruplex sequence does not have cytosine bases, the thymine base protons Me–H6 (2.99 A˚) can also be used as a reference. 24. In some cases, the wild-type DNA sequence that forms multiple conformations is preferred for examining specific drug bindings. A drug that binds to one specific conformation may be able to shift the equilibrium of the multiple forms. 25. Highly concentrated drug stock solutions allow the use of small volume additions of drug solutions in the DNA-drug complex, so that the solvent effect is negligible.

NMR of G4 Structures and G4-Interactive Compounds

175

26. The DNA-drug ratio less than 1:0.5 is important for drug binding studies: for a slow exchange binding regime, two sets of peaks for the free DNA and the bound DNA will be observed at the ratio of 1:0.5, while for a medium-to-fast exchange binding regime, only one set of peaks will be observed. 27. Line broadening may be observed for ligands in intermediate exchange. In such cases, peak resolution would recover upon saturation of the binding site with excess ligand.

Acknowledgments This research was supported by the National Institutes of Health (R01CA122952 (DY), R01CA177585 (DY), and P30CA023168 (Purdue Center for Cancer Research)). References 1. Yang DZ, Okamoto K (2010) Structural insights into G-quadruplexes: towards new anticancer drugs. Future Med Chem 2 (4):619–646 2. Sen D, Gilbert W (1990) A sodium-potassium switch in the formation of four-stranded G4-DNA. Nature 344(6265):410–414 3. Hud NV, Plavec J (2006) The role of cations in determining quadruplex structure and stability. In: Neidle S (ed) Quadruplex nucleic acids. Royal Society of Chemistry, RSC Publishing, Cambridge, pp 100–130 4. Neidle S, Parkinson G (2002) Telomere maintenance as a target for anticancer drug discovery. Nat Rev Drug Discov 1(5):383–393 5. Punchihewa C, Yang DZ (2009) Therapeutic targets and drugs-G-quadruplex inhibitors. In: Hiyama K (ed) Telomeres and telomerase in cancer. Springer, NJ, USA, pp 251–280 6. Qin Y, Hurley LH (2008) Structures, folding patterns, and functions of intramolecular DNA G-quadruplexes found in eukaryotic promoter regions. Biochimie 90(8):1149–1171. https:// doi.org/10.1016/j.biochi.2008.02.020 7. Onel B, Lin C, Yang D (2014) DNA G-quadruplex and its potential as anticancer drug target. Sci China Chem 57(12):1605–1614 8. Balasubramanian S, Hurley LH, Neidle S (2011) Targeting G-quadruplexes in gene promoters: a novel anticancer strategy? Nat Rev Drug Discov 10(4):261–275 9. Neidle S (2016) Quadruplex nucleic acids as novel therapeutic targets. J Med Chem 59 (13):5987–6011

10. Chen Y, Yang DZ (2012) Sequence, stability, and structure of G-quadruplexes and their interactions with drugs. Curr Protoc Nucl Acid Chem 50:17.15.11–17.15.17 11. Sun DY, Thompson B, Cathers BE, Salazar M, Kerwin SM, Trent JO, Jenkins TC, Neidle S, Hurley LH (1997) Inhibition of human telomerase by a G-quadruplex-interactive compound. J Med Chem 40(14):2113–2116 12. Brooks TA, Hurley LH (2009) The role of supercoiling in transcriptional control of MYC and its importance in molecular therapeutics. Nat Rev Cancer 9(12):849–861 13. Wheelhouse RT, Han FX, Sun D, Hurley LH (1998) The interaction of telomerase inhibitory porphyrines with G-quadruplex DNA. Proc Am Assoc Cancer Res 39:430 14. Shin-ya K, Wierzba K, Matsuo K, Ohtani T, Yamada Y, Furihata K, Hayakawa Y, Seto H (2001) Telomestatin, a novel telomerase inhibitor from Streptomyces anulatus. J Am Chem Soc 123(6):1262–1263 15. Local A, Zhang H, Benbatoul KD, Folger P, Sheng X, Tsai C-Y, Howell SB, Rice WG (2018) APTO-253 stabilizes G-quadruplex DNA, inhibits MYC expression and induces DNA damage in acute myeloid leukemia cells. Mol Cancer Ther 17(6):1177–1186. https:// doi.org/10.1158/1535-7163.MCT-17-1209 16. Gunaratnam M, Collie GW, Reszka AP, Todd AK, Parkinson GN, Neidle S (2018) A naphthalene diimide G-quadruplex ligand inhibits cell growth and down-regulates BCL-2 expression in an imatinib-resistant gastrointestinal

176

Clement Lin et al.

cancer cell line. Bioorg Med Chem 26 (11):2958–2964 17. Dai JX, Punchihewa C, Mistry P, Ooi AT, Yang DZ (2004) Novel DNA Bis-intercalation by MLN944, a potent clinical bisphenazine anticancer drug. J Biol Chem 279(50):46096 18. Wang Y, Patel DJ (1993) Solution structure of the human Telomeric repeat D[AG(3)(T(2) AG(3))3] G-Tetraplex. Structure 1 (4):263–282 19. Parkinson GN, Lee MPH, Neidle S (2002) Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417 (6891):876–880 20. Ambrus A, Chen D, Dai JX, Bialis T, Jones RA, Yang DZ (2006) Human telomeric sequence forms a hybrid-type intramolecular G-quadruplex structure with mixed parallel/ antiparallel strands in potassium solution. Nucleic Acids Res 34(9):2723–2735 21. Dai JX, Punchihewa C, Ambrus A, Chen D, Jones RA, Yang DZ (2007) Structure of the intramolecular human telomeric G-quadruplex in potassium solution: a novel adenine triple formation. Nucleic Acids Res 35(7):2440–2450 22. Tu A (2000) Long-range imino proton-13C J-couplings and the through-bond correlation of imino and non-exchangeable protons in unlabeled DNA. J Biomol NMR 16 (2):175–178

23. Greene KL, Wang Y, Live D (1995) Influence of the glycosidic torsion angle on 13 C and 15 N shifts in guanosine nucleotides: investigations of G-tetrad models with alternating syn and anti bases. J Biomol NMR 5(4):333–338 24. Goddard TD, Kneller DG (2004). University of California, San Francisco 25. Bru¨nger AT (1993) Version 3.1: a system for X-ray crystallography and NMR. Yale University Press, Neww Haven, CT, USA. Version 31: A system for X-ray crystallography and NMR Yale University Press, Neww Haven, CT, USA 26. Luu KN, Phan AT, Kuryavyi V, Lacroix L, Patel DJ (2006) Structure of the human telomere in K+ solution: an intramolecular (3+1) G-quadruplex scaffold. J Am Chem Soc 128 (30):9963–9970 27. Lazzeretti P (2000) Ring currents. Prog Nucl Magn Reson Spectrosc 36(1):1–88 28. Ambrus A, Chen D, Dai JX, Jones RA, Yang DZ (2005) Solution structure of the biologically relevant g-quadruplex element in the human c-MYC promoter. Implications for g-quadruplex stabilization. Biochemist 44 (6):2048–2058 29. Marusˇicˇ M, Sˇket P, Bauer L, Viglasky V, Plavec J (2012) Solution-state structure of an intramolecular G-quadruplex with propeller, diagonal and edgewise loops. Nucleic Acids Res 40 (14):6946–6956

Chapter 10 Using Molecular Dynamics Free Energy Simulation to Compute Binding Affinities of DNA G-Quadruplex Ligands Nanjie Deng Abstract We provide a practical guide for using molecular dynamics simulation to compute the binding affinity of small molecules in complex with G-quadruplex DNA. Such calculations have a number of applications, such as rescoring docking results and validating docked poses, to inform the discovery of G-quadruplex binders with high affinity and selectivity. This chapter describes two binding free energy protocols: the double decoupling method (DDM) and the potential of mean force method (PMF). We illustrate the application of the two methods using a recent case study of the binding of quindoline with the c-MYC G-quadruplex DNA. For this system, the two methods yield absolute binding free energies within ~2 kcal/mol of the experimental value. We discuss the advantages and disadvantages of these binding free energy methods. Key words Molecular dynamics simulation, Absolute binding free energy, Double decoupling method, Potential of mean force, c-MYC G-quadruplex, Quindoline

1

Introduction The DNA G-quadruplex formed in the guanine-rich regions in human telomeres and gene promoters has been targeted for developing novel anticancer therapy [1–3]. The G-quadruplex interactive ligands often bind at the terminal G-tetrads but may also bind with the more flexible loops [4]. An important goal in structurebased drug discovery is to identify small molecules that have both high affinity and good selectivity for G-quadruplex over duplex DNA [5]. Computational methods such as docking have been used in virtual screening studies to discover potent G-quadruplex ligands with good affinity and selectivity [6]. However, the accuracy of docking can be limited by the relatively simple scoring functions employed, which lack adequate treatments of the ligand and binding site desolvation, receptor reorganization and entropy effect. As a result, virtual screening of ligand library by docking can result in high false positive rate [7].

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_10, © Springer Science+Business Media, LLC, part of Springer Nature 2019

177

178

Nanjie Deng

Fig. 1 (a) The 2D structure of the quindoline derivative. (b) The NMR structure of the 2:1 quindoline-c-MYC G-quadruplex complex. The two quindoline molecules at the 50 -end and 30 end are shown in yellow and green sticks, respectively. At the 30 -end the intermolecular hydrogen bond between the T23O4 of the DNA and N1 of the quindoline is shown as green dashed line. The two potassium ions are shown as purple dots. (c) The sequence of c-MYC G-quadruplex (Myc 14/23)

Binding free energy methods such as the double decoupling method (DDM) [8–10] and the potential of mean force method (PMF) [11] are based on statistical mechanics and molecular dynamics simulations (MD) in explicit solvent, which can capture in principle the desolvation, receptor reorganization and entropic effects in binding. These more detailed methods can be employed as additional filters following docking to more accurately compute binding affinity and separate true binders from false positives. We have shown that the combination of docking and DDM can lead to significant improvement over docking alone for in silico ligand screening against an allosteric site on the HIV-1 Protease [12]. More recently, we have applied both DDM and PMF to compute the absolute binding free energy for a G-quadruplex DNA target [13]. The parallel DNA G-quadruplex formed in the c-MYC gene promoter regulates the c-MYC transcription. NMR revealed two drug binding sites located at the 50 and 30 termini of the c-MYC G-quadruplex. To determine the ligand binding site specificity in the c-MYC G-quadruplex, that is, which site is more favored in drug binding, we calculated the binding free energies for a quindoline derivative at each of the two binding sites in the G-quadruplex (Fig. 1). The calculated absolute binding free energies are in good agreement with the SPR determined overall ligand binding affinity and suggest that the quindoline has a small preference for the 50 site [13]. A DDM calculation of ligand-DNA

Computing Binding Affinities of G-Quadruplex Ligands

179

binding affinity involves two legs of decoupling simulations. (The term decoupling describes the process in which a ligand gradually loses its interactions with its surrounding molecules.) In one leg, the intermolecular interactions between a bound ligand and the binding site and solvent molecules are gradually turned off. In the other leg, the interactions between a free ligand in solution and the surrounding solvent molecules are gradually turned off. Such calculations can be computationally challenging for charged ligands because of the strong electrostatic interactions between the positively charged DNA-interactive ligand with the negatively charged DNA and the ionic solution. In contrast, in the PMF approach the ligand is extracted out of the binding site along a geometric pathway, instead of being decoupled from the binding site. Below we describe the protocols for carrying out PMF and DDM calculations on ligand-DNA complex, and discuss the relative advantages and disadvantages of the two methods in computing absolute binding free energy.

2

Methods This practical guide is intended for computational chemists/biophysicists familiar with molecular dynamics simulations [14]. It assumes a basic working knowledge with one of the MD packages such as GROMACS [15–18], AMBER [19], NAMD [20], or CHARMM [21]. In this section, we give an overview of the concepts used to compute the absolute binding free energy from computer simulations. We then describe the procedure to set up the simulation system, and the protocols to perform the PMF and DDM calculations of absolute binding free energy. For each method, we provide a summary table listing the sequential tasks, the required software, and the estimated computation times.

2.1 Overview of Absolute Binding Free Energy Calculation

kon

For the binding reaction A þ B$AB, perhaps the most natural way koff

to compute the binding affinity from simulation is to use unbiased MD to directly estimate the forward and backward rates kon and koff from the number of binding and unbinding events observed in the simulation within a given time, and use the detailed balance relation kon[A][B] ¼ koff[AB] to determine the equilibrium dissociation ½B  koff constant, i.e., K D  ½A½AB  ¼ kon [22]. However, except for very weak binders, the direct estimation of koff or kon is still beyond the accessible timescale of brute-force explicit solvent MD even with the fastest specialized computers [23]. Therefore, to compute  the absolute binding free energy (ΔG bind ¼ kT ln K D), one solution is to insert intermediate states to connect the bound and unbound states along a thermodynamic path, instead of using brute-force MD simulation to directly connect the two end states of the

180

Nanjie Deng

Fig. 2 Thermodynamic path used by the PMF method to connect the bound and unbound states for computing the absolute binding free energy

Fig. 3 The thermodynamic cycle used in DDM to compute absolute binding free energy

binding reaction. From thermodynamics, it is known that the Gibbs energy difference between two states equals the non-expansion work expended on converting the initial state to the final state in a reversible process [24]. Therefore, the free energy difference between the bound and unbound states, i.e., binding free energy, can be obtained as the reversible work required to convert the bound state to unbound state via successive intermediate states along the thermodynamic path. Figures 2 and 3 show how the two end states are connected by using well-defined intermediate states along thermodynamic paths in the PMF and DDM methods, respectively. In both diagrams, the solid lines are the actual paths utilized by the corresponding method. In the PMF

Computing Binding Affinities of G-Quadruplex Ligands

181

Fig. 4 The coordinate frame that defines the orientation and the position of the ligand relative to the c-MYC G-quadruplex DNA. Atoms a, b, c and A, B, C are located on the DNA and ligand, respectively. The ligand position is defined in the polar coordinates by the vector raA and two angles θ: b-a-A; ϕ: c-b-a-A. The ligand orientation is defined by the three Euler angles: Θ: a-A-B; : a-A-B-C; Ψ : b-a-A-B

approach, starting from the bound state, the first intermediate state A is a solvated receptor-ligand complex in which the ligand’s orientation relative to the receptor is restricted by a set of angular restraints (see below). The conversion of the bound state to the state A requires turning on the angular restraints. The second intermediate state B contains an orientationally restrained ligand separated from the receptor. The conversion A ! B is obtained by pulling the orientationally restrained ligand out of the binding pocket (Fig. 2). Lastly, the conversion of B to the unbound state involves turning off the angular restraints on the ligand in the state B. The overall thermodynamic path from the bound state to the unbound state is: bound ! A ! B ! unbound, and the absolute binding free energy is the sum of the reversible work done on the system along this path. It can be seen that during the above conversion process, the original molecular interactions in the system are not modified in any of the transformations. Such thermodynamic processes are therefore in principle physically realistic. By contrast, as described below, the conversions used in DDM between the different intermediate states can involve modifying the original molecular interactions. Figure 4 shows the thermodynamic cycle used in DDM. Starting from the bound state, the first intermediate state A consists of a solvated complex in which the distance between the ligandreceptor centers is harmonically restrained to its equilibrium value in an unrestrained simulation. The state A is obtained by turning on the harmonic distance restraint starting from the bound state.

182

Nanjie Deng

The second intermediate state B contains a harmonically restrained ligand whose intermolecular interactions with the receptor and solvent molecules are turned off (decoupling), resulting in a gas-phase ligand tethered to the binding site by the ligand-receptor distance restraint. The conversion from A ! B, which involves turning off the physical interactions, is nonphysical or alchemical, and can only be realized in a computer simulation. Next, the conversion of B ! C involves turning off the harmonic distance restraint on the decoupled ligand. Thus, the state C contains a gas-phase ligand moving freely within the simulation box and does not interact with either the receptor or the solvent molecules. Lastly, starting from the state C, the free gas-phase ligand is re-coupled to the solvent phase, by turning on the interactions between the ligand and the solvent molecules (re-coupling). This completes the thermodynamic path: bound ! A ! B ! C ! unbound. Again, the absolute binding free energy is obtained as the sum of the reversible work done on the system during the above conversion process. 2.2 Simulation System Setup

A MD-based binding free energy calculation requires a reasonably good initial structure of a ligand-DNA complex, which may come from several sources: 1. The experimental structures of the complex from NMR or X-ray crystallography. 2. In the absence of a high resolution experimental structure of the ligand-DNA complex, docking can be used to generate an initial structure of the complex. Many widely used docking software such as AutoDock [25] and Glide [26, 27] can be used for this purpose. However, the binding site on the DNA can undergo significant conformational changes [13, 28] upon ligand binding, and most docking methods cannot capture such induced-fit effect. Therefore, to dock a new ligand, a holo form of the DNA structure containing similar ligands can be used as a reasonable first guess. 3. An alternative and potentially attractive method to generate good initial binding poses is to use direct MD simulations in explicit solvent starting from free ligands and the DNA in the same simulation solvent box and allow the ligands to find naturally the correct binding mode. Recent reports from Wu et al. [29] and Mu et al. [30] have demonstrated that the experimental binding modes of ligand in complex with human telomeric G-quadruplex DNA can be sampled in direct MD in explicit solvent. Compared with docking against rigid receptor, the MD binding simulation has the potential to capture the ligand-induced DNA reorganization. The drawback of the direct binding simulation is that it is computationally slow and it may take ~100 ns to 1 μs simulation time in order to observe the correct binding mode. Currently, directly

Computing Binding Affinities of G-Quadruplex Ligands

183

simulating such binding processes is feasible only when the binding occurs at a surface site. The next task is to build the force field topology of the ligandDNA complex. In GROMACS, the topology of the complex is generated by first creating the topologies for the ligand and DNA separately, and then merging the two topology files: see GROMACS tutorial by Lemkul [15]. Before generating the ligand topology, it is important to assign the protonation states for all the ionizable sites. For example, for the quindoline molecule the pKa of the N1 atom is determined to be 10.2 (Fig. 1) using the QM Jaguar pKa program. Therefore, the N1 site is protonated at the physiological pH. Using a deprotonated N1 would cause the absolute binding free energy of quindoline to be underestimated by several kcal/ mol. Some input G-quadruplex structures from the PDB are missing the important channel ions; these ions must be manually added to the DNA structure before generating the DNA G-quadruplex topology. To model the G-quadruplex DNA in aqueous solutions, we use either the AMBER parmbsc0 [31] or the parmbsc0/OL15 [32] parameter set. To generate the GROMACS topology of the G-quadruplex DNA using these force fields, the LEaP (AmberTools) [19] is used to first generate an AMBER topology which is subsequently converted to the GROMACS format by using ACPYPE [33]. We have also experimented with the AMBER parmbsc1 parameter set on the quindoline-cMYC G-quadruplex complex. Starting from the NMR structure of the complex, the ligand binding pockets are distorted within 20 ns of the simulation. By contrast, the two ligand binding pockets remain stable in the MD simulations run with the parmbsc0 set. The small molecule ligand is modeled by using the Amber GAFF or GAFF2 force field [34]. The ligand partial charges are assigned using the AM1-BCC charge model [35]. Here the GROMACS topology of the ligand is created by using the antechamber (AmberTools) program to generate the AMBER topology and then using the ACPYPE to write the ligand topology in the GROMACS format. The topology of the complex is generated by merging the two GROMACS topology files for the ligand and the DNA G-quadruplex. The ligand-DNA complex is solvated by using a cubic or truncated octahedral box containing the TIP3P water molecules [36]. The dimension of the solvent box is set up to ensure that the distance between solute ˚ . In GROMACS, atoms from nearest walls of the box is at least 10 A the solvent box is generated using the editconf and genbox commands [15, 18]. After solvation, K+ or Na+ counterions are added to the solvent box using the GROMACS genion utility [18] to make the simulation system charge neutral. The Lennard-Jones parameters developed by Joung and Cheatham [37] can be used to model the metal ion. The electrostatic interactions are computed

184

Nanjie Deng

using the particle-mesh Ewald (PME) method with a real space cutoff of 10 A˚ and a grid spacing of 1.0 A˚. 2.3 Binding Free Energy Calculation Using the PMF Method

As discussed in the Overview, the PMF approach [11, 13] involves three thermodynamic transformations (Fig. 2). In each of the transformations, the corresponding reversible work or free energy change is computed by using simulation or using analytical expression. The sum of the free energy changes yields the absolute bind ing free energy ΔG bind [13]: R 

ΔG bind ¼

ΔG bound ðθ;ϕ;Θ;Φ;Ψ Þ þ ΔG bulk restr



 wðr Þ  kB T ln

bound e



wðr Þ=kB T

2πkB T kr

1=2

dr

ð1Þ

The complete derivation of the PMF formula has been given in a previous report [13]. Below we describe the procedure for computing the various terms in Eq. (1). 1. From bound ! A: when the ligand is bound, apply harmonic restraints (UΘ, UΦ, UΨ ) on the three Euler angles (Θ, Ф, Ψ ) to restrain the ligand orientation relative to the receptor. Then, apply the harmonic restraints (Uθ, Uϕ) on the polar angle θ and azimuth angle ϕ to restrain the ligand translational movement along the axis defined by the vector raA: see Fig. 4, which shows the coordinate system in which the orientation and the position of the ligand relative to the receptor are defined. To define the above restraints, the first step is to choose three receptor atoms a, b, and c that are near the binding site, and choose three ligand atoms A, B, and C (Fig. 4). The Euler angle Θ is formed by atoms a-A-B, the dihedral angle Ф is formed by atoms a-AB-C, and the dihedral angle Ψ defined by atoms b-a-A-B. Similarly, the polar angle θ is formed by atoms b-a-A, and azimuth dihedral angle ϕ is formed by the atoms c-b-a-A. All  2 the angular restraints have the form U x ¼ 12 x  x eq , where xeq is the equilibrium value of angle (or dihedral angle) x in an unrestrained simulation of the bound complex. It should be noted that the axis defined by the vector raA determines the geometric pathway along which the ligand is extracted; as discussed in Note 1, this pathway needs to be free of major steric clash with the receptor atoms in the process of ligand extraction. The free energy change of applying these angular restraints in the bound state is denoted as ΔG bound ðθ;ϕ;Θ;Φ;Ψ Þ , which is the first term in Eq. (1). ΔG bound is the sum of ΔG bound ðθ;ϕ;Θ;Φ;Ψ Þ ðΘ;Φ;Ψ Þ and bound bound bound bound ΔG ðθ;ϕÞ , i.e., ΔG ðθ;ϕ;Θ;Φ;Ψ Þ ¼ ΔG ðΘ;Φ;Ψ Þ þ ΔG ðθ;ϕÞ . ΔG bound ðΘ;Φ;Ψ Þ and ΔG bound are the free energies of turning on the harmonic ðθ;ϕÞ restraints (UΘ, UΦ, UΨ ) and (Uθ, Uϕ), respectively. ΔG bound ðΘ;Φ;Ψ Þ

Computing Binding Affinities of G-Quadruplex Ligands

185

and ΔG bound are computed using free energy perturbation ðθ;ϕÞ (FEP) simulations [14] by using λ parameter to gradually increase the strength of the harmonic restraint in multiple simulation windows, i.e., λ ¼ 0 corresponds to a unrestrained ligand, and λ ¼ 1 corresponds to a fully restrained ligand. For the quindoline-c-MYC G-quadruplex system, 12 λΘ, Φ, Ψ values (0.0, 0.01, 0.025, 0.05, 0.075, 0.1, 0.2, 0.35, 0.5, 0.65, 0.8, 1.0) are used in FEP simulations to switch on the restraints (UΘ, UΦ, UΨ ) to compute ΔG bound ðΘ;Φ;Ψ Þ , followed by using 12 λθ, ϕ values in another set of FEP simulations to switch on the restraints (Uθ, Uϕ) to compute ΔG bound ðθ;ϕÞ . Note that throughout the FEP simulations of turning on the (Uθ, Uϕ) restraints, the (UΘ, UΦ, UΨ ) restraints are kept at their full strengths. The force constants used in the angular restraints are: kθ ¼ kϕ ¼ kΘ ¼ kΦ ¼ kΨ ¼ 1000 kJ rad1 mol1. In GROMACS, the λ parameter is setup by using the restraint-lambdas keyword. At each λ, the system is equilibrated for 100 ps, followed by 4 ns of production run. The resulting trajectories for all the λs are processed using the Bennett acceptance ratio (BAR, implemented in GROMACS) to compute the ΔG bound ðΘ;Φ;Ψ Þ and ΔG bound ðθ;ϕÞ . 2. From A ! B: reversibly pull the orientationally restrained ligand out of the binding pocket along the axis defined by raA until it reaches an arbitrary location r∗ in the bulk solution, and compute the reversible work w(r), i.e., the potential of mean force or PMF, as a function of the pulling distance |raA|  r. The corresponding free R energy change for the step A ! B equals w ðr ∗ Þ þ kB T ln

ewðr Þ=kB T dr bound 2πkB T 1=2 kr





, which are the second and third

terms in Eq. (1). Here w(r∗) is the value of the PMF function w (r) at r ¼ r∗, where r∗ is an arbitrary location of the ligand in the  bulk solution. The computed binding free energy ΔG bind is ∗ insensitive to the precise choice of r , as long as the ligand at this location is sufficiently far away from the binding pocket and the PMF w(r) becomes essentially flat: see Fig. 5. The PMF function w(r) can be computed by using umbrella sampling (US) method [38], in which a series of MD simulations are performed in the presence of varying biasing potentials in successive windows spanning from the bound state to unbound states along the ligand-DNA distance vector raA (Fig. 4). See reference [38] for a detailed description of umbrella sampling. The biasing potential used in the i-th simulation window is  2 U r , i ¼ 12 kr r  r 0i , where r 0i is the reference distance for the i-th window. For the 50 -end site of the quindoline-cMYC G-quadruplex, 24 umbrella windows are used with r 0i ¼ 11:0, 11:5,12:0,12:5,13:0,13:5,14:0,14:5,15:0,15:5,16:0,16:5,17:0, 

17:5,18:0,19:0,20:0,21:0,22:0,23:0,24:0,25:0,26:0,27:0 A .

186

Nanjie Deng

18 16

PMF (kcal/mol)

14 12 10

r∗

rsite

8 6 4 2 0 8

10

12

14

16

18

20

22

24

26

28

r(Å)

Fig. 5 Calculated PMF w(r) for extracting the ligand from the 50 pocket of the quindoline-c-MYC G-quadruplex complex. The representative conformations of the complex observed at intermediate r are also shown, with the quindoline molecule shown in yellow

A single force constant kr ¼ 1000 kJ mol1 nm2 is used for the distance restraints in all the umbrella windows. In each umbrella window, we need to first generate a reasonable initial configuration under the corresponding biasing potential for each window. This is achieved by running a short umbrella sampling simulation (~1 ns long), starting from the last coordinates frame of the short umbrella sampling simulation in the previous window. That is, for the i-th umbrella window, the starting structure to run the short umbrella sampling simulation is taken from the last saved snapshot in the short umbrella sampling simulation in the (i-1)th umbrella window. Next, we run 20 ns umbrella sampling simulation in each window using the last structure of the short umbrella sampling simulation. The first 10 ns trajectory is treated as equilibration; the last 10 ns of sampling data is collected for the calculation of the PMF. The biased probability distributions along the distance r accumulated in these sampling windows are unbiased using the Weighted Histogram Analysis Method (WHAM) [39, 40] to yield the unbiased distribution and the PMF w(r). The WHAM program implemented by Grossfield [41] is used to calculate the PMF w(r). The statistical uncertainties are estimated by dividing the trajectory in each sampling window into two blocks; the difference in the results

Computing Binding Affinities of G-Quadruplex Ligands

187

obtained using the first half and second half of the trajectory gives an estimate of the error bar at each intermediate distance. Figure 5 shows the calculated PMF w(r) for extracting the ligand from the 50 -end pocket of the quindoline-c-MYC G-quadruplex complex. The free energy minimum in w(r) should coincide with the ligand-binding site distance found in the NMR structure of the complex. The w(r) measures the free energy expended to pull the ligand out. As the ligand leaves the binding pocket, the w(r) increases steadily until it levels off after a char acteristic distance r site  20 A , which coincides with the loss of the stacking interaction between the quindoline molecule and the 50 -end G-tetrad. rsite represents the outer limit of the bound R state. ewðr Þ=kB T dr The third term kB T ln bound 2πk T 1=2 in the Eq. (1) has the B kr R following physical meaning: the numerator bound ewðr Þ=kB T dr in the logarithm measures the range of ligand translational fluctuation in the bound state, while the denominator 1 ð2πkB T =kr Þ2 is the range of ligand fluctuation when it is harmonically restrained in bulk solution by the force constant kr. R To compute the line integral bound ewðr Þ=kB T dr in the logarithm, note that this line integral spans the range of r that corresponds to the bound state. Therefore the upper limit of the integration should equal rsite, where the ligand-receptor interaction becomes vanishingly small; and the lower limit of the line integration corresponds to the smallest r in the PMF (Fig. 5). The w(r) at a series values of r printed by the WHAM program [41] is used to perform the numerical integration R w ðr Þ=kB T dr. bound e 3. From C ! unbound state: when the ligand reaches r∗ in the bulk solution, turn off all the restraints on the ligand orientation and translation. This allows the ligand to rotate freely in the bulk solution and occupy the standard volume V0 that 1 corresponds to the standard concentration V 0 ¼ C1o ¼ 1M 

¼ 1661 A 3 . The corresponding free energy change equals bulk ΔG bulk restr , which is last term in Eq. (1). ΔG restr is calculated C analytically as ΔG bulk restr ¼ kB Tln

o ∗2

r

sin θ0 sin Θ0 ð2πkB T Þ3

. Here kθ, 1 ðkr kθ kϕ kΘ kΦ kΨ Þ2 kϕ, kΘ, kΦ, and kΨ are the harmonic force constants used to define the angular restraints (Uθ, Uϕ, UΘ, UΦ, UΨ ).  Table 1 shows the free energy components of ΔG bind computed according to Eq. (1) for the quindoline binding at the 50 -end of the c-MYC G-quadruplex. Here, the agreement  between the calculated ΔG bind and the experimental value determined by SPR is excellent. The result shows that the absolute binding free energy cannot be simply equated with the value of the PMF w(r∗), and that other terms such as 8π 2

188

Nanjie Deng

Table 1 Contributions to the ΔG bind of quindoline binding at the 50 -end of the c-MYC G-quadruplex DNA from the PMF approach Eq. (1). (Unit: kcal/mol) R k B T ln



ΔG bound ðθ;ϕ;Θ;Φ;ΨÞ

w(r )

3.7  0.06

15.1  0.5

ew ðr Þ=k B T dr bound 2πk B T 1=2 kr





0.1  0.05





ΔG bulk restr

ΔG bind

ΔGbind experiment

10.1

8.8  0.6

8.94

bulk ΔG bound ðθ;ϕ;Θ;Φ;Ψ Þ and ΔG restr are also important for the calculation  of the RΔG bind . The result also shows that the term

kB T ln 

ewðr Þ=kB T dr bound 2πkB T 1=2 kr





makes negligible contribution to the

ΔG bind and may be omitted. 2.4 Binding Free Energy Calculation Using DDM

Figure 3 shows the thermodynamic cycle and the intermediate  states used in the DDM calculation [8, 9, 42, 43] of ΔG bind , which can be written as the sum of several free energy terms [13]. 

gas

bound water ΔG bind ¼ ΔG bound restr  ΔG decoupl þ ΔG restr þ ΔG decoupl

ð2Þ

Each term in Eq. (2) corresponds to a specific thermodynamic transformation in the cycle shown in Fig. 3: 1. From bound ! A: this step corresponds to the first term in Eq. (2), ΔG bound restr , the free energy of switching on the ligandreceptor harmonic distance restraint,when the 2 ligand is bound. The restraint has the form U r ¼ 12 kr r  r eq , where req is the equilibrium distance between a ligand atom A and a receptor atom a. This term is generally small and can be computed using the Zwanzig free energy perturbation formula [44], i.e. D E U r =kB T ¼ k T ln e ð3Þ ΔG bound B restr Ur

  where eU r =kB T U r is the mean value of eU r =kB T recorded in a simulation of the ligand-receptor complex in the presence of distance restraint Ur. 2. From A ! B: this step corresponds to the second term in Eq. (2), ΔG bound decoupl , the free energy of decoupling the distance restrained ligand in the bound state from its surroundings, which involves turning off the intermolecular interactions (electrostatic and Van der Waals interactions) involving the ligand (see Note 2). Throughout the decoupling process, the ligand is subject to the distance restraint Ur. The alchemical decoupling is done in a series of simulation controlled by the

Computing Binding Affinities of G-Quadruplex Ligands

189

coupling parameter λ [14]. The electrostatic interaction is turned off first. For the quindoline-c-MYC G-quadruplex system, this is done using 11 λelec values: λelec ¼ 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0. (λelec ¼ 0.0 corresponds to a electrostatically fully coupled ligand, and λelec ¼ 1.0 a electrostatically fully decoupled ligand). Then, at λelec ¼ 1.0, the Van der Waals interaction is turned off, using 17 λvdw values, λvdw ¼ 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.94, 0.985, 1.0. Thus, using thermodynamic integration (TI) [14], ΔG bound decoupl is computed as bound ΔG bound decoupl ¼ ΔG decoupl

where Z ΔG bound decoupl elec

¼ ¼

λelec ¼1 λelec ¼0

Z ΔG bound decoupl vdw

λvdw ¼1 λvdw ¼0

þ ΔG bound decoupl

elec



dU dλelec



dU dλvdw

vdw

ð4Þ

λelec

dλelec



λvdw

dλvdw

In the GROMACS MD script, the ligand decoupling from the environment is invoked using the couple-moltype keyword [18]. It is important that the soft core potential is used in the turning off of the Van der Waals interactions to avoid the problem of end point instability [45]. In the GROMACS script, the soft core potential is set up by using the following parameters in the MD script: sc-alpha ¼ 0.5, sc-power ¼ 1, and sc-sigma ¼ 0.3. To achieve adequate convergence the decoupling simulation at each λelec or λvdw is performed for 20 ns. The first 10 ns of the trajectory were treated as equilibration and the last 10 ns trajectory were used to compute ΔG bound decoupl .  For a ΔG bind calculation, multiple (>3) independent DDM runs, with different starting velocities and slightly different starting coordinates are carried out. The statistical error bars are estimated from the standard deviations of these independent sets of DDM simulations. 3. From B ! C: this process corresponds to the third term in the gas Eq. (2) ΔG restr , the free energy of switching off the harmonic distance restraint Ur. Note that it is Ur that keeps the decoupled ligand staying inside the receptor binding site. gas ΔG restr is evaluated analytically by gas

ΔG restr ¼ kB T ln

3 1 2πkB T 2 V0 kr

ð5Þ

where kr is the force constant of the ligand-receptor distance restraint [9]. V0 is the standard volume.

190

Nanjie Deng

4. From C ! unbound state: The corresponding reversible work is given by the last term in the Eq. (2), ΔG water decoupl , the free energy of alchemically decoupling the free ligand from the solution, which yields the ligand solvation free energy. This term is computed in a similar way as that for the calculation of ΔG bound decoupl , i.e. water ΔG water decoupl ¼ ΔG decoupl

elec

þ ΔG water decoupl

vdw

ð6Þ

water where ΔG water decoupl elec and ΔG decoupl vdw are the free energies of turning off the electrostatic interactions and the Van der Waals interactions between the ligand and the solvent environment, respectively. To compute these terms using alchemical decoupling simulations, a solvent box containing the free ligand needs to be set up, in a similar way as that for setting up the solvated receptor-ligand system, see Simulation System Setup in Methods (Subheading 2.2). Here, since the solution contains one ligand molecule and solvents, a smaller water box can be used. The convergence of the ligand decoupling from solution is more straightforward compared with that for the ligand decoupling from the complex. Finally, for charged ligands the use of finite periodic solvent box can affect the calculated electrostatic decoupling free enerwater gies ΔG bound decouplelec and ΔG decouplelec [46]. To correct for this finite-size effect, a procedure developed by Rocklin and coworkers [46] can be used. Two corrections are computed, one for the ligand decoupling from the complex ΔG bound decouplelec , and . The total electrothe other from the solution ΔG water decouplelec static corrections for the finite-size effect is the sum of these finite size two correction terms, denoted as ΔG elec corr . For the quindoline-cMYC G-quadruplex complex, the magnitude of size the ΔG finite elec corr is of the order of ~0.8 kcal/mol [13]: see Note 3. finite size The ΔG elec corr includes the following contributions: periodicity-induced net-charge interactions, periodicity-induced net-charge undersolvation, discrete solvent effects, and residual integrated potential effects [46]. The calculation of the residual integrated potential requires three separate Poisson-Boltzmann calculations (PB) of electrostatic potentials generated by different combinations of receptor and ligand charges: a complex of charged receptor-uncharged ligand in solution, a complex of uncharged receptor-charged ligand in solution, and a charged ligand in solution. These PB calculations can be carried out using the APBS program [47] or other programs for PB. With the inclusion of the electrostatic finite-size correction finite size ΔG elec corr , the final expression for the absolute binding free energy from DDM is written as

Computing Binding Affinities of G-Quadruplex Ligands

191

Table 2 The DDM-calculated absolute binding free energy for the quindoline binding with the 50 -end of the c-MYC G-quadruplex (in kcal/mol) water bound water ΔG bound decouplelec ΔG decouplelec ΔG decouplvdw ΔG decouplvdw

1.6  1.2

2.8  0.6

gas size ΔG restr ΔG finite elec corr

14.6  1.4 0.09  0.09 3.2

ΔG bound restr



ΔGbind

0.8  0.1 0.07  0.01 11.1  1.3



bound water ΔG bind ¼ ΔG bound restr  ΔG decouplelec þ ΔG decouplelec water finite size  ΔG bound decouplvdw þ ΔG decouplvdw þ ΔG elec corr gas

þ ΔG restr

ð7Þ

Table 2 shows the DDM results for the quindoline binding with the 50 -end of the c-MYC G-quadruplex. The calculated  ΔG bind of 11.1 kcal/mol is in reasonable agreement with the experimental binding free energy of 8.94 kcal/mol determined from SPR [13].

3

Notes 1. One prerequisite condition for an accurate calculation of the  ΔG bind using the PMF approach is that there exists a geometrical path along which the ligand can be extracted from the binding site without major steric clashes with the receptor: see an illustration in Fig. 6. The ligand-receptor steric clash during ligand extraction will artificially raise the PMF w(r) and lead to significantly overestimated binding free energy [48]. In many cases, identifying such obstruction-free pathway for ligand unbinding is not straightforward and it can take some trial and error to find such a low free energy pathway. 2. In the DDM calculation of the ligand decoupling from the complex and that from the solution, in addition to turning off the ligand interactions with the surroundings, we also turn off all the intramolecular nonbonded interactions within the decoupled ligand, by setting the keyword coupl_intramol ¼ yes in the GROMACS script. This is because for flexible ligands retaining the intramolecular interactions in the decoupled ligand could cause sampling problems as the ligand can be kinetically trapped in the more compact conformations because of the unscreened intramolecular electrostatic interac tions. As a result, the errors in the ΔG bind calculated using the DDM setup without turning off intramolecular nonbonded interactions in the decoupled ligand can be larger than those obtained with the intramolecular nonbonded interactions turned off [13].

192

Nanjie Deng

Fig. 6 A cartoon diagram illustrates a possible scenario when using the PMF method to compute the absolute binding free energy

3. The results in Table 2 show that in the DDM calculation the magnitude of the electrostatic correction for the finite system finite size size ΔG elec corr is smaller than 1 kcal/mol. Moreover, for the quindoline molecule bound at the two binding sites in the c-MYC G-quadruplex (Fig. 1b) the values of this correction term are similar [13]. Therefore, if the main goal of the binding free energy calculation is to compare the binding affinities for similar ligands or for ligands binding at different sites, then it is size possible to omit the ΔG finite elec corr term in the DDM calculation of  ΔG bind . 4. The DDM calculation also allows the absolute binding free energy to be decomposed into physically meaningful terms to provide insights into the thermodynamic driving forces of ligand binding. The free energy decomposition can be obtained by rearranging Eq. (7) to write 

finite size ΔG bind ¼ ΔG bound restr þ ΔΔG elec þ ΔΔG vdw þ ΔG elec corr gas þ ΔG restr

ð8Þ

where water ΔΔG elec ¼ ΔG bound decouplelec þ ΔG decouplelec

ð9Þ

water ΔΔG vdw ¼ ΔG bound decouplvdw þ ΔG decouplvdw

ð10Þ

Here, ΔΔGelec and ΔΔGvdw can be interpreted as the contributions of the effective electrostatic interactions and the nonpolar interactions to the total binding free energy, respectively. For the 2:1 quindoline-cMYC G-quadruplex complex [49], the binding free energy decomposition is performed in order to understand the physical reasons for the difference in

Computing Binding Affinities of G-Quadruplex Ligands

193

Table 3 The DDM-calculated electrostatic and nonpolar contributions to the binding of quindoline at the two sites in the c-MYC G-quadruplex 

Binding site

ΔΔGelec

ΔΔGvdw

ΔG bind

50 -end

1.2 (0.8)

14.7 (1.4)

11.2 (1.3)

0.5 (1.7)

11.4 (1.1)

9.6 (0.6)

0

3 -end

Note that other terms in the Eq. (8) are omitted

the binding affinities of quindoline at the two binding sites (Table 3). The electrostatics interaction ΔΔGelec favors the binding at the 30 -end by about 1.7 kcal/mol, which is consistent with the presence of the intermolecular hydrogen bond QuiN1H-T23O4 at the 30 site (Fig. 1). However, this is more than offset by the larger difference in the nonpolar free energy contribution ΔΔGvdw, which favors the binding at the 50 site by about 3.3 kcal/mol. Structural analysis of the binding site geometry suggests that the more favorable ΔΔGvdw for the 50 site correlates with the greater hydrophobic enclosure at this site [13, 49]. 5. Lastly, we discuss the strengths and weaknesses of the DDM  and PMF methods in computing ΔG bind . The main advantage of DDM is that it does not rely on the existence of a geometrical pathway to extract the ligand out of the binding pocket without causing steric clash with the receptor. As a result, the setup of a DDM calculation can be more automatic, allowing it to be readily applied to a diverse set of ligands. The PMF method requires carefully identifying a suitable pulling pathway free of major steric obstructions from the receptor. This makes it difficult to apply the PMF method in an automatic way to study a diverse set of ligands. For some ligands/binding pocket combinations, an obstruction-free pulling pathway may not exist. One advantage of the PMF method is that when calculating the absolute binding free energy for charged ligands, the method does not require computing the correcfinite size tion ΔG elec corr to account for the electrostatic finite-size effect. The PMF method may also have an advantage in treating large and flexible ligands, for which the computation of the decoupling of large ligands in DDM can be prohibitively slow to converge.

194

Nanjie Deng

Table 4 The workflow used in the PMF approach Task

Required software

Computing time

Set up the simulation system: create the AMBER16, AMBER Tools, ACPYPE, topology of the DNA-ligand GROMACS complex; add metal ions and solvate the system in a water box

1–2 h

GROMACS Equilibrate the system: energy minimization, followed by thermalization: 2 ns NVT with the heavy atoms restrained using a force constant 1000 kJ/mol  nm1; 2 ns NPT with the heavy atoms restrained using a force constant 1000 kJ/ mol  nm1; 5 ns NPT with the heavy atoms restrained using a force constant 100 kJ/mol  nm1

12–24 h on A single multicore workstation, e.g., 20 CPU

1h Set up angular restraints (UΘ, UΦ, UΨ) The Discovery Studio Visualizer from Biovia, http://accelrys.com/resourceand (Uθ, Uϕ): identify three receptor center/downloads/freeware/index. atoms a, b, and c, and three ligand html atoms A, B, and C, to define the coordinate system (Fig. 4). Generate Euler angles (Θ, Ф, Ψ ) restraints and two restraints on θ and ϕ, in the GROMACS topology file (see Methods, Subheading 2) Turn on the (UΘ, UΦ, UΨ ) restraints in GROMACS FEP simulations using 12 λ values and compute the associated free energy change ΔG bound ðΘ;Φ;Ψ Þ (see Methods, Subheading 2)

3–4 h, multiple nodes Linux cluster, with 96 CPU Each λ window uses minimum 8 CPU

Turn on the (Uθ, Uϕ) restraints in FEP GROMACS simulations using 12 λ values and compute the associated free energy change ΔG bound ðθ;ϕÞ

3–4 h, multiple nodes Linux cluster, with 96 CPU Each λ window uses minimum 8 CPU

GROMACS Set up sub-directories, each corresponding to an umbrellasampling window. Run umbrella sampling in these directories to pull the bound ligand out of the binding pocket (see Methods, Subheading 2)

2–3 days, on a multi-nodes Linux cluster Each λ window uses minimum 8 CPU (continued)

Computing Binding Affinities of G-Quadruplex Ligands

195

Table 4 (continued) Task

Required software

Computing time

Run WHAM program to compute the PMF w(r)

WHAM [41]

a few minutes

Run MATLAB to calculate the integral MATLAB w ðr Þ R k T B dr bound e

5 min

N/A Substitute the quantities calculated (see above, and the value of ΔG bulk restr Methods, Subheading 2) into Eq. (1)  to estimate ΔG bind

N/A

Table 5 DDM workflowa

Task

Required software

Estimated computing time and hardware

Set up the simulation system, in the same AMBER16, AMBER tools, ACPYPE, way as in the PMF workflow GROMACS

1–2 h

The Discovery Studio Visualizer from Set up the ligand-receptor distant Biovia, http://accelrys.com/ restraint: identify a ligand atoms A, and resource-center/downloads/ a receptor atoms a, and generate the freeware/index.html harmonic distant restraint on r  j raAj

a few minutes

Equilibrate the system in the presence of GROMACS the distance restraint. Energy minimization followed by thermalization: 2 ns NVT with heavy atoms restrained using a force constant 1000 kJ/mol  nm1; 2 ns NPT with heavy atoms restrained using a force constant 1000 kJ/mol  nm1; 5 ns NPT with heavy atoms restrained using a force constant 100 kJ/mol  nm1

12–24 h on A single multicore workstation, e.g., 20 CPU

Set up sub-directories, each corresponding GROMACS to λelec window. Run decoupling simulations in these directories to turn off the electrostatic interaction between the bound ligand with its surroundings and compute ΔG bound decouplelec (see Methods, Subheading 2)

12–24 h, multinodes Linux cluster Each λ window uses minimum 8 CPU

Go to the λelec ¼ 0.0 sub-directory in the above decoupling run, calculate mean value of eU r =kB T recorded in the decoupling simulation trajectory, and use Eq. (3) to compute ΔG bound restr (see Methods, Subheading 2)

a few minutes

GROMACS, Excel

(continued)

196

Nanjie Deng

Table 5 (continued)

Task

Required software

Estimated computing time and hardware

Set up sub-directories, each corresponding GROMACS to λvdw window. Run decoupling simulations in these directories to turn off the Van der Waals interactions between the bound ligand with its surroundings to compute ΔG bound decouplvdw (see Methods, Subheading 2)

12–24 h, multiple nodes Linux cluster Each λ window uses minimum 8 CPU

Solvate the ligand molecule in a solvent GROMACS box and run decoupling simulations to compute the ΔG water decoupl elec and , in a similar way as that in ΔG water decoupl vdw computing ΔG bound decouplelec and ΔG bound decouplvdw

12 h, multiple nodes Linux cluster Each λ window uses 8 CPU

Substitute the quantities calculated above, N/A gas and the ΔG restr from Eq. (5) into Eq. (2)  to estimate ΔG bind

N/A

finite size Note that the procedure to estimate the electrostatic correction for the finite-size effect ΔG elec corr is omitted. The details for computing this term is described in reference [46] a

Acknowledgments The author thanks Dr. Danzhou Yang, Dr. Piotr Cieplak, and Dr. Lauren Wickstrom for helpful discussions. References 1. Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH (2002) Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc Natl Acad Sci 99(18):11593–11598. https://doi.org/10.1073/pnas.182256799 2. Yang D, Okamoto K (2010) Structural insights into G-quadruplexes: towards new anticancer drugs. Future Med Chem 2(4):619–646. https://doi.org/10.4155/fmc.09.172 3. Neidle S (2016) Quadruplex nucleic acids as novel therapeutic targets. J Med Chem 59 (13):5987–6011. https://doi.org/10.1021/ acs.jmedchem.5b01835 4. Hou JQ, Chen SB, Tan JH, Luo HB, Li D, Gu LQ, Huang ZS (2012) New insights from molecular dynamic simulation studies of the multiple binding modes of a ligand with

G-quadruplex DNA. J Comput Aided Mol Des 26(12):1355–1368. https://doi.org/10. 1007/s10822-012-9619-1 5. Dixon IM, Lopez F, Tejera AM, Esteve JP, Blasco MA, Pratviel G, Meunier B (2007) A G-quadruplex ligand with 10000-fold selectivity over duplex DNA. J Am Chem Soc 129 (6):1502–1503. https://doi.org/10.1021/ ja065591t 6. Hou J-Q, Chen S-B, Zan L-P, Ou T-M, Tan J-H, Luyt LG, Huang Z-S (2015) Identification of a selective G-quadruplex DNA binder using a multistep virtual screening approach. Chem Commun 51(1):198–201. https://doi. org/10.1039/C4CC06951J 7. Ferreira RS, Simeonov A, Jadhav A, Eidam O, Mott BT, Keiser MJ, McKerrow JH, Maloney DJ, Irwin JJ, Shoichet BK (2010)

Computing Binding Affinities of G-Quadruplex Ligands Complementarity between a docking and a high-throughput screen in discovering new cruzain inhibitors. J Med Chem 53 (13):4891–4905. https://doi.org/10.1021/ jm100488w 8. Gilson M, Given J, Bush B, McCammon J (1997) The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophys J 72(3):1047–1069. https:// doi.org/10.1016/S0006-3495(97)78756-3 9. Boresch S, Tettinger F, Leitgeb M, Karplus M (2003) Absolute binding free energies: a quantitative approach for their calculation. J Phys Chem B 107(35):9535–9551. https://doi. org/10.1021/jp0217839 10. Hamelberg D, McCammon JA (2004) Standard free energy of releasing a localized water molecule from the binding pockets of proteins: double-decoupling method. J Am Chem Soc 126(24):7683–7689. https://doi.org/10. 1021/ja0377908 11. Woo HJ, Roux B (2005) Calculation of absolute protein-ligand binding free energy from computer simulations. Proc Natl Acad Sci 102 (19):6825–6830. https://doi.org/10.1073/ pnas.0409005102 12. Deng N, Forli S, He P, Perryman A, Wickstrom L, Vijayan SKV, Tiefenbrunn T, Stout CD, Gallicchio E, Olson AJ, Levy RM (2014) Distinguishing binders from false positives by free energy calculations: fragment screening against the flap site of HIV protease. J Phys Chem B 119(3):976–988. https://doi. org/10.1021/jp506376z 13. Deng N, Wickstrom L, Cieplak P, Lin C, Yang D (2017) Resolving the ligand-binding specificity in c-MYC G-quadruplex DNA: absolute binding free energy calculations and SPR experiment. J Phys Chem B 121 (46):10484–10497. https://doi.org/10. 1021/acs.jpcb.7b09406 14. Leach A (2001) Molecular modelling: principles and applications (2nd edition). Pearson, New York, NY 15. Lemkul J GROMACS Tutorial. http://www. bevanlab.biochem.vt.edu/Pages/Personal/ justin/gmx-tutorials/complex/index.html 16. Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory Comput 4 (3):435–447. https://doi.org/10.1021/ ct700301q 17. Pronk S, Pall S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, Shirts MR, Smith JC, Kasson PM, van der Spoel D, Hess B, Lindahl E (2013) GROMACS 4.5: a high-

197

throughput and highly parallel open source molecular simulation toolkit. Bioinformatics 29(7):845–854. https://doi.org/10.1093/ bioinformatics/btt055 18. Mark Abraham BH, van der Spoel D, Lindahl E (2016) GROMACS Reference Manual 19. Case DSC DA, Cheatham TE III, Darden TA, Duke RE, Giese TJ, Gohlke H, Goetz AW, Greene D, Homeyer N, Izadi S, Kovalenko A, Lee TS, LeGrand S, Li P, Lin C, Liu J, Luchko T, Luo R, Mermelstein D, Merz KM, Monard G, Nguyen H, Omelyan I, Onufriev A, Pan F, Qi R, Roe DR, Roitberg A, Sagui C, Simmerling CL, Botello-Smith WM, Swails J, Walker RC, Wang J, Wolf RM, Wu X, Xiao L, York DM, Kollman PA (2017) AMBER 2017. University of California, San Francisco 20. James C, Phillips RB, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kale L, Schulten K (2005) Scalable molecular dynamics with NAMD. J Comput Chem 26:1781–1802. https://doi.org/10.1002/ jcc.20289 21. Brooks BR, Mackerell AD Jr, Nilsson L, Petrella RJ, Roux B, Won Y, Archontis G, Bartels C, Boresch S, Caflisch A, Caves L, Cui Q, Dinner AR, Feig M, Fischer S, Gao J, Hodoscek M, Im W, Kuczera K, Lazaridis T, Ma J, Ovchinnikov V, Paci E, Pastor RW, Post CB, Pu JZ, Schaefer M, Tidor B, Venable RM, Woodcock HL, Wu X, Yang W, York DM, Karplus M (2009) CHARMM: the biomolecular simulation program. J Comput Chem 30:1545–1614 22. Votapka LW, Jagger BR, Heyneman AL, Amaro RE (2017) SEEKR: simulation enabled estimation of kinetic rates, a computational tool to estimate molecular kinetics and its application to trypsin-benzamidine binding. J Phys Chem B 121(15):3597–3606. https:// doi.org/10.1021/acs.jpcb.6b09388 23. Shan Y, Kim ET, Eastwood MP, Dror RO, Seeliger MA, Shaw DE (2011) How does a drug molecule find its target binding site? J Am Chem Soc 133(24):9181–9183. https:// doi.org/10.1021/ja202726y 24. Perrot P (1998) A to Z of thermodynamics. Oxford University Press, Oxford 25. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, Olson AJ (2009) AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J Comput Chem 30(16):2785–2791. https://doi.org/10.1002/jcc.21256 26. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK, Shaw DE, Francis P, Shenkin PS (2004) Glide: a new approach for

198

Nanjie Deng

rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47(7):1739–1749. https://doi.org/10. 1021/jm0306430 27. Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye LL, Pollard WT, Banks JL (2004) Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. J Med Chem 47 (7):1750–1759. https://doi.org/10.1021/ jm030644s 28. Lin C, Wu G, Wang K, Onel B, Sakai S, Shao Y, Yang D (2018) Molecular recognition of the hybrid-2 human telomeric G-quadruplex by epiberberine: insights into conversion of telomeric G-quadruplex structures. Angew Chem Int Ed Eng. https://doi.org/10.1002/anie. 201804667 29. Luo D, Mu Y (2015) All-atomic simulations on human telomeric G-quadruplex DNA binding with thioflavin T. J Phys Chem B 119 (15):4955–4967. https://doi.org/10.1021/ acs.jpcb.5b01107 30. Mulholland K, Wu C (2016) Binding of telomestatin to a telomeric G-quadruplex DNA probed by all-atom molecular dynamics simulations with explicit solvent. J Chem Inf Model 56(10):2093–2102. https://doi.org/10. 1021/acs.jcim.6b00473 31. Pe´rez A, Marcha´n I, Svozil D, Sponer J, Cheatham TE, Laughton CA, Orozco M (2007) Refinement of the AMBER force field for nucleic acids: improving the description of α/ γ conformers. Biophys J 92(11):3817–3829. https://doi.org/10.1529/biophysj.106. 097782 32. Zgarbova M, Sponer J, Otyepka M, Cheatham TE 3rd, Galindo-Murillo R, Jurecka P (2015) Refinement of the sugar-phosphate backbone torsion beta for AMBER force fields improves the description of Z- and B-DNA. J Chem Theory Comput 11(12):5723–5736. https:// doi.org/10.1021/acs.jctc.5b00716 33. ACPYPE. https://code.google.com/archive/ p/acpype/ 34. Wang J, Wolf RM, Caldwell JW, Kollman PA, Case DA (2004) Development and testing of a general amber force field. J Comput Chem 25 (9):1157–1174. https://doi.org/10.1002/ jcc.20035 35. Jakalian A, Bush BL, Jack DB, Bayly CI (2000) Fast, efficient generation of high-quality atomic charges. AM1-BCC model: I. Method. J Comput Chem 21(2):132–146. https://doi. org/10.1002/(SICI)1096-987X(20000130) 21:23.0.CO;2-P

36. Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML (1983) Comparison of simple potential functions for simulating liquid water. J Chem Phys 79(2):926. https://doi. org/10.1063/1.445869 37. Joung IS, Cheatham TE (2008) Determination of alkali and halide monovalent ion parameters for use in explicitly solvated biomolecular simulations. J Phys Chem B 112 (30):9020–9041. https://doi.org/10.1021/ jp8001614 38. K€astner J (2011) Umbrella sampling. Wiley Interdiscip Rev: Comput Mol Sci 1(6): 932–942. https://doi.org/10.1002/wcms.66 39. Gallicchio E, Andrec M, Felts AK, Levy RM (2005) Temperature weighted histogram analysis method, replica exchange, and transition paths{. J Phys Chem B 109(14):6722–6731. https://doi.org/10.1021/jp045294f 40. Kumar S, Rosenberg JM, Bouzida D, Swendsen RH, Kollman PA (1992) THE weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J Comput Chem 13(8):1011–1021. https:// doi.org/10.1002/jcc.540130812 41. Grossfield A WHAM. http://membrane.urmc. rochester.edu/sites/default/files/wham/doc. pdf 42. Deng Y, Roux B (2006) Calculation of standard binding free energies: aromatic molecules in the T4 lysozyme L99A mutant. J Chem Theory Comput 2(5):1255–1273. https:// doi.org/10.1021/ct060037v 43. Deng N, Zhang P, Cieplak P, Lai L (2011) Elucidating the energetics of entropically driven protein–ligand association: calculations of absolute binding free energy and entropy. J Phys Chem B 115(41):11902–11910. https:// doi.org/10.1021/jp204047b 44. Zwanzig RW (1954) High-temperature equation of state by a perturbation method. I. Nonpolar gases. J Chem Phys 22 (8):1420–1426. https://doi.org/10.1063/1. 1740409 45. Beutler TC, Mark AE, van Schaik RC, Gerber PR, van Gunsteren WF (1994) Avoiding singularities and numerical instabilities in free energy calculations based on molecular simulations. Chem Phys Lett 222(6):529–539. https://doi. org/10.1016/0009-2614(94)00397-1 46. Rocklin GJ, Mobley DL, Dill KA, Hu¨nenberger PH (2013) Calculating the binding free energies of charged species based on explicitsolvent simulations employing lattice-sum methods: an accurate correction scheme for electrostatic finite-size effects. J Chem Phys

Computing Binding Affinities of G-Quadruplex Ligands 139(18):184103. https://doi.org/10.1063/ 1.4826261 47. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA (2001) Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci 98 (18):10037–10041. https://doi.org/10. 1073/pnas.181342398 48. Deng N, Cui D, Zhang BW, Xia J, Cruz J, Levy R (2018) Comparing alchemical and physical

199

pathway methods for computing the absolute binding free energy of charged ligands. Phys Chem Chem Phys 20(25):17081–17092. https://doi.org/10.1039/c8cp01524d 49. Dai J, Carver M, Hurley LH, Yang D (2011) Solution structure of a 2:1 quindoline–c-MYC G-quadruplex: insights into G-quadruplexinteractive small molecule drug design. J Am Chem Soc 133(44):17673–17680. https:// doi.org/10.1021/ja205646q

Chapter 11 Electrophoretic Mobility Shift Assay and Dimethyl Sulfate Footprinting for Characterization of G-Quadruplexes and G-Quadruplex-Protein Complexes Buket Onel, Guanhui Wu, Daekyu Sun, Clement Lin, and Danzhou Yang Abstract DNA G-quadruplexes are globular nucleic acid secondary structures which occur throughout the human genome under physiological conditions. There is accumulating evidence supporting G-quadruplex involvement in a number of important aspects of genome functions, including transcription, replication, and genomic stability, and that protein and enzyme recognition of G-quadruplexes may represent a key event to regulate physiological or pathological pathways. Two important techniques to study G-quadruplexes and their protein interactions are the electrophoretic mobility shift assay (EMSA) and dimethyl sulfate (DMS) footprinting assay. EMSA, one of the most sensitive and robust methods for studying the DNA-protein interactions, can be used to determine the binding parameters and relative affinities of a protein for the G-quadruplex. DMS footprinting is a powerful assay for the initial characterization of G-quadruplexes, which can be used to deduce the guanine bases involved in the formation of G-tetrads under physiological salt conditions. DMS footprinting can also reveal important information in G-quadruplex-protein complexes on protein contacts and regional changes in DNA G-quadruplex upon protein binding. In this paper, we will provide a detailed protocol for the EMSA and DMS footprinting assays for characterization of G-quadruplexes and G-quadruplex-protein complexes. Expected outcomes and references to extensions of the method will be further discussed. Key words Electrophoretic mobility shift assay (EMSA), Dimethyl sulfate (DMS) footprinting, DNA, G-quadruplex, Protein, Electrophoresis

1

Introduction G-quadruplexes are formed in single-stranded guanine rich nucleic acid sequences and assembled by Hoogsteen hydrogen bonding of four guanines arranged within a planar tetrad and further stabilized by monovalent cations such as K+ and Na+ [1]. G-quadruplex formation has been observed in synthetic oligonucleotide sequences isolated from the human genome such as telomeres and gene promoter regions and more recently there has been significant number of advances in G-quadruplex detection that

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_11, © Springer Science+Business Media, LLC, part of Springer Nature 2019

201

202

Buket Onel et al.

supports the existence of G-quadruplex structures in the genome of human cells significantly in gene regulatory regions and hot spots of genomically unstable regions of human chromosomes [2]. The association of G-quadruplexes to telomere biology, transcription regulation, and genomic instability undoubtedly suggest existence of proteins that modulate G-quadruplex conformation or serve as a platform for protein-protein interactions. The biological relevance of G-quadruplexes is linked to several G-quadruplex interacting proteins: shelterin complex proteins are known to function in telomere hemostasis [3]; and some proteins involve in G-quadruplex unfolding processes such as helicases [4] or G-quadruplex stabilization such as nucleolin [5]. In this paper, we will provide a detailed protocol for electrophoretic mobility shift assay (EMSA) and dimethyl sulfate (DMS) footprinting assay for investigating G-quadruplex and protein-G-quadruplex interactions. Expected outcomes and references to extensions of the method will be further discussed. The established EMSA experiment has been applied to investigate many G-quadruplex-protein complex processes [6–10]. EMSA assay was originally described in 1981 by Fried and Crother [11] and Garner and Rezin [12] and became a widely used, robust method to elucidate crucial information for protein-nucleic acid interactions [6, 13, 14]. The underlying mechanism of the assay is that molecules in different size and charge show different electrophoretic mobilities when resolved on a native polyacrylamide or agarose gel [15]; the complex formed between nucleic acid and protein generally generate slower migrating species than free nucleic acids and DNA-protein complexes with lifetimes exceed the duration of the electrophoresis can be detected as distinct bands (Fig. 1). This technique poses several advantages. The assay is highly sensitive especially when using radioisotope labeled nucleic acids allowing the assay to be performed using nanomolar concentration of the protein and nucleic acids [16]. If high-sensitivity is not required, covalent or non-covalent fluorophores [17, 18] and biotin [19] labeled probes have also been successfully used and reported. Although the assay is often used for qualitative purposes, under appropriate conditions, apparent equilibrium constants for binding reactions can be obtained [20]. Another major advantage of the method is that the assay works well with both purified proteins as well as crude cell extracts [15]. Moreover, various size of nucleic acids and structures (single-stranded, double-stranded, quadruplex, triplex, hairpin, circular DNA) are compatible and commonly used with the assay [16]. The DNA structure within the DNA-protein complex can be further characterized using DMS footprinting method. DMS footprinting is a powerful biochemical method due to its ability to measure the relative reactivities of individual nucleotides within a DNA strand with DMS, which can provide essential

EMSA and DMS Footprinting of G4s and G4-Protein Complexes

203

Fig. 1 The principle of EMSA. The binding of protein to the DNA or RNA would increase their size and cause slow mobility of complexes in the native polyacrylamide gel

information for structural characteristics of the DNA [21]. DMS is one of the oldest probes for footprinting, and it was first introduced for DNA mapping in 1977 [22], allowing detection of methylation by DMS at N7 of guanosine (N7-methylguanine) and N3 (N3-methyladenine) of adenine nucleotides [21, 22]. The glycosidic bond of methylated purine becomes unstable and breaks easily under alkali conditions at high temperatures (>90  C), which results in the cleavage of the sugar-phosphate backbone [22, 23] (Fig. 2). Because guanine bases are methylated fivefold faster than adenine bases, breakage at guanine residues results in darker bands than for adenine when 32P end-labeled DNA fragments generated by DMS are resolved on a polyacrylamide gel [22, 24]. DMS footprinting has various applications. It can provide essential information for the existence of specific structural elements such as secondary structures. Moreover, it can be used to confirm highresolution structures or to identify the regions that do not match predictions from the structure, which may suggest the region is

204

Buket Onel et al.

Fig. 2 The proposed mechanism for DMS methylation and subsequent cleavage reaction

dynamic or exists in multiple conformations [25–30]. DMS footprinting can also give significant information about complexes in the presence and absence of bound small molecule ligands and proteins, allowing identification of the regions of the structure that change upon ligand binding [31, 32]. DMS footprinting experiments are particularly useful and routinely used for characterization of G-quadruplexes and elucidation of DNA structure within DNA-protein complex. G-quadruplexes are formed by the association of the four guanine bases through Hoogsteen hydrogen bonding, which is stabilized by monovalent cations such as K+ and Na+ [1, 33–35]. In contrast to guanine residues in single- and double-stranded DNA, a guanine N7 in a G-tetrad is involved in Hoogsteen hydrogen binding and is thereby protected from methylation in the DMS footprinting experiment and subsequent cleavage under alkali conditions, thus enabling the identification of guanines involved in G-tetrad formation (Fig. 2) [23, 25, 29, 34, 36]. In addition to initial characterization of the G-quadruplexes, DMS footprinting technique can also be used to investigate G-quadruplex-protein interactions. Further protections can be expected from direct contacts between protein and G-quadruplex as well as from local changes within the G-quadruplex induced by protein binding [6–10].

2

Materials Prepare all the solutions using filtered Milli-Q water. Unless stated otherwise, the reagents should be molecular biology grade or higher.

2.1

Denaturing-PAGE

1. TBE electrophoresis buffer (5): 0.45 M Tris–HCl (pH 8.0), 0.45 M boric acid, 10 mM EDTA. 2. Urea (Thermo Fisher Scientific, cat. no. BP169-212).

EMSA and DMS Footprinting of G4s and G4-Protein Complexes

3. 40% Acrylamide/Bis Solution no. 1610146) (see Note 1).

29:1

(Bio-Rad,

205

cat.

4. Ammonium persulfate (APS) (Bio-Rad, cat. no. 1610700) (see Note 2). 5. N,N,N0 ,N0 -Tetramethylethylenediamine (TEMED) (Fisher Scientific, cat. no. 15524010) (see Note 3). 6. Alkaline gel loading dye (1): 80% (by volume) formamide, 10 mM NaOH, 0.005% Bromophenol Blue (w/v). 7. Formamide (99.5%) no. BP-228100).

(Thermo

Fisher

Scientific,

cat.

8. Bromophenol Blue (Sigma Chemicals, cat. no. B5525). 9. Sodium Hydroxide (White Pellets) (Thermo Fisher Scientific, cat. no. BP350-212) (see Note 4). 2.2 Ethanol Precipitation of DNA

1. Sodium Acetate (pH ¼ 7.3 M) (Sigma Aldrich, cat. no. S2404). 2. Ethanol, Absolute (200 Proof) (Fisher Scientific, cat. no. BP2818500). 3. Corning™ Costar™ Centrifugal Devices: Spin-X™ 0.45 μm CA (Thermo Fisher Scientific, cat. no. 07200388).

2.3 Preparation of 32P-End-Labeled Oligonucleotides

1. 10 Phosphonucleotide Kinase (PNK) buffer (700 mM Tris–HCl pH ¼ 7.6, 100 mM MgCl2, 50 mM DTT) (New England Biolabs, cat. no. B0201S). 2. Adenosine 50 γ-32P-ATP 50 -gamma 32P Triphosphate ([γ-32P] ATP) Triethylammonium salt—3000 Ci/mmol, 10 mCi/mL, EasyTide (PerkinElmer, cat. no. BLU502A250UC) (see Note 5). 3. T4 Polynucleotide Kinase (PNK), 10,000 U/mL (New England Biolabs, cat. no. M0201S). 4. Micro Bio-Spin™ P6 Gel Columns, Tris Buffer (Bio-Rad, cat. no. 7326201). 5. UltraPure™ DNase/RNase-Free Distilled Water (Thermo Fisher Scientific, cat. no. 10977015).

2.4 Protein-DNA Complex Preparation

1. Binding Buffer: 20 mM Tris–HCl, pH ¼ 7.6, 200 mM KCl. 2. Glycerol (Molecular Biology) (Thermo Fisher Scientific, cat. no. BP229-1). 3. Bovine Serum Albumin (BSA) (Thermo Fisher Scientific, cat. no. BP9704-100). 4. Dithiothreitol (DTT) (GoldBio, cat. no. DTT50).

206

Buket Onel et al.

2.5 Native-PAGE Preparation

1. TBE Electrophoresis Buffer (5): 0.45 M Tris–HCl (pH 8.0), 0.45 M boric acid, 10 mM EDTA. Filter using a 0.22 μm filter. Store at 25  C. 2. 40% Acrylamide/Bis Solution 29:1 (Bio-Rad, cat. no. 1610146) (see Note 1). 3. Ammonium Persulfate (APS) (Bio-Rad, cat. no. 1610700) (see Note 2). 4. N,N,N0 ,N0 -Tetramethylethylenediamine (TEMED) (Fisher Scientific, cat. no. 15524010) (see Note 3). 5. DNA loading buffer (10): 50% glycerol by volume, 0.005% bromophenol blue (w/v). Store at 20  C. 6. XAR-2 Film (Individually Wrapped) (VWR, cat. no. IB1651579). 7. Plastic food wrap.

2.6 Methylation and Cleavage Reaction

1. 50 μg/mL calf thymus DNA (Thermo Fisher Scientific, cat. no. 15633019). 2. Dimethyl Sulfate solution (Sigma-Aldrich D186309). 3. Ethanol, Absolute (200 Proof) (Fisher Scientific, cat. no. BP2818500). 4. 2-Mercaptoethanol (Sigma-Aldrich, cat. no. M6250). 5. Glycerol (Molecular Biology) (Thermo Fisher Scientific, cat. no. BP229-1). 6. Piperidine, ReagentPlus, 99% (Sigma-Aldrich, cat. no. 104094). 7. Whatman® cellulose chromatography papers (Sigma-Aldrich, cat. no. Z270822).

3

Method

3.1 Preparation of 32P-End-Labeled G-Quadruplex Samples 3.1.1 Purification of a Desired Full-Length Oligonucleotide Using a Denaturing PAGE

1. Prepare a gel cassette for a 20 cm  16 cm  1.6 mm gel and make 60 mL of denaturing 12% polyacrylamide gel solution by mixing 12 mL TBE buffer (5), 18 mL of 40% acrylamide/ bis-acrylamide (29:1), and 30 g urea, then adding water to 60 mL (see Note 6). 2. After adding 350 μL freshly prepared 10% APS and 20 μL TEMED to the urea gel solution, fill the gel cassette with this solution using a propipetter and promptly insert the 1.6 mm comb. Let the gel polymerize, about 30 min to 1 h (see Note 6). 3. After the gel is polymerized, remove the comb gently and wash the wells using 1 TBE buffer using a Pasteur pipette. 4. Place the gel cassette into the electrophoresis apparatus (Labrepco Model V15-17); clamp it down firmly. Pour 1 TBE buffer into the top and bottom chambers (see Note 7).

EMSA and DMS Footprinting of G4s and G4-Protein Complexes

207

5. Pre-run the gel at 200 constant voltage for 20–30 min. 6. Combine the DNA sample with alkaline gel loading dye and heat the sample at 95  C for 5 min prior to loading into wells (see Note 8). 7. While samples are heating, rinse out the wells using a Pasteur pipette to make sure to get rid of the urea completely otherwise the urea will block the sample from entering the gel expeditiously. Load the samples into the gel. 8. Run the gel for about 2.5 h until bromophenol blue dye has migrated about half way down at a constant 200 V. 9. Pry the glass plates apart from the gel carefully so that the gel is still attached to one plate. 10. Place the plastic-wrapped gel over a UV fluorescent silicacoated TLC plate and illuminate the gel from above with a shortwave UV illuminator (see Note 9). 11. Cut the desired DNA bands out with a razor blade. Chop the gel for a band into pieces, transferring them to a 1.5- or 2 mL microfuge tube. Crush the gel into very small piece, almost like a powder using a pipette tip, then add sufficient elution buffer containing 50 mM Sodium Acetate with 2 mM EDTA (pH 7) to cover the gel pieces. Rotate the tubes for overnight at 4  C. Next day, remove the gel pieces using Corning Costar Centrifuge filters by following the manufacturer’s instructions. For a second round of recovery, add more elution buffer into the gel, rotate another 2 h, and filter again, combining the filtrates. 12. Add 4 volumes of absolute (200-proof) ethanol to the eluted sample. Mix well and store at 20  C overnight or at 80  C for ~2 h. Centrifuge the sample for ~20–30 min at 12,000  g at 4  C. Carefully remove the ethanol, and gently rinse the recovered DNA pellet once with 250 μL of ice-cold 75% ethanol. Remove the ethanol and allow the samples to sit open at room temperature to air-dry (see Note 10). 13. Resuspend the DNA in Milli-Q water and determine the concentration spectrophotometrically (see Note 11). 3.1.2 Preparation of 32 P-End-Labeled Oligonucleotides

1. To end-label the DNA, mix 2 μL 10 μM oligonucleotide, 2 μL 10 phosphonucleotide kinase buffer, 3 μL [γ-32P]ATP, 12 μL Milli-Q water and 1 μL PNK (phosphonucleotide kinase) (see Notes 5, 12–15). 2. Mix samples gently, spin down the tube contents, and incubate at 37  C for 30–40 min. 3. Run each samples through a Micro Bio-Spin column to remove any unreacted [γ-32P]ATP based on the recommendations from the manufacturer; after the column was centrifuged at 1000  g for 2 min to remove the buffer from the column, the

208

Buket Onel et al.

complete reaction mixture should then be loaded to the center of the column and eluted by centrifugation at 1000  g for 4 min (see Notes 16 and 17). 3.1.3 G-Quadruplex Folding of 32P-Labeled Oligonucleotides

1. Set up the G-quadruplex folding reactions for a final target concentration of 50 nM for each oligonucleotide in 20 mM Tris–HCl (pH ¼ 7.4) buffer, with and without 140 mM KCl (the DNA prepared in the absence of KCl will serve as a control) (see Note 18). 2. Anneal the DNA by incubating the samples at >90  C for 5 min, then cool down slowly to room temperature in the heating block (see Notes 17 and 19).

3.2

EMSA Assay

3.2.1 Complex Preparation

1. Prepare binding buffer in ice containing 20 mM Tris–HCl (pH ¼ 7.4), 200 mM KCl, 2 mM EDTA, 0.15 mg/mL BSA and 2 mM DTT (see Notes 20 and 21). 2. Prepare the protein in desired concentration (706 nM) using binding buffer (see Notes 22 and 23). 3. Take the 1 μL 10 nM folded G-quadruplex sample in 1.5 mL microfuge tube in shielding box that is placed on ice, and after DNA sample chill on ice for 5–10 min, add the 8.5 μL 706 nM protein solution prepared in binding buffer. Equilibrate the binding reaction mixture for an hour in shielding box placed on ice (Table 1, see Note 24). 4. Prepare1 μL 10 nM folded G-quadruplex sample in 1.5 mL microfuge tube and add the 8.5 μL binding buffer solution. This sample will be used as a control (Table 1). 5. Add 2% glycerol to in binding reaction mixture prior to loading on the gel (see Note 25).

Table 1 Composition of binding reactions for the assay shown in Fig. 3

Components

a Volume (μL)

b Volume (μL)

10 nM 32P DNAG4 in 20 mM Tris–HCl pH ¼ 7.6100 mM KCl

1

1

706 nM Protein in 20 mM Tris–HCl (pH ¼ 7.4), 200 mM KCl, 2 mM EDTA, 0.15 mg/mL BSA and 2 mM DTT



8.5

Binding Buffer: 20 mM Tris–HCl (pH ¼ 7.4), 200 mM KCl, 2 mM EDTA, 0.15 mg/mL BSA and 2 mM DTT

8.5



40% Glycerol

0.5

0.5

Vtotal

10

10

EMSA and DMS Footprinting of G4s and G4-Protein Complexes 3.2.2 Preparation of the Native PAGE

209

1. Cleanthelarge and smallgel plate pairs of 20 cm  16 cm  1.6mm gel cast, spacers and comb with methanol and 70% ethanol using Kimwipes to remove any debris that might interfere gel polymerization. 2. Prepare the polymerization mixture by mixing 6 mL TBE buffer (5), 6 mL of 40% acrylamide/bisacrylamide (29:1) and adding water to 60 mL. The final concentration of acrylamide is 4% (see Notes 26 and 27). 3. Thoroughly mix 400 μL freshly made 10% APS and 25 μL TEMED right before you are ready to cast the gel (see Note 28). 4. Pour the polymerizing mixture into the glass plate slowly, avoiding bubbles (see Note 29). 5. Insert the 1.6 mm comb, 10 well comb gently and clamp it from top to plate. Let the gel polymerize, about 30 min to 1 h. 6. After the gel is polymerized remove the comb gently and wash the wells using 0.5 TBE buffer using a pasture pipette. 7. Place the gel plate into the electrophoresis apparatus; clamp it down firmly. Pour chilled 0.5 TBE buffer into the top and bottom chambers (see Notes 26 and 30). 8. Pre-run the gel at 20 V constant voltage for 20–30 min at 4  C (see Notes 20 and 31). 9. Rinse out the wells using pasture pipette. 10. Load 5 μL samples into wells of the gel. Make sure to load only dye (same amount with your samples in other wells) into nearend wells of the gel for even running (see Note 32). 11. Run the gel for 3 min at 150 V and followed by an hour at constant 80 V at 4  C (see Note 33). 12. End the electrophoresis and take out the spacers first, start prying from the side to remove the gel from the plates. 13. Put a sheet of Whatman 1 filter paper on top of the gel and try to lift the end closest to you with the gel on it. 14. Lay the paper, gel side up, on the radiation shield pad and cover the gel with plastic wrap. Cut off extra material (see Note 34). 15. Transfer the wrapped gel to the gel dryer (Bio-Rad, Model 583), sandwiching between two sheets of filter paper (see Note 35). 16. Make sure the water in the vacuum control reservoir is between min and max (closer to max) and start the vacuum. Check that the entire area is under vacuum by running fingers along the groove around the edges. Dry the gel for 1 h. 17. Take the dried gel out and put it in a Phosphorimager cassette. Expose overnight or over the weekend.

210

Buket Onel et al.

Fig. 3 The As1411 G-quadruplex binding to human nucleolin protein. Samples contained 1  109 M DNA (50 -GGTGGTGGTGGTTGTGGTGGTGGTGG-3); free 32 P-labeled As1411 G-quadruplex (a) and As1411G4/Nucleolin complex (b) resolved on a 4% w/v polyacrylamide gel cast and run in 0.5 TBE buffer. The binding buffer contained 20 mM Tris (pH 7.6 at 4  C), 200 mM KCl, 2 mM dithiothreitol, 0.15 mg/mL bovine serum albumin

18. Detect the signals using a Storm PhosphorImager (Fig. 3). 19. Quantification can be obtained using ImageQuant 5.1 software from Amersham Biosciences. 3.3 DMS Footprinting Assay 3.3.1 Complex Preparation

1. Mix the folded and unfolded G-quadruplex samples from step 2 of Subheading 3.1.3 with the protein in shielding box that is placed on ice (see Note 20). To ensure that the DNA binds quantitatively to the protein, the protein should be present at a concentration well above the Kd and should be in excess of the DNA. 2. Incubate the reaction mixture to allow complete complex formation between protein and G-quadruplex in shielding box placed on ice (see Note 24). 3. Include two control reactions without protein for both folded and unfolded G-quadruplex.

3.3.2 DMS Methylation and Native Gel Purification of Methylated DNAs and Protein-DNA Complexes

1. Treat the DNA and DNA-protein complex samples in the presence and absence of KCl to methylate the DNA by addition of DMS to 0.5% and 1 μg Calf Thymus DNA, incubating the methylation mixture for 7 min at room temperature (see Note 36). 2. Include total of four control reactions without DMS for both folded and unfolded G-quadruplex samples with protein and without protein (see Note 36, Fig. 4c). 3. Add 0.07% β-mercaptoethanol and 2% glycerol to experimental samples as well as non-DMS control samples (see Note 37). 4. Prepare a gel cassette for a 20 cm  16 cm  1.6 mm gel, and make 60 mL of 8% nondenaturing polyacrylamide gel solution by mixing 12 mL TBE buffer (5), 12 mL of 40% acrylamide/ bisacrylamide (19:1), and 36 mL water (see Notes 6, 26, and 27). 5. After adding 350 μL freshly prepared 10% APS and 20 μL TEMED to the gel solution and pouring the gel, insert the 1.6 mm comb. Let the gel polymerize, about 30 min to 1 h (see Notes 6 and 29).

EMSA and DMS Footprinting of G4s and G4-Protein Complexes

211

Fig. 4 (a) General schematic diagram of a DMS footprinting experiment for G-quadruplexes and protein-Gquadruplex complexes. DNA bases in the given sequence are color coded; Red ¼ Guanine, Blue ¼ Thymine, Green ¼ Adenine. (b) Separation of species on a native PAGE gel. The binding of protein to the DNA would increase their size and cause slow mobility of complexes in the native polyacrylamide gel (c) Representative data for DMS chemical footprinting of G-quadruplex sequence in the presence of 140 mM K+ (lane 7), 0 mM K+ (lane 3) and in the presence of protein and 140 mM K+ (lane 8), in the presence of protein and 0 mM K+ (lane 4). Lanes 1, 2, 5, and 6 serve as control experiments in the absence of DMS

6. Remove the comb and rinse out the wells with 1 TBE buffer. 7. Place the gel cassette into the electrophoresis apparatus. Pour 1 TBE buffer into the top and bottom chambers (see Notes 7 and 30). 8. Load the samples onto the gel and run at 200 V for DNA samples and at 100 V for protein-DNA complexes. The gel should take 2–3 h (see Notes 21, 33, and 38). 9. Place plastic wrap over the gel after removing the back plate. Use about 4 min of exposure time for the Kodak XAR-2 film (see Note 39). 10. Visualize the DNA and complex location of DNA bands within the gel via autoradiography. 11. Cut the desired band out from the gel using razor blade and transfer each sample into a 1.5-mL microfuge tube. Follow steps 11 and 12 in Subheading 3.1 for recovery of the DNA (see Note 40). 12. Dissolve the pellet in 50 μL 10% piperidine. Heat the samples at 90  C for 16 min in 10 mM Tris–HCl pH ¼ 7.6 buffer (see Notes 41 and 42) to cleave the methylated DNA. 13. Spin samples down and then transfer them to the Speedvac for an hour for complete drying (see Note 42). 14. Add 50 μL water to each sample pellet with vigorous vortexing, then return the samples to the Speedvac for 30–60 min to dry them completely again. Repeat the wash with 50 μL water resuspension for each sample, vortexing the samples very well, and drying them again for 30–90 min until completely dry.

212

Buket Onel et al.

3.3.3 Separation of Cleavage Products on Denaturing PAGE

1. Set up a gel cassette for a 30 cm  30 cm  0.4 mm gel and prepare 60 mL of denaturing 16% polyacrylamide gel solution by mixing 12 mL TBE buffer (5), 24 mL of 40% acrylamide/ bis-acrylamide (29:1), and 30 g urea, then adding water to 60 mL total volume (see Notes 6 and 43). 2. After adding 350 μL 10%APS and 20 μL TEMED slowly to the gel solution, to minimize the formation of bubbles, fill a gel cassette with this solution using a propipetter and promptly insert the 0.4 mm comb (see Note 6). 3. Allow the gel to polymerize for 30 min1 h. 4. Pour 1 TBE buffer in lower and upper chambers of electrophoresis apparatus (see Note 7). 5. Remove the comb and rinse out the wells with 1 TBE buffer (see Note 44). 6. Pre-run the gel at 1600 V, 45 mA, 300 W for 30 min. With the Owl Separations apparatus S4S, the temperature should go to about 40–50  C. 7. Rinse out the wells using a Pasteur pipette to remove the urea completely. 8. Resuspend piperidine-treated DNA sample with alkaline gel loading dye for a final concentration of 30,000 cpm/10 μL. Heat the samples at 95  C for 5 min and place them directly on ice prior to loading into wells. 9. Load 10,000 cpm/3 μL sample into wells and run the gel at 1800 V with a maximal current of 35 mA (see Note 45). 10. After the desired resolution is obtained, end the electrophoresis run. Working as quickly as possible, remove the spacers first, then pry from the sides to break the gel from the plates. 11. Immediately, before the gel has time to cool, put a sheet of Whatman® cellulose chromatography paper on top of the gel and, moving quickly, roll the paper back at a corner, encouraging the edge of the gel to follow the paper. When enough of the gel has been lifted off the plate onto the paper, the remainder of the gel can be removed from the plate by rolling the paper carefully the rest of the way away from the plate, avoiding tears. 12. Cover the gel with plastic wrap. Cut off extra material and dry the gel using a gel dryer at 80  C for an hour with a vacuum, conditions provided by a gel dryer (see Note 35). 13. Place the dried gel in a PhosphorImager cassette. Expose to a PhosphorImager screen overnight or over the weekend. 14. Detect the signals using a Storm PhosphorImager.

EMSA and DMS Footprinting of G4s and G4-Protein Complexes 3.3.4 Analysis and Anticipated Results

4

213

In order to draw accurate conclusions from the results, all the controls should be included in the assay and all the reactions should be treated identically in the presence and absence of protein/KCl/ DMS. Discrete bands should be observed for the fragments of unfolded G-quadruplex resulted from digestion at each guanine positions (Fig. 4c, lane 3). The top band of the gel representing the full product can be measured as a reference point (Fig. 4c). The reference point in the absence of DMS should be more intense from the reactions including DMS. When DNA is folded, a guanine N7 in a G-tetrad is involved in Hoogsteen hydrogen binding and is protected from methylation and subsequent cleavage (Fig. 4c, lane 7). Hence, it is anticipated that the reference point will be denser in folded reaction compared to unfolded reaction. Similarly, protein binding will result in additional contacts and lead to protection of close-by residues from DMS modifications compared to the unbound control reaction (Fig. 4c, lanes 7 and 8). It should be also noted that folding and/or protein binding might result in over-cleaved products because formation of additional contacts might cause some regions of the DNA to be more exposed to solution or constrains structure at a certain orientation that is more favorable to DMS reaction. In footprinting assays the analysis is mostly done qualitatively. However, ImageQuant software enables manual boxing of individual bands and provides side-byside analysis of the bands of treated and untreated samples. The data can be normalized to increase accuracy of the data interpretation. Normalization can be done by two methods: by dividing the intensity of each band to the average of all of the measured bands in a given lane or dividing the intensity of each band to the reference point.

Notes 1. Acrylamide and bis-acrylamide are neurotoxic. Wear personal protective equipment to avoid exposure to hazards. Store refrigerated at 4  C and protect from light. 2. Ammonium persulfate acts as a catalyst for the copolymerization of acrylamide and bis-acrylamide gels. The powder should be stored in a desiccator at room temperature. It decays slowly in solution, so it is better to prepare fresh solution the day it is used. 3. TEMED is an essential catalyst for polyacrylamide gel polymerization. It should be stored tightly sealed at 4  C. 4. Special care is required to prepare a solution of sodium hydroxide. Solid pellets should be handled carefully. The solution must be prepared in a ventilated fume hood to provide containment. Dissolution of NaOH pellets in water is highly

214

Buket Onel et al.

exothermic; solutions of sodium hydroxide should be prepared by slowly adding the sodium hydroxide pellets to water while avoiding excess heat accumulation. An ice bath can be used to chill the solution after each addition. 5. Exposure to γ-radiation and secondary X-rays from 32P is hazardous. Safe handling of this isotope is essential. Exercise maximum precautions and shielding to prevent exposure to or spilling of 32P. Radioactive waste from the columns should be checked using a Geiger counter on the radiation records. Radioactive material waste should be properly disposed. 6. Clean glass plates (gel casting apparatus) using ethanol/methanol and wipe dry. The plates should be clamped together most of the way. Before adding APS and TEMED to the solution, mix the solution thoroughly and allow it to settle for a minute or so. Add APS and TEMED right before casting the gel, mix the solution very gently by swirling, avoiding any air bubbles, and carefully pour the gel solution between the plates. Avoid bubble formation in the gel while pouring. Gentle tapping on the plate with a fist can help direct the flow of the gel solution to avoid trapping bubbles. 7. Make sure to cover the bottom end of the plate with 1 TBE buffer and add 1 buffer to the top of the gel apparatus until it covers the top of the gel. 8. DMS footprinting oligonucleotides can be designed with the addition of T7 bases at both ends of the sequence to increase footprinting resolution at the ends and to have an extended DNA environment for the G-quadruplex forming sequence. However, it is essential to confirm that the T flanking ends do not influence the G-quadruplex structure. One can easily check the quality of the G-quadruplex prior to experiment by performing simple 1D NMR experiment and observing the chemical shift region between 10.5–12 ppm, which is characteristic for the imino protons of guanines involved in the G-tetrad and connected with Hoogsteen hydrogen bonding [1]. Purified and desalted custom-DNA/RNA oligonucleotide probes (20–80 bp) can be directly purchased from commercial suppliers. We routinely synthesize and purify DNA oligonucleotides in our laboratory using β-cyanoethylphosphoramidite solid phase chemistry (Applied Biosystems Expedite 8909) as described previously [37]. In brief, the synthesized oligonucleotides are eluted from the column with a 50%:50% mixture of 40% methylamine:ammonia, heated for 10 min at 65  C, purified on reverse-phase Micropure II columns (BioSearch Technologies) and subjected to sequential dialysis through 10 mM NaOH, water, 150 mM NaCl, and water before lyophilization.

EMSA and DMS Footprinting of G4s and G4-Protein Complexes

215

9. It is important to note that the UV radiation used in transilluminators is harmful to both the skin and eyes. Use appropriate PPE for the hazard: UV face shield, goggles. Never view the UV lamp directly. Keep exposure time to a minimum because it can damage DNA. One can mark the locations of the DNA bands on the plastic wrap by using UV light exposure and the silica plate and then turning off the UV to cut the bands to decrease the exposure time. 10. Remove the ethanol while avoiding contact with pellet so that the maximum salt possible is removed. 11. Quantification of the DNA oligonucleotides can easily be performed by UV/Vis spectroscopy at 260 nm using their calculated extinction coefficients. 12. 50 -end radiolabeling of nucleic acid probes are commonly used and provides high level of sensitivity. T4 Polynucleotide kinase (PNK) catalyzes the transfer of the gamma phosphate (32P) of ATP to the 50 -hydroxyl terminus of DNA or RNA. Ammonium ions are strong inhibitors of the PNK enzymes, thus G-quadruplex forming nucleic acids should not be dissolved in or precipitated in ammonia-containing buffers before PNK treatment. We have also observed low labeling efficiency in our experiments if the nucleic acid is dissolved in the phosphate buffer before PNK treatment. 13. Alternatively, covalent or non-covalent fluorophores [17, 18] and biotin [19, 38] labeled probes can be used. 14. If working with RNA, it is essential to ensure the working environment is RNase-free. Wearing latex laboratory gloves (changed frequently) and using nuclease/RNase-free tips, reagents, and microfuge tubes are necessary common practices to prevent RNase contamination. 15. For a lower quantity of labeled DNA, all materials can be scaled down proportionately 16. All recovered samples should be well over 500,000 cpm if fresh 32 P-ATP is used. 17. Radiolabeled nucleic acid samples can be kept at 4  C for 2 weeks in a radioactive-storage refrigerator. Based on our experience, storage for longer than 2 weeks leads to degradation of nucleic acids. 18. The cation type influences G-quadruplex formation. For instance, G-quadruplexes formed in human telomeres are structurally polymorphic, with two equilibrating hybrid-type structures in K+ solution [39, 40] and a basket-type structure in Na+ solution [41]. Thus, cation in the G-quadruplex folding buffer and binding buffer should be correctly determined to study intended G-quadruplex system.

216

Buket Onel et al.

19. Most of the time, annealing of nucleic acids assist inducing appropriate G-quadruplex formation. Typically, the samples are heated to 90  C and slowly cool down to room temperature on heating block. If the samples are intended to be annealed, to prevent phosphodiester backbone cleavage, alkaline conditions in buffer should be avoided. Whether annealing is required or not can be determined empirically and quality can be checked by 1D NMR experiment. 20. Special care and precautions are required when protein is mixed with the DNA. If addition results in any cloudiness, the binding reaction should be optimized in a way where protein stays in solution. Conditions that give complete binding of the DNA G-quadruplex to the protein can be established in EMSA experiments prior to performing DMS footprinting reactions. The binding buffer composition should be optimized for specific protein-DNA interactions and should be at physiological conditions for excellent buffering capacity as well as biological relevance. Various buffer conditions can be used in the binding reactions such as Tris-based, HEPES, Bis–Tris, and Phosphate [16]. Determining the correct pH of the binding and electrophoresis buffer is a common practice for manipulating the mobilities of the protein-DNA complexes [42]. Protein charge can significantly affect the resolution of the protein-DNA complex. Proteins which are neutral or positively charged display reduced mobility on the native gel and negatively charged proteins may have electrophoretic mobility comparable to free DNA. Mono- and divalent salt can be added in binding reaction to stabilize complexes [43]. Since the samples will be subjected to electrophoresis, the conductivity of the samples should not be excessive. Monovalent salt concentration should be in the range of 1 mM < M + X- < 300 mM to match conductivity of the sample and electrophoresis [20]. Addition of carrier proteins such as BSA (Bovine Serum Albumin) [44, 45] and poly (dI-dC) [46] can be added to abolish any nonspecific interactions within the complex. Nonionic detergents in binding reactions were also reported to increase the solubility of the protein and stability of the complexes [47]. If the cell or nuclear extract is used instead of purified protein, then protease, nuclease and phosphatase inhibitors can be included into binding reaction [48]. DTT or 2-Mercaptoethanol (200 DNA frames were counted). 3.3 Preparation of the GQ/i-Motif Double-Stranded DNA in the DNA Frame

The preassembled dsDNAs containing a GQ-forming DNA strand were incorporated into the DNA frame at a lower annealing temperature (Fig. 2) The sample was purified by gel filtration, and the assembled structures were confirmed by AFM imaging. 1. Assemble the i-motif bridging strand (bottom strand) containing i-motif strand, toehold-protecting strand, and two connecting scaffold strands (connection c, d) by annealing from 85  C to 15  C at a rate of 1.0  C/min using a thermal cycler. 2. Assemble the GQ bridging strand (top strand) containing the GQ strand and two connecting scaffold strands (connection a, b) by annealing from 85  C to 15  C at a rate of 1.0  C/min using a thermal cycler. 3. Assemble the 50 nM GQ bridging strand (5 equiv.) into the DNA frame (connection a, b; bottom side) by heating at 40  C for 10 min and then cooling to 15  C at a rate of 0.5  C/min using a thermal cycler. 4. Assemble the 20 nM i-motif bridging strand (2 equiv.) in the DNA frame (connection c, d; topside) by heating at 30  C for 6 min and then cooling to 12  C at a rate of 0.4  C/min using a thermal cycler. 5. Purify the sample by gel filtration chromatography. 6. Remove the toehold-protecting sequence in the bottom side by adding the release strand that is fully complementary to the toehold-protecting sequence (10 eq) for the preparation of the GQ/i-motif dsDNA by incubating at 25  C for 15 min and then cooling to 4  C directly. 7. Desalt the sample twice by a centrifugal filter at rt. for 10 min with buffer containing 10 mM Tris–HCl (pH 7.6), 1 mM EDTA, and 12.5 mM MgCl2 for the removal of K+ and incubate the sample at rt. for 1 h. 8. Add 28 μL of the buffer for the AFM observation to the desalted sample (2 μL) and incubate at rt. for 1 h to form the duplex of GQ-containing strand/i-motif containing strand.

Observation of G-quadruplex Using DNA Origami and AFM

307

9. Observe the assembled structures in all steps using HS-AFM as described in Subheading 3.4. 10. Calculate the yield of dsDNA formation and separation by manually counting the connected dsDNAs (X-shape) and separated dsDNAs (parallel shape) in the AFM images (>200 DNA frames were counted). 3.4 High-Speed AFM Imaging of the Behavior of the GQ-Forming DNA Strands in the DNA Frame

AFM images were obtained via HS-AFM (Nano Live Vision, RIBM, Tsukuba, Japan) (see Note 3) using a silicon nitride cantilever (Olympus BLAC10EGS) (see Note 4). Samples for AFM imaging were prepared, and the HS-AFM operation was performed. 1. Attach the mica disc onto the glass scaffold. 2. Cleave the mica disc to obtain a fresh surface. 3. Dilute the DNA nanostructures sample to ~5 nM by adding observation buffer. 4. Place ~2 μL of the sample solution onto the mica surface for 5 min. 5. Rinse the surface with imaging buffer (~10 μL) three times to remove unbound molecules. 6. Place the cantilever onto the cantilever holder. 7. Fill the liquid cell with ~120 μL of observation buffer by using the procedure described in Subheading 2.4, item 3. 8. Place the mica plate with a glass scaffold onto the scanner stage. 9. Set up the scanner over the liquid cell. 10. Align the laser focusing position and the photodetector position. 11. Adjust the resonant frequency of the cantilever. 12. Execute the approach until the software automatically stops the motor. 13. Adjust the set point voltage until the sample is clearly imaged. 14. Image the samples in the observation buffer with a scanning rate of 0.2–1.0 frame/s (fps).

4

Notes 1. The design of DNA origami structures is currently carried out using caDNAno software (http://cadnano.org/) [14]. 2. We prepared the DNA strands using the sequences reported previously [10]. 3. Instrumental details are available at the homepage of RIBM (http://www.ribm.co.jp).

308

Masayuki Endo et al.

4. For HS-AFM imaging, small cantilevers are used. Small cantilevers (9 μm long, 2 μm wide, and 130 nm thick; BL-AC10EGS, Olympus, Tokyo, Japan) made of silicon nitride with a spring constant 0.1–0.2 N/m, and a resonant frequency of 400–1000 kHz in water are commercially available from Olympus.

Acknowledgments This work was supported by a Grant-in-Aid for Scientific Research on innovative areas “Molecular Robotics” (Grant Number 24104002) of MEXT, and JSPS KAKENHI (Grant Number 18KK0139, 16 K14033). Financial support from the Uehara Memorial Foundation and the Nakatani Foundation to M.E. was acknowledged. References 1. Endo M, Sugiyama H (2014) Single-molecule imaging of dynamic motions of biomolecules in DNA origami nanostructures using highspeed atomic force microscopy. Acc Chem Res 47(6):1645–1653 2. Huppert JL, Balasubramanian S (2007) G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res 35 (2):406–413 3. Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH (2002) Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc Natl Acad Sci U S A 99 (18):11593–11598 4. Gehring K, Leroy JL, Gueron M (1993) A tetrameric DNA structure with protonated cytosine.cytosine base pairs. Nature 363 (6429):561–565 5. Xu Y, Sugiyama H (2006) Formation of the G-quadruplex and i-motif structures in retinoblastoma susceptibility genes (Rb). Nucleic Acids Res 34(3):949–954 6. Sun D, Hurley LH (2009) The importance of negative superhelicity in inducing the formation of G-quadruplex and i-motif structures in the c-Myc promoter: implications for drug targeting and control of gene expression. J Med Chem 52(9):2863–2874 7. Dhakal S et al (2010) Coexistence of an ILPR i-motif and a partially folded structure with

comparable mechanical stability revealed at the single-molecule level. J Am Chem Soc 132(26):8991–8997 8. Dhakal S et al (2012) G-quadruplex and i-motif are mutually exclusive in ILPR double-stranded DNA. Biophys J 102 (11):2575–2584 9. Miyoshi D, Matsumura S, Nakano S, Sugimoto N (2004) Duplex dissociation of telomere DNAs induced by molecular crowding. J Am Chem Soc 126(1):165–169 10. Endo M et al (2015) Single-molecule manipulation of the duplex formation and dissociation at the G-Quadruplex/i-Motif Site in the DNA nanostructure. ACS Nano 9(10):9922–9929 11. Rothemund PW (2006) Folding DNA to create nanoscale shapes and patterns. Nature 440 (7082):297–302 12. Endo M, Katsuda Y, Hidaka K, Sugiyama H (2010) Regulation of DNA methylation using different tensions of double strands constructed in a defined DNA nanostructure. J Am Chem Soc 132(5):1592–1597 13. Endo M, Katsuda Y, Hidaka K, Sugiyama H (2010) A versatile DNA nanochip for direct analysis of DNA base-excision repair. Angew Chem Int Ed 49(49):9412–9416 14. Douglas SM et al (2009) Rapid prototyping of 3D DNA-origami shapes with caDNAno. NucleicAcids Res 37(15):5001–5006

Chapter 18 G-Quadruplex and Protein Binding by Single-Molecule FRET Microscopy Chun-Ying Lee, Christina McNerney, and Sua Myong Abstract G-quadruplex (G4) is a non-canonical nucleic acid structure that arises from the stacking of planar G-tetrads, stabilized by monovalent cations. G4 forming sequences exist throughout the genome and G4 structures are shown to be involved in many processes including DNA replication and gene expression. The single-molecule total internal reflection fluorescence (TIRF) microscopy has been employed to study G4 structure formation and protein binding interactions. Here, we describe methods by which we tested the folding and unfolding of G-quadruplexes structure and studied the dynamics of its interaction with POT1 protein. The methods presented here can be applied to study other putative G4 sequences and potential binding partners. Key words G-quadruplex, G4, Single-molecule, FRET, Telomere, POT1

1

Introduction G-quadruplex (G4) is a non-canonical nucleic acid structure that can form from a guanine-rich single strand. The core structure of G4 comprises stacks of guanine(G)-tetrad planes formed by Hoogsteen hydrogen bonds, stabilized by monovalent cations, especially potassium [1]. Putative G4 forming sequences are unevenly distributed throughout the human genome [2, 3]. G4 structures have been reported to be involved in DNA replication, gene regulation, genome instability, and human diseases [4–7]. One well-characterized G-quadruplex is the human telomeric 30 overhang of 50–200 nucleotide region containing a repeat sequence (TTAGGG)n. Four repeats of this sequence can fold into a G-quadruplex structure in the presence of monovalent cations, such as sodium or potassium [8, 9]. Additional repeat sequences can associate with the G4, but extra repeats can destabilize G4 in terms of thermostability and enthalpy [10]. The G4 structure can also influence the accessibility of the telomeric DNA to proteins such as telomerase and helicase [11]. However, determining how

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_18, © Springer Science+Business Media, LLC, part of Springer Nature 2019

309

310

Chun-Ying Lee et al.

the telomeric overhang affects G-quadruplex formation and the accessibility of protein binding is difficult to monitor by traditional biochemical methods, especially when the process is dynamic. Single-molecule methods can offer an advantage of providing structural dynamics at the molecular level. Total internal reflection fluorescence (TIRF) microscopy is a popular single- molecule detection method which yields reduced background noise and enables collecting data from hundreds of molecules in one measurement [12, 13]. In our system, DNA samples are labeled with two dyes, Cy3 and Cy5, in the FRETsensitive distance. FRET is a distance-dependent energy transfer process used to monitor the interaction between the two dyes, donor and acceptor which report on structural dynamics of labeled molecules. For example, when two dyes are labeled on one molecule, such as in two positions within the same DNA, the FRET change induced by protein binding would report on how the protein changes the conformation of the DNA strand within the labeled position. We have previously reported single-molecule fluorescence study on the G4 folding of telomeric overhang and the interaction with POT1 binding [14, 15]. POT1 is a component of the telomere binding protein complex termed shelterin, which specifically associates with ten nucleotide of telomeric repeat sequence (TTAGGGTTAG). POT1 protects telomeric overhang by preventing DNA repair machinery from associating with nondamaged telomere DNA [16]. The binding of POT1 on telomeric G4 can unfold the structure. Such binding and unfolding of the target G4 can be probed and quantified by smFRET. Here, we describe detailed protocols of smFRET studies on telomeric G-quadruplexes and demonstrate how to interpret protein–DNA interactions observed with smFRET.

2

Materials

2.1 Total Internal Reflection Fluorescence (TIRF) Microscope 2.2

DNA Constructs

See references 12 and 13 for details.

1. GQ strand (G2): 50 -TGGCGACGGCAGCGAGGCTTAGGG TTAGGG/30 Cy3/. 2. GQ strand (G3): 50 -TGGCGACGGCAGCGAGGCTTAGGG TTAGGGTTAGGG/30 Cy3/. 3. GQ strand (G4): 50 -TGGCGACGGCAGCGAGGCTTAGGG TTAGGGTTAGGGTTAGGG/30 Cy3/.

G-Quadruplex and Protein Binding by smFRET

311

4. GQ strand (G5): 50 -TGGCGACGGCAGCGAGGCTTAGGG TTAGGGTTAGGGTTAGGGTTAGGG/30 Cy3/. 5. GQ strand (G6): 50 -TGGCGACGGCAGCGAGGCTTAGGG TTAGGGTTAGGGTTAGGGTTAGGGTTAGGG/30 Cy3/. 6. GQ strand (G7): 50 -TGGCGACGGCAGCGAGGCTTAGGG TTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGG /30 Cy3/. 7. Biotin primer: 30 Biotin/.

/50 Cy5/GCCTCGCTGCCGTCGCCA/

8. Annealing buffer: 20 mM Tris–HCl pH 7.5, 50 mM NaCl. 9. Heating block. 2.3 Sample Chamber Preparation

1. Rectangular cover slips (24  40 mm No. 1½). 2. Quartz microscope slides, 100  300 , 1 mm thick. 3. Epoxy, 5 min epoxy (Devcon). 4. Double-sided tape. 5. Neutravidin, ImmunoPure NeutrAvidin protein.

2.4

Imaging Buffer

1. Glucose oxidase. 2. Glucose. 3. 6-Hydroxy-2,5,7,8-tetramethylchromane-2-carboxylic (Trolox). 4. Catalase. 5. KCl.

2.5

Flowing System

1. Tubing, WEICO ETT-28 (inner diameter ¼ 0.01500 , outer wall ¼ 0.01600 ). 2. Needle: BD (26 ga, 3/800 ). 3. 1-mL syringe. 4. 10-μL, 20–200-μL pipette tips.

2.6

3 3.1

Protein

1. POT1 protein. The purification protocol was previously published in [17].

Methods DNA Annealing

Single-molecule FRET constructs were prepared by annealing two single strands which were labeled with either Cy3 or Cy5 dye. One end of the Cy5-labeled strand was also biotinylated to immobilize DNA on the slide surface (Fig. 1a). 1. DNA sequences were purchased from a vendor and labeled with either Cy3 or Cy5 dye. The DNA oligos were dissolved with T50 (50 mM Tris–HCl pH 7.5) to a final concentration of

312

Chun-Ying Lee et al.

Fig. 1 Examples of DNA construct slide and image. (a) Single-molecule FRET construct are immobilized on PEG slide through biotin and neutravidin interaction. (b) Assembled channels. Each channel is separated by double-side tape and sealed by epoxy. (c) Image captured by camera. Each dot indicates one molecule. (d) Additional pipette tips are used to build a flowing system

100 μM. The stock DNA samples were stored at 20  C (see Note 1). 2. The GQ strands were annealed with the biotinylated, Cy5-containing primer at a molar ratio of 1:1.5 in annealing buffer (see Note 2). 3. The mixtures were incubated in a preheated 95  C heat block for 5 min and slowly cooled to room temperature (see Notes 3 and 4). 4. The annealed constructs were stored at 20  C (see Note 5). 3.2 Slide Preparation and Sample Immobilization

1. To perform single-molecule experiments, the quartz microscope slide and coverslip was cleaned and chemically coated with biotin-PEG to create a passivated surface. This prevents nonspecific adsorption of sample to surface (see Note 6). A practical procedure was written in previous literature [14]. 2. The PEG-coated surface of both the slide and coverslip must face each other. The coverslip was taped to the slide by one layer (about 100 μm, see Note 7) of double-side tape to create a passivated channel. For multiple channels, each channel was separated by one layer of tape (see Note 8).

G-Quadruplex and Protein Binding by smFRET

313

3. Overhanging extra tape was cut to fit the slide, and all borders of the coverslip/slide surface were sealed by epoxy (Fig. 1b). 4. To immobilize the DNA substrate, the PEG-coated slide was incubated with neutravidin solution (0.05 mg/mL, see Note 9) for 1 min, and then rinsed with 100 μL of T50 buffer (see Note 10). 5. DNA samples were diluted to 25 pM and added to the channel. After a 5-min incubation, the channel was rinsed with 100 μL of T50 (see Note 11). 3.3

Imaging

In general, single-molecule FRET measurements can provide realtime traces of FRET efficiency, which informs the dynamics of FRET states. In addition, the trace data can be used to generate a FRET histogram, which indicates the overall FRET distribution under certain equilibrium states. Both smFRET traces and histograms are able to answer different questions. Here is a procedure to test for the formation of the G-quadruplex structure and to obtain smFRET data. 1. A prism-type TIRF microscope was used in this protocol. A detailed protocol of TIRF setup can be found previously published in [12]. 2. After DNA immobilization and prior to any sample excitation, the image buffer was mixed with the oxygen scavenging system (1 mg/mL glucose oxidase, 0.8% v/v glucose, ~10 mM Trolox, and 0.03 mg/mL catalase) to reduce photobleaching events and added into the channel (see Note 12). 3. The donor dye (Cy3) was excited by a 532 nm solid-state laser, and emitted fluorescence (both of Cy3 and Cy5) was collected with an EMCCD camera by using a custom C++ program. 4. Data was collected over time as a movie. Depending on the purpose of the experiment, the movie can be short (20–40 frames in 100 ms time interval, see Note 13) or long (lasting several minutes). Collecting 15–20 short movies, each in different imaging fields, was usually required to generate one FRET histogram (see Note 14). 5. The recorded movie data was processed by IDL software to identify each individual fluorescent spot and correlate donor and acceptor channels, described in detail in [12] (Fig. 1c). 6. The output file was sorted by signal spots (molecules) and the intensity was plotted as a function of time to generate a singlemolecule signal trace. 7. To generate a histogram, FRET was first calculated with correction of Cy3 this equation: FRET ¼ Cy5Leakage (see Note 15). Cy3þCy5 The average FRET efficiencies were then calculated from the initial 10–20 frames of each short movie. This FRET

314

Chun-Ying Lee et al.

a

d

c

(TTAGGG)4

0.2

G4

1.0 0.8 0.6 0.4 0.2

0.1

b

folded (high FRET)

UF 400

F

0 mM KCl

200

molecule frequency

unfolded (low FRET)

0.00 0.08

0.00

0.02

number of molecules

0.1

200 0 400

G7

1.0 0.8 0.6 0.4 0.2

1.0 0.8 0.6 0.4 0.2

0.05

G5

G6

G7

0.00

2 mM

0.0 0.2 0.4 0.6 0.8 1.0

200 0 400

G6

0.05

0.00

0.1 mM

1.0 0.8 0.6 0.4 0.2

0.04

0 400

G5 FRET

+ KCl

G4

FRET

0

20 40 60 80 100

Time (sec)

10 mM

200 0 800 600 400 200 0

100 mM

0

0.2 0.4 0.6 0.8 1

FRET Fig. 2 FRET histograms show G-quadruplex folding. (a) Scheme of single-molecule FRET constructs. Formation of G-quadruplex can be induced by addition of potassium ions. (b) FRET histograms showing G-quadruplex folding after titration of K+. Increasing K+ concentration enhances the folded population. Three major peaks observed here are at FRET efficiencies of 0.8, 0.5, and 0.2, representing folded, unfolded, and donor-only constructs. (c) FRET histograms of constructs with increasing 3’ overhang lengths. Red line indicates overall shape of fitting; black line indicates the Gaussian distribution fit to each peak. (d) Time trace examples of each overhang construct. This figure was reproduced from published work (Helen Hwang, 2014) with permission from Elsevier [14]

efficiency data was plotted as a histogram by a homemade Matlab code. 8. Since G4 is stabilized by a potassium ion, the formation of G4 can be confirmed by titrating increasing potassium concentrations into the system. The image buffer was prepared with 0, 0.1, 2, 10, or 100 mM potassium. FRET histograms were generated for each concentration by repeating steps 1–6. (Fig. 2a, b).

G-Quadruplex and Protein Binding by smFRET

315

9. In order to observe the dynamics of FRET efficiency, a 3-min long movie was recorded (see Note 16). Traces generated from long movies can be used to confirm if FRET efficiency over time is stable or fluctuating (Fig. 2c, d). 10. FRET histograms of other DNA constructs with longer overhangs (G5, G6, or G7) can also be studied in a fully folding buffer condition (100 mM KCl) by repeating the above steps. 3.4

Protein Binding

Unlike when imaging DNA only, introduction of proteins into this DNA system produces an additional challenge: protein binding to DNA may be a pseudo-irreversible process (stable binding). Therefore, binding events could only be studied directly following the protein being flowed into the channel. Once the reaction reaches an equilibrium state, protein binding process can no longer be observed. Therefore, a flow system and real-time imaging was implemented to capture the initial binding event. Here is the procedure to construct a flowing system and use it to test POT1 protein binding to telomeric G-quadruplex sequences. 1. To adapt the slide for a flowing system, a short segment of a p10 tip and a thin tubule were used to connect one side of the channel to a syringe. The syringe could be controlled manually or by a syringe pump. A wide portion of a p200 tip was adhered to the other side of the channel by epoxy to create a buffer well (Fig. 1d) (see Note 17). 2. POT1 protein was mixed with image buffer for a final concentration of 100 nM and loaded into the buffer well. 3. A long movie recording was initiated. Then, the buffer was drawn through by pulling the syringe manually or controlling with a syringe pump (see Notes 18 and 19). This way, the movie will record both pre-flow and post-flow states for later comparison. 4. POT1 protein was incubated for 5–10 min (including the time to complete step 3, i.e., to record a long movie) to reach an equilibrium state. Then, 15–20 short movies were taken to generate a FRET histogram of equilibrium protein binding. FRET histograms can also be collected at several time points following the addition of protein, and the changes in FRET patterns would then indicate a change in equilibrium state due to protein binding. 5. This method (steps 3 and 4) was also used to study other DNA constructs (G2, G3, G5, G6, and G7).

3.5 Data Analysis and Interpretation

1. All codes are part of a homemade smFRET tool package, which is available to download at the Center for the Physics of Living Cells (http://cplc.illinois.edu/software/, Biophysics Department, University of Illinois at Urbana-Champaign).

316

Chun-Ying Lee et al.

2. Analysis of the FRET histograms was used to define whether the G-quadruplex was folded or unfolded. In our system, KCl titration revealed two identifiable states: the unfolded structure has low FRET efficiency, but the donor and acceptor fluorophores come into close proximity upon folding to exhibit a high FRET efficiency (Fig. 2a, b). 3. Since evidence of the overhang length affecting G-quadruplex stability was published previously [10], here we also test 5, 6, and 7 repeats of TTAGGG (G5, G6, and G7) in the presence of 100 mM KCl (Fig. 2c). The FRET histograms show that while G4 has a single sharp peak, other constructs have multiple peaks or a broad peak, suggesting that 30 overhangs exceeding the length of G4 can result in multiple states. Histograms were fitted by Gaussian distribution with multiple states in order to distinguish FRET states. 4. The multiple states of FRET efficiency observed in the histograms maybe due to one of the two situations: either more than two FRET species exist at same time or the dynamics of FRET transition is too fast to be resolved in this time interval. To resolve this, long-movie single-molecule FRET traces were used to determine FRET states present (Fig. 2d). Analysis of FRET efficiency shows that G4 has flat FRET trace over time, indicating a lack of dynamics within this structure. G5 and G6 show that a FRET transition occurs within one trace and results in multiple FRET peaks. G7 shows that the FRET signal fluctuates in one trace, resulting in a broad FRET histogram (see Note 13). 5. POT1 binding can unfold G4 structures, visualized by a decrease in FRET efficiency. To determine the ratio of POT1 binding, peaks within each histogram were fit by a Gaussian distribution. The bound and unbound states were defined by comparing the post-flow to pre-flow FRET histogram. The fraction in the bound state was determined by calculating the area under the curve. A shift in the peak would indicate a difference in the binding ratio due to an increase in the distance between donor and acceptor fluorophores from protein binding (Fig. 3a, b). 6. Further, the effect of overhang length on POT1 binding can be examined. Figure 3c shows histograms of the FRET efficiency observed with G4, G5, G6, and G7 constructs before and after POT1 binding. After fitting each peak to a Gaussian distribution and determining the area under the bound peak, the binding ratios are 70%, 78%, 70%, and 69%, respectively, which are not significantly different among the constructs. It suggests that although overhang length affects G4 folding and stability, POT1 binding is independent of GQ folding and overhang length.

G-Quadruplex and Protein Binding by smFRET

a

c

b G4 before binding 0.10

0

0.20

G4 after binding

0.10

0

0.0

unfolded (low FRET)

POT1

G4

0.10

0.2

0.4 0.6 FRET

0.8

1.0

molecule frequency count

+ POT1

molecule frequency count

0.20

folded (mid-high FRET)

+ POT1 0.20

317

0 0.20

70% G5

0.10

0 0.20

78% G6

0.10

0 0.20

70% G7

0.10

69%

0

0.0

0.2

0.4 0.6 FRET

0.8

1.0

Fig. 3 FRET histograms show POT1 binding. (a) Scheme of POT1 binding on DNA. POT1 binding unfolds the G4 structure and reduces FRET efficiency. (b, c) FRET patterns change after POT1 binding. Gray depicts FRET efficiency distributions prior to POT1 binding, and blue shows the distributions after POT1 binding. The binding ratio, calculated from the area beneath each curve, is displayed to the bottom right of each histogram. This figure was reproduced from published work (Helen Hwang, 2014) with permission from Elsevier [14]

7. POT1 binding to the DNA sequence TTAGGGTTAG reduces FRET by unfolding the G-quadruplex structure. POT1 contains two subunits OB1 and OB2 which can bind to DNA (Fig. 4a). To determine the stoichiometric effects of POT1 binding, the number of POT1 binding sites was reduced from two in G4 to 1.5 and one in G3 and G2, respectively. Here, the real-time binding process can be observed in the long movie (Fig. 4b). There are four transitions that occur upon POT1 binding to the G4 construct. Because the G4 sequence contains two POT1 binding sites, and two subunits (OB1 and OB2) bind to each binding site, we interpret these four transitions as four steps of binding. This was further confirmed by testing G2 and G3, because these constructs have 1 and 1.5 POT1 binding sites and the number of transition steps become 2 and 3, respectively. 8. The transitions of FRET within all the traces can be plotted as transition density plots (TDP) (Fig. 4c). TDP is a 2D accumulative cluster map. The x-and y-axis refer to the FRET value

Chun-Ying Lee et al.

a

b

c

(TTAGGG)4 2

3

4

TTAGGG TTAGGG TTAGGG TTAGGG

folded (high FRET) 3

TTAGGG TTAGGG TTAGGG

POT1

0

2

one POT1 binding site

POT1 bound (low FRET)

1-4

20

40

Telomere Repeat Pot1 binding site

60

G3

0

20

40

60

G2

1 FRET

1

6.750 9.000

13.50 15.75 18.00

0.4

0.5

one and a half POT1 binding sites

TTAGGG TTAGGG

4.500

0.5

1 FRET

CT O O B 1 B2

2

2.250

0.6

0 0

two POT1 binding sites

1

0.000

11.25

FRET

1

G4 1.0 POT1 on G4 (4 steps) 0.8

G4

1

0.5

FRET before transition

318

0.2 0

dissociation

0 0.2 0.4 0.6 0.8 1.0

G3 1.0

0.000

POT1 on G3 (3 steps) 1.500 3.000

0.8

4.500 6.000 7.500

0.6

9.000

0.4 0.2 dissociation

0

0 0.2 0.4 0.6 0.8 1.0

G2

1.0 POT1on G2 (2 steps) 0.8 0.6

0

0.4

0

20

time (sec)

40

60

0.2 0

dissociation

Number of events 0 4 8 12 16

0 0.2 0.4 0.6 0.8 1.0

FRET after transition Fig. 4 Dissecting POT1 binding site from smFRET time trace. (a) Scheme of POT1 binding. POT1 consists of three domains, OB1, OB2, and CT. Both OB1 and OB2 bind to telomere sequence TTAGGGTTAG. (b) Representative smFRET time trace of each construct. Time zero is the time the protein was flowed into the channel. Each red arrow indicates a FRET transition, which is correlated to the number of binding sites. (c) Transition density plots (TDP) of each construct. The x-axis and y-axis are the FRET state after and before transition. This figure was reproduced from published work (Helen Hwang, 2012) with permission from Elsevier [15]

before and after transition. The color map indicates the density of the transition events. If the TDP is symmetric along diagonal line, it means the transition is reversible. Here, the TDP contains four clusters from upper-right to lower-left and no symmetric plot, indicating that there are four irreversible transitions from high FRET to low FRET. This suggests POT1 binding to G4 occurs in four steps and is irreversible. TDP is built by using the software HaMMy, which is based on hidden Markov modeling (available for download at http://ha.med.jhmi.edu/res ources/#1464200861600-0fad9996-bfd4) [18].

4

Notes 1. We recommend purchasing DNA with HPLC purification to improve DNA sample purity. If the DNA will be labeled in the lab, depending on the labeling method, a proper buffer system will require for dissolving DNA instead of Tris buffer.

G-Quadruplex and Protein Binding by smFRET

319

2. Excess biotin primer is used to maximize the density of fluorescent G4-content DNA. Here, G4-content DNA is labeled with Cy3 dye, which is the source of fluorescence when it is excited by 532 nm laser. 3. Depending on the experiment, the annealing buffer may be adjusted to contain different salt ions and concentrations. For example, G4 is known to be stabilized by potassium ions, so when studying G4 folding, one should avoid potassium ions while annealing the DNA. 4. The cooling process is critical for G4 formation. Fast cooling may cause kinetically stable products, such as a hairpin or double helix. Slower cooling will better stabilize G4 formation. 5. Repeated freezing and thawing of DNA samples will induce G4 conformational changes as well as fluorophore degradation; therefore, we recommend using small, single-use aliquots of DNA samples. 6. A clean and passivated surface becomes hydrophobic, and so it can be differentiated from a nontreated surface by testing the behavior of a drop of water on the surface. Also, the surface should not be physically touched to protect the passivated surface. 7. The tape must be sufficiently wide (~2 mm) to prevent channel-channel leakage, but narrow enough to allow the solution to flow through the length of the channel. 8. Bubbles inside the tape will cause leakage and contaminate the neighbor channels. In order to prevent bubbles, the tape should be placed flatly and carefully pressed from one side. Once the tape is placed on the slide, it should not be removed to prevent damage to the passivated surface. 9. The dilution factor of neutravidin depends on slide quality. The lower the density of biotin-PEG, the higher the concentration of neutravidin should be. In our case, the slide contains 3% biotin-PEG, and a 50-fold dilution is sufficient to immobilize 1000 molecules in one image. 10. The solution can be added by directly pipetting into the channel and cleaning extra buffer with a kimwipe. Another option is to use a flow setup system, which is described in Subheading 3.4. 11. Generally, 25–100 pM DNA is low enough to yield 300–400 molecules in an imaging field of view of 25  75 μm2. Higher than this density may cause signal overlap because of Abbe’s law. However, many factors impact immobilization, and so starting with a low concentration and increasing until 300–400 molecules/image is reached is recommended.

320

Chun-Ying Lee et al.

12. 10 mM trolox was prepared by dissolving 25 mg in 10 mL ddH2O with 10 M NaOH (about 10 μL, adjusting to pH 8). After inverting up and down to mix the powder, the solution was wrapped in aluminum foil and rotated in the cold room overnight until it is fully dissolved. The next day, the solution was filtered using a 0.22-μm membrane and aliquotted (1 mL per tube). The trolox solution can be stored for 2–3 weeks at 4  C and longer in 80  C. 13. Taking short movies (containing only 20–40 frames) will prevent photobleaching. However, the time interval that these frames are taken over can vary. For example, if the FRET dynamics are faster than 100 ms, there will be one broad peak observed at 100 ms scale. Therefore, a short time interval is able to capture and distinguish different FRET states. 14. The number of short movies taken depends on molecule density. Reliable FRET histograms must be built from a sufficient number of effective molecules (approximately 5000–6000 molecules). If DNA labeling efficiency is low, the ratio of donor- or acceptor-only molecules becomes higher, so more movies must be taken to obtain a representative FRET histogram. To ensure that areas are only imaged once (to reduce photobleaching), we recommend snaking the microscope stage across and down the length of the channel. 15. Signal leakage comes from the instrument setup, usually the dichroic mirror. A perfect dichroic mirror would be able to completely separate light at a certain wavelength; however, in actuality part of the signal leaks through the mirror (i.e., Cy3 emission into the red channel and Cy5 emission into the green channel). Because our system uses Cy3 and Cy5 intensities to calculate FRET efficiency, this leakage causes incorrect calculation of FRET efficiency. The calibration was described in the ref. 12. 16. In our oxygen scavenging system conditions, about 50% of both dyes photobleached after 3 min (1801 frames in 100 ms resolution). Another way to prevent fast photobleaching is to decrease the laser power. However, lower laser power will cause lower fluorescence emission. In this case, lengthening frame resolution can increase average signal intensity to overcome the decreased fluorescence. Ensuring that the lasers are properly focused will help limit the laser power needed. 17. To prevent buffer leakage by atmospheric pressure, the flowing system must be isolated from outside airflow. Therefore, the gap between the tubing and tip should be epoxied and tested for air leakage by flowing water through before attaching the flowing system to the slide—the volume of water pulled

G-Quadruplex and Protein Binding by smFRET

321

through should enter the syringe immediately. Any leakage will cause bubbles in the tubing and slow down the flow speed. 18. Do not pull all of the buffer into the channel, because the balance of flow pressure will draw more solution than the expected volume. If 50 μL of solution is loaded into the well, then the flowing volume should be less than 40 μL. Before attaching the p10 tip onto the slide channel, a small amount of T50 buffer can be pulled through the tubing first to fill the volume of the tubing itself. 19. The pulling action may perturb the sample stage or microscope, causing imaging drift. Be careful while pulling the syringe.

Acknowledgments Most of the data was taken and analyzed by our alumni, Helen Hwang. We thank the members of the Sua Myong and Taekjip Ha Laboratory for their input. References 1. Burge S, Parkinson GN, Hazel P, Todd AK, Neidle S (2006) Quadruplex DNA: sequence, topology and structure. Nucleic Acids Res 34 (19):5402–5415 2. Maizels N, Gray LT (2013) The G4 genome. PLoS Genet 9(4):e1003468 3. Chambers VS, Marsico G, Boutell JM, Antonio MD, Smith GP, Balasubramanian S (2015) High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat Biotechnol 33(8):877–881 4. Valton AL, Prioleau MN (2016) G-quadruplexes in DNA replication: a problem or a necessity? Trends Genet 32(11):697–706 5. Rhodes D, Lipps H (2015) G-quadruplexes and their regulatory roles in biology. Nucleic Acids Res 43(18):8627–8637 6. Koole W, van Schendel R, Karambelas AE, van Heteren JT, Okihara KL, Tijsterman M (2014) A polymerase theta-dependent repair pathway suppresses extensive genomic instability at endogenous G4 DNA sites. Nat Commun 5:3216 7. Maizels N (2015) G4-associated human diseases. EMBO Rep 16(8):910–922 8. Raghuraman MK, Cech TR (1990) Effect of monovalent cation-induced telomeric DNA structure on the binding of Oxytricha telomeric protein. Nucleic Acids Res 18 (15):4543–4552

9. Hardin C, Henderson E, Watson T, Prosser JK (1991) Monovalent cation induced structural transitions in telomeric DNAs: G-DNA folding intermediates. Biochemistry 30 (18):4460–4472 10. Viglasky V, Bauer L, Tluckova K, Javorsky P (2010) Evaluation of human telomeric G-quadruplexes: the influence of overhanging sequences on quadruplex stability and folding. J Nucleic Acids 2010:1–8 11. Wang Q, Liu J, Chen Z et al (2011) G-quadruplex formation at the 30 end of telomere DNA inhibits its extension by telomerase, polymerase and unwinding by helicase. Nucleic Acids Res 39(14):6229–6237 12. Joo C, Ha T (2008) Single molecule FRET with total internal reflection microsopy. In: Selvin P, Ha T (eds) Single molecule techniques: a laboratory manual. Cold Spring Harbor Laboraroty Press, Cold Spring Harbor, NY. ISBN 978-087969775-4, 507 pp 13. Roy R, Hohng S, Ha T (2008) A practical guide to single-molecule FRET. Nat Methods 5(6):507–516 14. Hwang H, Kreig A, Calvert J et al (2014) Telomeric overhang length determines structural dynamics and accessibility to telomerase and ALT-associated proteins. Structure 22 (6):842–853

322

Chun-Ying Lee et al.

15. Hwang H, Buncher N, Opresko PL, Myong S (2012) POT1-TPP1 regulates telomeric overhang structural dynamics. Structure 20 (11):1872–1880 16. Denchi EL, de Lange T (2007) Protection of telomeres through independent control of ATM and ATR by TRF2 and POT1. Nature 448(7157):1068–1071

17. Sowd G, Lei M, Opresko PL (2008) Mechanism and substrate specificity of telomeric protein POT1 stimulation of the Werner syndrome helicase. Nucleic Acids Res 36 (13):4242–4256 18. McKinney SA, Joo C, Ha T (2006) Analysis of single-molecule FRET trajectories using hidden Markkov modeling. Biophys J 91:1941–1951

Chapter 19 High-Throughput Screening of G-Quadruplex Ligands by FRET Assay Kaibo Wang, Daniel P. Flaherty, Lan Chen, and Danzhou Yang Abstract Fluorescence resonance energy transfer (FRET) is a distance-dependent process by which energy is transferred from an excited donor fluorophore to an acceptor molecule when the donor and acceptor are in close proximity to each other. Depending on the assay design, FRET can provide a real-time measurement of structural integrity and dynamics of biomacromolecules in solution and is particularly suitable for studying G-quadruplex (G4) nucleic acids and their ligand interactions. FRET-based assays are ideally suited for high throughput screening (HTS) methodology because they are simple, sensitive, and easily automated. G4s are stable nucleic acid structures involved in important regulatory roles in gene replication, transcription, and genomic instability. Four-stranded G4s are promising drug targets as these non-canonical structures are enriched in oncogene promoters, 50 UTRs, and telomeres, and have been linked to regulation of gene expression in cancer and other diseases. Although molecules that bind to G4s, with subsequent influence on gene expression, have been well documented, the identification of new chemical scaffolds that potently and selectively bind to G4s and control specific gene expression are still much less common. Here, we describe a detailed protocol of a FRET-based HTS methodology to identify novel G4 ligands. Key words FRET, G-quadruplex, Drug target, Promoter, Telomere, Ligand

1

Introduction Four-stranded G-quadruplexes (G4s) are a family of nucleic acid secondary structures consisting of stacked G-tetrad planes stabilized by Hoogsteen hydrogen bonds and monovalent cations such as Na+ and K+ [1, 2]. There is growing evidence indicating that G4-forming sequences are concentrated at biologically relevant regions and play important regulatory roles in gene replication, transcription, and genomic instability [3, 4]. More recently, G4s have been visualized both in DNA and RNA of human cells, and further marked in human regulatory chromatin [5–7]. Such structures have been implicated in the regulation of genes, some of which are necessary for disease pathogenesis, leading to increased attention as potential novel targets for anticancer and anti-HIV

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_19, © Springer Science+Business Media, LLC, part of Springer Nature 2019

323

324

Kaibo Wang et al.

agents [8–11]. To date, molecules that bind to G4s and stabilize the G4 secondary structure have been shown to significantly influence transcription and translation of the G4 associated genes [8–14]. The results illustrate the utility of such molecules for disease therapy and emphasize the importance of continued discovery of new chemical scaffolds that potently and selectively for specific G4s [2, 4, 12–15]. Fluorescence resonance energy transfer (FRET) is a dipoledipole coupling process by which the excited-state energy of a fluorescent donor molecule is non-radiatively transferred to an acceptor molecule [16, 17]. As this effect is distance dependent, generally limited to 10–80 A˚, optimization is necessary for placement of donor and acceptor fluorophores to maximize the FRET signal. However, once this step has been performed the assay provides a powerful tool in biomedical research and drug discovery to monitor integrity and dynamics of the target system in real-time [17, 18]. FRET is especially useful for tracking G4 folding and unfolding processes, which can be used to discover G4-interactive small-molecule ligands. For these experiments, a G4-forming nucleic acid is labeled at the 50 and 30 ends by a donor and an acceptor fluorophore, with the requirement that the fluorescence emission spectrum of the donor probe overlaps the excitation spectrum of the acceptor probe. Common FRET pairs include a 6-carboxyfluorescein (FAM) donor and the Black Hole Quencher 1 (BHQ1) or 6-carboxy tetramethylrhodamine (TAMRA) as acceptors. Alternative FRET pairs that display desired spectroscopic properties also include FAM-Cy3, Cy3-Cy5, and FAM-rhodamine dyes. The fluorescence intensity of the probe depends on the distance from the fluorophore to the quencher (or from the FRET acceptor) and this distance is a function of the G-quadruplex folding/unfolding process. In solution, without G4-favorable cations K+ or Na+, the G4-forming nucleic acid mainly exists as a single-strand, but the population of the folded G4 can dramatically increase upon addition of G4-interactive small molecules (Fig. 1) or K+ cations. Therefore, in the context of the FAM-BHQ1 pair, the ability to induce G4 formation of small molecules will be reflected by the decrease in fluorescence intensity of FAM relative to the negative control group (probe-only group). This FRET intensity-based assay is a rapid and convenient method suitable for high-throughput screening. Herein, we describe a detailed protocol for high-throughput screening of G4 ligands that are capable of inducing G4 formation from chemical libraries by FRET assay.

High-Throughput Screening of G-Quadruplex Ligands by FRET Assay

325

Fig. 1 Principle of G4-interactive compounds screening by FRET assay

2

Materials 1. Oligonucleotide: 50 -BHQ1-TGGGGAGGGTGGGGAGGGT GGGGAAGGTT-FAM-30 (named Bpu28F) (see Note 1). 2. 384-well optical bottom black plates. 3. 50 mL conical sterile polypropylene centrifuge tubes. 4. Precise compound transfer equipment (Echo/Access Workstation, see Note 2). 5. Liquid dispenser and strip washer (MultiFlo, see Note 3). 6. Plate reader (Synergy Neo2, see Note 4). 7. Centrifuges. 8. pH meter. 9. Diversity-oriented small molecules library (see Note 5). Known drugs, bioactive and natural products library (see Note 6). 10. 100 mM Tris-acetate buffer, pH 7.0. 11. 2 M potassium chloride (KCl) solution. 12. MilliQ water was used for the preparation of all solutions.

3

Methods

3.1 Sample Preparation and Method Validation

1. Set the parameters of fluorescence intensity measurement (Synergy Neo2 plate reader; Excitation: 490 nm, with a 10 nm bandwidth; Emission: 520 nm, with a 10 nm bandwidth; Gain: 80; Light source: xenon flash; Data point: 10; Temperature 25  C; see Note 7). These settings were used for all consecutive scans.

326

Kaibo Wang et al.

2. Before the large-scale screening, a small-scale test of DNA probe quality and the instrument conditions was carried out each day to maintain robustness of the assay. Steps 3–7 describe the small-scale test. 3. The dual labeled Bpu28F probe (0.04 μmol) was dissolved in water to a stock concentration of 100 μM (0.1 mL), which was then diluted in 100 mM Tris-acetate buffer, pH 7.0, to a working concentration of 1 μM in a 10 mL final volume. The 1 μM probe solution was allowed to stand for 1 h at room temperature to reach equilibrium. 4. The potassium cation that stabilizes G4 structure served as a positive control. A stock of 2 M KCl was diluted in water to a working concentration of 200 mM. 5. In a black 384-well plate, 10 μL of 1 μM probe solution was added to columns 1 and 2 by MultiFlo. The column 1 (negative control group) was diluted with 10 μL MilliQ water (final well probe concentration/volume: 0.5 μM in 20 μL of 50 mM Trisacetate buffer, pH 7). The column 2 (positive control group) was diluted with 10 μL 200 mM KCl (final buffer components: 50 mM Tris-acetate, pH 7, KCl 100 mM in 20 μL volume). The plate was centrifuged and incubated for 30 min at room temperature to reach equilibrium. 6. The fluorescence intensity was measured according to parameters listed in step 1 using the Synergy Neo2 plate reader. 7. Data analysis was done in Excel with calculation of the mean fluorescence and standard deviation (S.D.) for positive and negative controls, respectively. To illustrate, in this case the negative control group had a mean fluorescence value of 873.8  15.2, and the positive control (100 mM KCl) group had values of 553.1  9.0, respectively. The fluorescence intensity value of the negative control group was normalized to 100%, and a relative fluorescence intensity value of 63.3% was obtained for the positive control group. From our experience, the values of the positive control group ranging from 61% to 65% are acceptable given the day-to-day variation of reagents (Z0 ¼ 0.7). If the small-scale experimental results are acceptable, then a large-scale screen of chemical libraries may be performed using the same reagents and instrument settings. We selected 50,000 screening decks from different libraries. 3.2 High-Throughput Screening of 50,000 Compounds from Different Chemical Libraries

1. We selected a total of 50,000 compounds from two major chemical libraries (the diversity-oriented small molecule library and the known drugs, bioactive and natural products library). The stock concentration of compounds in the chemical libraries was 1 mM in DMSO. 2. Add 9.8 μL MilliQ water to columns 3–22 of the black 384-well plates by MultiFlo. Add 10 μL MilliQ water and

High-Throughput Screening of G-Quadruplex Ligands by FRET Assay

327

Fig. 2 Plate design for the FRET experiment to screen G4-interactive compounds. The first two columns, 1–2, did not contain any compound or KCl, and served as negative controls. Columns 3–22 had different compound in each well. The last two columns, 23–24, had 100 mM KCl and served as positive controls. All wells contained 0.5 μM Bpu28F probe

200 mM KCl solution into columns 1–2 and 23–24 for negative and positive control groups, respectively (Fig. 2, see Notes 8 and 9). 3. Using the Access Workstation integrated with an Echo system (Labcyte), transfer 0.2 μL of each compound from libraries to columns 3-22 of the pre-prepared black 384-well plates for a total solution volume of 10 μL. Columns 1–2 and 23–24 are compound-free wells that serve as internal references for each plate. 4. Centrifuge the plates and incubate for 10 min at room temperature, then measure the fluorescence intensity on the Synergy Neo2 plate reader. The collected data serves as a compoundonly control (see Note 10). 5. Add 10 μL 1 μM Bpu28F probe in 100 mM Tris-acetate buffer, pH 7, to the entire black 384-well plates using MultiFlo, then incubate 30 min at room temperature for equilibrium. Final well concentrations/volumes: (1) Columns 1–2 (negative

328

Kaibo Wang et al.

control group), 0.5 μM Bpu28F in 50 mM Tris-acetate, pH 7; (2) Columns 3–22 (sample group), 10 μM small molecule sample, 0.5 μM Bpu28F in 50 mM Tris-acetate, pH 7, 1% v/v DMSO (see Note 11); (3) Columns 23–24 (positive control group), 0.5 μM Bpu28F in 50 mM Tris-acetate, pH 7, 100 mM KCl. 6. Measure the fluorescence intensity using Synergy Neo2 plate reader. 7. Data analysis was done in Excel. Average the fluorescence intensity values of negative control groups (columns 1–2) and the positive control groups (columns 23–24) and calculate the standard deviation. Normalize the negative control to 100% fluorescence signal. Compare fluorescence intensity of all sample wells (columns 3–22) to normalized negative control value to obtain percentage of relative fluorescence intensity values for each molecule tested. Hits were defined molecules that provided relative fluorescence intensity values less than 70% compared to negative control. Using this criterion, we identified 329 hits (see Notes 12 and 13). A representative result from one plate using this high-throughput screening method is shown in Fig. 3. 8. Repeat the FRET experiment for these 329 hit compounds, in triplicate, to exclude the technical errors.

Fig. 3 Representative result from one 384-well plate using the high-throughput screening method. Each plate contains 320 compounds. Wells in the first and last two columns (left and right ends in the figure) are negative and positive controls, respectively. There are 5 hits in this plate that are marked by red circles

High-Throughput Screening of G-Quadruplex Ligands by FRET Assay

4

329

Notes 1. We synthesized the dual-labeled Bpu28F DNA oligonucleotide, but the labeled DNAs are available from commercial sources. FAM is a common FRET donor. Black Hole Quencher-1 (BHQ-1) is classified as a dark quencher (a non-fluorescent chromophore), which is frequently used as the quencher moiety in a variety of FRET DNA detection probes. The advantages of using BHQ1 in a FRET probe are: (1) low background fluorescence (and thus better signal-tonoise ratio); (2) ease of synthesis for FRET probes with a dark quencher (due to dark quenchers being resistant to degradation during the oligo deprotection step) [19]. 2. Echo uses acoustic energy to precisely transfer droplets (2.5 nL per droplet) of liquid from any well of a source plate to any well of a destination plate in a contact-free manner. The Access Workstation integrates Echo with a series of other devices, such as a plate hotel, a microplate centrifuge, a plate peeler, sealer and a washer/dispenser, so that a complete cycle of preparing an assay plate can be done on one workstation with one controlling program. From our screen, we use the Access Workstation to transfer library compounds to our assay plates. 3. MultiFlo (Biotek) is a housekeeping liquid dispenser and strip washer which can dispense up to three different reagents without changing dispensing cassettes so that it can avoid switching reagents and thus saving reagents for priming. 4. Synergy Neo2 HTS multimode microplate reader (Biotek). 5. The Chemical Genomics Facility at Purdue Institute for Drug Discovery Facility holds a total collection of about 430,000 diversity-oriented small molecules which are comprised by ChemDiv and ChemBridge libraries. The compounds in these libraries are structurally diverse that characterized by “druglike” and good ADME profiles. 6. This collection includes LOPAC 1280 (known drugs and bioactives from Sigma), Spectrum 2400 (known drugs, and natural products from MicroSource), 1000 pure natural products extracted from plants and microorganisms, and 5000 semi-synthetic compounds that were synthesized based on the scaffold of natural products but with trackable chemistry. 7. If the parameters for FRET pairs are not well-established, one needs do excitation wavelength scans and emission wavelength scans to obtain the optimized excitation and emission wavelengths for the FRET pairs. 8. We recommend using two separate dispensing cassettes to dispense water and KCl. KCl is relatively high in concentration

330

Kaibo Wang et al.

and hard to clean completely from the transfer consumables, and contamination by KCl will dramatically affect the G4 folding process and can cause failure of the experiments. 9. One can handle as many plates as desired in one day; however, we recommend 30 plates per day. 10. The compound-only fluorescence intensity control is needed because some compounds may have intrinsic fluorescence that overlaps with the spectrum of the probes. In this case, the detected fluorescence at 520 nM can be polluted by the fluorescence contribution from the test molecule and lead to a false-negative. 11. We determined that the 1% v/v DMSO did not affect our FRET system, so we did not include DMSO control groups in our experiment for convenience. 12. One can select the desired hits by defining the hit criteria based on their goals. In our case, we used 70% fluorescence intensity compared to negative controls for hit criteria. This criteria was roughly 5–10% higher than the 100 mM K+ positive control group. 13. Beside the technical artifacts, there are some cases that are not suitable for the FRET assay described here: (1) as mentioned in Note 10, the intrinsic absorption or fluorescence of test ligands may dramatically affect the FRET screening results. (2) The ligand molecule may interact with FAM rather than the G4 structure. In this case, a decrease of the fluorescence intensity of FAM reflects an interaction with the fluorescent dye, not the G4 structure, generating false-positive ligands. One needs to confirm the G4-interaction with a non-labeled oligonucleotide, using fluorescence independent techniques such as CD, gel electrophoresis, ITC, SPR, and NMR. (3) The small molecule compound may interact with BHQ1, which can hinder the identification of the G4-interactive compounds due to loss the function of BHQ1 acceptor. (4) One may encounter intermediate cases, in which the binding to the labeled oligonucleotide is different than that to the unlabeled oligonucleotide: the presence of fluorescent tags may affect the accessibility of the ligand. Regardless, all hit compounds should be subjected to rigorous secondary assays that are not related to the primary assay for hit validation. It is also suggested to screen all hits with cheminformatics filters to triage any known problematic scaffolds such as pan-assay interference compounds (PAINS) or known aggregators.

High-Throughput Screening of G-Quadruplex Ligands by FRET Assay

331

Acknowledgments This research was supported by the National Institutes of Health (R01CA177585 (DY), and P30CA023168 (Purdue Center for Cancer Research)). We thank Dr. Clement Lin, Dr. Buket Onel, and Dr. Jonathan Dickerhoff for helpful discussion and proofreading the manuscript. References 1. Chen Y, Yang D (2012) Sequence, stability, and structure of G-quadruplexes and their interactions with drugs. Curr Protoc Nucleic Acid Chem. Chapter 17: Unit 17.5. https:// doi.org/10.1002/0471142700.nc1705s50 2. Neidle S (2017) Quadruplex nucleic acids as targets for anticancer therapeutics. Nat Rev Chem 1(5):0041 3. Bochman ML, Paeschke K, Zakian VA (2012) DNA secondary structures: stability and function of G-quadruplex structures. Nat Rev Genet 13(11):770 4. H€ansel-Hertsch R, Di Antonio M, Balasubramanian S (2017) DNA G-quadruplexes in the human genome: detection, functions and therapeutic potential. Nat Rev Mol Cell Biol 18 (5):279 5. H€ansel-Hertsch R, Beraldi D, Lensing SV, Marsico G, Zyner K, Parry A, Di Antonio M, Pike J, Kimura H, Narita M (2016) G-quadruplex structures mark human regulatory chromatin. Nat Genet 48(10):1267 6. Biffi G, Tannahill D, McCafferty J, Balasubramanian S (2013) Quantitative visualization of DNA G-quadruplex structures in human cells. Nat Chem 5(3):182 7. Biffi G, Di Antonio M, Tannahill D, Balasubramanian S (2014) Visualization and selective chemical targeting of RNA G-quadruplex structures in the cytoplasm of human cells. Nat Chem 6(1):75–80 8. Neidle S (2016) Quadruplex nucleic acids as novel therapeutic targets. J Med Chem 59 (13):5987–6011 9. Balasubramanian S, Hurley LH, Neidle S (2011) Targeting G-quadruplexes in gene promoters: a novel anticancer strategy? Nat Rev Drug Discov 10(4):261 10. Wang K-B, Li D-H, Hu P, Wang W-J, Lin C, Wang J, Lin B, Bai J, Pei Y-H, Jing Y-K (2016) A series of β-carboline alkaloids from the seeds of Peganum harmala show G-quadruplex interactions. Org Lett 18(14):3398–3401 11. Amrane S, Kerkour A, Bedrat A, Vialet B, Andreola M-L, Mergny J-L (2014) Topology of a DNA G-quadruplex structure formed in

the HIV-1 promoter: a potential target for anti-HIV drug development. J Am Chem Soc 136(14):5249–5252 12. Felsenstein KM, Saunders LB, Simmons JK, Leon E, Calabrese DR, Zhang S, Michalowski A, Gareiss P, Mock BA, Schneekloth JS Jr (2015) Small molecule microarrays enable the identification of a selective, quadruplex-binding inhibitor of MYC expression. ACS Chem Biol 11(1):139–148 13. Kang H-J, Cui Y, Yin H, Scheid A, Hendricks WP, Schmidt J, Sekulic A, Kong D, Trent JM, Gokhale V (2016) A pharmacological chaperone molecule induces cancer cell death by restoring tertiary DNA structures in mutant hTERT promoters. J Am Chem Soc 138 (41):13673–13692 14. Qin H, Zhao C, Sun Y, Ren J, Qu X (2017) Metallo-supramolecular complexes enantioselectively eradicate cancer stem cells in vivo. J Am Chem Soc 139(45):16201–16209 15. Xu H, Di Antonio M, McKinney S, Mathew V, Ho B, O’Neil NJ, Dos Santos N, Silvester J, Wei V, Garcia J (2017) CX-5461 is a DNA G-quadruplex stabilizer with selective lethality in BRCA1/2 deficient tumours. Nat Commun 8:14432 ˝rster T (1959) 10th Spiers Memorial Lec16. Fo ture. Transfer mechanisms of electronic excitation. Discuss Faraday Soc 27:7–17 17. Masuko M, Ohuchi S, Sode K, Ohtani H, Shimadzu A (2000) Fluorescence resonance energy transfer from pyrene to perylene labels for nucleic acid hybridization assays under homogeneous solution conditions. Nucleic Acids Res 28(8):e34–e00 18. Juskowiak B, Takenaka S (2006) Fluorescence resonance energy transfer in the studies of guanine quadruplexes. In: Fluorescent energy transfer nucleic acid probes. Springer, pp 311–341 19. Yeung AT, Holloway BP, Adams PS, Shipley GL (2004) Evaluation of dual-labeled fluorescent DNA probe purity versus performance in real-time PCR. BioTechniques 36(2):266–275

Chapter 20 Targeting G-Quadruplexes with PNA Oligomers Bruce A. Armitage Abstract The growing interest in G-quadruplex (G4) structure and function is motivating intense efforts to develop G4-binding ligands. This chapter describes the design and testing of peptide nucleic acid (PNA) oligomers, which can bind to G4 DNA or RNA in two distinct ways, leading to formation of heteroduplexes or heteroquadruplexes. Guidelines for designing G4-targeting PNAs and step-by-step protocols for characterizing their binding through biophysical or biochemical methods are provided. Key words G-quadruplex, PNA, γPNA, Hybridization, UV melting, Thermal difference spectroscopy, Circular dichroism, Luciferase reporter assay

1

Introduction With each passing year, our understanding of the complexity of genome structure and the mechanisms by which gene expression is regulated continues to grow. The guanine quadruplex (G4) [1] is a non-duplex DNA and RNA structure that is believed to play important regulatory roles at all stages of gene expression, including transcription, splicing, mRNA export and localization, translation and miRNA maturation [2–4]. The increased scrutiny of G4 biology is motivating intense efforts to design and synthesize molecules capable of binding G-quadruplexes and modulating their function. Molecular recognition of G4 structures can be approached from either a shape- or sequence-based perspective (Fig. 1). Small molecules and proteins (e.g., antibodies) recognize the threedimensional shape of a quadruplex, finding potential elements of complementarity in the planar tetrad surfaces, the concave grooves and the loops that connect adjacent corners of the structure. The challenge of shape-based recognition, particularly via small molecules, is generalization: once a small molecule has been identified that can bind a particular quadruplex with high affinity and selectivity, translating that result to a different quadruplex is not necessarily straightforward [5].

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_20, © Springer Science+Business Media, LLC, part of Springer Nature 2019

333

334

Bruce A. Armitage

Fig. 1 Recognition modes for G-quadruplexes. Small molecules bind via shape-based recognition, whereas oligonucleotides bind via sequence-based recognition and can form either complementary heteroduplexes or homologous heteroquadruplexes

In contrast, sequence-based recognition requires that the G4 structure be disrupted in order to “read” the H-bonding groups on the individual nucleotides. There are actually two sequence-based approaches to recognizing G4s: complementary oligonucleotides can bind via standard Watson-Crick base pairing to form heteroduplexes, whereas homologous oligonucleotides can form heteroquadruplexes, via G-tetrad formation. In principle, any synthetic oligonucleotide can be designed to recognize G4s by either complementary or homologous hybridization. This chapter highlights our work with peptide nucleic acid (PNA) oligomers, which can successfully invade folded G4 structures via both binding modes. The presence of a stable quadruplex fold in the target nucleic acid imposes kinetic and thermodynamic barriers to hybridization by synthetic oligonucleotides. The lack of a negative charge on the PNA backbone and the documented high affinity of PNA [6, 7] and, even more so, its second generation analogue γPNA [8, 9] (Chart 1), allows oligomers based on these backbones to successfully target G4s to form stable heteroduplex and heteroquadruplex structures. In fact, both complementary and homologous PNAs have been shown to invade stable DNA and RNA G4s with low nanomolar KD values and show potent inhibition of G4-dependent biochemical activity, e.g., DNA polymerase-mediated primer extension [10] and mRNA translation [11]. Factors to consider when

Targeting G-Quadruplexes with PNA Oligomers

335

designing a PNA (or other synthetic oligonucleotide) to target a quadruplex are described in Subheading 2.

2

Materials

2.1

PNA Oligomers

The original patent covering PNA oligomers has expired so custom PNAs can be obtained from various suppliers, particularly peptide synthesis companies. PNA amino acid monomers are also commercially available, allowing researchers with even modest organic synthesis capabilities to make their own PNA oligomers in-house by standard solid-phase peptide synthesis protocols [12, 13]. γPNAs are more difficult to obtain. There is only one commercial supplier of custom γPNA oligomers as of this writing: Panagene, Inc. and its affiliates. The supplier provides various options in terms of the side chain that extends from the gamma carbon, primarily allowing the user to introduce charges (positive or negative) or nonionic, water-solubilizing groups such as minipolyethylene glycol (miniPEG). While the advantages of γPNA include the enhanced affinity (due to a right-handed helical pre-organized structure imposed by the chiral backbone [8]) and improved water solubility for charged or miniPEG substituents [9], custom oligomers can be quite expensive (>$500 per oligomer).

2.2

Buffer Selection

A variety of buffers can be used for these experiments. The only truly important factor to keep in mind is the concentration of potassium (usually introduced as KCl) in the buffer. Potassium cations strongly stabilize G4 structures, whereas lithium cations are destabilizing [14]. Thus, maintaining the buffer pH and ionic strength while varying the cation between potassium and lithium allows one to tune the stability of the G4 target. While high lithium concentrations are not physiologically relevant, experiments using lithium as a replacement for potassium can provide compelling support for the presence and importance of a quadruplex on a biological process such as transcription or translation.

2.3

Instrumentation

The methods described in the next section can be performed using instruments and equipment that are commonly available in labs or departments where biophysical and biochemical experiments are routine. Relatively inexpensive UV–vis spectrometers equipped with a Peltier thermoelectrically controlled cell-holder (for temperature regulation) can be used for UV melting curves and thermal difference spectroscopy. A number of manufacturers make such instruments; our work has always been done with Cary (formerly Varian) instruments that hold up to six samples at a time. Similarly, fluorescence spectrometers and plate readers that are widely available are appropriate for measuring G-quadruplex recognition by labeled probes or for fluorescence/luminescence gene expression

336

Bruce A. Armitage

reporter assays. We typically use Tecan plate readers for luciferase assays and Cary fluorimeters for spectroscopic measurements. Circular dichroism (CD) spectropolarimeters are more expensive than UV–vis or fluorescence spectrometers and so are more likely to be available as part of a shared instrumentation facility. Nevertheless, modern instruments such as those manufactured by JASCO are sufficiently sensitive to measure CD spectra for G quadruplexes and their complexes with other molecules at low micromolar concentrations. These instruments are also capable of measuring CD melting curves, in which the CD signal at a single wavelength is monitored as a function of temperature, providing a complementary technique to UV melting curves.

3

Methods This section describes factors to consider in designing a G4-targeting PNA and a brief description of the methods typically used to characterize the hybridization.

3.1

Design

PNAs can be designed to be either complementary or homologous to their G4 targets. In both cases, the PNA can be expected to bind with low nanomolar affinity. For complementary hybridization, we recently demonstrated that a γPNA targeting the first two G-tracts and five nucleotides adjacent to the G4 sequence successfully invaded an RNA quadruplex [11]. In principle, it should even be possible to target only the first G-tract of a G4 sequence and still disrupt the structure since each of the tetrads would be destabilized. The binding site of the PNA can be shifted throughout the G4 without significantly compromising affinity. The advantage of targeting flanking regions is that they provide opportunities for improved selectivity in hybridization since these regions are not conserved among different quadruplexes within a genome or transcriptome, whereas the G4 sequence motif itself (e.g., GGG-XnGGG-Xn-GGG-Xn-GGG, where Xn refers to loop nucleotides) has a high degree of similarity among various quadruplexes. In addition, targeting the flanking region can accelerate hybridization to a quadruplex in cases where these nucleotides are more accessible to the probe than those in the folded G4 structure. As with any antisense experiment, the length and sequence of the PNA should be varied and optimized, but one can expect sub-micromolar KD and IC50 values for probe lengths >10 residues. For homologous hybridization, we typically use two G-tracts separated by an abasic linker, e.g., H-GGG-Ln-GGG-LysNH2, where H ¼ unsubstituted amino terminus, LysNH2 ¼ C-terminal lysine amide, L ¼ 8-amino-3,6-dioxa-octanoic acid and n ¼ 1–3 [15]. (The abasic linker reduces off-target hybridization to complementary sites.) PNAs based on this design can invade G4 DNA

Targeting G-Quadruplexes with PNA Oligomers

337

and RNA to yield 2:1 heteroquadruplexes. In some cases, hybridization via this mode is considerably faster than for a complementary PNA of similar affinity. We have also compared PNA and γPNA for heteroquadruplex formation. Incorporation of a methyl group at the gamma carbon led to similar affinity for a DNA G4 target based on the MYC gene promoter sequence, regardless of whether the configuration at the gamma carbon was D or L [16]. This was interesting because the two configurations lead to opposite helicity: D ¼ right-handed, while L ¼ left handed. Whereas left-handed γPNA does not bind to complementary DNA [17], it does bind to homologous DNA to form a heteroquadruplex. Thus, left-handed γPNA should provide additional benefits for minimizing off-target hybridization to complementary DNA or RNA, particularly when combined with an abasic linker between the two Gn tracts. 3.2

Characterization

3.2.1 UV Melting Analysis

UV Melting Procedure

A range of biophysical and biochemical methods can be used to assess hybridization to G4 DNA and RNA targets. A few of the most common methods are described here. This method relies on changes in UV absorbance at particular wavelengths as a function of temperature. For example, dissociation (i.e., “melting”) of the two strands of a Watson-Crick duplex is readily observed by monitoring the UV absorbance at 260 nm while heating of the sample. When the duplex melts, a noticeable increase in absorbance (i.e., hyperchromic effect) is typically observed. In contrast, melting of a quadruplex is monitored at 295 nm, where a decrease in absorbance (i.e., hypochromic effect) is observed [18]. Thus, if a quadruplex is successfully invaded by a complementary PNA to form a PNA-DNA heteroduplex, the hypochromic quadruplex melting transition at 295 nm will be replaced by a hyperchromic duplex melting transition at 260 nm. In contrast, invasion by a homologous PNA to form a heteroquadruplex does not eliminate the hypochromic transition at 295 nm, but it does usually shift to higher temperature, since PNA-DNA and PNA-RNA heteroquadruplexes are more stable than the isolated DNA or RNA G4 targets. 1. DNA or RNA oligonucleotides capable of forming G4 structures are purchased from commercial suppliers. 2. Samples are typically prepared in the buffer of choice, e.g., 10 mM Tris–HCl, 100 mM KCl, 0.1 mM EDTA (pH ¼ 7.0). Strand concentrations of 1–10 μM are commonly used for these experiments. Typical samples are made to 800–1000 μL volume in 1 cm quartz cuvettes (available from Starna Cells and other suppliers). 3. Mix thoroughly by inverting capped cuvette several times.

338

Bruce A. Armitage

4. Place sample in instrument and equilibrate at 90  C for at least 5 min. 5. Samples are cooled to 15  C at a rate of 1  C/min while collecting UV absorbance data at periodic intervals (e.g., 0.5 or 1.0  C), then repeating the procedure while heating the sample back to high temperature (see Note 1). 3.2.2 Thermal Difference Spectroscopy and Circular Dichroism Spectropolarimetry

TDS Procedure

These methods provide additional confirmation of PNA strand invasion and the nature of the hybrid that is formed. Conveniently, TDS and CD spectra can be measured on the same samples used for UV melting analysis. Thermal difference spectroscopy (TDS) involves measuring the UV absorbance spectrum of a sample at high (e.g., 90  C) and low (e.g., 20  C) temperatures, then subtracting the low temperature spectrum from the high temperature spectrum [19]. G quadruplexes exhibit a diagnostic negative peak at 295 nm that disappears upon hybridization of a complementary PNA. (Hybridization of a homologous PNA does not eliminate this feature since the resulting hybrid is also a quadruplex.) 1. Prepare solution of desired strand concentration in buffer of choice. Typical samples are made to 800–1000 μL volume in 1 cm quartz cuvettes. 2. Mix thoroughly by inverting capped cuvette several times. 3. Place sample in instrument and equilibrate at 90  C for at least 5 min. 4. Record UV spectrum from 200 to 320 nm. 5. Cool sample to 20  C at a rate of 1  C/min and equilibrate at 20  C for at least 5 min (see Note 2). 6. Record UV spectrum from 200 to 320 nm. 7. Subtract 20  C spectrum from 90  C spectrum using a standard graphics package (e.g., Origin or Excel) to calculate the thermal difference spectrum.

CD

Meanwhile, circular dichroism (CD) spectropolarimetry measures the difference in the amount of right- and left-handed circularly polarized light that a sample absorbs. Chiral species such as nucleic acids and their complexes exhibit nonzero CD spectra [20]. The CD spectrum of a G4 differs based on the morphology of the quadruplex, which are distinguished by the relative orientations of the four G-tracts and can be classified as parallel, antiparallel, and hybrid [21]. If a complementary PNA invades the quadruplex, the CD spectrum of the resulting heteroduplex is usually significantly different from that of the original quadruplex. On the other hand, if a homologous PNA forms a heteroquadruplex, the resulting CD spectrum normally resembles that of a parallel G4 rather than either

Targeting G-Quadruplexes with PNA Oligomers

339

antiparallel or hybrid [15, 22, 23]. Thus, if the G4 DNA or RNA is already in a parallel structure, the CD spectrum of the PNA-DNA or PNA-RNA heteroquadruplex can be very similar. CD Procedure

1. Prepare solution of desired strand concentration in buffer of choice. Typical samples are made to 800–1000 μL volume in 1 cm quartz cuvettes. 2. Mix thoroughly by inverting capped cuvette several times. 3. Place sample in instrument and equilibrate at 90  C for at least 5 min. 4. Cool sample to 20  C (or desired temperature) at a rate of 1  C/min and equilibrate at final temperature for at least 5 min (see Note 2). 5. Record CD spectrum with desired parameters; we typically use scan rates of 50 or 100 nm/min, integration time of 1 s, and at least six accumulations, i.e., individual scans that are collected and averaged. More sophisticated biophysical characterization of the kinetics and thermodynamics of hybridization can be obtained using surface plasmon resonance [15, 24, 25] and calorimetry [26] experiments. These methods are described elsewhere and require access to specialized instrumentation.

3.2.3 Biochemical Assays

The ultimate goal for a G4-targeting compound is to alter whatever function the quadruplex might be performing. For example, if a quadruplex is involved in regulating mRNA translation, then successful binding to the quadruplex would be expected to alter the level of protein resulting from translation of the mRNA. Alternatively, a G4 might interfere with a DNA polymerase enzyme’s ability to replicate a particular region of the genome. Hybridization of a PNA to the G4 can modulate the level of polymerase readthrough of the G4 site. Incorporating PNA into in vitro biochemical assays such as primer extension or mRNA translation requires little modification of standard methods. For example, luciferase reporter assays are commonly used to assess the function of G4s in gene expression. For an in vitro translation assay, luciferase mRNA is produced from a plasmid by transcription and purified by gel electrophoresis. The mRNA is then incubated in a cell lysate such as commercially available rabbit reticulocyte lysate supplemented with amino acid mixtures, thus providing all the factors necessary to support translation. A chemiluminescent substrate for the luciferase enzyme is then added and protein expression levels are measured based on luminescence signals. We typically incorporate PNA (or γPNA) oligomers into the workflow after purification of the transcribed mRNA. A range of

340

Bruce A. Armitage 100

75

% RLU

No G3 50

4G3

25

IC50 = 15 nM

0 0

40

80

120

160

200

[γ5'] (nM)

Fig. 2 Dose–response curves for a complementary γPNA targeted to a quadruplex inserted into the 50 -untranslated region of a luciferase reporter mRNA (filled squares) or a control mRNA lacking the quadruplex target (open squares). %RLU ¼ % relative light units, where no γPNA corresponds to 100% RLU

concentrations can be used in order to determine a dose–response curve, from which the 50% inhibitory concentration (IC50) can be identified (Fig. 2). The PNA and RNA are incubated at a desired temperature and time (e.g., 1 h at 37  C), then added to the cell lysate to initiate translation. Results are compared to a control sample containing only RNA in order to assess the effect of the PNA. Luciferase Reporter Assay Procedure

1. The DNA template containing the target sequence upstream of the firefly luciferase gene is amplified by polymerase chain reaction (PCR) using Taq DNA polymerase (New England Biolabs) and two sequence specific primers. 2. The DNA amplicon is purified using a GeneJET PCR purification kit (ThermoFisher) according to the manufacturer’s instructions. The integrity and length of each PCR product are verified using a 1% agarose gel and electrophoresis. 3. The purified DNA template is transcribed in vitro using a cocktail consisting of T7 RNA polymerase (100 units) and a mixture of the ribonucleoside triphosphates (500 μM of each NTP) in a total reaction volume of 100 μL at 37  C for 2 h. 4. The resulting mRNA transcript is purified using the GeneJET RNA clean-up Micro Kit (Thermo Scientific) according to the manufacturer’s instructions. 5. The mRNA concentration is estimated by UV spectroscopy, where the extinction coefficient of each mRNA transcript is assumed to be the sum of the molar absorptivities of each

Targeting G-Quadruplexes with PNA Oligomers

341

nucleobase, and the recorded absorbance (at 260 nm) is assumed to be unaffected by any (undetermined) secondary folding of the transcripts. 6. A translation reaction is typically performed by incubating the purified transcript in a mixture containing nuclease-treated rabbit reticulocyte lysate (70% v/v) (Promega), 10 μM amino acid mixtures minus leucine, 10 μM amino acid mixtures minus methionine and 20 units RNasin ribonuclease inhibitor (50 μL total reaction volume) at 30  C for 1.5 h. 7. Experiments to examine the effect of a PNA or γPNA on luciferase production are performed by preincubating the mRNA with requisite amino acids and increasing concentrations of the PNA or γPNA in a buffer containing 79 mM KCl and 7.9 mM Tris–HCl (pH ¼ 7.4) to a total volume of 19.5 μL at 37  C for 1 h. Each subsequent translation reaction was started by adding 30.5 μL of the rabbit reticulocyte lysate to the preincubated mixture, and the final mixture was incubated at 30  C for 1.5 h (see Note 4). 8. Luciferase levels are measured by incubating 10 μL of the translation products with 50 μL of a reagent cocktail (D-luciferin, Mg2+, and ATP) (Promega). Luciferase activity is estimated as relative light units (RLU) on a Tecan Infinite M1000 Plate Spectrometer, relative to a sample lacking PNA or γPNA. 3.2.4 Cellular Experiments

A more stringent test of a G4-targeting molecule is its ability to modulate G4 function in cell culture. There are already numerous methods for assessing G4 function in cells, particularly involving high-throughput sequencing, reporter gene assays, and immunoblots. A bigger concern is cell uptake. Unlike small molecules, where decades of work have led to robust empirical guidelines for achieving cell permeability, delivery of synthetic oligonucleotides into the cytoplasm and/or nucleus of a living cell is a challenge. Cationic lipid formulations such as lipofectamine are not directly useful for uncharged molecules such as PNAs, although the Corey Lab demonstrated the use of a carrier DNA, which is partially complementary to the PNA and can help transfect it into cells using cationic lipids [27]. While the nonionic peptide-like backbone of PNAs makes it difficult to use electrostatic association with transfection vehicles, it does allow facile incorporation of another cell uptake modality, namely cell-penetrating peptides (CPPs), such as the arginine-rich TAT peptide. Since both PNA and CPPs are based on amide bonds, PNA-CPP conjugates can be made by continuous synthesis, unlike other oligonucleotides where the CPP is conjugated in a separate step, often leading to low yields. A wide variety of CPPs have been shown to promote cell uptake and function of PNAs in regulating gene expression [28].

342

Bruce A. Armitage

Recent work in the Glazer Lab has shown that PNAs and γPNAs can be formulated in poly-L-glycolic acid (PLGA) nanoparticles, which effectively deliver their cargo into cells [29]. In one particularly exciting case, PLGA nanoparticles delivered a γPNA designed to invade genomic DNA as well as synthetic DNA [30]. Strand invasion of the γPNA induced gene editing of the targeted site, resulting in “repair” of a mutation that causes the blood disorder γ-thalassemia in a mouse model.

4

Outlook Sequence-based targeting approaches are attractive due to the potential to achieve high affinity and selectivity. PNAs and γPNAs exhibit high affinity hybridization to folded G-quadruplex DNA and RNA structures using either complementary or homologous sequences. Nevertheless, we anticipate that selectivity by either approach will be poor. For complementary hybridization, this is a long-standing issue. As the affinity of an oligonucleotide increases, its tolerance of mismatches under physiological conditions also increases. However, by using relatively short oligonucleotides, where mismatches can be discriminated, it is likely that there will be many perfect-match sequences present in other regions of the genome or transcriptome [31]. Fine tuning of complementary γPNA design is at an early stage and will require significant effort to define optimal guidelines. Meanwhile, homologous hybridization to form PNAcontaining heteroquadruplexes also occurs with high affinity, but this binding mode is unlikely to be sufficiently selective to avoid offtarget effects in a complex biological system. While binding to complementary targets can be effectively minimized using lefthanded γPNA monomers and abasic loops, binding to other G-rich sequences to form heteroquadruplexes cannot be avoided. The future of sequence-based targeting of G4 sequences likely lies in the direction of chimeric recognition. In particular, it is possible that chimeric PNAs that hybridize through both complementary and homologous binding modes (Fig. 3), as shown previously for DNA [32] and 20 -O-methyl RNA oligonucleotides [32, 33], will provide maximum affinity and selectivity. Optimizing the affinities of duplexand quadruplex-forming domains such that they only bind their target when both are present should all but eliminate off-target hybridization. Such an approach will need to consider the opposite orientational preferences for PNA hybridization: heteroduplex formation prefers to align the PNA N-terminus with the DNA/RNA 30 -terminus, whereas heteroquadruplex formation prefers the opposite alignment, i.e., PNA N-terminus with DNA/RNA 50 -terminus. Nevertheless, efforts in this direction are warranted given the growing interest in G4 function and the implications of quadruplexes for human disease.

Targeting G-Quadruplexes with PNA Oligomers

343

Fig. 3 Chimeric recognition of a G-quadruplex involves simultaneous hybridization of complementary and homologous domains to form adjacent heteroduplex and heteroquadruplex structures

5

Notes 1. Comparing the heating and cooling curves obtained from UV or CD melting reveals whether there is any hysteresis in the melting/hybridization process, which might reflect a more complex reaction that requires structural reorganization of one or both strands prior to hybridization. If hysteresis is observed, melting curves should be acquired again but using a slower temperature ramp (e.g., 0.2  C/min) to prevent slow folding/hybridization kinetics from underestimating transition temperature during cooling. Thermodynamic parameters for hybridization can also be extracted from the shape of the melting curve and/or the concentration-dependence of the transition [34], although this will not be described in more detail here. 2. The procedures given above describe sample annealing directly in the instrument (UV–vis or CD). As an alternative, samples can be annealed using 1 mL microcentrifuge tubes and a heating block set to 90  C. After incubating for at least 5 min, the heating can be simply turned off, allowing the block to cool passively to room temperature. Tubes should be briefly centrifuged (on a small benchtop microfuge) to collect any condensed water from the upper part of the tube before transferring samples to cuvettes. 3. For luciferase assays, it is advisable to perform an experiment in which the RNA and PNA are heated to 90  C then cooled to 37  C over a period of at least an hour. We have found

344

Bruce A. Armitage

significant differences in the level of PNA inhibition of translation based on whether or not a high temperature annealing step is done, presumably due to kinetic barriers to hybridization. In one example, the IC50 value of a complementary γPNA targeted to an RNA G4 was improved four-fold after annealing, an effect we attributed to disruption of secondary structure that hindered hybridization at lower temperature [11]. While cells or animals obviously cannot be treated in this way, screening PNAs for both their kinetic as well as thermodynamic properties gives the best possible chance of identifying biologically useful molecules. 4. Although we have not made use of them, common biochemical methods such as electrophoretic mobility shift analysis, footprinting and SHAPE should be readily applicable to determination of binding sites as well as kinetic and equilibrium constants for PNAs and γPNAs when targeting longer DNA or RNA G4s. Preincubating the DNA or RNA with the PNA for a set period of time and temperature prior to treating with the nuclease reagent and subsequent polyacrylamide gel analysis would identify not only the specific nucleotides to which the PNA is hybridized but also the temperature, time and concentration dependence.

Acknowledgments Our work in this area has been supported by the National Institutes of Health (R01 GM58547) and the David Scaife Family Charitable Foundation (141RA01). References 1. Gellert M, Lipsett MN, Davies DR (1962) Helix formation by guanylic acid. Proc Natl Acad Sci U S A 48:2013–2018 2. H€ansel-Hertsch R, Di Antonio M, Balasubramanian S (2017) DNA G-quadruplexes in the human genome: Detection, functions and therapeutic potential. Nat Rev Mol Cell Biol 18:279–284 3. Cammas A, Millevoi S (2017) RNA G-quadruplexes: Emerging mechanisms in disease. Nucleic Acids Res 45:1584–1595 4. Fay MM, Lyons SM, Ivanov P (2017) RNA G-quadruplexes in biology: principles and molecular mechanisms. J Mol Biol 429:2127–2147 5. Neidle S (2016) Quadruplex nucleic acids as novel therapeutic targets. J Med Chem 59:5987–6011

6. Nielsen PE, Egholm M, Berg RH, Buchardt O (1991) Sequence-selective recognition of DNA by strand displacement with a thyminesubstituted polyamide. Science 254:1498–1500 7. Egholm M, Buchardt O, Christensen L, Behrens C, Freier SM, Driver DA, Berg RH, Kim SK, Norde´n B, Nielsen PE (1993) PNA hybridizes to complementary oligonucleotides obeying the Watson-Crick hydrogen-bonding rules. Nature 365:566–568 8. Dragulescu-Andrasi A, Rapireddy S, Frezza BM, Gayathri C, Gil RR, Ly DH (2006) A simple gamma-backbone modification preorganizes peptide nucleic acid into a helical structure. J Am Chem Soc 128:10258–10267 9. Sahu B, Sacui I, Rapireddy S, Zanotti KJ, Bahal R, Armitage BA, Ly DH (2011) Synthesis and characterization of conformationally

Targeting G-Quadruplexes with PNA Oligomers preorganized, (R)-diethylene glycol-containing γ-peptide nucleic acids with superior hybridization properties and water solubility. J Org Chem 76:5614–5627 10. Murphy CT, Gupta A, Armitage BA, Opresko PL (2014) Hybridization of G-quadruplexforming peptide nucleic acids to guanine-rich DNA templates inhibits DNA polymerase η extension. Biochemistry 53:5315–5322 11. Oyaghire SN, Cherubim CJ, Telmer CA, Martinez JA, Bruchez MP, Armitage BA (2016) RNA G-quadruplex invasion and translation inhibition by antisense γPNA oligomers. Biochemistry 55:1977–1988 12. Christensen L, Fitzpatrick R, Gildea B, Petersen KH, Hansen HF, Koch T, Egholm M, Buchardt O, Nielsen PE, Coull J et al (1995) Solid-phase synthesis of peptide nucleic acids. J Pept Sci 3:175–183 13. Koch, T. (1999) In Nielsen, P. E. and Egholm, M. (eds.), Peptide nucleic acids. Horizon Scientific Press, Norfolk, UK, pp. 21–37 14. Williamson JR (1994) G-quartet structures in telomeric DNA. Annu Rev Biophys Biomol Struct 23:703–730 15. Roy S, Tanious FA, Wilson WD, Ly DH, Armitage BA (2007) High-affinity homologous peptide nucleic acid probes for targeting a quadruplex-forming sequence from a MYC promoter element. Biochemistry 46:10433–10443 16. Lusvarghi S, Murphy CT, Roy S, Tanious FA, Sacui I, Wilson WD, Ly DH, Armitage BA (2009) Loop and backbone modifications of PNA improve G quadruplex binding selectivity. J Am Chem Soc 131:18415–18424 17. Sacui I, Hsieh W-C, Manna A, Sahu B, Ly DH (2015) Gamma peptide nucleic acids: As orthogonal nucleic acid recognition codes for organizing molecular self-assembly. J Am Chem Soc 137:8603–8610 18. Mergny J-L, Phan A-T, Lacroix L (1998) Following G-quartet formation by UV-spectroscopy. FEBS Lett 435:74–78 19. Mergny J-L, Li J, Lacroix L, Amrane S, Chaires JB (2005) Thermal difference spectra: A specific signature for nucleic acid structures. Nucleic Acids Res 33:e138 20. Rodger A, Norde´n B (1997) Circular dichroism and linear dichroism. Oxford University Press, Oxford 21. Vorlı´ckova´ M, Kejnovska´ I, Sagi J, Renciuk D, Bedna´rova´ K, Motlova´ J, Kypr J (2012) Circular dichroism and guanine quadruplexes. Methods 57:64–75 22. Datta B, Schmitt C, Armitage BA (2003) Formation of a PNA2-DNA2 hybrid quadruplex. J Am Chem Soc 125:4111–4118

345

23. Marin VL, Armitage BA (2005) RNA guanine quadruplex invasion by complementary and homologous PNA probes. J Am Chem Soc 127:8032–8033 24. Gupta A, Lee L-L, Tanious F, Wilson WD, Ly DH, Armitage BA (2013) Strand invasion of DNA quadruplexes by PNA: Comparison of homologous and complementary hybridization. Chembiochem 14:1476–1484 25. Roy S, Zanotti KJ, Murphy CT, Tanious FA, Wilson WD, Ly DH, Armitage BA (2011) Kinetic discrimination in recognition of DNA quadruplex targets by guanine-rich heteroquadruplex-forming PNA probes. Chem Commun 47:8524–8526 26. Ratilainen, T. and Norde´n, B. (2002) In Nielsen, P. E. (ed.), Peptide nucleic acids. Methods and protocols. Humana Press, Towana, NJ, pp. 59–88 27. Hamilton SE, Simmons CG, Kathiriya IS, Corey DR (1999) Cellular delivery of peptide nucleic acids and inhibition of human telomerase. Chem Biol 6:343–351 28. Koppelhus U, Nielsen PE (2003) Cellular delivery of peptide nucleic acid (PNA). Adv Drug Deliv Rev 55:267–280 29. Bahal R, McNeer NA, Ly DH, Saltzman WM, Glazer PM (2013) Nanoparticle for delivery of antisense γPNA oligomers targeting CCR5. Artificial DNA: PNA & XNA 4:49–57 30. Bahal R, McNeer NA, Quijano E, Liu Y, Sulkowski P, Turchick A, Lu Y-C, Bhunia DC, Manna A, Greiner DL et al (2016) In vivo correction of anaemia in β-thalassemic mice by γPNA-mediated gene editing with nanoparticle delivery. Nat Commun 7:13304 31. Demidov VV, Frank-Kamenetskii MD (2004) Two sides of the coin: Affinity and specificity of nucleic acid interactions. Trends Biochem Sci 29:62–71 32. Hagihara M, Yamauchi L, Seo A, Yoneda K, Senda M, Nakatani K (2010) Antisenseinduced guanine quadruplexes inhibit reverse transcription by HIV-1 reverse transcriptase. J Am Chem Soc 132:11171–11178 33. Bhattacharyya D, Nguyen K, Basu S (2014) Rationally induced RNA:DNA G-quadruplex structures elicit an anticancer effect by inhibiting endogenous eIF-4E expression. Biochemistry 53:5461–5470 34. Marky LA, Breslauer KJ (1987) Calculating thermodynamic data for transitions of any molecularity from equilibrium melting curves. Biopolymers 26:1601–1620

Chapter 21 Primer-Modified G-Quadruplex-Au Nanoparticles for Colorimetric Assay of Human Telomerase Activity and Initial Screening of Telomerase Inhibitors Fang Pu, Jinsong Ren, and Xiaogang Qu Abstract G-quadruplexes formed by 30 -overhang of guanine-rich human telomeric DNA at the end of chromosome have important implication in inhibiting the telomerase activity. Telomerase catalyzes the elongation of telomeres by adding telomeric repeats sequence TTAGGG onto the end of the chromosome. Since telomerase is over-expressed in 80–90% of all known human tumors, the enzyme can be recognized as a biomarker for cancer diagnosis and a therapeutic target. Thus, the sensitive detection of telomerase activity is essential to cancer diagnosis and therapy, and screening of anticancer drugs. Gold nanoparticles (AuNPs) have been widely applied as a colorimetric probe for assay owing to their unique size- and distancedependent optical properties. Human telomerase activity can be visualized by using primer-modified Au nanoparticles. The extremely high extinction coefficients of AuNPs offered high sensitivity. Here, we describe a protocol for the preparation of primer-modified Au nanoparticles for colorimetric assay of human telomerase activity and initial screening of telomerase inhibitors. Key words G-quadruplex, Telomerase, Gold nanoparticles, Extraction, Extension, Aggregation

1

Introduction Guanine-rich DNA sequences can form G-quadruplex through π-π stacking of the G-tetrads in which four guanines are arranged in a plane by Hoogsteen hydrogen bonds with the assistance of a specific metal cation [1]. Telomere is located at the ends of chromosomes and protects the chromosome ends from deterioration and fusion [2]. Human telomeres play a significant role in genome stability, cancer, and aging. Human telomeric DNA consists of a duplex region composed of TTAGGG hexamer repeats and an extended 30 -overhang of the G-rich single strand [3, 4]. The single-stranded G-rich overhang containing four contiguous TTAGGG repeats can fold into a G-quadruplex structure in the presence of Na+ or K+ ions [5].

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_21, © Springer Science+Business Media, LLC, part of Springer Nature 2019

347

348

Fang Pu et al.

Telomerase, a cellular reverse transcriptase, is composed of telomerase integral RNA and catalytic subunit [6]. It catalyzes the addition of multiple telomeric repeats TTAGGG onto the end of the chromosomes using telomerase RNA as templates. It may cause genetically abnormal situations. Telomerase is activated in 80–90% of human tumors, while is low or undetectable in most normal somatic cells. Therefore, telomerase is recognized not only as a biomarker of cancer diagnosis but also as a specific target for cancer therapy [7]. Owing to the significance of telomerase in the biomedical filed, it is necessary to develop a reliable and sensitive analytical method for monitoring telomerase activity [8]. The classic telomeric repeat amplification protocol (TRAP) assay based on polymerase chain reaction (PCR) is the most widely used method [9, 10]. However, it suffers from some drawbacks including timeconsuming, sophisticated experimental procedure, and harsh condition. Moreover, the assay is not easily quantitated and subject to PCR-related artifacts [11]. Some PCR-free methods for telomerase activity have also been constructed [12–14]. However, the shortcomings such as low sensitivity, complicated operation, and requirement of elaborate equipment or expensive fluorescent labels still need to be solved. Telomeric DNA that folds into G-quadruplexes can’t be elongated by telomerase, thus inhibiting the telomerase activity. Small molecules, which can stabilize the G-quadruplex structure formed by the human telomeric DNA, have been considered as efficient telomerase inhibitor [15]. Therefore, the design, synthesis, and screening of them can provide a viable strategy toward anticancer drugs development. Gold nanoparticles (AuNPs) have been extensively used in analytical methods since AuNPs possess strong size-and distancedependent optical properties and extremely high extinction coefficient [16]. AuNP-based assays present color changes in response to analytes, which can be observed by the naked eye. A visual method for measuring telomerase activity and initial screening telomerase inhibitor was developed (Fig. 1) [17]. In the method, 50 -thiolfunctionalized telomerase substrate oligonucleotides (TS primer) are immobilized on the surface of AuNPs through robust Au-S bonds and then elongated by telomerase extracted from cancer cells. The elongated DNA primer can fold into G-quadruplex structures and protect the AuNPs from the aggregation at a defined salt concentration. In the absence of telomerase, the TS primers modified on the surface of AuNPs are not extended, leading to saltinduced aggregation of AuNPs. In the presence of G-quadruplex ligands, telomeric DNA can form G-quadruplex structure, which cannot be elongated by telomerase. The AuNPs without the elongated primer will aggregate. The detection of telomerase activity and screening of telomerase inhibitors can be achieved by observing

G-Quadruplex-AuNPs for Telomerase Assay

349

Fig. 1 Scheme of telomerase activity assay and inhibitor screening using primermodified Au nanoparticles

the color change of AuNPs corresponding to their states of aggregation or dispersion or using absorbance spectrometer. In the protocol, we provide step-by-step procedures that allow researchers to determine human telomerase activity. The time range for the entire protocol is ~4 days. First, we describe the synthesis of AuNPs and the following modification of AuNPs with primer. The step takes about 2 days. Then we describe the telomerase extraction from cells, which takes half a day. We evaluate human telomerase activity by measuring the absorption spectra of AuNPs after telomerase extension reaction using a common UV/visible spectrometer, which takes ~1 day. Throughout the procedures, the requirement of sophisticated instruments and highly technical experimental operation are avoided. The protocol will benefit the development of assay of human telomerase activity and screening of inhibitors.

2

Materials Use ultrapure water (18.2 MΩ; Millipore Co., USA) in all experiments and to prepare all buffers.

2.1 Synthesis of AuNPs

1. Hydrogen tetrachloroaurate(III) (HAuCl4 3H2O, 99.99%) solution (50 mM) (Alfa Aesar). Store in a refrigerator at 4  C. 2. Sodium citrate dihydrate (Na3C6H5O7 2H2O, 99%) solution (38.8 mM) (Alfa Aesar).

350

Fang Pu et al.

3. Aqua regia, a mixture of HNO3 and HCl with a molar ratio of 1:3. Concentrated HCl is about 35% and concentrated HNO3 is about 65%, so the volume ratio is usually 4 parts concentrated HCl to 1 part concentrated HNO3 (see Note 1). Keep the solution in a cool location. 2.2 Preparation of DNA-Functionalized AuNPs

1. DNA sequences and modifications (Sangon Biotechnology Inc., Shanghai, P. R. China): 50 -thiol-functionalized telomerase substrate oligonucleotides (TS primer): 5’-HS(CH2)6TT TTTTTTTTAATCCGTCGAGCAGAGTT-30 . 2. Phosphate buffered saline (PBS): 0.2 M, pH 7.0. Mix 39 mL of 31.21 g/L NaH2PO4·2H2O and 61 mL of 35.61 g/L Na2HPO4·2H2O. Store the solution at 4  C. 3. NaCl solution: 2 M. 4. Tris–HCl buffer: 10 mM tris(hydroxymethyl)aminomethane), 10 mM NaCl, pH 7.5.

2.3 Telomerase Extraction

1. Diethy pyrocarbonate (DEPC) solution: 0.1% v/v. Add 1 mL of DEPC to 1000 mL of double distilled water. Stir the solution until the droplets of DEPC are mixed with water evenly (see Note 2). Keep overnight. Seal and store in a 1000 mL volumetric flask. 2. DEPC treated water: Treat 0.1% DEPC solution at 121  C for 25 min. Seal and store (see Note 3). Use DEPC treated water for all solutions used in telomerase extraction and extension steps. 3. Ice-cold PBS: Fully dissolve 4 g of NaCl, 0.1 g of KCl, 0.1 g of KH2PO4 and 1.445 g of Na2HPO4·12H2O (or 1.3 g Na2HPO4·7H2O) in 500 mL of DEPC treated water in a 500 mL volumetric flask. Before being used, place the volumetric flask in an ice bath for at least 2 h. 4. CHAPS lysis buffer: 10 mM Tris–HCl, pH 7.5, 1 mM MgCl2, 1 mM ethylene glycol tetraacetic acid (EGTA), 0.1 mM phenylmethylsulfonyl fluoride (PMSF), 5 mM β-mercaptoethanol, 0.5% (v/v) 3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonic acid (CHAPS), 10% glycerol, 0.5 M NaCl. Store the solution in a refrigerator at 4  C. Before being used, place the flask in an ice bath for at least 2 h.

2.4 Telomerase Extension

1. TRAP buffer: 20 mM Tris–HCl, pH 8.3, 1.5 mM MgCl2, 63 mM KCl, 0.005% Tween 20, 1 mM EGTA, 0.005% Tween 20, 0.1 mg/mL bovine serum albumin (BSA). Store the solution in a refrigerator at 4  C. 2. G-quadruplex ligands [Ni2L3]Cl4-P (L ¼ C25H20N4) and tetra-(N-methyl-4-pyridyl)porphyrin (TMPyP4) and other chemicals are analytical grade reagents purchased from SigmaAldrich and used without further purification.

G-Quadruplex-AuNPs for Telomerase Assay

2.5 Colorimetric Detection of Telomerase Activity and Telomerase Inhibitors Screening

3

351

1. MgCl2 solution: 800 mM. 2. NaCl solution: 3 M.

Methods

3.1 Synthesis of AuNPs with a Diameter of 13 nm

1. Soak all glassware in freshly prepared aqua regia (3:1 HCl: HNO3) for 2 h. Rinse them extensively with doubly distilled H2O and then dry them in an oven. 2. Bring 49 mL of ultrapure water and 1 mL of 50 mM HAuCl4 aqueous solution to a 100 mL round-bottomed flask. The final concentration of HAuCl4 solution is 1 mM. 3. Connect a condenser to the neck of the round-bottomed flask. Boil HAuCl4 solution in a heating mantle under stirring using a magnetic stir bar. 4. Add 5 mL of 38.8 mM sodium citrate rapidly when HAuCl4 solution begins to reflux (see Note 4). 5. After appearance of a deep red color, reflux the mixture for an additional 15 min under vigorous stirring. 6. Cool the resulting solution to room temperature with continued stirring. 7. Determine the concentration of AuNPs solution using the absorbance values at 520 nm with an extinction coefficient of 2.7  108 M1 cm1. The concentration of the resulting AuNPs solution is 12 nM. 8. Characterize the morphology and size of AuNPs using transmission electron microscopy (TEM). The nanoparticles should be spheres with a diameter of 13 nm and have good dispersity. 9. Store the prepared AuNPs in a refrigerator at 4  C in the dark.

3.2 Preparation of DNA-Functionalized AuNPs

1. Mix 100 μL of deprotected thiol-oligonucleotides and 850 μL of AuNPs solution for 16 h at room temperature under gentle stirring. The final concentration of oligonucleotides is 2–3 mM. 2. Add 50 μL of concentrated PBS (0.2 M, pH 7) to the colloid solution with hand shaking and keep for 2 h. The final concentration of PBS is 10 mM. 3. Add 26 μL of 2 M NaCl dropwise and stir the mixture. The final concentration of NaCl is 50 mM. Keep the solution for 6-8 h.

352

Fang Pu et al.

4. Add 27 μL of 2 M NaCl until the concentration reach 100 mM. Keep the solution for 6–8 h. 5. Continue to add 58 μL of 2 M NaCl, resulting to 200 mM of NaCl solution. Keep the solution for another 6–8 h. 6. Add 58 μL of 2 M NaCl. The final concentration of NaCl is 300 mM. 7. Transfer the mixture into microcentrifuge tubes. 8. To remove excess thiol-DNA, these DNA-functionalized AuNPs are purified by centrifuging for 45 min at 30,230  g. AuNPs are at the bottom of the tubes. 9. Remove the supernatant, and add Tris–HCl buffer (10 mM Tris–HCl, 100 mM NaCl, pH 7.5). Disperse the precipitate using a pipette. 10. Centrifuge for 45 min at 30,230  g and then remove the supernatant. 11. Add Tris–HCl buffer and redisperse the precipitate. Centrifuge for 45 min at 30,230  g and then remove the supernatant. 12. Disperse DNA- functionalized AuNPs in 100 μL of 10 mM PBS. 13. Store the liquid at 4  C for stand-by. 14. Calculate the DNA/AuNPs ratio. Measure the absorbance spectra of the AuNPs before and after DNA modification and then normalize the spectra to an absorbance of 1 at 519 nm. Calculate the molar ratio based on the known extinction coefficients of DNA strands at 260 nm and AuNPs at 519 nm. An average loading of 50 primers per AuNP can be calculated. 3.3 Cell Culture and Telomerase Extraction ( See Note 5)

1. Culture various cell lines (HeLa cervical cancer, MCF-7 breast cancer, K562 myelogenous leukemia, HepG2 hepatocellular carcinoma, HEK-293T transformed human embryonic kidney cell line, NIH-3T3 transformed mouse embryonic fibroblast cell line) in Dulbecco’s modified Eagle’s medium (DMEM) medium supplemented with 10% fetal calf serum. Place the cells in an incubator at 37  C in a humidified atmosphere of 5% CO2 in air. 2. Soak the pipette tips and Eppendorf (EP) tubes in DEPC solution (0.1% v/v) away from light for one night (see Note 6). After removing the DEPC solution from the pipette tips and EP tubes, pack and wrap them. Sterilize them with a highpressure sterilizing pot at 121  C for 30 min and then dry them in a clean oven at 37  C for 48 h. 3. Harvest the cells which are in the exponential phase of growth into DNase-, RNase-free 1.5-mL microfuge tubes.

G-Quadruplex-AuNPs for Telomerase Assay

353

4. Centrifuge the microfuge tubes at 3000  g for 10 min at 2–8  C. Carefully remove the supernatant with a pipette. 5. Dispense 1  106 cells in a 1.5-mL EP tube and wash with ice-cold PBS (see Note 7). 6. Centrifuge at 3000  g for 10 min at 2–8  C. Remove the supernatant with a pipette. 7. Resuspend the cells in 100 μL of ice-cold CHAPS lysis buffer and incubate on ice for 30 min (see Note 8). 8. Centrifuge the lysate for 20 min at 25,760  g at 4  C (see Note 9). 9. Carefully transfer the supernatant to a fresh 1.5-mL EP tube without disturbing the cellular debris (see Note 10). 10. The crude extract of telomerase should be transferred to 80  C refrigerator (see Note 11). 3.4 Telomerase Extension Reaction

1. Add 5 μL of the extract of telomerase to 45 μL of RNA secure pretreated extension solution containing 1 TRAP buffer, 1 mM dATP, dTTP, dCTP, and dGTP, and 3 nM TS-modified AuNPs with gentle shaking by hand. 2. Incubate the solution in a water bath kettle at 30  C for 60 min (see Note 12). 3. For negative control experiments, heat-treat the extract of telomerase in a water bath kettle at 95  C for 10 min (see Note 13). Add 5 μL of the extract of telomerase heat-treated to 45 μL of solution containing 1 TRAP buffer, 1 mM dATP, dTTP, dCTP, and dGTP, and 3 nM TS-modified AuNPs. Incubate the mixture in a water bath kettle at 30  C for 60 min.

3.5 Colorimetric Detection of Telomerase Activity ( See Note 14)

1. After telomerase extension reaction, add 20 μL of 800 mM MgCl2 and 3 M NaCl to the resulting system.

3.6 Telomerase Inhibitors Screening

1. Mix DNA binders and telomerase extract.

2. 10 min later, dilute the solution with H2O to 180 μL. 3. Measure the absorption spectra of the reacted solution in the range 400–750 nm at room temperature (Fig. 2).

2. Add the mixture to the extension solution containing 1 TRAP buffer, 1 mM dATP, dTTP, dCTP, and dGTP, and 3 nM TS-modified AuNPs. 3. Perform the follow-up experiment according to the steps in Subheadings 3.4 and 3.5 (Fig. 3).

354

Fang Pu et al.

Fig. 2 Colorimetric assay of human telomerase activity using primer-modified Au nanoparticles. (a) Photograph of color changes and (b) UV–vis spectra of AuNPs after the telomere extension reaction with the addition of salts. A–D in panel (a) represent the telomerase equivalent to 0, 1, 10, and 100 HeLa cells μL1, respectively. (c) The ratio of absorbance intensity at 520 and 650 nm (A520/A650) of AuNPs at 10 min after adding salt versus the telomerase concentration. (Inset) The linear region from 0 to 8 cells μL1

4

Notes 1. Aqua regia should be freshly prepared in a big container. Slowly add 1 volume of concentrated HNO3 into 4 volumes of concentrated HCl, stirring constantly with glass rods. Do not add HCl to HNO3. A fuming red or yellow liquid can be obtained. The preparation should be performed in a fume hood. Aqua regia is a highly corrosive liquid, so be careful with it. Wear goggles, masks, and gloves when using aqua regia. 2. DEPC is toxic and highly volatile. Prepare DEPC solution in a fume hood. Wear masks and gloves. 3. Aliquot DEPC treated water in small amounts before use to avoid pollution. 4. The addition of sodium citrate should be fast with vigorous stirring. 5. Attention must be paid to preventing RNase contaminations. Gloves and masks are always worn during telomerase extraction and extension steps.

G-Quadruplex-AuNPs for Telomerase Assay

355

Fig. 3 Inhibition assay of telomerase activity by the G-quadruplex ligands [Ni2L3] Cl4-P and TMPyP4 using AuNP-TS colorimetric method. (a) The relative telomerase activity corresponding to 20 HeLa cells μL1 in the presence of various concentrations of G-quadruplex ligands. The ordinate is represented by A520/A650 values and normalized. (b) Photograph of color changes of AuNPs in the presence of telomerase corresponding to 20 HeLa cells μL1 and G-quadruplex ligands with increasing concentrations. In panel (b) the labels are as follows. A: the reaction without G-quadruplex ligand; B–E: the reaction with 1, 2, 5, and 10 μm [Ni2L3]Cl4-P; F–I: the reaction with 1, 2, 5, and 10 μm TMPyP4

6. Wear masks and gloves when treating pipette tips and EP tubes with DEPC solution. Ensure that both the pipette tips and EP tubes are fully immersed in DEPC solution. 7. Keep the PBS on ice until ready for use. 8. It’s better to insert the EP tubes into the ice to ensure low temperature operation. 9. The centrifugation time should be more than 20 min to better precipitate cell debris. 10. Try to shorten the operation time as much as possible. 11. Aliquot the extract of telomerase in small amounts to avoid repeated freezing and thawing. 12. The EP tubes should be capped and sealed by parafilm to avoid solution volatilization.

356

Fang Pu et al.

13. The EP tubes should be capped and sealed by parafilm to prevent the lid from being opened by steam generated in the process of heating. 14. The detection of telomerase activity is carried out at room temperature.

Acknowledgments This work was supported by NSFC (21431007, 21533008, 21673223, 21820102009, and 21871249), and Chinese Academy of Sciences (CAS QYZDJ-SSW-SLH052). References 1. Davis JT (2004) G-quartets 40 years later: from 50 -GMP to molecular biology and supramolecular chemistry. Angew Chem Int Ed 43:668–698 2. Rhodes D, Giraldo R (1995) Telomere structure and function. Curr Opin Struct Biol 5:311–322 3. Xu Y (2011) Chemistry in human telomere biology: structure, function and targeting of telomere DNA/RNA. Chem Soc Rev 40:2719–2740 4. Meyne J, Ratliff RL, Moyzis RK (1989) Conservation of the human telomere sequence (TTAGGG)n among vertebrates. Proc Natl Acad Sci U S A 86:7049–7053 5. Blackburn EH (1991) Structure and function of telomeres. Nature 350:569–573 6. Greider CW (1996) Telomere length regulation. Annu Rev Biochem 65:337–365 7. Hahn WC, Stewart SA, Brooks MW, York SG, Eaton E, Kurachi A et al (1999) Inhibition of telomerase limits the growth of human cancer cells. Nat Med 5:1164–1170 8. Zhou X, Xing D (2012) Assays for human telomerase activity: progress and prospects. Chem Soc Rev 41:4643–4656 9. Piatyszek MA, Kim NW, Weinrich SL, Hiyama K, Hiyama E, Wright WE et al (1995) Detection of telomerase activity in human cells and tumors by a telomeric repeat amplification protocol (TRAP). Methods Cell Sci 17:1–15 10. Kim NW, Piatyszek MA, Prowse KR, Harley CB, West MD, Ho PL et al (1994) Specific association of human telomerase activity with

immortal cells and cancer. Science 266:2011–2015 11. Krupp G, Kuhne K, Tamm S, Klapper W, Heidorn K, Rott A et al (1997) Molecular basis of artifacts in the detection of telomerase activity and a modified primer for a more robust ’TRAP’ assay. Nucleic Acids Res 25:919–921 12. Xiao Y, Pavlov V, Niazov T, Dishon A, Kotler M, Willner I (2004) Catalytic beacons for the detection of DNA and telomerase activity. J Am Chem Soc 126:7430–7431 13. Zheng G, Daniel WL, Mirkin CA (2008) A new approach to amplified telomerase detection with polyvalent oligonucleotide nanoparticle conjugates. J Am Chem Soc 130:9644–9645 14. Sharon E, Freeman R, Riskin M, Gil N, Tzfati Y, Willner I (2010) Optical, electrical and surface plasmon resonance methods for detecting telomerase activity. Anal Chem 82:8390–8397 15. Maji B, Bhattacharya S (2014) Advances in the molecular design of potential anticancer agents via targeting of human telomeric DNA. Chem Commun 50:6422–6438 16. Elghanian R, Storhoff JJ, Mucic RC, Letsinger RL, Mirkin CA (1997) Selective colorimetric detection of polynucleotides based on the distance-dependent optical properties of gold nanoparticles. Science 277:1078–1081 17. Wang J, Wu L, Ren J, Qu X (2012) Visualizing human telomerase activity with primermodified Au nanoparticles. Small 8:259–264

Chapter 22 Heme G-Quadruplex DNAzymes: Conditions for Maximizing Their Peroxidase Activity l

Nisreen Shumayrikh and Dipankar Sen Abstract Catalytic DNAs (DNAzymes) with peroxidase-like activity have great potential in bioanalytical chemistry [1], owing to numerous advantages that DNA enzymes offer over conventional protein enzymes, including structural simplicity, low cost, thermal stability, and straightforward handling and preparation. Maximizing the efficiency of the peroxidase activity of such DNAzymes is a subject in need of review. In this chapter, we discuss the optimal experimental conditions for the peroxidase activity of these DNAzymes and describe general procedures for their utilization. Key words Guanine quadruplex, DNAzyme, Heme, Hemin, Peroxidase activity, ABTS, Amplex Red, HRP

1

Introduction Peroxidation is the reaction catalyzed by a large class of hemoproteins in nature, in which one-electron oxidation of various organic and inorganic substrates is enabled by peroxides, usually hydrogen peroxide [2]. In the late 1990s, the Sen Laboratory made the discovery that guanine-rich DNAs and RNAs, either derived by in vitro selection (SELEX) from random sequence libraries or natural genomic sequences, that fold to form G-quadruplexes (Fig. 1), bind tightly to ferric heme [Fe(III)-heme] or hemin [3, 4]. Shortly after, Travascio et al. reported that such G-rich oligonucleotides are able to utilize hemin as a cofactor in a fashion similar to natural peroxidases, such as HRP, to catalyze one electron (1 e) peroxidation reactions [3, 5, 6]. Since the initial discoveries, many research groups have intensively investigated the catalytic properties of hemelG-quadruplex DNAzymes. Optimization of the peroxidase activity is one of the subjects that has occupied much investigative energy. Optimization includes consideration of the nature of the buffer, pH, salts, and concentrations, the nature of the oxidizing agent, different

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_22, © Springer Science+Business Media, LLC, part of Springer Nature 2019

357

358

Nisreen Shumayrikh and Dipankar Sen

Fig. 1 (a) The structure of heme. (b) The structure of G-quadruplex. (c) A parallel folded G-quadruplex in which each blue parallelepiped represents a guanine base

substrates, as well as the effect of the topology of the DNA or RNA G-quadruplex. In the Introductory section, we discuss the effect of all these factors on the peroxidase reaction as catalyzed by hemelGquadruplex DNAzymes; this is followed by a description of the methods for studying hemin binding and for measuring peroxidase activity of particular hemelG-quadruplex DNAzymes. 1.1 pH Dependence and Buffer Effects

Travascio et al. have systematically examined the effects of pH and the presence of different buffer components in the reaction solutions on the efficiency of peroxidation of a standard chromogenic substrate, ABTS [2.20 -azido-bis(3-ethylbenzothiazoline-6-sulfonic acid] [3, 5–7]. The comparison in these studies is made between the catalyzed (in the presence of heminlG-quadruplex) versus uncatalyzed (hemin by itself, or hemin in the presence of nonbinding DNA) reactions. A key observation was a large difference in pKa values characterizing the “alkaline transition” of the two systems (the peroxidase activity of hemin complexed to an early heminbinding G-quadruplex, PS2.M, was linked to a pKa of 8.6, while uncomplexed, disaggregated hemin showed a corresponding pKa of 3.4–4.0) [3, 6]. In a nutshell, this means that the hemelGquadruplex complex showed a high level of peroxidase activity in the 6.0–9.0 pH range, whereas uncomplexed hemin showed poor activity at any pH > 4.0. The disparity between these so-called “alkaline transition” pKa values is a signature of the superior catalysis shown by the complex at or close to neutral pH. Travascio et al.

HemelG-Quadruplex DNAzymes

359

showed further that both that catalyzed and background reactions were accelerated by the presence of nitrogenous buffers such as Tris, HEPES-ammonium, and collidine [3, 7]. The hypothesis proposed was that the nitrogenous buffer played the characteristic acid‑base catalytic roles that “distal” amino acid side-chains typically play within the active sites of oxidative hemoprotein enzymes toward activating hydrogen peroxide [8]. In support of this hypothesis, Li et al. [9] made an exciting observation recently and found that adenines positioned at the 30 ends of hemin-binding intramolecular G-quadruplexes remarkably enhance the complex’s peroxidase activity, playing a role analogous to that of the nitrogenous buffers. Shangguan et al. investigated the effect of the flanking sequences as a part of the G-quadruplex structure on peroxidase activity and reported that sequences containing d(CCC) flanking both ends of the G-quadruplex core not only show enhances DNAzyme activity but also show strong tolerance to pH value changes, making them more suitable for applications necessitating widely ranging pH conditions [10]. Furthermore, Cheng et al. [11] investigated the effect of loop transposition in which an adenine residue is placed at multiple locations of the loops of an intramolecular G-quadruplex. They found an enhancement in ABTS reaction rate by approximately sixfold as the adenine residue is moved closer to the 50 -side loop. The rate enhancement in this case is attributed to increasing the hemin binding affinity to the G-quadruplex. 1.2 Cation Dependence and G4 Topology

The choice of cation significantly affects the overall stability of the final folded quadruplex. Potassium metal ions, particularly, have the ideal characteristics of size and charge to fit between G-tetrads effectively. Thus, the ideal choice for generating a stable hemin/ G-quadruplex complex is to include K+ ions in solution. Sintim and Nakayama broadly investigated the effect of different cations on peroxidase activity of various hemelG-quadruplex DNAzymes [12]. They found that ABTS peroxidation does depend strongly on the identity of the cations present in the reaction; it is optimal in the presence of ammonium ions. In fact, ammonium ions are able to play two roles: the acid-base catalysis referred to, above, and specific stabilization of the folded G-quadruplex structures themselves. The ideal situation is always to measure the peroxidase activity under hemin saturation conditions. A binding experiment, as described in Subheading 3, should be conducted first in order to calculate the dissociation constant (Kd) that characterizes the affinity of hemin for binding to specific G-quadruplex (Fig. 2). Indeed, the hemin binding affinity of G-quadruplexes varies significantly, depending on the folding topology of a given G-quadruplex. Several studies [13–16] have found that hemin preferentially stacks on the terminal G-quartets of a parallel-stranded G-quadruplex

360

Nisreen Shumayrikh and Dipankar Sen

Fig. 2 (a) UV–Vis spectra of 0.5 μM heme titrated with 0–4 μM CatG4 in a (40 mM HEPES-NH4OH pH 8.0, 20 mM KCl, 1% DMF, 0.05% Triton X-100), at 25  C. (b) Plots of calculated bound heme based on the change in absorbance at 404 nm plotted against DNA concentration to generate binding isotherm, and dissociation equilibrium constants (Kd) derived from it

(Fig. 1c). In carrying out such measurements, a researcher should be mindful of the identity and concentration of cations being used for folding the G-quadruplex structure, given that different cations under different conditions can influence the overall G-quadruplex topology [17–23]. 1.3

Oxidizing Agent

The next key factor to be considered is the oxidizing agent. Hydrogen peroxide is the oxidant most commonly used for hemelGquadruplex DNAzymes. Rojas et al. have examined the relative effectiveness of oxidants other than H2O2 for hemelG-quadruplex DNAzyme activation [24]. These authors found that bulkier oxidants such as t-butyl hydroperoxide and cumene hydroperoxide were also effective at activating the hemelG-quadruplex DNAzyme [24], so could potentially be used in place of hydrogen peroxide. However, the concentration of these stronger oxidants needs to be controlled, since they are also effective at activating hemin alone, and also damage the hemin/G-quadruplex complex more rapidly than does hydrogen peroxide.

1.4

Substrates

It is well established by now that hemelG-quadruplex DNAzymes can oxidize a broad range of substrates, including useful chromogenic [3, 12] and fluorogenic [25–27] ones. Rojas et al. [24] made the interesting observation that certain phenolic substrates, including L- and D-tyrosine, N-acetyl-L-tyrosine, and hydroxycinnamic acid [24] are outstanding substrates for hemelG-quadruplex

HemelG-Quadruplex DNAzymes

361

Fig. 3 The peroxidation reactions by heme/G4 DNAzyme of chromogenic ABTS (a) and florigenic Amplex red (b) substrates

DNAzymes, which oxidize them more efficiently even than horseradish peroxidase does. Figure 3 shows schematics for the oxidation of a chromogenic substrate (ABTS) and a fluorogenic substrate (Amplex Red) by hemelG-quadruplex DNAzymes. In comparing these two kinds of substrates, the utility of non-fluorescent molecules such as Amplex Red and 20 ,70 -dichlorodihydrofluorescein, efficiently oxidized by hemelG-quadruplexes to fluorescent products, offer a dynamic range far higher than chromogenic substrates do. In this regard, Nakayama and Sintim [26] reported further that 20 ,70 dichlorodihydrofluorescein diacetate is a superior reducing substrate for the fluorometric detection of bioanalytes using hemelG-quadruplex DNAzymes, not only because of its superior signal-to-noise characteristics but also for the intrinsic brightness of the product fluorophore, that enables detection of low nanomolar concentrations of hydrogen peroxide. This is comparable to or even superior to the detection limit that can be achieved with more expensive dyes, such as Amplex Red. Based on the above points, we will describe in the following section recommended general procedures on how to perform hemin binding experiments and the peroxidase reaction.

362

2

Nisreen Shumayrikh and Dipankar Sen

Materials Prepare all solutions using ultrapure water (prepared by purifying deionized water, to obtain a resistivity of 18 MΩ-cm at 25  C) and analytical grade reagents. Prepare and store all reagents at room temperature (unless indicated otherwise). 1. A G-quadruplex forming DNA oligonucleotide [such as “CatG4”: 50 –TGG GTA GGG CGG GTT GGG AAA–30 (see Note 1)] is purified using standard desalting and gel purification methods. 2. DNA pallet is dissolved in TE buffer: 10 mM Tris, pH 7.5, 0.1 mM EDTA. 3. Polyacrylamide gel electrophoresis is carried out in 12% denaturing gels containing 19:1 bisacrylamide/acrylamide], 7.5 M urea. 4. TBE running buffer: 50 mM Tris/Borate/EDTA. 5. Denaturing gel-loading solution: 95% formamide and 0.25% (w/v) each of Xylene Cyanol and Bromophenol Blue. 6. Hemin is purchased from Frontier Scientific (Logan, UT, USA). Hemin is dissolved in DMF: N,N-dimethylformamide. 7. ABTS: 2, 20 -azido-bis(3-ethylbenzothiazoline-6-sulfonic acid), and all other chemicals are purchased from Sigma-Aldrich. 8. Binding/peroxidase buffer: 40 mM HEPES-NH4OH pH ¼ 8, 20 mM KCl, 0.05% Triton-100X, 1% dimethylformamide (DMF) (see Note 2). 9. UV/Vis spectra and data are obtained using a Cary 100 UV/Vis spectrophotometer and 500 μL quartz cuvette. 10. Data analysis and curve fitting are carried out using either GraphPad Prism 7 or OriginLab 9 software.

3

Methods All procedures should be carried out at 20–22  C unless otherwise specified.

3.1 DNA Oligonucleotide SizePurification by Polyacrylamide Gel Electrophoresis

1. Prepare 2 nmol per microliter solution of DNA oligonucleotide by dissolving it in TE buffer in a microfuge/ Eppendorf tube. 2. Mix 40 μL of DNA solution (~80 nmol) with a denaturing gel-loading solution. 3. Vortex and centrifuge the solution, followed by heating to 95  C and cooling to room temperature.

HemelG-Quadruplex DNAzymes

363

4. Fill the gel apparatus with TBE running buffer, then, load the denatured DNA solution into the gel, and run the gel at constant power (~10–15 W). 5. Identify the DNA band in the gel using a standard UV shadowing method (the gel is covered with transparent cling-film and placed on a fluorescent thin layer chromatography plate. The DNA band is visualized as a dark patch in the gel upon shining light from a hand-held UV lamp. The visualized DNA band is cut out of the gel using a clean, sharp razor blade. 6. Elute the DNA from the gel fragment by mincing the gel fragment and soaking it in excess volume TE buffer overnight, at 4  C. 7. Vortex and centrifuge, then recover the TE buffer containing the DNA away from the gel fragments, filtering if necessary through a 0.2-μm filter. 8. Add an equal volume of 2-butanol, mix, and centrifuge briefly. Remove the upper, 2-butanol layer (DNA should be present in the lower aqueous layer). 9. Repeat step 9 until the volume of DNA solution reduced to ~300 μL. 10. Add 10 μL of 3 M sodium acetate, pH 5.2, solution. Mix by vortexing. 11. Add 775 μL of anhydrous ethanol. Mix and place the solution in powdered dry ice for 5–10 min. 12. Centrifuge for 30 min, then remove the supernatant. 13. Wash the DNA pellet with ice-cold 70% ethanol, centrifuge for 5 min, and carefully remove the ethanol without dislodging the DNA pellet. 14. Repeat step 14, as required. 15. Air-dry the DNA pellet, and dissolve it in TE buffer. 3.2 Preparation of the Binding and Peroxidase Reaction Buffer

The binding and the peroxidase buffer compositions are the same. The buffer is prepared as a 2X stock solution as follows: 1. Prepare a solution of 1 M of HEPES solution and adjust its pH to 8.0 with ammonium hydroxide. 2. In 50 mL falcon tube, add 4 mL of 1 M HEPES-NH4OH, pH 8 stock solution. 3. Add 0.5 mL of 4 M KCl. 4. Add 5 mL of 1% (w/v) Triton-100X solution and 1 mL of dimethylformamide (DMF). 5. Add 39.5 mL of deionized distilled H2O.

364

Nisreen Shumayrikh and Dipankar Sen

3.3 UV–Vis Hemin Binding Assay

The recommended final concentration of hemin in this assay is no less than 0.5 μM (see Note 3). 1. Prepare a hemin stock solution at 10 mM in DMF. Make a dilution of the stock to 5 mM in 100 μL, then aliquot 10 μL into ten dark Eppendorf/microfuge tubes, and store them at 20  C until needed (see Note 4). 2. Prepare 100 μL of various stock concentrations of the Gquadruplex-forming oligonucleotide in TE buffer: to 5, 10, 15, 20, 25, 30, 37.5, 50, 62.5, 75, 87.5, 125 μM, and perhaps higher concentrations of DNA (see Note 5). 3. Add 3.5 mL of 2 binding buffer (prepared as described in Subheading 3.2) to 15 mL falcon tube. 4. Add 140 μL of 25 μM hemin to the solution and mix by vortexing. 5. Divide the mixture so that 260 μL is aliquoted into each of 12 Eppendorf tubes. 6. To each solution, add 20 μL of the DNA solution, to reach the various desired DNA concentrations. Note, there should be one sample with no DNA added, as a control. For this sample, add an equivalent volume of TE buffer instead of DNA solution. 7. Leave the mixture for at least 20 min at room temperature to ensure proper folding of the G-quadruplex structure and to enable equilibrium binding between the hemin and the G-quadruplex. 8. Add 220 μL of ddH2O so that the total volume of the solution mixture is 500 μL (see Note 6). 9. Transfer the content of each Eppendorf tube into UV–Vis quartz cuvettes and record the absorption spectra from 200 to 800 nm wavelength (see Note 7).

3.4 Calculation of Dissociation Equilibrium Constant

1. Use the absorbance data from the above titration experiment to plot saturation binding curves, plotting absorbance changes in the hemin Soret band (404 nm) as a function of DNA concentration. 2. Calculate dissociation equilibrium constants (Kd) by fitting the binding isotherm using nonlinear regression (OriginLab 9) with the following equation described by Wang et al. [DNA]0 ¼ Kd(A  A0)/(A1  A) + [P0] (A  A0)/(A1  A0) [see ref. 28 on how to derive the equation]. 3. The parameters of the Wang’s equation are as follows: [DNA]0 is the initial concentration of DNA, [P0] is the initial concentration of monomeric hemin (the concentration of hemin was calculated using ε398 ¼ 80,000 M1 cm1). A1 indicates

HemelG-Quadruplex DNAzymes

365

maximum hemin absorbance and A0 the initial absorbance in the absence of DNA. 4. Enter the equation manually into the software and provide values to the known fixed parameters; A0, A1, and [P0] prior to starting the analysis. 3.5 Peroxidase Activity Measurement Using ABTS as a Substrate

1. Add 250 μL of the peroxidation buffer described in Subheading 3.2 to 1.6 mL size Eppendorf tube. 2. Prepare 50 μM of CatG4 oligonucleotide in TE buffer. Add 20 μL of the G-quadruplex forming oligonucleotide to the reaction buffer. 3. Allow the DNA to fold by leaving the sample for 10 min at room temperature. 4. Add 10 μL of 5 μM hemin solution in DMF to the mixture. 5. Wait for at least 20 min to allow for hemin/G-quadruplex binding. 6. Add 50 μL of 50 μM ABTS solution in ddH2O (see Note 8). Add 30 μL of ddH2O so that the total volume now is 400 μL. Mix the solution and centrifuge for 1 min. 7. Transfer the whole mixture to 500 μL quartz cuvette. 8. Add 100 μL of 5 mM H2O2 to initiate the reaction gently mix the solution. Make sure not to generate a bubbly solution while mixing. 9. Monitor the change in absorbance at 414 nm.

4

Notes 1. CatG4 stands for “Catalytic Guanine quadruplex” oligonucleotide is a variant of the PS2M sequence (50 -GTG GGT AGG GCG GGT TGG-30 ) that emerged from SELEX [4, 29, 30]. Both oligonucleotides fold into parallel G4 structures, but CatG4 has three additional adenines at the 30 -side offering both greater stability and higher peroxidase rates when complexed with hemin [31]. Yet more sophisticated active sites can be built around the bound hemin, and such an approach can be used to diversify and improve the catalytic activities of the hemelG-quadruplex DNAzymes. For example, it was shown by Willner and colleagues [32] that a hemelG-quadruplex DNAzyme conjugated to an aptamer (such as a dopamine binding aptamer DNA aptamer, DBA) at either its 30 - or 50 end, exhibited more efficient oxidative catalysis of dopamine compared to the wild-type hemelG-quadruplex DNAzyme . 2. The nonionic detergent Triton X-100 and DMF are included in the binding/reaction buffer to prevent hemin aggregation

366

Nisreen Shumayrikh and Dipankar Sen

and increase the latter’s solubility in aqueous buffers [3]. The final concentration of Triton X-100 should be in the range of 0.03 – 0.05% (w/v). We observed a decline in the peroxidase activity above 0.05% [3]. Very high Triton X-100 concentrations could begin to compete for and remove bound hemin from the DNA-bound complex so, it is crucial to not exceed the minimum concentration of 0.05% (w/v). By contrast, the catalysis of the hemelG-quadruplex DNAzyme operates well in certain mixtures of aqueous/organic solvents, and this enables the investigator to include organic solvents into the reaction buffer to enhance the solubility of both hemin and of hydrophobic substrates. We find that the oxidative properties of a hemelG-quadruplex DNAzyme are notably enhanced in aqueous solutions containing 20–30% v/v methanol or formamide. By contrast, aqueous dimethylformamide solutions containing >1% of DMF largely inhibit hemelG-quadruplex DNAzyme activity [33]. Methanol is an optimal “green solvent” for oxidizing poorly water-soluble, industrially relevant compounds. We determined that a 60% (v/v) methanol-water mixture gave a strongly optimized yield of the dibenzothiophene sulfoxide (DBTO) oxidation product of petroleum-derived dibenzothiophene [33]. 3. Hemin concentrations lower than 0.5 μM are not optimal for obtaining reliable and reproducible UV–Vis absorption data. Also, we always use a fixed hemin concentration and titrate the DNA to avoid any complications that might arise from hemin insolubility. 4. Hemin is both light- and heat-sensitive; therefore, we always recommend preparing multiple aliquots of the original stock and storing them at –20  C; and use one at a time to avoid freeze and thaw cycle that could influence a change in hemin concentration. 5. Prepare enough DNA solution for repeating the binding experiment, so as to generate at least three experimental replicates. If the G-rich DNA is stored at – 20  C, generally there is a chance of partial formation of secondary structures at thawing. Therefore, to ensure proper folding of the DNA to the required G-quadruplex, it is recommended to heat the DNA to 90  C then allow it cool down to room temperature gradually. It is best to use the folded DNA promptly at this point. 6. The total volume of the binding or peroxidation solutions is not restricted to 500 μL, as described above for this particular assay. A researcher has the choice to reduce the volume to 100 μL to save materials if expensive. We do not recommend a volume less than 100 μL because of the presence of Triton

HemelG-Quadruplex DNAzymes

367

100-X in solution as bubbles might complicate the UV–Vis measurement, especially with small-sized cuvettes. 7. Prior to measurement, one must start up the UV–Vis Spectrophotometer and let it stabilize for at least 30 min. It is also important to set the baseline on the Spectrophotometer prior to recording a spectrum. A cuvette containing an equal volume of the binding buffer can be used to establish the baseline. 8. For peroxidase activity using ABTS as the reducing substrate, we recommend a high concentration of ABTS (5 mM) since ABTS + is subjected to disproportionation. To increase the sensitivity of the assay, we recommend using fluorogenic substrates such as Amplex Red and 20 ,70 -dichlorodihydrofluorescein, which can be used at significantly lower concentrations [26, 27]. l

Acknowledgments We acknowledge grant funding from the Natural Sciences and Engineering Research Council of Canada (NSERC). References 1. Kosman J, Juskowiak B (2011) Peroxidasemimicking DNAzymes for biosensing applications: a review. Anal Chim Acta 707 (1-2):7–17. https://doi.org/10.1016/j.aca. 2011.08.050 2. Veitch NC (2004) Horseradish peroxidase: a modern view of a classic enzyme. Phytochemistry 65(3):249–259 3. Travascio P, Li Y, Sen D (1998) DNA-enhanced peroxidase activity of a DNAaptamer-hemin complex. Chem Biol 5 (9):505–517 4. Li Y, Geyer CR, Sen D (1996) Recognition of anionic porphyrins by DNA aptamers. Biochemistry 35(21):6911–6922. https://doi. org/10.1021/bi960038h 5. Travascio P, Bennet AJ, Wang DY, Sen D (1999) A ribozyme and a catalytic DNA with peroxidase activity: active sites versus cofactorbinding sites. Chem Biol 6(11):779–787 6. Travascio P, Witting PK, Mauk AG, Sen D (2001) The peroxidase activity of a hemin-DNA oligonucleotide complex: free radical damage to specific guanine bases of the DNA. J Am Chem Soc 123(7):1337–1348 7. Travascio P, Sen D, Bennet AJ (2006) DNA and RNA enzymes with peroxidase activity. An investigation into the mechanism of action. Can J Chem 84(4):613–619. https://doi. org/10.1139/v06-057

8. Furtmuller PG, Zederbauer M, Jantschko W, Helm J, Bogner M, Jakopitsch C, Obinger C (2006) Active site structure and catalytic mechanisms of human peroxidases. Arch Biochem Biophys 445(2):199–213. https://doi. org/10.1016/j.abb.2005.09.017 9. Li W, Li Y, Liu Z, Lin B, Yi H, Xu F, Nie Z, Yao S (2016) Insight into G-quadruplex-hemin DNAzyme/RNAzyme: adjacent adenine as the intramolecular species for remarkable enhancement of enzymatic activity. Nucleic Acids Res 44(15):7373–7384. https://doi. org/10.1093/nar/gkw634 10. Chang T, Gong H, Ding P, Liu X, Li W, Bing T, Cao Z, Shangguan D (2016) Activity enhancement of G-quadruplex/hemin DNAzyme by flanking d(CCC). Chem Eur J 22 (12):4015–4021. https://doi.org/10.1002/ chem.201504797 11. Cheng M, Zhou J, Jia G, Ai X, Mergny J-L, Li C (2017) Relations between the loop transposition of DNA G-quadruplex and the catalytic function of DNAzyme. Biochim Biophys Acta Gen Subj 1861(8):1913–1920. https://doi. org/10.1016/j.bbagen.2017.05.016 12. Nakayama S, Sintim HO (2012) Investigating the interactions between cations, peroxidation substrates and G-quadruplex topology in DNAzyme peroxidation reactions using

368

Nisreen Shumayrikh and Dipankar Sen

statistical testing. Anal Chim Acta 747:1–6. https://doi.org/10.1016/j.aca.2012.08.008 13. Saito K, Tai H, Hemmi H, Kobayashi N, Yamamoto Y (2012) Interaction between the heme and a G-quartet in a heme-DNA complex. Inorg Chem 51(15):8168–8176. https://doi. org/10.1021/ic3005739 14. Shibata T, Nakayama Y, Katahira Y, Tai H, Moritaka Y, Nakano Y, Yamamoto Y (2017) Characterization of the interaction between heme and a parallel G-quadruplex DNA formed from d(TTGAGG). Biochim Biophys Acta 1861(5 Pt B):1264–1270. https://doi. org/10.1016/j.bbagen.2016.11.005 15. Cheng X, Liu X, Bing T, Cao Z, Shangguan D (2009) General peroxidase activity of Gquadruplex-hemin complexes and its application in ligand screening. Biochemistry 48 (33):7817–7823. https://doi.org/10.1021/ bi9006786 16. Kong DM, Yang W, Wu J, Li CX, Shen HX (2010) Structure-function study of peroxidaselike G-quadruplex-hemin complexes. Analyst 135(2):321–326. https://doi.org/10.1039/ b920293e 17. Phan AT, Patel DJ (2003) Two-repeat human telomeric d(TAGGGTTAGGGT) sequence forms interconverting parallel and antiparallel G-quadruplexes in solution: distinct topologies, thermodynamic properties, and folding/ unfolding kinetics. J Am Chem Soc 125 (49):15021–15027. https://doi.org/10. 1021/ja037616j 18. Smith FW, Lau FW, Feigon J (1994) d (G3T4G3) forms an asymmetric diagonally looped dimeric quadruplex with guanosine 50 -syn-syn-anti and 50 -syn-anti-anti N-glycosidic conformations. Proc Natl Acad Sci U S A 91 (22):10546–10550 19. Haider SM, Parkinson GN, Neidle S (2003) Structure of a G-quadruplex-ligand complex. J Mol Biol 326(1):117–125 20. Risitano A, Fox KR (2003) Stability of intramolecular DNA quadruplexes: Comparison with DNA duplexes. Biochemistry 42 (21):6507–6513. https://doi.org/10.1021/ bi026997v 21. Sen D, Gilbert W (1990) A sodium-potassium switch in the formation of four-stranded G4-DNA. Nature 344(6265):410–414. https://doi.org/10.1038/344410a0 22. Venczel EA, Sen D (1993) Parallel and antiparallel G-DNA structures from a complex telomeric sequence. Biochemistry 32 (24):6220–6228

23. Dai J, Carver M, Yang D (2008) Polymorphism of human telomeric quadruplex structures. Biochimie 90(8):1172–1183. https:// doi.org/10.1016/j.biochi.2008.02.026 24. Rojas AM, Gonzalez PA, Antipov E, Klibanov AM (2007) Specificity of a DNA-based (DNAzyme) peroxidative biocatalyst. Biotechnol Lett 29(2):227–232. https://doi.org/10.1007/ s10529-006-9228-y 25. Grigg JC, Shumayrikh N, Sen D (2014) G-quadruplex structures formed by expanded hexanucleotide repeat RNA and DNA from the neurodegenerative disease-linked C9orf72 gene efficiently sequester and activate heme. PLoS One 9(9):e106449. https://doi.org/ 10.1371/journal.pone.0106449 26. Nakayama S, Sintim HO (2010) Biomolecule detection with peroxidase-mimicking DNAzymes; expanding detection modality with fluorogenic compounds. Mol BioSyst 6 (1):95–97. https://doi.org/10.1039/ b916228c 27. Golub E, Freeman R, Willner I (2011) A hemin/G-quadruplex acts as an NADH oxidase and NADH peroxidase mimicking DNAzyme. Angew Chem Int Ed Engl 50 (49):11710–11714 28. Wang Y, Hamasaki K, Rando RR (1997) Specificity of aminoglycoside binding to RNA constructs derived from the 16S rRNA decoding region and the HIV-RRE activator region. Biochemistry 36(4):768–779. https://doi.org/ 10.1021/bi962095g 29. Li Y, Sen D (1996) A catalytic DNA for porphyrin metallation. Nat Struct Biol 3 (9):743–747 30. Li Y, Sen D (1997) Toward an efficient DNAzyme. Biochemistry 36(18):5589–5599. https://doi.org/10.1021/bi962694n 31. Poon LC-H (2011) Structural and catalytic properties of DNA/RNA-heme complexes. (Thesis) M.Sc. Simon Faser University 32. Golub E, Albada HB, Liao WC, Biniuri Y, Willner I (2016) Nucleoapzymes: Hemin/Gquadruplex DNAzyme-Aptamer binding site conjugates with superior enzyme-like catalytic functions. J Am Chem Soc 138(1):164–172. https://doi.org/10.1021/jacs.5b09457 33. Canale TD, Sen D (2016) Hemin-utilizing G-quadruplex DNAzymes are strongly active in organic co-solvents. Biochim Biophys Acta. https://doi.org/10.1016/j.bbagen.2016.11. 019

Chapter 23 In Vivo Chemical Probing for G-Quadruplex Formation Fedor Kouzine, Damian Wojtowicz, Arito Yamane, Rafael Casellas, Teresa M. Przytycka, and David L. Levens Abstract While DNA inside the cells is predominantly canonical right-handed double helix, guanine-rich DNAs have potential to fold into four-stranded structures that contain stacks of G-quartets (G4 DNA quadruplex). Genome sequencing has revealed G4 sequences tend to localize at the gene control regions, especially in the promoters of oncogenes. A growing body of evidence indicates that G4 DNA quadruplexes might have important regulatory roles in genome function, highlighting the need for techniques to detect genomewide folding of DNA into this structure. Potassium permanganate in vivo treatment of cells results in oxidizing of nucleotides in single-stranded DNA regions that accompany G4 DNA quadruplexes formation, providing an excellent probe for the conformational state of DNA inside the living cells. Here, we describe a permanganate-based methodology to detect G4 DNA quadruplex, genome-wide. This methodology combined with high-throughput sequencing provides a snapshot of the DNA conformation over the whole genome in vivo. Key words Non-B DNA, DNA quadruplex, G4 DNA, Potassium permanganate, Chromatin, Highthroughput genomics

1

Introduction Rather than being a static helix, DNA possesses structural variability. Hydrogen bonding between nucleobases of the complementary DNA strands keeps DNA in the double-stranded right-handed helix: classical B-DNA form. However, DNA elements with special patterns of nucleotides sequence have potential for structural transitions into other DNA forms, non-B-DNA structures. These structures were extensively characterized by biophysical studies, using DNA oligonucleotides or plasmid DNA in various solution conditions. While non-B-DNA structures provide enormous potential for autoregulation of genome function [1], the extent and even existence of such unusual structures inside of living cells is still the matter of some debate. The study of the interplay between DNA conformation and genome biology has been hindered by

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_23, © Springer Science+Business Media, LLC, part of Springer Nature 2019

369

370

Fedor Kouzine et al.

experimental difficulties associated with detecting non-B-DNA structures and assessing their regulation in vivo, especially in eukaryotic cells [2, 3]. In this chapter, we describe our method to map non-B DNA genome-wide in the living cells, with a focus on G4 DNA quadruplexes. Other groups are currently adopting and adapting this approach allowing researchers to uncover the role of DNA conformational dynamics in genome function. G4 DNA quadruplexes are four-stranded DNA that stacks planar sets of four mutually Hoogsteen H-bonded guanine bases. Biophysical studies on synthetic oligonucleotide sequences delivered from the eukaryotic genomes indicate the broad variety of G4 structures depending on strand orientation, the size of the singlestranded DNA loops, and solution conditions, such as the selection and concentration of the prevailing cation [4]. These studies have enabled the search for sequences with quadruplex forming potential in genomes [5]. It was shown that DNA elements with predisposition for quadruplex formation (G4 DNA) often reside within regulatory regions, including a significant enrichment of G4 DNA in the promoters of oncogenes [6]. Chemical biological studies have provided crucial insight into G-quadruplex-binding ligands that exhibit pronounced anticancer activities in vivo, ability for transcriptional reprogramming, as well as locus-specific changes in epigenetic information [7, 8]. The potential importance of G4 DNA quadruplex formation within the genomes of living cells and the extensive literature correlating pharmacologic disturbance of quadruplex stability with perturbation of cellular programs stimulated experiments to search for the presence of these structures in vivo [5]. High-affinity G4 DNA quadruplex-recognizing antibodies were used to visualize these structures by immunostaining inside a range of cells [9, 10]. The high stability of G4 DNA quadruplex enabled immunoprecipitation of immunoreactive structures from isolated and fragmented genomic DNA [11]. Deep sequencing of the selected DNA fragments yielded signal that correlated with predicted G4 DNA sequences. Recently, a G4 ChIP–seq protocol was developed that employed an antibody specific to G4 DNA quadruplex to map the genome-wide location of the structures in the chromatin of the formaldehyde fixed cells [12]. However, it should be noted that antibody-based approaches might impact and bias the formation of G-quadruplex-structures: fixing cells, binding of antibodies, enzymatic, chemical, and mechanical factors during genomic DNA isolation might all influence the apparent versus real pattern of G4 DNA quadruplex in the genome [1, 2]. Also, considering the polymorphous nature of G4 and the absence of a structure of anti-G4 complexed with nucleic acid, it remains to be established that the determinants of antibody recognition are universally present and accessible on all formed G4 structures and absent on other structures [4]. Therefore, it was important to

Chemically Probing for Non-B DNA In Vivo

371

develop orthogonal and less chromatin disruptive approaches to map G4 DNA structures in vivo, to further consolidate previous findings. A powerful approach to detect non-B DNA conformation genome-wide at nearly nucleotide resolution relies on the properties of the molecule potassium permanganate (KMnO4) to oxidize unpaired nucleotides of single-stranded DNA (ssDNA) segments that are a pervasive characteristic of non-B DNA structures, G4 DNA included. Though for several decades permanganate-based assays have been used to probe DNA conformation in selective regions of the genome [13–15], we have combined this method with enzymatic footprinting and next generation sequencing to provide a global view of DNA structure in vivo [16, 17]. Potassium permanganate is a small molecular which easily penetrates cellular membranes and modifies ssDNA in living cells by oxidizing unpaired pyrimidine bases. Cells treated with permanganate for a very short period remain fully viable after treatment [16]. Thus, marking of single-stranded DNA by this chemical modification reflects the unbiased in vivo pattern of single stranded DNA. In purified genomic DNA, regions with oxidized bases are susceptible to cleavage by single strand specific nuclease—S1 nuclease. After nuclease digestion, double-stranded breaks are produced at the sites of the DNA chemical modification. These breaks are labeled with biotinylated nucleotides during DNA tailing reaction with terminal deoxynucleotidyl transferase (TdT). After DNA sonication, biotinylated DNA fragments are selected with streptavidin beads prior to Illumina library preparation then sequenced using the high-throughput Illumina platform; this is the outline of the method that we call ssDNA-seq [16, 17]. Overlapping the sequencing signal with computationally predicted G4 DNA motifs in the genome delivers a high-resolution map of G4 DNA structures formed in vivo (Fig. 1).

2 2.1

Materials Reagents

1. Proteinase K (Solution, 20 mg/mL). 2. Phenol:Chloroform:Isoamyl Alcohol 25:24:1, Tris (pH 8.0) saturated. 3. Ethanol 100%. 4. Ammonium acetate 7.5 M. 5. RNase, DNase-free (Solution, 500 μg/mL). 6. SDS, 10%. 7. EDTA, 0.5 M. 8. Glycogen, 2 mg/mL stock solution. 9. UltraPure Agarose (Invitrogene).

372

Fedor Kouzine et al.

Fig. 1 G4 DNA structures mapping workflow: from top, counterclockwise direction. Single-stranded DNA in G4 DNA structures is stabilized in live cells by treatment with KMnO4 (described in Subheadings 3.2 and 3.3). Cordycepin and Terminal Transferase treatment of purified genomic DNA is used to block preexisting DNA double stranded breaks (described in Subheading 3.4). Chemically modified DNA is digested with singlestrand specific S1 nuclease (described in Subheading 3.5). DNA ends exposed by nuclease treatment are biotinylated (described in Subheading 3.6) and, following sonication, streptavidin selected. Biotin is removed from the selected DNA fragments (described in Subheading 3.7). DNA fragments surrounding G4 DNA structures in vivo are deep sequenced (described in Subheading 3.8). Computational analysis allows genome-wide identification of G4 DNA structures (described in Subheading 3.9)

10. Terminal Transferase (NEB). 11. Nuclease S1 (Thermo Scientific). 12. Taq DNA Polymerase with ThermoPol Buffer (NEB). 13. HiFi HotStart ready Mix (KAPA Biosystems). 14. Potassium Permanganate, KMnO4. 15. Cordycepin 50 -triphosphate sodium salt (Sigma Aldrich). 16. Biotin-16-dUTP, 1 mM (Roche). 17. 100 mM dNTPs, PCR grade (NEB) 18. 10 mM ATP 19. SYBR Green Nucleic Acid Gel Stain. 20. MassRuler DNA Ladder Mix (Thermo Fisher Scientific). 21. Illumina adapter (Adaptor oligo mix).

Chemically Probing for Non-B DNA In Vivo

373

22. Illumina genomic SE primers (Fw: 50 -aat gat acg gcg acc acc gag atc tac act ctt tcc cta cac gac gct ctt ccg atc t-30 / Rv: 50 -caa gca gaa gac ggc ata cga gct ctt ccg atc t-30 ). 23. T4 DNA ligase (NEB). 2.2

Buffers

1. TE buffer: 10 mM Tris–HCl, 1 mM EDTA (pH 8.0). 2. Low salt buffer: 15 mM Tris–HCl (pH 7.5), 60 mM KCl, 15 mM NaCl, 5 mM MgCl2, 0.5 mM EGTA, 300 mM Sucrose. 3. Stop Solution: 1% SDS, 50 mM EDTA, 700 mM β-mercaptoethanol. 4. Elution buffer: 10 mM Tris–HCl (pH 7.5), 1 mM EDTA, 1 M NaCl, 2 M β-mercaptoethanol. 5. TAE buffer: 40 mM Tris (pH 7.6), 20 mM acetic acid, 1 mM EDTA. 6. 10 Terminal Transferase (TdT) reaction buffer: 500 mM Potassium acetate, 200 mM Tris-acetate, 100 mM magnesium acetate, pH 7.9 7. 5 Nuclease S1 buffer: 200 mM sodium acetate (pH 4.5), 1.5 M NaCl, 10 mM ZnSO4 8. 10 (2.5 mM) solution of CoCl2 9. PBS: 137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 2 mM KH2PO4, without calcium and magnesium, pH 7.4.

2.3

Equipment

1. Gel electrophoresis apparatus. 2. Thermomixer. 3. Ultrasonic sonicator (Bioruptor, Diagenode). 4. Spectrophotometer, NanoDrop (ND-1000). 5. Rocker, tube wheel rotator, water bath, aspirator, and thermomixer. 6. Access to Genome Analyzer IIX (Illumina).

2.4

Kits

1. QIAquick PCR Purification Kit (Qiagen). 2. Dynabeads kilobaseBINDER Kit (Thermo Fisher Scientific). 3. Invitrogen Quant-iT PicoGreen dsDNA Reagent (Thermo Fisher Scientific). 4. DNA END-Repair kit (Epicentre Biotechnologies). 5. MinElute Gel Extraction Kit (Qiagen). 6. MinElute Reaction Cleanup Kit (Qiagen).

374

Fedor Kouzine et al.

Fig. 2 Representative examples of gel electrophoresis analysis of genomic DNA after the S1 nuclease digest described in Subheading 3.5. Cells were treated (+KMnO4) or untreated (KMnO4) with potassium permanganate. Purified genomic DNA were digested with increasing concentration of S1 nuclease and visualized with SYBR Green after gel electrophoresis. M: DNA mass ladder (MassRuler DNA Ladder Mix). The numbers on the left refer to the molecular weight of the DNA marker in lane M. (a) Optimal S1 nuclease reactivity of the genomic DNA from the cells treated with potassium permanganate. (b) Overexposing the cells to the potassium permanganate results in the excessive chemical modification of the genomic DNA as evident by high S1 nuclease reactivity. (c) Purified genomic DNA shows signs of degradation as apparent by the appearance of a nucleosome-like DNA ladder. Please note that the genomic DNA from the cells not exposed to potassium permanganate is highly resistant to S1 nuclease

3

Methods

3.1 Preparing Stock Solutions

1. Dissolve KMnO4 in water with constant shaking for at least 1 h (see Note 1). The optimal concentration of potassium permanganate is slightly different for different cell types. In pilot experiments, we routinely check 80 mM, 90 mM and 100 mM stock solutions of permanganate, to achieve the optimal level of DNA chemical modifications (Fig. 2). 2. Prepare Stop Solution and Elution buffer in the day of experiment. 3. Prewarm in the water bath PBS buffer, Low Salt buffer, KMnO4 stock solution and Molecular Biology Grade Water to 37  C.

Chemically Probing for Non-B DNA In Vivo

3.2 Preparing Genomic DNA from the Cells Treated with Permanganate (Suspension Cells)

375

1. Start with 60  106 cells (see Note 2). Count cells and transfer the suspension to a centrifuge tube. Recover the cell pellet by centrifugation at 500  g for 5 min at room temperature (RT). Remove the supernatant by aspiration and gently resuspend the cells in 40 mL PBS buffer (without CaCl2 and MgCl2). Pellet cells by centrifugation at 500  g for 5 min at RT. Remove the supernatant by aspiration and gently resuspend the cells in 3 mL of Low Salt buffer (see Note 3). Split the cells in the two 50 mL tubes; 1.5 mL of cells suspension in each tube. 2. A typical experiment requires condition #I, cells not exposed to permanganate; #II, cells exposed to permanganate. Write these numbers on the tubes and keep the cell suspension at 37  C in a water bath. Add 0.5 mL of H2O to the tube #1 and 0.5 mL of 100 mM KMnO4 solution to the tube #2. Incubate 80 s. Add 2 mL of Stop Solution to each tube. Mix carefully until the cellular lysate is clear (see Note 4). Add DNase free RNase (40 μg/mL), mix, and incubate for 1 h at 37  C (see Note 5). Then, add Proteinase K (300 μg/mL) and incubate at 42  C, overnight. 3. Add an equal volume of Phenol–Chloroform to the cellular lysates. Mix the two phases by slowly inverting the tubes for a few minutes (see Note 4), and separate the two phases by centrifugation at 3500  g for 10 min at RT. Transfer the aqueous phase to a new 50 mL Falcon tube. 4. Repeat Subheading 3.2, step 3. Add to the aqueous phases 0.5 volume of 7.5 M ammonium acetate and 2 volumes of ethanol stored at RT. Swirl the tubes to thoroughly mix solutions and precipitate the DNA by centrifugation at 3500  g for 10 min. Remove ethanol solution using an aspirator. Wash the DNA precipitates with 70% ethanol. Remove as much ethanol as possible and air-dry the pellet for 15 min. Add 1 mL of TE (pH 8.0) to each tube. Place the tubes on a tube wheel rotator and incubate the solution overnight at RT with gentle agitation until the DNA is completely dissolved (see Note 4).

3.3 Preparing Genomic DNA from the Cells Treated with Permanganate (Adherent Cells)

1. Start with appropriate amount of cell culture flask to have required 60  106 cells. Steps below are done for the standard 175 cm2 flask. Label half of the flasks with #I (cells not exposed to permanganate) and another half with #II (cells exposed to permanganate). Remove media and wash the cell layer with 50 mL of prewarmed PBS buffer (without CaCl2 and MgCl2). Add 6 mL of prewarmed Low Salt buffer. 2. Immediately add 2 mL of H2O to the flasks labeled with #1 and 2 mL of KMnO4 stock solution to the flasks labeled with #2. Incubate 80 s. Add 8 mL of Stop Solution. Mix carefully by flask rocking until cellular lysates is clear. Add RNase-DNase free (40 μg/mL). Remove DNA solution from the flask and

376

Fedor Kouzine et al.

incubate for 1 h at 37  C. Then add Proteinase K (300 μg/mL) and incubate at 42  C, overnight (see Note 6). 3. Add an equal volume of Phenol–Chloroform to the cellular lysates. Mix the two phases by slowly inverting the tubes for a few minutes (see Note 4), and separate the two phases by centrifugation at 3500  g for 10 min at RT. Transfer the aqueous phase to a new 50 mL Falcon tube. 4. Repeat Subheading 3.3, step 3. Add to the aqueous phases 0.5 volume of 7.5 M ammonium acetate and 2 volumes of ethanol stored at RT. Swirl the tubes to thoroughly mix solutions and precipitate the DNA by centrifugation at 3500  g for 10 min. Remove ethanol solution using an aspirator. Wash the DNA precipitates with 70% ethanol. Remove as much ethanol as possible and air-dry the pellet for 15 min. Add 1 mL of TE (pH 8.0) to each tube. Place the tubes on a tube wheel rotator and incubate the solution overnight at RT with gentle agitation until the DNA is completely dissolved (see Note 4). 3.4 Blocking Unspecific Breaks in the Genomic DNA

1. Set up DNA double-stranded breaks blocking reaction by mixing 1 mL of DNA solution with 300 μL of 10 TdT buffer, 300 μL of CoCl2 solution, 80 μL of 4 mM cordycepin 50 -triphosphate, 1000 U of Terminal Transferase and H2O to a final volume of 3 mL. Incubate for 1 h at 37  C. 2. Phenol–Chloroform extract and Ethanol precipitate with 2.5 M Ammonium Acetate (see Subheading 3.3, step 4 of the current protocol). Repeat Ethanol precipitation with 2.5 M Ammonium Acetate (see Note 7). Dissolve DNA in 0.5 mL of TE buffer.

3.5 Converting the Sites of DNA Chemical Modification into DNA Breaks

1. Aliquot the DNA solution into three tubes (50 μL in each). Set up S1 nuclease digest of genomic DNA by mixing 50 μL of DNA solution with 60 μL of 5 S1 nuclease buffer and H2O to a final volume of 300 μL. Mark the tubes accordingly and add to them 50 U, 100 U, and 200 U of S1 nuclease. Incubate the reactions at 37  C for 20 min (see Note 8). 2. Stop the reaction by extracting the mixtures with 300 μL of Phenol–Chloroform. Add 20 μg of glycogen from a stock solution to the aqueous phase and 30 μL of 3 M Sodium acetate. Mix and add 2 volumes of ethanol stored at +4  C. Swirl the tubes to thoroughly mix solutions and precipitate the DNA by centrifugation at 15,000  g for 10 min. Remove ethanol solution using an aspirator. Wash the DNA precipitates with 70% ethanol. Remove as much ethanol as possible and air-dry the pellet for 15 min. Add 30 μL of TE (pH 8.0) to each tube. Place the tubes on a shaker for 1 h to dissolve the DNA.

Chemically Probing for Non-B DNA In Vivo

377

3. Run 2 μL of DNA solution from each reaction on a 0.6% agarose gel. Stain the gel with SYBR Green stain according to the manufacturer’s recommendation and by using UV transilluminator check the size of DNA in each reaction (see Fig. 2 for illustrative examples). 4. For the following biotin-labeling of the DNA breaks, choose S1 nuclease concentrations that produced DNA fragments with average size 2–10 kilobases (from the cells treated with KMnO4). Please compare Fig. 2a, b. Genomic DNA should not show any trace of degradation (Fig. 2c). By scaling up this condition for the remaining 350 μL of DNA solution (see Subheading 3.5, step 1), perform S1 digestion of the genomic DNA (see Subheading 3.5, steps 2 and 3). 3.6 DNA EndLabeling with TdT Tailing Reaction

1. Set up DNA tailing reaction by mixing 200 μL of DNA solution (samples #I and #II) with 30 μL of 10 TdT buffer, 30 μL of CoCl2 solution, 1 μL of 100 mM dCTP, 1 μL of 100 mM dATP, 4000 U of Terminal Transferase, and H2O to a final volume of 300 μL. Incubate reactions for 5 min at 37  C. 2. Add 20 μL of 1 mM Biotin-16-dUTP and incubate at 37  C for 30 min. Stop the reaction by adding EDTA to a final concentration of 20 μM following extraction with Phenol–Chloroform and ethanol precipitation with ammonium acetate (see Subheading 3.5, step 3). Dissolve DNA pellets in the 300 μL of TE buffer and repeat ethanol precipitation with ammonium acetate (see Note 9). Dissolve DNA pellets in the 300 μL of TE buffer. 3. Sonicate the biotinylated DNA to generate 200–400 bp DNA fragments. Check the DNA fragment sizes by running 2 μL of DNA solution from each tube on a 1% agarose gel. In our procedure, sonication was performed with an ultrasonic sonicator (Bioruptor, Diagenode) at medium power, by pulsing 20 times for 30 s and cooling in an ice-bath for 30 s between pulses.

3.7 Capturing of Biotinylated DNA by Streptavidin-Coated Beads

1. Transfer 200 μL of thoroughly resuspended beads (2 mg) to an Eppendorf tubes. Use the magnet to separate beads from supernatant. Add to the beads 300 μL of binding buffer (provided with the beads). Mix for 5 min. Place the tubes on the magnet and remove the supernatant. Resuspend the beads in 300 μL of binding buffer. 2. Add 300 μL of a solution containing the biotinylated DNA fragments (see Subheading 3.6, step 3) to the beads. Incubate the tubes for 3 h at room temperature on a tube wheel rotator. Aliquot 10 μL of unbound DNA solution and purify it with a QIAquick PCR Purification Kit, eluting DNA samples into

378

Fedor Kouzine et al.

30 μL of TE buffer (see Note 10). Save 2 μL of DNA to use in Subheading 3.7, step 6. 3. Wash the beads/DNA complex four times in 600 μL Washing solution (provided with the beads) and once in 600 μL TE buffer. To perform each wash, add Washing solution to the beads and incubate the suspension at 50  C for 5 min with agitation on the thermomixer. Use the magnet to separate the beads from the washing buffer. Add new Washing solution or TE buffer and repeat. 4. To disrupt biotin-streptavidin complexes, add to the washed beads 100 μL of elution buffer and incubate at 75  C for 1.5 h with agitation on the thermomixer. Use the magnet to separate the supernatant from the beads. Keep the supernatant. Add to the beads 100 μL of new elution buffer, incubate at 75  C for 1.5 h, and use the magnet to separate the supernatant from the beads. Pool the supernatants together. Purify DNA fragments with a QIAquick PCR Purification Kit, eluting DNA samples into 30 μL of TE buffer. Save 2 μL of DNA to use in Subheading 3.7, step 6. 5. To remove the biotinylated tails from DNA, set up S1 nuclease digesting reaction by mixing 28 μL of DNA solution (samples #I, #II, and unbound DNA from the Subheading 3.7, step 2) with 20 μL of 5 Nuclease S1 buffer, 50 U of S1 nuclease, and H2O to a final volume of 100 μL. Incubate reactions for 15 min at 37  C. Purify DNA with a QIAquick PCR Purification Kit, eluting DNA samples into 30 μL of TE buffer. Aliquot 5 μL of DNA to use in Subheading 3.7, step 6. 6. Quantify the recovered DNA. Run DNA samples aliquoted at Subheading 3.7, steps 2, 4, and 5 on a 1% agarose gel. Stain the DNA in the gel with SYBR Green in TAE buffer (see Note 11 and Fig. 3). 7. Repeat the experiment generating required number of biological replicates. 8. The DNA samples recovered from the biotin-streptavidin selection are ready for high-throughput sequencing. 3.8 Template Preparation for Sequencing Analysis

1. Quantify DNA with PicoGreen dsDNA Reagent according to the manufacturer’s protocol to determine what amount of adaptor to use in the later reaction. 2. To generate blunt-ended DNA, incubate the DNA for 45 min at room temperature in 25 μL reaction with a mixture of End repair buffer, 0.25 mM of each dNTPs, 1 mM ATP, and 1 μL DNA End-Repair Enzyme mix. Purify the DNA with a MinElute Reaction Cleanup Kit.

Chemically Probing for Non-B DNA In Vivo

379

Fig. 3 Representative example of gel electrophoresis analysis of DNA fragments described in Subheading 3.7. From the left to the right: first panel—DNA unbound to the beads; second panel—DNA bound to the beads; third panel—DNA mass ladder (MassRuler DNA Ladder Mix); fourth panel—DNA unbound to the beads treated with S1 nuclease; fifth panel—DNA bound to the beads treated with S1 nuclease. DNA fragments were delivered from the genomic DNA of the cells treated with potassium permanganate (+), or from the cells untreated with potassium permanganate (). The numbers on the left refer to the molecular weight of the DNA marker

3. In 50 μL reaction, treat the blunt-ended DNA with 15 units of Taq DNA polymerase for 40 min at 70  C in the presence of 0.2 mM dATP to generate a protruding 3’A base used for adaptor ligation (see Note 12). Purify DNA with a MinElute Reaction Cleanup Kit. 4. In 20 μL reaction, ligate Illumina genomic adapter to the end of DNA fragments by incubating DNA with Adaptor oligo mix and 1500 units of T4 DNA ligase in the T4 DNA ligase buffer at room temperature for 30 min. Use 1 μL of 1/20 diluted Adaptor oligo mix per 10 ng of DNA quantified at Subheading 3.8, step 1. Purify DNA with a MinElute Reaction Cleanup Kit. 5. Amplify the DNA for 18 cycles using Illumina genomic SE primers according to the following protocol: 98  C for 3 min precycle incubation followed by 98  C for 30 s; 65  C for 30 s; 72  C for 30 s and postcycle incubation at 72  C for 3 min with HiFi HotStart ready Mix. 6. Run the PCR product through 2% agarose gel and excise the gel slice around 200–300 bps region. Try not to include adaptor dimers located around 140 bps. 7. Purify the DNA from the gel using MinElute gel extraction kit. Elute the amplicon with 10 μL of 10 mM Tris, pH 8.5. 8. The purified DNA is used directly for cluster generation and sequencing analysis using the Illumina Genome Analyzer following manufacturer’s protocols.

380

Fedor Kouzine et al.

3.9 Identification of G4 DNA Quadruplex Structures

1. Find all occurrences of G4 sequence motifs in both strands of a reference genome using QuadParser [18] with at least three guanine bases required in each of four runs of guanine monomer repeat and gap size between repeats of 1–7 bases. Merge overlapped quadruplex motifs on the same strand into a single motif. Precomputed genomic locations of G4 motifs in mouse (mm9) and human (hg19) genomes can be found at https://www.ncbi. nlm.nih.gov/CBBresearch/Przytycka/index.cgi#nonbdna. 2. Process the raw sequencing data using the built-in Illumina Real-Time Analysis (RTA) software that provides base calls and associated quality scores. 3. Perform quality control checks (e.g., FastQC) to understand whether the data might have any problems before doing any further analysis. 4. Align sequencing reads to the reference genome using sequence aligner, for example Bowtie2 [19] or BWA [20]. 5. For each genomic occurrence of G4 sequence motif, count the number of nonredundant reads overlapping two windows of length 500 bp and 1000 bp centered at a given motif using htseq-count script [21]. The 500 bp and 1000 bp windows are called as signal and local window, respectively. 6. For each G4 motif, compute a p value for observed number of reads in a signal window within a local window using binomial distribution. To find a reasonable p value cutoff for G4 structures use a permutation test, i.e., randomly shuffle read location within local windows and compute p values for number of randomized reads found in signal windows within local windows. 7. G4 motifs with p value above a cutoff (corresponding to false discovery rate of 5% computed based on randomized data) can be considered as regions forming G4 quadruplex structure (see Note 13).

4

Notes 1. Keep potassium permanganate solution protected from light. Do not use DEPC-treated water. Do not keep stock solution for extended period—it should be made in the day of experiment. 2. This number of cells (60  106) is required to control all intermediate steps of the procedure and visualize DNA on the agarose gel. We recommend following all the checkpoints for the first experiments. With experience, one can scale-down this amount up to 10  106 cells.

Chemically Probing for Non-B DNA In Vivo

381

3. This buffer makes the cells swell and take up the KMnO4. 4. To minimize DNA breakage which might increase background signal after sequencing, cellular lysates and genomic DNA should be handled gently. After the addition of phenol–chloroform, mix the phases by slowly inverting the tubes for a few minutes. The aqueous phase should be removed from the organic phase with a 25-mL pipette, sucking the liquid very slowly. Do not vortex the samples and do not resuspend the DNA pellet by pipetting. 5. RNase is inhibited by 0.5% SDS, but at this high concentration it will be able to digest most of the RNA in the cellular lysates. 6. A large amount of RNase is required for experiments with adherent cells. To economize this enzyme, first genomic DNA can be purified without RNase treatment as described in Subheading 3.3, step 2. The DNA solution is then treated with 5 μg/mL of RNase at 37  C for 1 h. DNA is purified as described in Subheading 3.3, steps 3 and 4. 7. The double precipitation with ammonium acetate is necessary to remove free cordycepin 50 -triphosphate which might impede with following steps. 8. Pilot S1 nuclease titration is recommended to estimate the efficiency of DNA chemical modification and to choose optimal nuclease concentration in the preparative DNA digest (see Subheading 3.5, step 4). See also Fig. 2. 9. The double precipitation with ammonium acetate is necessary to remove free Biotin-16-dUTP which might inhibit interaction between biotinylated DNA and streptavidin-coated beads. 10. These samples to be used in the Subheading 3.7, step 6 of the current protocol. 11. This step is performed to check (1) the quality of the final DNA samples, (2) the efficiency of the removal of biotinylated tails from the DNA fragments, (3) low yield of DNA recovery from the cells not treated with potassium permanganate as verification of efficiency DNA chemical modification in vivo. 12. To avoid freeze and thaw cycles, we aliquot 10 mM dATP to small volumes and store them at 80  C for single use. 13. It should be emphasized that transcription bubble formed on DNA by RNA polymerase II may influence identification of G4 structures by increasing false positive calls. Thus, it is recommended to remove from analysis G4 motifs that overlap with binding sites of RNA polymerase II when available [16].

382

Fedor Kouzine et al.

References 1. Zaytseva O, Quinn LM (2018) DNA conformation regulates gene expression: the MYC promoter and beyond. Bioessays 40: e1700235. https://doi.org/10.1002/bies. 201700235 2. Kouzine F, Levens D (2007) Supercoil-driven DNA structures regulate genetic transactions. Front Biosci 12:4409–4423 3. van Holde K, Zlatanova J (1994) Unusual DNA structures, chromatin and transcription. Bioessays 16(1):59–68. https://doi.org/10. 1002/bies.950160110 4. Qin Y, Hurley LH (2008) Structures, folding patterns, and functions of intramolecular DNA G-quadruplexes found in eukaryotic promoter regions. Biochimie 90(8):1149–1171. https:// doi.org/10.1016/j.biochi.2008.02.020 5. Murat P, Balasubramanian S (2014) Existence and consequences of G-quadruplex structures in DNA. Curr Opin Genet Dev 25:22–29. https://doi.org/10.1016/j.gde.2013.10.012 6. Brooks TA, Kendrick S, Hurley L (2010) Making sense of G-quadruplex and i-motif functions in oncogene promoters. FEBS J 277 (17):3459–3469. https://doi.org/10.1111/j. 1742-4658.2010.07759.x 7. Guilbaud G, Murat P, Recolin B, Campbell BC, Maiter A, Sale JE, Balasubramanian S (2017) Local epigenetic reprogramming induced by G-quadruplex ligands. Nat Chem 9(11):1110–1117. https://doi.org/10.1038/ nchem.2828 8. Kendrick S, Muranyi A, Gokhale V, Hurley LH, Rimsza LM (2017) Simultaneous drug targeting of the promoter MYC G-quadruplex and BCL2 i-motif in diffuse large B-cell lymphoma delays tumor growth. J Med Chem 60 (15):6587–6597. https://doi.org/10.1021/ acs.jmedchem.7b00298 9. Schaffitzel C, Berger I, Postberg J, Hanes J, Lipps HJ, Pluckthun A (2001) In vitro generated antibodies specific for telomeric guaninequadruplex DNA react with Stylonychia lemnae macronuclei. Proc Natl Acad Sci U S A 98 (15):8572–8577. https://doi.org/10.1073/ pnas.141229498 10. Biffi G, Tannahill D, McCafferty J, Balasubramanian S (2013) Quantitative visualization of DNA G-quadruplex structures in human cells. Nat Chem 5(3):182–186. https://doi.org/10. 1038/nchem.1548 11. Lam EY, Beraldi D, Tannahill D, Balasubramanian S (2013) G-quadruplex structures are stable and detectable in human genomic DNA.

Nat Commun 4:1796. https://doi.org/10. 1038/ncomms2792 12. Hansel-Hertsch R, Beraldi D, Lensing SV, Marsico G, Zyner K, Parry A, Di Antonio M, Pike J, Kimura H, Narita M, Tannahill D, Balasubramanian S (2016) G-quadruplex structures mark human regulatory chromatin. Nat Genet 48(10):1267–1272. https://doi.org/ 10.1038/ng.3662 13. Bui CT, Rees K, Cotton RG (2003) Permanganate oxidation reactions of DNA: perspective in biological studies. Nucleosides Nucleotides Nucleic Acids 22(9):1835–1855. https://doi. org/10.1081/NCN-120023276 14. Giardina C, Perez-Riba M, Lis JT (1992) Promoter melting and TFIID complexes on drosophila genes in vivo. Genes Dev 6 (11):2190–2200 15. Johnston BH, Rich A (1985) Chemical probes of DNA conformation: detection of Z-DNA at nucleotide resolution. Cell 42(3):713–724 16. Kouzine F, Wojtowicz D, Baranello L, Yamane A, Nelson S, Resch W, Kieffer-Kwon KR, Benham CJ, Casellas R, Przytycka TM, Levens D (2017) Permanganate/S1 nuclease footprinting reveals non-B DNA structures with regulatory potential across a mammalian genome. Cell Syst 4(3):344–356 e347. https://doi.org/10.1016/j.cels.2017.01.013 17. Kouzine F, Wojtowicz D, Yamane A, Resch W, Kieffer-Kwon KR, Bandle R, Nelson S, Nakahashi H, Awasthi P, Feigenbaum L, Menoni H, Hoeijmakers J, Vermeulen W, Ge H, Przytycka TM, Levens D, Casellas R (2013) Global regulation of promoter melting in naive lymphocytes. Cell 153(5):988–999. https://doi.org/10.1016/j.cell.2013.04.033 18. Huppert JL, Balasubramanian S (2005) Prevalence of quadruplexes in the human genome. Nucleic Acids Res 33(9):2908–2916. https:// doi.org/10.1093/nar/gki609 19. Langmead B, Salzberg SL (2012) Fast gappedread alignment with bowtie 2. Nat Methods 9 (4):357–359. https://doi.org/10.1038/ nmeth.1923 20. Li H, Durbin R (2009) Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/ btp324 21. Anders S, Pyl PT, Huber W (2015) HTSeq—a python framework to work with highthroughput sequencing data. Bioinformatics 31(2):166–169. https://doi.org/10.1093/ bioinformatics/btu638

Chapter 24 G-Quadruplex Visualization in Cells via Antibody and Fluorescence Probe Matteo Nadai and Sara N. Richter Abstract G-quadruplexes (G4s) are noncanonical nucleic acids structures involved in key regulatory and pathological roles in eukaryotes, prokaryotes, and viruses: the development of specific antibodies and fluorescent probes represent an invaluable tool to understand their biological relevance. We here present three protocols for the visualization of G4s in cells, both uninfected and HSV-1 infected, using a specific antibody and a fluorescent G4 ligand, and the effect of the fluorescent ligand on a G4 binding protein, nucleolin, upon binding of the molecule to the nucleic acids structure. Key words G-quadruplex-specific antibodies, G-quadruplex ligands, Fluorescence probe, Immunofluorescence staining, Confocal microscopy, Nucleoli, HSV-1

1

Introduction G-quadruplexes (G4s) are unique, noncanonical nucleic acids structures adopted by guanine-rich sequences. The building block of these structures is the so-called guanine quartet (G-quartet): two or more G-quartets, stacking on each other, form the G-quadruplex. From a structural point of view, G4s are characterized by a high polymorphism: their topology can be classified as parallel, antiparallel, or hybrid basing on strands orientation and the multiple orientations adopted by the nucleotide linkers between guanine tracts (loops) contribute to increase G4 diversity. G4s are involved in key regulatory and pathological roles in eukaryotes [1–5], prokaryotes, and viruses [6–10]: given their biological significance, many efforts have been devoted to the development of specific and selective G4 stabilizing molecules [11–14], as well as of probes able to modify their fluorescence behavior upon G4 binding [15–17]. Both antibodies and fluorescence probes that specifically recognize G4 structures represent invaluable tools to visualize G4s in cells and to understand their biological relevance. Recently, two antibodies recognizing G4s

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_24, © Springer Science+Business Media, LLC, part of Springer Nature 2019

383

384

Matteo Nadai and Sara N. Richter

have been developed: BG4 [18] and 1H6 [19]. BG4 is a singlechain fragment variable antibody generated by phage display employing a library of different single-chain antibody clones and selecting the best G4 binder, while 1H6 is a monoclonal antibody produced immunizing mice with stable G4 DNA structures. Both antibodies were used to detect G4s in cells [20–22], in our studies the monoclonal antibody 1H6 was used. Many G4-specific fluorescent probes have been developed in the last years [23–25], but only a few of them can be used in both fixed and live cells, because of their cellular and subcellular permeability. The core-extended NDI (c-exNDI) is a potent G4 binder with an antiviral and anticancer activity [9]. Given its light-up properties upon G4 binding and its very fast cellular and nuclear entry, c-exNDI was used to visualize G4, in combination with the 1H6 anti-G4 antibody [23], both in uninfected and in HSV-1-infected cells.

2

Materials All solutions and materials used for cell culturing must be sterile.

2.1 Cell Culture and Virus

1. Cell line of interest (must be chosen, depending on the experiment purpose: in our case HEK293T and Vero). 2. Cell culture medium, DMEM–Dulbecco’s Modified Eagle Medium: NaCl 110.34 mM, NaHCO3 44.05 mM, D-Glucose 25.00 mM, KCl 5.33 mM, L-Glutamine 3.97 mM, Fe(NO3)3 2.47 mM, CaCl2 1.80 mM, NaH2PO4 0.92 mM, MgSO4 0.81 mM, L-Valine 0.80 mM, L-Isoleucine 0.80 mM, L-Leucine 0.80 mM, L-Lysine 0.80 mM, L-Threonine 0.80 mM, LPhenylalanine 0.40 mM, L-Serine 0.40 mM, Glycine 0.40 mM, L-Tyrosine 0.40 mM, L-Arginine 0.40 mM, L-Cystine 0.20 mM, L-Methionine 0.20 mM, L-Histidine 0.20 mM, LTryptophan 0.08 mM, i-Inositol 0.04 mM, Phenol Red 0.04 mM, Niacinamide 0.03 mM, Choline 0.03 mM, Pyridoxine 0.02 mM, Thiamine 0.01 mM, (Thermo Fisher Scientific). 3. Fetal Bovine Serum (FBS) (Thermo Fisher Scientific). 4. Trypsin-EDTA 0.05% (Thermo Fisher Scientific). 5. Dulbecco’s phosphate-buffered saline (DPBS) pH 7.4 137.9 mM NaCl, 2.7 mM KCl, 8.1 mM Na2HPO4, 1.5 mM KH2PO4 (Thermo Fisher Scientific), (optional: poly-D-lysine) (see Note 1). 6. HSV-1 wt, strain F (see Note 2). 7. Six-well plates for cell culture, microscope slides, and coverslips (alternatively: chamber slides for cell culture). 8. c-exNDI (or any other fluorescent compound reported to bind G4) (see Note 3).

G-Quadruplex Visualization in Cells via Antibody and Fluorescence Probe

385

9. Fixative: 2% (w/v) paraformaldehyde (PFA) in 1 DPBS. 10. Humidified 37  C, 5% CO2 incubator. 2.2 Immunofluorescence and Confocal Microscopy

1. Permeabilizing solution: 0.5% (v/v) Tween-20 in DPBS (see Note 4). 2. Washing solution after permeabilization (PBST): 0.1% (v/v) Tween-20 in DPBS. 3. 40 μg/mL RNaseA (Invitrogen) or 200 units DNase I (Invitrogen). 4. Blocking agent: BlockAid (Invitrogen) or any other suitable reagent. 5. Anti-G4 antibody: 1H6 [19] (see Note 5). 6. Anti-nucleolin C23 Biotechnology).

antibody

(H-250)

(SantaCruz

7. Anti-fibrillarin antibody (38F3) (Abcam). 8. Appropriate fluorescent secondary antibody: Alexa 488 antimouse IgG antibody, Alexa 488 anti-rabbit IgG antibody and Alexa 546 anti-mouse IgG antibody (see Note 6). 9. Fluorescent DNA dye: DRAQ5® (Cell Signaling Technology). 10. Antifade mounting medium: Glycergel Mounting Medium (Dako-Agilent) or ProLong™ Gold Antifade Mountant (ThermoFisher Scientific). 11. Nail polish. 12. Confocal microscopes: Leica TCS SP2 and Nikon A1Rsi + Laser Scanning.

3 3.1

Methods Cell Culture

1. Grow cells in appropriate medium supplemented with Fetal Bovine Serum (FBS) at 37  C in a 5% CO2 humidified atmosphere. HEK293T cells were used for fluorescent probe experiments and Vero cells for HSV-1 experiments. 2. Determine compound cytotoxicity (MTT assay or any other cell proliferation assay, according to manufacturer’s instructions). 3. Harvest cells using Trypsin-EDTA and seed them onto glass coverslips in a six-well plate (see Note 7). 4. Allow cells an overnight period for attaching and grow.

3.2 G4 Visualization via 1H6 Antibody and c-exNDI Fluorescent Probe

All the following steps have to be carried out under dim light (see Note 8). 1. Dilute compound in cell culture medium and treat cells. Compound concentrations have to be nontoxic, exposure times

386

Matteo Nadai and Sara N. Richter

have to be chosen depending on cell permeability to the compound. For c-exNDI in HEK293T cells, we used 1 μM compound for 2.5–30 min at 37  C in incubator. 2. Remove cell culture medium and wash cells with 1 DPBS at least three times to remove cell medium and compound residuals (see Note 9). 3. Fix cells with 2% PFA for 20 min at RT in the dark. 4. Remove PFA and wash cells with 1 DPBS at least five times to remove PFA residuals (see Note 10). 5. Permeabilize cells with 500 μL permeabilizing solution for 15 min on a rocker. 6. Remove permeabilizing solution and wash slides three times with PBST. 7. Treat slides with 40 μg/mL RNaseA for 30 min at 37  C on a rocker (see Note 11). 8. Incubate with blocking agent (BlockAid) for 1 h at 37  C, placing slides face-down in a humidified chamber (see Note 12). Use tweezer and a needle to pick the slides from the plate and place them in the humidified chamber. 9. Put slides back in the six-well plate and wash them three times with PBST and incubate with 1 μg/mL anti-G4 antibody 1H6 for 2 h at RT in a humidified chamber. 10. Put slides back in the six-well plate and wash them three times with PBST and incubate with 1:250 Alexa 488 anti-mouse IgG antibody for 1 h at 37  C in a humidified chamber. 11. Put slides back in the six-well plate and wash them three times with PBST. 12. Dip the slides twice in distilled water to remove salts. 13. Place a drop of mounting medium on the microscope slide, and put the coverslip face-down on the mounting medium. Carefully press the coverslip over the slide and remove the excess of liquid with absorbent paper. 14. Use nail polish to seal the edge of the coverslip, and let it dry (see Note 13). 15. Proceed with confocal microscopy. We used 488 nm excitation wavelength and 500–530 nm emission range for G4 visualization, and 543 nm excitation wavelength and 609–617 nm emission range for c-exNDI visualization (see Note 14). The fluorescent probe c-exNDI enters the cell and localizes in the cell nucleus, with peaks in subnuclear compartments corresponding to nucleoli. Moreover, it shows a good colocalization with the anti-G4 antibody 1H6 (Fig. 1).

G-Quadruplex Visualization in Cells via Antibody and Fluorescence Probe

387

Fig. 1 Colocalization of c-exNDI and G4s by confocal microscopy. Cells were incubated with c-exNDI (red signal, left panel) and with the anti-G4 antibody 1H6 (green signal, middle panel). The image on the right (merge) shows c-exNDI (red) and G4 (green) overlapping

3.3 Effects of Fluorescent Probe upon G4 Binding

Since the fluorescent probe c-exNDI is able to bind G4, we investigated its effect on nucleolin, a G4-binding protein mainly localized in the nucleolus [26]. It was already reported that treatment with Quarfloxin (QFX), a potent G4 ligand, induces a displacement of NCL from nucleoli and a relocalization to the nucleoplasm, without affecting the distribution of fibrillarin, a component of nucleolar snRNPs [12]. The following protocol can be used to compare the effect of c-exNDI and QFX on nucleolin and fibrillarin distribution. Proceed from step 4 of Subheading 3.1. All the following steps have to be carried out under dim light (see Note 8). 1. Dilute QFX in cell culture medium to reach a final concentration range 1–5 μM, treat cells, and place them for 2 h at 37  C in incubator. 2. Dilute c-exNDI in cell culture medium to reach a final concentration range 1–5 μM, treat cells, and place them for 30 min at 37  C in incubator. 3. Remove cell culture medium and wash cells with 1 DPBS at least three times to remove cell medium and compound residuals (see Note 9). 4. Fix cells with 2% PFA for 20 min at RT in the dark. 5. Remove PFA and wash cells with 1 DPBS at least five times to remove PFA residuals (see Note 10). 6. Permeabilize cells with 500 μL permeabilizing solution for 15 min on a rocker.

388

Matteo Nadai and Sara N. Richter

7. Remove permeabilizing solution and wash slides three times with PBST. 8. Treat slides with 40 μg/mL RNaseA for 30 min at 37  C on a rocker (see Note 11). 9. Incubate with blocking agent (BlockAid) for 1 h at 37  C, placing slides face-down in a humidified chamber (see Note 12). Use tweezer and a needle to pick the slides from the plate and place them in the humidified chamber. 10. Put slides back in the six-well plate and wash them three times with PBST and incubate with 1:500 anti-nucleolin C23 antibody or with 1:500 anti-fibrillarin antibody for 1 h at 37  C in a humidified chamber. 11. Put slides back in the six-well plate and wash them three times with PBST and incubate with 1:250 Alexa 488 anti-mouse IgG antibody or with 1:250 Alexa 488 anti-rabbit IgG antibody for 1 h at 37  C in a humidified chamber. 12. Put slides back in the six-well plate and wash them three times with PBST. 13. Dip the slides twice in distilled water to remove salts. 14. Place a drop of mounting medium on the microscope slide, and put the coverslip face-down on the mounting medium. Carefully press the coverslip over the slide and remove the excess of liquid with absorbent paper. 15. Use nail polish to seal the edge of the coverslip, and let it dry (see Note 13). 16. Proceed with confocal microscopy, using a 488 nm excitation wavelength and 500–530 nm emission range for nucleolin or fibrillarin visualization, and a 543 nm excitation wavelength and 609–617 nm emission range for c-exNDI visualization. The comparative nucleolin displacement induced by c-exNDI and QFX confirms not only its specific localization at nucleoli, but also binding to nucleolar G4s (Fig. 2). 3.4 G4 Visualization in HSV-1 Infected Cells via 1H6 Antibody

The herpes simplex virus-1 (HSV-1) genome has a very high GC content (68%) which peaks at 84.7% GC in simple sequence repeats (SSRs): recently, our research group provided evidence for the presence of very stable G4-forming regions located in the HSV-1 inverted repeats [8]. Given the extraordinary extension of G4 forming regions in the HSV-1 genome, it is possible to visualize G4s in eukaryotic cells infected with HSV-1 [27]. HSV-1 infected cells are highly enriched in G4s: in particular, the amount of G4s depends on the virus amount (MOI) and on the viral step, being more intense around the time of viral DNA replication.

Fig. 2 Cellular localization and targeting of c-exNDI. (a) Nucleolar localization of c-exNDI. Cells treated with c-exNDI (red signal, panel a) were incubated with an anti-fibrillarin antibody (green signal, panel b). Colocalization is shown in panel c. (b) c-exNDI-mediated displacement of the G4 binding protein nucleolin from the nucleoli. Cells were treated with increasing concentrations of c-exNDI (panels a–f) or quarfloxin (QFX) (panels a’–f’). Nucleolin (NCL) and fibrillarin behavior upon treatment with c-exNDI or QFX was visualized by staining the cells with anti-nucleolin (panels a–c and a’–c’) and anti-fibrillarin (panels d–f and d’–f’) antibodies

390

Matteo Nadai and Sara N. Richter

Proceed from step 4 of Subheading 3.1. 1. Infect Vero cells at MOI 2.5 and 5 in serum-free medium for 1 h at 37  C in incubator (see Note 15). 2. Remove serum-free medium and replace it with complete medium. 3. After 6–8 h, remove medium and wash with 1 DPBS. 4. Fix cells with 2% PFA for 20 min at RT in the dark. 5. Remove PFA and wash cells with 1 DPBS at least five times to remove PFA residuals (see Note 10). 6. Permeabilize cells with 500 μL permeabilizing solution for 15 min on a rocker. 7. Remove permeabilizing solution and wash slides three times with PBST. 8. Incubate with blocking agent (BlockAid) for 1 h at 37  C, placing slides face-down in a humidified chamber (see Note 12). Use tweezer and a needle to pick the slides from the plate and place them in the humidified chamber. 9. Put slides back in the six-well plate and wash them three times with PBST and incubate with 1 μg/mL anti-G4 antibody 1H6 for 2 h at RT in a humidified chamber. 10. Put slides back in the six-well plate and wash them three times with PBST and incubate with 1:500 Alexa 546 anti-mouse IgG antibody for 1 h at 37  C in a humidified chamber. 11. Put slides back in the six-well plate and wash them three times with PBST. 12. Incubate with 1:200 FITC-conjugated anti-HSV-1 ICP8 at room temperature for 1 h. 13. Put slides back in the six-well plate and wash them three times with PBST. 14. Stain nuclei with far-red fluorescent DNA dye (DRAQ5®, 1:1000) for 5 min at room temperature. 15. Dip the slides twice in distilled water to remove salts. 16. Place a drop of mounting medium on the microscope slide, and put the coverslip face-down on the mounting medium. Carefully press the coverslip over the slide and remove the excess of liquid with absorbent paper. 17. Use nail polish to seal the edge of the coverslip, and let it dry (see Note 13). 18. Proceed with confocal microscopy. We used 488 nm excitation wavelength and 496–519 nm emission range for ICP8 visualization, 546 nm excitation wavelength and 556–573 nm emission range for G4 visualization, and 646 nm excitation wavelength and 681–697 nm emission range for nuclei visualization.

G-Quadruplex Visualization in Cells via Antibody and Fluorescence Probe

391

Fig. 3 Colocalization of G4s and the viral protein ICP8 by 3D confocal microscopy. ICP8 is a marker for HSV-1 replication compartments (RCs). Cells were infected with wt HSV-1 (strain F), MOI 5. At 8 h p.i. cells were stained with the anti-G4 (1H6) and anti-ICP8-FITC antibodies. Blue, red, and green indicate DNA, G4s, and ICP8-dependent viral RCs, respectively. The images on the right (merge) show G4 (red) and ICP8 (green) overlapping as a yellow/orange signal

Confocal microscopy colocalization analysis (Fig. 3) shows an almost complete overlapping between G4s induced during the viral infection and replication compartments (RCs) where ICP8, an essential component of the HSV-1 DNA replication machinery implicated in the assembly of viral pre-replication and RCs, is localized. This evidence supports formation of viral G4s during viral replication. 3.5 Analysis of Microscopy Images

Different open-source software can be used for the analysis of the acquired images, for example, ImageJ (https://imagej.nih.gov/ij/). 1. Save images for the different channels as TIFF files. 2. Load images for different channels separately on ImageJ and merge them. 3. Using the ImageJ Plot profile tool, draw an ideal line across the cell and obtain the 2D-intensity profile (Fig. 4), or the JACoP colocalization plugin [28] to obtain the overlapping coefficient. For more information, see the ImageJ tutorial (https://imagej. nih.gov/ij/docs/examples/index.html).

4

Notes 1. The amount of fetal bovine serum (FBS) to supplement cell culture medium depends on the cell line. Typically, it spans from 5% to 10%, but check cell line specifications. Poly-Dlysine promotes the adhesion of cells to the culture vessel, it should be used only if cells tend to detach easily from the culture vessel. 2. HSV-1 strain F was a kind gift from Bernard Roizman, University of Chicago, IL, USA. Other HSV-1 strains can be chosen, depending on the purpose of the experiment. Particular care

392

Matteo Nadai and Sara N. Richter

Fig. 4 Colocalization of c-exNDI and G4s by confocal microscopy. Intensity profiles of c-exNDI (red) and G4s (green) obtained using ImageJ software, along an ideal straight line (white) crossing the nucleus of a representative cell (right inset). Intensity profiles refer to Fig. 1

must be taken when choosing the virus strain and host cell line. Produce virus stock and titrate it according to virological protocols. 3. c-exNDI was synthetized by Prof. Freccero’s group [23, 29]. Any other fluorescent compound reported to bind G4 and to enter cell nucleus can be used. Attention should be given to the fluorescence emission spectrum of the compound. 4. The choice of the permeabilizing agent is particularly critical. If permeabilization is too strong, anti-G4 antibody recognition could be lost. We obtained our best results using 0.5% Tween20. 5. 1H6 antibody, specific for G4 DNA, was kindly provided by P. Lansdorp, European Research Institute for the Biology of Ageing, University of Groeningen, the Netherlands. 6. The choice of the fluorescent secondary antibody has to be done taking in consideration the source of primary anti-G4 antibody and fluorescence emission properties of G4-binding compound. Any cross talk between antibody and compound has to be avoided.

G-Quadruplex Visualization in Cells via Antibody and Fluorescence Probe

393

7. Coverslips can be sterilized by dipping them in ethanol. Subsequently let them dry and wash with DPBS. The number of cells to be seeded depends on cell morphology and doubling time. Optimal confluency is around 70% on the day of cells fixation. 8. Dim light is required to avoid c-exNDI and fluorescent secondary antibody bleaching. 9. To visualize c-exNDI staining in live cells, proceed to fluorescence or confocal microscopy: place a drop of DPBS onto microscope slide, put the coverslip face-down on the drop and image cells. 10. After fixation, cells can be kept at 4  C in the dark. 11. Treatment with RNaseA is used to digest RNA and visualize DNA G4. If you wish to visualize RNA G4, treat slides with 200 units DNase I for 30 min at 37  C on a rocker. 12. An easy way to have a humidified chamber is using a 15 cm petri dish with water-soaked filter paper and a parafilm layer (Fig. 5). Place a drop of reagent (about 30 μL) on the parafilm layer, and coverslips face-down on the drop. 13. Fixed, mounted, and sealed slides can be stored at 4  C in the dark. 14. According to lab/facility procedure for confocal microscopy acquisition. In particular, be careful not to saturate the fluorescence signal. We preferred to perform single laser scanning instead of sequential scanning, to avoid any undesired and unspecific fluorescence signal due to laser-fluorophores cross talks. 15. Use serum-free medium to dilute viral stock and to infect cells, and complete medium to grow and maintain cells.

Fig. 5 Humidified chamber made using a petri dish

394

Matteo Nadai and Sara N. Richter

References 1. Folini M, Venturini L, Cimino-Reale G, Zaffaroni N (2011) Telomeres as targets for anticancer therapies. Expert Opin Ther Targets 15 (5):579–593. https://doi.org/10.1517/ 14728222.2011.556621 2. Holder IT, Hartig JS (2014) A matter of location: influence of G-quadruplexes on Escherichia coli gene expression. Chem Biol 21 (11):1511–1521. https://doi.org/10.1016/j. chembiol.2014.09.014 3. Maizels N (2015) G4-associated human diseases. EMBO Rep 16(8):910–922 4. Rhodes D, Lipps HJ (2015) G-quadruplexes and their regulatory roles in biology. Nucleic Acids Res 43(18):8627–8637. https://doi. org/10.1093/nar/gkv862 5. Ou TM, Lu YJ, Tan JH, Huang ZS, Wong KY, Gu LQ (2008) G-quadruplexes: targets in anticancer drug design. ChemMedChem 3 (5):690–713. https://doi.org/10.1002/ cmdc.200700300 6. Perrone R, Nadai M, Poe JA, Frasson I, Palumbo M, Palu G, Smithgall TE, Richter SN (2013) Formation of a unique cluster of G-quadruplex structures in the HIV-1 nef coding region: implications for antiviral activity. PLoS One 8(8):e73121. https://doi.org/10. 1371/journal.pone.0073121 7. Perrone R, Nadai M, Frasson I, Poe JA, Butovskaya E, Smithgall TE, Palumbo M, Palu G, Richter SN (2013) A dynamic G-quadruplex region regulates the HIV-1 long terminal repeat promoter. J Med Chem 56(16):6521–6530. https://doi.org/10. 1021/jm400914r 8. Artusi S, Nadai M, Perrone R, Biasolo MA, Palu G, Flamand L, Calistri A, Richter SN (2015) The herpes simplex virus-1 genome contains multiple clusters of repeated G-quadruplex: implications for the antiviral activity of a G-quadruplex ligand. Antiviral Res 118:123–131. https://doi.org/10.1016/ j.antiviral.2015.03.016 9. Perrone R, Artusi S, Butovskaya E, Nadai M, Pannecouque C, Richter SN (2015) G-quadruplexes in the human immunodeficiency virus-1 and herpes simplex virus-1: new targets for antiviral activity by small molecules. IFMBE Proc 46:207–210. https://doi.org/ 10.1007/978-3-319-11776-8_50 10. Amrane S, Kerkour A, Bedrat A, Vialet B, Andreola ML, Mergny JL (2014) Topology of a DNA G-quadruplex structure formed in the HIV-1 promoter: a potential target for antiHIV drug development. J Am Chem Soc 136

(14):5249–5252. https://doi.org/10.1021/ ja501500c 11. Gowan SM, Harrison JR, Patterson L, Valenti M, Read MA, Neidle S, Kelland LR (2002) A G-quadruplex-interactive potent small-molecule inhibitor of telomerase exhibiting in vitro and in vivo antitumor activity. Mol Pharmacol 61(5):1154–1162. https://doi. org/10.1124/mol.61.5.1154 12. Drygin D, Siddiqui-Jain A, O’Brien S, Schwaebe M, Lin A, Bliesath J, Ho CB, Proffitt C, Trent K, Whitten JP, Lim JKC, Von Hoff D, Anderes K, Rice WG (2009) Anticancer activity of CX-3543: a direct inhibitor of rRNA biogenesis. Cancer Res 69 (19):7653–7661. https://doi.org/10.1158/ 0008-5472.CAN-09-1304 13. De Cian A, DeLemos E, Mergny JL, TeuladeFichou MP, Monchaud D (2007) Highly efficient G-quadruplex recognition by bisquinolinium compounds. J Am Chem Soc 129 (7):1856. https://doi.org/10.1021/ ja067352b 14. Rodriguez R, Muller S, Yeoman JA, Trentesaux C, Riou JF, Balasubramanian S (2008) A novel small molecule that alters Shelterin integrity and triggers a DNA-damage response at telomeres. J Am Chem Soc 130 (47):15758. https://doi.org/10.1021/ ja805615w 15. Largy E, Granzhan A, Hamon F, Verga D, Teulade-Fichou MP (2013) Visualizing the quadruplex: from fluorescent ligands to lightup probes. Top Curr Chem 330:111–177. https://doi.org/10.1007/128_2012_346 16. Vummidi BR, Alzeer J, Luedtke NW (2013) Fluorescent probes for G-quadruplex structures. Chembiochem 14(5):540–558. https://doi.org/10.1002/cbic.201200612 17. Beauvineau C, Guetta C, Teulade-Fichou MP, Mahuteau-Betzer F (2017) PhenDV, a turn-off fluorescent quadruplex DNA probe for improving the sensitivity of drug screening assays. Org Biomol Chem 15 (34):7117–7121. https://doi.org/10.1039/ c7ob01705g 18. Biffi G, Tannahill D, McCafferty J, Balasubramanian S (2013) Quantitative visualization of DNA G-quadruplex structures in human cells. Nat Chem 5(3):182–186. https://doi.org/10. 1038/Nchem.1548 19. Henderson A, Wu YL, Huang YC, Chavez EA, Platt J, Johnson FB, Brosh RM, Sen D, Lansdorp PM (2014) Detection of G-quadruplex DNA in mammalian cells. Nucleic Acids Res

G-Quadruplex Visualization in Cells via Antibody and Fluorescence Probe 42(2):860–869. https://doi.org/10.1093/ nar/gkt957 20. Biffi G, Tannahill D, Miller J, Howat WJ, Balasubramanian S (2014) Elevated levels of G-quadruplex formation in human stomach and liver cancer tissues. PLoS One 9(7): e102711. https://doi.org/10.1371/journal. pone.0102711 21. Hoffmann RF, Moshkin YM, Mouton S, Grzeschik NA, Kalicharan RD, Kuipers J, Wolters AHG, Nishida K, Romashchenko AV, Postberg J, Lipps H, Berezikov E, Sibon OCM, Giepmans BNG, Lansdorp PM (2016) Guanine quadruplex structures localize to heterochromatin. Nucleic Acids Res 44 (1):152–163. https://doi.org/10.1093/nar/ gkv900 22. Yangyuoru PM, Di Antonio M, Ghimire C, Biffi G, Balasubramanian S, Mao HB (2015) Dual binding of an antibody and a small molecule increases the stability of TERRA G-quadruplex. Angew Chem Int Ed 54 (3):910–913. https://doi.org/10.1002/anie. 201408113 23. Doria F, Nadai M, Zuffo M, Perrone R, Freccero M, Richter SN (2017) A red-NIR fluorescent dye detecting nuclear DNA G-quadruplexes: in vitro analysis and cell imaging. Chem Commun 53(14):2268–2271. https://doi.org/10.1039/c6cc08492c 24. Laguerre A, Wong JMY, Monchaud D (2016) Direct visualization of both DNA and RNA quadruplexes in human cells via an uncommon spectroscopic method. Sci Rep 6:32141. https://doi.org/10.1038/Srep32141

395

25. Carvalho J, Pereira E, Marquevielle J, Campello MPC, Mergny JL, Paulo A, Salgado GF, Queiroz JA, Cruz C (2018) Fluorescent lightup acridine orange derivatives bind and stabilize KRAS-22RT G-quadruplex. Biochimie 144:144–152. https://doi.org/10.1016/j.bio chi.2017.11.004 26. Bugler B, Caizergues-Ferrer M, Bouche G, Bourbon H, Amalric F (1982) Detection and localization of a class of proteins immunologically related to a 100 KDa nucleolar protein. Eur J Biochem 128(2–3):475–480. https:// doi.org/10.1111/j.1432-1033.1982. tb06989.x 27. Artusi S, Perrone R, Lago S, Raffa P, Di Iorio E, Palu G, Richter SN (2016) Visualization of DNA G-quadruplexes in herpes simplex virus 1-infected cells. Nucleic Acids Res 44 (21):10343–10353. https://doi.org/10. 1093/nar/gkw968 28. Bolte S, Cordelieres FP (2006) A guided tour into subcellular colocalization analysis in light microscopy. J Microsc-Oxford 224:213–232. https://doi.org/10.1111/j.1365-2818.2006. 01706.x 29. Perrone R, Doria F, Butovskaya E, Frasson I, Botti S, Scalabrin M, Lago S, Grande V, Nadai M, Freccero M, Richter SN (2015) Synthesis, binding and antiviral properties of potent core-extended naphthalene Diimides targeting the HIV-1 long terminal repeat promoter G-quadruplexes. J Med Chem 58 (24):9639–9652. https://doi.org/10.1021/ acs.jmedchem.5b01283

Chapter 25 In Cell NMR Spectroscopy: Investigation of G-Quadruplex Structures Inside Living Xenopus laevis Oocytes Michaela Krafcikova, Robert H€ansel-Hertsch, Lukas Trantirek, and Silvie Foldynova-Trantirkova Abstract G-quadruplexes are inherently polymorphic nucleic acid structures. Their folding topology depends on the nucleic acid primary sequence and on physical–chemical environmental factors. Hence, it remains unclear if a G-quadruplex topology determined in the test tube (in vitro) will also form in vivo. Characterization of G-quadruplexes in their native environment has been proposed as an efficient strategy to tackle this issue. So far, characterization of G-quadruplex structures in living cells has relied exclusively on the use of Xenopus laevis oocytes as a eukaryotic cell model system. Here, we describe the protocol for the preparation of X. laevis oocytes for studies of G-quadruplexes as well as other nucleic acids motifs under native conditions using in-cell NMR spectroscopy. Key words In-cell NMR, Xenopus laevis oocyte, DNA, RNA, G-quadruplex

1

Introduction In principle, in-cell NMR spectroscopy can be considered as a direct analogy of conventional solution NMR spectroscopy with a nature of the sample being a single, but a key difference. While the conventional NMR sample comprises target molecule (G-quadruplex, for example) dissolved in a low-complexity buffered solution, the in-cell sample constitutes of the target “dissolved” in the interior of living cells. However, the “dissolution” of the target in the interior of a cell, without compromising its viability and metabolic status, at concentration amiable to notoriously insensitive NMR detection represents a major technical challenge. As first demonstrated for proteins [1, 2] and later also for nucleic acids [3], this challenge can be resolved by microinjecting bio-macromolecules into large (~1 mm in diameter) oocytes from the African clawed frog (Xenopus laevis)—Fig. 1. When microinjected material is labeled, e.g., with non-native nuclei such as 19F [4] or enriched in isotopes natively occurring at low abundance such as the 13C

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_25, © Springer Science+Business Media, LLC, part of Springer Nature 2019

397

398

Michaela Krafcikova et al.

Fig. 1 Top: Schematic representation of in-cell NMR experimental setup. Bottom: Comparison of the imino regions of 1D 1H NMR spectra of the G-quadruplex forming telomeric DNA, d(G3(TTAG3)3T), acquired in buffer (in vitro), in X. laevis oocytes (in-cell), and in cleared cellular lysate (ex vivo). Note that in-cell NMR spectrum has notably lower resolution compared to its in vitro counterpart. Additionally, note the evident difference between spectral fingerprint in buffer and in cleared lysate. Figure was adapted with permission from ref. [3]

and/or 15N [3, 5], the NMR signals specific to the introduced target of interest can be readily observed even in the complex environment of living cells. Unfortunately, in-cell NMR spectra of nucleic acids exhibit a lower resolution compared to NMR spectra acquired under simplistic conditions in vitro [3, 6, 7] (Fig. 1). The resolution of in-cell NMR spectra generally does not permit their use for de novo structure determination. Although the low spectral resolution notably limits applicability of in-cell NMR for inherently polymorphic G-quadruplex structures, the resolution and informational content of in-cell NMR spectra was shown to be still sufficient to address number of biological relevant questions. For example, in-cell NMR was successfully used to address roles of low molecular weight compounds (metabolites) [3] as well as of native intracellular molecular crowding in promoting G-quadruplex polymorphism [8]. In-cell NMR was also used to assess the role of an intracellular environment in modulating interactions between a DNA G-quadruplex and a G-quadruplex stabilizing ligand (drug-like molecule) [5]. Most recently, the in-cell NMR study in X. laevis

In-Cell NMR of G-Quadruplexes in Xenopus Oocytes

399

oocyte model was used to assess formation of the higher-order G-quadruplex structures (consisting of stacked G-quadruplex subunits) under native conditions [5]. Herein, we provide a step-by-step protocol for the preparation of X. laevis oocyte samples for in-cell NMR studies of nucleic acids. While the procedure might appear, on the first sight, rather similar to that used for the preparation of in-cell NMR samples of proteins, the notable differences in the preparation procedure and sample handling in the course of actual in-cell NMR experiment do exist. With exception of spin labeling requirement, essentially identical procedures and considerations mentioned below hold also true for the preparation of in-cell samples of nucleic acids for electronparamagnetic (EPR) spectroscopy [9, 10]. Noteworthy, two distinct procedures for delivery of exogenous nucleic acid material into interior of human cells for the purpose of the in-cell NMR studies were reported at the time of preparation of this chapter [11, 12].

2

Materials

2.1

Equipment

Laboratory equipment for standard work: buffer preparation (common laboratory plastic, pipettes, pH meter, magnetic stirrer), NA precipitation with n-butanol and NA annealing (centrifuge, heat block). Equipment important for oocyte manipulation: preparative microscope with cold light, in-house made or commercially available pre-pulled microinjection needles (World Precision Instruments, USA/Tritech Research, USA), pneumatic oocyte injection devices (Harvard Instruments, USA). Alternatively, automated robotic micro-injectors can be employed (Roboinject™, Robocytes™). Equipment important for NMR measurement: Shigemi™ tube, solution NMR spectrometer (500 MHz or higher).

2.2

DNA

DNA/RNA oligonucleotides (unlabeled, isotopically labeled and/or covalently modified).

2.3 Stocks and Solutions

1. Injection buffer: 25 mM HEPES (pH ¼ 7.5), 50 mM NaCl, and 1 mM DTT. 2. Intraoocyte buffer (buffer mimicking the oocyte salt environment): 25 mM HEPES (pH ¼ 7.5) 10.5 mM NaCl, 110 mM KCl, 130 nM CaCl2, 1 mM MgCl2, 10% D2O. 3. Ori-Ca2+ buffer: 5 mM HEPES (pH ¼ 7.6), 110 mM NaCl, 5 mM KCl, 2 mM CaCl2, and 1 mM MgCl2. 4. Progesterone (Sigma-Aldrich). 5. Ficoll® (Fluka).

400

Michaela Krafcikova et al.

6. Injection ready X. laevis oocytes (Xenocyte™ or Eco Cyte Bioscience). 7. n-Butanol.

3

Methods

3.1 NA Sample Preparation

It is highly recommended to perform butanol precipitation of oligonucleotides prepared by chemical synthesis (see Note 1). 1. Dissolve the lyophilized NA in 200–500 μL of H2O, incubate for 15 min at 37  C. 2. Add 15 mL of n-butanol and vortex vigorously for 10 min at room temperature. 3. Spin-down precipitated NA by centrifugation at 30,000  g for 45 min at room temperature. 4. Let precipitated NA pellet dry up at room temperature (overnight). 5. Dissolve precipitated NA in injection buffer and heat NA sample to 95  C for 10 min. 6. Let the sample cool down to room temperature.

3.2 Oocyte Maintenance and Sample Injection

For one in-cell NMR sample, ~200 X. laevis oocytes (see Note 2) need to be injected either manually or via robotic device (see Note 3). Specifics of the preparation of the in-cell sample for studies of NA-ligand interaction in vivo are discussed in Note 4. 1. Inject 50 nL of the 1.2–5 mM NA stock solution in “injection buffer” into each oocyte (see Note 5). Concentration of injected NA depends on the presence/absence of isotopic label in NA (see Note 6) and on toxicity of NA (see Note 7). 2. After injection transfer oocytes to a Petri dish, wash them thoroughly with excess volume of Ori-Ca2+ buffer, and allow them to recover for 1 h (18  C) (see Note 8). 3. Gently transfer injected oocytes into Shigemi™ tube (without the plunger) prefilled with 1 mL of Ori-Ca2+ buffer containing 10% D2O (see Note 9). 4. Start NMR experiment (see Subheading 3.5). 5. Check viability of cells—visual control the oocytes under the light microscope (see Note 10). 6. To assess NA degradation status—perform Subheading 3.6 (see Note 11).

3.3 “Crude” Lysate Preparation

The NMR investigation can be performed also in so-called crude cellular extract/homogenate or cleared lysate (see Subheading 3.4) from X. laevis oocytes (see Note 12) (Fig. 1). The lysates can be

In-Cell NMR of G-Quadruplexes in Xenopus Oocytes

401

prepared either from the injected oocytes or alternatively from intact oocytes. In the latter case, NA is mixed into the lysate prior to NMR measurement). 1. Transfer oocytes into Petri dish prefilled with Ori-Ca2+ buffer, cool down dish on ice for 15 min. 2. Transfer oocytes in another Petri dish prefilled with ice-cold 10 mL of intracellular buffer for 10 min. 3. Transfer oocytes into an Eppendorf tube and adjust final volume of the intracellular buffer to 500 μL. 4. Crash oocytes (mechanically—e.g., using a glass plunger from the NMR Shigemi™ tube) and centrifuge the sample at 20,000  g for 20 min, collect the supernatant, to get crude lysate. 5. (a) Transfer crude lysate from injected oocytes to NMR tube and initiate NMR measurement. (b) Mix NA into the crude lysate from intact oocytes. Transfer the mixture to NMR tube and initiate NMR measurement. 3.4 “Cleared” Lysate Preparation

1. Prepare crude lysate (see Subheading 3.3) and heat it to 95  C for 10 min to thermally precipitate protein-based intracellular components. 2. Centrifuge sample for 10 min at 20,000  g. 3. Collect the supernatant, to get cleared lysate (250 μL of cleared lysate is used for one NMR measurement). 4. (a) Transfer cleared lysate from injected oocytes to NMR tube and initiate NMR measurement. (b) Mix NA into cleared lysate from intact oocytes. Transfer mixture to NMR tube and initiate NMR measurement.

3.5 In-Cell NMR Spectroscopy

1. Set temperature in NMR spectrometer to TE ¼ 18  C. 2. Place the NMR tube to high field NMR spectrometer (>500 MHz). 3. Lock, tune, and shim NMR instrument according manufactures instructions. 4. Initiate the NMR experiment of choice (see Note 13). 5. Collect ~350–500 μL of buffer covering oocytes. 6. Use collected buffer to acquire so-called leakage control spectrum—using identical setup as in Subheading 3.5, steps 1–4 (see Note 14).

3.6 NA Sample Recovery

1. Process injected oocytes to a clear lysate (see Subheading 3.4). 2. Add 20 mL of n-butanol to your clear lysate, follow Subheading 3.1 to recover your NA (recovery yield is usually around 70%).

402

4

Michaela Krafcikova et al.

Notes 1. The samples prepared by chemical synthesis are toxic to the oocytes (probably due to traces of low-molecular weight impurities toxic for the oocytes). One cycle of the conventional butanol precipitation is usually sufficient to remove most of the toxic contaminants from a NA sample. The NAs prepared enzymatically can be usually injected directly. 2. Xenopus laevis oocytes are sensitive to the temperature. Therefore, the experiments involving manipulation of living oocytes should be carried out at 18  C to keep cells alive. 3. While manual injection of roughly 200 oocytes takes about 3 h, robotic injection takes about 5–10 min. 4. For investigation of ligand G-quadruplex interaction in X. laevis oocytes using in-cell NMR, two different ways of sample preparation might be employed [5]. Complex between the ligand and G-quadruplex NA can be preformed in vitro. Subsequently, the complex is injected into X. laevis oocytes (following Subheading 3.2). Alternatively, the complex formation between the G-quadruplex and the ligand is induced in vivo. In this case, the oocytes are first injected with the NA target alone (see Subheading 3.2). In the next step, the injected oocytes are incubated with the ligand (incubation times, concentration of the ligands depend on nature of the ligand and its toxicity—for details see [5]). 5. Detailed step-by-step protocol for X. laevis oocytes injection for the purpose of in-cell NMR was recently described by Thongwichian and Selenko [13]. 6. Microinjection from a concentrated stock allows relatively precise control of intracellular concentration of the oligonucleotide in the oocytes. The choice of the final intracellular concentration is thus primarily governed by NA toxicity (see also Note 7), type of the isotopical/covalent NA labeling, and (last but not least) by the price of the labeled NA oligonucleotide. For nontoxic, 13C/15N- isotopically or 19F covalently labeled NA fragments, intracellular concentrations of ~150 μM are typically used. Concentrations 250 μM and higher are typically employed for in-cell NMR spectroscopy on unlabeled oligonucleotides [3, 8]. 7. Some NA fragments, particularly G-quadruplex DNA, are toxic to the oocytes at concentrations typically used for in-cell NMR studies [3]. Lowering intracellular NA concentration represents the only straightforward solution to this issue. Pilot experiments aiming at estimation of nontoxic NA

In-Cell NMR of G-Quadruplexes in Xenopus Oocytes

403

concentration to oocytes should be performed prior to actual in-cell NMR sample preparation. 8. To improve the quality of an in-cell NMR sample, individual oocytes can be microscopically inspected 1 h after injection. For example, an irreversibly damaged oocyte can display pigmentation alterations within its dark hemisphere resulting in white speckle formation. Furthermore, addition of progesterone can be used to confirm oocyte viability which triggers germinal vesicle breakdown formation in the dark hemisphere after injection. The visual assessment of oocytes after microinjection allows one to directly discard all damaged oocytes prior to actual in-cell NMR measurement. Any severe disruption of intracellular environment leads to increase of the cell mortality, which is usually visible within 30 min post-injection. 9. For long-term in-cell NMR experiments, it is recommended to keep oocytes in buffers containing inert co-solutes such as Ficoll®. 10% Ficoll helps to reduce the mechanical stress in the oocytes when packed in the NMR tube [14]. 10. Integrity/viability of oocytes post-acquisition of in-cell NMR spectra can be checked by treatment of injected oocytes with 1 μM progesterone. Progesterone triggers synchronized maturation of the oocyte to the egg state. Usually ~10 injected oocytes are checked to see whether the injected oocytes can complete maturation process. 11. An issue of NA in-cell NMR is that NAs are degraded with time once inside X. laevis oocytes. Stability of NA inside cells depends on their primary sequence and folding topology. In general, the time-window for investigation of NA under in vivo conditions is typically less than 6 h. Of note, G-quadruplex structures are more resistant to nuclease-induced degradation than other NA structures, which allow in-cell NMR measurements for more than 19 h [5]. Degradation of less stable NA can be effectively diminished by chemical modifications such as replacement of phosphate group in the backbone by a thiophosphate moiety (RNA/DNA) or by methylation of the 20 -OH (RNA). To make sure that NA degradation does not bias interpretation of in-cell NMR spectrum, NA is recovered from injected oocytes post-acquisition of in-cell NMR spectra (see Subheading 3.6). Comparison of 1D 1H NMR spectrum of the recovered NA (in injection buffer) with that of the original stock solution employed for injection than allows assessment of extent of NA degradation in the intracellular space. 12. In contrast to crude cellular extracts from mammalian cells, the extracts from oocytes can be readily obtained in a virtually undiluted form. These extracts reflect most of the oocyte intracellular biology [15]. They are frequently used as alternative

404

Michaela Krafcikova et al.

cell-free systems for ex vivo analysis of cellular processes, since they enable even better control of the concentration and improved spectral quality owing to their higher homogeneity compared to oocytes [16, 17]. 13. Type of in-cell NMR experiments is dictated by NA labeling scheme employed. Injection of nonlabeled NA presumes acquisition of 1D 1H in-cell NMR spectrum. On the other hand, working with 13C/15N labeled samples allows performing number of diverse isotopically filtered 1D and 2D NMR experiments [3, 5]. Although the setup of in-cell NMR experiment is essentially identical to conventional NMR experiment, there is one critical difference—the available timewindow. A typical time-window for an in-cell NMR experiment in X. laevis oocytes is ~ 6 h. This time limitation arises from combination of several factors: (1) life span of the oocytes in NMR tube, (2) rate of degradation of investigated NA fragment by nucleases present in intracellular space, and (3) rate of accidental leakage of NA from oocytes via post-injection incisions. As shown by Sakai et al. [18], properly treated oocytes are viable for a period at least 20 h post-injection. Rate of degradation greatly varies based on sequence and secondary structure of investigated oligonucleotide—See Note 11. Leakage of NA from incision can occur on much shorter time scale than both NA and oocyte life spans. However, the results by Hansel et al. [3] showed that if the buffer covering oocytes is replaced during NMR measurements every 5 h, amount of leaked NA from oocytes does not overcome 5%. Such an amount does not interfere with the in-cell NMR measurements. 14. Assessment of leakage represents mandatory control of any in-cell NMR experiment. In ideal case, there are no signals from the NA in the buffer surrounding the oocytes.

Acknowledgments This work was supported by grants from the Czech Science Foundation (17-12075S), European Regional Development Fund (SYMBIT: CZ.02.1.01/0.0/0.0/15_003/0000477), Horizon 2020 Program of the EU (iNEXT: grant agreement 653706), and from the MEYS CR (CEITEC 2020 LQ1601). References 1. Selenko P, Serber Z, Gadea B, Ruderman J, Wagner G (2006) Quantitative NMR analysis of the protein G B1 domain in Xenopus laevis egg extracts and intact oocytes. Proc Natl Acad Sci U S A 103:11904–11909

2. Serber Z, Selenko P, Hansel R, Reckel S, Lohr F, Ferrell JE Jr, Wagner G, Dotsch V (2006) Investigating macromolecules inside cultured and injected cells by in-cell NMR spectroscopy. Nat Protoc 1:2701–2709

In-Cell NMR of G-Quadruplexes in Xenopus Oocytes 3. Hansel R, Foldynova-Trantirkova S, Lohr F, Buck J, Bongartz E, Bamberg E, Schwalbe H, Dotsch V, Trantirek L (2009) Evaluation of parameters critical for observing nucleic acids inside living Xenopus laevis oocytes by in-cell NMR spectroscopy. J Am Chem Soc 131:15761–15768 4. Bao HL, Ishizuka T, Sakamoto T, Fujimoto K, Uechi T, Kenmochi N, Xu Y (2017) Characterization of human telomere RNA G-quadruplex structures in vitro and in living cells using 19F NMR spectroscopy. Nucleic Acids Res 45:5501–5511 5. Salgado GF, Cazenave C, Kerkour A, Mergny JL (2015) G-quadruplex DNA and ligand interaction in living cells using NMR spectroscopy. Chem Sci 6:3314–3320 6. Hansel R, Foldynova-Trantirkova S, Dotsch V, Trantirek L (2013) Investigation of quadruplex structure under physiological conditions using in-cell NMR. Top Curr Chem 330:47–65 7. Hansel R, Luh LM, Corbeski I, Trantirek L, Dotsch V (2014) In-cell NMR and EPR spectroscopy of biomacromolecules. Angew Chem Int Ed Engl 53:10300–10314 8. Hansel R, Lohr F, Foldynova-Trantirkova S, Bamberg E, Trantirek L, Dotsch V (2011) The parallel G-quadruplex structure of vertebrate telomeric repeat sequences is not the preferred folding topology under physiological conditions. Nucleic Acids Res 39:5768–5775 9. Azarkh M, Okle O, Singh V, Seemann IT, Hartig JS, Dietrich DR, Drescher M (2011) Long-range distance determination in a DNA model system inside Xenopus laevis oocytes by in-cell spin-label EPR. Chembiochem 12:1992–1995

405

10. Krstic I, Hansel R, Romainczyk O, Engels JW, Dotsch V, Prisner TF (2011) Long-range distance measurements on nucleic acids in cells by pulsed EPR spectroscopy. Angew Chem Int Ed Engl 50:5070–5074 11. Yamaoki Y, Kiyoishi A, Miyake M, Kano F, Murata M, Nagata T, Katahira M (2018) The first successful observation of in-cell NMR signals of DNA and RNA in living human cells. Phys Chem Chem Phys 20:2982–2985 12. Dzatko S, Krafcikova M, Hansel-Hertsch R, Fessl T, Fiala R, Loja T, Krafcik D, Mergny JL, Foldynova-Trantirkova S, Trantirek L (2018) Evaluation of the stability of DNA i-motifs in the nuclei of living mammalian cells. Angew Chem Int Ed Engl 57:2165–2169 13. Thongwichian R, Selenko P (2012) In-cell NMR in Xenopus laevis oocytes. Methods Mol Biol 895:33–41 14. Bodart JF, Wieruszeski JM, Amniai L, Leroy A, Landrieu I, Rousseau-Lescuyer A, Vilain JP, Lippens G (2008) NMR observation of Tau in Xenopus oocytes. J Magn Reson 192:252–257 15. Selenko P, Wagner G (2007) Looking into live cells with in-cell NMR spectroscopy. J Struct Biol 158:244–253 16. Crane RF, Ruderman JV (2006) Using Xenopus oocyte extracts to study signal transduction. Methods Mol Biol 322:435–443 17. Murray AW (1991) Cell cycle extracts. Methods Cell Biol 36:581–605 18. Sakai T, Tochio H, Tenno T, Ito Y, Kokubo T, Hiroaki H, Shirakawa M (2006) In-cell NMR spectroscopy of proteins inside Xenopus laevis oocytes. J Biomol NMR 36:179–188

Chapter 26 19

F NMR Spectroscopy for the Analysis of DNA G-Quadruplex Structures Using 19F-Labeled Nucleobase

Takumi Ishizuka, Hong-Liang Bao, and Yan Xu Abstract G-quadruplex structures have been suggested to be biologically important in processes such as transcription and translation, gene expression and regulation in human cancer cells, and regulation of telomere length. Investigation of G-quadruplex structures associated with biological events is therefore essential to understanding the functions of these molecules. We developed the 19F-labeled nucleobases and introduced them into DNA sequences for the 19F NMR spectroscopy analysis. We present the 19F NMR methodology used in our research group for the study of G-quadruplex structures in vitro and in living cells. Key words G-quadruplex structures, Human telomeres, Aptamer, 19F NMR spectroscopy

1

Introduction

1.1 G-Quadruplex Structures

G-quadruplexes are four-stranded nucleic acid secondary structures formed in specific G-rich sequences [1, 2]. G-quadruplex structures have attracted attention because of their important roles in biological events such as gene regulation [3, 4], telomere length regulation and protection [5–9], transcription [10–13], and DNA replication [14–19], suggesting that G-quadruplex structures are viewed as promising molecular targets for therapeutics and diagnostics [4, 20–30]. Previous studies have shown that human telomere DNA form the different G-quadruplex topologies. A 22 nucleotide (nt) DNA with the sequence 50 -A(GGGTTA)3GGG-30 can form an antiparallel-stranded basket-type G-quadruplex in sodium ion solution [31] and a parallel-stranded propeller-type G-quadruplex in crystal containing potassium ion [32]. In potassium solution, the sequence forms a (3 + 1) hybrid G-quadruplex [33–35] and also adopts different topologies [36, 37]. A direct observation of long-telomeric-overhang DNA by atomic force microscopy (AFM) revealed that that telomeric-overhang DNA

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7_26, © Springer Science+Business Media, LLC, part of Springer Nature 2019

407

408

Takumi Ishizuka et al.

forms a higher-order DNA structure containing consecutive G-quadruplexes [6]. Additionally, an intramolecular G-quadruplex structure has also formed in thrombin binding aptamer (TBA) as a 15-nt DNA with the sequence 50 -GGTTGGTGTGGTTGG-30 , which has an ability of specific binding to human α-thrombin and exhibits anticoagulant properties [38, 39]. TBA has been widely used as model of G-quadruplex structure [40–43], which forms an antiparallel G-quadruplex with a chair-type conformation as reported by NMR and X-ray structural studies [44–46]. In addition to intramolecular G-quadruplexes, dimeric and tetrameric intermolecular G-quadruplexes have been reported to form by various DNA sequences. For instance, it has been suggested that a 12-nt human telomeric DNA with the sequence 50 -(TAGGGT)2-30 formed parallel- and antiparallel-stranded intermolecular G-quadruplexes in potassium solution [47]. A 16-nt human telomeric sequence 50 -GGGT(TAGGGT)2-30 can also assemble to the 6-nt human telomeric sequence 50 -TAGGGT-30 in potassium solution to form an intermolecular (3 + 1) hybrid G-quadruplex topology, which has three strands oriented in one direction and one strand oriented in the opposite direction [5, 26, 48–50]. 1.2 19F NMR for Studying the G-Quadruplex Structure

The 19F NMR spectroscopy has recently been used as a powerful tool for the analysis of biomolecule conformations [51–61]. The 19 F nucleus is easily incorporated at desired sites in the nucleic acids, and provides the high sensitive and low background signals in biological samples. Because there is no natural intracellular concentration of fluorine in cells, there is no background noise in in-cell 19F NMR spectra. Therefore, 19F NMR spectroscopy is an ideal tool for studying G-quadruplex structures in living cells [62–64]. An additional advantage of 19F NMR spectroscopy is that the 19F nucleus, which has 100% natural abundance (13C is 1.1% and 15N is 0.37%), offers a chemical shift dispersion that is more than 100-fold larger than the one for 1H. Furthermore, its broader chemical shift range makes the 19F nucleus quite sensitive in the local environment, even in living cells [54, 57, 65]. 19F NMR signals are strongly dependent on the 19F label’s structural environment; it should be possible to distinguish different structures with the same sequence by the corresponding resonances of the different structures, such as single strands and G-quadruplexes (Fig. 1). We have recently demonstrated that the sensitivity and simplicity of the 19F NMR approach can be used to directly observe DNA and RNA G-quadruplexes in vitro and in living cells [62–64, 66–68]. In this chapter, we describe detailed procedures for preparing 19F-labeled DNA, the evaluation of 19F-labeled DNA G-quadruplexes in vitro and in living Xenopus laevis oocytes by 19 F NMR spectroscopy, and the quantitative characterization of thermodynamic properties of the G-quadruplexes.

Studying DNA G-Quadruplex Structures by 19F NMR

409

Fig. 1 Conceptual design of the method. The present study is based on the concept that 19F NMR signals are strongly dependent on the structural environment of the 19F label. 19F resonances of different chemical shifts are expected according to single strand and G-quadruplex

This approach has several advantages over existing techniques. First, it is relatively easy to prepare 19F-labeled DNA molecules by introducing a 19F group into DNA sequences. Second, the absence of any natural fluorine background signal in DNA and cells results in a simple and clear 19F NMR spectrum and does not suffer from high background signals as does 1H NMR. Finally, the simplicity and sensitivity of 19F NMR can be used to easily distinguish different DNA G-quadruplex conformations under various conditions, even in living cells. We incorporated the nucleosides 20 -fluoro (20 -F) uridine and 0 2 -F-guanosine into G-quadruplex. However, unclear signals were observed in the 19F NMR spectra (data not shown). It is believed that a stabilized G-quadruplex causes an overall reduction in the tumbling rates of molecule and, consequently, a loss of 19F NMR sensitivity. In order to obtain a sharper and higher-quality 19F NMR spectrum, we introduced the 3,5 bis(trifluoromethyl)phenyl into DNA nucleobase, because the six equivalent 19F atoms in the moiety are afforded a high 19F NMR signal intensity.

2

Materials 1. 20 -Deoxyguanosine. 2. N-Bromosuccinimide. 3. Tetrakis(triphenylphosphine)palladium(0).

410

Takumi Ishizuka et al.

4. Copper(I) iodide. 5. Amberlite® IRA-67 free base. 6. 1-Ethynyl-3,5-bis(trifluoromethyl)benzene. 7. Celite. 8. Trimethylchlorosilane. 9. Isobutyric anhydride. 10. 28% Ammonia in water. 11. 4,40 -Dimethoxytritylchloride. 12. Triethylamine. 13. 4-(Dimethylamino)pyridine. 14. Sodium bicarbonate. 15. Sodium sulfate. 16. N,N-Diisopropylethylamine. 17. 2-Cyanoethyl-N,N-diisopropylchlorophosphoramidite. 18. 5-Iodo-20 -deoxyuridine. 19. Silica gel for column chromatography. 20. 3-(Trifluoromethoxy)benzyl alcohol. 21. Phosphoramidites, resins, and other reagents (see Note 1). 22. Methylamine solution (40% wt % in water). 23. Ammonium formate. 24. Xenopus laevis oocytes (Kitagawa Scientific General Research Institute) (see Note 2). 25. Acetonitrile. 26. Dichloromethane. 27. Methanol. 28. Acetone. 29. N,N-Dimethylformamide. 30. Pyridine. 31. n-Hexane, 32. Ethyl acetate. 33. Chloroform. 34. Deuterated dimethyl sulfoxide (DMSO-d6). 35. Deuterated chloroform (CDCl3). 36. NAP10 column (GE Healthcare).

Studying DNA G-Quadruplex Structures by 19F NMR

3

411

Methods

3.1 Synthesis of Phosphoramidite Bearing 19F-Labeled Nucleobases 3.1.1 Synthesis of 8FdG phosphoramide

As shown in Fig. 2a, 19F-labeled nucleobase (8FdG phosphoramide) can be prepared in five steps (13% total yield) starting from commercially available 20 -deoxyguanosine. 1. Transfer 188 mL of acetonitrile and 63 mL of water to a 500 mL round-bottom flask equipped with a stirring bar. To the solution, add 20 -deoxyguanosine (5.0 g, 18.7 mmol). 2. To the suspension in solution, add N-bromosuccinimide (5.34 g, 30.0 mmol) in five portions over 15 min. Stir the reaction mixture at room temperature. 3. Monitor the progress of the bromination reaction by thin layer chromatography (TLC) using dichloromethane/methanol (4:1) as the mobile phase (Rf for 1: 0.43) (see Note 3). 4. After 1 h (completion of the reaction), filter the reaction mixture using a Buchner funnel and wash the residue with 100 mL of acetone. 5. Remove the solvent under reduced pressure in a belt-drive vacuum pump to yield 1 as a slightly orange solid (4.8 g, 74% yield). 6. Dissolve 5  10 mg of isolated 1 in deuterated dimethyl sulfoxide (DMSO-d6) and confirm the chemical structure by 1 H NMR and high-resolution mass spectrometry (HRMS) (see Note 4). 1H-NMR (400 MHz, DMSO-d6) δ 10.79 (s, 1H), 6.49 (s, 2H), 6.16 (dd, J ¼ 7.2, 7.2 Hz, 1H), 5.25 (br s, 1H), 4.85 (br s, 1H), 4.39 (m, 1H), 3.80 (dt, J ¼ 8.4, 2.8 Hz, 1H), 3.62 (dd, J ¼ 11.6, 5.2 Hz, 1H), 3.50 (dd, J ¼ 11.6, 6.0 Hz, 1H), 3.16 (m, 1H), 2.10 (ddd, J ¼ 2.8, 6.8, 13.2 Hz, 1H). HRMS (ESI) for C10H11O4N5Br [MH]: Calcd. 344.0000; Found. 343.9981 (see Note 5). 7. For cross-coupling reaction of the 19F group (1 ! 2 in Fig. 2a), transfer the purified 1 (2.0 g, 5.78 mmol), tetrakis (triphenylphosphine)palladium(0) (668 mg, 0.58 mmol), copper(I) iodide (220 mg, 1.16 mmol), Amberlite® IRA-67 free base (5 g) and 25 mL of dry N,N-dimethylformamide to a 100 mL round-bottom flask equipped with a stirring bar. Stir the solution at room temperature for 5 min under argon atmosphere. 8. To the solution, add 1-ethynyl-3,5-bis(trifluoromethyl)benzene (2.05 mL, 11.56 mmol). Stir the solution at 60  C in oil bath under argon atmosphere. 9. Monitor the progress of the cross-coupling reaction by TLC using dichloromethane/methanol (6:1) as the mobile phase (Rf for 2: 0.35).

Fig. 2 Synthesis of phosphoramidite bearing phosphoramidite

F-labeled nucleobases (a) Synthesis of 20 -deoxyuridine phosphoramidite. (b) Synthesis of 20 -deoxyguanosine

19

412 Takumi Ishizuka et al.

Studying DNA G-Quadruplex Structures by 19F NMR

413

10. Upon completion of the reaction (after 20 h), cool to room temperature and filter the reaction mixture through Celite (20 g). 11. Wash the residue with 100 mL of methanol and remove the solvent under reduced pressure in a rotary evaporator. 12. Purify 2 by flash column chromatography using a dichloromethane: methanol (95:5 ! 4:1) as the solvent system (see Note 6). 13. Identify the fractions containing 2 by TLC, combine them, and remove the solvent under reduced pressure in a rotary evaporator to yield 2 as a yellow solid (1.87 g, 65% yield). 14. Dissolve 5  10 mg of isolated 2 in DMSO-d6 and confirm the chemical structure by 1H, 13C, 19F NMR, and HRMS. 1H NMR (400 MHz, DMSO-d6) δ 10.90 (s, 1H), 8.39 (s, 2H), 8.25 (s, 1H), 6.63 (br s, 2H), 6.39 (t, J ¼ 7.2 Hz, 1H), 5.26 (d, J ¼ 4.8 Hz, 1H), 4.83 (t, J ¼ 6.0 Hz, 1H), 4.46 (m, 1H), 3.81 (m, 1H), 3.63 (m, 1H), 3.52 (m, 1H), 3.04 (m, 1H), 2.22 (ddd, J ¼ 3.6, 6.8, 13.2 Hz, 1H). 13C NMR (100 MHz, DMSO-d6) δ 155.96, 153.97, 150.87, 132.16, 131.06, 130.73, 128.06, 123.34, 117.92, 87.70, 83.53, 82.49, 70.66, 61.83, 37.60. 19F NMR (376 MHz, DMSO-d6) δ 61.42. HRMS (ESI) for C20H14O4N5F6 [MH]: Calcd. 502.0944; Found. 502.0956. Fluorescence properties: λex ¼ 386 nm, λem ¼ 461 nm. Molar absorption coefficients (ε) in MeOH: λmax ¼ 340 nm (ε ¼ 13.8  103), 260 nm (ε ¼ 8.9  103). 15. For protection by isobutyryl group (2 ! 3 in Fig. 2a), dissolve the purified 2 (1.8 g, 3.58 mmol) in 50 mL of dry pyridine and remove the solvent under reduced pressure in a rotary evaporator and a belt-drive vacuum pump. Repeat this step three times to dry the 2 thoroughly. 16. Dissolve the dried 2 in 20 mL of dry pyridine under argon atmosphere. To the solution, add trimethylchlorosilane (2.3 mL, 17.9 mmol). After 30 min, add isobutyric anhydride (3.0 mL, 17.9 mmol). Stir the reaction mixture at room temperature for 5 h. 17. Cool the reaction mixture using an ice bath, add 20 mL of water. Stir the reaction mixture for 5 min. 18. To the reaction mixture, add 20 mL of 28% ammonia solution (28% NH3 in H2O) and stir the reaction mixture for 15 min. 19. Monitor the progress of the reaction by TLC using dichloromethane/methanol (10:1) as the mobile phase (Rf for 3: 0.38). 20. Remove the solvent under reduced pressure in a rotary evaporator and add 100 mL of methanol to the residue.

414

Takumi Ishizuka et al.

21. Filter the precipitate (product) using a Buchner funnel and purify the residue from the filtrate by flash column chromatography using a dichloromethane: methanol (100:0 ! 85:15) as the solvent system. 22. Identify the fractions containing 3 by TLC, combine them, and remove the solvent under reduced pressure in a rotary evaporator to yield 3 as a yellow solid (1.16 g, 56% yield). 23. Dissolve 5  10 mg of isolated 3 in DMSO-d6 and confirm the chemical structure by 1H, 13C, 19F NMR, and HRMS. 1H NMR (400 MHz, DMSO-d6) δ 12.22 (s, 1H), 11.63 (s, 1H), 8.43 (s, 2H), 8.28 (s, 1H), 6.48 (t, J ¼ 6.8 Hz, 1H), 5.30 (d, J ¼ 4.8 Hz, 1H), 4.77 (t, J ¼ 6.0 Hz, 1H), 4.51 (m, 1H), 3.84-3.81 (m, 1H), 3.62 (m, 1H), 3.51 (m, 1H), 3.12 (m, 1H), 2.81 (sep, J ¼ 6.8 Hz, 1H), 2.23 (ddd, J ¼ 3.6, 6.8, 13.2 Hz, 1H), 1.15 (d, J ¼ 6.8 Hz, 3H), 1.15 (d, J ¼ 6.8 Hz, 3H).13C NMR (100 MHz, DMSO-d6) δ 180.19, 154.11, 148.63, 148.50, 132.44, 131.44, 131.11, 130.77, 130.44, 130.34, 126.79, 124.08, 123.39, 122.99, 121.36, 121.10, 118.65, 90.29, 87.66, 83.59, 81.80, 70.41, 61.65, 37.46, 34.71, 18.79, 18.74. 19F NMR (376 MHz, DMSO-d6) δ 61.44. HRMS (ESI) for C24H20O5N5F6 [MH]: Calcd. 572.1363; Found. 572.1371. 24. For protection by 4,40 -dimethoxytrityl (DMTr) group (3 ! 4 in Fig. 2a), dissolve the purified 3 (177 mg, 0.31 mmol) in 10 mL of dry pyridine and remove the solvent under reduced pressure in a rotary evaporator and a belt-drive vacuum pump. Repeat this step three times to dry the 3 thoroughly. 25. Dissolve the dried 3 in 5 mL of dry pyridine under argon atmosphere. To the solution, add 4,40 -dimethoxytritylchloride (157 mg, 0.46 mmol), triethylamine (39 μL, 0.28 mmol), and 4-(dimethylamino)pyridine (1.5 mg, 0.012 mmol). Stir the reaction mixture at room temperature for 12 h. 26. Monitor the progress of the protection reaction by TLC using dichloromethane/methanol (10:1) as the mobile phase (Rf for 4: 0.46). 27. Upon completion of the reaction, dilute the reaction mixture with 50 mL of dichloromethane and add 30 mL of aqueous 5% sodium bicarbonate solution. 28. Extract the solution with 2  50 mL of dichloromethane. Combine the organic layers and dry them over sodium sulfate (5 g). Remove the organic solvent under reduced pressure in rotary evaporator to obtain 4 as a crude product. 29. Purify the residue by flash column chromatography using a dichloromethane:methanol (100:0 ! 9:1) as the solvent system (see Note 7).

Studying DNA G-Quadruplex Structures by 19F NMR

415

30. Identify the fractions containing 4 by TLC, combine them, and remove the solvent under reduced pressure in a rotary evaporator to yield 4 as a yellow solid (188 mg, 70% yield). 31. Dissolve 5  10 mg of isolated 4 in DMSO-d6 and confirm the chemical structure by 1H, 13C, 19F NMR, and HRMS. 1H NMR (400 MHz, DMSO-d6) δ 12.16 (s, 1H), 11.40 (s, 1H), 8.33 (s, 2H), 8.29 (s, 1H), 7.26-7.23 (m, 2H), 7.127.10 (m, 7H), 6.72-6.61 (m, 5H), 4.59-4.53 (m, 1H), 4.054.01 (m, 1H), 3.69 (s, 3H), 3.68 (s, 3H), 3.45 (t, J ¼ 9.6 Hz, 1H), 3.12-3.03 (m, 2H), 2.76 (sep, J ¼ 6.8 Hz, 1H), 2.442.37 (m, 1H), 1.14 (d, J ¼ 6.8 Hz, 3H), 1.13 (d, J ¼ 6.8 Hz, 3H). 13C NMR (100 MHz, DMSO-d6) δ 180.04, 157.79, 157.72, 154.08, 148.27, 148.07, 144.68, 135.49, 135.38, 132.33, 131.04, 130.71, 130.59, 129.57, 129.44, 127.61, 127.30, 126.29, 124.06, 123.34, 122.90, 121.41, 121.35, 112.63, 112.55, 90.52, 86.61, 85.01, 84.32, 81.73, 70.71, 64.67, 54.79, 54.74, 38.11, 34.67, 18.85, 18.58. 19F NMR (376 MHz, DMSO-d6) δ 61.39. HRMS (ESI) for C45H38O7N5F6 [MH]: Calcd. 874.2670; Found. 874.2666. 32. For final step (4 ! 5 in Fig. 2a), dissolve the purified 4 (530 mg, 0.61 mmol) in 30 mL of dry dichloromethane/ acetonitrile (1:1) and remove the solvent under reduced pressure in a rotary evaporator and a belt-drive vacuum pump. Repeat this step three times to dry the 4 thoroughly. 33. Dissolve the dried 4 in 10 mL of dry dichloromethane/acetonitrile (1:1) under argon atmosphere. To the solution, add N, N-diisopropylethylamine (422 μL, 2.42 mmol) and 2-cyanoethyl-N,N-diisopropylchlorophosphoramidite (405 μL, 1.82 mmol). Stir the reaction mixture at room temperature for 2 h. 34. Monitor the progress of the reaction by TLC using n-hexane: ethyl acetate (1:3) as the mobile phase (Rf for 5: 0.79 and 0.73). 35. Upon completion of the reaction, dilute the reaction mixture with 100 mL of dichloromethane and add 100 mL of aqueous 5% sodium bicarbonate solution. 36. Extract the solution with 2  50 mL of dichloromethane. Combine the organic layers and dry them over sodium sulfate (5 g). Remove the organic solvent under reduced pressure in rotary evaporator to obtain 5 as a crude product. 37. Purify the residue by flash column chromatography using a nhexane:ethyl acetate (3:2 ! 2:3) as the solvent system (see Note 8).

416

Takumi Ishizuka et al.

38. Identify the fractions containing 5 by TLC, combine them, and remove the solvent under reduced pressure in a rotary evaporator to yield 5 as a solid (435 mg, 67% yield). 39. Dissolve 510 mg of isolated 5 in deuterated chloroform (CDCl3) and confirm the chemical structure by 19F and 31P NMR and HRMS. 19F NMR (376 MHz, CDCl3) δ 63.11. 31 P NMR (161 MHz, CDCl3) δ 148.33, 147.84. HRMS (ESI) for C54H55O8N7F6P [MH]: Calcd. 1074.3748; Found. 1074.3756. 3.1.2 Synthesis of 5FdU phosphoramide

As shown in Fig. 2b, 19F-labeled nucleobase (5FdU phosphoramidite) can be prepared in three steps (55% total yield) starting from commercially available 5-iodo-20 -deoxyuridine. 1. Dissolve 5-iodo-20 -deoxyuridine (1.75 g, 4.94 mmol) in 15 mL of dry N,N-dimethylformamide and add tetrakis(triphenylphosphine)palladium(0) (571 mg, 0.49 mmol), copper (I) iodide (188 mg, 0.99 mmol), and N,N-diisopropylethylamine (1.72 mL, 9.9 mmol). Stir the solution at room temperature for 5 min under argon atmosphere. 2. To the solution, add 1-ethynyl-3,5-bis(trifluoromethyl)benzene (1.48 mL, 8.40 mmol). Stir the solution at room temperature under argon atmosphere. 3. Monitor the progress of the cross-coupling reaction by TLC using dichloromethane/methanol (4:1) as the mobile phase (Rf for 6: 0.73). 4. Upon completion of the reaction (after 12 h), filter the reaction mixture through Celite (20 g). Wash the residue with 50 mL of methanol. 5. Remove the solvent under reduced pressure in a rotary evaporator and add 100 mL of dichloromethane/methanol (1:1) solution to the residue. 6. Filter the precipitate (product) using a Buchner funnel. Repeat steps 5 and 6 four times to obtain the 6. 7. Remove the solvent under reduced pressure in a belt-drive vacuum pump to yield 1 as a white solid (1.6 g, 70% yield). 8. Dissolve 510 mg of isolated 6 in DMSO-d6 and confirm the chemical structure by 1H, 13C, 19F NMR, and HRMS. 1H NMR (400 MHz, DMSO-d6) δ 11.78 (s, 1H), 8.53 (s, 1H), 8.14 (s, 1H), 8.13 (s, 2H), 6.12 (t, J ¼ 6.4 Hz, 1H), 5.28 (d, J ¼ 4.4 Hz, 1H), 5.20 (t, J ¼ 4.8 Hz, 1H), 4.27 (m, 1H), 3.83 (q, J ¼ 3.4 Hz, 1H), 3.68 (ddd, J ¼ 3.5, 4.9, 11.9 Hz, 1H), 3.60 (ddd, J ¼ 3.5, 4.9, 11.9 Hz, 1H), 2.19 (dd, J ¼ 5.4, 6.2 Hz, 1H). 13C NMR (100 MHz, DMSO-d6) δ 161.20, 149.36, 145.38, 131.46, 131.06, 130.73, 130.40, 126.96, 125.10, 124.24, 121.89, 121.53, 96.97, 88.80, 87.62,

Studying DNA G-Quadruplex Structures by 19F NMR

417

86.54, 85.08, 69.73, 60.67. 19F NMR (376 MHz, DMSO-d6) δ 61.52. HRMS (ESI) for C19H13O5N2F6 [MH]: Calcd. 463.0723; Found. 463.0738. 9. For second step (6 ! 7 in Fig. 2b), dissolve the purified 6 (1.6 g, 3.5 mmol) in 30 mL of dry pyridine and remove the solvent under reduced pressure in a rotary evaporator and a belt-drive vacuum pump. Repeat this step three times to dry the 6 thoroughly. 10. Dissolve the dried 6 in 20 mL of dry pyridine under argon atmosphere. To the solution, add 4,40 -dimethoxytritylchloride (1.75 g, 5.2 mmol), triethylamine (432 μL, 3.1 mmol), and 4-(dimethylamino)pyridine (16.8 mg, 0.14 mmol). Stir the reaction mixture at room temperature for 2 h. 11. Monitor the progress of the protection reaction by TLC using dichloromethane/methanol (10:1) as the mobile phase (Rf for 7: 0.48). 12. Upon completion of the reaction, remove the solvent under reduced pressure in a rotary evaporator and add 100 mL of dichloromethane and 30 mL of aqueous 5% sodium bicarbonate solution to the reaction mixture. 13. Extract the solution with 2  50 mL of dichloromethane. Combine the organic layers and dry them over sodium sulfate (5 g). Remove the organic solvent under reduced pressure in rotary evaporator to obtain 7 as a crude product. 14. Purify the residue by flash column chromatography using a dichloromethane:methanol (100:0 ! 93:7) as the solvent system. 15. Identify the fractions containing 7 by TLC, combine them, and remove the solvent under reduced pressure in a rotary evaporator to yield 7 as a solid (2.4 g, 90% yield). 16. Dissolve 5  10 mg of isolated 7 in DMSO-d6 and confirm the chemical structure by 1H NMR and HRMS. 1H NMR (400 MHz, DMSO-d6) δ 11.85 (s, 1H), 8.35 (s, 1H), 8.06 (s, 1H), 7.55 (s, 2H), 7.41-7.10 (m, 9H), 6.83-6.78 (m, 4H), 6.16 (t, J ¼ 6.0 Hz, 1H), 5.37 (d, J ¼ 4.4 Hz, 1H), 4.37-4.30 (m, 1H), 3.96 (q, J ¼ 3.4 Hz, 1H), 3.62 (s, 6H), 3.24 (m, 2H), 2.39-2.26 (m, 2H). HRMS (ESI) for C40H31O7N2F6 [MH]: Calcd. 765.2030; Found. 765.2048. 17. For final step (7 ! 8 in Fig. 2b), dissolve the purified 7 (1.15 mg, 1.5 mmol) in 20 mL of dry acetonitrile and remove the solvent under reduced pressure in a rotary evaporator and a belt-drive vacuum pump. Repeat this step three times to dry the 7 thoroughly. 18. Dissolve the dried 7 in 5 mL of dry dichloromethane under argon atmosphere. To the solution, add N,N-

418

Takumi Ishizuka et al.

diisopropylethylamine (1.05 mL, 6.0 mmol) and 2-cyanoethylN,N-diisopropylchlorophosphoramidite (1.0 mL, 4.5 mmol). Stir the reaction mixture at room temperature for 4 h. 19. Monitor the progress of the reaction by TLC using n-hexane: ethyl acetate (1:2) as the mobile phase (Rf for 8: 0.74 and 0.66). 20. Upon completion of the reaction, dilute the reaction mixture with 150 mL of dichloromethane and add 150 mL of aqueous 5% sodium bicarbonate solution. 21. Extract the solution with 2  50 mL of dichloromethane. Combine the organic layers and dry them over sodium sulfate (5 g). Remove the organic solvent under reduced pressure in rotary evaporator to obtain 8 as a crude product. 22. Purify the residue by flash column chromatography using a nhexane: ethyl acetate (75:25 ! 55:45) as the solvent system. 23. Identify the fractions containing 8 by TLC, combine them, and remove the solvent under reduced pressure in a rotary evaporator to yield 8 as a solid (1.26 g, 87% yield). 24. Dissolve 5  10 mg of isolated 8 in CDCl3 and confirm the chemical structure by 31P NMR and HRMS. 31P-NMR (161 MHz, CDCl3) δ 149.03, 148.58. HRMS (ESI) for C49H48O8N4F6P [MH]: Calcd. 965.3108; Found. 965.3116. 3.2 Synthesis and Purification of 19F-Labeled Oligonucleotides

1. Dissolve the 19F-labeled phosphoramidite in dry acetonitrile as a final concentration of 0.2 M. 2. Use a DNA/RNA synthesizer (Nihon Techno Service Co., Ltd.) to synthesize the 19F-labeled DNA on controlled pore glass (CPG) supports (1 μmol) with general parameters, except for an extended coupling time (20 min) for the 19F-labeled phosphoramidite. 3. Cleave the synthesized 19F-labeled oligonucleotides from the CPG support with 1 mL of ammonium hydroxide/methylamine (AMA) solution for 20 min at room temperature and wash the supports with 0.5 mL of AMA solution (see Note 9). 4. Incubate the solution (1.5 mL) from step 3 for 10 min at 65  C to remove the protection group of 19F-labeled oligonucleotides. 5. Evaporate the solution and add 300 μL of water for HPLC purification. 6. Purify the 19F-labeled oligonucleotides by HPLC. Elute with ammonium formate containing 5–20% (vol/vol) acetonitrile with a linear gradient at a flow rate of 1.0 mL/min for

Studying DNA G-Quadruplex Structures by 19F NMR

50 min. Collect the fractions.

19

419

F-labeled oligonucleotides containing

7. Lyophilize the purified sample and add 1 mL of deionized water. Apply to a NAP10 column (GE Healthcare) for desalting. 8. Confirm the identity and purity of the 19F-labeled oligonucleotides by MALDI-TOF MS (Bruker) using negative mode. Use dT5 ([MH]: 1458.012), dT8 ([MH]: 2370.603), and dT17 ([MH]: 5108.376) as external standards. The typical results of MALDI-TOF MS (m/z) are as follows: (1) 8FdG modified oligonucleotides: 50 -TA(8FdG)GGT-30 , calculated 2082.4, found 2081.5; 50 -TA(8FdG)GGTTAGGGT-30 , calculated 3991.6, found 3991.3; 50 -(8FdG) 0 GTTGGTGTGGTTGG-3 , calculated 4961.2, found 4961.6. 5F dU modified oligonucleotides: 50 -GG(5FdU) (2) 0 TGGTGTGGTTGG-3 , calculated: 4947.2, found: 4947.3. 3.3 19F NMR Sample Preparation 3.3.1 In Vitro Sample

Prepare the samples of 19F-labeled and natural 12-mer DNA for a 1 H NMR and 19F NMR experiment as follows: 0.1 mM DNA strand concentration containing 10% (vol/vol) D2O in the presence of 100 mM KCl and 20 mM potassium phosphate buffer (pH 7.0) in a 0.15 mL volume (see Note 10).

3.3.2 In-Cell Sample Preparation

Prepare the DNA sample for in-cell 19F NMR as follows: 5 mM strand concentration of 19F-labeled 6-mer DNA and 5 mM strand concentration of natural 16-mer DNA containing 10% (vol/vol) D2O in the presence of 100 mM KCl and 20 mM potassium phosphate buffer (pH 7.0) in a volume of 50 μL. The mixture was incubated at 4  C overnight.

3.4 19F NMR Measurements

In this study, 1D 19F NMR spectral measurements are performed on a Bruker 400 MHz spectrometer equipped with a 5 mm PABBO probe, and the observation channel is tuned to 19F (376.05 MHz) (see Note 11). Trifluoroacetic acid (CF3COOH, 75.66 ppm) as an external standard is used for calibration. 1D 19F spectra are acquired with a one-pulse program (90 pulse with a duration of 15 μs). The following experimental parameters are used for 19F NMR measurements: spectral width: 89.3 kHz, acquisition time: 0.73 s, relaxation delay: 1 s, and number of scans: 1024 or 2048.

3.5 CD Sample Preparation and Measurements

1. Prepare the natural and 19F-labeled samples for the CD experiment as follows: 10 μM DNA strand concentration in the presence of 100 mM KCl and 20 mM potassium phosphate buffer (pH 7.0) in a 0.3 mL volume. 2. The samples are annealed by heating to 95  C for 5 min, then gradually cooling to room temperature.

420

Takumi Ishizuka et al.

3. CD spectra were obtained using a JASCO J-820 CD spectrophotometer. The spectra were recorded using a 1 cm path length cell. The melting curves were obtained by monitoring at 290 nm. 3.6 Fluorescence Sample Preparation and Measurements

1. The fluorescence spectra of ODN1 (10 μM) and ODN3 (0–20 μM) were obtained in 100 mM KCl and 10 mM Tris–HCl buffer (pH 7.0) in a 0.3 mL volume. 2. The samples are annealed by heating to 95  C for 5 min, then gradually cooling to room temperature. 3. Fluorescence spectra were obtained using a JASCO FP-8200 spectrofluorometer. The spectra were recorded using a 1 cm path length cell. The excitation of the probes used in this study was achieved at 386 nm. For each sample, at least two spectral scans were accumulated over a wavelength range from 400 nm to 700 nm. The emission spectra were recorded at an excitation wavelength of 386 nm, and the excitation spectra were recorded at an emission wavelength of 461 nm for 8FdG.

3.7 In Vitro 19F NMR Spectroscopy for Studying G-Quadruplex Structure 3.7.1 Analysis of Aptamer DNA G-Quadruplex by 19F NMR

In this section, we show that 19F NMR is a powerful method in determining the thrombin-binding aptamer (TBA) DNA G-quadruplex, widely used as a model structure for studying G-quadruplex aptamers (Fig. 3a). One thymidine in a TBA sequence is systematically changed to 19F-labeled thymidine (5FdU), and the conformational transition detected by 19F NMR is evaluated. We successfully observed the structural change between the G-quadruplex and the unstructured single strand by 19 F NMR spectroscopy. On the basis of the 19F NMR signals, the thermodynamic parameters of these aptamer DNA G-quadruplexes were also determined. Furthermore, we show that the 19F NMR method can be used to observe the complex formed by TBA G-quadruplex and thrombin.

Fig. 3 (a) Schematic cartoon of the NMR structure of the G-quadruplex aptamer. (b) Sequences of the labeled oligonucleotides synthesized

19

F-

Studying DNA G-Quadruplex Structures by 19F NMR

421

Fig. 4 CD spectra of all19F-labeled oligonucleotides. CD spectrum of native TBA (WT) indicated by dashed line. Condition: [DNA] ¼ 5 μM, [KCl] ¼ 100 mM, [Tris–HCl (pH 7.0)] ¼ 10 mM

1. To study the G-quadruplex structure by 19F NMR spectroscopy, the 19F-labeled thymidine derivative (5FdU) was incorporated into the all thymine residues in the TBA sequence using phosphoramidite chemistry (Fig. 3b) (see Subheading 3.2). 2. We first investigated the conformation of the 19F-labeled oligonucleotides by CD spectroscopy. In the presence of 100 mM potassium ions, the CD spectra of TBA oligonucleotides with and without 19F labeling showed positive and negative peaks at 295 and 265 nm, respectively (Fig. 4). The CD spectra confirmed as an antiparallel chair-like G-quadruplex structure in accord with previous reports [42, 43], indicating that the incorporation of the 19F-labeled nucleoside at loop region does not affect the folding of G-quadruplex. 3. To test the influence of 19F labeling on TBA sequences, we next evaluated the conformational behavior of all 19F-labeled oligonucleotides (T3, T4, T7, T9, T12, and T13) by 19F NMR spectroscopy. Typical result (for T12) from 19F NMR spectroscopy showed in Fig. 5. In the absence of potassium ions, a single peak at 62.88 ppm was observed as the single stranded T12, whereas in the presence of 100 mM potassium ions, a new peak at 62.57 ppm was appeared, and the original peak of the unstructured single strand (62.78 ppm) disappeared completely. Upon heating to 70  C, the G-quadruplex peaks disappeared and only the single strand peak (62.88 ppm) was detected in the 19F NMR spectrum. 4. To further confirm that signal at 62.88 was from the G-quadruplex, we performed a 1H NMR experiment by

422

Takumi Ishizuka et al.

Fig. 5 19F NMR spectra of 19F-labeled oligonucleotide T12 in the presence or absence of KCl. Condition: [DNA] ¼ 100 μM, [KCl] ¼ 100 mM, [TrisHCl buffer (pH 7.0)] ¼ 10 mM. 1H imino proton NMR of T12 corresponding to 19F NMR at 25 or 70  C

observation of imino protons from G-quadruplex formation. As shown in Fig. 5, eight peaks assignable to the TBA G-quadruplex were observed at 11.5–12.5 ppm that corresponds to the single peak in the 19F NMR spectrum. At the 70  C, no peak was observed in the 1H NMR. 5. We next performed a temperature-dependent experiment to confirm the assignment of the two signals in the 19F NMR (Fig. 6a). When the temperature was increased, the intensity of the signals for the G-quadruplex decreased (red circle for T3, T4, T7, T9, T12, and T13), whereas that of the signals for the single-strand DNA increased (blue circle). This result indicated that all of the G-quadruplexes unfolded to a single strand at high temperatures. 6. According to the 19F NMR peak areas at various temperatures, we characterized the melting process. The melting curves estimated from the 19F NMR peak areas are shown in Fig. 6b. 7. The thermodynamics of the melting process is quantitatively characterized by the resonances of different conformations (Table 1). 8. Using the 19F NMR spectroscopy, the interaction of G-quadruplex aptamer with thrombin is further investigated. When the thrombin was added to the G-quadruplex aptamer sample, a new signal was detected in the 19F NMR spectrum (Fig. 7). The signal was assigned to the complex between the G-quadruplex aptamer and thrombin, demonstrating that 19F NMR spectroscopy is a useful tool for studying the binding events of G-quadruplex aptamer and protein targets.

Fig. 6 (a) 19F NMR spectra of 19F-labeled oligonucleotides at different temperatures. Blue and red spots indicated single strand and G-quadruplex. Temperatures indicated on the right. (b) Melting profiles derived from the relative peak areas of the 19F signals on different temperatures

Studying DNA G-Quadruplex Structures by 19F NMR 423

424

Takumi Ishizuka et al.

Table 1 Thermodynamic parameters for G-quadruplex conformational transition Sample

ΔH (kJ/mol)

ΔS (J/mol K)

ΔG298K (kJ/mol)

T3

166.7

524.4

10.4

T4

168.9

533.4

9.9

T7

163.7

525.3

7.1

T9

192.6

596.7

14.7

T12

174.4

548.2

11.0

T13

175.1

552.1

10.5

The thermodynamic parameters were determined from van’t Hoff plots. The experimental errors for enthalpy (ΔH) and entropy changes (ΔS) were  5 kJ/mol and  10 J/mol K, respectively

Fig. 7 Discrimination of G-quadruplex structure and G-quadruplex-protein complex. 19F NMR spectra of 19Flabeled TBA oligonucleotide (T9) with 0 mM KCl (a), with 100 mM KCl (b) and after addition of thrombin (c). Condition: [DNA] ¼ 100 μM, [KCl] ¼ 0 or 100 mM KCl, [Tris–HCl buffer (pH 7.0)] ¼ 10 mM, [thrombin] ¼ 10 μM 3.7.2 Analysis of the (3 + 1) DNA G-Quadruplex by 19F NMR

We describe a procedure for studying human telomeric (3 + 1) DNA G-quadruplex structure by a multifunctional guanine derivative, 8FdG, as a G-quadruplex stabilizer, a fluorescent probe for the detection of G-quadruplex formation, and a 19F sensor for the observation of the G-quadruplex. We demonstrated that the 8FdG simultaneously allows for three kinds of functions at a single G-quadruplex DNA for the first time. The results shown in this section suggest that the 8FdG can be broadly used for studying the G-quadruplex structure and serves as a powerful tool for examining the molecular basis of G-quadruplex formation. 1. The 8FdG bearing a 3,5-bis(trifluoromethyl)benzene group at the 8-position of guanine allows for three kinds of functions (Fig. 8a). First, the 8FdG stabilizes the DNA G-quadruplex

Studying DNA G-Quadruplex Structures by 19F NMR

425

Fig. 8 Chemical structure of guanine derivative 8FdG (a) and sequences used in this study (b). 8-Substituted 8F dG shows a predominant syn conformation and induces fluorescence in the G-quadruplex, also acts as a 19F sensor for monitoring the G-quadruplex

structure, because modification at 8-position of guanine induces syn conformation in the G-quadruplex structure. Second, the modification at the 8-position of guanine via an acetylene linker changes fluorescent properties by extending the π-conjugation, leading a fluorescent nucleobase, which can apply for the observation of the G-quadruplex. Lastly, the 8F dG as a sensitive probe can be employed to study for the DNA G-quadruplex structure. 2. To test the potency of 19F-labeled guanosine derivative (8FdG) in the G-quadruplex structure, the labeled oligonucleotides containing the 8FdG was synthesized using phosphoramidite chemistry (Fig. 8b) (see Subheading 3.2). 3. The effect of 8FdG on the stability of the intermolecular (3 + 1) human telomeric G-quadruplex that was formed by the 8FdG modified single-repeat (6-nt, ODN1) and three-repeat (16-nt,

426

Takumi Ishizuka et al.

Fig. 9 CD spectra (a) and CD melting curves (b) of intermolecular G-quadruplexes formed by the 8FdG substituted ODN1 and ODN3, natural sequences ODN2 and ODN3. (c) Fluorescence spectra of ODN1 (10 μM) and ODN3 (0–20 μM) in 100 mM KCl and 10 mM Tris–HCl buffer (pH 7.0) at 25  C with a mole fraction variation (λex ¼ 386 nm). Inset: fluorescence image of ODN1 only (1:0) and ODN1/ODN3 (1:1) hybrid G-quadruplex after illumination with a UV lamp (365 nm). (d) Titration plot of the fluorescence monitored at 450 nm for the (3 + 1) hybrid G-quadruplex

ODN3) human telomeric sequences was examined by CD measurement (Fig. 9a). According to the result of the melting curves (Fig. 9b), the 8FdG substitution in dG at the expected syn conformations induces an increase in the thermal stability of the G-quadruplex in comparison with the G-quadruplex formed by natural sequences (ODN2 and ODN3), demonstrating that 8F dG is a G-quadruplex stabilizer. The thermal stabilization was also observed by the 8FdG substitution in an intermolecular G-quadruplex of human telomere DNA (ODN4 vs. ODN5) and an intramolecular G-quadruplex of a TBA (ODN6 vs. ODN7) (for detailed information, see Ref. 63).

Studying DNA G-Quadruplex Structures by 19F NMR

427

4. To investigate 8FdG as a potential fluorescent probe for observing the G-quadruplex structure, we next performed fluorescence microscopy experiments to investigate the fluorescence change when ODN1 and ODN3 formed the G-quadruplex in the presence of 100 mM potassium ion (Fig. 9c) (see Note 12). The intensity of the emission was monitored at 450 nm (λex ¼ 386 nm) during the G-quadruplex formation with increasing ODN3 concentration. A clear fluorescence that derived directly from the G-quadruplex formation is visible to the naked eye (inset in Fig. 9c). As shown in the Job plot of the fluorescence emission monitored at 450 nm (Fig. 9d), a clear inflection point around 50% indicates a 1:1 stoichiometry for the formation of interstrand G-quadruplexes by ODN1 and ODN3. 5. Finally, we investigated that 8FdG can be used as a sensor for studying of the G-quadruplex conformation by 19F NMR. Because 19F NMR signals are strongly dependent on the structural environment of the 19F label, it should be possible to distinguish different DNA structures of the same sequence by the corresponding resonances of the different structures (Fig. 10a). We performed concentration- and temperaturedependent experiments to investigate the structural behavior of ODN1 and ODN3 in the formation of the DNA G-quadruplex by 19F NMR. As shown in Fig. 10b, c, we observed a single signal (62.71 ppm; indicated by red circle), which was derived from the signal of the G-quadruplex formed by ODN1 and ODN3. When the concentration of ODN3 was decreased or the temperature was increased, a new signal at 62.83 ppm corresponding to the unfolded single strand appeared. The results suggest that the transition of the G-quadruplex and single strand can be monitored by 19F NMR. Detection of conformational transition by 19F NMR was also observed from the intermolecular G-quadruplex structure of human telomere DNA (ODN4) and the intramolecular G-quadruplex structure of a TBA (ODN6) (Fig. 10d). 3.8 In-Cell 19F NMR Spectroscopy for Studying G-Quadruplex Structure

1. At first, prepare the DNA sample for in-cell introduced in Subheading 3.3.1.

19

F NMR as

2. Prepare an injection needle (see Note 13). 3. Manually inject 50 nL of the prepared DNA sample into the Xenopus laevis oocytes with an IM-30 Electric Microinjector. About 150 of Xenopus laevis oocytes were injected (Fig. 11a). 4. After injection, remove any obviously damaged cells. Transfer the remaining oocytes to a disposable dish and wash carefully with oocyte stocking buffer.

428

Takumi Ishizuka et al.

Fig. 10 (a) Concept for the detection of different DNA structures by a 19F label. Two 19F resonances of different chemical shifts are expected according to the single-stranded DNA and G-quadruplex. Purple box represents 8F dG. (b) 19F NMR spectra of ODN1 and ODN3 at different concentrations. ODN1 (100 μM) was mixed with ODN3 (0, 33, 66, and 100 μM). The mole fraction of ODN1 and ODN3 is shown on the right. (c) 19F NMR spectra at different temperatures. [DNA] ¼ 100 μM, [Tris–HCl buffer (pH 7.0)] ¼ 10 mM, [KCl] ¼ 100 mM. (d) 19 F signal change of ODN4 or ODN6 in the absence and presence of KCl. Red and blue circles indicate the G-quadruplex and unstructured single strand, respectively

5. Transfer the washed oocytes to a Shigemi tube and maintain them in an oocyte stocking buffer containing 10% (vol/vol) D2O. 6. Measure the in-cell 19F NMR spectra. Figure 11b shows a comparison of the in vitro and in-cell NMR spectra for the ODN1 and ODN3 in their pure form (top and middle panels) and upon oocyte injection (bottom panel). ODN1 indicated a single strand and ODN1 and ODN3 showed a G-quadruplex conformation. Only one signal was observed in the bottom panel NMR spectrum, for which the chemical shift was identical to that observed for the corresponding G-quadruplex in the in vitro 19F NMR spectrum (middle panel). This result reveals that the ODN1 and ODN3 formed intermolecular G-quadruplex structures inside living cells.

4

Notes 1. dA-, dG-, dT-CE phosphoramidites and dG-, dT-CPGs were used for DNA synthesis. All phosphoramidites and CPGs were purchased from Glen Research.

Studying DNA G-Quadruplex Structures by 19F NMR

429

Fig. 11 (a) Schematic overview of in-cell 19F NMR experiments. For in-cell 19F NMR applications in Xenopus oocytes, DNA sample was injected into the oocyte cells. Comparison with the position of the reference in vitro spectrum provides a reliable determination of intracellular DNA conformation. (b) Comparison of 19F NMR spectra of the in vitro sample of DNA (up and middle) and in Xenopus oocytes (bottom)

2. Experiments involving live Xenopus oocytes must conform to appropriate national and institutional regulations. The procedures in this protocol were approved by the Institutional Animal Care and Use Committee of Miyazaki University. 3. Thin layer chromatography was performed using TLC Silica gel 60 F254 (Merck). Compounds were visualized by staining with a potassium permanganate solution. 4. The sample for HRMS should be dissolved in methanol for LC-MS grade. High-resolution mass spectra (HRMS) and electrospray ionization mass spectra (ESI-MS) were recorded on a Thermo Scientific Q Exactive instrument. 5. 1H, 13C, 19F, and 31P NMR spectra were recorded on a Bruker (AV-400 M) magnetic resonance spectrometer. DMSO-d6 and CDCl3 were used as the solvents. 1H spectra chemical shifts (δ) are reported in parts per million (ppm) referenced to residual protonated solvent peak (DMSO-d6, δ ¼ 2.50, CDCl3, δ ¼ 7.26). Coupling constants (J) values are given in Hz and are correct to within 0.5 Hz. Signal patterns are indicated as br, broad; s, singlet; d, doublet; t, triplet; m, multiplet. 6. Purification of products was also performed on a middle pressure liquid chromatography (MPLC) systems (EPCLC-AI580S, Yamazen Corporation) equipped with silica gel column (Hi-Flash Column, Yamazen Corporation). 7. Purification of 4 can be used n-hexane: ethyl acetate (1:5) as the solvent system (Rf for 4: 0.46). To purify 4, n-hexane: ethyl acetate (3:2 ! 1:4) as the solvent system should be used. 8. Purification of 5 and 8 can be used a recycling preparative HPLC system (LC-9201, Japan Analytical Industry) with

430

Takumi Ishizuka et al.

JAIGEL-1H and 2H columns with CHCl3 as an eluent running at 3.0 mL/min. 9. Ammonium hydroxide/methylamine (AMA) solution is a 1:1 mixture of ammonium hydroxide solution (28% in water) and methylamine solution (40% wt % in water). The solution should be made up fresh before using. 10. Shigemi 5-mm symmetrical NMR microtube was used for in vitro and in-cell 19F NMR. 11. Owing to the sensitivity being a limiting factor, 19F cryoprobes are recommended but are not absolutely necessary. Similarly, higher magnetic fields (400 MHz) are recommended. 12. The addition of ODN3 results in a strong fluorescence, whereas a control sequence oligonucleotide sequence 50 -GAGT(TAGAGT)2-30 that cannot form a G-quadruplex does not induce an increase in fluorescence intensity. 13. The volume of injection sample should be around 50 nl and the diameter of individual drops of sample is calculated as 2.2 mm (R) following the equation: V ¼ 4/3πR3. A needle with an aperture that can dispense such a drop is prepared.

Acknowledgments This work is supported by JSPS KAKENHI (26288083, 17H03091, 16K17938). Support from the Takeda Science Foundation and Nakatani Foundation Scholarship is also acknowledged. References 1. Xu Y (2011) Chemistry in human telomere biology: structure, function and targeting of telomere DNA/RNA. Chem Soc Rev 40:2719–2740 2. Hansel-Hertsch R, Di Antonio M, Balasubramanian S (2017) DNA G-quadruplexes in the human genome: detection, functions and therapeutic potential. Nat Rev Mol Cell Biol 18:279–284 3. Hirashima K, Seimiya H (2015) Telomeric repeat-containing RNA/G-quadruplex-forming sequences cause genome-wide alteration of gene expression in human cancer cells in vivo. Nucleic Acids Res 43:2022–2032 4. Rhodes D, Lipps HJ (2015) G-quadruplexes and their regulatory roles in biology. Nucleic Acids Res 43:8627–8637 5. Xu Y, Sato H, Sannohe Y, Shinohara K, Sugiyama H (2008) Stable lariat formation

based on a G-quadruplex scaffold. J Am Chem Soc 130:16470–16471 6. Xu Y, Ishizuka T, Kurabayashi K, Komiyama M (2009) Consecutive formation of G-quadruplexes in human telomeric-overhang DNA: a protective capping structure for telomere ends. Angew Chem Int Ed 48:7833–7836 7. Xu Y, Ishizuka T, Yang J, Ito K, Katada H, Komiyama M, Hayashi T (2012) Oligonucleotide models of telomeric DNA and RNA form a hybrid G-quadruplex structure as a potential component of telomeres. J Biol Chem 287:41787–41796 8. Takahama K, Takada A, Tada S, Shimizu M, Sayama K, Kurokawa R, Oyoshi T (2013) Regulation of telomere length by G-quadruplex telomere DNA- and TERRA-binding protein TLS/FUS. Chem Biol 20:341–350

Studying DNA G-Quadruplex Structures by 19F NMR 9. Wang C, Zhao L, Lu S (2015) Role of TERRA in the regulation of telomere length. Int J Biol Sci 11:316–323 10. Simonsson T, Pecinka P, Kubista M (1998) DNA tetraplex formation in the control region of c-myc. Nucleic Acids Res 26:1167–1172 11. Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH (2002) Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc Natl Acad Sci U S A 99:11593–11598 12. Hershman SG, Chen Q, Lee JY, Kozak ML, Yue P, Wang LS, Johnson FB (2008) Genomic distribution and functional analyses of potential G-quadruplex-forming sequences in Saccharomyces cerevisiae. Nucleic Acids Res 36:144–156 13. Ito K, Go S, Komiyama M, Xu Y (2011) Inhibition of translation by small RNA-stabilized mRNA structures in human cells. J Am Chem Soc 133:19153–19159 14. Xu Y, Sugiyama H (2006) Formation of the G-quadruplex and i-motif structures in retinoblastoma susceptibility genes (Rb). Nucleic Acids Res 34:949–954 15. Paeschke K, Capra JA, Zakian VA (2011) DNA replication through G-quadruplex motifs is promoted by the Saccharomyces cerevisiae Pif1 DNA helicase. Cell 145:678–691 16. Besnard E, Babled A, Lapasset L, Milhavet O, Parrinello H, Dantec C, Marin JM, Lemaitre JM (2012) Unraveling cell type-specific and reprogrammable human replication origin signatures associated with G-quadruplex consensus motifs. Nat Struct Mol Biol 19:837–844 17. Vannier JB, Sandhu S, Petalcorin MI, Wu X, Nabi Z, Ding H, Boulton SJ (2013) RTEL1 is a replisome-associated helicase that promotes telomere and genome-wide replication. Science 342:239–242 18. Valton AL, Hassan-Zadeh V, Lema I, Boggetto N, Alberti P, Saintome C, Riou JF, Prioleau MN (2014) G4 motifs affect origin positioning and efficiency in two vertebrate replicators. EMBO J 33:732–746 19. Valton AL, Prioleau MN (2016) G-quadruplexes in DNA replication: a problem or a necessity? Trends Genet 32:697–706 20. Hurley LH (2002) DNA and its associated processes as targets for cancer therapy. Nat Rev Cancer 2:188–200 21. Neidle S, Parkinson G (2002) Telomere maintenance as a target for anticancer drug discovery. Nat Rev Drug Discov 1:383–393 22. De Cian A, Lacroix L, Douarre C, TemimeSmaali N, Trentesaux C, Riou JF, Mergny JL

431

(2008) Targeting telomeres and telomerase. Biochimie 90:131–155 23. Balasubramanian S, Neidle S (2009) G-quadruplex nucleic acids as therapeutic targets. Curr Opin Chem Biol 13:345–353 24. Xu Y, Suzuki Y, Lonnberg T, Komiyama M (2009) Human telomeric DNA sequencespecific cleaving by G-quadruplex formation. J Am Chem Soc 131:2871–2874 25. Shinohara K, Sannohe Y, Kaieda S, Tanaka K, Osuga H, Tahara H, Xu Y, Kawase T, Bando T, Sugiyama H (2010) A chiral wedge molecule inhibits telomerase activity. J Am Chem Soc 132:3778–3782 26. Xu Y, Ito K, Suzuki Y, Komiyama M (2010) A 6-mer photocontrolled oligonucleotide as an effective telomerase inhibitor. J Am Chem Soc 132:631–637 27. Collie GW, Parkinson GN (2011) The application of DNA and RNA G-quadruplexes to therapeutic medicines. Chem Soc Rev 40:5867–5892 28. Zhao C, Wu L, Ren J, Xu Y, Qu X (2013) Targeting human telomeric higher-order DNA: dimeric G-quadruplex units serve as preferred binding site. J Am Chem Soc 135:18786–18789 29. Lin C, Yang D (2017) Human telomeric G-quadruplex structures and G-quadruplexinteractive compounds. Methods Mol Biol 1587:171–196 30. Neidle S (2017) Quadruplex nucleic acids as targets for anticancer therapeutics. Nat Rev Chem 1:0041 31. Wang Y, Patel DJ (1993) Solution structure of the human telomeric repeat d[AG3(T2AG3)3] G-tetraplex. Structure 1:263–282 32. Parkinson GN, Lee MP, Neidle S (2002) Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417:876–880 33. Ambrus A, Chen D, Dai J, Bialis T, Jones RA, Yang D (2006) Human telomeric sequence forms a hybrid-type intramolecular G-quadruplex structure with mixed parallel/ antiparallel strands in potassium solution. Nucleic Acids Res 34:2723–2735 34. Luu KN, Phan AT, Kuryavyi V, Lacroix L, Patel DJ (2006) Structure of the human telomere in K+ solution: an intramolecular (3 + 1) G-quadruplex scaffold. J Am Chem Soc 128:9963–9970 35. Xu Y, Noguchi Y, Sugiyama H (2006) The new models of the human telomere d[AGGG (TTAGGG)3] in K+ solution. Bioorg Med Chem 14:5584–5591 36. Phan AT, Luu KN, Patel DJ (2006) Different loop arrangements of intramolecular human

432

Takumi Ishizuka et al.

telomeric (3 + 1) G-quadruplexes in K+ solution. Nucleic Acids Res 34:5715–5719 37. Phan AT, Kuryavyi V, Luu KN, Patel DJ (2007) Structure of two intramolecular G-quadruplexes formed by natural human telomere sequences in K+ solution. Nucleic Acids Res 35:6517–6525 38. Avino A, Fabrega C, Tintore M, Eritja R (2012) Thrombin binding aptamer, more than a simple aptamer: chemically modified derivatives and biomedical applications. Curr Pharm Des 18:2036–2047 39. Deng B, Lin Y, Wang C, Li F, Wang Z, Zhang H, Li XF, Le XC (2014) Aptamer binding assays for proteins: the thrombin example-a review. Anal Chim Acta 837:1–15 40. Esposito V, Scuotto M, Capuozzo A, Santamaria R, Varra M, Mayol L, Virgilio A, Galeone A (2014) A straightforward modification in the thrombin binding aptamer improving the stability, affinity to thrombin and nuclease resistance. Org Biomol Chem 12:8840–8843 41. Virgilio A, Petraccone L, Scuotto M, Vellecco V, Bucci M, Mayol L, Varra M, Esposito V, Galeone A (2014) 5-Hydroxymethyl-20 -deoxyuridine residues in the thrombin binding aptamer: investigating anticoagulant activity by making a tiny chemical modification. Chembiochem 15:2427–2434 42. Scuotto M, Rivieccio E, Varone A, Corda D, Bucci M, Vellecco V, Cirino G, Virgilio A, Esposito V, Galeone A, Borbone N, Varra M, Mayol L (2015) Site specific replacements of a single loop nucleoside with a dibenzyl linker may switch the activity of TBA from anticoagulant to antiproliferative. Nucleic Acids Res 43:7702–7716 43. Virgilio A, Petraccone L, Vellecco V, Bucci M, Varra M, Irace C, Santamaria R, Pepe A, Mayol L, Esposito V, Galeone A (2015) Sitespecific replacement of the thymine methyl group by fluorine in thrombin binding aptamer significantly improves structural stability and anticoagulant activity. Nucleic Acids Res 43:10602–10611 44. Macaya RF, Schultze P, Smith FW, Roe JA, Feigon J (1993) Thrombin-binding DNA aptamer forms a unimolecular quadruplex structure in solution. Proc Natl Acad Sci U S A 90:3745–3749 45. Kelly JA, Feigon J, Yeates TO (1996) Reconciliation of the X-ray and NMR structures of the thrombin-binding aptamer d (GGTTGGTGTGGTTGG). J Mol Biol 256:417–422

46. Russo Krauss I, Merlino A, Giancola C, Randazzo A, Mazzarella L, Sica F (2011) Thrombin-aptamer recognition: a revealed ambiguity. Nucleic Acids Res 39:7858–7867 47. Phan AT, Patel DJ (2003) Two-repeat human telomeric d(TAGGGTTAGGGT) sequence forms interconverting parallel and antiparallel G-quadruplexes in solution: distinct topologies, thermodynamic properties, and folding/ unfolding kinetics. J Am Chem Soc 125:15021–15027 48. Xu Y, Suzuki Y, Komiyama M (2009) Click chemistry for the identification of G-quadruplex structures: discovery of a DNA-RNA G-quadruplex. Angew Chem Int Ed 48:3281–3284 49. Xu Y, Suzuki Y, Ishizuka T, Xiao CD, Liu X, Hayashi T, Komiyama M (2014) Finding a human telomere DNA-RNA hybrid G-quadruplex formed by human telomeric 6-mer RNA and 16-mer DNA using click chemistry: a protective structure for telomere end. Bioorg Med Chem 22:4419–4421 50. Zhang N, Phan AT, Patel DJ (2005) (3 + 1) assembly of three human telomeric repeats into an asymmetric dimeric G-quadruplex. J Am Chem Soc 127:17277–17285 51. Zhou L, Rajabzadeh M, Traficante DD, Cho BP (1997) Conformational heterogeneity of arylamine-modified DNA: 19F NMR evidence. J Am Chem Soc 119:5384–5389 52. Hammann C, Norman DG, Lilley DM (2001) Dissection of the ion-induced folding of the hammerhead ribozyme using 19F NMR. Proc Natl Acad Sci U S A 98:5503–5508 53. Barhate NB, Barhate RN, Cekan P, Drobny G, Sigurdsson ST (2008) A nonafluoro nucleoside as a sensitive 19F NMR probe of nucleic acid conformation. Org Lett 10:2745–2747 54. Graber D, Moroder H, Micura R (2008) 19F NMR spectroscopy for the analysis of RNA secondary structure populations. J Am Chem Soc 130:17230–17231 55. Kiviniemi A, Virta P (2010) Characterization of RNA invasion by 19F NMR spectroscopy. J Am Chem Soc 132:8560–8562 56. Sakamoto T, Hayakawa H, Fujimoto K (2011) Development of a potassium ion sensor for 19F magnetic resonance chemical shift imaging based on fluorine-labeled thrombin aptamer. Chem Lett 40:720–721 57. Fauster K, Kreutz C, Micura R (2012) 20 -SCF3 uridine-a powerful label for probing structure and function of RNA by 19F NMR spectroscopy. Angew Chem Int Ed 51:13080–13084 58. Lombe`s T, Moumne´ R, Larue V, Prost E, Catala M, Lecourt T, Dardel F, Micouin L,

Studying DNA G-Quadruplex Structures by 19F NMR Tisne´ C (2012) Investigation of RNA-ligand interactions by 19F NMR spectroscopy using fluorinated probes. Angew Chem Int Ed 51:9530–9534 59. Chen H, Viel S, Ziarelli F, Peng L (2013) 19F NMR: a valuable tool for studying biological events. Chem Soc Rev 42:7971–7982 60. Tanabe K, Tsuda T, Ito T, Nishimoto S (2013) Probing DNA mismatched and bulged structures by using 19F NMR spectroscopy and oligodeoxynucleotides with an 19F-labeled nucleobase. Chem A Eur J 19:15133–15140 61. Zhao C, Devany M, Greenbaum NL (2014) Measurement of chemical exchange between RNA conformers by 19F NMR. Biochem Biophys Res Commun 453:692–695 62. Bao HL, Ishizuka T, Sakamoto T, Fujimoto K, Uechi T, Kenmochi N, Xu Y (2017) Characterization of human telomere RNA G-quadruplex structures in vitro and in living cells using 19F NMR spectroscopy. Nucleic Acids Res 45:5501–5511 63. Ishizuka T, Zhao PY, Bao HL, Xu Y (2017) A multi-functional guanine derivative for

433

studying the DNA G-quadruplex structure. Analyst 142:4083–4088 64. Bao HL, Xu Y (2018) Investigation of higherorder RNA G-quadruplex structures in vitro and in living cells by 19F NMR spectroscopy. Nat Protoc 13:652–665 65. Ye Y, Liu X, Xu G, Liu M, Li C (2015) Direct observation of Ca2+-induced calmodulin conformational transitions in intact Xenopus laevis oocytes by 19F NMR spectroscopy. Angew Chem Int Ed 54:5328–5330 66. Bao HL, Ishizuka T, Iwanami A, Oyoshi T, Xu Y (2017) A simple and sensitive 19F NMR approach for studying the interaction of RNA G-quadruplex with ligand molecule and protein. Chem Select 2:4170–4175 67. Ishizuka T, Yamashita A, Asada Y, Xu Y (2017) Studying DNA G-quadruplex aptamer by 19F NMR. ACS Omega 2:8843–8848 68. Bao HL, Xu Y (2019) Hybrid-type and two-tetrad antiparallel telomere DNA G-quadruplex structures in living human cells. Nucleic Acids Res 47:4940–4947

INDEX A Absolute binding free energy ............................. 178–183, 187, 190–193 Aggregation .......................................................30, 52, 54, 59, 60, 75, 152, 171, 272, 348, 349, 365 Amplex Red .......................................................... 361, 367 Analytical ultracentrifugation (AUC) ............. 10, 87–101 Aptamer ...........................................................34, 45, 153, 290, 365, 408, 420–422 Atomic-force microscopy (AFM) ......................... 11, 276, 283, 299–308, 407 2.2’-Azido-bis(3-ethylbenzothiazoline-6-sulfonic acid (ABTS) ............................................. 358, 359, 361, 362, 365, 367

B Biosensors ............................................................. 3, 63–79

C Cancer...................................................................... 4, 8, 9, 108, 117, 157, 158, 224, 347, 348, 352 Cations.............................................................2–4, 32, 38, 50, 55, 64, 65, 106–108, 112, 113, 115, 117, 127, 157, 201, 204, 215, 228, 230, 275, 277, 287, 309, 323, 324, 326, 335, 347, 359–360, 370 CD spectroscopy ...............................................26, 27, 29, 32, 224, 262, 271, 272, 303, 421 Chromatin ............................................................... 8, 233, 238, 241, 323, 370, 371 Chromatin immunoprecipitation (ChIP) assays ...................................................11, 233–242 Circular dichroism (CD) .................................. 10, 25–42, 93–95, 99, 120, 121, 145, 224, 258–260, 262, 263, 269, 271, 272, 276, 303, 330, 336, 338–339, 343, 419, 421, 426 c-MYC G-quadruplex ......................................... 178, 181, 187, 191–193, 280 Confocal microscopy .......................... 385–388, 390–393 Crystallization ..................................................... 133, 138, 140, 142, 144–147, 152

D Data collection .........................................................76–77, 133, 136, 137, 139, 141, 144, 148, 150

Deoxyribonucleic acid (DNA) ........................... 1, 11, 26, 45, 63, 88, 106, 117, 132, 158, 177, 202, 223, 233, 243, 258, 275, 299, 323, 333, 347, 357, 369, 370, 398, 407 Differential scanning calorimetry (DSC)...................... 10, 117–129, 276 Dimethyl sulfate (DMS) ...............................11, 201–220, 244–246, 248, 249, 252, 254, 276, 289 DNA conformational transitions......................... 420, 427 DNA damage................................................................. 7–9 DNA origami................................................289, 299–308 DNA polymerase ...........................................11, 223–230, 235, 236, 244, 246, 251, 257–273, 277, 334, 339, 340, 372, 379 DNA quadruplexes .............................................. 370, 380 DNA replication .........................................................2, 11, 223, 224, 309, 388, 391, 407 DNAzymes .............................................. 11, 45, 357–367 Double decoupling method (DDM) ................. 178–181, 188–193, 195, 196 Drug targets ........................................................... 45, 159

E Electrophoresis ...................................................... 29, 202, 204, 206, 209, 211, 216–219, 225–227, 229, 237, 240, 242, 244–248, 250–254, 261, 266, 273, 301, 330, 339, 340, 362, 363, 372, 374, 379 Electrophoretic mobility shift assay (EMSA) ...................................... 11, 201–220, 276 Electrospray ionization (ESI) ............................. 105–111, 113, 115, 411, 413–418, 429 Extension ............................................................... 89, 202, 224, 228, 229, 240, 250–252, 276, 277, 280, 334, 339, 349, 350, 353, 354, 388 Extraction ................................................... 152, 173, 183, 189, 239, 249, 349, 350, 352–354, 372, 377, 379

F Fluorescence probe .............................383–393, 424, 427 Folding ........................................2, 3, 26, 35, 64, 65, 90, 91, 94, 101, 106, 115, 117–119, 121, 123, 125–128, 143, 146, 158, 160, 162, 163, 165, 166, 169, 170, 208, 213, 215, 252, 280, 283–286, 288–290, 292, 294, 310, 315, 316, 319, 324, 330, 341, 343, 359, 364, 366, 403, 421

Danzhou Yang and Clement Lin (eds.), G-Quadruplex Nucleic Acids: Methods and Protocols, Methods in Molecular Biology, vol. 2035, https://doi.org/10.1007/978-1-4939-9666-7, © Springer Science+Business Media, LLC, part of Springer Nature 2019

435

G-QUADRUPLEX NUCLEIC ACIDS: METHODS

436 Index

AND

PROTOCOLS

Footprinting ...............................................................8, 11, 201–220, 244–246, 248, 252–253, 276, 289, 344, 371 Fo¨rster resonance energy transfer (FRET) .......... 11, 224, 276–278, 292, 309–321, 323–330

G γPNA .......................................... 334–337, 339–342, 344 Gel electrophoresis..................................... 237, 240, 244, 245, 247, 248, 251–254, 330, 339, 362, 363, 372, 374, 379 Gold nanoparticles (AuNPs) ............................... 347–356 Guanine-quadruplexes (G4s) ....................................1, 42, 45, 63, 87, 117, 132, 165, 223, 240, 257, 268, 309, 323, 333, 365, 370, 383 detection .................................................................. 201 DNA ...............................................65, 263, 264, 269, 272, 336, 337, 339, 370–372, 380, 384, 392 interactive agents............................................ 233–242 interactive compounds............................................... 9, 157–175, 223–230, 234, 325, 327, 330 ligands ............................................................ 7–10, 45, 46, 108, 113, 147, 152, 177–193, 252, 257, 292, 294, 323–330, 355, 387 specific antibodies .................................................7, 11 structures ....................................................... 3–11, 64, 66, 78, 87–101, 106, 110, 113–115, 121, 131–153, 157–175, 183, 204, 214, 225, 234, 243, 248, 275–277, 279, 280, 282, 284, 286–289, 292, 294, 309, 313, 317, 347, 348, 359, 360, 364, 370, 397–404, 407–430

H Heme .................................................................... 357–367 Hemin.................................................. 357–362, 364–366 Herpes simplex virus-1 (HSV-1).................................384, 385, 388, 390, 391 High-speed atomic force microscopy (HS-AFM) ................................................ 299–308 High-throughput genomics ..................... 8, 11, 341, 378 HRP ............................................................................... 357 Human telomeres.........................................................4, 8, 27, 33–35, 66, 79, 91, 94, 123, 177, 215, 259, 264, 276, 285–287, 289, 294, 347, 407, 426, 427 Hybridization .................... 290, 334, 336–339, 342–344

I Immunofluorescence staining ...................................... 385 i-motif .................................................................. 146, 147, 258, 259, 264, 265, 268, 269, 271, 272, 299–308 In-cell NMR .......................................... 11, 397–404, 428 Indoloquinoline ........................................................47, 51 Instability ........................................................................... 8

Interpretation of CD spectra ............................. 28, 30–35 Isothermal titration calorimetry (ITC)......................... 10, 45–60, 276, 330

K Kinetic analysis ...................................... 80, 120, 266–268 Kinetics ..............................................................10, 58, 65, 66, 69–71, 74, 77, 78, 80, 82, 118, 119, 127, 159, 170, 267, 277, 280, 282–289, 292, 294, 334, 339, 343, 344

L Ligand binding..................................................10, 47, 54, 55, 58, 67, 70, 74, 81, 88, 97, 113, 153, 178, 180, 183, 187, 192, 204, 290, 301 Ligands ................................................................ 6, 45, 65, 87, 110, 125, 133, 158, 177, 204, 223, 244, 257, 287, 301, 324, 348, 370, 387, 398 Luciferase reporter assay ............................................... 340

M Macromolecular crystallography ......................... 144, 153 Magnetic tweezers........................ 11, 280–282, 285, 286 Magneto-optical tweezers........................... 276, 282, 288 Mass spectrometry (MS).............. 89, 105–115, 142, 419 Mass transfer................................................ 71, 78, 80, 81 Molecular crowding .....................................................145, 257, 258, 262, 265, 270, 300, 398 Molecular dynamics (MD) simulations ........10, 177–193 Molecular weight .................................................... 47, 68, 81, 87–91, 94, 99, 101, 112, 113, 132, 145, 146, 148, 152, 269, 270, 272, 374, 379, 398

N Native SAD/MAD phasing.......................................... 149 19F NMR spectroscopy....................................... 407–430 Non-B DNA........................................277, 282, 369, 371 Noncovalent interaction ............................. 105, 106, 110 NSC 82892 .......................................................... 234, 235 Nuclear magnetic resonance (NMR) spectroscopy................................................. 10, 11, 159–161, 163, 170, 397–404, 407–430 Nucleoli ................................................................ 386–389

O Oncogene promoters ............................4, 8, 32, 107, 108 Optical tweezers ..................................277–286, 288, 289

P Peptide nucleic acid (PNA) ...........................11, 333–342 Peroxidase activity ............................... 357–359, 365–367 Phase diagrams ........................... 118–120, 123, 124, 267

G-QUADRUPLEX NUCLEIC ACIDS: METHODS POT1 ................................. 278, 310, 311, 314, 316–318 Potassium permanganate (KMnO4) ...........................371, 372, 374, 375, 377, 379–381, 429 Potential of mean force (PMF) .......................... 178–180, 183, 185–189, 192–195 Primer extension ................................................. 224, 228, 229, 250–252, 334, 339 Promoter ..3, 27, 47, 64, 107, 170, 177, 201, 233, 244, 276, 300, 337, 370 Protein .......................................................... 1, 30, 66, 87, 106, 141, 159, 202, 223, 233, 272, 278, 309, 333, 387, 397, 422

Q Quindoline ......................................................8, 100, 171, 172, 178, 183, 185–188, 191–193

R Replication.................................................... 4, 6, 11, 224, 243, 257–260, 265–269, 272, 273, 276, 277, 279, 284, 286, 288, 294, 323, 388, 391, 407 Ribonucleic acid (RNA) ............................................1, 30, 132, 214, 234, 243, 275, 323, 333, 348, 357, 381, 393, 399, 408

S Sample homogeneity....................................................... 38 Single-molecule ................................................... 275–294, 300, 304, 309–321 single-molecule fluorescence resonance energy transfer (smFRET) .................................. 276–279, 284, 285, 309–321 single-molecule methods ......................................... 11, 276–284, 294, 310 single-molecule observation .......................... 301, 304 Small molecule-nucleic acid interactions .......... 10, 63–79 Stability .................................................................. 2, 8, 10, 106, 113, 117, 118, 120–129, 142, 216, 217, 257, 258, 267, 271–273, 277, 280, 283, 284, 287–290, 292, 316, 335, 347, 359, 365, 370, 403, 425, 426 Stabilization .......................................................4, 8, 9, 64, 65, 77, 133, 202, 223, 290, 359, 426

AND

PROTOCOLS Index 437

Steady state analysis...................................................77, 82 Stoichiometry .......................................................... 47, 48, 54, 57, 65, 70, 76, 78, 87, 88, 95, 99, 100, 108, 110, 112, 113, 170, 171, 269, 427 Structure solution ............. 133, 136, 141, 142, 149–151 Surface plasmon resonance (SPR)................................. 10, 63–79, 178, 187, 191, 330, 339

T Telomerase............................................................... 4, 8, 9, 64, 158, 276, 289, 309, 347–356 Telomere............................................................ 4, 7–9, 27, 33, 35, 65, 91, 125, 202, 259, 264, 276, 278, 283–285, 292, 310, 347, 354, 407 Thermal difference spectroscopy (TDS)....................... 37, 335, 338–339 Thermodynamics............................................... 10, 45–49, 54, 65, 66, 82, 95, 106, 114, 117–120, 122, 123, 125, 126, 180–183, 188, 189, 192, 263, 265, 272, 283–285, 287–290, 292, 334, 339, 343, 344, 408, 420, 422, 424 Transcription ........................................................... 2, 5, 6, 8, 9, 11, 178, 202, 223, 233–245, 247, 250, 252, 253, 257, 276, 277, 280, 282, 284, 286, 288, 291, 294, 300, 301, 323, 333, 335, 339, 381, 407 Translation............................................................... 2, 6, 9, 187, 257, 324, 333–335, 339–341, 344

U 50 - and 30 -Untranslated region (UTR) ....................4, 6, 9 UV melting.......................................................... 258–260, 262–265, 272, 335–338

V Vascular endothelial growth factor (VEGF) ...............5, 6, 234–236, 240

X Xenopus laevis oocytes ..................................11, 397–404, 408, 410, 425 X-ray diffraction .................................................. 132, 133, 139, 140, 143, 149