Nucleic Acid Detection and Structural Investigations: Methods and Protocols [1st ed. 2020] 978-1-0716-0137-2, 978-1-0716-0138-9

This book volume provides in depth coverage of the nucleic acid field and aims to represent a broad diversity of the met

512 94 8MB

English Pages XI, 270 [272] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Nanotechnology for Nucleic Acid Delivery: Methods and Protocols [2nd ed.] 978-1-4939-9091-7, 978-1-4939-9092-4

This detailed second edition volume expands upon the prior edition by addressing newly emerged technologies as well as i

628 157 11MB Read more

Detection of Cell Death Mechanisms: Methods and Protocols 1071611615, 9781071611616

This volume provides detailed protocols for the performance, analysis, and troubleshooting of in vitro and in vivo exper

922 76 8MB Read more

Non-Natural Nucleic Acids: Methods and Protocols [1st ed.] 978-1-4939-9215-7;978-1-4939-9216-4

This volume provides relevant synthetic strategies, incorporation, and applications of non-natural nucleic acids. Chapte

526 106 8MB Read more

Nucleic Acid Biology and its Application in Human Diseases [1st ed. 2023] 9811985197, 9789811985195

This book reviews the structure-function relationship of nucleic acids, their role in the pathophysiology of the disease

249 50 63MB Read more

G-Quadruplex Nucleic Acids: Methods and Protocols [1st ed. 2019] 978-1-4939-9665-0, 978-1-4939-9666-7

This volume covers the structures, properties, and functions of G-quadruplexes in a wide range of biological disciplines

431 101 14MB Read more

Receptor and ion channel detection in the brain : methods and protocols [2 ed.] 9781071615218, 1071615211

286 129 15MB Read more

Modern Methods in Protein- and Nucleic Acid Research: Review Articles [Konferenzschrift, 1989, Bielefeld. Reprint 2019 ed.] 9783110853537, 9783110122756

129 28 23MB Read more

Nuclear Reprogramming: Methods and Protocols [1st ed.] 9781071610831, 9781071610848

This volume provides basic and advanced protocols on somatic cell nuclear transfer, induced pluripotent stem cells, and

545 69 14MB Read more

Homologous Recombination: Methods and Protocols [1st ed.] 9781071606438, 9781071606445

This volume explores homologous recombination’s (HR) essential role in meiotic and somatic cells. It discusses the analy

604 68 15MB Read more

Communication Protocols: Principles, Methods and Specifications [1st ed.] 9783030504045, 9783030504052

This book provides comprehensive coverage of the protocols of communication systems. The book is divided into four parts

1,503 147 8MB Read more

Nucleic Acid Detection and Structural Investigations: Methods and Protocols [1st ed. 2020]
978-1-0716-0137-2, 978-1-0716-0138-9

Author / Uploaded
Kira Astakhova
Syeda Atia Bukhari

Table of contents :
Front Matter ....Pages i-xi
Front Matter ....Pages 1-1
Optomagnetic Detection of Rolling Circle Amplification Products (Gabriel Antonio S. Minero, Valentina Cangiano, Jeppe Fock, Francesca Garbarino, Mikkel F. Hansen)....Pages 3-15
Subtyping of Swine Influenza Using a High-Throughput Real-Time PCR Platform and a Single Microfluidics Device (Helene Larsen, Nicole B. Goecke, Charlotte K. Hjulsager)....Pages 17-25
RT-qPCR Detection of Low-Copy HIV RNA with Yin-Yang Probes (Dmitry E. Kireev, Valentina M. Farzan, German A. Shipulin, Vladimir A. Korshun, Timofei S. Zatsepin)....Pages 27-35
Solid-Phase Hybridization Assay for Detection of Mutated Cancer DNA by Fluorescence (Maria Taskova, Kira Astakhova)....Pages 37-44
5′-Monopyrene and 5′-Bispyrene 2′-O-methyl RNA Probes for Detection of RNA Mismatches (D. S. Novopashina, O. A. Semikolenova, A. G. Venyaminova)....Pages 45-56
Combined Assay for Detecting Autoantibodies to Nucleic Acids and Apolipoprotein H in Patients with Systemic Lupus Erythematosus (Sangita Khatri, Elizabeth D. Mellins, Kathryn S. Torok, Syeda Atia Bukhari, Kira Astakhova)....Pages 57-71
Electrochemical Detection of dsDNA-Specific Antibodies (Pablo Fagúndez, Gustavo Brañas, Justo Laíz, Juan Pablo Tosar)....Pages 73-83
Front Matter ....Pages 85-85
Sequence-Defined DNA Amphiphiles for Drug Delivery: Synthesis and Self-Assembly (Michael D. Dore, Hanadi F. Sleiman)....Pages 87-100
DNA-Mediated Liposome Fusion Observed by Fluorescence Spectrometry (Philipp M. G. Löffler, Oliver Ries, Stefan Vogel)....Pages 101-118
Advanced Fluorescence Imaging to Distinguish Between Intracellular Fractions of Antisense Oligonucleotides (M. Leontien van der Bent, Derick G. Wansink, Roland Brock)....Pages 119-138
Methodology for Subcellular Fractionation and MicroRNA Examination of Mitochondria, Mitochondria Associated ER Membrane (MAM), ER, and Cytosol from Human Brain (Paresh Prajapati, Wang-Xia Wang, Peter T. Nelson, Joe E. Springer)....Pages 139-154
Front Matter ....Pages 155-155
Skeletal Muscle Injury by Electroporation: A Model to Study Degeneration/Regeneration Pathways in Muscle (Camila F. Almeida, Mariz Vainzof)....Pages 157-169
Isolation and Characterization of Muscle-Derived Stem Cells from Dystrophic Mouse Models (Paula C. G. Onofre-Oliveira, Mariz Vainzof)....Pages 171-180
Universal Library Preparation Protocol for Efficient High-Throughput Sequencing of Double-Stranded RNA Viruses (Anna S. Dolgova, Marina V. Safonova, Vladimir G. Dedkov)....Pages 181-188
Quantitation of Molecular Pathway Activation Using RNA Sequencing Data (Nicolas Borisov, Maxim Sorokin, Andrew Garazha, Anton Buzdin)....Pages 189-206
Molecular Pathway Analysis of Mutation Data for Biomarkers Discovery and Scoring of Target Cancer Drugs (Marianna Zolotovskaia, Maxim Sorokin, Andrew Garazha, Nikolay Borisov, Anton Buzdin)....Pages 207-234
Oncobox Method for Scoring Efficiencies of Anticancer Drugs Based on Gene Expression Data (Victor Tkachev, Maxim Sorokin, Andrew Garazha, Nicolas Borisov, Anton Buzdin)....Pages 235-255
Global Characterization of Circulating Nucleic Acids (Marina Dunaeva, Ger J. M. Pruijn)....Pages 257-268
Back Matter ....Pages 269-270

Citation preview

Methods in Molecular Biology 2063

Kira Astakhova Syeda Atia Bukhari Editors

Nucleic Acid Detection and Structural Investigations Methods and Protocols

METHODS

IN

MOLECULAR BIOLOGY

Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, UK

For further volumes: http://www.springer.com/series/7651

For over 35 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-bystep fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitute the key ingredient in each and every volume of the Methods in Molecular Biology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexed in PubMed.

Nucleic Acid Detection and Structural Investigations Methods and Protocols

Edited by

Kira Astakhova and Syeda Atia Bukhari Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark

Editors Kira Astakhova Department of Chemistry Technical University of Denmark Kongens Lyngby, Denmark

Syeda Atia Bukhari Department of Chemistry Technical University of Denmark Kongens Lyngby, Denmark

ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-0716-0137-2 ISBN 978-1-0716-0138-9 (eBook) https://doi.org/10.1007/978-1-0716-0138-9 © Springer Science+Business Media, LLC, part of Springer Nature 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature. The registered company address is: 233 Spring Street, New York, NY 10013, U.S.A.

Preface Nucleic acids are the key biomolecules in living organisms and thus represent the origin of life. Current nucleic acid analyses can be divided into two major groups: technically demanding, biological analyses and simpler, point-of-care assays. Both methodologies contribute to our growing understanding of nucleic acid regulation and function that opens up exciting opportunities for the treatment of human diseases. However, the delivery of oligonucleotides is a major challenge for therapeutic applications that still needs to be overcome. From a chemical standpoint, a deoxyribonucleic acid (DNA) is a macromolecule built of four nucleotide units abbreviated as A, T, C, and G. Driven by hydrogen bonding, stacking interactions, and hydrophobic effects, the nucleotides that form a DNA chain specifically recognize and bind to a complementary DNA sequence following Watson-Crick base pairing rules, i.e., A binds T and C binds G. Similarly, for ribonucleic acid (RNA), a sequence of monomolecular units rA, rU, rC, and rG is organized in the chain, which can recognize and bind its complementary DNA or RNA. An oligonucleotide probe is a short, synthetic DNA/RNA strand, typically less than 20 nucleotides in length, which can be synthesized chemically and bind to a desired region within a DNA or RNA biopolymer. Oligonucleotides are designed using Watson-Crick rules, along with considerations for optimal target affinity and specificity. Attaching a dye to the oligonucleotide probe creates a detection probe, whereas the delivery of single- or double-stranded DNA or RNA probes can serve as a therapy if the target sequence is affected in the cell. This book surveys the most recent developments in oligonucleotide probe technology for basic research and human disease applications. The collection covers a broad range of topics, with the main themes of point-of-care nucleic acid diagnostics, prognostic DNA and RNA markers, delivery of gene therapeutics to cells, and basic biology of nucleic acid interactions. In particular, we describe the utility of novel nanotechnologies for targeted detection and delivery of DNA (Loffler et al.; Dore et al.), advanced genotyping probes (Taskova et al.; Kireev et al.; Novopashina et al.), oligonucleotides as antigens in autoimmune antibody targeting (Khatri et al.; Faguńdez et al.), and applying RNA sequencing in biomarker discovery and drug response monitoring (Borisov et al.; Zolotovskaia et al.; Tkachev et al.). Ensuring that oligonucleotide therapeutic candidates effectively work inside cells has been debated over decades. Prajapati et al. and van der Bent et al. describe how fluorescence imaging can approach this task with previously unachievable precision. Regeneration and stem-cell-based therapies have generated growing attention over the past years. Almeida et al. and Onofre-Oliveira et al. present their exciting studies in this area, in chapters dedicated to isolation of muscle-derived stem cells and studies of regeneration pathways by electroporation that cause muscle injury. Both works are performed in a suitable mouse model, but can be adapted further; they provide new ideas and methodologies for work on the relevant problems of human disease management. This book has a wide coverage of the field and enough depth to be of practical use to professionals. Thus, it aims to represent a broad diversity of the methodologies and practical coverage of a wide range of nucleic-acid-related topics within the fields of molecular biology and biomedicine. The presented methodologies combine cutting-edge innovation with

v

vi

Preface

sound theory and practical applications in life sciences. Clearly, not all oligonucleotide applications are covered in this book. However, we hope that these topics will attract the attention of a broad readership and act as a source for useful practical information as well as an inspiration to explore modern oligonucleotide technology in a different context. Preferably, the reader of this book should have a background in biology and be familiar with the basic properties of oligonucleotides. Nevertheless, this book could be used by a wider readership, including graduate students, researchers, instructors, and practitioners in molecular biology and biomedicine. Methodologies within the nucleic acids field develop at an extreme speed, and it should be noted that the procedures described in the book might rapidly evolve as well. It is therefore encouraged to stay current and use the presented methodologies as an updated version for 2019, with necessary revisions to be done approximately every 7 years. This book is organized into three parts: In Vitro Detection, Nanotechnology and Imaging, Biomedical Applications and Big Data. The 18 chapters were carefully selected to provide a broad scope of methods by leading researchers in each field. We aimed to select chapters with minimal overlap between individual methods, but with enough coverage of the four major themes. We hope that the reader will find the resulting material useful and inspiring for taking nucleic acid technologies to the next level. Kongens Lyngby, Denmark

Kira Astakhova Syeda Atia Bukhari

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

PART I

IN VITRO DETECTION

1 Optomagnetic Detection of Rolling Circle Amplification Products . . . . . . . . . . . . Gabriel Antonio S. Minero, Valentina Cangiano, Jeppe Fock, Francesca Garbarino, and Mikkel F. Hansen 2 Subtyping of Swine Influenza Using a High-Throughput Real-Time PCR Platform and a Single Microfluidics Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . Helene Larsen, Nicole B. Goecke, and Charlotte K. Hjulsager 3 RT-qPCR Detection of Low-Copy HIV RNA with Yin-Yang Probes . . . . . . . . . . Dmitry E. Kireev, Valentina M. Farzan, German A. Shipulin, Vladimir A. Korshun, and Timofei S. Zatsepin 4 Solid-Phase Hybridization Assay for Detection of Mutated Cancer DNA by Fluorescence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maria Taskova and Kira Astakhova 5 50 -Monopyrene and 50 -Bispyrene 20 -O-methyl RNA Probes for Detection of RNA Mismatches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. S. Novopashina, O. A. Semikolenova, and A. G. Venyaminova 6 Combined Assay for Detecting Autoantibodies to Nucleic Acids and Apolipoprotein H in Patients with Systemic Lupus Erythematosus . . . . . . . . . . . . Sangita Khatri, Elizabeth D. Mellins, Kathryn S. Torok, Syeda Atia Bukhari, and Kira Astakhova 7 Electrochemical Detection of dsDNA-Specific Antibodies. . . . . . . . . . . . . . . . . . . . ˜ as, Justo Laı´z, and Juan Pablo Tosar Pablo Faguńdez, Gustavo Bran

PART II

v ix

3

17 27

37

45

57

73

NANOTECHNOLOGY AND IMAGING

8 Sequence-Defined DNA Amphiphiles for Drug Delivery: Synthesis and Self-Assembly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Michael D. Dore and Hanadi F. Sleiman 9 DNA-Mediated Liposome Fusion Observed by Fluorescence Spectrometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Philipp M. G. Lo¨ffler, Oliver Ries, and Stefan Vogel 10 Advanced Fluorescence Imaging to Distinguish Between Intracellular Fractions of Antisense Oligonucleotides . . . . . . . . . . . . . . . . . . . . . . . . 119 M. Leontien van der Bent, Derick G. Wansink, and Roland Brock 11 Methodology for Subcellular Fractionation and MicroRNA Examination of Mitochondria, Mitochondria Associated ER Membrane (MAM), ER, and Cytosol from Human Brain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Paresh Prajapati, Wang-Xia Wang, Peter T. Nelson, and Joe E. Springer

vii

viii

Contents

PART III 12

13

14

15

16

17

18

BIOMEDICAL APPLICATIONS AND BIG DATA

Skeletal Muscle Injury by Electroporation: A Model to Study Degeneration/Regeneration Pathways in Muscle . . . . . . . . . . . . . . . . . . . . . . . . . . . Camila F. Almeida and Mariz Vainzof Isolation and Characterization of Muscle-Derived Stem Cells from Dystrophic Mouse Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paula C. G. Onofre-Oliveira and Mariz Vainzof Universal Library Preparation Protocol for Efficient High-Throughput Sequencing of Double-Stranded RNA Viruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna S. Dolgova, Marina V. Safonova, and Vladimir G. Dedkov Quantitation of Molecular Pathway Activation Using RNA Sequencing Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nicolas Borisov, Maxim Sorokin, Andrew Garazha, and Anton Buzdin Molecular Pathway Analysis of Mutation Data for Biomarkers Discovery and Scoring of Target Cancer Drugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marianna Zolotovskaia, Maxim Sorokin, Andrew Garazha, Nikolay Borisov, and Anton Buzdin Oncobox Method for Scoring Efficiencies of Anticancer Drugs Based on Gene Expression Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Victor Tkachev, Maxim Sorokin, Andrew Garazha, Nicolas Borisov, and Anton Buzdin Global Characterization of Circulating Nucleic Acids. . . . . . . . . . . . . . . . . . . . . . . . Marina Dunaeva and Ger J. M. Pruijn

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

157

171

181

189

207

235

257 269

Contributors CAMILA F. ALMEIDA • Laboratory of Muscle Proteins and Comparative Histopathology, Human Genome and Stem Cell Research Center, Biosciences Institute, University of Saõ Paulo, Saõ Paulo, Brazil KIRA ASTAKHOVA • Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark NICOLAS BORISOV • Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia; Omicsway Corp., Walnut, CA, USA NIKOLAY BORISOV • Omicsway Corp., Walnut, CA, USA; Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia GUSTAVO BRANÃS • Analytical Biochemistry Unit, Nuclear Research Center, Faculty of Science, Universidad de la Repu´blica, Montevideo, Uruguay ROLAND BROCK • Department of Biochemistry, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, The Netherlands SYEDA ATIA BUKHARI • Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark ANTON BUZDIN • Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia; Omicsway Corp., Walnut, CA, USA; ShemyakinOvchinnikov Institute of Bioorganic Chemistry, Moscow, Russia VALENTINA CANGIANO • Department of Health Technology, DTU Health Tech, Technical University of Denmark, Kongens Lyngby, Denmark VLADIMIR G. DEDKOV • Saint-Petersburg Pasteur Institute, Center, Federal Service on Consumers’ Rights Protection and Human Well-Being Surveillance, Saint-Petersburg, Russia; Martsinovsky Institute of Medical Parasitology, Tropical and Vector Borne Diseases, Sechenov First Moscow State Medical University, Moscow, Russia ANNA S. DOLGOVA • Saint-Petersburg Pasteur Institute, Center, Federal Service on Consumers’ Rights Protection and Human Well-Being Surveillance, Saint-Petersburg, Russia MICHAEL D. DORE • Department of Chemistry, McGill University, Montreal, QC, Canada MARINA DUNAEVA • Department of Biomolecular Chemistry, Institute for Molecules and Materials, Radboud University Nijmegen, Nijmegen, The Netherlands PABLO FAGUŃDEZ • Analytical Biochemistry Unit, Nuclear Research Center, Faculty of Science, Universidad de la Repu´blica, Montevideo, Uruguay VALENTINA M. FARZAN • Skolkovo Institute of Science and Technology, Moscow, Russia JEPPE FOCK • Department of Health Technology, DTU Health Tech, Technical University of Denmark, Kongens Lyngby, Denmark; BluSense Diagnostics Aps, Copenhagen, Denmark ANDREW GARAZHA • Omicsway Corp., Walnut, CA, USA FRANCESCA GARBARINO • Department of Health Technology, DTU Health Tech, Technical University of Denmark, Kongens Lyngby, Denmark NICOLE B. GOECKE • National Veterinary Institute, Technical University of Denmark, Kongens Lyngby, Denmark MIKKEL F. HANSEN • Department of Health Technology, DTU Health Tech, Technical University of Denmark, Kongens Lyngby, Denmark CHARLOTTE K. HJULSAGER • Statens Serum Institute, Copenhagen, Denmark

ix

x

Contributors

SANGITA KHATRI • Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark DMITRY E. KIREEV • Central Research Institute of Epidemiology, Moscow, Russia VLADIMIR A. KORSHUN • Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia; Gause Institute of New Antibiotics, Moscow, Russia; Department of Biology and Biotechnology, National Research University Higher School of Economics, Moscow, Russia JUSTO LAI´Z • Analytical Biochemistry Unit, Nuclear Research Center, Faculty of Science, Universidad de la Repu´blica, Montevideo, Uruguay HELENE LARSEN • National Veterinary Institute, Technical University of Denmark, Kongens Lyngby, Denmark PHILIPP M. G. LO¨FFLER • Department of Physics, Chemistry and Pharmacy, University of Southern Denmark, Odense M, Denmark ELIZABETH D. MELLINS • Program in Immunology, Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA GABRIEL ANTONIO S. MINERO • Department of Health Technology, DTU Health Tech, Technical University of Denmark, Kongens Lyngby, Denmark PETER T. NELSON • Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY, USA; Department of Pathology and Laboratory Medicine, University of Kentucky, Lexington, KY, USA D. S. NOVOPASHINA • Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk, Russia; Novosibirsk State University, Novosibirsk, Russia PAULA C. G. ONOFRE-OLIVEIRA • Laboratory of Muscle Proteins and Comparative Histopathology, Human Genome and Stem Cell Research Center, Biosciences Institute, University of Saõ Paulo, Saõ Paulo, Brazil PARESH PRAJAPATI • Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, KY, USA; Department of Neuroscience, University of Kentucky, Lexington, KY, USA GER J. M. PRUIJN • Department of Biomolecular Chemistry, Institute for Molecules and Materials, Radboud University Nijmegen, Nijmegen, The Netherlands OLIVER RIES • Department of Physics, Chemistry and Pharmacy, University of Southern Denmark, Odense M, Denmark MARINA V. SAFONOVA • Plague Control Center, Federal Service on Consumers’ Rights Protection and Human Well-Being Surveillance, Moscow, Russia O. A. SEMIKOLENOVA • Novosibirsk State University, Novosibirsk, Russia GERMAN A. SHIPULIN • Federal State Budgetary Institution “Center for Strategic Planning and Management of Biomedical Health Risks” of the Ministry of Health, Moscow, Russia HANADI F. SLEIMAN • Department of Chemistry, McGill University, Montreal, QC, Canada MAXIM SOROKIN • Laboratory of Clinical Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia; Omicsway Corp., Walnut, CA, USA; ShemyakinOvchinnikov Institute of Bioorganic Chemistry, Moscow, Russia JOE E. SPRINGER • Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, KY, USA; Department of Neuroscience, University of Kentucky, Lexington, KY, USA MARIA TASKOVA • Department of Chemistry, Technical University of Denmark, Kongens Lyngby, Denmark VICTOR TKACHEV • Omicsway Corp., Walnut, CA, USA KATHRYN S. TOROK • Division of Rheumatology, Department of Pediatrics, Children’s Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA

Contributors

xi

JUAN PABLO TOSAR • Analytical Biochemistry Unit, Nuclear Research Center, Faculty of Science, Universidad de la Repu´blica, Montevideo, Uruguay; Functional Genomics Laboratory, Institut Pasteur de Montevideo, Montevideo, Uruguay MARIZ VAINZOF • Laboratory of Muscle Proteins and Comparative Histopathology, Human Genome and Stem Cell Research Center, Biosciences Institute, University of Saõ Paulo, Saõ Paulo, Brazil M. LEONTIEN VAN DER BENT • Department of Biochemistry, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, The Netherlands; Department of Cell Biology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, The Netherlands A. G. VENYAMINOVA • Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk, Russia STEFAN VOGEL • Department of Physics, Chemistry and Pharmacy, University of Southern Denmark, Odense M, Denmark WANG-XIA WANG • Spinal Cord and Brain Injury Research Center, University of Kentucky, Lexington, KY, USA; Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY, USA; Department of Pathology and Laboratory Medicine, University of Kentucky, Lexington, KY, USA DERICK G. WANSINK • Department of Cell Biology, Radboud Institute for Molecular Life Sciences (RIMLS), Radboud University Medical Center, Nijmegen, The Netherlands TIMOFEI S. ZATSEPIN • Skolkovo Institute of Science and Technology, Moscow, Russia; Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia MARIANNA ZOLOTOVSKAIA • Omicsway Corp., Walnut, CA, USA; Department of Oncology, Hematology and Radiotherapy of Pediatric Faculty, Pirogov Russian National Research Medical University, Moscow, Russia

Part I In Vitro Detection

Chapter 1 Optomagnetic Detection of Rolling Circle Amplification Products Gabriel Antonio S. Minero, Valentina Cangiano, Jeppe Fock, Francesca Garbarino, and Mikkel F. Hansen Abstract Rolling circle amplification (RCA) of a synthetic nucleic acid target is detected using magnetic nanoparticles (MNPs) combined with an optomagnetic (OM) readout. Two RCA assays are developed with on-chip detection of rolling circle products (RCPs) either at end-point where MNPs are mixed with the sample after completion of RCA or in real time where MNPs are mixed with the sample during RCA. The plastic chip acts as a cuvette, which is positioned in a setup integrated with temperature control and simultaneous detection of four parallel DNA hybridization reactions between functionalized MNPs and products of DNA amplification. The OM technique probes the small-angle rotation of MNPs bearing oligonucleotide probes complementary to the repeated nucleotide sequence of the RCPs. This rotation is restricted when MNPs bind to RCPs, which can be observed as a turn-off of the signal from MNPs that are free to rotate. The amount of MNPs bound to RCPs is found to increase in response to the amplification time as well as in response to the synthetic DNA target concentration (2–40 pM dynamic range). We report OM real-time results obtained with MNPs present during RCA and compare to relevant end-point OM results for RCPs generated for different RCA times. The real-time approach avoids opening of tubes post-RCA and thus reduces risk of lab contamination with amplification products without compromising the sensitivity and dynamic range of the assay. Key words Biosensor, Isothermal amplification, RCA, Magnetic nanoparticles, Real-time detection

1

Introduction In molecular diagnostics, nucleic acid biomarkers have made it possible to detect an increasing number of infectious diseases at an early stage to the benefit of patients [1]. Compared to PCR, isothermal amplification methods are simpler to implement in an out-of-lab setting and are thus promising for patient-near diagnostic applications. Among these techniques is rolling circle amplification (RCA) [2]. In RCA, a circular template is formed by annealing and ligation of a padlock probe (PLP) onto the target. The 30 hydroxyl and 50 -phosphate terminals of PLP have to be in close

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_1, © Springer Science+Business Media, LLC, part of Springer Nature 2020

3

4

Gabriel Antonio S. Minero et al.

proximity for the ligation to take place. Therefore, the ligation efficiency is highly sensitive to a single base mutation at the binding sites near the ends of the PLP and the PLP can be designed to detect such mutations with 100% specificity [3, 4]. On successfully circularized PLPs, the 30 -end of the DNA target can then be continuously extended by phi29 polymerase, which unzips the double-stranded product and displaces the newly synthesized single-stranded copy to create a long single-stranded DNA concatemer containing repeated copies of the DNA sequence complementary to the PLP. Typically, 1000 copies of the approximately 100 nt long PLP are produced in 1 h. RCA can be carried out in a standard format, as presented here, where the amount of product grows linearly with time [5, 6] or in more advanced formats using circle-to-circle amplification (C2CA) or hyperbranched RCA (h-RCA) [7–9] where the amount of product grows closer to exponentially with time. The amplification strategies with improved yield of products require either the use of magnetic microbeads, which may interfere with enzymatic amplification in C2CA [8, 10] or extreme precautions when handled in labs to avoid contamination, as observed in other exponential isothermal amplification methods, such as Loop-mediated Amplification (LAMP) [11]. To the best of our knowledge, only few reports exist on the integration of RCA with real-time detection. For example, Maier et al. used an impedimetric sensor and found a limit of detection (LOD) of 420 pM after 45 min of RCA [12]; Seichi et al. used an ethidium ion-selective electrode and obtained an LOD of 10 nM after 2 h of RCA [13], and Yang et al. adapted RCA for protein detection via aptamers and found an LOD of 400 pM after 90 min of RCA [14]. End-point detection, using for example confocal microscopy, a mobile-phone based microscope or a microarray readout format, has generally demonstrated good performance with LODs in the fM/pM range after RCA times >1 h [15–17]. However, these strategies require additional labeling steps and post-RCA opening of the reaction tube to perform the detection, which creates a significant risk of cross-contamination between experiments. Herein, we detect the RCPs using an optomagnetic detection technique, which is described in detail in references [18, 19]. This method probes the modulation of light transmitted through a suspension of magnetic nanoparticles in response to an oscillating magnetic field, B(t) ¼ B0 sin (2πft), applied along the light path. MNPs that are nonspherical and have a linked magnetic moment and optical anisotropy produce a second harmonic modulation of the intensity of transmitted light, which reflects their ability to rotate in response to the applied field. This ability depends on their hydrodynamic size such that small MNPs are able to follow the excitation field up to higher frequencies than large MNPs. Spectra of the real part of the second harmonic signal, reflecting

Optomagnetic Detection of RCA Products

5

the sin(4πft)-component of the signal, show a peak at a frequency, which is inversely proportional to the hydrodynamic volume of the MNPs. When MNPs bind to RCPs, their hydrodynamic size increases significantly and spectral features shift to substantially lower frequencies. The presence of RCPs can therefore be detected as a turn-off of the signal from free MNPs at high frequencies. Using this approach, a typical LOD measured at end-point after an RCA time of about 30 min is about 10 pM [20, 21]. The objective of this study was to develop, present and characterize two assays of RCA with the on-chip OM readout: (1) end-point detection of DNA products carried out by mixing MNPs with RCPs after RCA, in comparison to previous reports [20, 21], and (2) integrated amplification and real-time detection with MNPs present during RCA. The latter approach does not require opening of the tube or chip after RCA and is therefore less prone to contamination between experiments. The assays were carried out using a synthetic target DNA for type B-influenza virus [10]. We demonstrate that the two RCA assay formats produce comparable results and that down to at least 2 pM of synthetic DNA target can be detected using a real-time optomagnetic readout format.

2 2.1

Materials Setup

Our optomagnetic setup consisted of four identical copies of a single-chip setup such that four experiments could run in parallel (Fig. 1). Each single-chip setup measured the modulation of monochromatic light transmitted through the chip containing an MNP suspension in response to an oscillating magnetic field applied along the light direction [18, 19]. In addition to structural mechanical parts, the complete four-chip setup consisted of 1. Four light emitting diodes (one for each chip) (λ¼405 nm). 2. Four photodetectors (one for each chip) (PDA-36A-EC, Thorlabs). 3. A data acquisition unit (USB-6341 DAQ, National Instruments) used to generate a sinusoidal output voltage and to detect the voltages from the four photodetectors. 4. A custom-built voltage-to-current amplifier used to drive all electromagnetic coils in series. 5. Eight electromagnetic coils (two for each chip). 6. Eight custom-made resistive heater elements (2.5 Ω) defined by photolithography on a 1.5 mm thick aluminum printed circuit board. Two were used for each chip to form a bottom heater–chip–top heater sandwich.

6

Gabriel Antonio S. Minero et al.

Fig. 1 Setup for the optomagnetic studies where four experiments can be run on-chip simultaneously. (a) Schematic of a single setup unit with a plastic chip placed inside the setup and the top heater in the open position. The LED emits monochromatic light, which is guided to the side of a chip using a plastic rod. The light transmitted through the chip is collected and guided to a photodetector using a second plastic rod. The plastic chip is sandwiched between two resistive heaters with a temperature measured by a platinum thermometer. The magnetic field on the chip is provided by an electromagnet placed on either side of the chip. (b) Photograph of entire system with four setups. (c) Close-up photograph of a single setup unit. A chip is placed in the setup with the top heater in the open position

7. Four Pt100 thermometers (one for each chip) to monitor temperature of the bottom heater. 8. One four-channel Stanford Research Systems PTC10 temperature controller equipped with four PTC430 50 W output cards.

Optomagnetic Detection of RCA Products

7

Fig. 2 Photograph of assembled chip. A ruler shows the scale in cm. The outline of the chambers and channels is cut in the center PSA-2 mm PMMA-PSA stack and the through-holes are cut in the 0.5 mm PMMA lid

2.2

Chips

1. Polymethylmethacrylate (PMMA) plates of thicknesses 0.5 and 2 mm. 2. Pressure sensitive adhesive (PSA) foil (ARcare 90106, Adhesive Research, Limerick, Ireland), 0.15 mm thick. 3. To fabricate chips (Fig. 2), PSA is laminated on both sides of the 2 mm thick PMMA plate and the channel and chamber structure is cut through the three-layer laminate using an Epilog Mini 18 CO2 laser cutter. Inlet holes for the chips are laser cut through a 0.5 mm thick PMMA plate to form the lid layer. The protective foils on the three-layer laminate are removed and the stack of the 0.5 mm lid, the PSA-2 mm PMMA-PSA and the 0.5 mm thick bottom are assembled and left in a press at a pressure of about 1.8 MPa for 30 s to form the complete bonded stack. Subsequently, the individual chips are cut from the stack using the laser cutter. 4. Each chip has the following dimensions: thickness ca. 3 mm, width 13 mm, length 25 mm and contains a loading chamber with two inlets (two ensure pressure equilibration) and a measurement chamber (Fig. 2). 5. The measurement chamber of the chip containing the sample has a square geometry (side length 5 mm) and a volume of approximately 75 μL. 6. A chip is loaded for a measurement by first loading the inlet chamber using a pipette and then manually applying a quick acceleration to shift the liquid to the measurement chamber.

2.3 Buffers and Reagents

1. 20 mg/mL BSA. 2. Ligation buffer 10: 200 mM Tris–HCl, 250 mM KCl, 100 mM MgCl2, 5 mM NAD, 0.1% Triton X-100, pH 8.3.

8

Gabriel Antonio S. Minero et al.

3. 5 U/μL Ampligase. 4. 10 mM dNTP mix. 5. RCA buffer 10: 330 mM Tris–CH3COOH, 100 mM Mg (CH3COO)2, 660 mM K-CH3COOH, 1.0% Tween 20, 10 mM DTT, pH 7.9. 6. 10 U/μL Phi29 polymerase. 7. 10 U/μL R. AluI restriction endonuclease. 8. Binding buffer 1: 8 mM Tris, 4 mM EDTA, 0.1% Tween 20, and 0.8 M NaCl, pH 8 (adjusted using HCl). 9. Detection buffer 1: 20 mM Tris–HCl, 140 mM NaCl, 5 mM KCl, 50 mM EDTA, 0.1% BSA, 0.01% Tween 20, pH 8 (adjusted using HCl). 10. Streptavidin coated 100 nm MNPs (10 mg/mL) (BNF starch streptavidin, Micromod). According to the manufacturer, the 10 mg/mL stock MNP concentration corresponds to 6 1012 MNPs/mL or 10 nM. 11. Two tubes with 60 μL stock solutions of the target DNA-PLP hybrids with target concentrations of 2 nM and 200 pM and threefold excess of PLPs are prepared in the ligation buffer (1) with 0.2 mg/mL BSA and 250 U/mL ampligase. 2.4 DNA Oligonucleotides

The DNA sequence design for padlock probe (PLP) recognition and rolling circle amplification of influenza synthetic target DNA was adapted from Neumann et al. [10] and sequences are given in Table 1. The detection oligonucleotide (DO) is also a primer for restriction by R. AluI restriction nuclease.

Table 1 DNA sequences of the synthetic target, the padlock probe (PLP) and the detection oligonucleotide (DO) given in the 50 ! 30 direction DNA sequence (50 to 30 ) Target AGACCTGTTACATCTGGGTGCTTTCCTATAA TGCACGACAGAACAAAAATTAGACAGCTGCCC AACCTTCTCCGAGGATAC

Modification –

PLP

GGGCAGCTGTCTAATTTTTGAGTCGGAAGTACTACTCTCTGTGTA TGCAGCTCCTCAGTAATAGTGTCTTACGTATCCTCGGAGAAGGTT

50 -phosphate

DO

GTGTATGCAGCTCCTCAGTA

30 -biotin

The target sequence matching the arms of the PLP is underlined. The sequence on the PLP identical to that of the DO is marked in bold

Optomagnetic Detection of RCA Products

3

9

Methods

3.1 MNP Functionalization

1. A volume of 5 μL (10 mg/mL) MNPs is mixed with 4 μL of the (1 μM) biotinylated DOs and 41 μL binding buffer (1), see Note 1, and incubated at room temperature for 30 min. 2. The DO conjugated MNPs (DO-MNPs) are separated from unbound DO in a strong magnetic field gradient and washed three times in 50 μL binding buffer (1). 3. The DO-MNPs are finally resuspended in 50 μL detection buffer (1) to a nominal concentration of 1 mg/mL.

3.2 Padlock Probe Ligation

1. Study wide range of target concentrations (2 nM and 200 pM) by using stock solutions of target DNA PLP hybrids. 2. Sequence-specific annealing and PLP ligation are performed simultaneously in tubes by placing in block thermostat at 55 C for 20 min, see Note 2 and schematics in Figs. 3a-1 and 4a-1.

3.3

RCA in Tube

1. 50 μL of each of the obtained target-PLP stocks is mixed with 50 μL of RCA mixture containing 1.12 μL (10 mM) dNTP, 10 μL (2 mg/mL) BSA, 10 μL RCA buffer (10), 25.3 μL milliQ water, and 3.6 μL (10 U/μL) phi29 polymerase. 2. A No Template Control (NTC) is prepared with 50 μL of milliQ water instead of DNA. 3. Tubes with the solutions are placed in a block thermostat at 38 C for 45 min, see schematic in Fig. 3a-2. 4. Subsequently, tubes are temporarily placed on ice while the temperature of the block thermostat reaches 65 C after which the tubes are moved to the block thermostat to terminate amplification by heat inactivation of the phi29 polymerase for 10 min.

3.4 End-Point Optomagnetic Detection on a Chip

1. Samples (100 μL) are prepared by adding a volume of RCPs from one of the RCP stocks to detection buffer (1) to obtain a volume of 95 μL and target-PLP concentrations c [pM] of 200, 40, 30, 20, 10, 4, 2, and 0 and in a final 100 μL volume. As the final step, 5 μL of (1 mg/mL) DO-MNPs are added to a final MNP concentration of 0.05 mg/mL. 2. The samples are immediately loaded into chips that are sealed using PCR tape and mounted in the OM setups preheated to the detection temperature of 54 C (see Note 3). 3. 30 sequential OM measurements are recorded for B0¼1 mT and 41 logarithmically spaced frequencies f between 1 and 2800 Hz (45 s per spectrum, total time 23 min), see schematic in Fig. 3a-3 and example of results in Fig. 3b.

10

Gabriel Antonio S. Minero et al.

Fig. 3 End-point detection of rolling circle amplification products. (a) Schematic illustration of (1) simultaneous PLP annealing and ligation, (2) RCA, and (3) post-RCA on-chip optomagnetic detection. (b) OM spectra (30 spectra, 45 s each) recorded during binding of DO-MNPs to RCPs (c ¼ 40 pM, 45 min RCA time). The dashed lines indicate frequency range used for the calculation of the average signal from free DO-MNPs at time t, hV 02 =V 0 it . (c) Time-evolution of fraction of bound MNPs, B MNP ¼ 1 hV 02 =V 0 it =hV 02 =V 0 i0 for the indicated DNA target concentrations, c. (d) Values of BMNP after detection for 23 min for the indicated target concentrations and RCA times (20, 30, and 45 min). Error bars indicate the standard deviation (SD) obtained from three repeated experiments. The horizontal lines represent the signal cutoff given by the NTC signal plus 3SD. The signal for 2 pM target concentration is above the detection threshold for all three investigated RCA times

3.5 RCA On-Chip with Real-Time Optomagnetic Detection

1. 50 μL of each of the obtained target-PLP stocks is mixed with 50 μL of RCA mixture containing 1.12 μL (10 mM) dNTP, 10 μL (2 mg/mL) BSA, 10 μL RCA buffer (10), 5 μL (1 mg/ mL) DO-MNPs, 20.3 μL milliQ water and 3.6 μL (10 U/μL) phi29 polymerase. 2. The samples are immediately loaded into chips that are sealed using PCR tape and mounted in the OM setup preheated to 38 C. 3. One hundred sequential OM measurements are recorded for B0¼1 mT and 41 logarithmically spaced frequencies f between

Optomagnetic Detection of RCA Products

11

Fig. 4 Real-time detection of rolling circle amplification products. (a) Schematic illustration of (1) PLP annealing and ligation and (2) on-chip RCA with simultaneous optomagnetic detection. (b) OM spectra recorded after the RCA/detection times indicated in the legend for c ¼ 40 pM. The dashed lines indicate frequency range used for the calculation of the amount of free MNPs. The arrow indicates progression of the times series. (c) Values of BMNP after amplification/detection for 20, 30, 45, and 75 min for the indicated target concentrations, c. The horizontal lines represent cutoff values corresponding to BMNP of the NTC samples after 20, 30, 45, and 75 min, respectively, plus 3SD. The inset presents the time evolution of BMNP for the indicated concentrations. The signal for 2 pM target concentration is at or above the detection threshold for all investigated RCA/detection times

1 and 2800 Hz (45 s per spectrum, total time 75 min), see schematic in Fig. 4a-2 and example of results in Fig. 4b. 4. If necessary, the DO-RCP hybrids formed by specific binding of DO-MNPs to the RCPs can be digested by addition of 1 μL of (10 U/μL) R. AluI restriction endonuclease to release the MNPs from the RCPs. 3.6 Analysis of the OM Spectra

1. The data collection software saves the complex harmonics (from 0th to 12th harmonic) of the response with respect to the sinusoidal magnetic field excitation. These are found using

12

Gabriel Antonio S. Minero et al.

a built-in Fast Fourier Transform (FFT) algorithm. At low B0, the photodetector voltage can be written as [18] V ðtÞ ¼ V 0 þ 2V 2 sin2 ð2πf t ϕÞ ¼ V 0 þ V 2 þ V 02 sinð4πf tÞ þ V 002 cosð4πf tÞ where we have used trigonometric identities and defined the real and imaginary parts of the second harmonic optomagnetic signal as V 02 ¼ V 2 sinð2ϕÞ and V 002 ¼ V 2 cosð2ϕÞ, respectively. In these expressions, V2 (V0) is the amplitude of the oscillating signal and ϕ is the phase lag of the response of the particles with respect to the magnetic field oscillation. The particle signal shows up in the second harmonic response because the light extinction (or shadow) of a particle depends on the magnitude but not the sign of the magnetic field. Higher even harmonics of the signal come into play at higher values of B0 [19]. In the measurements presented below, we consider the real part of the second harmonic signal V 02 , which gives the sin(4πft) component of the signal. To compensate for possible variations in the intensity of incoming light from chip to chip and between setups, we normalize the second harmonic signal by the 0th harmonic signal V0, which approximately gives the constant component of the photodetector signal. 2. The V 02 =V 0 spectra reveal a negative peak at 200 Hz (at 54 C) or 130 Hz (at 38 C) from free DO-MNPs and complex features below 10 Hz dominated by a positive peak from DO-MNPs bound to RCPs, respectively (Figs. 3b and 4b). 3. In a spectrum obtained at time t, we let hV 02 =V 0 it denote the average value of V 02 =V 0 for frequencies in the range between 40 and 1000 Hz. This frequency range is dominated by the signal from free DO-MNPs, which decreases over time as more and more DO-MNPs bind to the RCPs (Figs. 3b and 4b). 4. We define the fraction of free MNPs at time t as F MNP ¼ hV 02 =V 0 it =hV 02 =V 0 i0 , where time t ¼ 0 denotes the first spectrum measured after stabilization of the temperature. Correspondingly, we define the fraction of bound MNPs as BMNP ¼ 1 FMNP. 5. For each target concentration, the fraction of bound MNPs, BMNP, is plotted as function of time (Fig. 3c, inset in Fig. 4c) and the signal is taken as the value of BMNP after a given detection time. 6. The dose-response curve for end-point detection of RCPs (Subheadings 3.3 and 3.4) is obtained for RCA times of 20, 30, or 45 min and an OM detection time 20 min (Fig. 3d). 7. The dose-response analysis for real-time detection of RCPs during RCA (Subheading 3.5) is obtained for the amplification (and detection) times of 20, 30, 45, and 75 min (Fig. 4c).

Optomagnetic Detection of RCA Products

13

Fig. 5 Real-time monitoring of digestion release of MNPs bound to RCPs. (a) Experimental setup: RCPs (40 pM) were first generated after 75 min of real-time RCA resulting in clustered MNPs and subsequently cleaved using R. AluI restriction endonuclease added to the same chip. The digestion was initiated by mounting chip at 35 C. (b) OM spectra (30 spectra, 45 s each) recorded during digestion. The arrow indicates the time progression

8. To demonstrate that the changes in the spectra are due to specific binding of the DO-MNPs, we study a representative sample formed for real-time detection of RCPs (c ¼ 40 pM, 75 min RCA and detection time), add 1 μL of R. AluI restriction enzyme and monitor the signal vs. time (Fig. 5). The restriction enzyme cleaves the DO sites on RCPs and leads to release of the MNPs from the RCPs. Within about 8 min at 35 C, the initial optomagnetic spectrum from free MNPs is recovered (Fig. 5).

4

Notes 1. The probe density of DOs per MNP can be adjusted by adjusting the ratio between the concentrations of the DO probes and MNPs [22]. In the present study, we used a [DO]/[MNP] ratio of 80. 2. The choice of the ligation temperature is dictated by the PLP sequence, in particular the melting points of dsDNA hybrids between the ligating arms of PLPs and the complementary domain in the target DNA. It should be high enough to minimize unspecific annealing of the PLP and low enough to ensure that PLP arms can anneal on the target. As a rule of thumb, it is chosen to be about 5 C below the lowest melting point of the hybrids of the PLP arms and the target. In the present system, the ligation temperature was chosen to 55 C following a previous report [10]. 3. The detection temperature is chosen to be above the melting temperature of unspecific annealing/self-annealing hybrids of the RCPs but below the melting temperature of the DO-RCP

14

Gabriel Antonio S. Minero et al.

hybrids. Possible self-annealing hybrids can be evaluated using the OligoAnalyzer tool (IDT). Usually, the detection temperature is chosen to be at least 5 C below the melting temperature of the DO-RCP hybrids and the length of the DO sequence should be chosen to target available (not self-annealed) domains of the RCPs and to have a DO-RCP melting temperature well above the melting temperatures of self-annealing sites. References 1. Cheung-Hoi Yu A, Vatcher G, Yue X, Dong Y, Hua Li M, K Tam PH, Tsang PY, Wong AK, Hui MH, Yang B, Tang H, Lau L-T (2012) Nucleic acid-based diagnostics for infectious diseases in public health affairs. Front Med 6:173–176 2. Ali MM, Li F, Zhang Z, Zhang K, Kang DK, Ankrum JA, Le XC, Zhao W (2014) Rolling circle amplification: a versatile tool for chemical biology, materials science and medicine. Chem Soc Rev 43:3324–3341 3. Pavankumar AR, Engstro¨m A, Liu J, Herthnek D, Nilsson M (2016) Proficient detection of multi-drug-resistant Mycobacterium tuberculosis by padlock probes and lateral flow nucleic acid biosensors. Anal Chem 88:4277–4284 4. Nilsson M, Gullberg M, Dahl F, Szuhai K, Raap AK (2002) Real-time monitoring of rolling-circle amplification using a modified molecular beacon design. Nucleic Acids Res 30:e66 5. Zhang S, Wu Z, Shen G, Yu R (2009) A labelfree strategy for SNP detection with high fidelity and sensitivity based on ligation-rolling circle amplification and intercalating of methylene blue. Biosens Bioelectron 24:3201–3207 6. Stro¨mberg M, Zardań Go´mez de la Torre T, Go¨ransson J, Gunnarsson K, Nilsson M, Strømme M, Svedlindh P (2008) Microscopic mechanisms influencing the volume amplified magnetic nanobead detection assay. Biosens Bioelectron 24:696–703 7. Yasui T, Ogawa K, Kaji N, Nilsson M, Ajiri T, Tokeshi M, Horiike Y, Baba Y (2016) Labelfree detection of real-time DNA amplification using a nanofluidic diffraction grating. Sci Rep 6:31642 8. Hernańdez-Neuta I, Pereiro I, Ahlford A, Ferraro D, Zhang Q, Viovy JL, Descroix S, Nilsson M (2018) Microfluidic magnetic fluidized bed for DNA analysis in continuous flow mode. Biosens Bioelectron 102:531–539

9. Sun J, de Hoog S (2012) Hyperbranching rolling circle amplification, an improved protocol for discriminating between closely related fungal species. In: Fungal diagnostics. Humana, Totowa, NJ, pp 167–175 10. Neumann F, Hernańdez-Neuta I, Grabbe M, Madaboosi N, Albert J, Nilsson M (2018) Padlock probe assay for detection and subtyping of seasonal influenza. Clin Chem 64:1–9 11. Minero GAS, Nogueira C, Rizzi G, Tian B, Fock J, Donolato M, Stro¨mberg M, Hansen MF (2017) Sequence-specific validation of LAMP amplicons in real-time optomagnetic detection of Dengue serotype 2 synthetic DNA. Analyst 142:3441–3450 12. Maier T, Kainz K, Barisˇic´ I, Hainberger R (2015) An impedimetric sensor for real-time detection of antibiotic resistance genes employing rolling circle amplification. Int J Electrochem Sci 10:2026–2034 13. Seichi A, Kokuka N, Kashima Y, Tabata M, Goda T, Matsumoto A, Iwasawa N, Citterio D, Miyahara Y, Suzuki K (2016) Real-time monitoring and detection of primer generation-rolling circle amplification of DNA using an ethidium ion-selective electrode. Anal Sci 32:505–510 14. Yang L, Fung CW, Eun Jeong Cho A, Ellington AD (2007) Real-time rolling circle amplification for protein detection. Anal Chem 79:3320–3329 15. Jarvius J, Melin J, Go¨ransson J, Stenberg J, Fredriksson S, Gonzalez-Rey C, Bertilsson S, Nilsson M (2006) Digital quantification using amplified single-molecule detection. Nat Methods 3:725–727 16. Ku¨hnemund M, Wei Q, Darai E, Wang Y, Hernańdez-Neuta I, Yang Z, Tseng D, Ahlford A, Mathot L, Sjo¨blom T, Ozcan A, Nilsson M (2017) Targeted DNA sequencing and in situ mutation analysis using mobile phone microscopy. Nat Commun 8:13913 17. Nallur G, Luo C, Fang L, Cooley S, Dave V, Lambert J, Kukanskis K, Kingsmore S,

Optomagnetic Detection of RCA Products Lasken R, Schweitzer B (2001) Signal amplification by rolling circle amplification on DNA microarrays. Nucleic Acids Res 29:e118 18. Fock J, Jonasson C, Johansson C, Hansen MF (2017) Characterization of fine particles using optomagnetic measurements. Phys Chem Chem Phys 19:8802–8814 19. Fock J, Balceris C, Costo R, Zeng L, Ludwig F, Hansen MF (2017) Field-dependent dynamic responses from dilute magnetic nanoparticle dispersions. Nanoscale 10:2052–2066 20. Bejhed RS, de la Torre TZG, Donolato M, Hansen MF, Svedlindh P, Stro¨mberg M

15

(2015) Turn-on optomagnetic bacterial DNA sequence detection using volume-amplified magnetic nanobeads. Biosens Bioelectron 66:405–411 21. Donolato M, Antunes P, de la Torre TZG, Te HE, Chen CH, Burger R, Rizzi G, Bosco FG, Strømme M, Boisen A, Hansen MF (2015) Quantification of rolling circle amplified DNA using magnetic nanobeads and a Blu-ray optical pick-up unit. Biosens Bioelectron 67:649–655 22. Minero GAS, Fock J, McCaskill JS, Hansen MF (2017) Optomagnetic detection of DNA triplex nanoswitches. Analyst 142:582–585

Chapter 2 Subtyping of Swine Influenza Using a High-Throughput Real-Time PCR Platform and a Single Microfluidics Device Helene Larsen, Nicole B. Goecke, and Charlotte K. Hjulsager Abstract Reverse transcription real-time PCR (RT-qPCR) is one of several techniques used to determine the presence and level of infectious veterinary pathogens in diagnostic laboratories. Here we describe how automation of PCR reactions using integrated fluidic circuits (IFCs), an IFC controller MX and a Biomark HD instrument allows for the testing of 48 field samples with swine influenza for up to 48 different subtypes simultaneously in nanoliter volumes. Key words Dynamic array (DA), Real-time PCR, Microfluidics, Integrated fluidic circuits (IFCs), Biomark HD, Preamplification, Influenza A virus, Subtyping

1

Introduction Polymerase chain reaction (PCR) is a method widely used in molecular biology to make thousands to millions of copies of a specific DNA segment [1]. The end PCR product subsequently allows for varies applications such as DNA cloning for sequencing or gene expression and functional analysis, medical diagnostics, forensic analysis of DNA and detection of the presence and level of specific pathogens for the diagnosis of infectious diseases in humans and animals. The most commonly used PCR platforms in diagnostic laboratories today allow for the analysis of 96–384 samples in one run, creating a maximum of 384 data points. The reactions run on these platforms are costly, as each reaction requires the use of microliter volumes of expensive master mix and large volumes of precious sample. The high-throughput Biomark HD PCR system (Fluidigm, South San Francisco, USA) is a high-throughput microfluidics PCR system that can generate up to 9216 data points in one run using just nanoliters of master mix and sample. The system comprises the Biomark HD PCR instrument, a range of dynamic gene expression arrays IFCs (Fig. 1), which are designed to run

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_2, © Springer Science+Business Media, LLC, part of Springer Nature 2020

17

18

Helene Larsen et al.

Fig. 1 The 48.48 Dynamic Array IFC for genotyping which enables 2304 reactions using 48 samples. (a) A loading map and two check valves are shown into which the control line fluid should be injected. (b) A closeup of the microfluidic chambers where the mixing of sample and assay is controlled by nanoflex valves about 1/10 the width of a human hair

192.24 (samples.assays), 48.48, and 96.96 and an IFC controller which primes the IFC and prepares the PCR reactions in the microfluidic chambers. The Biomark HD system is the only multipurpose real-time PCR system that performs genotyping, gene expression profiling, quantitative real-time digital PCR (qdPCR) and single-cell analysis [2–4]. Swine influenza virus is a cause of respiratory disease in pigs worldwide. The influenza A virus genome consists of eight RNA segments, which codes for different viral proteins necessary for the function and multiplication of the virus. Influenza A viruses are classified into subtypes based on the antigenic characteristics of the viral encoded surface proteins hemagglutinin (HA) and neuraminidase (NA). Currently 18 different HA and 11 different NA

High-Throughput PCR Subtyping of Swine Influenza

19

subtypes are known [5, 6]. In pigs only the H1, H3, N1, and N2 variants are endemic, and H1N1, H1N2, and H3N2 swine influenza viruses are common causes of respiratory disease in pigs worldwide [7]. Influenza A virus subtypes can be further divided into lineages; for example, there are different variants of the H1 subtype [8]. Information about the origin of the additional six influenza virus gene segments is also valuable for tracing of influenza viruses and evaluation of zoonotic potential. Determination of subtype, lineage, and identity of internal gene segments provide valuable information for correct control strategy including choice of vaccine and transmission tracing. Determining the variant of each of the eight gene segment requires the use of many primerprobe sets (assays) run separately, due to the wide diversity of the influenza virus gene segments. Here we describe the use of the Biomark HD PCR system (Fluidigm) for the subtyping of swine influenza viruses from cDNA [9]. This platform offers the versatility needed for complete subtype characterization of influenza viruses, due to the capacity for simultaneous running of many assays.

2 2.1

Materials RNA Extraction

1. RNeasy Mini Kit (QIAGEN, Copenhagen, Denmark). 2. β-Mercaptoethanol, 3. TissueLyser II (QIAGEN). 4. RNase-free water. 5. β-Mercaptoethanol RLT Buffer (500 μl β-mercaptoethanol pr 50 ml Buffer RLT).

2.2 Primer Mix for cDNA Synthesis and Preamplification

1. 100 μM PCR primers (Eurofins Genomics, Ebersburg, Germany)

2.3 cDNA Synthesis and Preamplification

1. 10 μM random hexamer.

2. Low EDTA TE-buffer (TE buffer (1) pH 8.0 low EDTA).

2. 200 nM primer mix generated in Subheading 2.2. 3. 5 QIAGEN One-Step RT-PCR buffer (One-Step RT-PCR kit, QIAGEN). 4. 10 mM nucleotide dNTP mix. 5. 25 mM MgCl2. 6. Enzyme mix (One-Step RT-PCR kit, QIAGEN). 7. RNase-free water. 8. RNA sample. 9. A thermocycler.

20

Helene Larsen et al.

2.4 Preparation of the 48.48 Dynamic Array (DA) IFC and qPCR

1. 2 TaqMan Gene Expression Master Mix (Applied Biosystems, Foster city, USA). 2. 20 Sample loading Reagent (Fluidigm). 3. 30 μM specific probe (from Eurofins Genomics). 4. 100 μM Forward primer. 5. 100 μM Reverse primer. 6. 2 Assay loading reagent (Fluidigm). 7. Preamplified cDNA from Subheading 2.3. 8. Biomark 48.48 DA IFC (Fluidigm). 9. Control line fluid (included in the DA IFC package). 10. IFC controller MX (Fluidigm). 11. Biomark HD instrument (Fluidigm). 12. Fluidigm Real-Time PCR Analysis Software (Fluidigm).

3

Methods Carry out all procedures at room temperature, unless otherwise stated. See also [9] for support of the methodology. RNA should be stored at 80 C. An overview of the procedure is outlined in Fig. 2. Be observant of workflow to avoid contamination with PCR products (see Note 1).

3.1

RNA Extraction

1. Mix 200 μl clinical specimen (e.g., nasal swab in phosphate buffered saline (PBS); see Note 2) with 400 μl Buffer RLT containing β-mercaptoethanol. 2. Extract RNA from 600 μl sample mixture with the RNeasy Mini Kit following instructions from the manufacturer, elute in 60 μl RNase-free water and store at 80 C until use.

3.2 Primer Mix for cDNA Synthesis and Preamplification

1. Make a 20 μM primer solution of all PCR primer pairs (5 μl Forward primer (100 μM) + 5 μl Reverse primer (100 μM) + 15 μl low EDTA TE-buffer (TE buffer (1) pH 8.0 low EDTA). 2. Add 5 μl of each primer solution (20 μM) in low EDTA TE-buffer to 500 μl (see Note 3).

3.3 cDNA Synthesis and Preamplification in One Step

1. Mix 1.5 μl of 10 μM random hexamer (see Note 4), 0.75 μl primer mix from Subheading 3.2, 5 μl 5 QIAGEN One-Step RT-PCR buffer, 1 μl 10 mM nucleotide dNTP mix, 1.25 μl 25 mM MgCl2, 1 μl QIAGEN enzyme mix, 3 μl sample, and RNase-free water to a final volume of 25 μl.

High-Throughput PCR Subtyping of Swine Influenza

21

Fig. 2 Outline of the workflow. First total RNA is extracted from swine clinical specimens. cDNA synthesis and preamplification is performed in a one-tube step. Then sample mixes (preamplified cDNA, master mix, and sample loading reagent) and assay mixes (primer–probe pairs and assay loading reagent) are prepared and applied to the IFC in the assay and sample inlets respectively. Sample mix and assay mix are combined in the microfluidic chambers by loading of the IFC in the IFC controller MX. The chip is then transferred to the Biomark HD, where the real-time PCR is performed. Results are displayed as a heat map, representing Cq values for the individual real-time PCR reactions in each chamber

2. Perform the cDNA synthesis and preamplification on a Thermocycler at 50 C for 30 min, then inactivate the enzyme at 95 C for 15 min followed by 24 cycles of 94 C for 10 s, 54 C for 30 s, and 72 C for 10 s (see Note 5). 3. Store the preamplified cDNA at 3.4 Preparation of the 48.48 DA IFC and qPCR

20 C.

1. Prepare sample mix by mixing 3 μl 2 TaqMan Gene Expression Master Mix, 0.3 μl 20 sample loading reagent and 2.7 μl preamplified cDNA in a 96-well plate. 2. Prepare primer–probe mix for each assay by mixing 10 μl forward primer (100 μM), 10 μl reverse primer (100 μM), and 10 μl TaqMan probe (30 μM). Prepare assay mix using 3 μl of each primer–probe mix with 3 μl 2 Assay loading reagent in a 96 well plate (see Note 6). 3. Vortex the two 96 well plates with the sample and assay mix. Spin the 96 well plates briefly. 4. Actuate both check valves on the IFC (Fig. 1) using the syringe with control line fluid (with shipping cap in place), remove shipping cap and inject the control line fluid into both check valves, use one syringe per valve and slowly depress the plunger fully. 5. Prime the IFC in the MX controller (10 min) (see Note 7) and use within 1 h of priming. The priming step allows for subsequent sample and assay loading. 6. Load 4.9 μl sample mix in each sample inlet on right side of the IFC followed by 4.9 μl of assay mix in each assay inlet on left side of IFC using a multichannel pipette (see Fig. 1 for loading map). Ensure there are no air bubbles present in the inlets, as

22

Helene Larsen et al.

Fig. 3 An example of a heat map showing the specificity of the qPCR assays included on the SIV 48.48DA by testing six virus isolates with known subtype (based on full genome sequencing). At the top: The qPCR assays in two different primer/probe concentrations (indicated by the numbers 2 or 4). To the left: The virus isolates and a no-template control (NTC). Each square corresponds to a single real-time PCR reaction. The Cq value for each reaction is indicated by a colour; the corresponding colour scale is presented in the legend on the right. A black square is interpreted as a negative result

these potentially disrupt all the PCR reactions downstream of the inlet (see Note 8). 7. Place the IFC back in the MX controller for loading and mixing of the samples with assays in the microfluidics chambers (55 min). 8. Place the IFC in the Biomark HD PCR instrument and use the following cycling conditions: 15 min at 95 C, followed by 40 cycles at 94 C for 10 s, 54 C for 30 s and 72 C for 10 s (total time 1.5 h) (see Note 9). 3.5

4

Data Processing

Data acquired on the Biomark HD PCR instrument is analyzed with the Fluidigm Real-Time PCR Analysis Software. The PCR results are depicted as individual data points on a heat map (Fig. 3). The positive results are colored and the related Cq values and amplification curves can be seen by clicking the colored squares on the heat map (see Note 10). From the Fluidigm Real-Time PCR Analysis Software the results (Cq values) can be exported to Excel.

Notes 1. To avoid contamination with PCR products, it is advisable to perform the different steps in the procedure (Fig. 2) in dedicated rooms according to the presence of amplified PCR

High-Throughput PCR Subtyping of Swine Influenza

23

products, and to impose a unidirectional flow of staff (including cleaning staff and craftsmen) accordingly. RNA extraction should be performed in a separate room with no entrance of people or objects from post-PCR rooms. Primers and probes should be prepared in a clean room with no entrance of samples, post-PCR objects or staff, and with change of lab coat and shoes; preferably as the first task of the work day. cDNA/ preamplification mixes should also be prepared in the clean room. Addition of the RNA to the cDNA/preamplification mixes should be performed in a separate clean room, preferably in a UV bench. The remaining steps are all to be considered as post-PCR procedures due to the preamplification step of samples, and should be performed in post-PCR rooms. It is preferable however to perform the different post-PCR steps in individual rooms, separating the PCR machine for the cDNA/preamplification, the controller and Biomark HD, and the area for pipetting of sample and assay mixes to the chip, as the contamination risk is bigger when open tubes containing amplified PCR products are handled. 2. If the clinical specimen is lung tissue, prepare a homogenate consisting of 70 mg lung tissue in 1400 μl RLT buffer containing β-mercaptoethanol (1:100) by homogenization for 3 min at 30 Hz on a TissueLyser II, then centrifuge for 3 min at 12,000 g. Use 600 μl of the homogenate supernatant for RNA extraction by the RNeasy Mini Kit. The remaining supernatant homogenate can be stored at 20 C for later extraction. 3. The primers used in the primer mix for the cDNA synthesis and preamplification step are the same primers that are used for the real-time PCR run (see Subheading 3.4, step 2). The panel of primers can easily be modified by adding and removing primer pairs, provided that the primers have been optimized for the PCR conditions for the given PCR setup. The Fluidigm preamplification protocol has been validated for multiplex preamplification of up to 96 targets. 4. We have later shown for other target genes that random hexamers can be omitted from the cDNA synthesis reaction and simply be replaced with an equivalent concentration of primer mix. This even resulted in slightly lower Cq values. This should however be tested for individual RNA targets. 5. If your starting material is DNA rather than RNA, you perform the preamplification by using 5 μl 2 TaqMan PreAmp Master (Applied Biosystems), 2.5 μl primer mix (see Subheading 3.2), and 2.5 μl DNA sample. Place in a thermocycler and run the following program: 95 C for 10 min followed by 14 cycles of 95 C for 15 s, 60 C for 4 min, then pause at 4 C.

24

Helene Larsen et al.

6. Instead of using expensive TaqMan probes for optimizing PCR reactions and testing the efficiency of primer pairs on the IFCs, one can start by testing the primers with EvaGreen chemistry (similar to using SYBR Green) omitting any probes. The TaqMan probes can subsequently be ordered and tested. Initial testing of individual primer–probe pairs (assays) can be performed on a Rotor-Gene Q (QIAGEN) real-time PCR machine to avoid using costly IFCs. To our experience, there is in general a good correlation between the performance of assays on the Rotor-Gene Q and Biomark HD PCR platforms. This is likely to be the case for other standard real-time PCR platforms as well. 7. Each type of IFC has their own controller. The IFC controller MX is designed to load the 48.48 DA IFC, the IFC controller HX is designed to load the 96.96 DA IFC and the IFC controller RX is designed to load the 192.24 DA IFC. 8. If bubbles have been introduced in the inlets when the IFC is set up manually with a multichannel pipette, use a needle to pop them. This is rather time consuming though, which is why we have recently acquired a liquid handling robot, Biomek 4000 (Ramcon, Birkerød, Denmark), which has been programmed to load IFCs, and it does so without introducing bubbles. 9. In each run on the Biomark HD, it is common good practice to include positive samples representing each of the assays (primer–probe pairs) on the chip. In order to spare sample slots for diagnostic samples, positive control samples can be pooled to a single mixed control pool sample. To test for contamination of the cDNA/pre-amplification and PCR steps, a cDNA/pre-amplification negative control and a NTC is run in each set-up. 10. A thing to consider when using the Biomark HD highthroughput qPCR platform is the risk of false negative results, which can occur if the sample is very positive. The additional cDNA synthesis and preamplification will result in an even more positive sample, which will appear as light yellow in the heat map (Cq value will be around 3) and the amplification curve can be questionable. In these cases, the sample should be diluted and retested.

Acknowledgments This work was supported by the Danish Pig Levy Fund and the Technical University of Denmark.

High-Throughput PCR Subtyping of Swine Influenza

25

References 1. Saiki R, Gelfand D, Stoffel S et al (1988) Primerdirected enzymatic amplification of DNA with a thermostable DNA polymerase. Science (80) 239:487–491. https://doi.org/10.1126/ science.2448875 2. Liu J, Hansen C, Quake SR (2003) Solving the “world-to-chip” interface problem with a microfluidic matrix. Anal Chem 75:4718–4723. https://doi.org/10.1021/ac0346407 3. Spurgeon SL, Jones RC, Ramakrishnan R (2008) High throughput gene expression measurement with real time PCR in a microfluidic dynamic array. PLoS One 3:e1662. https://doi. org/10.1371/journal.pone.0001662 4. Skovgaard K, Cirera S, Vasby D et al (2013) Expression of innate immune genes, proteins and microRNAs in lung tissue of pigs infected experimentally with influenza virus (H1N2). Innate Immun 19:531–544. https://doi.org/ 10.1177/1753425912473668

5. Cheung TKW, Poon LLM (2007) Biology of influenza a virus. Ann N Y Acad Sci 1102:1–25. https://doi.org/10.1196/annals.1408.001 6. Wu Y, Wu Y, Tefsen B et al (2014) Bat-derived influenza-like viruses H17N10 and H18N11. Trends Microbiol 22:183–191. https://doi. org/10.1016/j.tim.2014.01.010 7. Vincent A, Awada L, Brown I et al (2014) Review of influenza A virus in swine worldwide: a call for increased surveillance and research. Zoonoses Public Health 61:4–17. https://doi. org/10.1111/zph.12049 8. Watson SJ, Langat P, Reid SM et al (2015) Molecular epidemiology and evolution of influenza viruses circulating within European swine between 2009 and 2013. J Virol 89:9920–9931. https://doi.org/10.1128/JVI.00840-15 9. Goecke NB, Krog JS, Hjulsager CK et al (2018) Subtyping of swine influenza viruses using a high-throughput real-time PCR platform. Front Cell Infect Microbiol 8:165. https://doi. org/10.3389/fcimb.2018.00165

Chapter 3 RT-qPCR Detection of Low-Copy HIV RNA with Yin-Yang Probes Dmitry E. Kireev, Valentina M. Farzan, German A. Shipulin, Vladimir A. Korshun, and Timofei S. Zatsepin Abstract Accurate monitoring of low levels of viral load (the number of viral particles per milliliter of plasma) in HIV-infected patients is important in terms of evaluation of the progress of antiretroviral therapy. The general approach for detection of low copy HIV RNA is reverse transcription combined with quantitative real-time PCR based on fluorescence detection. The selection of primers and the structure of fluorogenic oligonucleotide probes are crucial for sensitivity and accuracy of the assay. In this chapter, we report the RT-qPCR protocol for detection of low copy HIV RNA using double stranded Yin-Yang DNA probes containing identical fluorescent dyes on each strand of the probe. Dye residues attached to the 30 -end of an oligonucleotide and 50 -end of the complementary oligonucleotide form a self-quenched aggregate in a Yin-Yang duplex probe, and display fluorescence light up upon probe strand displacement with the target sequence amplified in the course of PCR. Among several fluorescent dyes tested (R6G, ROX, Cy5) the ROX labeled Yin-Yang probes showed better fluorescence increase and lower Ct values. All the homo Yin-Yang probes were superior to corresponding dye-quencher probes and allowed reliable detection of 10–10,000 copies of HIV RNA per mL. Key words Molecular beacon, RT-qPCR, HIV-1, Diagnostics, Yin-Yang probes

1

Introduction Real-time or quantitative PCR (qPCR) based on fluorescence detection by specific oligonucleotide probes [1] is an automated analysis without any postreaction manipulations. The speed, simplicity, and convenience made qPCR a truly universal tool for analysis of genetic materials and the technique of choice for clinical diagnostics, particularly for monitoring of viral infections [2]. There are many types of fluorogenic oligonucleotide probes for real-time qPCR [3]. Yin-Yang probes (displacing probes or double-stranded probes) [4, 5] are composed of two labeled complementary oligonucleotides of different lengths. In common Yin-Yang probes the longer strand is labeled with a fluorophore

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_3, © Springer Science+Business Media, LLC, part of Springer Nature 2020

27

28

Dmitry E. Kireev et al.

and the shorter strand is labeled with a nonfluorescent quencher. Under detection conditions, the probes are double-stranded and fluorescence is quenched due to the close proximity of the dye to the quencher. The target DNA displaces the shorter strand, which leads to the increase of fluorescence. These probes have a number of advantages over more popular molecular beacons and TaqMan probes. Design and synthesis of Yin-Yang is easier, and they are much more tolerant to mutations due to greater length of the oligonucleotide part. The last point is crucial for HIV detection, as the virus is well known for high mutational rate. Recently, we developed Yin-Yang probes carrying two identical fluorescent dyes [6] instead of a dye–quencher pair. The synthesis of oligonucleotides modified with fluorescent dyes is based on high-throughput solid phase “click” technique [7]. The protocol includes solid phase oligonucleotide synthesis of Yin-Yang probes, purification and characterization by LC-ESI-MS, in-house kit production, HIV-1 RNA purification, and quantification.

2

Materials

2.1 Oligonucleotide Synthesis 2.1.1 Standard Reagents for Oligonucleotide Synthesis

1. Protected 20 -deoxynucleoside phosphoramidites for mild deprotection (T, dCAc, dABz, dGDMF). 2. Deprotection solution (3% trichloroacetic acid in DCM). 3. Oxidizing solution (0.02 M iodine in THF–pyridine–water (80:10:10, v/v/v)). 4. Capping solutions (Cap A: 10% acetic anhydride, 16% Nmethylimidazole in THF; Cap B: 10% pyridine in THF). 5. Acetonitrile (DNA synthesis grade). 6. Solution of activator (0.25 M 5-ethylthiotetrazole in acetonitrile). 7. 30 -Phosphate solid support (CPR I). 8. TAMRA cocktail (tert-butyl amine–methanol–water, 1:1:2, v/v/v). 9. 50 -Alkyne phosphoramidite. 10. ROX azide. 11. Alkyne-CPG. 12. Dimethylacetamide (DMAA). 13. CuI·P(OEt)3 catalyst.

2.1.2 Standard Reagents for RP-HPLC, IE-HPLC, and PAGE Oligonucleotide Purification, LC-ESI-MS, and Gel-Filtration Characterization

1. Tris–borate buffer (50 mM Tris, 50 mM boric acid, 1 mM EDTA, pH 8.3). 2. Tris–borate buffer (5 mM Tris, 5 mM boric acid, pH 8.3). 3. 80% formamide. 4. Acrylamide.

RT-qPCR Detection of HIV RNA

29

5. Bis-acrylamide. 6. Tris. 7. Boric acid. 8. Urea. 9. Ammonium acetate. 10. Disodium EDTA. 11. Ammonium acetate. 12. Sodium perchlorate. 13. Tris hydrochloride. 14. HPLC-grade diisopropylamine. 15. MS-grade 1,1,1,3,3,3-hexafluoroisopropanol. 16. MS- and HPLC-grade water. 17. MS- and HPLC gradient grade acetonitrile. 2.2

Clinical Sample

Human plasma (ACD-A and EDTA) specimens may be used with the assay.

2.3 QIAamp Viral RNA Mini Kit

QIAamp Viral RNA Mini Kit 50 (Qiagen, cat #52904) should be used for the extraction of plasma samples with HIV RNA concentration higher than 1000 copies per mL. Extraction volume is 0.1 mL.

2.4 Abbott Sample Preparation Reagents

Abbott Sample Preparation Reagents (Abbott, cat #04J70–24) should be used for the extraction of plasma samples with low HIV RNA concentration. Extraction volume should be 1 mL.

2.5 Reagents for RT-qPCR

1. KAPA2G Fast HotStart PCR Kit (Kapa Biosystems, cat #KK5530); including buffer, polymerase, dNTP mix, and MgCl2. 2. M-MLV Reverse Transcriptase (cat #M170A, Promega). 3. Nuclease-Free Water (ThermoFisher Scientific, cat #AM9932). 4. Primers and probes. Primers can be purchased from any commercial supplier; synthesis of Yin-Yang probes is described below. 5. Positive control samples were made by dilution of HIV-1 third WHO International Standard 10/152 with HIV-negative plasma to final concentration 100, 1000 and 10,000 copies of HIV-1 RNA per mL. 6. Negative control samples—HIV-negative plasma.

30

3

Dmitry E. Kireev et al.

Methods

3.1 Solid Phase Oligonucleotide Synthesis

1. 50 -Alkyne, 30 -alkyne oligonucleotides, alkyne–ACA gCA gTA CAA ATg gCA gTA TTC ATT CAC AAT TTT AAA AgA A– 30 -phosphate (YY1) and TgC CAT TTg TAC TgC TgT–alkyne (YY2) should be assembled in DNA synthesizer at 1 μmol scale (e.g., ABI 3400 or MM-12) with the phosphoramidite method according to the manufacturer’s recommendations in DMT-off mode (see Note 1) using 50 -akyne phosphoramidite together with 30 -phosphate CPG (YY1) and 30 -alkyne CPG (YY2). 2. Solid-phase ROX derivatization can be carried out in automated or manual mode. In the first case one should install 10 mM solution of the ROX azide in dimethylacetamide and 100 mM CuI·P(OEt)3 in separate bottles for phosphoramidites and deliver azide and catalyst solutions (50 μL each) to the column with simultaneous mixing. 3. After coupling for 1 h procedure should be repeated in triplicate followed by thorough washings with acetonitrile (6 100 μL). 4. In manual (off-synthesizer) mode the procedure should be carried out using syringes in argon atmosphere to protect the catalyst from oxidation and exclude DNA damage by activated oxygen. After completion of the synthesis, dry the column in vacuo and transfer CPG with the protected oligonucleotide into the screw-capped tube (1.5 mL). 5. Add 500 μL of TAMRA cocktail to protected ROX modified oligonucleotides immobilized on CPG and perform deprotection at 55 C overnight under gentle shaking. Cool the solution for 20 min at 20 C, filter off solids, wash them thoroughly by water (2 200 μL) and evaporate the combined washings on a SpeedVac vacuum concentrator to dryness.

3.2 Purification of Dye Labeled Oligonucleotides by Denaturating Polyacrylamide Gel Electrophoresis (PAGE) Followed by LC-MS Analysis and Final RP-HPLC Purification 3.2.1 PAGE Purification

The denaturing gel electrophoresis of oligonucleotides is performed in 15% PAGE containing 7 M urea in Tris–borate buffer (50 mM Tris, 50 mM boric acid, 1 mM EDTA, pH 8.3). 1. Prepare large vertical gel chamber (gel size 200 200 1.5 mm) and prerun the gel for 3 h at 200 V to remove salts. Dissolve the oligonucleotide in 500 μL of 80% formamide, denature the sample by heating at 90 C for 3 min, then rapidly cool on ice. 2. Flush the wells of the gel with Tris–borate buffer to remove precipitated urea and load the solution of the oligonucleotide in 80% formamide to the gel (50 μL per well). Run PAGE at constant voltage (500 V) and stop electrophoresis when the

RT-qPCR Detection of HIV RNA

31

marker dyes have migrated an appropriate distance. For 25- to 45-mer primers a running time of 3 h is sufficient. 3. Slab PAGE can be visualized under VL-6 UV lamp (Vilber Lourmat) using fluorescent TLC plate (TLC Silica gel 60G F254 or analog) as a substrate. Dark red fluorescent bands should be cut out by a scalpel. Oligonucleotides are recovered from the gel by electroelution with Elutrap (Whatman) for 3 h in Tris–borate buffer (5 mM Tris, 5 mM boric acid, pH 8.3). 3.2.2 LC-ESI-MS QC

1. Check the molecular mass of the product by running LC-MS. Monoisotopic molecular mass for YY1 ¼ 14197.9, for YY2 ¼ 6357.9. If you use other alkyne/ROX derivatives than those described in this protocol, you need to recalculate the masses. 2. LC-ESI-MS analysis for oligonucleotides can be performed using Agilent 1260-Bruker Maxis Impact system as described earlier [6]. The HPLC is equipped with the 2.1 50 mm Jupiter C18 column (5 μm, Phenomenex); buffer A: 10 mM diisopropylamine, 15 mM 1,1,1,3,3,3-hexafluoroisopropanol; buffer B: 10 mM diisopropylamine, 15 mM 1,1,1,3,3,3hexafluoroisopropanol, 80% acetonitrile. Salts are washed out with buffer A (4 CV) followed by a step of 100% B (2 CV) with a flow rate of 0.3 mL/min; temperature 45 C. The MS analysis of oligonucleotides is carried out in negative mode (capillary voltage 3500 V, dry temp 160 C). Raw spectra are deconvoluted by maximum entropy method using Bruker software. 3. Compare experimental data with calculated molecular mass—if difference for the main peak exceeds 5 Da, oligonucleotide should be utilized and synthesis should be repeated or mass spectrometer should be recalibrated.

3.2.3 Reverse-Phase HPLC Purification and Analysis of Oligonucleotides

1. The HPLC purification of oligonucleotides is carried out on a 4.6 250 mm Jupiter C18 column (5 μm, Phenomenex), buffer A—50 mM ammonium acetate in acetonitrile–water (5/95, v/v), buffer B—30 mM ammonium acetate in ACN– water (80/20, v/v); a gradient of buffer B: 0 ! 15% (1 CV), 15 ! 50% (10 CV); a flow rate of 1 mL/min; temperature 45 C. Pool the appropriate fractions, evaporate to ~100 μL, and add 1 mL of ethanol (molecular biology grade) to precipitate an oligonucleotide and to remove most of ammonium acetate. 2. The mixture is cooled in a freezer at 20 C for 1 h, and the oligonucleotide is isolated by centrifugation for 5 min at 10,000 g followed by final drying of ethanol traces. 3. Dissolve the product in water, and quantify the oligonucleotide by measuring the UV absorbance at 260 nm; for YY1

32

Dmitry E. Kireev et al.

(1 OD260 ¼ 1.71 nmol), for YY2 (1 OD260 ¼ 3.91 nmol). Store the solution frozen at 20 C. 4. Check the purity of the product by LC-ESI-MS as described above and by RP-HPLC. The RP-HPLC analysis of oligonucleotides is carried out on a 4.6 250 mm Jupiter C18 column (5 μm, Phenomenex); a linear gradient of buffer B: 0 ! 100% (8 CV); a flow rate of 1 mL/min; temperature 45 C. 3.3 Preparation of Yin-Yang Probes, Their Purification by Nondenaturating IE-HPLC and Analysis by Gel Filtration

Yin-Yang probes should be annealed from single stranded YY1 and YY2 oligonucleotides by heating followed by slow cooling and purified by nondenaturative ion-exchange HPLC and further characterized by nondenaturative ion-exchange HPLC and gel filtration. 1. YY1 (50 nmol) and YY2 (60 nmol) oligonucleotides are dissolved in 30 mM Tris–HCl (pH 7.0) buffer at room temperature, then mixture is gently heated to 95 С (for 5 min) and cooled down to room temperature during 3 h. 2. Nondenaturative IE-HPLC purification of Yin-Yang probes is carried out on a 7.5 75 mm TSK-gel SuperQ-5PW (10 μm, TOSOH), buffer A—20 mM Tris–HCl (pH 7.0), buffer B— 20 mM Tris–HCl (pH 7.0), 800 mM sodium perchlorate; a gradient of buffer B: 0% (1 CV), 0 ! 80% (10 CV); a flow rate of 1 mL/min; room temperature. Pool the appropriate fractions, evaporate to ~100 μL, and add 1 mL of acetone (molecular biology grade) to precipitate oligonucleotides and to remove most of sodium perchlorate. 3. The mixture is cooled in a freezer at 20 C for 1 h, and oligonucleotides are isolated by centrifugation for 5 min at 10,000 g. 4. Dissolve colored solid in water (100 μL, molecular biology grade) and add 1 mL of ethanol (molecular biology grade) to precipitate oligonucleotides and to remove most of ammonium acetate. 5. The mixture is cooled in a freezer at 20 C for 1 h, and the primer is isolated by centrifugation for 5 min at 10,000 g followed by final drying of ethanol traces. 6. Dissolve the solid in water (100 μL, molecular biology grade) and quantify Yin-Yang probe by measuring the UV absorbance at 260 nm (1 OD260 ¼ 1.14 nmol). Adjust the solution to final Yin-Yang probe concentration 100 μM using 10 buffer (100 mM Tris–HCl buffer, pH 7.0) and molecular biology grade water. 7. Heat the solution to 95 С (for 5 min) and cool down to room temperature during 3 h (see Note 2).

RT-qPCR Detection of HIV RNA

33

8. Analyze Yin-Yang probe duplex purity by gel filtration using Superdex 75 10/300 GL column (GE Healthcare) in 50 mM Tris–HCl (pH 7.0). 3.4 Analysis of Clinical Samples 3.4.1 RNA Extraction ( See Note 4)

1. Separate plasma from whole blood (5 mL) by centrifugation at 800–1600 g for 20 min at room temperature (see Note 5). Transfer plasma to sterile polypropylene tubes after centrifugation. 2. For RNA extraction from plasma samples with RNA concentration higher than 1000 copies per mL use QIAamp viral RNA mini kit. Use the QIAamp® Viral RNA Mini Handbook [8]. The protocol of extraction is described on the pp. 27–30 of the handbook (see Note 3). Use 70 μL of Buffer AVE and single elution for the last step of extraction (see Notes 6 and 7). 3. For RNA extraction from plasma samples with low HIV RNA concentration use Abbott Sample Preparation Reagents according to the manufacturer’s protocol. The protocol of extraction is described on the pp. 3–4 of the package insert (see Notes 6 and 7).

3.4.2 Sample Preparation for Reverse Transcription and Amplification ( See Notes 4 and 7)

1. Prepare the master mix for RT-qPCR (See Note 8). Add the following components: 20 μL KAPA2G Buffer A (5), 2 μL KAPA dNTP Mix (10 mM each), 2 μL KAPA MgCl2 (25 mM), 0.4 μL KAPA2G Fast HotStart DNA Polymerase (5 U/μL), 4 μL M-MLV Reverse Transcriptase, 0.2 μM of Forward primer, 0.2 μM of Reverse primer, 0.15 μM of Yin-Yang Probe, nuclease-free water to final volume of master mix 50 μL for one reaction (see Note 6). Mix gently by pipetting or vortexing. Transfer 50 μL of the prepared master mix to the PCR tubes. 2. Add 50 μL of RNA solution obtained from clinical samples into the prepared tubes using tips with aerosol barrier and carefully mix by pipetting.

3.4.3 RT-qPCR Analysis

1. Program qPCR machine according to the following amplification program: 59 С for 30 min, then 4 cycles (95 С—40 s, 46 С—30 s), then 6 cycles (92 С—30 s, 60 С—30 s); then cycling 92 С—30 s, 56 С—20 s (plus 2 seconds at each subsequent cycle), 35 С—40 s (with fluorescent detection at Yellow/Orange/Red channels, depending on the dye used). Insert tubes into the machine and start the run. 2. First, the analysis of “raw” data should be done. To do this, when using the ABI7500 instrument, you need to select the Plot Type tab in the “Analysis,” “Amplification Plot” tab: ΔRn vs Cycle, and Graph Type: Linear. When using the Rotor-Gene instrument, it is necessary to open a window with the Raw

34

Dmitry E. Kireev et al.

Channel with the appropriate dye and click “Auto-Scale.” For positive samples you should get sigmoidal shaped fluorescent curves. At the same time, for negative samples you should observe absence or a slight slope of fluorescence increase (see Note 9). 3. The Ct (Cycle threshold) results using the ABI 7500 instrument can be taken from the “View Well Table” table and exported using the export function: “File,” “Export ...,” “Start export.” When using the Rotor-Gene instrument, it is necessary to analyze the fluorescence data “Analysis” and choose the appropriate fluorescence channel. In the resulting table threshold cycles will be presented. To export data, rightclick on the results table and select “Export to Excel.” 4. You should build a calibration curve of Ct (averaged for three repeats) dependence on logarithm of concentration for positive control samples of HIV-1 RNA and convert the result into linear equation. To quantify HIV-1 RNA in the sample, you need to average Ct for three technical repeats and to calculate the mean using the equation obtained for positive control samples.

4

Notes 1. Alkyne phosphoramidite should be used as 0.05 M solution in acetonitrile (DNA synthesis grade) and the coupling time should be increased to 10 min. 2. Store the solution of Yin-Yang probe without freezing to maintain it as a duplex. If you need long term storage at 20 or 80 C, repeat stages as in Subheadings 3.3, steps 4–8 after unfreezing before further use. 3. Extraction protocol contains step 10 which is recommended by manufacturer. It is very important to perform this step to maximize the sensitivity of the assay. Residual amount of buffer AW2 has a detrimental effect on samples with low concentration of the viral RNA. 4. This stage should be carried out in class II or III laminar flow hoods or biosafety cabinets using appropriate self-protection— lab coat, gloves, and glasses. 5. Plasma separation should be done within 6 h after blood collection. 6. It is recommended to carry out RT-qPCR within 30 min after RNA extraction. If this does not happen, samples with purified RNA should be frozen immediately at 20 or 80 C.

RT-qPCR Detection of HIV RNA

35

7. For each plasma sample you should purify RNA in three technical repeats. Also negative (3) and three positive (100, 1000, and 10,000 copies of HIV-1 RNA per mL) samples (3 each) should be used to verify the RT-qPCR analysis. 8. Master Mix should be prepared with a margin. For example, if you need to perform RT-qPCR of nine samples, prepare a mixture for ten reactions. 9. If you observe miserable increase of fluorescence for positive control samples, you should check enzymes, probes and buffer components for integrity—your system does not work. If you have significant sigmoidal increase for negative control samples, you got a contamination with HIV-1 RNA or amplicons from previous runs. You should perform decontamination of your working place and repeat the protocol from stage as in Subheading 3.4.1. References 1. Heid CA, Stevens J, Livak KJ, Williams PM (1996) Real time quantitative PCR. Genome Res 6:986–994 2. Gullett JC, Nolte FS (2015) Quantitative nucleic acid amplification methods for viral infections. Clin Chem 61:72–78 ˜ o MJ, 3. Navarro E, Serrano-Heras G, Castan Solera J (2015) Real-time PCR detection chemistry. Clin Chim Acta 439:231–250 4. Li Q, Luan G, Guo Q, Liang J (2002) A new class of homogeneous nucleic acid probes based on specific displacement hybridization. Nucleic Acids Res 30:e5 5. Cheng J, Zhang Y, Li Q (2004) Real-time PCR genotyping using displacing probes. Nucleic Acids Res 32:e61

6. Farzan VM, Kvach MV, Aparin IO et al (2019) Novel homo Yin-Yang probes improve sensitivity in RT-qPCR detection of low copy HIV RNA. Talanta 194:226–232 7. Farzan VM, Ulashchik EA, Martynenko-Makaev YV et al (2017) Automated solid-phase click synthesis of oligonucleotide conjugates: from small molecules to diverse N-acetylgalactosamine clusters. Bioconjug Chem 28:2599–2607 8. QIAamp® viral RNA mini handbook. https:// www.qiagen.com/us/resources/resourcedetail? id¼c80685c0-4103-49ea-aa728989420e3018&lang¼en. Abbott RealTime HIV-1 Package Insert

Chapter 4 Solid-Phase Hybridization Assay for Detection of Mutated Cancer DNA by Fluorescence Maria Taskova and Kira Astakhova Abstract We report a straightforward protocol for the detection of mutated DNA extracted from cancer cells. The assay combines a step-wise solid-phase hybridization and a readout by fluorescence emission. We detect a single-nucleotide polymorphism in two human oncogenes, BRAF and EGFR, and reach a limit of the detection of 300 pM by conventional fluorometry. The protocol described herein may be used as a foundation for development of automatic optimized assays capable for detection of mutant DNA and RNA in vitro and in cells. Key words DNA oncogene, Fluorescent oligonucleotides, Hybridization assay, Fluorometry

1

Introduction Innovative tools for diagnostics on molecular level are on high demand [1–3]. It is hypothesized that early detection of mutations in human genome will prevent many not treatable diseases. Fluorescent oligonucleotides are widely explored as a platform for screening genomic DNA, and intricate biological processes [4, 5]. Thus, their development and implementation made huge impact in diagnostic methods including next generation sequencing and real-time polymerase chain reaction (PCR). In spite of unneglectable importance of the abovementioned techniques, there is an immense need for routine, simple, and inexpensive detection of genomic mutations [5]. For such a use, many oligonucleotides covalently conjugated with fluorescent dyes have been developed [6–10]. The aim is to develop fluorescent oligonucleotide capable to distinguish single nucleotide polymorphism in a particular gene and to efficiently report it by a fluorescence alteration. Among these methods are exciton fluorescent probes [11], molecular beacons [12], guanine quenching probes [13], and probes containing fluorescent nucleobase analogues [14].

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_4, © Springer Science+Business Media, LLC, part of Springer Nature 2020

37

38

Maria Taskova and Kira Astakhova

Fig. 1 Illustration of the detection probes and perylene structures; B-Perylene is a DNA sequence with two internal perylene dyes. E-Perylene is DNA sequence that contains two perylene dyes covalently attached at the 50 termini. It contains three locked nucleic acid (LNA) monomers

Following the need for advanced solutions, and in addition to the described methods, we describe a combined solid-phase hybridization–fluorescence assay for reliable detection of mutated cancer DNA. First, we designed biotin-capturing probes and fluorescent detection probes. The capturing probes contain biotin on the 50 termini capable to attach to streptavidin-coated magnetic beads and further to hybridize with the oncogene of interest. The detection probes covalently conjugated with perylene dye (represented in Fig. 1) hybridize strongly to the target oncogene and weaker to the mismatched wild-type target. After removal of the detection oligonucleotides binding off-target or the wild type target by denaturation, the remaining fluorescence intensity at the emission maximum results from the binding exclusively by the mutant, i.e. full matched, target. From a practical standpoint, we observe that the magnetic beads may change the physical appearance under frequent heating (80–90 C). This could be a challenge in the final step when the fluorescent detection probe is dehybridized and detached from the beads. To investigate this upshot, we also explore the assay specificity using chemical denaturation (with urea and dimethyl sulfoxide (DMSO)) methods [15]. Based on the results (Figs. 2 and 3), both probes show good target specificity, that is, significant difference in fluorescence intensity at emission maximum for the mutant target and the wild-type control. Moreover, both probes, the B-Perylene and the E-perylene display similar trend in fluorescence intensity for three different denaturation methods. For both probes, the denaturation using urea in the last step give the highest fluorescent intensity and the highest target specificity. With the assay we reach the limit of detection (LOD) of 300 pM for single nucleotide polymorphism in DNA extracted from cell line. Taking into account the simplicity of the design

Solid-hase Hybridization Assay for Detection of Mutated Cancer DNA. . .

39

Fig. 2 Fluorescence intensity values for B-Perylene using thermal denaturation (TD) or chemical denaturation (urea or DMSO) in the final dehybridization step

Fig. 3 Presentation of the fluorescence intensity values for E-Perylene using thermal denaturation (TD) or chemical denaturation (urea or DMSO) in the final dehybridization step

and the resulting metrics, our assay is comparable and even superior to many reported methods for fluorescence-based detection of nucleic acids [12, 16, 17].

40

2

Maria Taskova and Kira Astakhova

Materials Prepare all solutions using ultrapure Mill-Q water. Purchase molecular grade reagents to use them without additional purification. Execute the entire procedure and washing steps at room temperature, unless specified otherwise.

2.1 Solid-Phase Hybridization DNA Fluorescent Assay

1. Phosphate buffered saline (1X PBS): Prepare 1X PBS using phosphate buffered saline tablets. Dissolve one tablet in 200 mL Milli-Q water to obtain 0.137 M NaCl, 0.0027 M KCl, and 0.01 M (1X PBS). Adjust the pH to 7.4 at room temperature. 2. Phosphate buffered saline (2X PBS): Prepare 2X PBS using phosphate buffered saline tablets. Dissolve two tablets in 200 mL Milli-Q water to obtain 0.274 M NaCl, 0.0054 M KCl, and 0.02 M (2X PBS). Adjust the pH to 7.4 at room temperature. 3. Urea: Make 2 M Urea (50 mL) dissolving 6.006 g urea in 35 mL Milli-Q water. When the solid urea is dissolved, adjust the volume to 50 mL with Milli-Q water. 4. PCR tubes: Use sterile, low binding 0.2 mL PCR tubes. 5. Magnetic rack for separation: Use magnetic rack suitable for 0.2 mL PCR tubes. 6. Beads: Use magnetic beads coated with streptavidin. For example (2.8 μm, 10 mg/mL). Store the beads at +4 C. 7. Targeted DNA for cancer cell line: Use purified genomic DNA from cancer cell lines and from healthy cell line as a control. Else, use A101D—human skin malignant melanoma cell line (mutated BRAF) and NCI-H1975—human long non-small cell line carcinoma (mutant EGFR). As a negative control, use DNA extracted from healthy cell line HMC-1. 8. Biotin-capturing probes: Synthetize the probes on automated DNA synthesizer followed by purification and characterization or purchase from commercial supplier (Table 1). 9. Fluorescent oligonucleotide detection probes: Use enriched (double labeled) fluorescent oligonucleotide probes (Table 1). Alternatively, synthetize the B- and the E-oligonucleotide probes on automated DNA synthesizer. Use standard protocol in a DMT-ON mode, scale of 1.0 μmol. Incorporate the LNA monomers and the bis-alkyne nucleic acid scaffold (which synthesis will be published elsewhere). Conjugate the B-Perylene and the E-Perylene using copper catalyzed azide–alkyne cycloaddition reacting the double alkyne modified oligonucleotides (B- and E-oligonucleotide) with commercially purchased azideperylene dye [18].

Solid-hase Hybridization Assay for Detection of Mutated Cancer DNA. . .

41

Table 1 Sequences of capturing and fluorescent detection probes for BRAF V600E and EGFR L858R used in the assay BRAF V600E Biotin-capturing probe B-Perylene EGFR L858R Biotin-capturing probe E-Perylene

50 -/Biosg/GAA AAT ACT ATA GTT GAG ACC TTC AAT GAC TTT CTA GTA ACT CAG CAG CAT CTC AGG GCC AAA AAT TTA ATC AGT GGA AAA ATA GCC TCA ATT CTT ACC ATC CAC AAA ATG GAT CCA GAC 50 -GATTTC/2Perylene/CTGTAGCT 50 -/Biosg/CTG GAG AGC ATC CTC CCC TGC ATG TGT TAA ACA ATA CAG CTA GTG GGA AGG CAG CCT GGT CCC TGG TGT CAG GAA AAT GCT GGC TGA CCT AAA GCC ACC TCC TTA CTT TGC CTC CTT CTG 50 -/2Perylene/GCA GTT TGG C + C + C + GCC CAA AAT

Fig. 4 Illustration of the solid-phase hybridization assay

10. Incubation program: For programmed incubation, use PCR machine. 11. Fluorescence measurements: For fluorescence quantification, use suitable microplate reader to measure fluorescence intensity at emission maximum or a suitable fluorescence range (440–488 nm).

3

Methods We give details for performing the assay by thermal and chemical denaturation bellow. Graphical illustration of the assay is shown in Fig. 4.

3.1 Solid-Phase Hybridization, Temperature Denaturation (TD)

1. Stabilize the magnetic beads at room temperature for 1 h prior the assay. Vortex the magnetic beads suspension to become homogeneous and take out 50 μL in a 0.2 mL PCR tube (see Note 1). Using a magnetic rack, take out the supernatant from the PCR tube and wash the beads with 150 μL 1X PBS. Repeat the washing two more times. Redisolve the beads in 25 μL 1X PBS. 2. Add 10 μL (500 nM) biotin-capturing probe (Table 1). Vortex well. Incubate for 30 min at room temperature on a shaker.

42

Maria Taskova and Kira Astakhova

Wash out the unreacted biotin-capturing probes using the magnetic rack with 150 μL 1X PBS two subsequent times. 3. Add the cancer cell line extract BRAF/EGFR or the healthy cell line (negative HMC—control) 4.8 μL (10 ng/μL) and adjust to final volume until 35 μL with 2X PBS. Incubate the solution using the following PCR program—10 min at 85 C, 2 h at 65 C and cooling down to room temperature over 20 min. Wash the not hybridized DNA with 150 μL 1X PBS using the magnetic rack (see Note 2). Repeat the wash two more times. 4. Add 10 μL (50 nM) fluorescent detection probe (B-Perylene or E-Perylene for BRAF or EGFR, respectively—Table 1) (see Note 3). Adjust the volume to 35 μL with 2X PBS. Incubate the solution using the following PCR program: 10 min 85 C, 40 min 60 C, and 10 min 40 C. Immediately remove the not hybridized fluorescent detection probe with wash step (using the magnetic rack) with 150 μL 1X PBS (kept at 45 C for minimum 2 h prior washing). Repeat the warm wash three more times with 150 μL 1X PBS at 45 C. 5. Redisolve in 30 μL 2X PBS. Incubate for 10 min at 92 C to dehybridize the fluorescent detection probes from the target DNA with immediate separation of the supernatant from the beads using the magnetic rack. 6. Transfer 10 μL of the supernatant in the microplate and measure the fluorescence intensity at emission maximum or in the suitable wavelength range (See Note 4). 3.2 Solid-Phase Hybridization, Chemical Denaturation 3.2.1 Denaturation with Urea

3.2.2 Denaturation with DMSO

1. Do steps 1–4 from the temperature denaturation assay (Subheading 3.1; Notes 1–3). 2. Redisolve in 20 μL 2X PBS. Add 10 μL 2 M urea and keep the mixture at room temperature for 5 min. Separate the supernatant from the beads using the magnetic rack. 3. Transfer 10 μL of the supernatant in the microplate and measure the fluorescence intensity at emission maximum or in the suitable wavelength range (See Note 4). 1. Do steps 1–4 from the temperature denaturation assay (Subheading 3.1; Notes 1–3). 2. Redisolve in 20 μL 2X PBS. Add 30 μL DMSO and keep the mixture at room temperature for 5 min. Separate the supernatant from the beads using the magnetic rack. 3. Transfer 10 μL of the supernatant in the microplate and measure the fluorescence intensity at emission maximum or in the suitable wavelength range (See Note 4).

Solid-hase Hybridization Assay for Detection of Mutated Cancer DNA. . .

4

43

Notes 1. When pipetting the magnetic beads out from the original package, make sure that the suspension is homogeneous, so that an equal amount of beads is taken with each 50 μL. 2. During the washing step, place the PCR tubes on the magnetic rack and wait 10 seconds before taking out the supernatant. This will allow all magnetic beads to stick to the magnet. Also, always point the micropipette tip opposite of the beads, to not touch the beads while extracting the supernatant and to avoid losing them. 3. Always keep the fluorescent probes protected from light! After step 4, always protect the PCR tubes from light using aluminum folia. 4. Make volume correction for the results of the three different methods.

Acknowledgments The work is supported by Villum Foundation Young Investigator Programme, award number 13152. References 1. Østergaard ME, Cheguru P, Papasani MR, Hill RA, Hrdlicka PJ (2010) Glowing locked nucleic acids: brightly fluorescent probes for detection of nucleic acids in cells. J Am Chem Soc 40:14221–14228 2. Okamoto A (2011) ECHO probes: a concept of fluorescence control for practical nucleic acid sensing. Chem Soc Rev 12:5815–5828 3. Su X, Xiao X, Zhang C, Zhao M (2012) Nucleic acid fluorescent probes for biological sensing. Appl Spectrosc 11:1249–1261 4. Krasheninina OA, Novopashina DS, Apartsin EK, Venyaminova AG (2017) Recent advances in nucleic acid targeting probes and supramolecular constructs based on pyrene-modified oligonucleotides. Molecules 12:2108–2056 5. Hwang GT (2018) Single-labeled oligonucleotides showing fluorescence changes upon hybridization with target nucleic acids. Molecules 1:124–143 6. Wang G, Bobkov GV, Mikhailov SN et al (2009) Detection of RNA hybridization by pyrene-labeled probes. Chembiochem 7:1175–1185

7. Umemoto T, Hrdlicka PJ, Babu BR, Wengel J (2007) Sensitive SNP dual-probe assays based on pyrene-functionalized 20 -amino-LNA: lessons to be learned. Chembiochem 18:2240–2248 8. Astakhova IV, Korshun VA, Jahn K, Kjems J, Wengel J (2008) Perylene attached to 20 -amino-LNA: synthesis, incorporation into oligonucleotides, and remarkable fluorescence properties in vitro and in cell culture. Bioconjug Chem 10:1995–2007 9. Svanvik N, Westman G, Wang D, Kubista M (2000) Light-up probes: thiazole orangeconjugated peptide nucleic acid for detection of target nucleic acid in homogeneous solution. Anal Biochem 1:26–35 10. Karlsen KK, Pasternak A, Jensen TB, Wengel J (2012) Pyrene-modified unlocked nucleic acids: synthesis, thermodynamic studies, and fluorescent properties. Chembiochem 4:590–601 11. Sugizaki K, Okamoto A (2010) ECHO-LNA conjugates: hybridization-sensitive fluorescence and its application to fluorescent

44

Maria Taskova and Kira Astakhova

detection of various RNA strands. Bioconjug Chem 12:2276–2281 12. Kim Y, Sohn D, Tan W (2008) Molecular beacon in biomedical detection and clinical diagnosis. Int J Clin Exp Pathol 2:105–116 13. Vaughn CP, Elenitoba-Johnson KSJ (2003) Hybridization-induced dequenching of fluorescein-labeled oligonucleotides: a novel strategy for PCR detection and genotyping. Am J Pathol 1:29–35 14. Xie Y, Maxson T, Tor Y (2010) Fluorescent nucleoside analogue displays enhanced emission upon pairing with guanine. Org Biomol Chem 22:5053–5055 15. Wang X, Lim HJ, Son A (2014) Characterization of denaturation and renaturation of DNA

for DNA hybridization. Environ Health Toxicol 9:7–15 16. Astakhova IK, Samokhina E, Babu BR, Wengel J (2012) Novel (phenylethynyl)pyrene-LNA constructs for fluorescence SNP sensing in polymorphic nucleic acid targets. Chembiochem 10:1509–1519 17. Okholm A, Kjems J, Astakhova K (2014) Fluorescence detection of natural RNA using rationally designed “clickable” oligonucleotide probes. RSC Adv 86:45653–45656 18. Taskova M, Barducci MC, Astakhova K (2017) Environmentally sensitive molecular probes reveal mutations and epigenetic 5-methyl cytosine in human oncogenes. Org Biomol Chem 27:5680–5684

Chapter 5 50 -Monopyrene and 50 -Bispyrene 20 -O-methyl RNA Probes for Detection of RNA Mismatches D. S. Novopashina, O. A. Semikolenova, and A. G. Venyaminova Abstract Progress in synthesis of novel fluorescent oligonucleotides has provided effective instruments for nucleic acid detection. Pyrene conjugated oligonucleotides have demonstrated their effectiveness as fluorescent hybridization probes. Here we describe the synthesis, isolation, and analysis of 50 -monopyrene and 50 -bispyrene conjugates of oligo(20 -O-methylribonucleotides) and their application as probes for fluorescent detection of mismatches in RNA targets. Key words Pyrene conjugates, Oligo(20 -O-methylribonucleotides), Excimeric and monomeric fluorescence, RNA, Mismatch detection

1

Introduction In recent years fluorescent hybridization probes received immense interest as tools for study the localization and structure, point mutations, and functions of NA targets, for instance, DNAs, mRNAs and long ncRNAs, rRNAs, small regulatory RNAs (see, for example [1–9]). Pyrene-modified oligonucleotides have gained much attention as fluorescent hybridization probes [2, 7]. Among the fluorescent probes based on pyrene-labeled oligonucleotides, the pyrene-labeled excimer [10–20] and exciplex [21, 22] forming probes possess a number of remarkable advantages that make them promising research and diagnostic tools. They display strong hybridization-induced changes and high degree of selectivity to the type of NA target. The large hybridization-induced Stokes shift in fluorescence spectrum (up to 130 nm) provides a specific signal of excimer or exciplex that can be easily distinguished even in the presence of an excess of unbound probes. High stability to photobleaching and long fluorescence lifetime (>40 ns) make it possible to use the pyrene-labeled probes in time-resolved assays [23]. The probes are very sensitive to the changes of local

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_5, © Springer Science+Business Media, LLC, part of Springer Nature 2020

45

46

D. S. Novopashina et al.

environment of pyrene fluorophore that can be displayed in the change of spectral properties of the probes [11, 21, 24]. Despite the large number of already proposed fluorescent probes for NA detection, there is a continuously growing interest in developing of new improved probes with well predefined properties and high flexibility to their modification. In each case, to control the hybridization properties, specificity and target-induced output signal of probes, it is necessary to design the probes taking into account structural features of NA target, fluorescent label, and the resulting hybrid duplex. The synthesis of 50 -monopyrene and 50 -bispyrene-modified oligo(20 -O-methylribonucleotides) was described earlier and successfully used for preparation of probes for RNA target detection (see [25, 26] and ref. therein). Here we proposed to use 50 -monopyrene and 50 -bispyrenemodified oligo(20 -O-methylribonucleotide) probes for the efficient mismatches detection in RNA targets.

2

Materials Prepare all solutions using ultrapure water obtained by purifying deionized water, to attain a sensitivity of 18 MΩ cm at 25 C, and analytical grade reagents. Prepare and store all oligonucleotides at 20 C.

2.1 Pyrene Conjugate Synthesis

1. Solution of 50 -phosphate of oligo(20 -O-methylribonucleotide) (50 -p-GmAmCmAmGmUmAmGmAmUmUmGmUmAmUmAmGm) in water (20 OU260 per 100 μL). 2. 8% (w/v) Solution of hexadecyltrimethylammonium bromide (CTAB) in water. 3. Triphenylphosphine (PPh3). 4. 2,20 -Dipyridyl disulfide ((PyS)2). 5. 4-N,N0 -Dimethylaminopyridine (DMAP). 6. 1-Pyrenemethylamine hydrochloride. 7. Dimethyl sulfoxide (DMSO), quality anhydrous, no more than 0.001% of water. 8. 2% (w/v) acetone solution of NaClO4. 9. Diethyl ether. 10. Acetone.

2.2 Pyrene Conjugate Isolation and Analysis

1. Loading buffer: 0.01% (w/v) solution of bromophenol blue and xylene cyanol in 7 M urea. 2. PAAG electrophoresis running buffer (TBE buffer): 0.089 M Tris–H3BO3, 0.001 M Na2EDTA, pH 8.3.

Detection of RNA Mismatches by 20 -OMeRNA Probes

47

3. 15% solution of acrylamide/N,N0 -methylene bisacrylamide (19:1) containing 7 M urea in TBE buffer. 4. Ammonium persulfate: 10% (w/v) solution in water. Store at 4 C. 5. N,N,N0 ,N0 -Tetramethylethylenediamine: Store at 4 C. 6. 0.3 M NaClO4 solution in water. 7. Sep Pak C18 Cartridges (Waters, USA). 8. 50% (v/v) Water solution of acetonitrile. 9. Fluorescent screen covered with polyethylene or polypropylene film, for example, a Kieselgel F254 thin-layer chromatography plate from Merck. 10. 0.05% (w/v) Solution of “Stains-all” dye in water–formamide (1:1) mixture. 11. 0.02 M Triethylammonium acetate buffer, pH 7.0. 12. Cacodylate buffer: 0.1 M NaCl, 10 mM sodium cacodylate, pH 7.4, and 1 mM Na2EDTA. 2.3 Preparation of Samples for Absorbance Spectra and Fluorescence Registration

3

1. 10 buffer: 1 M NaCl, 100 mM sodium cacodylate, pH 7.4, and 10 mM Na2EDTA. Store the buffer at 20 C. 2. The 10 μM solutions of 50 -mono – and 50 -bis-pyrene conjugates (50 -Pyr-p-GmAmCmAmGmUmAmGmAmUmUmGmUmAm UmAmGm and 50 -(Pyr)2-p-GmAmCmAmGmUmAmGmAmUm UmGmUmAmUmAmGm) and RNA targets (50 -CUAUACAUC UACUGUCAAAC, 50 -CUAUACAAUCUACUGUUAAAC, 50 -CUAUACAAUCUACUGUCUUUC and 50 -CUAUAC AAUCUACUGUAUUUC). Store the solutions at 20 C.

Methods The convenient chemical method of pyrene conjugation to the 50 -phosphate of oligo(20 -O-methylribonucleotide) is based on the selective activation of oligonucleotide terminal phosphate with the use of oxidation/reduction pair of reagents (triphenylphosphine/ 2,20 -dipyridyldisulfide) in the presence of a basic catalyst N,N0 -dimethylaminopyridine with the subsequent attachment of one or two pyrenemethylamine residues (Fig. 1) [25, 26]. Different conditions of the reaction of the activated derivative of oligonucleotide with pyrenylmethylamine were used for preparation of 50 -monopyrene and 50 -bispyrene derivatives of oligo(20 -O-methylribonucleotides). The excess of activation reagents must be removed and reaction with pyrenylmethylamine must be carried out in water/organic (H2O/DMSO) medium for obtaining of 50 -monopyrene conjugate of oligo(20 -O-methylribonucleotide). In the case of 50 -bispyrene conjugate synthesis the reaction must

48

D. S. Novopashina et al. O

O -O

P

O

Oligo

PPh3/(PyS)2/DMAP DM SOabs

-O

O

H3C + N H3C

N

P

O

Oligo

PPh3/(PyS)2/DMAP

H3C + N H3C

N

DM SOabs

-O

Pyr =

O

Oligo

1) NaCIO4/acetone

O P -O

Oligo

=

NH P

O

Oligo

PyrCH2NH2.HCI TEA:DM SOabs (1:5)

O

Oligo

-O

2) PyrCH2NH2.HCI H2O:TEA:DM SO (1:1:5)

-O

O -O

P

CH2

CH2

O NH P

CH2

O

Oligo

NH

Oligo(2'-O-methylribonucleotide)

Fig. 1 The schemes of the 50 -monopyrene (a) and 50 -bis-pyrene (b) conjugates of oligo(20 -O-methylribonucleotides) synthesis

be conducted in anhydrous organic medium (DMSOabs) using increased excess of pyrenylmethylamine. Because the synthesis is done on a microscale and no sophisticated chemical equipment is necessary, all the steps of synthesis could be realized in 1.5 mL Eppendorf vials. To confirm attachment of pyrene residues to oligo(20 -O-methylribonucleotides), a denaturing gel electrophoresis system or HPLC system supplied with a C-18 reverse-phase column is necessary. The products may be visualized by an ultraviolet (UV) shadowing method (as a violet band migrating slower than the band of the initial 50 -phosphate of oligo(20 -O-methylribonucleotide) when the gel slab is placed on a fluorescent screen and irradiated by a UV lamp) and then the product can be easily isolated from the gel by “crush and soak” method with subsequent desalting on C18 cartridge. Absorbance spectra of the pyrene conjugates are registered using standard spectrophotometer, for example the Eppendorf BioSpectrometers. Mass spectra of the conjugates are obtained by ESI or MALDI-TOF mass spectrometry using standard equipment. Standard instrument for fluorescence measurements is used for study of fluorescent properties of pyrene conjugates. 3.1 Precipitation of Oligo(20 -O-methylribonucleotide) in CTAB Salt

1. This procedure transforms oligonucleotide (usually nonsoluble in organic solvents) into the form of a CTAB salt that is soluble in polar aprotic organic solvents, such as DMSO. 2. Estimate the quantity of CTAB necessary for neutralization of all phosphate groups in oligonucleotide (see Note 1). 3. Add the estimated volume of 8% (w/v) solution of CTAB in water to solution of the 50 -phosphorylated oligonucleotide, and vortex the tube to obtain a white coagulated pellet.

Detection of RNA Mismatches by 20 -OMeRNA Probes

49

4. Centrifuge the tube 2 min at 15,000 g until precipitation of the pellet. 5. Add additional 1 μL of 8% CTAB solution and shake slightly. If any new coagulation appears, centrifuge the tube, and add a new portion (1 μL) of CTAB. 6. Repeat the procedure until the precipitate stops forming after addition of a new portion (see Note 1). Then remove the supernatant and dry the oligonucleotide in the vacuum over P2O5. 3.2 Monopyrene Conjugate Synthesis

1. Prepare solution of 2 mg of pyrenemethylamine hydrochloride in 20 μL of anhydrous DMSO, 5 μL of TEA must be added to transform the amino group into a free base form. 2. Dissolve 5 mg of DMAP in 50 μL of anhydrous DMSO, add the solution to the dry 50 -phosphate of oligonucleotide CTAB salt (15–20 OE260), agitate until dissolution, add the solution of 6.6 mg (30 μmol) of dipyridyldisulfide in 25 μL of anhydrous DMSO and the solution of 7.9 mg (30 μmol) of triphenylphosphine in 25 μL of anhydrous DMSO (dissolution of the chemicals can be accelerated by heating at 50–70 ). Agitate the mixture thoroughly, and incubate for 15 min at room temperature. 3. Add 10 vol of 2% sodium perchlorate solution in acetone and immediately centrifuge 2 min at 15,000 g. Remove supernatant acetone and quickly rinse the pellet with acetone, but do not dry it. 4. Immediately after removal of the acetone dissolve the pellet in 5 μL of water and add the solution of pyrenemethylamine hydrochloride. Shake the reaction mixture at 37 C during 2 h. 5. To remove excess of pyrenemethylamine, it is necessary to precipitate the conjugate by 2% sodium perchlorate solution in acetone/diethyl ether (1:1) and washed the pellet with acetone (see Note 2).

3.3 Bis-pyrene Conjugate Synthesis

1. Prepare solution of 6 mg of pyrenemethylamine hydrochloride in 20 μL of anhydrous DMSO, 5 μL of TEA must be added to transform the amino group into a free base form. 2. Dissolve 5 mg of DMAP in 50 μL of anhydrous DMSO, add the solution to the dry 50 -phosphate of oligonucleotide CTAB salt (15–20 OE260), agitate until dissolution, add the solution of 6.6 mg (30 μmol) of dipyridyldisulfide in 25 μL of anhydrous DMSO and the solution of 7.9 mg (30 μmol) of triphenylphosphine in 25 μL of anhydrous DMSO (dissolution of the chemicals can be accelerated by heating at 50–70 ). Agitate the mixture thoroughly, and incubate 15 min at room temperature. Add the solution of pyrenemethylamine hydrochloride

50

D. S. Novopashina et al.

and shake the reaction mixture at 37 C during 2–16 h (see Note 3). 3. To remove excess of pyrenemethylamine and phosphate activating reagents, it is necessary to precipitate the conjugate by 2% sodium perchlorate solution in acetone/diethyl ether (1:1) and washed the pellet with acetone (see Note 2). 3.4 Analysis and Purification of the Conjugates

Several methods could be used for detection of conjugates, their characterization, and purification. In this chapter we shall not go into the details of each method because it is not an aim of this publication. 1. Apply 0.05 OU260 of the reaction mixture in the loading buffer in the well of a 15% denaturing polyacrylamide gel slab, and run electrophoresis until the bromophenol blue passes more than two-thirds of the slab. 2. Formation of the product could be analyzed by gel staining using the solution of “Stains-all” dye. The products are visualized as violet or blue retarded bands related to the control— initial 50 -phosphate of oligo(20 -O-methylribonucleotide) (Fig. 2). The degree of the oligonucleotide conversion to the monopyrene conjugate was about 60–85% and conversion to the bispyrene conjugate was about 80–95% (see Note 4). 3. Apply 15–20 OU260 of the reaction mixture in the loading buffer in the 3 cm width well of a 15% denaturing polyacrylamide gel slab with the thickness of 1 mm.

Fig. 2 The analysis of the reaction mixture by PAGE. (1) Reaction mixture obtained upon 50 -bispyrene conjugate synthesis (50 -(Pyr)2-p-GmAmCmAmGmUm AmGmAmUmUmGmUmAmUmAmGm); (2) reaction mixture obtained upon 50 -monopyrene conjugate synthesis (50 -Pyr-p-GmAmCmAmGmUmAmGmAmUmUm GmUmAmUmAmGm); (3) initial 50 -phosphate of oligo(20 -O-methylribonucleotide) (50 -p-GmAmCmAmGmUmAmGmAmUmUmGmUmAmUmAmGm). Conditions: 15% denaturating PAAG (7 M urea, acrylamide/N,N0 -methylene bisacrylamide 19:1) in TBE buffer. Gel stained with “Stains-all”

Detection of RNA Mismatches by 20 -OMeRNA Probes

51

4. The products could be detected by a UV-shadowing method. The products are visible on UV-fluorescent screen upon irradiation at 265 nm as bands show violet fluorescence. 5. For purification of the product, excise the product band, and extract the product by soaking the gel slice in 0.3 M NaClO4, then isolate the conjugate by desalting using Sep Pak C18 cartridge with subsequent precipitation with by 2% sodium perchlorate solution in acetone. The yields after isolation of the conjugates were about 50% for 50 -monopyrene conjugate and 65% for 50 -bis-pyrene conjugate relative to 50 -phosphate of oligo(20 -O-methylribonucleotide). 6. Owing to different hydrophobicity of the pyrene conjugates compared to the initial 50 -phosphate of oligo(20 -O-methylribonucleotide), the products may be separated by HPLC on a reversed-phase column (Fig. 3). 7. Reverse phase-HPLC (RP-HPLC) analysis of the oligonucleotides and their conjugates is performed on an Alphachrome high performance liquid chromatography with the use of a ProntoSil-120-5-C18 AQ (75 2.0 mm, 5.0 μm) column, applying a gradient elution from 0% to 50% (20 min) of acetonitrile in 0.02 M triethylammonium acetate buffer, pH 7.0 at a

Fig. 3 HPLC analysis of initial (curve 1) 50 -phosphate of oligonucleotide 50 -p-GmAmCmAmGmUmAmGmAmUm m m m m m m m U G U A U A G and isolated by preparative gel electrophoresis 50 -monopyrene conjugate 50 -Pyr-pGmAmCmAmGmUmAmGmAmUmUmGmUmAmUmAmGm (curve 2) and 50 -bis-pyrene conjugate 50 -(Pyr)2-p-GmAmCm m m m m m m m m m m m m m m A G U A G A U U G U A U A G (curve 3)

52

D. S. Novopashina et al.

Fig. 4 UV-Vis absorbance spectra for 50 -monopyrene (mono) and 50 -bis-pyrene (bis) conjugate of oligo(20 -Omethylribonucleotide) (50 -p-GmAmCmAmGmUmAmGmAmUmUmGmUmAmUmAmGm). Spectra were registered at T ¼ 25 C in cacodylate buffer (pH 7.4). The concentration of the conjugates was 1 μM

flow rate 100 μL per min, and detection at 260 and 345 nm for pyrene conjugate. 8. Record the electronic absorbance spectra of the conjugates on the UV-visible spectrophotometer in the wavelength interval between 220 and 400 nm in 10 mm quartz cuvette at T ¼ 25 C in buffer contained 0.1 M NaCl, 10 mM sodium cacodylate, pH 7.4, and 1 mM Na2EDTA. The concentration of the conjugates was 1 μM (Fig. 4). Two absorption maxima will be observed: one maximum at 260 nm corresponding to sum of oligonucleotide and pyrene absorbance and the second maximum at 345 nm corresponding to the absorbance of pyrene residues [27]. The molar extinction coefficients for pyrene are 24,000 and 37,000 M1 cm1 at 260 and 345 nm, respectively. The molar ratio between oligonucleotide and pyrene parts must be about 1:1 for 50 -monopyrene conjugate and 1:2 for 50 -bis-pyrene conjugate. 3.5 Fluorescent Properties of the Conjugates and Their Duplexes with RNA Targets

Fluorescence emission spectra of individual pyrene conjugates and their duplexes with RNA targets are recorded at 25 C in buffer contained 0.1 M NaCl, 10 mM sodium cacodylate, pH 7.4, and 1 mM Na2EDTA. The concentrations of pyrene conjugate or components of the duplex are 0.2 μM. Spectra are registered in quartz cuvettes with the optical length 4 mm at the voltage of 600 V on Cary Eclipse Fluorescence Spectrophotometer and thermostating

Detection of RNA Mismatches by 20 -OMeRNA Probes

b

a

Fluorescence, r.u.

15

15

3

10

10 5 0 350

Fluorescence, r.u.

20

2

1

400

450

500

550

3 2 400

450

500

550

600 nm

d

12

12

10

2

10

8 6

1 4 2 0 350

1 5 0 350

600 nm

Fluorescence, r.u.

Fluorescence, r.u.

20

c

53

3 400

450

500

550

600 nm

8 6

1 4

2

2 0 350

3 400

450

500

550

600 nm

Fig. 5 Fluorescent spectra of 50 -monopyrene (50 -Pyr-p-GmAmCmAmGmUmAmGmAmUmUmGmUmAmUmAmGm) (a, b) and 50 -bis-pyrene (50 -(Pyr)2-p-GmAmCmAmGmUmAmGmAmUmUmGmUmAmUmAmGm) (c, d) conjugates (1) and its duplexes with fully match RNA (2) and mismatch RNA targets (3). (a, c) RNA targets: match RNA 50 -CUAUACAAUCUACUGUCAAAC, mismatch RNA 50 -CUAUACAAUCUACUGUUAAAC; (b, d) RNA targets: match RNA 50 -CUAUACAAUCUACUGUCUUUC, mismatch RNA 50 -CUAUACAAUCUACUGUAUUUC. Spectra were registered at T ¼ 25 C in cacodylate buffer (pH 7.4). The concentrations of the conjugates and RNA targets were 0.2 μM

with Thermostatic circulator 2219 Multitemp II. The probes were preliminary annealed at 90 C and cooled for 45 min to a room temperature. The emission spectra are recorded in the range of 360–600 nm at an excitation wavelength of 345 nm. The difference in monomeric and excimeric fluorescence in the spectra of duplexes with two RNA targets indicates the presence of mismatch in one RNA target. The monomeric and excimeric fluorescence of fully match duplexes are much more pronounced then for the duplexes with the mismatch (Fig. 5a, c). When the poly(U) tract is near to pyrene groups the monomeric and excimeric fluorescence are quenched efficiently and no difference between match and mismatch RNA can be revealed (Fig. 5b, d). We can conclude that 50 -monopyrene and 50 -bispyrene conjugates of oligo(20 -O-methylribonucleotides) can be used as efficient probes for mismatch discrimination in RNA target. It must be noted that the poly(U) tract extremely negative affects the possibility of mismatch detection due to the strong quenching effect of uridine on pyrene fluorescence.

54

4

D. S. Novopashina et al.

Notes 1. Titration of the oligonucleotide by CTAB is used because it is very important to add the molar quantity of CTAB equivalent to the molar quantity of oligonucleotide phosphates (for example, 18 molecules of CTAB/1 molecule of phosphorylated 17-mer oligonucleotide). Otherwise, CTAB salt of oligonucleotide will dissolve in excess of CTAB solution. 2. To avoid the loss of the oligonucleotide conjugates upon precipitation of reaction mixture after synthesis we have changed standard conditions by addition of ethyl ether to the 2% solution of sodium perchlorate in acetone. The incubation of reaction mixture with this precipitation solution at 20 C during 20–60 min additionally permits to increase the quantity of precipitated conjugate. 3. In the case of terminal phosphate activation, the direct addition without removal of activating agents can lead to the attachment of two pyrene residues to the same terminal phosphate of oligonucleotide [28, 29]. The application of Mukaiyama reaction for the synthesis of oligonucleotide conjugates was described in details in [28, 30]. 4. In the case of monopyrene conjugate synthesis, the degree of conversion is lower than in the case of bispyrene conjugate due to the formation of the side product—triphenylphosphinecontaining conjugate [26].

Acknowledgments The work was partially supported by Russian State funded budget project # АААА-А17-117020210021-7 to ICBFM SB RAS. References 1. Boutorine AS, Novopashina DS, Krasheninina OA, Nozeret K, Venyaminova AG (2013) Fluorescent probes for nucleic acid visualization in fixed and live cells. Molecules 18:15357–15397 2. Krasheninina OA, Novopashina DS, Apartsin EK, Venyaminova AG (2017) Recent advances in nucleic acid targeting probes and supramolecular constructs based on pyrene-modified oligonucleotides. Molecules 22:E2108 3. Furlan I, Domljanovic I, Uhd J, Astakhova K (2019) Improving the design of synthetic oligonucleotide probes by fluorescence melting assay. Chembiochem 20:587–594

4. Taskova M, Barducci MC, Astakhova K (2017) Environmentally sensitive molecular probes reveal mutations and epigenetic 5-methyl cytosine in human oncogenes. Org Biomol Chem 15:5680–5684 5. Junager NP, Kongsted J, Astakhova K (2016) Revealing nucleic acid mutations using Fo¨rster resonance energy transfer-based probes. Sensors (Basel) 16:E1173 6. Armitage BA (2011) Imaging of RNA in live cells. Curr Opin Chem Biol 15:806–812 7. Østergaard ME, Hrdlicka PJ (2011) Pyrenefunctionalized oligonucleotides and locked nucleic acids (LNAs): tools for fundamental

Detection of RNA Mismatches by 20 -OMeRNA Probes research, diagnostics, and nanotechnology. Chem Soc Rev 40:5771–5788 8. Guo J, Ju J, Turro NJ (2012) Fluorescent hybridization probes for nucleic acid detection. Anal Bioanal Chem 402:3115–3125 9. Kolpashchikov DM (2010) Binary probes for nucleic acid analysis. Chem Rev 110:4709–4723 10. Ebata K, Masuko M, Ohtani H, KashiwasakeJibu M (1995) Nucleic acid hybridization accompanied with excimer formation from two pyrene-labeled probes. Photochem Photobiol 62:836–839 11. Masuko M, Ohtani H, Ebata K, Shimadzu A (1998) Optimization of excimer-forming two-probe nucleic acid hybridization method with pyrene as a fluorophore. Nucleic Acids Res 26:5409–5416 12. Paris PL, Langenhan JM, Kool ET (1998) Probing DNA sequences in solution with a monomer-excimer fluorescence color change. Nucleic Acids Res 26:3789–3793 13. Mahara A, Iwase R, Sakamoto T, Yamana K, Yamaoka T, Murakami A (2002) Bispyreneconjugated 20 -O-methyloligonucleotide as a highly specific RNA-recognition probe. Angew Chem Int Ed 41:3648–3650 14. UmemotoT HPJ, Babu BR, Wengel J (2007) Sensitive SNP dual-probe assays based on pyrene-functionalized 20 -amino-LNA: lessons to be learned. Chembiochem 8:2240–2248 15. Krasheninina OA, Novopashina DS, Venyaminova AG (2011) Oligo(20 -O-methylribonucleotides) containing insertions of 20 -bispyrenylmethylphosphorodiamidate nucleoside derivatives as prospective fluorescent probes for RNA detection. Russ J Bioorg Chem 37:244–248 16. Huang J, Wu Y, Chen Y, Zhu Z, Yang X, Yang CJ, Wang K, Tan W (2011) Pyrene-excimer probes based on the hybridization chain reaction for the detection of nucleic acids in complex biological fluids. Angew Chem Int Ed 50:401–404 17. Waki R, Yamayoshi A, Kobori A, Murakami A (2011) Development of a system to sensitively and specifically visualize c-fos mRNA in living cells using bispyrene-modified RNA probes. Chem Commun 47:4204–4206 18. Krasheninina OA, Novopashina DS, Lomzov AA, Venyaminova AG (2014) 20 -Bispyrenemodified 20 -O-methyl RNA probes as useful tools for the detection of RNA: synthesis, fluorescent properties, and duplex stability. Chembiochem 15:1939–1946

55

19. Conlon P, Yang CJ, Wu Y, Chen Y, Martinez K, Kim Y, Stevens N, Marti A, Jockusch S, Turro NJ, Tan W (2008) Pyrene excimer signaling molecular beacons for probing nucleic acids. J Am Chem Soc 130:336–342 20. Karlsen KK, Okholm A, Kjems J, Wengel J (2013) A quencher-free molecular beacon design based on pyrene excimer fluorescence using pyrene-labeled UNA (unlocked nucleic acid). Bioorg Med Chem 21:6186–6190 21. Bichenkova EV, Savage HE, Sardarian AR, Douglas KT (2005) Target-assembled tandem oligonucleotide systems based on exciplexes for detecting DNA mismatches and single nucleotide polymorphisms. Biochem Biophys Res Commun 332:956–964 22. Kumar TS, Myznikova A, Samokhina E, Astakhova IK (2013) Rapid genotyping using pyrene-perylene locked nucleic acid complexes. Artif DNA PNA XNA 4:58–68 23. Marti AA, Li X, Jockusch S, Li Z, Raveendra B, Kalachikov S, Russo JJ, Morozova I, Puthanveettil SV, Ju J, Turro NJ (2006) Pyrene binary probes for unambiguous detection of mRNA using time-resolved fluorescence spectroscopy. Nucleic Acids Res 34:3161–3168 24. Astakhova IK, Samokhina E, Babu BR, Wengel J (2012) Novel (phenylethynyl)pyrene-LNA constructs for fluorescence SNP sensing in polymorphic nucleic acid targets. Chembiochem 13:1509–1519 25. Krasheninina OA, Fishman VS, Novopashina DS, Venyaminova AG (2017) 50 -Bispyrene molecular beacons for RNA detection. Russ J Bioorg Chem 43:259–269 26. Novopashina DS, Totskaia OS, Kholodar’ SA, Meshchaninova MI, Ven’iaminova AG (2008) Oligo(20 -O-methylribonucleotides) and their derivatives: III. 50 -mono- and 50 -bispyrenyl derivatives of oligo(20 -O-methylribonucleotides) and their 30 -modified analogues: synthesis and properties. Russ J Bioorg Chem 34:602–612 27. Dobrikov MI, Gaidamakov SA, Koshkin AA, Gainutdinov TI, Luk’anchuk NP, Shishkin GV, Vlassov VV (1997) Sensitized photomodification by binary systems. I. Synthesis of oligonucleotide reagents, and the effect of their structure on the efficacy of target modification. Russ J Bioorg Chem 23:171–178 28. Grimm GN, Boutorine AS, He´le`ne C (2000) Rapid routes of synthesis of oligonucleotide conjugates from nonprotected oligonucleotides and ligands possessing different nucleophilic or electrophilic functional groups.

56

D. S. Novopashina et al.

Nucleosides Nucleotides Nucleic Acids 19:1943–1965 29. Kostenko E, Dobrikov M, Pyshnyi D, Petyuk V, Komarova N, Vlassov V, Zenkova M (2001) 50 -Bis-pyrenylated oligonucleotides displaying excimer fluorescence provide

sensitive probes of RNA sequence and structure. Nucleic Acids Res 29:3611–3620 30. Sinyakov AN, Ryabinin VA, Grimm GN, Boutorine AS (2001) Stabilization of DNA triple helices using conjugates of oligonucleotides and synthetic ligands. Mol Biol 35:251–260

Chapter 6 Combined Assay for Detecting Autoantibodies to Nucleic Acids and Apolipoprotein H in Patients with Systemic Lupus Erythematosus Sangita Khatri, Elizabeth D. Mellins, Kathryn S. Torok, Syeda Atia Bukhari, and Kira Astakhova Abstract The complicated clinical picture and biomolecular pattern of human autoimmune diseases (ADs) make knowledge on their etiology still fragmentary. The diagnostic approaches for ADs require improvement both for clinical and research effort to progress. Synthetic biomolecular antigens find growing applications for diagnosis and investigation of ADs. The main goal of this work is to detect interaction between synthetic antigens and autoantibodies in systemic lupus erythematosus within a combined, high-throughput assay. A panel of synthetic antigens has been prepared from DNA, RNA, locked nucleic acids and apolipoprotein H. The binding of synthetic antigens to autoantibodies has been confirmed in sera samples from those with active systemic lupus erythematosus (SLE) by indirect enzyme linked immunosorbent assay. Our study provides an efficient methodology for combined autoantibody profiling in SLE. Key words Autoimmune diseases, Antigens, Autoantibodies, Systemic lupus erythematosus, ELISA

Abbreviations AD ApoH DMSO DTT ELISA GSH H2O2 HRP IgE IgG IgM LNA LS

Autoimmune disease Apolipoprotein H/beta II glycoprotein Dimethyl sulfoxide Dithiothreitol Enzyme-linked immunosorbent assay Glutathione Hydrogen peroxide Horseradish peroxide Immunoglobulin E Immunoglobulin G Immunoglobulin M Locked nucleic acid Linear Scleroderma

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_6, © Springer Science+Business Media, LLC, part of Springer Nature 2020

57

58

Sangita Khatri et al.

MOPS pNPP SDS SLE TMB TRX-1

1

3-(N-morpholino) propanesulfonic acid para-Nitrophenylphosphate Sodium dodecyl sulfate Systemic lupus erythematosus Tetramethylbenzidine Thioredoxin

Introduction Autoimmune diseases (ADs) is a group of more than 80 chronic and heterogeneous conditions. ADs develop when the function of the immune system to detect, deflect, and destroy pathogens becomes dysregulated and attacks host organs, tissues, and cells [1–3]. Given the substantial complexity of ADs, understanding of their etiology and pathogenesis remains fragmentary and incomplete [4]. Approximately 5–8% of the world population is affected by ADs, and most of those affected are women [2, 5]. The diagnosis of these diseases remains difficult, and standardized treatment regimens are lacking for most ADs, due to disease heterogeneity, lack of biomarkers that can predict which patient will benefit from which drug and, for some ADs, insufficient knowledge of effective drug targets. Therefore, a significant proportion of AD patients live with chronic illness and unmet treatment needs. There is a great demand for robust techniques for early diagnosis of ADs, when intervention with available treatments is most likely to have an impact. Disease-specific biomarkers, currently using a number of biomolecular platforms, can inform the prognosis and management of ADs [6, 7]. Different diagnostic approaches are available for the most prevalent ADs; however, these are not applicable to all ADs, which has challenged researchers to search for novel biomarkers and diagnostic approaches [8]. In some ADs, specific antibodies to self-molecules (autoantibodies) have been shown to correlate with specific clinical subsets [7]. The targets of autoantibodies include all classes of biomolecules, including proteins, DNA, and RNA. Biomolecular chemistry approaches can generate pure biomolecules of all classes, with a well-defined chemical structure and high purity. The use of synthetic biomolecules to reveal patterns of autoantibodies in disease states and healthy states has high diagnostic importance [6, 9– 11]. Specifically, systemic lupus erythematosus (SLE) is an AD with a complex biomolecular pattern, which requires diagnosis by combining a physical examination with measuring multiple biomarkers, including autoantibodies [5]. Enzyme-linked immunosorbent assay (ELISA) is a straightforward method that allows for the detection of binding of an antigen by an antibody. ELISAs are straightforward to design and perform.

Detecting Autoantibodies in Systemic Lupus Erythematosus

59

Moreover, ELISA has the advantages of high sensitivity and applicability to a broad range of molecules [12, 13]. The interaction of antigens with antibodies linked to enzymes followed by the addition of the enzyme’s substrate gives detectable color change. The latter is analyzed by measuring the absorbance at a specific wavelength. ELISA assays are broadly applied for SLE diagnostics. However, most of these assays use antigen extracts from natural sources, often leading to moderate to low reproducibility and clinical specificity moderate to low [5]. Herein, the diagnostic strategy includes the rational design and synthesis of a panel of synthetic antigens, including proteins and nucleic acids (DNA, RNA and locked nucleic acid, LNA) for specific detection of SLE associated antibodies [11]. For the first time, a series of chemically diverse antigens is being applied on the same ELISA plate. Given the robustness of ELISA, and the chemical uniformity and diversity of the novel antigens, the reported methodology has potential as a new multiparameter diagnostic approach for human AD, such as SLE.

2

General Considerations and Materials Prepare all solutions and buffers using ultrapure autoclaved water and molecular biology grade reagents. Prepare and store all the buffers and reagents according to the supplier’s guidelines. It is recommended to use sterile tubes and tips. Universal precautions like cleaning the working space, ELISA plates and pipettes with 70% v/v EtOH in water should be strictly adhered as this method involves the manipulations of RNA.

2.1 Buffers for SDS– Polyacrylamide Gel

1. 1X SDS running buffer: Add 25 mL 20X 3-(N-morpholino) propanesulfonic acid (MOPS) buffer in 475 mL deionized autoclaved water. 2. 2X Sample buffer: Prepare solution of 125 mM Tris–HCl (pH 6.8), 20% glycerol, 4% sodium dodecyl sulfate (SDS) and 0.1% bromophenol blue. Make aliquots and store at +4 C for immediate use and 20 C for prolonged storage (see Note 1). 3. 1X Fixation buffer: Add 500 mL of ultrapure grade 95% (v/v) methanol to 400 mL of autoclaved water. Add 100 mL of analytical grade acetic acid and adjust the volume to 1000 mL. 4. 1X Coomassie protein stain: Add 1 g of Coomassie Brilliant Blue in 500 mL methanol, 100 mL acetic acid and adjust the total volume to 1000 mL with autoclaved water. 5. 1X Destaining buffer: Add 400 mL of ultrapure grade 95% (v/v) methanol to 100 mL analytical grade acetic acid and adjust the volume to 1 L with autoclaved water.

60

Sangita Khatri et al.

6. HEPES buffer: Dilute 1 M commercially available HEPES stock to get the desired concentration of 20 mM. pH is adjusted to 7.4 by adding an aqueous solution of NaOH.

3

Methods Below we give a stepwise protocol that allows for the ELISA test of SLE patients on a panel of 11 disease-related antigens. The assay uses nucleic acid and two protein antigens. Five DNA and three RNA antigens consist of natural nucleotides, whereas one DNA antigen contains a chemical modification, locked nucleic acid (LNA). LNA is known to improve the binding affinity between DNA/DNA, DNA/RNA and nucleic acid protein complexes in a sequence dependent way [14]. The antigen sequence details and background are given in Table 1.

3.1 Buffers for Enzyme Linked Immunosorbent Assay (ELISA)

1. Washing buffer/Phosphate buffered saline (1X PBS) (see Note 2). 2. Blocking solution/PTB buffer: Mix 20 g of commercially available bovine serum albumin (BSA) in 1 L of 1X PBS. Add 50 μL of biology grade Tween 20. Store remaining BSA according to the supplier’s guidelines and diluted BSA solution in +4 C. Discard diluted solution if haze is noted in the stored buffer. 3. Diluent: Prepare dilution solution by adding 2 g Bovine Serum Albumin (BSA) in 1 L of 1X PBS. Add 50 μL of biology grade Tween-20. Diluent must be freshly prepared. 4. Tetramethylbenzidine (TMB) substrate: Incubation with the substrate depends on the enzyme conjugated to antihuman immunoglobulins. A para-nitrophenylphosphate (pNPP, pH 9.8) and tetramethylbenzidine are widely used ones. Both of the substrate is available commercially and can also easily prepared in lab from standard reagent. Dissolve 3 mg of analytical grade TMB in 5 mL of biological grade Dimethyl sulfoxide (DMSO) and adjust the volume to 50 mL with 0.1 M acetate buffer. Add 3 μL of hydrogen peroxide (H2O2) in adjusted TMB buffer and store in +4 C. Store all the remaining reagents according to the supplier’s recommendation (see Note 3). 5. Stopping solution: Different stopping solutions are used for different conjugated antisera and substrate. Usually, sodium hydroxide (NaOH) and sulfuric acid (H2SO4) are used for pNPP and TMB substrate respectively. Both are commercially available but can be easily prepared in the lab. For TMB substrate, dilute the analytical grade sulfuric acid (H2SO4) to gain

Detecting Autoantibodies in Systemic Lupus Erythematosus

61

Table 1 List of DNA, RNA, LNA, and protein antigensa Antigen

Sequence, 50 ! 30 (N ! C)

1.

D5

d (TCC TCT CTT TCT CTT TCT CTT D5 is a double stranded TC rich DNA TCC TCT CTT TCT CTT CTT CTT previously used in SLE studies [21] TCC TCT CTT TCC CTT TCT CTT): d (AAG AGA AAG AGA AAG AGA GGA AAG AGA AAG AGA AAG AGA GGA AAG AAG AGA AA)

2.

HUV

d (CGA GGA TGA CCG CTC AGC CGC HUV is a cytomegalovirus specific sequence known from systemic CGC TGC ACC ACC GCC ACC ACC sclerosis [22] CGT ACG CCC TGT TCG GGA): d (TCC CGA ACA GGG CGT ACG GGT GGT GGC GGT GGT GCA GCG GCG GCT GAG CGG TCA TCC TCG)

3.

LS_ag1

r (UCC UCU CUU UCU CUU CUC UUU CCU CUC UUU): r (AAA GAG AGG AAA GAG AAG AGA AAG AGA GGA)

4.

L3D

d (ATLG ATA CALG ACA TALG AGTCA): L3D is a mixmer of LNA and DNA d(TGC ACT CTA TGT CTG TAT CAT) previously used in LS studies [16]

5.

EBF3_r

r (UGA AAA CUG CUG GCG GCG GCC EBF3_r codes for early B cell factor 3 which is associated with a variety of GCA GCU CCC GGC CGA AAG CGT ADs [17] UUC CUC GAG CAG CGG CGC): r (GCG CCG CUG CUC GAG GAA ACG CUU UCG GCC GGG AGC UGC GGC CGC CGC CAG CAG UUU UCA)

6.

ssD4

d (CAT GAA GAC CTC ACA GTA AAA ATA GGT GAT TTT GGT CTA GCT ACA GTG AAA TCT CGA T)

ssD4 is single stranded DNA form which is previously used sequence in LS studies [21]

7.

ssLS_R

r (UCC UCU CUU UCU CUU CUC UUU CCU CUC UUU)

LS_R is a single stranded UC rich RNA corresponding to the TC rich D5 [16]

8.

D4

d (CAT GAA GAC CTC ACA GTA AAA ATA GGT GAT TTT GGT CTA GCT ACA GTG AAA TCT CGA T): d (ACT CCA TCG AGA TTT CAC TGT AGC TAG ACC AAA ATC ACC TAT TTT TAC TGT GAG GTC T)

D4 is the corresponding double stranded DNA used for SLE and LS studies [16, 21]

9.

LS_R1

r (UGA AGA CCU CAC AGU AAA AAU AGG UGA UUU): r (AAA UCA CCU AUU UUU ACU GUG AGG UCU UCA)

LS_R1 is double stranded RNA with mixed RNA nucleotides corresponding to D4 [16]

MISPVLILFS SFLCHVAIAG RTCPKPDDLP FSTVVPLKTF YEPGEEITYS CKPGYVSRGG MRKFICPLTG LWPINTLKCT

Anti-apolipoprotein (Anti-ApoH) antibodies are strongly associated lupus and sclerosis and used as a diagnosticmarkers [18]

10. ApoH

Comments

Double stranded RNA of analogues sequence to ds D5 [16]

(continued)

62

Sangita Khatri et al.

Table 1 (continued) Antigen

Sequence, 50 ! 30 (N ! C)

Comments

PRVCPFAGIL ENGAVRYTTF EYPNTISFSC NTGFYLNGAD SAKCTEEGKW SPELPVCAPI ICPPPSIPTF ATLRVYKPSA GNNSLYRDTA VFECLPQHAM FGNDTITCTT HGNWTKLPEC REVKCPFPSR PDNGFVNYPA KPTLYYKDKA TFGCHDGYSL DGPEEIECTK LGNWSAMPSC KASCKVPVKK ATVVYQGERV KIQEKFKNGM LHGDKVSFFC KNKEKKCSYT EDAQCIDGTI EVPKCFKEHS SLAFWKTDAS DVKPC 11. Reduced reduced disulfide bond between Cys288 to ApoH Cys326 in domain V of protein sequence

Wang et al. [19] demonstrated the role of reduced apolipoprotein in inhibiting TGF-β1-p38 MAPK pathway leading to nephropathy [19]

LS localized scleroderma, SLE systemic lupus erythematosus a LNA is indicated with upper case L

the concentration of 1 M and store according to the supplier’s recommendation (see Note 4). 3.2

Antigens

Table 1 lists DNA, RNA, LNA and protein antigens used in this study [15–20]. A step-wise procedure for preparation of antigens and coating ELISA plates is given below. 1. Double stranded DNA and RNA antigens for single ELISA plate: Anneal single stranded DNA/RNA with its complementary stranded in 1:1 molar ratio in 10X PBS buffer for 10 min at 92 C for DNA and 85 C for RNA. Adjust the volume to 0.868 mL in 1X PBS to get the final concentration of antigen 3.5 μg/mL. Store the individual DNA/RNA prior to annealing in 20 C (see Note 5). 2. Single-stranded DNA/RNA antigens and protein antigen for single ELISA plate: Mix 3.04 μg of single stranded DNA/RNA/protein in 1X PBS to get the final concentration of 3.5 μg/mL. It is recommended to make aliquots of DNA, RNA and protein after receiving from suppliers and stored in 20 C. 3. Beta II Glycoprotein (ApoH): Reduce ApoH by reduced Thioredoxin (TRX-1) as described in the previously reported method [15]. Reduce TRX-1 by incubating 5 μM of Thioredoxin 1 human recombinant with 25 μM dithiothreitol (DTT)

Detecting Autoantibodies in Systemic Lupus Erythematosus

63

at 37 C for 1 h. Add 0.2 μM of ApoH to reaction mixture and incubate at 37 C for 1 h. Quench the reaction mixture with 200 μM glutathione (GSH) for 10 min at 37 C. Perform all the reaction in 20 μM HEPES buffer at pH 7.4. The confirmation of reduced ApoH was done by SDS PAGE Gel in nonreducing condition. Mix 3.04 μg of reduced protein in 1X PBS to get the final concentration of 3.5 μg/mL. Store at 20 C for a month and 80 C for a prolonged period (see Note 6). 4. Secondary antibodies (conjugated antisera): Four different reagent grade fraction of horseradish peroxide (HRP) antiserum antihuman conjugates: IgG (Fc specific), IgM (μ-chain specific), IgA and IgE (ε-chain specific) developed in goat or rabbit are used as secondary antibodies and are commercially available. Prepare HRP antibody solution by diluting commercially available anti-serum antihuman conjugates in previously prepared diluent (2 g BSA, 50 μL Tween-20, 1 L 1X PBS). It is recommended to follow the suppliers guide to dilute the antisera which mostly range from 1:250 to 1:20,000 and should be prepared on the day of conjugation (see Note 7). 3.3

Equipment

1. Thermomixer with maximum speed 3000 rpm (1000 g), mixing frequency 300–3000 rpm (10–1000 g) and temperature accuracy max. 0.5 C at 20–45 C is used for incubating and mixing of samples simultaneously at variable temperature. 2. Centrifugal Filter device with fast ultrafiltration with the capacity of easy concentrate recovery from crude sample is recommended for purifying the samples. These filter devices are commercially available and recommended to store according to the manufacturer recommendations, usually at room temperature. 3. Centrifuge rotor: Centrifuge should have speed ranging from 100 to 15,300 rpm with microcontroller to control speed, time, temperature, and gravitational field. 4. Spectrophotometer. 5. Mini-Cell electrophoresis system. It is recommended to use the running buffer, gel cassette and electrophoresis system from the same suppliers. 6. Maxisorp plates: Sterile, 96-well, high binding maxisorp plates are recommended for use in ELISA assay. These are commercially available and should be stored according to supplier’s guidelines, usually room temperature. 7. Precision pipettes and disposable pipette tips: Different singlechannel pipettes with volume ranging 0–3, 0–20, 20–200, and 100–1000 μL as well as multichannel (8 and 12 channels) pipettes with a volume range 0–300 μL are needed. The

64

Sangita Khatri et al.

corresponding sterile tips for the pipette can be obtained from commercial supplier. 8. Incubator: Incubator or oven with a temperature controller can be used for incubation at varying temperature. 9. ELISA plate reader: The optical density of the samples at different wavelength (405 nm for ALP/pNPP assay and 450 nm for HRP/TMB assays) in 96-well plates are measured by a plate reader. 3.4 Purification of Reduced Apolipoprotein H

Perform the purification at room temperature in a centrifuge. Check clearance before spinning the filter device to avoid any damage during centrifugation. Rinsing filter device with ultrapure water is prior before use to avoid the interference in analysis due to presence of trace amount of glycerin in filter device. Avoid drying of membrane in filter device after rinsing. It is recommended to leave fluid on the membrane until the device is used. 1. Add up to 0.5 mL of sample in a suitable filter device with Molecular Weight Cut off 30 kDa and place it in the centrifuge device including the counterbalance with similar device. It is recommended to use the filter device having half Molecular Weight Cut off than the molecular weight of protein. 2. Spin the centrifuge (fixed-angle rotor) at 5000 g maximum for approximately 5 min. Follow the supplier’s guidelines for typical spin time. Check the sample volume and if needed add sample buffer, resuspend and spin for next 5 min. 3. Recover the purified and concentrated sample using sterile tips and pipet by using side-to-side sweeping motion to ensure total recovery. Remove the purified samples immediately after centrifugation for optimal recovery. 4. Pipet an aliquot of the recovered sample onto the lower measurement pedestal of Spectrophotometer followed by closing the sampling arm.

3.5 4–12% SDS– Polyacrylamide Gel Electrophoresis

Perform all the experimental procedures for SDS–polyacrylamide Gel electrophoresis in room temperature unless otherwise specified for some samples. It is recommended to follow the supplier’s guidelines for setting the apparatus, handling gels including all buffers and storage of all reagents. 1. Prepare samples for running on polyacrylamide gels by adding 2 μL of SDS PAGE nonreducing sample buffer (125 mM Tris– HCl, pH 6.8, 20% glycerol, 4% SDS, and 0.1% bromophenol blue) and 5 μg of reduced and commercial protein samples. Mix the samples slowly by doing up and down and avoid vortex for protein samples.

Detecting Autoantibodies in Systemic Lupus Erythematosus

65

2. Adjust gel cassette in the electrophoresis system and remove the comb, fill the inner chamber of electrophoresis system with running buffer (electrophoresis buffer) and keep filling until the buffer overflow and reach certain level in outer chamber (see Note 8). 3. Load prepared samples into wells along with 5 μL of molecular weight marker for reference in first lane. Check the possibility of leaking of buffer and cover the electrophoresis system with top correctly. Connect the system to the power supply, set the right voltage (approx. 100 V) and run the electrophoresis system for 2 h or until the lower band of the protein ladder reaches the foot line of the gel plate (see Note 9). 4. Remove gel from the electrophoresis cell and carefully place into a Petri dish containing fixation buffer (500 mL MeOH, 100 mL AcOH, 400 mL H2O, 2 h) followed by staining with Coomassie protein stain (1 g Coomassie Brilliant Blue, 500 mL MeOH, 100 mL AcOH, 400 mL H2O) for at least 2 h. After staining, destain the gel with the solution of destaining buffer (400 mL MeOH, 100 mL AcOH, 500 mL H2O). Observe the clear bands in transparent gel and take a picture. 3.6 Enzyme Linked Immunosorbent Assay (ELISA)

The assay should be carried out in ultraclean working space (bleached or washed with ethanol including DNase and RNasefree environment). It is recommended to read and follow the guidelines of biohazard zone before working as this protocol involves the handling of clinical diseased human samples as well as the hazardous chemicals. Multichannel pipettes with DNase and RNase free, low-retention sterile tips should be used. Blank must be included in every plate. It is recommended to prepare fresh buffer for this assay. Furthermore, covering ELISA plate during incubation is recommended. 1. Plate coating: Coat maxisorb 96-well plates with antigens by using multichannel pipette at concentration 3.5 μg/mL in 1X PBS (100 μL/well) and incubate in fridge at +4 C overnight (12–18 h). It is recommended to leave the last lane of each plate as a blank without antigen. Use the coated plate within 48 h (see Note 10). 2. Plate blocking: Remove the plates from freeze and discard the remaining solution. Allow the plates to reach room temperature. Wash the plates two times with washing buffer (1 L 1X PBS, 50 μL Tween 20; 300 μL/well). Block plates with 1X PTB buffer (20 g BSA, 50 μL Tween 20, 1 L 1X PBS; 100 μL/ well) and incubate some plates in room temperature and some in 37 C for 1 h to find the effect of temperature in binding (see Note 11).

66

Sangita Khatri et al.

3. Incubation with serum: Wash the plates two times with the washing buffer and incubate the respective plates with diluted plasma (1 μL plasma in 100 μL diluent (2 g BSA, 50 μL Tween20, 1 L 1X PBS; 100 μL/well) in the ratio of 1:100 for 1.5 h at desired incubation temperature. Follow the guidelines of the supplier for diluting plasma. Prepare the sera box a day ahead and store at +4 C. 4. Incubation with HRP-conjugated secondary antibody: Remove the loaded serum solution from the plates and follow washing as reported before three times with same washing buffer. Incubate with secondary antibody or HPR-Ab (100 μL/well) in the ratio of (1:20,000) diluted or follow the supplier’s guidelines with previous diluent for 1.5 h at respective incubation temperature. 5. Incubation with substrate: Repeat washing step as before and incubate with freshly prepared TMB solution (3 mg TMB, 5 mL DMSO diluted to 50 mL with 0.1 M acetate buffer, pH 5.4–5.7, 3 μL H2O2; 100 μL/well) for 20–30 min. Avoid light exposure during color development by covering the plate with opaque foil. Observe color development in plates from transparent to light blue. 6. Stop the reaction by adding 1 M H2SO4 (50 μL/well) using multichannel pipette. Analyze the plates in ELISA plate reader at 450 nm for HRP/TMB substrate and 405 nm for ALP/pNPP substrate. Subtract the OD values of the black wells from mean values to remove background interference during data analysis. Higher values of blank indicate high background interference and data may not be reliable. Run different statistical analyses including box plot, one-way ANOVA test, R2 values, and coefficient of variation. Using the aforementioned protocol, we performed the ELISA assay (IgG, IgM and IgE), using multiple antigens, to test a cohort of matched healthy samples (n ¼ 11), and patients with SLE (n ¼ 11). All the biomolecules used in this study are strongly associated with SLE and localized scleroderma [11–18]. The sera samples were provided by Associate Professor Kathryn Torok, Pittsburgh Children Hospital, USA (healthy sera), and by Professor Elizabeth Mellins, Stanford University, USA (SLE sera), collected under protocols approved by the local Institutional Review Boards. Median age of the patients was 34 (SLE), and 35.6 (HC) years, with gender composition of 77% females in both groups. No personal data of the patients and healthy donors has been disclosed. Sera were stored at 20 C for 3 months after the withdrawal, prior to the assay. ELISA data were analyzed in R, by one-way ANOVA. The assay consists of DNA, RNA, LNA and protein antigens in same microwell plate. The reported incubation temperature of

Detecting Autoantibodies in Systemic Lupus Erythematosus

c

0.05

0.15

0.25

0.04 0.08 0.12 0.16

b

0.1 0.2 0.3 0.4

a

SLE

Healthy

SLE

Blank

Healthy

e

Blank

f 0.25 0.05

0.15 0.05 Healthy

Healthy

0.15

0.4 0.3 0.2 0.1

SLE

SLE

Blank

0.25

d

67

SLE

Blank

Healthy

Blank

SLE

Healthy

Blank

0.05

0.15

0.25

g

SLE

Healthy

Blank

Fig. 1 Box plot for relative absorbance values across different antigens, IgG study. (a) ssD4 synthetic antigen, p value ¼ 0.001657. (b) D4 synthetic antigen, p value ¼ 0.01485 (c) L3D synthetic antigen, p value ¼ 1.933e05. (d) HUV synthetic antigen, p value ¼ 0.0001322. (e) ssLS_R synthetic antigen, p value ¼ 9.829e10. (f) EBF3_R synthetic antigen, p value ¼ 1.353e06. (g) Reduced ApoH synthetic antigen, p value ¼ 7.212e08

nucleic acid and protein is 37 C and room temperature, respectively. Therefore, the binding affinity at different incubation temperatures was analyzed prior to assay. The assay at room temperature reduced the binding compared to the incubation at 37 C by three-fold (data not shown). We proceed with testing HC and SLE sera, using incubation at 37 C. The data obtained by ELISA is illustrated in the boxplot below (Figs. 1, 2, and 3). First, blank and healthy control have a significant difference in binding levels compared to SLE ( p < 0.05; one-way ANOVA). Seven antigens (ssD4, D4, LE3D, HUV, ssag1, EBF3 and reduced ApOH) showed significant difference between SLE and healthy samples for IgG, as shown in the box plot (Fig. 1) with a p value 550 nm.

DNA-Mediated Liposome Fusion

107

17. A set of complementary LiNAs or other fusogenic agents. See Ries et al. [14].

3 3.1

Methods Stock Solutions

1. 10 HEPES buffered saline (HBS). Prepare a 10 buffer stock solution (1 L). 100 mM HEPES∙NaOH, 1 M NaCl, pH 7.0. Weigh 58.44 g (1 mol) NaCl and 23.83 g (0.1 mol) N-2Hydroxyethylpiperazine-N0 -2-ethanesulfonic Acid (HEPES) and dissolve in 500 ml H2O, adjust pH with a NaOH solution (>1 M) noting the volume added. Bring to final volume of 1 l. Filter through HPLC-grade filter. Store at 4 C. 2. HBS working buffer. Dilute 50 ml HBS stock to 500 ml in a volumetric flask to obtain the working concentration. Filter buffer through 0.2 μm pore (or smaller) in 50 ml portions before use. 3. Sulforhodamine B (SRB) solution. 20 mM SRB in 50 ml HBS. Dissolve 1 mmol (559 mg) SRB in 5 ml of the HBS buffer stock solution (10) and 40 ml of water in a beaker. Adjust pH to 7.0 using NaOH (see Note 1). Bring to 50 ml using H2O. 4. DOPC. 10 mM in 20 ml CHCl3. Weigh 0.2 mmol (147 mg), dissolve and store at 21 C for up to 6 months. 5. DOPE. 10 mM in 10 ml CHCl3. Weigh 0.1 mmol (74 mg), dissolve and store at 21 C for up to 6 months. 6. Cholesterol. 10 mM in 10 ml CHCl3. Weigh 0.1 mmol (39 mg), dissolve and store at 20 C for up to 6 months. 7. DPPE-NBD. 1 mM in MeOH. Weigh 1–2 μmol (0.87–1.74 mg), calculate the MeOH volume (between 1 and 2 ml) to attain 1 mM concentration and dissolve. 8. Triton X-100 solution. Prepare a 1% w/v (15.5 mM) solution by dissolving 0.5 g in a 50 ml volumetric flask. Store at room temperature. 9. LiNA stock solutions. Dissolve RP-HPLC purified oligonucleotides in 50% Acetonitrile/H2O, store at 4 C.

3.2

Liposomes

Liposomes are prepared by mixing lipid stock solutions to achieve a ratio of 2:1:1 between DOPC, DOPE and Cholesterol, respectively. Lipid films (a pair of which will yield liposome stock solutions for at least 32 experiments at 250 μl total volume), are prepared in parallel and stored in at 21 C for up to 14 days. Freshly formed liposomes are sized using a standard extrusion technique and their size distribution measured using NTA or DLS.

108

Philipp M. G. Lo¨ffler et al.

3.2.1 Determine Concentrations of DOPC and DOPE Stock Solutions

Due to the hygroscopic nature of the lipids, even freshly opened containers of lyophilized phospholipids may contain additional water, leading to an overestimation of lipid concentration. A corrected concentration is found by performing a standard phosphate assay [46]. 1. Prepare standards for calibration curve in duplicate (2 7 samples: 200 μl water (0 nmol), 20 μl (13 nmol), 50 μl (32.5 nmol), 100 μl (65 nmol), 150 μl (97.5 nmol), and 200 μl (130 nmol) of the phosphate standard) in 10 ml glass vials. 2. Prepare sample solutions for each phospholipid stock in duplicates: an appropriate volume of sample to a target of approx. 100 nmol phospholipid in 10 ml glass vials. 3. Dry all solutions at 80 C on a heater plate. 4. Add 70% perchloric acid (0.65 mL) to the residues. 5. Incubate solutions at 190 C for 1.5 h. 6. Cool down to room temperature and add water (3.3 ml), 2.5% (w/v) aqueous ammonium molybdate solution (0.5 ml), and a 10% (w/v) aqueous sodium ascorbate solution (0.5 ml). 7. Heat to 100 C for 10 min. 8. The absorbance of the standards and the sample was measured at a wavelength of 800 nm. 9. Use the absorbance from the standards to prepare a calibration curve with a linear fit and calculate the accurate stock solution concentrations.

3.2.2 Lipid Film Preparation

To achieve a homogeneous mixture of DOPC, DOPE, and cholesterol calculate appropriate aliquots based on the stock solution concentration measured in Subheading 3.2.1 (4 μmol total lipid, 2:1:1 molar ratio, respectively). 1. Prepare two round-bottomed receiving vessels per batch of experiments (see Note 2). 2. Add 2 μmol DOPC, 1 μmol DOPE, and 1 μmol cholesterol from their stock solutions. If applicable, add 10 μl (0.01 μmol, 0.25 mol%) NBD-DPPE solution for lipid labeling. 3. Mix and then evaporate chloroform under a stream of N2 or argon. 4. Place under vacuum overnight to remove trace chloroform. 5. Store at 21 C for up to 14 days.

DNA-Mediated Liposome Fusion 3.2.3 Liposome Preparation and Extrusion

109

Liposomes are formed by rehydration of the lipid films in HBS or SRB solution to a total lipid concentration of 20 mM. 1. Add 200 μl HBS or SRB solution to prepare, what will be denoted in curved brackets, that is, as “{buffer}” and “{SRB}” populations, respectively. 2. Vortex the suspension at intervals until all lipid material has desorbed from the inner surface of the vial. 3. To further homogenize, place the suspension in a bath sonicator for 10 min at 50 C (reduces backpressure during extrusion). 4. Meanwhile, assemble the hand extruder using a 100 nm poresize polycarbonate membrane (19 mm Ø) in between two drain disks. Pass the suspensions 21 through the membrane and collect the suspension in a fresh vial. Observe that the solution changes from strongly turbid to almost opalescent. If this does not occur, re-extrusion with a fresh membrane is necessary. For comparable results, use liposomes within 36 h after rehydration (in our hands, a 25–30% lower fusion yield was observed in one-week-old liposomes compared to freshly extruded ones).

3.2.4 Size Exclusion of SRB-Filled Liposomes

1. Wash the prepacked spin columns three times with 200 μl HBS by spinning at 750 g for 1 min. Inspect the column bed: if it is free of cracks and the material is mostly dry but still has contact with the sides of the column vessel, proceed. Otherwise, resuspend the resin in 200 μl HBS, vortex and restart washing procedure. 2. Apply 50 μl of extruded {SRB} (20 mM total lipid), drop-bydrop directly onto the column material, and avoiding contact with the walls of the column vessel. 3. Transfer the column to a clean receiving vessel (1.5 ml microtube) and spin at 750 g for 2 min to collect purified liposomes. Inspect the column to make sure that no deep purple color appears on the column frit to rule out SRB bleedthrough. A faint pink band near the frit is normal and pertains to the remaining liposomes retained on the column. 4. Apply 50 μl HBS and repeat step 3 to elute more liposomes. In absence of bleed-through, combine the two liposome fractions. 5. Determine the average relative lipid concentration after spin column (see Note 3). Use either another Bartlett assay or liposomes labelled with 0.25 mol% DPPE-NBD.

3.3 Liposome Characterization Using NTA

1. Dilute the extruded {buffer} liposomes and the extruded and purified {SRB} liposomes to prepare samples at 2 μM total lipid concentration using freshly filtered HBS (0.22 μm filter) over a

110

Philipp M. G. Lo¨ffler et al.

few steps (e.g., 20 mM ! 200 μM ! 20 μM ! 2 μM). In a flow-cell microscope setup at 200 magnification and a fieldof-view of 80 80 μm, this concentration should correspond to approx. 100 trackable particles. 2. Acquire at least 5000 valid tracks using the NanoSight system and NanoSight NTA software (ver. 2.0 or higher, typically record 5 30 s at 25 frames/s (shutter 387) and camera gain set to 280 (see Note 4). 3.4 Content Mixing Experiments 3.4.1 Preparation of LiNA Solutions (1 μM)

For optimal fusion outcomes an exact 1:1 stoichiometry between the LiNAs engrafted onto the liposome populations is crucial. To ensure this, prepare working solutions of LiNAs in HBS under spectrophotometric control: 1. Using the theoretical extinction coefficient of the oligodeoxynucleotide at 260 nm (ε260) [47], calculate the absorbance at 1 μM and 1 cm path length (Atarget) and prepare the solution accordingly directly in a cuvette (4 10 mm chamber, 1 cm light path, 1.5 ml). 2. Given that the LiNA stock concentration must be known, use Lambert-Beers law to calculate the amount of LiNA stock solution (Vstock) needed for 1000 μl at 1 μM. 3. To a clean cuvette, add 500 μl 2 HBS and 500 μl Vstock of milliQ H2O. 4. Zero the instrument. 5. Add Vstock of the desired LiNA and mix thoroughly, record absorbance (Aobs). 6. Adjust to 1.00 0.03 μM using calculated aliquots: (a) To increase concentration, add LiNA stock and an equal volume of 2 HBS (Vstock ¼ V2 HBS ¼ Vtotal Atarget/ Aobs), the dilution by the small added volumes (Vstock Vtotal) is not taken into account in the formula. (b) To dilute add 1 HBS (V1 HBS ¼ Vtotal Aobs/Atarget). 7. If necessary, iterate step 6.

3.4.2 LiNA Engraftment on Liposomes

Fusion experiments are carried out in a semi-micro cuvette without stirring in a total volume of 250 μl. For a single round of fusion, four different LiNA-engrafted liposome populations are needed, {SRB}α, {SRB}β, {buffer}α and {buffer}β, where α and β are complementary LiNA strands. Efficient fusion has been obtained with molecular ratios between lipids and LiNA (Lipid–LiNA ratio) in the range of 400:1 and 2000:1. A calculation example is given in Table 1, based on the lipid concentration used in our previous papers [35]. The samples are designed to allow each {SRB} liposome to fuse with up to three liposomes with {buffer} (see Note 5).

DNA-Mediated Liposome Fusion

111

Table 1 Overview of functionalized liposome populations to prepare LiNA conc. Lipid conc. Liposomes LiNA (μM)

(μM)

Lipid/LiNA ratio

# per liposomea

# of Amount experiments needed (μl)

1 {SRB}

α

92

0.205 449:1

195

4

500

2 {SRB}

β

92

0.205 449:1

195

4

500

3 {SRB}

–

92

–

–

4

500

4 {buffer}

α

276

0.205 1346:1

65

2

250

5 {buffer}

β

276

0.205 1346:1

65

2

250

6 {buffer}

–

276

–

–

2

250

–

–

a

2

Estimated based on a 100 nm diameter and average headgroup area of 0.681 nm (based on 1:1 DOPC/DOPE, [48])

Freshly mix calculated aliquots of the liposome suspensions and the 1 μM LiNA solutions, vortex and incubate at room temperature for 15 min and use within 30 min thereafter. 3.4.3 Measurement of Content Mixing Incl. Leakage and Leakage Only

With the above set of samples, the following experiments can be run including controls, where liposomes with noncomplementary LiNAs are paired or where LiNAs are absent (Table 2). Start the measurements as follows, each time adding 125 μl of each liposome sample directly to the cuvette. 1. Set up measurement (λEx 545 nm, emission filter 550–1100 nm, λEm 583 nm, data interval 15 s, temperature controller set at a desired temperature between 37 and 50 C. 2. Add {SRB} population. 3. Start measurement. 4. Immediately before a measurement point, add {buffer} population into the cuvette and mix swiftly. 5. Record data for the next 30 min. 6. Lyse liposomes in 0.1% w/v Triton X-100, by adding 25 μl of a 1% w/v stock solution, mix well. 7. Record data until signal stabilizes (see Note 6).

3.4.4 Data Treatment and Calculation of CM

Once the fluorescence intensity at any time (It) are expressed relative to the starting fluorescence intensity, I0 (first measurement after addition of the secondary population) one can directly compare CM + L and L traces. 1. Generate It/I0 vs. t graphs by dividing all data points by the corresponding I0 value and align the time-axes for the CM + L and L, as well as for the controls. See Fig. 3.

112

Philipp M. G. Lo¨ffler et al.

Table 2 Content mixing experiments and final lipid concentrations Cuvette

Type

Liposomes α

Total lipid conc. (μM)

1

CM + L

{SRB} {buffer}β

46 138

2

CM + L

{SRB}β {buffer}α

46 138

3

L

{SRB}α {SRB}β

46 46

4

L

{SRB}β {SRB}α

46 46

5

CM + LNoncomplementary

{SRB}α {buffer}α

46 138

6

CM + LNoncomplementary

{SRB}β {buffer}β

46 138

7+8

CM incl. LNo LiNA

{SRB} {buffer}

46 138

9 + 10

LNo LiNA

{SRB} {SRB}

46 46

Concentration of LiNAs are 0.058 μM for each strand (α and β), unless Noncomplementary (0.105 μM α or β) or No LiNA is indicated

2. Subtract the Leakage data from the CM + L data to obtain a graph that shows the fluorescence increase corresponding to CM only (see Note 7). 3. Compare the resulting traces, that is, CM, L, and control experiments. For explanation of control experiments (see Note 8). 3.5

Calibration

Moving toward applications of liposome fusion, for example, to initiate chemical reactions within the fused compartment, it may be helpful to gauge the fusion yield between the different populations. To this end we can prepare samples with different [SRB] in their interior to correspond to different stages in the fusion. Most obviously, we can produce samples that correspond to a situation where all {SRB} liposomes have fused with one {buffer} liposome, diluting [SRB] by a factor of two, or when, expressed as the fraction of the original [SRB], f[SRB] ¼ 0.5. With such samples we can generate a measure of “100% fusion yield” or “one round of fusion.” Further we can emulate a situation where all {SRB} have fused with another liposome, and another and so forth, producing dilution factors of 3, 4, or more, respectively. Note, however that the incremental effect for each additional fusion round becomes smaller each time, reducing assay sensitivity. To obtain a calibration curve it may be

DNA-Mediated Liposome Fusion

113

Fig. 3 Illustration of the content mixing experiments and controls. The control “L no LiNA” was omitted as it typically overlays with the CM + L curve. Sample data are for the XN P3 anchored LiNA-pair at 50 C (see Fig. 1)

helpful to produce samples with intermediate concentrations as well (e.g., 40% fusion), giving an average f[SRB] ¼ 0.8. The total lipid concentration for these samples should be in the range used during the fusion experiments and should increase with the inverse of f[SRB] to reflect the added material. This way of calibrating neglects the influence of inhomogeneously distributed SRB concentrations within the liposome population, the diameter increase upon fusion as well as scattering effects of {buffer} liposomes in the fusion experiment, but should reproduce the fluorescence increase as it is measured in bulk. As leakage will generate extra fluorescence, it should be minimized by measuring samples immediately after size-exclusion and dilution. Functionalization with ssLiNA further improves this issue (see Note 8, Noncomplementary control). Consider the following table of calibration samples. 1. Produce SRB concentrations of 16, 10, 6.7, and 5 mM by diluting the SRB stock solution with HBS. 2. Resuspend lipid films in duplicate for each SRB concentration listed in Table 3 and follow steps in Subheadings 3.1, steps 3 through 5 to produce liposomes encapsulating SRB. 3. Produce the samples listed in Table 3 in 250 μl HBS. 4. Measure SRB fluorescence intensity for each concentration (Ix, x ¼ 5, 6.7, 10, 16 and 20 mM), at the temperatures of interest. 5. Add 0.1% w/v Triton X-100, mix and record the fluorescence.

114

Philipp M. G. Lo¨ffler et al.

Table 3 SRB calibration samples [SRB] in interior

f [lipid]total [SRB] (μM)

“Rounds of fusion”a

20

1

46

0

16

0.8

58

0.4

10

0.5

92

1

100%

6.7

0.33

138

2

200%

5

0.25

184

3

300%

Fusion yield 0% 40%

a

Denotes to which stage of fusion the calibration sample corresponds, that is, how many {buffer} liposomes “have fused” with each {SRB} liposome on average

6. Normalize values by their fluorescence after lysis. 7. For each temperature divide normalized values I5 through I16 by I20 (Ix/I20 values). 8. Plot the Ix/I20 against f[SRB] and do a linear regression. Discard outliers. 9. Noting that Ix/I20 corresponds to the It/I0 from the leakagecorrected CM data, use the linear regression to estimate content mixing yields.

4

Notes 1. For adjusting the SRB solution, keep measurements short and immediately flush the pH electrode with plenty of H2O after measuring to avoid contamination of the inner electrolyte with SRB). 2. Lipids dried on the inside of container edges are often difficult to resuspend completely. 3. Fluorimetry to determine the liposome concentration (clipid) after size exclusion (relative to before). Liposome samples labeled with 0.25 mol% NBD-DPPE for pre- and post-spincolumn were prepared in triplicate (10 μl in 100 μl HBS containing 0.1% Triton X-100). Fluorescence intensities (I) were measured in a microtiter plate from above (excitation wavelength (λex.), 460 nm, monitored emission wavelength (λem.), 535 nm, average of five readings) and clipid calculated as follows: c lipid ¼

I purified I blank ½stock : I stock I blank

4. Camera gain settings are instrument dependent. Analysis settings: standard detection threshold (10), automatic blur, min.

DNA-Mediated Liposome Fusion

115

Number of steps 10, automatic maximum jump distance (min. Particle size in nm). The diameter of each particle is calculated using the Stokes–Einstein equation based on the mean square displacement extracted from all steps of the particles track Thus, the software generates a histogram of particle diameters for the population, the numerical mean of which is taken as the average diameter of the liposomes. The software reports the standard deviation (SD) to the mean diameter as a measure of polydispersity, where an SD 50 nm is acceptable. In DLS experiments the polydispersity is typically underestimated; for example, a “good” polydispersity index of 0.2 for a 100 nm average diameter is translated in to an SD of 20 nm. 5. The liposome concentration should be low to prevent extensive scattering of the excitation and emission light, but high enough to achieve a good signal-to-noise ratio during measurement. Tune the detector voltage, such that lysis of the leakage experiment mixture with a detergent gives approx. 80% detector saturation. 6. Liposome lysis using 0.1% Triton X-100 a fluorescence intensity for maximal dilution of the SRB, Imax (i.e., as if each {SRB} liposome had fused a few thousand times). Many previous studies report fluorescence increase relative to Imax. This is, however, impractical because SRB fluorescence changes nonlinearly with temperature and with concentration. Thus, it becomes insensible to compare CM incl. L with L data as well as data recorded at different temperatures. In practical terms, evaluating Fluorescence increase relative to Imax over-evaluates content mixing at higher temperatures (Imax drops more severely due increased molecule anisotropy than in the quenched state) and under-evaluates leakage. Nonetheless, the Imax value also gives the possibility to compare dye concentrations between batches of experiments and clearly identifies leakage samples. 7. Each curve is normalized to its I0, which levels fluorescence differences between batches. However, to maximize reproducibility, we strongly recommend that this data treatment is done only within the same batch, and, if allowed by a multicell holder, that CM + L and L data are run in parallel. 8. In absence of LiNA we measure both CM + L and L, to check for liposome integrity under the conditions in use as well as for the absence of spontaneous CM. In this control one expects that It/I0 for both curves is similar (L ~ CM + L) as no CM is expected to occur. Further, It/I0 should stay 1 standard deviation, and were thus marked as Cy5-AON positive, are indicated by asterisks. We found that AON uptake by cells is heterogeneous, leading to measurable nuclear localization in about 35% of the cells 3.2.2 To Assess Nuclear Localization (Fig. 3)

1. Prior to imaging, wash cells for 5 min with PBS containing 5 μg/mL Hoechst 33342 to stain the nuclei, then apply fresh medium (see Notes 19 and 20). 2. Image sequentially in all channels to avoid any bleed through, make z-stacks to image the entire nuclear volume (see Note 21). 3. For Data Analysis in FIJI l

Take the maximum intensity projection of all acquired z-slices.

l

Make nuclear masks by automatic thresholding on the Hoechst channel (see Note 22).

l

Use the watershed function to divide nuclei that are lying against each other.

Advanced Fluorescence Imaging to Determine Intracellular AON Fate

3.3 FLIM and FCS on Solutions and Cell Lysates 3.3.1 Calibration

127

l

Use the analyze particles function to count and select the nuclei, applying a size threshold to exclude background signal (see Note 23), and add to the ROI manager.

l

Apply the ROI selection to the Cy5 channel and measure all ROIs, yielding at least the mean gray value and standard deviation for each nucleus.

l

Calculate the number of Cy5-positive nuclei by comparing the background-subtracted mean with the standard deviation (see Note 24).

Calibrate the system using solutions of free Cy5, Cy5-labeled AONs, and mixtures of the two (e.g., 10:90, 25:75, 50:50, 75:25, and 90:10, Fig. 4a). 1. Dilute the Cy5 and Cy5-AONs to a concentration below 100 nM (see Note 25). 2. Adjust the correction ring of the microscope objective to match the thickness of the coverslip or culture vessel bottom or to avoid refractive index mismatching (see Notes 26 and 27). 3. Choose a measurement point in the solution and run FCS Test to set up the laser intensity (see Notes 28 and 29). 4. Calibrate the confocal volume using the dye-only solution, which has a known concentration and diffusion constant. 5. Perform FCS measurements of the various solutions. Longer measurements and/or a larger number of repetitions will lead to a more accurate fit (see Note 30). In dye solutions when bleaching and/or presence of aggregates are not a concern, we preferably measure for at least 10 min. 6. Fit the autocorrelation functions using an appropriate model. For Cy5, we use an extended 3D diffusion model with two triplet times (see Note 31) where one triplet time represents the light-driven cis–trans isomerization [15]. Determine the diffusion time of the free dye and the AON-bound dye separately; globally fit the triplet times, as these are expected to be the same for free and AON-bound Cy5. 7. Fit the lifetime data that has been recorded simultaneously (this is an option for the Leica FALCON, other instrument configurations may require separate measurements). We generally use a one-component n-exponential reconvolution model for the pure solutions. These values are then used as fixed input in a two-component model for the mixtures.

128

M. Leontien van der Bent et al.

Fig. 4 Lifetime measurements on various samples. (a) Fast FLIM histograms from Cy5, Cy5-AONs and various mixtures thereof in solution (shades of blue to red; left y-axis), PF14 polyplexes (gray; left y-axis) and cell

Advanced Fluorescence Imaging to Determine Intracellular AON Fate 3.3.2 Measurements on Polyplex Solutions

129

The lifetimes that were determined in the pure solutions can be used to determine the degree of quenching in polyplexes (Fig. 4a). 1. Form polyplexes as described above. 2. Dilute polyplexes in milliQ water so that the Cy5-AON concentration is below 100 nM. 3. Record lifetime-resolved fluorescence as described for the pure dye solutions. 4. Fit the fluorescence lifetime decays using a three-component n-exponential reconvolution or tail fit model. Fix the two longer lifetime components according to the lifetimes found for free and AON-bound Cy5. If repeated measurements were done, use global fitting of the shortest component between replicate measurements. Determine the amplitude-weighted mean lifetime.

3.3.3 Measurements on Cell Lysates

To determine the degree of cleavage of the Cy5 from the Cy5-AON inside cells, perform FCS measurements on cell lysates (Fig. 4a, b, see Note 32). 1. Incubate cells with polyplexes as described above. 2. After 24 h, prepare cell lysates: (a) Wash cells twice with HKR buffer containing 100 μg/mL heparin to dissociate and remove any extracellular polyplexes. After this, trypsinize cells. Then add two volumes of HKR buffer when all cells have detached and transfer the suspension to microcentrifuge tubes. (b) Spin down cells at 1000 g for 5 min at 4 C; wash the pellet once with HKR buffer and spin down again.

Fig. 4 (continued) lysates from PF14 polyplex-treated myoblasts (gray dashed line; right y-axis). Free Cy5 and Cy5-labeled AONs can be distinguished based on the fluorescence lifetime. The shorter lifetime observed for PF14 polyplexes indicates fluorescence quenching. The lifetime observed in PF14-treated cell lysates indicates that the majority of the signal originated from intact Cy5-labeled AONs. (b) Relative diffusion amplitudes as determined by FCS are skewed to Cy5-AONs in mixtures of Cy5 and Cy5-AONs. Relative triplet fractions show a roughly linear relationship, which can be used to infer the amount of free Cy5 in cell lysates. Again, these results show that only a small fraction of the Cy5 in PF14-treated cell lysates is free, whereas the majority is still bound to AONs. (c) Fast FLIM image of immortalized human myoblasts treated with polyplexes for 24 h, then washed and imaged using a Leica TCS SP8 FALCON system. The fitted lifetime components and the images corresponding to the individual components are shown to the right. The shortest lifetime can be explained by fluorescence quenching in aggregates and endo/lysosomes. The intermediate lifetime corresponds to the fluorescence lifetime of free Cy5, whereas the longest lifetime corresponds to the lifetime of Cy5-AONs

130

M. Leontien van der Bent et al.

(c) Lyse the cells in 250 μL HKR buffer containing 0.1% (v/v) Triton-X100 for 1 h, with repeated flicking to ensure complete homogenization. Then transfer 200 μL of these lysates to an 8-well ibiTreat μ-Slide. 3. Perform the FCS measurements as described for the pure dye solutions in section. 4. Fit the autocorrelation curves using an extended 3D diffusion model with two diffusion components and two triplet times. Both diffusion times and both triplet times should be fixed to those derived from the measurements on pure Cy5 and Cy5-AON solutions. 5. Fit the fluorescence lifetime decays as described for the polyplex solutions. 3.4 Live Cell FLIM and FCS

1. To perform live cell FLIM and FCS, incubate cells with polyplexes as described above. Use a live cell nuclear counterstain such as Hoechst 33342 to visualize the nuclei. 2. To perform FLIM imaging (Fig. 4c, see Note 33), adjust the laser intensity such that the number of photons detected per laser pulse is around 1 for a SMD HyD detector, or around 0.5 for regular imaging HyD detectors (see Note 34). Then record FLIM images at a low scanning speed (we generally use 100 Hz and 864 864 pixels at 2 zoom, pixel size 106.8 nm2, for high spatial resolution) and acquire a number of frames that leads to a high enough photon count for analysis (we generally accumulate 20 frames for live cell FLIM; see Note 35). After this, fit the lifetime decay curves using an n-exponential reconvolution model with the number of components that best describes the data. In our hands, a three-component fit is usually sufficient. As the environment of the fluorophore in various subcellular compartments is different from the aqueous solution that was used for calibration, the lifetime components should be fitted in these images instead of using the values obtained in the calibration. 3. To perform live cell FCS (Fig. 5), adjust the laser intensity to reach approximately 1/3 of the maximum counts per molecule (see Note 36). Then select points of interest, for example in the nucleus, in the cytosol and in an extracellular region (see Note 37). Measure for 5–10 s per point of interest, repeating several times (we generally use five cycles). Take images before the FCS measurements and after each cycle. Perform bleaching correction and spark filtering to eliminate fluctuations from, for example, aggregates. Fit the autocorrelation functions as described previously for the cell lysates (see Note 38). By linking the concentrations determined in the FCS measurements with the fluorescence intensity in the same regions prior to

Advanced Fluorescence Imaging to Determine Intracellular AON Fate

131

Fig. 5 Live cell FCS. (a) Images of immortalized human myoblasts treated with polyplexes for 24 h, then washed once with PBS containing Hoechst 33342, followed by live cell CLSM and FCS on a Leica TCS SP8 confocal laser scanning microscope with FCS functionality. Images were acquired after each cycle of FCS measurements. The point of interest in the nucleus is indicated by the cross hair. Note that bleaching occurred throughout the nucleus, and that focal drift caused additional intensity fluctuations between time points. The homogeneous bleaching indicates that Cy5-AONs can freely diffuse throughout the nucleus. (b) Autocorrelation curve (red) from the first FCS measurement in the nucleus of the cell shown in (a). The fitted data and residuals are shown in gray. The diffusion amplitude (horizontal black dashed line) can be used to calculate the number of molecules, which is the inverse of the y-intercept. With a known effective volume, this can then be used to calculate the concentration of the fluorescently labeled molecules. The diffusion time (τD), which is influenced by the microenvironment and molecular interactions, can also be derived from the autocorrelation curve, as indicated by the vertical black dashed line

132

M. Leontien van der Bent et al.

FCS, one can infer the approximate linear relationship between those values, and thus calculate the subcellular concentrations for a large number of cells in the same field of view, without having to perform FCS in every single cell.

4

Notes 1. For polyplex formation, phosphate buffers have a strong adverse effect in our hands. We therefore dissolve both peptides and AONs in milliQ water without further adjustment of pH. 2. The choice of fluorophore will depend on the cost and commercial availability, as well as other factors (pH sensitivity, size, effect on subcellular localization, brightness, phototoxicity of excitation light, etc.) [14]. Advantages of Cy5 are that it is relatively cheap and easy to couple to AONs, and that the far-red excitation light has a lower chance of damaging the cells. A disadvantage is that the phosphodiester bond between the AON and the dye is not very stable (although in terms of AON fate, this property provides information about potential shielding of AONs from degradation). In order to obtain information about the molecular environment, the fluorophore should have different lifetimes in different environments and it is helpful if information on such environment-dependent lifetime changes is available in the literature. The fact that Cy5 shows photo-inducible cis–trans isomerization and also FRET between these isomers [15] can be seen as both an advantage and a disadvantage: the mix of molecular species that is thus present leads to a complicated spectral profile with multiple lifetime components as well as multiple dark states, but it also allows to distinguish between free fluorophore and AON-coupled fluorophore based on the fluorescence lifetime and triplet/dark state fractions. 3. For use in cell culture, we recommend using Cy5-AONs that are purified by HPLC, followed by a Na+ salt exchange. 4. To reduce binding of the peptide to the tube walls, we preferably use Protein LoBind tubes or 0.2 mL PCR tubes for peptide solutions and to form polyplexes. 5. To keep the pH constant, it is advisable to use HEPES-buffered culture medium. Furthermore, if repeated imaging is done, phototoxicity may be reduced by using medium containing little or no phenol red. 6. It is important to use temperature control at the microscope, especially for long-term imaging, either through use of a temperature chamber or a heated stage.

Advanced Fluorescence Imaging to Determine Intracellular AON Fate

133

7. Preferably form polyplexes on the day of the experiment; if necessary, they can be formed one day in advance and stored in the fridge until use. 8. Lyophilized peptides may contain counter ions and residual water, which may account for as much as 70% of the weight [19]. Therefore, it is important to measure the peptide concentration. PF14 contains a tyrosine residue, so the peptide concentration can be derived from the extinction in spectrophotometric measurements using a molar extinction coefficient of 1200 M1 cm1. 9. It is not possible to form polyplexes at too high concentrations due to aggregation problems. In our experience, it is possible to form small nanoparticles up to at least 90 μM peptide and 10 μM AON. Polyplex formation at higher concentrations will likely have to be optimized first, for example by including a stabilizing agent such as gelatin and/or sonicating the polyplex solution. 10. To obtain a small and monodisperse population of polyplexes, it is important to pipette the peptide and AONs together at the same time and at a quite high speed, so that rapid mixing occurs. Larger volumes (hundreds of microliters) are easier to mix fast and uniformly. Polyplex formation can also be done using microfluidic devices [20], provided the mixing is fast enough. 11. After stabilization, polyplex size can be determined using dynamic light scattering (DLS) [21]. 12. Dilution of polyplexes in serum-containing medium can lead to aggregation. To minimize this effect, and to avoid high local concentrations of polyplexes in the culture vessel, we pre-dilute polyplexes in one volume of serum-free medium and add this to the well, which already contains an equal volume of medium with twice the final concentration of serum. Generally, we perform polyplex incubations in the presence of 10% serum. 13. For time-lapse imaging for extended periods of time, it is advisable to use a dry objective as opposed to a water or oil immersion objective, to avoid smearing and evaporation of the immersion fluid during the experiment, which would result in loss of focus. 14. It is not advisable to use a blue-fluorescent dye such as Hoechst for these extended periods of time, due to phototoxicity of the UV excitation. 15. When imaging the first hours of uptake of fluorescently labeled AONs, most of the fluorescence will at first derive from the culture medium that contains the AONs. Therefore, it can be challenging to find the correct focal plane; at this point, cells

134

M. Leontien van der Bent et al.

can be identified as dark objects close to the coverslip. The use of a cellular stain such as calcein-AM can also aid in locating the cells in a spectrally different channel. Furthermore, such as stain is useful for quantification of the intracellular fluorescence signal by image processing through generation of cell masks for image segmentation. 16. After 1–2 h, intracellular fluorescence will exceed the signal in the medium. Therefore, it is important to set the laser intensity and the gain of the detector at a level where the extracellular fluorescence signal is well below saturation, but not so low that the early stages of uptake cannot be seen. As a starting point, a signal of 5–10% of the maximum grey value can be used (between 12 and 25 for 8-bit acquisition). 17. Allowing the temperature of the entire system to stabilize prior to starting the experiment will aid in maintaining the correct focal plane. If present, adaptive focus control of the microscope can also be used to this end. To be sure that the cells do not drift out of focus during the experiment, we often acquire a few extra z-slices below and above the cells. 18. The frequency of imaging that is necessary will depend on the research question. However, it is important to bear in mind that imaging at a high frequency will increase the chance of photobleaching, which is something we have noticed for example when imaging every ten seconds. If a high temporal resolution is required, it is advisable to use very low laser power in combination with a more sensitive detector, such as a HyD detector. 19. For live cell imaging of nuclei together with Cy5-labeled AONs, we use the cell-permeable nuclear stain Hoechst 33342. It is important to check whether your microscope has a suitable excitation source and emission filters to image this dye, for example a 405 nm diode laser. 20. We generally assess nuclear AON localization after 24-h incubation with polyplexes, but longer incubations or repeated imaging at multiple later time points after washing are also possible. 21. As live cells are imaged, the length of microscopy measurements is limited by the movement of the cells as well as phototoxicity of the imaging procedure. There is a trade-off between the number of detected photons and phototoxicity in terms of laser intensity and duration and frequency of scanning. Furthermore, long pixel dwell times or frequent imaging (for example once a minute for several hours) at a too high intensity can lead to bleaching of fluorescence. 22. If necessary, subtract background signal using the rolling ball algorithm and/or apply a median filter prior to thresholding.

Advanced Fluorescence Imaging to Determine Intracellular AON Fate

135

The Huang algorithm (one of the standard functions implemented in FIJI) usually performs well for thresholding nuclei. 23. For our cells, a nuclear size threshold from 50 to 500 μm2 or infinity generally work well. 24. To calculate the background-subtracted mean, it is of course essential to take along cells that have not been treated with Cy5-labeled AONs. A high mean and low standard deviation indicate homogeneous fluorescence throughout the nucleus, whereas a high mean with a high standard deviation is indicative of high fluorescence signals from cytoplasmic structures that overlay the nucleus, for example vesicles or mitochondria, or from extracellular aggregates. It might be necessary to optimize the parameters according to the signal-to-noise ratio in the experiment, but we generally count as positive those nuclei that have a background-subtracted mean that is larger than the standard deviation. 25. The optimal fluorophore concentration for FCS measurements is between 0.5 and 10 nM, but measurements at concentrations up to 100 nM are possible. We generally use concentrations of around 20 nM in solutions. 26. To get the smallest possible point spread function, it is essential to use a high NA objective and to match the immersion fluid with the objective. We used a 63 1.2 NA water immersion objective. 27. In our case, the objective has a correction ring for coverslip thickness. The correction ring is adjusted by imaging reflection in xzy mode, using AOBS reflection, no Notch filter and overlapping excitation and emission wavelengths (for example 488 nm). For the Leica microscopes, adjustment of these settings is incorporated in the LAS X software workflow. The optimal correction ring setting is found by locating the reflection from the coverslip to the sample (the lower band on the screen, as the image is inverted) and turning the correction ring until the reflection band is as narrow as possible. 28. To identify the optimal laser intensity for FCS, increase the laser intensity stepwise. Doubling the intensity should lead to approximately twice the count rate and counts per molecule, but this will reach a plateau due to saturation of the fluorophores, and also bleaching if slowly mobile molecules are present. Increase to the maximum counts per molecule, then adjust the intensity such that 1/3–1/2 of the maximum counts per molecule is reached. For intracellular measurements, at the beginning of a measurement at a given position, the count rate may drop rapidly due to bleaching of immobile molecules. 29. Do not change the laser intensity between measurements or samples as the counts-per-molecule also provide a source of

136

M. Leontien van der Bent et al.

information on environment dependent changes in quantum yield. Also maintain the measurement point at approximately the same distance from the coverslip between measurements. We generally measure at 10 μm from the coverslip. 30. As a rule of thumb, the minimal measurement time should be approximately 100 the diffusional autocorrelation time. 31. For Cy5, we use an extended 3D diffusion model with two triplet states to account for the additional dark state that is caused by the light-driven cis–trans isomerization, in which the cis-isomer has red-shifted absorption and emission spectra [15, 22]. 32. In our hands, the calibration using the various ratios of Cy5 to Cy5-AON showed that the relative amplitudes for either component assigned when fitting the autocorrelation curves are skewed toward the longer diffusional component, which means that a too high fraction of intact Cy5-AON was derived. Therefore, it was not possible to detect small fractions of free dye using the diffusion time. However, the relative triplet fractions show a linear correlation to the fraction of free versus AON-bound dye and can therefore be used to calculate the fraction of free versus AON-bound Cy5. This can be explained by the reduced conformational freedom when Cy5 is bound to the AON, leading to a lower fraction of cis-isomers in this solution [15], which we described as a triplet component. The fluorescence lifetime can also be used to determine the fraction of free Cy5 and is based on the same physical phenomenon. 33. It is possible to make z-slices while recording FLIM images. However, due to the duration of acquisition of such images, live cells are likely to move during the measurements, which will result in mismatch of the individual slices. 34. If fluorescence bleaching is an issue, especially in dim samples, it might be necessary to set a laser intensity which leads to collection of less than 1 photon/pulse. Pixel binning can be applied to increase the photon counts, but this will reduce the spatial resolution of the FLIM image. For maximum spatial resolution, we generally apply a zoom factor and the number of pixels that yields a pixel size corresponding to roughly half the maximum xy-resolution with the used objective. 35. As a rule of thumb, one should aim for peak heights of the overall decay curve of at least 100 photons for a one-component fit; 1000 photons for a two-component fit; and 10,000 photons for a three-component fit. In live cell FLIM, movement of the cells and phototoxicity may require per-pixel count rates lower than these optimum values.

Advanced Fluorescence Imaging to Determine Intracellular AON Fate

137

36. In our experience, bleaching occurs extremely rapidly during live cell FCS and cannot be completely avoided due to the presence of immobile molecules. We have generally performed ten-second measurements at 10% laser intensity of the 80 MHz pulsed white light laser with about 1 mW output per line. 37. Due to enrichment of Cy5-labeled AONs in cells after uptake, the concentration within the cell or within the nucleus can be much higher than 100 nM, and is therefore not ideal for FCS measurements. It is therefore advisable not to choose the brightest cells. 38. For the FALCON system fluorescence lifetime decay curves can also be derived from the photon-traces recorded for FCS measurements, but the number of photons detected during FCS is generally lower than preferred for FLIM. Additionally, these measurements record signals only in the identified confocal volumes, and therefore do not have the spatial resolution offered by FLIM. The lifetime information can be used to unmix different signals, such as autofluorescence and specific fluorescence, through fluorescence lifetime correlation spectroscopy (FLCS) [23].

Acknowledgments We thank Leica Microsystems for the use of the Leica TCS SP8 FALCON system, and are especially grateful to Johanna Berndt and Frank Hecht for support with the application and data analysis. This work was supported by the Prinses Beatrix Spierfonds (grant number W.OR14-19). References 1. Uhlmann E, Peyman A (1990) Antisense oligonucleotides: a new therapeutic principle. Chem Rev 90:543–584 2. Aartsma-Rus A (2017) FDA approval of nusinersen for spinal muscular atrophy makes 2016 the year of splice modulating oligonucleotides. Nucleic Acid Ther 27:67–69 3. Sardone V, Zhou H, Muntoni F, Ferlini A, Falzarano M (2017) Antisense oligonucleotide-based therapy for neuromuscular disease. Molecules 22:563 4. Godfrey C, Desviat LR, Smedsrød B, Pie´triRouxel F, Denti MA, Disterer P et al (2017) Delivery is key: lessons learnt from developing splice-switching antisense therapies. EMBO Mol Med 9:545–557 5. Pack DW, Hoffman AS, Pun S, Stayton PS (2005) Design and development of polymers

for gene delivery. Nat Rev Drug Discov 4:581–593 6. Boisgue´rin P, Deshayes S, Gait MJ, O’Donovan L, Godfrey C, Betts C et al (2015) Delivery of therapeutic oligonucleotides with cell penetrating peptides. Adv Drug Deliv Rev 87:52–67 € (2016) 7. Kurrikoff K, Gestin M, Langel U Recent in vivo advances in cell-penetrating peptide-assisted drug delivery. Expert Opin Drug Deliv 13:373–387 8. McClorey G, Banerjee S (2018) Cell-penetrating peptides to enhance delivery of oligonucleotidebased therapeutics. Biomedicine 6:51 9. Ezzat K, EL Andaloussi S, Zaghloul EM, Lehto T, Lindberg S, Moreno PMD et al (2011) PepFect 14, a novel cell-penetrating peptide for oligonucleotide delivery in solution

138

M. Leontien van der Bent et al.

and as solid formulation. Nucleic Acids Res 39:5284–5298 10. Bayguinov PO, Oakley DM, Shih CC, Geanon DJ, Joens MS, Fitzpatrick JAJ (2018) Modern laser scanning confocal microscopy. Curr Protoc Cytom 85:1–17 11. Oida T, Sako Y, Kusumi A (1993) Fluorescence lifetime imaging microscopy (flimscopy). Methodology development and application to studies of endosome fusion in single cells. Biophys J 64:676–685 12. Tinnefeld P, Buschmann V, Herten D, Han K, Sauer M (2000) Confocal fluorescence lifetime imaging microscopy (FLIM) at the single molecule level. Single Mol 1:215–223 13. Chang CW, Sud D, Mycek MA (2007) Fluorescence lifetime imaging microscopy. Methods Cell Biol 81:495–524 14. Schwille P, Haustein E (2001) Fluorescence correlation spectroscopy. An introduction to its concepts and applications. https://pages. jh.edu/~iic/resources/ewExternalFiles/FCSSchwille.pdf 15. Widengren J, Schwille P (2000) Characterization of photoinduced isomerization and backisomerization of the cyanine dye cy5 by fluorescence correlation spectroscopy. J Phys Chem A 104:6416–6428 16. Buschmann V, Weston KD, Sauer M (2003) Spectroscopic study and evaluation of red-absorbing fluorescent dyes. Bioconjug Chem 14:195–204

17. Brock R, Vamosi G, Vereb G, Jovin TM (1999) Rapid characterization of green fluorescent protein fusion proteins on the molecular and cellular level by fluorescence correlation microscopy. Proc Natl Acad Sci 96:10123–10128 18. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T et al (2012) Fiji: an open-source platform for biologicalimage analysis. Nat Methods 9:676–682 19. Turner A, Radburn-Smith K, Mushtaq A, Tan L (2011) Storage and handling guidelines for custom peptides. Curr Protoc Protein Sci 18:12 20. Damiati S, Kompella UB, Damiati SA, Kodzius R (2018) Microfluidic devices for drug delivery systems and drug screening. Genes (Basel) 9: pii:E103 21. Stetefeld J, McKenna SA, Patel TR (2016) Dynamic light scattering: a practical guide and applications in biomedical sciences. Biophys Rev 8:409–427 22. Huang Z, Ji D, Wang S, Xia A, Koberling F, Patting M et al (2006) Spectral identification of specific photophysics of Cy5 by means of ensemble and single molecule measurements. J Phys Chem A 110:45–50 23. Gregor I, Enderlein J (2007) Time-resolved methods in biophysics. 3. Fluorescence lifetime correlation spectroscopy. Photochem Photobiol Sci 6:13–18

Chapter 11 Methodology for Subcellular Fractionation and MicroRNA Examination of Mitochondria, Mitochondria Associated ER Membrane (MAM), ER, and Cytosol from Human Brain Paresh Prajapati, Wang-Xia Wang, Peter T. Nelson, and Joe E. Springer Abstract Eukaryotic cell organelles exert unique functions individually but also interact with each other for essential cellular functions. This physical interface between the organelles serves as an important platform for biomolecule trafficking and signaling. Mitochondria are membrane-bound organelles and form a dynamic contact with other organelles. The interactions and communication between mitochondria and endoplasmic reticulum (ER) are facilitated by an ER specific domain, named mitochondria associated ER membrane (MAM). Due to its unique location, the MAM is a “hotspot” for important cell signaling and biochemical processes including calcium homeostasis, lipid synthesis/exchange, inflammasome and autophagosome formation, and mitochondria fission/fusion. Although techniques are available for isolation of organelle fractions including MAM, most utilize animal tissues and cell lines. Here we describe a protocol that is tailored to the isolation of highly purified MAM, mitochondria, ER, and cytosol from human brain. In addition, we include a protocol for the isolation of total RNA and subsequent analysis of microRNAs from these highly purified organelle fractions. Finally, we include a panel of protein markers that are useful for validating the enrichment and purity of each subcellular fraction. Key words Mitochondria associated ER membrane (MAM), Subcellular fractionation, Neurodegenerative diseases, Human brain, MicroRNA, RT-qPCR

1

Introduction Subcellular fractionation is a process of separation and purification of cellular compartments using techniques based on size, density, shape, and surface charge [1]. The isolation of the subcellular constituents provides a valuable tool for studying localization, processing, trafficking, and signaling of biomolecules and is especially useful for studying functional contacts between organelles. One such specialized inter-organelle contact point is the mitochondria-

Paresh Prajapati and Wang-Xia Wang contributed equally to this work. Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_11, © Springer Science+Business Media, LLC, part of Springer Nature 2020

139

140

Paresh Prajapati et al.

associated endoplasmic reticulum (ER) membrane (MAM), which is a specific domain that functions as a physical conduit between the ER and mitochondria. MAM facilitates a number of crucial cellular activities including Ca2+ signaling, phospholipid exchange, inflammasome and autophagosome formation, and mitochondrial morphology and redox status [2–4]. Given this unique structural microdomain and functional properties, interest in MAM biology has broadened over the past decade to include examination of various tissue and cell types encompassing several pathological conditions. Procedures for the isolation of MAM from animal tissues and cells has been established and documented in several excellent reports [5, 6]. Here, we provide a protocol that is specifically tailored to isolate highly purified MAM, mitochondria, ER, and cytosol from human and rat brain tissues. Moreover, we describe a protocol for RNA isolation and microRNA (miRNA) analysis from these purified subcellular fractions. We reported previously that several inflammatory-related miRNAs are enriched in rat hippocampal mitochondria [7, 8] and now report that the MAM is also a subcellular localization site for miRNA, implicating a role for MAM in miRNA trafficking. Moreover, we found that several miRNAs including miR-146a and miR-107 were detected in the MAM fraction and both miRNAs have been implicated in neurodegenerative diseases [9, 10] suggesting a potential cross talk between miRNA and MAM. While MAM dysfunction has been implicated in multiple forms of neurodegenerative disease including Alzheimer’s disease, Parkinson’s disease, and amyotrophic lateral sclerosis [11–13], the role of MAM in miRNA biology associated with neurodegenerative disease states remains elusive. To study this more closely, we adapted and developed a subcellular fractionation protocol that has been employed previously in our lab for mitochondria isolation. This protocol details the fractionation of mitochondria, MAM, ER, and cytosol and the isolation and analysis of miRNAs from these fractions. Based on our own practice, we provide important remarks at several critical steps. We also include a panel of protein markers for examining the enrichment and purity of these subcellular fractions. It is our opinion that this protocol would be useful for examining miRNA distribution patterns in human and experimental models of neurodegeneration, and the protocol can be easily modified to accommodate the fractionation of other tissues.

2 2.1

Materials Brain Tissues

1. Human brain tissue: human brain specimens should be obtained at a short postmortem interval (PMI) less than 6 h (see Note 1). Human brain specimens were obtained from the University of Kentucky Alzheimer’s Disease Center (UK-ADC). The procedures and protocols related to

MicroRNA Examination and Fractionation of Subcellular Organelles. . .

141

procurement of the human brain specimens were approved by the University of Kentucky Institutional Review Board. 2. Rat brain tissue: use rat brain tissue as fresh as possible. 2.2 Subcellular Isolation

1. Optima Xe-90 Ultracentrifuge (Beckman Coulter).

2.2.1 Equipment and Supplies

3. SW 41 Ti rotor (Swinging bucket, 41,000 rpm, 288,000 g).

2. Tabletop refrigerated centrifuge with speed up to 20,000 g. 4. SW 55 Ti rotor (Swinging bucket, 55,000 rpm, 285,000 g). 5. Teflon glass homogenizer. 6. Dounce glass tissue grinder with large clearance “A” pestle. 7. Thinwall, Ultra-Clear™, 5 mL, 13 51 mm for SW55 Ti rotor. 8. Thinwall, Ultra-Clear™, 13.2 mL, 14 89 mm for SW41 Ti rotor. 9. 1.7 ml and 5.0 mL microcentrifuge tubes.

2.2.2 Chemicals and Buffer Preparation

All solutions should be prepared using deionized, nuclease-free water and buffers made fresh before beginning the subcellular fractionation procedure. The pH of the buffers (pH: 7.4) should be adjusted at 4 C. Keep the buffer on ice (see Note 2). 1. Homogenization Buffer (HB): 30 mM Tris–HCl pH 7.4, 225 mM mannitol, 75 mM sucrose, 0.5 mM EGTA, protease inhibitor and 0.5% BSA (see Note 3). 2. Isolation Buffer A (IB-A): 30 mM Tris–HCl pH 7.4, 225 mM mannitol, 75 mM sucrose, and 0.25% BSA (see Note 3). 3. Isolation Buffer B (IB-B): 30 mM Tris–HCl pH 7.4, 225 mM mannitol, and 75 mM sucrose (see Note 3). 4. Mitochondrial Resuspending Buffer (MRB): 5 mM HEPES, pH 7.4, 250 mM mannitol, and 0.5 mM EGTA (v/v) (see Note 3). 5. Percoll medium (PM): 25 mM HEPES, pH 7.4, 225 mM mannitol, 1 mM EGTA, 30% Percoll (v/v) (see Note 4).

2.3 Protein Isolation and Western Blot Analysis

1. Tabletop refrigerated centrifuge. 2. Spectrophotometer for protein concentration determination. 3. SDS-PAGE and Western blotting assembly.

2.3.1 Equipment 2.3.2 Reagents

1. RIPA buffer. 2. Protease inhibitors. 3. Western blotting reagents. 4. Primary and secondary antibodies (see Table 1).

142

Paresh Prajapati et al.

Table 1 Enrichment of protein markers in each fraction Protein marker

Pure Pure cytosol mitochondria MAM

PDZD8

+++++

++

FACL4

++++

+++++ Abgent (ap14406A)

1:1000 MAM/ ERc

CALNEXIN

+++++

+++++ Abcam (ab22595)

1:5000 MAM/ ERd

IP3R

+

+++++ Santa Cruz (sc-377518)

1:500

MAM/ ERe

GRP75

+++++

++++

++

Santa Cruz (sc-133137)

1:500

Mito/ MAMf

NDUAF9

+++++

+

ThermoFisher 1:5000 Mitog Scientific (459100)

VDAC1

+++++

++

Santa Cruz (sc-390996)

1:500

TUBULIN

+++++

++

+++

Abcam (ab6046)

1:5000 Cyto/ MAM/ ERi

Pro+++++ CASPASE3

++

++

CST (9662)

1:1000 Cyto/ MAM/ ERj

HSP90

+++

++++

Santa Cruz (sc-7947) 1:500

+++++

ER

Antibody source Gift

a

Dilution Detection 1:500

MAMb

Mito/ MAMh

Cyto/ MAM/ ERk

Very high: +++++, high: ++++, medium: +++, low: ++, very low: +, not detected: – a The PDZD8 antibody was a generous gift from Dr. Joseph Sodroski [14] b PDZD8 (PDZ domain-containing protein 8) is a key protein tethering ER and mitochondria and regulates Ca2+ dynamics in neurons. PDZD8 is predominately enriched in MAM fractions and can be detected in ER c FASL4 (Fatty acid-CoA ligase 4 family protein) is a lipid metabolism enzyme, enriched in ER and MAM d Calnexin is an ER molecular chaperone. It can be detected in both ER and MAM fractions e IRP3 (trisphosphate receptor) is an ER Ca2+ channel protein. IRP3 is detected mostly in ER fractions f GRP75 (glucose-regulated protein 75) is a member of the mitochondrial HSP70 family and plays a key role in regulating Ca2+ transfer between ER and mitochondria. GRP75 can be detected in both mitochondria and MAM fractions g NDUFA9 (NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 9) is a subunit of mitochondrial complex I located in the mitochondrial inner membrane. NDUFA9 is detected predominately in mitochondria fractions h VDAC1 (voltage-dependent anion channel 1) plays important role in exchange of ions and metabolites and is thought to be in direct contact with MAM. VDAC is located in the mitochondrial outer membrane and is often detected in both mitochondria and MAM i Tubulin is the major component of eukaryotic cytoskeleton and is widely used as cytosolic protein marker. However, given its role as a structural protein, low levels can be found in MAM and ER fractions j Pro-caspase 3 can be detected in both cytosol and MAM k HSP90 is a molecular chaperone expressed most abundantly in the cytoplasm and can also be detected in MAM and ER

MicroRNA Examination and Fractionation of Subcellular Organelles. . .

2.4 RNA Isolation and miRNA Analysis 2.4.1 Equipment

143

1. Tabletop refrigerated centrifuge. 2. NanoDrop or equivalent equipment for quantifying total RNA levels following extraction. 3. QuantStudio 7 Flex Real-Time PCR System or equivalent equipment. 4. Biometra TAdvanced Thermocycler or equivalent with a fast ramping rate 5 C/s.

2.4.2 Reagents

1. TRIzol LS. 2. 20 μg/μl glycogen. 3. RNasin® Plus RNase Inhibitor or similar products. 4. TaqMan™ MicroRNA Reverse Transcription Kit (see Note 5). 5. Megaplex™ RT Primers. 6. TaqMan Custom microRNA Low Density Array Card. 7. TaqMan™ MicroRNA Assay with RT/PCR primers. 8. 2 TaqMan® Universal Master Mix II, No AmpErase® UNG. 9. 2 TaqMan PreAmp Master Mix. 10. 25 mM MgCl2. 11. Custom RT primer pool.

3

Methods

3.1 Tissue Preparation

1. Human brain tissue: immerse tissue immediately in ice-cold IB-B with protease inhibitors and keep the tube containing specimen in ice. Remove meninges and visible blood vessels, and separate gray matter from most of white matter using a razor blade (see Note 6). 2. Rat brain tissue: separate cortex from hippocampus and cerebellum and remove meninges and visible blood vessels (see Note 7).

3.2 Homogenization and Initial Fractionations

1. Transfer 0.5 g of human brain tissue or one hemisphere rat cortex to a 10-ml Teflon glass homogenizer. 2. Homogenize tissue in 3 ml HB using a Teflon pestle for 8–10 strokes (see Note 8). 3. Transfer homogenate to a 5 ml Eppendorf centrifuge tube. 4. Centrifuge homogenate at 630 g for 5 min at 4 C. 5. Collect supernatant in a new 5 ml tube. Resuspend pellet in 3 ml HB and repeat steps 2–4. 6. Collect and combine supernatant from previous centrifugation (the volume of the two combined supernatants should be

144

Paresh Prajapati et al.

around 5 ml). Discard the pellet containing unbroken cell debris and nuclei. 7. Centrifuge the combined supernatant again at 630 g for 5 min at 4 C and transfer supernatant to a new tube. 8. Repeat centrifugation step as in step 7 and collect supernatant to a new tube. Save 350 μl supernatant into a 1.5 ml Eppendorf tube as nuclei-free lysate (Lysate). 9. Centrifuge the remaining nuclei-free lysate at 6300 g for 10 min at 4 C. 10. Save the supernatant as crude cytosol in a new 5 ml Eppendorf tube and keep on ice for further fractionation in Subheading 3.2.4. (This crude cytosol contains ER and various multivesicular particles such as lysosome, microsomes and other cytosolic granules.) The pellet contains crude mitochondria and MAM and proceed to Subheading 3.2.1 Mitochondria and MAM Separation for further fractionation. 3.2.1 Mitochondria and MAM Separation

1. Add 2 ml ice-cold IB-A buffer to the crude mitochondria– MAM pellet (from step 10 of Subheading 3.2 above) and gently transfer the pellet to a 7-ml glass Dounce tissue grinder using a 1.0 ml pipette tip that has been cut to obtain a larger opening. Apply 3–5 strokes using a loose, large clearance “A” pestle to resuspend the mitochondria–MAM pellet (see Note 9). 2. Transfer the mitochondria–MAM resuspension to a fresh 5 ml Eppendorf tube and top the tube with 3 ml IB-A buffer. 3. Centrifuge the tube at 6300 g for 8 min at 4 C. Discard the supernatant after centrifugation. 4. Repeat step 1–3 using IB-B and gently transfer the pellet to the homogenizer. 5. Repeat step 1 using 2 ml of MRB and keep resuspension on ice. (Important Step: make sure the mitochondria–MAM pellet is well resuspended, if visible particles are seen, apply additional strokes.) 6. Load 8 ml of PM into a 13.2-ml ultracentrifuge tube (see Note 10). Carefully layer the crude mitochondria–MAM suspended in MRB from step 5 over PM solution without disturbing PM layer (see Note 11). 7. Gently top the mitochondria–MAM layer with additional 1.5 ml MRB (Fig. 1). (Be sure to balance the tubes with MRB.) 8. Centrifuge at 95,000 g for 45 min at 4 C using SW41 rotor (or similar rotor that is compatible with the tubes) in a ultracentrifuge. At the end of the centrifugation, two clear bands will be seen. The upper band is a wider, white-to-light gray band contains a mixture of MAM, fragmented mitochondria,

MicroRNA Examination and Fractionation of Subcellular Organelles. . .

145

Fig. 1 Schematic flowchart of subcellular fractionation procedure

and may also contain some larger multivesicular bodies (MVBs). The lower band is close to the bottom of the percoll gradient which contains intact mitochondria (see Fig. 1). 3.2.2 MAM Isolation

1. Use a 1-ml pipette tip to collect the upper layer from step 8 in Subheading 3.2.1 (see Fig. 1) and dilute the collected solution 10 times with MRB in a 13.2-ml ultracentrifuge tube (see Note 12). 2. Centrifuge the tube at 7200 g for 10 min at 4 C. 3. Transfer the supernatant (containing MAM fraction) into a new 13.2-ml ultracentrifuge and centrifuge at 100,000 g for 60 min at 4 C. After centrifugation, the purified MAM fraction can be seen as a floating sheet at the bottom of the tube. 4. Use a 1-ml pipette tip to carefully collect the MAM sheets and transfer to a 1.5-ml Eppendorf tube and top with MRB. 5. Centrifuge at top speed in a benchtop refrigerated centrifuge for 10 min at 4 C. 6. Resuspend MAM pellet (MAM) in 200 μl RIPA lysis buffer (see Note 13). 7. Use 100 μl of MAM lysis for RNA isolation and the remaining 100 μl for protein analysis.

146

Paresh Prajapati et al.

3.2.3 Pure Mitochondria Isolation

1. Collect the lower mitochondria band from step 8 in Subheading 3.2.1 (see Fig. 1) and dilute the collected solution ten times with MRB in a 13.2-ml ultracentrifuge tube (see Note 12). 2. Centrifuge tube at 10,000 g for 10 min at 4 C. 3. Discard supernatant and resuspend the pellet containing purified mitochondria (pMito) in 200 μl RIPA lysis buffer (see Note 13). 4. Use 100 μl of lysed mitochondria for RNA isolation and the remaining 100 μl for protein analysis.

3.2.4 ER Isolation

1. Centrifuge crude cytosolic supernatant (from step 10 in Subheading 3.2) at 20,000 g for 30 min at 4 C in a benchtop refrigerated centrifuge. 2. Transfer supernatant into a new 5-ml ultracentrifuge tube and centrifuge at 100,000 g for 60 min at 4 C. This centrifugation step separates ER from cytosol. 3. Transfer supernatant into a new tube, this is the pure cytosolic fraction (pCyto). The resulting purified ER pellet (ER) is gently rinsed in-tube 3 times with MRB and then resuspended in 200 μl RIPA lysis buffer (see Note 13). 4. Collect 275 μl pure cytosol for RNA isolation and 100 μl for protein analysis. 5. Collect 100 μl of ER lysis for RNA isolation and 100 μl for protein analysis (see Note 13).

3.3 Protein Isolation, Quantification, and Western Blot Analysis

1. Purified organelle fractions lysed in RIPA buffer (or other buffer solution) are incubated in ice for 20 min before subjected to centrifugation at 10,000 g for 10 min at 4 C. 2. Collect supernatants from each fraction. 3. Determine protein concentrations using BCA kit or other methodology of choice. 4. Add Laemmli sample buffer containing reducing agent (e.g., β-Mecaptoenothol or DTT) and boil the samples for 5 min. Aliquot and store the samples at 20 C if not processed immediately. 5. To determine the enrichment/purity of each fraction, run Western blot analysis using antibodies against organelle selective protein markers suggested in Table 1 (Fig. 2a, b).

MicroRNA Examination and Fractionation of Subcellular Organelles. . .

147

Fig. 2 Western blot analysis of subcellular marker proteins in each organelle fraction. Several conventionally recognized organelle marker proteins were used to determine the purity of the various subcellular fractions. A highly purified fraction is characterized by (1) enrichment of the corresponding organelle marker in that fraction and (2) the relative absence of markers associated with other organelles. The Western blots presented here represent a typical distribution of marker proteins in human brain cortical tissue (a) and rat brain cortical tissue (b) 3.4 RNA Isolation and miRNA Analysis (See Note 14)

1. Add additional RIPA buffer to aliquots of purified MAM, mitochondria, ER, and cytosol to a final volume of 275 μl in 1.5-ml Eppendorf tube.

3.4.1 RNA Isolation

2. Add 750 μl TRIzol™ LS to the tube. 3. Cap the tube, mix the contents vigorously for 15 s and incubate for 5 min at room temperature. 4. After a quick spin (see Note 15), add 0.2 ml of chloroform, cap tube, and mix vigorously for 15 s followed by incubation for 5 min at room temperature. 5. Spin the tube at 10,000 g for 30 s at room temperature. 6. Transfer 750 μl of TRIzol-sample mixture from the upper phase to a fresh tube avoiding the interface (see Note 16). 7. Centrifuge the tube at 12,000 g for 15 min at 4 C. 8. Collect 450 μl of the colorless upper aqueous phase into a new tube containing 2 μl glycogen (20 μg/μl) (see Note 17). 9. Vortex to mix followed by a quick spin, then add 450 μl isopropanol, mix well and leave overnight at 20 C (see Note 18). 10. Centrifuge the tube at 12,000 g for 15 min at 4 C to pellet the precipitated RNA.

148

Paresh Prajapati et al.

11. Carefully remove supernatant without disturbing RNA pellet. 12. Add 1 ml of 75% ethanol to rinse the RNA pellet. 13. Centrifuge tube at 10,000 g for 5 min at 4 C. 14. Decant off the supernatant and give the tube a quick spin. 15. Remove any remaining supernatant using a 200-μl pipette tip. 16. Briefly air-dry RNA pellet (see Note 19) and dissolve in 20–50 μl nuclease-free water (see Note 20) containing RNasin (0.5 U/μl). Set RNA in ice for several minute, then with brief vortex and short spin to obtain homogenous RNA solution. 17. Determine RNA concentration using NanoDrop Spectrophotometer (NanoDrop Technologies, Inc.) or any other method of choice. 18. Aliquot and store RNA samples at 80 C or proceed to next step. 19. Single tube (Subheading 3.5) or TaqMan® Low Density Array (Subheading 3.6) reverse transcription (RT) for miRNAs using TaqMan® MicroRNA Reverse Transcription Kit. 3.5

Single Tube RT

1. Normalize total RNA concentration to 10 ng/μl for all samples. 2. Calculate and prepare RT master mixture for a 15-μl total reaction volume per sample (12 μl of reaction mixture +3 μl RNA): For one reaction, μl Nuclease-free water

6.16

10 reverse transcription buffer

1.50

RNase inhibitor, 20 U/μl

0.19

100 mM dNTPs (with dTTP)

0.15

5 RT primer

3.00

MultiScribe™ reverse transcriptase, 50 U/μl

1.00

3. Aliquot 12 μl of reaction mixture to PCR reaction tubes or PCR plate. 4. Add 3 μl RNA to the reaction mixture. 5. Mix the content gently and briefly spin tube to bring down the solution to the bottom of the tube. 6. Incubate the tube in ice for 5 min.

MicroRNA Examination and Fractionation of Subcellular Organelles. . .

149

7. Perform RT reaction in a thermocycler using following program: (a) Hold 30 min at 16 C. (b) Hold 30 min at 42 C. (c) Hold 5 min at 85 C. (d) Hold at 4 C or proceed to preamplification. 3.6 TaqMan® Low Density Array RT Using Either Megaplex™ RT Primer Pools or Custom RT Primer Pools (See Note 21)

1. Normalize total RNA concentration to 10 ng/μl for all samples to be analyzed. 2. Calculate and prepare RT master mixture for a 7.5-μl total reaction volume per sample (4.5 μl of reaction mixture +3 μl RNA): For one reaction, μl Nuclease-free water

0.20

10 Reverse transcription buffer

0.80

RNase inhibitor, 20 U/μl

0.10

100 mM dNTPs (with dTTP)

0.20

MgCl2 (25 mM)

0.90

Megaplex™ RT primers (10) or custom RT primer pool

0.80

MultiScribe™ reverse transcriptase, 50 U/μl

1.50

3. Aliquot 4.5 μl reaction mixture to PCR reaction tubes or PCR plate. 4. Add 3 μl RNA to the reaction mixture. 5. Gently mix the contents and spin tube to bring down solution to the bottom of the tube. 6. Incubate the tube on ice for 5 min. 7. Perform RT reaction in a thermocycler using following program: (a) Hold 2 min at 16 C. (b) Hold 2 min at 42 C. (c) Hold 1 s at 50 C. – Run 40 cycles of steps a–c. (d) Hold 5 min at 85 C. (e) Hold at 4 C or proceed to preamplification. 8. Add 7.5 μl of nuclease-free water to 7.5 μl RT products (1:2 dilution) for preamplification or the sample can be further diluted for qPCR.

150

Paresh Prajapati et al.

3.7 Preamplification

1. Prepare preamplification master mixture for a 10-μl total reaction volume per sample (8 μl of reaction mixture +2 μl RT product): For one reaction, μl Nuclease-free water

1.50

2 TaqMan PreAmp master mix

5.00

PreAmp primers (see Note 21)

1.50

2. Aliquot 8 μl of reaction mixture to PCR reaction tubes or PCR plate. 3. Add 2 μl 2 diluted RT product to the reaction mixture. 4. Mix the contents and briefly spin the tube to bring down the solution to the bottom of the tube. 5. Perform PreAmp reaction in a thermocycler using following program: (a) Hold 10 min at 95 C. (b) Hold 2 min at 55 C. (c) Hold 2 min at 72 C. (d) Hold 15 s at 95 C. (e) Hold 4 min at 60 C. – Run 12 cycles of steps d and e. (f) Hold 10 min at 99.9 C. (g) Hold at 4 C or proceed to qPCR. 6. Add 30 μl of 0.1 TE buffer to 10 μl PreAmp products (1:4 dilution) for array cards. For single tube assays, PreAmp will be further diluted at least 10 times (final dilution for single tube assay is 1:40 or greater). 3.8

Real Time PCR

3.8.1 Single Tube TaqMan® Assay

1. Prepare PCR master mixture for a 10-μl total reaction volume per sample (7 μl of reaction mixture +3 μl diluted PreAmp product or diluted RT product): For one reaction, μl Nuclease-free water

1.50

2 TaqMan® Universal Master Mix II, (No AmpErase® UNG)

5.00

20 TaqMan® MicroRNA Assays

0.50

2. Aliquot 7 μl to PCR reaction plate or tubes. 3. Add 3 μl 40 diluted PreAmp product or diluted RT product to each designated well or tube.

MicroRNA Examination and Fractionation of Subcellular Organelles. . .

151

Fig. 3 RT-qPCR detection of miRNAs in subcellular fractions. TaqMan® microRNA assay with preamplification is a highly sensitive method for detection of miRNAs from a relatively small quantity of starting total RNA (e.g., from mitochondria and MAM). The amplification plots presented in this figure demonstrate robust detection of miR-146a and miR-107 using 30 ng of total RNA isolated from purified mitochondria, MAM, ER, and cytosol fractions obtained from a 0.5 g human brain tissue sample

4. Seal the plate with MicroAmp Optical Adhesive Film or cap the tubes. 5. Spin the plate or tube briefly. 6. Perform qPCR reaction in QuantStudio 7 Flex Real-Time PCR System or equivalent equipment using following program (see Fig. 3 for representative amplification plot): (a) Hold 10 min at 95 C. (b) Hold 15 s at 95 C. (c) Hold 1 min at 60 C. – Run 40 cycles of steps b and c. (d) Hold at 4 C. 3.8.2 TaqMan Custom MicroRNA Low-Density Array

1. Prepare PCR master mixture for a 100-μl total reaction volume per sample (90 μl of reaction mixture +10 μl diluted PreAmp): For one reaction, μl Nuclease-free water

40

®

2 TaqMan Universal Master Mix II, (No AmpErase® UNG)

50

2. Aliquot 90 μl to reaction mixture to each tube. 3. Add 10 μl 4 designated tube.

diluted

PreAmp

product

to

each

152

Paresh Prajapati et al.

4. Cap the tube and mix by inversion several times, briefly spin tube. 5. Pipet 95 μl of the sample/reaction mixture to the port of the array card. 6. Centrifuge the array card and seal the plate. 7. Perform qPCR reaction in QuantStudio 7 Flex Real-Time PCR System using following program: (a) Hold 2 min at 50 C. (b) Hold 10 min at 95 C. (c) Hold 15 s at 95 C. (d) Hold 1 min at 60 C. – Run 40 cycles of steps c and d.

4

Notes 1. We observed that the fractionation yield is reduced when PMI is more than 8 h. 2. Prepare stock solutions of 0.5 M EGTA pH 7.4, 1 M Tris–HCl pH 7.4, 1 M HEPES pH 7.4 ahead of time and stored at 4 C. 3. This buffer should be prepared fresh before each experiment. The pH should be adjusted at 4 C. Keep the buffer on ice. 4. Make PM immediately before the Percoll gradient step. Mix well before dispensing into the ultracentrifuge tubes to avoid forming regions with uneven PM concentrations. 5. We use TaqMan™ microRNA detection system, but it can be any other platform of analysis once total RNA is isolated. 6. Gray matter of human neocortex was used in this protocol. 7. Subcellular fractionation of rat cortex requires one hemisphere, while fractionation of rat hippocampus requires hippocampi from both hemispheres. 8. Pre-cool all glassware including homogenizer and pestle. 9. To avoid shearing, cut the tip of a 1.0 ml pipette tip (~0.5 cm from the tip) to transfer the mitochondria–MAM pellet into homogenizer. Thoroughly resuspend the pellet using a gentle motion with a loose fitting glass homogenizer/pestle. 10. Mix PM well before transfer into ultracentrifuge tube to avoid forming regions with uneven concentrations. 11. Avoid any movements that could disrupt layers. 12. A total of 1–1.2 ml may be collected.

MicroRNA Examination and Fractionation of Subcellular Organelles. . .

153

13. Use of other buffers depends on the type of experiments and analysis to be conducted. For example, use buffers containing a less stringent detergent for the purpose of co-immunoprecipitation experiments. 14. Any RNA isolation method may be employed as long as good RNA quantity and quality are obtained. Likewise, other platforms can be used for miRNA analysis. 15. When using TRIzol and chloroform, it is very important to minimize the carry-over of these organic solvents, which will interfere with downstream analysis. A quick spin can bring down the residue solution that may be on the cap and/or edges of the tube. 16. This 750 μl volume will contain most of aqueous phase with some lower pink TRIzol solution. 17. Due to the small quantity of total RNA from purified mitochondria and MAM, it is essential to use a nucleic acid carrier (such as glycogen or bacteriophage MS2 RNA) to obtain a better total RNA yield. We use glycogen because it significantly enhances visibility of the RNA pellet. 18. In our hands, miRNAs are better precipitated with an overnight incubation. 19. The pellet usually dries within a couple minutes and care should be taken not to allow it to be over-dried. 20. In our experience, a volume of 20 μl is sufficient to dissolve pure mitochondria and MAM RNA pellets and 50 μl for ER RNA pellet obtained from 0.5 g of brain tissue. 21. Multiple miRNA RT and PreAmp primers can be pooled in a single RT reaction for a more efficient miRNA analysis procedure. For the protocol using custom pooled primers, see: https://www.thermofisher.com/document-connect/docu ment-connect.html?url¼https://assets.thermofisher.com/ TFS-Assets/LSG/manuals/cms_094060.pdf.

Acknowledgments This work is supported by a grant from the Kentucky Spinal Cord and Head Injury Research Trust. The authors would like to thank Dr. Joseph Sodroski for the generous gift of the PDZD8 antibody. References 1. Graham JM (2015) Fractionation of subcellular organelles. Curr Protoc Cell Biol 69:3 1 1–3 122 2. Janikiewicz J, Szymanski J, Malinska D et al (2018) Mitochondria-associated membranes

in aging and senescence: structure, function, and dynamics. Cell Death Dis 9:332 3. Marchi S, Bittremieux M, Missiroli S et al (2017) Endoplasmic reticulum-mitochondria

154

Paresh Prajapati et al.

communication through Ca(2+) signaling: the importance of mitochondria-associated membranes (MAMs). Adv Exp Med Biol 997:49–67 4. Van Vliet AR, Verfaillie T, Agostinis P (2014) New functions of mitochondria associated membranes in cellular signaling. Biochim Biophys Acta 1843:2253–2262 5. Wieckowski MR, Giorgi C, Lebiedzinska M et al (2009) Isolation of mitochondriaassociated membranes and mitochondria from animal tissues and cells. Nat Protoc 4:1582–1590 6. Williamson CD, Wong DS, Bozidis P et al (2015) Isolation of endoplasmic reticulum, mitochondria, and mitochondria-associated membrane and detergent resistant membrane fractions from transfected cells and from human cytomegalovirus-infected primary fibroblasts. Curr Protoc Cell Biol 68(3):21–33 7. Wang WX, Springer JE (2015) Role of mitochondria in regulating microRNA activity and its relevance to the central nervous system. Neural Regen Res 10:1026–1028 8. Wang WX, Visavadiya NP, Pandya JD et al (2015) Mitochondria-associated microRNAs

in rat hippocampus following traumatic brain injury. Exp Neurol 265:84–93 9. Liu NK, Xu XM (2011) MicroRNA in central nervous system trauma and degenerative disorders. Physiol Genomics 43:571–580 10. Nelson PT, Wang WX, Rajeev BW (2008) MicroRNAs (miRNAs) in neurodegenerative diseases. Brain Pathol 18:130–138 11. Paillusson S, Stoica R, Gomez-Suaga P et al (2016) There’s something wrong with my MAM; the ER-mitochondria axis and neurodegenerative diseases. Trends Neurosci 39:146–157 12. Rodriguez-Arribas M, Yakhine-Diop SMS, Pedro JMB et al (2017) Mitochondriaassociated membranes (MAMs): overview and its role in Parkinson’s disease. Mol Neurobiol 54:6287–6303 13. Vance JE (2014) MAM (mitochondriaassociated membranes) in mammalian cells: lipids and beyond. Biochim Biophys Acta 1841:595–609 14. Zhang S, Sodroski J (2015) Efficient human immunodeficiency virus (HIV-1) infection of cells lacking PDZD8. Virology 481:73–78

Part III Biomedical Applications and Big Data

Chapter 12 Skeletal Muscle Injury by Electroporation: A Model to Study Degeneration/Regeneration Pathways in Muscle Camila F. Almeida and Mariz Vainzof Abstract Skeletal muscle has a remarkable capacity to regenerate after injuries mainly due to a reservoir of precursor cells named satellite cells (SCs), which are responsible for after-birth growth and response to lesions, either by exercise or disease. Upon injury, the regenerative response includes SCs exit of quiescence, activation, proliferation, and fusion to repair or form new myofibers. This process is accompanied by inflammation, with infiltration of immune cells, primarily macrophages. Every phase of regeneration is highly regulated and orchestrated by many molecules and signaling pathways. The elucidation of players and mechanisms involved in muscle degeneration and regeneration is of extreme importance, especially for therapeutic strategies for muscle diseases. Here we are proposing a model of muscle injury induced by electroporation, which is an efficient method to induce muscle damage in order to follow the steps involved in degeneration and regeneration. Three days after electroporation, the muscle shows prominent signals of degeneration, like areas of necrosis and infiltration of macrophages, followed by regeneration, observed by the presence of centrally nucleated myofibers. After 5 days the regeneration is very active, with small dMyHC positive fibers. Fifteen days later, we observe a general regeneration of the muscle, with fibers with increased diameter after 60 days. This methodology is an easy and simple alternative to induce muscle lesion. It can be employed to study alterations in gene expression and the process of satellite cell recruitment, both in healthy and dystrophic/myopathic animal models for muscular dystrophy. Key words Skeletal muscle, Lesion, Regeneration, Electroporation

1

Introduction After an injury, the healthy muscle is able to repair and grow new fibers. However, in the presence of mutations in a range of genes encoding for many muscular proteins, the process of regeneration is not completely efficient, and its constant activation leads to an exhaustion of this ability. Understand all the steps involved in muscle degeneration and regeneration is of major importance for a better comprehension of the neuromuscular diseases. For this, protocols to induce muscle injury are very useful to track all the

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_12, © Springer Science+Business Media, LLC, part of Springer Nature 2020

157

158

Camila F. Almeida and Mariz Vainzof

steps and players of muscle regeneration. The most popular models of lesion include the use of myotoxins, chemicals and physical methods. Although all the methods are highly efficient to provoke muscle degeneration, they can have different effects on how the tissue recovers. Thus, the choice for one method should take into account these variables, depending on the questions to be answered. Tissue electroporation is a method for delivery of DNA and other molecules to tissues broadly applied because it makes the plasma membrane more porous [1]. The technique consists on the application of voltage pulses that induce transient or permanent wounds in the plasma membrane. Typically, the transmembrane potential difference of a cell is 50–70 mV, and its maintenance is necessary to preserve membrane’s integrity. The larger the membrane is, higher is the magnitude of the transmembrane potential generated by an external electric field. Thus, electrical fields as small as 60 V/cm can damage the membrane of skeletal muscle cells [2]. When the electrical field is applied, the dielectric strength of the membrane is exceeded, which causes an increase on the conductance due to pore formation [3]. During electroporation there is also an increase in local temperature and conformational changes in proteins [2]. Some publications have shown that 3 days after electroporation for gene transfer, the muscle presents prominent signals of degeneration, as areas of necrosis and infiltration of macrophages, followed by regeneration, observed by the presence of centrally nucleated myofibers. After 15 days, it is possible to observe a general regeneration of the muscle, but some chronic lesions could still be identified [4, 5]. Based on these observations, we propose the use of electroporation protocol as a fast and efficient method to induce muscle damage in order to follow the steps involved in regeneration. It emerges as a simple alternative to other injury methodologies, with the benefit of adjustment of the lesion by modifying the employed parameters. This opens the possibility to study the alterations in gene expression and the process of satellite cells recruitment both in healthy and dystrophic/myopathic animal models for muscular dystrophy.

2

Materials

2.1 Electroporation Induced Injury

1. 1 mL disposable syringes. 2. Anesthetics: acepromazine, ketamine, and xylazine. 3. Disposable razor blades. 4. Conductive gel.

Electroporation Induced Muscle Injury

159

5. Pulse generator ECM830, Electro Square Porator, Harvard Apparatus. 6. Electrodes pads Tweezertrode Kit, 7 mm of diameter, stainless steel, 45-0165, Harvard Apparatus. 7. Disposable gloves. 2.2 Skeletal Muscle Dissection and Freezing

1. Disposable gloves. 2. Surgical instruments: forceps, dissecting scissors, iris scissors, iris forceps. 3. Liquid nitrogen. 4. Tissue-Tek O.C.T. compound. 5. Talc (cryoprotector). 6. Corks. 7. Cryogenic vial.

2.3 Skeletal Muscle Sectioning and Pulverization

1. Microtome-Cryostat. 2. (1 mg/mL) L-polylysine solution. 3. Microscope slides and coverslips. 4. Methylene blue solution. 5. Mortar and pestle. 6. Dry ice. 7. Liquid nitrogen.

2.4 Histological Analysis

1. Hematoxylin solution. 2. Eosin solution. 3. 70, 90, 100% ethanol. 4. 50–50% ethanol–xylol. 5. Acetic acid. 6. Mounting medium. 7. Bouin’s solution: 50 mL of formaldehyde, 20 mL of acetic acid, and 430 mL of picric acid. Store at room temperature. 8. 0.2% Sirius red solution: 0.2 g of Sirius red and 100 mL of picric acid. 9. Nail polish.

2.5 Immunofluorescence

1. Hydrophobic PAP pen. 2. Paraformaldehyde 4% solution. 3. PBS 1. 4. Primary antibodies against laminin (1/50 dilution, Abcam— Ab80580) and developmental myosin heavy chain (dMyHC) (1/30 dilution, Vector Laboratories—VPM664).

160

Camila F. Almeida and Mariz Vainzof

Table 1 Sequence of primers Gene

Forward

Reverse

Amplicon (bp)

Col1a2

GATGGTCACCCTGGAAAACC

CACGAGCACCCTGTGGTCC

Myf5

CTGTCTGGTCCCGAAGAAC

GACGTGATCCGATCCACAATG

130

Myod

TACAGTGGCGACTCAGATGC

TAGTAGGCGGTGTCGTAGCC

116

Myog

CTGCACTCCCTTACGTCCAT

CCCAGCCTGACAGACAATCT

103

Pax7

GAGTTCGATTAGCCGAGTGC

GTGTTTGGCTTTCTTCTCGC

100

Tbp

TGCACAGGAGCCAAGAGTGAA

CACATCACAGCTCCCCACCA

132

Tgfb

CCCCACTGATACGCCTGAGT

AGCCCTGTATTCCGTCTCCTT

68

86

5. Anti-rat FITC and anti-mouse Cy3 secondary antibodies. 6. DAPI diluted in antifade mounting medium. 7. Nail polish. 2.6 Gene Expression Analysis

1. TRIzol reagent. 2. Spectrophotometer. 3. DNase. 4. SuperScript® VILO™ MasterMix. 5. Primers pairs for target genes (Table 1). 6. PowerUp SYBR Green Master Mix. 7. Real-time PCR system.

3

Methods

3.1 Animal Handling and Electroporation

It is recommended to work in accordance with the guidelines for animal research from your institution, respecting animal well-being (see Note 1). 1. Anesthetize animals by intraperitoneal injection of a cocktail of acepromazine, ketamine, and xylazine (3, 80, and 10 mg/kg of body mass respectively) and position the animal in dorsoventral position. 2. Remove the hair from both calves and apply a drop of conductive gel to mouse’s skin (Fig. 1). 3. Set up the equipment: eight electrical pulses, intensity of 100 V, duration of 20 ms each pulse and 0.5 s interval between pulses (see Note 2). 4. Position the electrodes pads perpendicularly to muscle fibers orientation (Fig. 1).

Electroporation Induced Muscle Injury

161

Fig. 1 Muscle electroporation and dissection. (a) Removal of hairs from the calf. (b) Electrode positioning. (c) After specified times, the animals are sacrificed, the skin is removed and the calcaneal tendon is pinched. (d, e) Insert a scissor under the calcaneal tendon and open it, detaching the muscle from leg and finally cut the muscle at the opposite insertion. (f) Isolated calf muscle

5. Hold mouse leg by its ankle and hold it tightly to avoid electrodes slipping. 6. Press start button and wait the eight pulses be finished. 7. Place the animals back to the cage and observe them recover from anesthesia. 3.2 Animal Euthanasia and Muscle Dissection

Animals can be euthanized after different time points from electroporation. Here we present data 3, 5, 10, 15, 21, 30, and 60 days after lesion. 1. Anesthetize the mouse and sacrifice by cervical dislocation. 2. Rinse the limbs with ethanol 70%. Pinch the skin with a forceps and make an incision with a scissor and pull up the whole skin from the leg to expose muscles. 3. With a pointed forceps, pinch the calcaneal tendon, insert a scissor under it and by opening the scissors detach the whole calf muscles. Cut the tendon, hold it with forceps and cut muscles at the opposite insertion, releasing them.

3.3 Muscle Freezing, Pulverization and Cryosectioning

One muscle is prepared for sectioning and the other for biochemical studies. 1. In a small container with talc, place a slice of cork and put a generous drop of O.C.T. compound. Place the muscle

162

Camila F. Almeida and Mariz Vainzof

perpendicularly to the cork, immersing it in O.C.T. compound. Cover everything with talc to prevent O.C.T. compound seep out. 2. Grab the cork with a forceps, make sure to keep it in a horizontal position and quickly immerge it in liquid nitrogen. Leave the sample for at least 1 min in liquid nitrogen. The specimens can be stored on liquid nitrogen or at 80 C freezer until sectioning. For this, pack the specimens in small plastic bags, properly identified (see Note 3). 3. Put the second muscle directly in a cryogenic vial. Sink it in liquid nitrogen and leave the vial on liquid nitrogen for at least 2 min. Store it on liquid nitrogen or at 80 C freezer until pulverization. 4. Coat microscope slides with polylysine. Apply 10 μL of polylysine solution to a slide and use another slide to spread the solution on the surface of both slides, until it dries. Make a mark with a pencil on the side that received the coating. 5. Cool the cryostat chamber to 20 C and put the specimen disc, the blade and the specimen inside it for temperature equilibration. 6. Put a drop of water on the specimen disc to “glue” the cork with the muscle and wait until it freezes. Then, mount the specimen disc to the specimen head and align it to the blade holder. 7. Adjust the section thickness to 6 μm and cut the tissue. Transfer the muscle sections to the coated side of slides. To check if muscle is properly positioned, stain the sections with methylene blue for a few seconds, rinse on water and observe it on a light microscope. Muscle must be in a transversal orientation, with fibers appearing round. If necessary, adjust the positioning of the specimen disc until the correct orientation is found. Then, proceed with cutting. Four to six slices suffice for each slide. 8. Cover the segment of slides with muscle sections using a piece of plastic film and store slides in a 80 C freezer (see Note 4). 9. To pulverize muscles, use an ice bucket filled with dry ice and place the pestle and mortar in the middle of ice to cool them down. Carefully pour liquid nitrogen in the mortar. Wait a few seconds and put the piece of muscle in the mortar and start to smash it with the pestle, until it is completely grinded. Add more liquid nitrogen as it evaporates. 10. With a spatula transfer the powder to 1.5 mL microcentrifuge tubes. If wanted, the sample can be divided in aliquots. Store samples at 80 C freezer.

Electroporation Induced Muscle Injury

163

3.4 Histological Colorations

1. Remove slides from 80 C and let them equilibrate to room temperature for 30 min.

3.4.1 H&E

2. Place the slides in a staining trough and pour hematoxylin solution enough to submerge the muscle sections. Stain for 8 min and collect the hematoxylin solution (see Note 5). 3. Put the trough under a tap water and let water flow slowly for 10 min or until all the colorant is washed out. 4. Pour eosin solution inside the trough and stain for 2 min. 5. Put the trough under a tap water and let water flow slowly for 10 min or until all the colorant is washed out. 6. Quickly dip slides in acetic acid to fix coloration. 7. Proceed to dehydration. Immerse in ethanol 70% for 3 min, then in ethanol 90% for 3 min and finally in ethanol 100%. Forthwith dip slides into ethanol-xylol solution for 3 min and then in xylol for another 3 min (see Note 6). 8. To mount the slides, apply a drop of mounting medium over the slide and cover with a coverslip (see Note 7). 9. Seal the slides by applying nail polish around coverslips and let them dry overnight. 10. Visualize slides under a bright-field microscope. After 3 days, we observe an intense degeneration with infiltration of inflammatory cells. Next, at 5 days, there is still the infiltration of mononuclear cells, but many new fibers can be observed. At 10 days post-lesion, the tissue is cleared from cellular infiltrates and we observe centrally located nuclei. The centronucleated fibers are present even 60 days after lesion (Fig. 2).

Fig. 2 Hematoxylin and eosin staining. Muscle histology at different time points after injury. At 3 days, we observe the infiltration of inflammatory cells and fiber degeneration. After 5 days, there is still cell infiltrates and new fibers start to be formed. After 10 days, centronucleated fibers appear and they are still observable 60 days post-injury

164

3.5

Camila F. Almeida and Mariz Vainzof

Sirius Red

1. Remove slides from 80 C and let them equilibrate to room temperature for 30 min. 2. Place the slides in a staining trough and pour Bouin’s solution to fix muscle sections during 20 min (see Note 8). 3. Wash slides with tap water as many times as necessary, until water is no longer yellow. 4. Filter Sirius red solution before every use. Incubate slides in 0.2% Sirius red solution during 1 h. Wash out all the solution with running tap water (see Notes 8 and 9). 5. Dehydrate in ethanol 90% for 2 min, then in ethanol 100% for 2 min. Immerse slides in ethanol–xylol mixture for 2 min and then in xylol for another 2 min. 6. Mount the slides with mounting medium and coverslips. Seal the slides with nail polish. 7. Acquire images with a bright-field microscope coupled to a camera, with a 10 objective. We recommend capturing at least three fields from section of each mouse. 8. Quantify Sirius red staining on ImageJ software. Load the file on ImageJ and convert RGB image to grayscale. Work with the green channel image, which has the best contrast to threshold the image. Highlight the collagen (in black) by moving the threshold bar, keeping an eye on original image to correspond black to the red staining. Once satisfied, measure the area and calculate the percentage of black to the total area of the section. No appreciable fibrosis is observed after 60 days postinjury (Fig. 3).

Fig. 3 Sirius red staining. Quantification of percentage of collagen deposition. Data presented as mean SD, n- ¼ 6 per group. Mann–Whitney test, ns nonsignificant

Electroporation Induced Muscle Injury

3.6 Immunofluorescence Staining

165

1. Remove slides from 80 C and let them equilibrate to room temperature for 30 min. 2. With a hydrophobic pen, circle the sections creating a barrier. 3. Add PFA 4% on the sections. Use the volume necessary to cover all the sections. Let it fix for 15 min, under a fume hood. 4. Rinse PFA with PBS, three times. 5. Dilute antibodies in PBS: laminin antibody at 1/50 dilution and dMyHC at 1/30 dilution. Usually a volume of 50 μL suffices to cover all the sections. To help spread the solution evenly cut a small square of plastic and carefully put it over the sections. 6. Incubate overnight in a wet chamber at 4 C. 7. Wash three times with PBS for 5 min each. 8. Incubate with secondary antibodies diluted 1/100 in PBS for 1 h at room temperature, in a dark humid chamber. Use a 150 μL volume per slide. 9. Wash three times with PBS for 5 min each. Dry all the PBS, absorbing it with a tissue paper. 10. Apply 10 μL of mounting medium and gently put a coverslip. Turn the slide over a tissue paper to remove the excess of mounting medium and eventual bubbles. 11. Let the slides dry for 30 min and seal with nail polish. 12. Keep slides at 4 C, in the dark. Acquire images with a fluorescence microscope, with a 20 objective. Take representative images of the whole sections (Fig. 4). 13. Count the percentage of positive dMyHC fibers in relation to all fibers in the section. dMyHC+ fibers peak 5 days after lesion (Fig. 4). 14. To measure fiber’s diameter, open only green channel images on ImageJ software and convert them to 8-bit. Set scales if necessary (to μm). Adjust threshold to highlight white contours. Analyze particles using the following parameters: size 200–50,000, circularity 0.00–1.00, show outlines, display results. Measure Feret’s diameter. If necessary, exclude measurements from fibers present partially in the frame. We found

Fig. 4 Regenerating fibers are identified by dMyHC positive staining (in red)

166

Camila F. Almeida and Mariz Vainzof

Fig. 5 Feret’s diameter. Fiber’s diameters is slightly increased after lesion induced by electroporation. Data presented as mean SD. Mann–Whitney test, ∗∗∗p < 0.001

that after 60 days, the regenerated fibers increased their diameters as compared to noninjured tissue (Fig. 5). 3.7

RNA Extraction

1. Get muscle powder from freezer and keep them in dry ice. 2. Add 500 μL of TRIzol reagent. From this point, samples can be manipulated at room temperature. Homogenize samples with TRIzol using a small plastic pestle, pressuring it against the bottom of the tube. Add more 500 μL of TRIzol and gently mix with the pestle, taking care to not let TRIzol leak from the tube. 3. Let homogenates at room temperature for 5 min. 4. Add 200 μL of chloroform and vigorously shake by hand for 15 s. The color must change from a transparent pink to a milky rose. 5. Incubate for 2 min at room temperature. 6. Centrifuge samples at 12,000 g for 15 min at 4 C. Three phases will form: in the bottom a pink one, in the middle a white one and at the top an aqueous phase. 7. Remove the aqueous phase to a new 1.5 mL tube, avoiding aspirating the middle phase. 8. Add 500 μL of isopropanol and pipette up and down several times to mix well. 9. Incubate for 10 min at room temperature. 10. Centrifuge at 12,000 g for 10 min at 4 C. Depending on the initial amount of muscle, often a visible pellet forms at the bottom. 11. Carefully remove the supernatant and wash pellet with 1 mL of 75% ethanol. Vortex briefly. 12. Centrifuge at 7500 g for 5 min at 4 C. Discard supernatant.

Electroporation Induced Muscle Injury

167

13. Air-dry the pellet for 5–10 min (see Note 10). 14. Resuspend RNA in RNase-free water by pipetting up and down several times. 15. Incubate for 10 min at 60 C. 16. Measure RNA yield with a spectrophotometer. 17. Store RNA at 80 C. 3.8 cDNA Synthesis and Quantitative RealTime PCR

1. To cDNA synthesis, use 1 μg of RNA per 20 μL reaction. 2. Add 4 μL of SuperScript® VILO™ MasterMix and RNase-free water to 20 μL. 3. Mix well and incubate at 65 C for 10 min, then at 42 C for 60 min and finally at 85 C for 5 min. 4. Store cDNA at 20 C. 5. To qPCR, dilute cDNA 1/10 in RNase-free water. In 96-well plates, apply 10 μL of PowerUp SYBR Green Master Mix, 2 μL of diluted cDNA and 300 nM of forward and reverse primers, and RNase-free water to 20 μL of reaction. Make triplicates for each sample and pair of primers (see Notes 11 and 12). 6. Perform PCR in a real-time PCR thermocycler using cycling conditions following manufacturer’s recommendations. 7. Calculate fold-change values according to 2ΔΔCT method [6]. All genes are expected to increase their expression, as a reflect of satellite cells activity and tissue remodeling. (Fig. 6). As tissue

Fig. 6 Gene expression. Genes related to myogenic program are activated upon injury, with higher expression 3 and 5 days after electroporation, reducing as muscle regenerates. Data presented as mean SD, n ¼ 6 per group. Mann–Whitney test, ∗p < 0.05, ∗∗p < 0.01

168

Camila F. Almeida and Mariz Vainzof

recovers, their expression is reduced, however, the elevated Pax7 and Myf5 expression even after 60 days is noteworthy (Fig. 6a, b).

4

Notes 1. A minimum of five animals age and sex-matched is suggested for significant statistical analysis. 2. The safety of electroporation is determined by the magnitude of the electric field that must be below irreversible thresholds; this can be controlled by choosing adequate pulse amplitude and electrode configuration [7]. Adjustments of these parameters are encouraged if different degrees of lesion are desired. 3. Once frozen, it is important to maintain the temperature of muscles, to avoid thawing and formation of artifacts. 4. Make as many slides as necessary for all histological colorations and immunofluorescence analysis. If more slides are necessary in the future, the specimen can be stored back to 80 C and cut again. Make sure that the specimen does not thaw when moved from freezer to microtome and vice-versa. Always transport specimens on dry ice. 5. Hematoxylin solution can be reused many times. The staining time can be increased or reduced if staining is too strong or weak, respectively. 6. Xylol must be manipulated under fume hood. 7. To remove the eventually formed bubbles and excess of mounting medium, turn the slide down and gently squeeze it against a piece of tissue paper. 8. Bouin’s and Sirius red solutions can be reused several times. 9. Do not let the red coloration stain inside muscle fibers. To avoid this, check slides after 20 min of incubation and then every 10 min. Stop incubation before 1 h if you observe that the fibers are getting red. 10. If necessary, aspirate the ethanol with a micropipette. 11. Use SYBR green reagent compatible to your equipment. 12. PowerUp SYBR green master mix requires from 1 to 10 ng of cDNA. In this case, we are applying 10 ng.

Acknowledgments This work was made possible thanks to financial support from Fundação de Amparo a` Pesquisa do Estado de Saõ Paulo (FAPESP), CNPq, and CAPES.

Electroporation Induced Muscle Injury

169

References 1. Aihara H, Miyazaki JI (1998) Gene transfer into muscle by electroporation in vivo. Nat Biotechnol 16:867–870 2. Lee RC (2005) Cell injury by electric forces. Ann N Y Acad Sci 1066:85–91 3. Belete H, Godin L, Stroetz R, Hubmayr R (2010) Experimental models to study cell wounding and repair. Cell Physiol Biochem 25:71–80 4. J a R, Ford-Speelman DL, Ru LW et al (2011) Physiological and histological changes in skeletal muscle following in vivo gene transfer by electroporation. Am J Physiol Cell Physiol 301: C1239–C1250

5. Baligand C, Jouvion G, Schakman O et al (2012) Multiparametric functional nuclear magnetic resonance imaging shows alterations associated with plasmid electrotransfer in mouse skeletal muscle. J Gene Med 14: 598–608 6. Schmittgen TD, Livak KJ (2008) Analyzing realtime PCR data by the comparative CTmethod. Nat Protoc 3:1101–1108 ˇ orovic´ S, Mir LM, Miklavcˇicˇ D (2012) In vivo 7. C muscle electroporation threshold determination: Realistic numerical models and in vivo experiments. J Membr Biol 245:509–520

Chapter 13 Isolation and Characterization of Muscle-Derived Stem Cells from Dystrophic Mouse Models Paula C. G. Onofre-Oliveira and Mariz Vainzof Abstract The study of the population of muscle satellite cells (SC) is important to understand muscle regeneration and its involvement in the different dystrophic processes. We studied two dystrophic mouse models, Largemyd and Lama2dy2j/J, that show an intense and very similar pattern of muscle degeneration, but with differences in the expression of genes involved in the regeneration cascade. They are, therefore, interesting models to study possible differences in the mechanism of activation and action of satellite cells in the dystrophic muscle. The main objectives of this chapter are to describe the isolation and characterization of SC populations, evaluating the presence of myogenic and pluripotent stem cells markers in normal and dystrophic muscles. Key words Satellite cells, Muscular dystrophies, Animal models, Antibodies, Flow cytometry

1

Introduction Satellite cells (SC) are a subpopulation of muscle cells partially undifferentiated, located in the periphery of mature myotubes, between the sarcolemma and the basal lamina of muscle fibers [1]. These cells give the muscle tissue great ability of response to growth and injury. When they are not stimulated they keep in a quiescent, non proliferative state [2]. However, in response to a stimulus like injury or increased work, these cells became active, proliferate and begin to express myogenic markers, differentiating into myoblasts [3]. The control of proliferation and differentiation of satellite cells has been an important field of research, especially because the elucidation of this process can lead to the identification of therapeutic targets to stimulate muscle regeneration. In this respect, the identification and quantification of specific subpopulations of SC, quiescent and/or active, can allow their isolation and better characterization [4–7].

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_13, © Springer Science+Business Media, LLC, part of Springer Nature 2020

171

172

2 2.1

Paula C. G. Onofre-Oliveira and Mariz Vainzof

Materials Mice

All the animals are maintained in our own animal house. All the dystrophic animals were genotyped prior to the experiments. Mice were kept in cages with water and food ad lib, in rooms with controlled temperature and illumination, until the moment of euthanasia. All the experiments were approved by the Ethics commission of Institute of Biosciences (protocol #097/2009). Strains: 1. B6.WK-Lama2dy-2j/J (000524) - Jackson Laboratory (www. jax.org). Mice homozygous for the dystrophia-muscularis spontaneous mutations (Lama2dy and Lama2dy-2J) are characterized by progressive weakness and paralysis beginning at about 3 1/2 weeks of age. Skeletal muscle shows degenerative changes with proliferation of sarcolemmal nuclei, increase in amount of interstitial tissue, and size variation among individual muscle fibers. Myelination in the peripheral nervous system is delayed. 2. Largemyd (000226)—Jackson Laboratory (www.jax.org). The myodystrophy mutation, Largemyd, arose spontaneously at the Jackson Laboratory. Largemyd mice are a model of congenital muscular dystrophy type 1D (MDC1D; also called human α-dystroglycan glycosylation-deficient muscular dystrophy). Homozygotes exhibit a progressive myopathy, abnormal posture, thoracic kyphosis, calcium deposits in muscle, loss of Schwann cells and myelin, central nervous system defects, and reduced growth. 3. C57Black6 (normal mouse).

2.2

Cell Culture

1. Dulbecco modified Eagle medium—DMEM, or Dulbecco modified Eagle medium: Nutrient mixture F12 (DMEM/F12), 4 mM L-glutamine, 1.5 g/L sodium bicarbonate, 4.5 g/L glucose, 1.0 mM sodium pyruvate, 1% pen-strep, and nonessential amino acids (NEAA—Invitrogen—Thermofisher). 2. 1:500 Diluted Matrigel (BD) in 5 mL DMEM. 3. Digestion solution 1: 15 mL DMEM medium, 0.03 g collagenase type II, 4% penicillin–streptomycin. 4. Digestion solution 2: 15 mL DMEM medium, 0.04 g collagenase type II, 2 mL trypsin, 2 mL (1 mg/ml) DNase, 1% penicillin–streptomycin. 5. Inactivation solution: 15 mL supplemented DMEM medium (1% penicillin–streptomycin, 1% L-glutamine, 1% NEAA, 20% fetal bovine serum). 6. Markers used: CD29-Alexa 700 or -PE (Biolegend), CD133PE (Biolegend), CD73-Alexa647 (Biolegend), Sca1-APC

Isolation and Characterization of Muscle Derived Stem Cells

173

(Abcam), CXCR4-Alexa647 (Biolegend), CD44-PE-Cy5 (BD), CD13-PE (BD), CD44-PE (Abcam), CD31-PE (Abcam), CD45-PerCP-Cy5.5 (BD), CD13-PE (BD). 7. Commercially available C2C12 cells in DMEM culture, 10% fetal bovine serum.

3

Methods

3.1

Mice

Animals were euthanized and the muscles were dissected and collected in sterile DPBS (Gibco—Thermofisher), supplemented with 4% of pen-strep (Gibco—Thermofisher). The collected muscles were washed twice with the same solution, and all the apparent connective tissue, veins, and adipose tissue were removed using sterile chirurgical material. The muscles can now proceed to the isolation protocol.

3.2

Cells Culture

The chosen method for isolation of satellite cells was digestion followed by preplating to purify the population of cells based in their different adhesion times. 1. Fragment the clean muscles with sterile scalpel until the pieces did not exceed 1 mm3 (see Note 1). 2. Place the material in digestion solution 1. Incubate the fragments at 37 C, 80 rpm for 30 min (see Note 2). 3. Centrifuge the digested product (300 g, 4 C, 10 min) and place it in digestion solution 2. Incubate the material in this solution for 20 min at 37 C, 80 rpm. 4. Add inactivation solution and centrifuge the material at 300 g, 4 C, 10 min. 5. Discard the supernatant and resuspend all the pelleted material in supplemented DMEM medium. Centrifugate again (300 g, 4 C, 5 min). 6. After centrifugation the material should be resuspended again in supplemented DMEM and the cells can be mechanically dissociated from the fibers by gently passing the material several times in a disposable Pasteur pipette. 7. At this stage it is possible to purify the product of digestion by passing all the material through a 100 or 60 μM filter (SteriflipMillipore) or waiting for a few min for the larger, undigested material to decant at the bottom of the centrifuge tube. 8. Preplating is performed in this stage. The results from each step of isolation can be seen in Fig. 1. 9. All the cells should be kept in humid incubator at 37 C with 5% CO2 and in low populational density to inhibit differentiation induced by contact.

174

Paula C. G. Onofre-Oliveira and Mariz Vainzof

Fig. 1 Satellite cells isolated by the digestion method. (a) Product of the first digestion with collagenase II. (b) Material still in suspension after leaving the solution decant for a few minutes, composed of remnants of muscle fibers and individualized satellite cells. (c) Colony of satellite cells after 2 days of seeding. Increase: 50 3.3 Preplating Selection

According to Li and coworkers, the different cell lines adhere to the plaque in the following order of time: in the first 24 h the main cell types to adhere to are mainly myofibroblasts (adhere rapidly in the first 2 h), fibroblasts, and myoblasts (PP1). Satellite cells adhere slowly (PP2—in Fig. 2 it is possible to see its morphology) and finally stem cells derived from muscle tissue, after PP6; that is, after six 24-h platings [8]. Each supernatant of the previous passage was left to adhere for 24 h. After this time, the media containing slow-adhesion cells was transferred to another flask and the previous one was filled with fresh supplemented medium. This process was repeated until PP6.

3.4 Cytometry Studies

The main objective of this step was to compare: (a) The phenotype of normal extracted satellite cells with C2C12 immortalized myoblasts, mesenchymal stem cells and muscle derived stem cells isolated with the same technique.

Isolation and Characterization of Muscle Derived Stem Cells

175

Fig. 2 Satellite cells isolated by the digestion method followed by preplating. (a) PP1; (b) PP2; (c) PP3 (d) PP6-named muscle derived stem cells. Increase: 50

(b) The phenotype of normal isolated satellite cells with dystrophic satellite cells from two different mouse models for muscular dystrophies. For the phenotyping of cells relative to the membrane markers they express, we performed fluorescent flow cytometry assays (see Note 3). All cytometry studies were performed at least in technical duplicates and the results are presented as the mean fluorescence intensity values. Five trials were performed for each sample using myogenic markers. We performed the cytometry on samples from two different individuals from each of the isolated cell populations. For these assays, part of the already expanded cells were counted, centrifuged, and resuspended in a small volume of DPBS. 100 μL of the suspension containing the cells was transferred to 1.5 mL tubes. Ideally, the cells should be in a final concentration close to 2 107 cells/mL. 1. Add the antibodies labeled with fluorochromes of interest to this suspension.

176

Paula C. G. Onofre-Oliveira and Mariz Vainzof

Fig. 3 Example of the acquisition data and pattern of some surface marker expression in the cell lines studied

2. Incubate the cells with the antibodies for 1 h at 4 C. 3. Wash the samples with PBS and finally apply and read them on a flow cytometer (BD FACS Aria II). 4. After counting at least 5000 events the results are displayed by the device in the form of graphs and the percentage of marked cells is provided (Fig. 3). 3.5 Characteristics Related to Myogenesis

The comparison of markers in C2C12 cells with the normal and dystrophic strains, including PP1 and PP2 cells, is shown in Fig. 4 below. In normal muscle, the comparison of PP1 with PP2 showed a very similar pattern. The same was observed for the Largemyd strain, with no differences between PP1 and PP2. On the other hand, in the Lama2dy-2j/J strain many of the used markers showed differences in their expression comparing PP1–PP2, suggesting a more heterogeneous initial composition of cell population in this strain. All other cell populations showed phenotypic similarities to C2C12 myoblasts (Fig. 4).

3.6 Muscle Derived Stem Cells: MDSC/PP6

Preplating technique for longer time can lead to the isolation of cells with smaller capacity of adhesion which has been classified as stem cells resident in the muscle, the so-called muscle-derived stem cells. These cells are more undifferentiated than satellite cells, and are described as pluripotent and self-renewing [9]. These results demonstrate that the cell population isolated in PP6 has a phenotype very similar to that of the mesenchymal stem cells (Fig. 5).

Isolation and Characterization of Muscle Derived Stem Cells

177

Fig. 4 Expression of markers in PP1 and PP2 cells derived from each mouse strain, including C2C12. Lama2dy2j /J PP1 cells expressed differently several markers in relation to PP2 3.7

Conclusions

The phenotypic characterization of isolated cell populations identified: 1. In NORMAL MUSCLE. (a) The population of cells that adhere faster (PP1) and slower (PP2) show a phenotypic pattern similar to each other and with characteristics closer to the myogenic ones. (b) The preplating technique, when carried to the maximum PP6 point, isolated a population of cells with mixed marking pattern.

178

Paula C. G. Onofre-Oliveira and Mariz Vainzof

Fig. 5 Expression of markers in PP1, PP2, and PP6 cells derived from normal mice, in comparison with bonemarrow mesenchymal stem cells

2. In DYSTROPHIC MUSCLE. (a) In Lama2dy-2j/J, the initial population PP1 is more heterogeneous. (b) In Largemyd there was no difference in the marking pattern of the populations in PP1 and PP2, suggesting the selection of a more homogeneous initial population. (c) In Largemyd the two populations PP1 and PP2 expressed at least one characteristic marker of mesenchymal stem cells, suggesting greater immaturity.

4

Notes 1. Extraction of satellite cells. Although the niche of the satellite cells in the muscle is well known, the isolation of these cells is not trivial and several methodologies have emerged to successfully remove the satellite from its position close to the muscle. The protocol here presented is useful to compare the populations of several dystrophic strains and normal control, since it can be applied in a large number of samples. In addition, this method is more interesting if the objective is to obtain a representative pool of the various subpopulations of satellite cells present in these animal models. 2. Digestion method. By using enzymes to digest muscle tissue it is possible to isolate a larger number of satellite cells and populate the culture flask in a much shorter period of time. Thus, this method allows a greater amount of cells in a much shorter period,

Isolation and Characterization of Muscle Derived Stem Cells

179

since it does not have to take advantage of the mobility and capacity of cells migration [10]. On the other hand, contamination of other cell types in culture is much greater with the use of the digestion technique [8]. Especially in the dystrophic strains, where there is infiltration of connective and adipose tissue throughout the muscle tissue, it is clear that these tissues will also be digested and may contribute to cells that are able to adhere to the bottle, such as fibroblasts, endothelial cells, and mesenchymal cells, among others. A culture purification method is necessary; therefore, we use preplating in all our samples, as described by Li [8]. 3. Phenotypic characterization and identification of subpopulations. The characterization of satellite cells by cell surface markers is also not trivial. There are several labels that can be used to identify such cells at various stages of differentiation such as PAX3, PAX7, MYOD, and myogenin. It is possible to identify the satellite cells in their physiological position with the use of anti-PAX7 antibodies in different species, such as mice [11, 12], rat [13], chicken [14], and human [15]. However, all these specific antibodies to satellite cells are intracellular and require the fixation and permeabilization of the cells for their use in the identification of subpopulations by selective cytometry. Thus, even by identifying these subpopulations of interest, these cells will no longer be viable for culture and analysis. We then sought to identify the phenotypic profile of the cells with several other markers. Some of them are used for identification of mesenchymal cells [16, 17], such as CD29+, CD44+, CD73+, CD90+, CD 105+, and SCA1. At the same time, some markers are known to be negative in mesenchymal cells because they are specific to other cell types: CD45, which marks hematopoietic cells and CD31, which marks endothelial cells. The CD13 marker is indicative of cell senescence [16].

Acknowledgments This work was made possible thanks to financial support from Fundação de Amparo a` Pesquisa do Estado de Saõ Paulo (FAPESP), CNPq, and CAPES. References 1. Morgan JE, Partridge TA (2003) Muscle satellite cells. Int J Biochem Cell Biol 35:1151–1156 2. Chen JC, Goldhamer DJ (2003) Skeletal muscle stem cells. Reprod Biol Endocrinol 1:101

3. Hawke TJ, Garry DJ (2001) Myogenic satellite cells: physiology to molecular biology. J Appl Physiol 91:534–551 4. Kallestad KM, McLoon LK (2010) Defining the heterogeneity of skeletal muscle-derived

180

Paula C. G. Onofre-Oliveira and Mariz Vainzof

side and main population cells isolated immediately ex vivo. J Cell Physiol 222:676–684 5. Scott IC, Tomlinson W, Walding A et al (2013) Large-scale isolation of human skeletal muscle satellite cells from post-mortem tissue and development of quantitative assays to evaluate modulators of myogenesis. J Cachexia Sarcopenia Muscle 4:157–169 6. Tajika Y, Takahashi M, Hino M et al (2010) VAMP2 marks quiescent satellite cells and myotubes, but not activated myoblasts. Acta Histochem Cytochem 43:107–114 7. Yajima H, Motohashi N, Ono Y et al (2010) Six family genes control the proliferation and differentiation of muscle satellite cells. Exp Cell Res 316:2932–2944 8. Li Y, Pan H, Huard J (2010) Isolating stem cells from soft musculoskeletal tissues. J Vis Exp 41:pii:2011 9. Jankowski RJ, Deasy BM, Huard J (2002) Muscle-derived stem cells. Gene Ther 9:642–647 10. Shefer G, de Mark DP, Richardson JB, Yablonka-Reuveni Z (2006) Satellite-cell pool size does matter: defining the myogenic potency of aging skeletal muscle. Dev Biol 294:50–66 11. Seale P, Sabourin LA, Girgis-Gabardo A et al (2000) Pax7 is required for the specification of myogenic satellite cells. Cell 102:777–786

12. Zammit PS, Golding JP, Nagata Y et al (2004) Muscle satellite cells adopt divergent fates: a mechanism for self-renewal? J Cell Biol 166:347–357 13. Shefer G, Rauner G, Yablonka-Reuveni Z, Benayahu D (2010) Reduced satellite cell numbers and myogenic capacity in aging can be alleviated by endurance exercise. PLoS One 5:e13307 14. Halevy O, Piestun Y, Allouh MZ et al (2004) Pattern of Pax7 expression during myogenesis in the posthatch chicken establishes a model for satellite cell differentiation and renewal. Dev Dyn 231:489–502 15. Lindstro¨m M, Thornell LE (2009) New multiple labelling method for improved satellite cell identification in human muscle: application to a cohort of power-lifters and sedentary men. Histochem Cell Biol 132:141–157 16. Dominici M, Le Blanc K, Mueller I et al (2006) Minimal criteria for defining multipotent mesenchymal stromal cells: the International Society for cellular therapy position statement. Cytotherapy 8:315–317 17. Patki S, Kadam S, Chandra V, Bhonde R (2010) Human breast milk is a rich source of multipotent mesenchymal stem cells. Hum Cell 23:35–40

Chapter 14 Universal Library Preparation Protocol for Efficient HighThroughput Sequencing of Double-Stranded RNA Viruses Anna S. Dolgova, Marina V. Safonova, and Vladimir G. Dedkov Abstract This chapter reports a library preparation protocol for efficient high-throughput sequencing of doublestranded RNA viruses. The protocol consists of four main steps, viz., enzyme treatment, precipitation using lithium chloride, full-length amplification of cDNAs, and tailing adapters for high-throughput sequencing. This protocol will be useful for all double-stranded RNA viruses and for all of the high-throughput sequencing platforms. Key words Efficient sequencing protocols, dsRNA viruses, Enzyme treatment, Precipitation by LiCl, FL amplification

1

Introduction The double-stranded RNA viruses (dsRNA viruses) are a heterogeneous group that currently includes eight viral families: Birnaviridae (4 genera includes 16 species), Chrysoviridae (one genus includes 9 species), Cystoviridae (one genus includes a single species), Partitiviridae (5 genera including 45 species; 15 additional species are unassigned to a genus), Picobirnaviridae (one genus includes a single species), Quadriviridae (One genus including a single species), Reoviridae (15 genera including 135 species; 55 additional species are unassigned to a genus), and Totiviridae (5 genera including 28 species) [1]. Among them, there are viruses that have medical (Human rotavirus A, Kemerovo virus, etc.) [2, 3] as well as veterinary significance (African horse sickness viruses, Blue tong viruses, etc.) [4, 5]. However, in general, dsRNA viruses have a wild range of hosts, including vertebrates and invertebrates, algae, archaea, protozoa, fungi, plants, and bacteria [1]. Thus, dsRNA viruses could be of great interests for researchers from diverse fields of biological sciences.

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_14, © Springer Science+Business Media, LLC, part of Springer Nature 2020

181

182

Anna S. Dolgova et al.

dsRNA viruses, as a distinct feature of genome organization, have a segmented genome (from 2 to 12 segments of doublestranded RNAs, which have a total length near 20 kbp). Such peculiarities lead to certain technical difficulties in sequence analysis of the full-length genome or entire genome. Herein, we present an enhanced library preparation protocol for efficient highthroughput sequencing of double-stranded RNA viruses. This protocol is the further development of an approach described by Maan et al. in 2007 [6]. It allows eliminating the contaminating DNA and single-stranded RNA, and therefore, could increase the viral genome coverage during high-throughput sequencing and reduce the cost of sequencing in terms of using a single sample. We successfully used this protocol for complete genome sequencing of Kemerovo virus (belongs to Orbivirus genus) by means of MiSeq (Illumina, Inc., USA). However, this approach could be convenient for all double-stranded RNA viruses and for all of the highthroughput sequencing platforms.

2

Materials

2.1 Buffers and Solutions

1. STE Buffer (sodium chloride–Tris–EDTA buffer): 10 mM Tris, 100 mM NaCl, 1 mM EDTA, pH 8.0. Dissolve following chemicals: 5.84 g NaCl, 1.21 g Tris base, and 0.37 g Na2EDTA·2H2O in 800 mL distilled water. Adjust the pH to 8.0 with HCl. Make the final volume to 1 L with distilled water. Store at 15–25 C. 2. 10% (w/v) SDS (Sodium dodecyl sulfate): To prepare 100 mL of 10% SDS solution, weigh out 10 g SDS in a 250-mL conical flask/beaker. Add 80 mL deionized/Milli-Q water and mix it (see Notes 1 and 2). Heat to 68 C, if necessary. Adjust the volume to 100 mL with deionized/Milli-Q water. Mix thoroughly. (Optional): One can filter the solution to remove any undissolved material (see Note 3). Store at RT for several months (see Note 4). 3. Phenol–chloroform (1:1): Mix phenol (see Note 5) and chloroform in a proportion of 1:1, use directly for nucleic acids isolation (see Note 6). CAUTION: phenol is a highly corrosive substance that causes a severe burn to the skin. WEAR GLOVES and BE EXTREMELY CAUTIOUS, work in a hood if possible (see Note 7). 4. 5 M Ammonium acetate: Dissolve 385 g of ammonium acetate in 800 mL of H2O. Make final volume to 1 L. Filter-sterilize. Store at 4–8 C. 5. Ethanol: 75%: Take 75 mL of 96% ethanol and add water to make 96 mL. Store at RT.

Universal Preparation Protocol for Sequencing of dsRNA Viruses

183

6. 50 TAE Electrophoresis Buffer: Add 242 g of Tris base and 18.61 g Na2EDTA∙2H2O to approximately 700 mL DDI H2O and stir until the Tris and EDTA are dissolved. Add 57.1 mL acetic acid and adjust the volume to 1 L. Store at 15–25 C. 7. Preparation of 1 TAE working solution: Take 20 mL of 50 TAE and add 1000 mL deionized/Milli-Q water. The final 1 TAE solution contains 40 mM Tris, 20 mM Acetate and 1 mM EDTA and typically has a pH around 8.6 (no need to adjust). Casting a standard agarose gel: Measure 1 g of agarose for 1% gel (1.5 g for 1.5% gel). Mix agarose powder with 100 mL 1 TAE buffer in a microwavable flask. Heat for 1–3 min until the agarose is completely dissolved (see Note 8). CAUTION: HOT! Be careful while stirring, eruptive boiling can occur. Let agarose solution cool down to about 50 C (or till you can hold the flask). Add ethidium bromide (EtBr) to a final concentration of approximately 0.2–0.5 μg/mL (usually about 2–3 μL of lab stock solution per 100 mL gel). CAUTION: EtBr is a known mutagen. Wear a lab coat, eye protection, and gloves when working with this chemical. Pour the agarose into a gel tray with the well comb in place. 8. 0.5 M EDTA (pH 8.0) adjust pH to 8.0 with 10 M NaOH. Add water to 1 L and filter-sterilize (see Note 9). Divide into aliquots and sterilize by autoclaving. Store at 4 C. To obtain 25 mM EDTA mix 5 μL 0.5 M EDTA and 95 μL RNase-free water. 9. 50% w/v PEG 4000 (polyethylene glycol 4000) solution: Dissolve 250 g of polyethylene glycol in 300 mL of ddH2O by stirring. Add ddH2O to make a final volume of 500 mL. Autoclave to sterilize. Store at 4 C. 10. LiCl (lithium chloride): Dissolve 3.4 g LiCl in 100 mL ddH2O. Autoclave to sterilize. Store at RT. 11. DMSO (Dimethyl sulfoxide). Use as supplied, Store at RT. 2.2 Synthetic Primers

1. Reo-Sp-Loop adapter: 50 p-GACCTCTGAGGATTCTAAAC_Sp9_TCCAGTTTAGAATCC-30 . 2. pReo primer: 50 p-GTTTAGAATCCTCAGAGGTC-30 . (a) Enzymes, enzyme, buffers, etc. (b) DNase I, DNase I reaction buffer, T4 RNA ligase, 10 T4 RNA ligase reaction buffer, AMV reverse transcriptase, AMV-reverse transcriptase buffer, dNTP mix, PCR buffer, thermostable DNA polymerase, RNase-free water.

2.3

Kits

1. PCR Purification Kit. 2. DNA Library Preparation Kit.

184

2.4

Anna S. Dolgova et al.

Equipment

1. Thermoshaker. 2. Thermal cycler. 3. Any automated electrophoresis unit for the sample quality control of biomolecules utilizing capillary electrophoresis. 4. Refrigerated microcentrifuge. 5. Refrigerated high-performance centrifuge. 6. Vortex mixture, agarose gel casting system.

3

Methods Carry out all procedures at room temperature, unless otherwise specified. Always work in an RNase-free environment, after cleaning pipettes and table with RNase, keep away all other similar compounds. The principal scheme of the cDNA preparation and amplification is shown in Fig. 1.

3.1

RNA Extraction

1. Remove the cell debris from cell culture (50 mL) (see Note 10) by centrifugation at 18,000 g and 4 C for 30 min. Spin supernatant at 112,500 g for 4 h at 4 C. Remove media and Purified dsRNA

Ligation of the anchor primer ds RNA purification

ds RNA denaturation (with DMSO)

cDNA first strand synthesis and purification

Completion of ds DNA

PCR amplification (with primer complementary to the adapter)

Fig. 1 Principal scheme of the cDNA preparation and amplification

Universal Preparation Protocol for Sequencing of dsRNA Viruses

185

resuspend the precipitate of virus particles in 600 μL of STE Buffer. Add 67 μL of 10% (w/v) SDS to the final concentration of 1%. 2. Transfer the nucleic acid sample to a polypropylene tube and add an equal volume of phenol–chloroform (1:1). Mix gently for 2–3 min and centrifuge at 20,000 g for 8 min at 4 C. Transfer the aqueous phase to a fresh tube with the help of a pipette. Add an equal volume of phenol–chloroform and mix and centrifuge. Transfer the aqueous phase to a fresh tube. 3. Measure the volume of the aqueous phase. Add three volumes of 96% ethanol and 5 M ammonium acetate to the final concentration of 1 M. Shake thoroughly the resulting mixture and incubate for 1 h at 80 C. Precipitate RNA by centrifugation at 20,000 g for 15 min at 4 (see Note 11). 4. Remove the supernatant completely. Mix the samples by vortexing. Wash the precipitate with 500 μL of 75% ethanol, dried in a desiccator under vacuum (ethanol can be also removed by centrifuge at no more than 7500 g for 5 min at 2–8 C) and dissolve in 50 μL RNase-free water. 5. Evaluate the quality of the obtained RNA by the electrophoresis using 1% agarose gel with the addition of ethidium bromide. Mix 5 μL of RNA with 1 μL of 6 Loading Dye. Load a molecular weight maker ladder and samples into the wells of the gel. Run the gel at 80 V for 1.5 h (Fig. 2). Bands must correspond to the genomic segments of the virus. 6. Mix 9 μL of the RNA dissolved in water with 2 U of DNase I and of 1.2 μL of 10 DNase I reaction buffer. Incubate at 37 C for 30 min. Inactivate DNase by the addition of 1 μL

Fig. 2 Agarose gel electrophoresis of dsRNA extracted from the virus particles. The virus particles derived from the infected baby hamster kidney cell culture (BHK-21). Lane 1 and 4: GeneRuler DNA Ladder 100–10,000 bp (Thermo Fisher Scientific, USA), lane 2 and 3: Kemerovo virus strain 106 and strain k10

186

Anna S. Dolgova et al.

25 mM EDTA and subsequent incubation of the mixture for 10 min at 65 C. To purify the product using any PCR Purification Kit according to the manufacturer’s protocol. Elute purified RNA in 10 μL of RNase-free water. 7. Mix 5 μL of DNaseI-treated purified RNA and 20 μL of 8 M LiCl. Adjust the volume of the resulting reaction mixture to 80 μL by adding RNase-free water. Incubated on ice for 16 h. 8. Centrifuge the reaction mixture for 30 min at 12,000 g. To purify supernatant containing dsRNA use any PCR Purification Kit according to the manufacturer’s protocol. Elute purified RNA in 10 μL of RNase-free water. 3.2 Preparation of Viral cDNA Libraries

1. Mix 10 μL of purified dsRNA, 25 pmol of the Reo-Sp-Loop adapter, 2 μL of 10 T4 RNA ligase reaction buffer, 4 μL 50% PEG 4000, 10 U of T4 RNA ligase. Adjust the reaction volume to 20 μL using RNase-free water. Incubate at 25 C in a thermoshaker at 400 rpm. In order to purify the ligation product, use any PCR Purification Kit according to the manufacturer’s protocol. Elute purified RNA in 10 μL of RNase-free water. 2. Transfer 5 μL of dsRNA to a fresh PCR-tube and add 3 μL of DMSO. Incubate at 90 C for 2 min for product denaturation in a thermostat (see Note 12). 3. Remove the samples from the thermostat and immediately place on ice. Add 12 μL of the reaction mixture consisting of 1 AMV-reverse transcriptase buffer, 0.5 mM dNTP mix, 20 U of AMV reverse transcriptase and adjusted to 12 μL by RNase-free water. 4. Place reaction mixture in a thermoshaker and incubate at 580 rpm for 30 min at 37 C and then for another 30 min at 42 C (see Note 13). In order to purify the reverse transcription product, use any PCR Purification Kit according to the manufacturer’s protocol. Elute first cDNA strand in 10 μL of RNasefree water (see Note 14). 5. Prepare a 25-μL reaction mixture containing, 2 μL of cDNA, 10 pmol of pReo primer, 0.2 mM dNTPs, 1 PCR buffer, and 1 U of thermostable DNA polymerase, adjust to the required volume by RNase-free water. Perform the PCR reaction with the following thermal cycling parameters: 95 С, 3 min; (95 С, 20 s; 72 С, 15 min) for 2 cycles; and further (95 C, 20 s; 50 C, 20 s; 72 C, 1 min) for 45 cycles; 72 C, 3 min for one cycle. In order to purify the PCR product, use any PCR Purification Kit according to the manufacturer’s protocol. Elute PCR product in 15 μL of RNase-free water. 6. Evaluate the quality of the library by electrophoresis using 1.5% agarose gel with the addition of ethidium bromide (Fig. 3).

Universal Preparation Protocol for Sequencing of dsRNA Viruses

187

Fig. 3 Agarose gel electrophoresis of cDNA libraries of Kemerovo virus. 1— Strain 106, 2—Strain k10, 3—DNA Ladder (100, 200, 300, 400, 500, and 800 bp)

7. Prepare DNA libraries for high-throughput sequencing using the DNA Library Preparation Kit (appropriate for your DNA sequencer) according to the manufacturer’s protocol (see Note 15). The quality of the final libraries can be assessed using a capillary electrophoresis system. Evaluate the distribution of fragments in length and the representation of each type of fragments in the library pool. Samples are ready for high-throughput sequencing.

4

Notes 1. Do not dissolve in 100 mL of deionized/Milli-Q water. In most cases, solution volume increases when a large amount of solute is dissolved in a solvent. 2. Wear a face mask to avoid inhalation of SDS dust. Avoid frothing. 3. Do not autoclave 10% SDS solution; only filter-sterilize using a 0.45 μM filter. 4. Do not store the solution at 4 C. SDS will precipitate at a temperature below 15 C. If there is any precipitate, redissolved by warming the solution (68 C for 10 min). 5. Be careful to determine which layer is the phenol. The density of pure phenol (unlike phenol–chloroform) is almost 1.0. Small changes in the density of your water layer (e.g., excess salt) can lead to layer inversion. 6. In case you want to store aliquots: add to 10–20 mL phenol– chloroform mix, 10 mL 50 mM Tris–HCl (pH 8) and freeze at 20 C.

188

Anna S. Dolgova et al.

7. A solution of PEG 400 is recommended for first aid. Phenol is both a systemic and locally toxic agent. Visit a medical center if required. 8. Do not boil the solution too long, as some of the water will evaporate and the final buffer composition of the gel will change. It is a good idea to microwave for 30–45 s, stop and swirl, and then continue toward a boil. Keep an eye on it, the solution has a tendency to boil over. 9. Begin titrating before the sample is completely dissolved. EDTA, even in the disodium salt form, is difficult to dissolve at this concentration unless the pH is increased to between 7 and 8. 10. The cultivation conditions of the virus-infected cells must be modified in accordance with the recommendations of the manufacturer. 11. The RNA precipitate, often invisible before centrifugation, forms a gel-like pellet on the side and bottom of the tube. 12. If thermal cycler has no heating lid, mix solution under a layer of mineral oil. 13. Reaction efficiency can be enhanced by additional 20 U of AMV reverse transcriptase, if added at this stage. 14. From here on you can use ddH2O. 15. We used Nextera XT DNA Library Preparation Kit (Illumina, Inc., USA), the quality of the final libraries was assessed using 2100 Electrophoresis BioAnalyzer (Agilent Technologies, Inc., USA) and for high-throughput sequencing, we used MiSeq (Illumina Inc., USA) platform. References 1. Taxonomy V (2017) The classification and nomenclature of viruses. The online (10th) report of the ICTV. Arch Virol 51(1–2):141–149 2. Aly M, Al Khairy A, Al Johani S, Balkhy H (2015) Unusual rotavirus genotypes among children with acute diarrhea in Saudi Arabia. BMC Infect Dis 15(1):192. https://doi.org/ 10.1186/s12879-015-0923-y 3. Shi J, Hu Z, Deng F, Shen S (2018) Tick-borne viruses. Virol Sin 33(1):21–43. https://doi. org/10.1007/s12250-018-0019-0 4. Diarra M, Fall M, Fall AG, Diop A, Lancelot R, Seck MT, .Guis H. (2018). Spatial distribution modelling of Culicoides (Diptera: Ceratopogonidae) biting midges, potential vectors of African horse sickness and bluetongue viruses in

Senegal. Parasit Vectors, 11(1):341. doi: https://doi.org/10.1186/s13071-018-2920-7 5. Bournez L, Cavalerie L, Sailleau C, Breard E, Zanella G., de Almeida RS, .Hendrikx P. (2018). Estimation of French cattle herd immunity against bluetongue serotype 8 at the time of its re-emergence in 2015. BMC Vet Res, 14(1):65. doi: https://doi.org/10.1186/ s12917-018-1388-1 6. Maan S, Rao S, Maan NS, Anthony SJ, Attoui H, Samuel AR, Mertens PPC (2007) Rapid cDNA synthesis and sequencing techniques for the genetic study of bluetongue and other dsRNA viruses. J Virol Methods 143(2):132–139. https://doi.org/10.1016/j.jviromet.2007.02. 016

Chapter 15 Quantitation of Molecular Pathway Activation Using RNA Sequencing Data Nicolas Borisov, Maxim Sorokin, Andrew Garazha, and Anton Buzdin Abstract Intracellular molecular pathways (IMPs) control all major events in the living cell. IMPs are considered hotspots in biomedical sciences and thousands of IMPs have been discovered for humans and model organisms. Knowledge of IMPs activation is essential for understanding biological functions and differences between the biological objects at the molecular level. Here we describe the Oncobox system for accurate quantitative scoring activities of up to several thousand molecular pathways based on high throughput molecular data. Although initially designed for gene expression and mainly RNA sequencing data, Oncobox is now also applicable for quantitative proteomics, microRNA and transcription factor binding sites mapping data. The Oncobox system includes modules of gene expression data harmonization, aggregation and comparison and a recursive algorithm for automatic annotation of molecular pathways. The universal rationale of Oncobox enables scoring of signaling, metabolic, cytoskeleton, immunity, DNA repair, and other pathways in a multitude of biological objects. The Oncobox system can be helpful to all those working in the fields of genetics, biochemistry, interactomics, and big data analytics in molecular biomedicine. Key words Systems biology, Bioinformatics, Intracellular molecular pathways, Gene expression, Transcriptomics, Proteomics, Epigenetics, Micro-RNA, miR, Cancer, Biomarkers, Machine learning, Big data analytics

Abbreviations IMP miR PAL RNAseq TFBS

Intracellular molecular pathway MicroRNA Pathway activation level, calculated using RNA or protein expression data RNA sequencing Transcription factor binding site

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_15, © Springer Science+Business Media, LLC, part of Springer Nature 2020

189

190

1

Nicolas Borisov et al.

Introduction Intracellular molecular pathways (IMPs) consolidate gene products involved in common molecular processes. IMPs are involved in all major events in the normal, pathological and developing cells. The best known are metabolic, signaling, DNA repair and cytoskeleton pathways [1]. Metabolic pathways connect biochemical reactions and the corresponding enzymes in biologically significant networks. Signaling pathways consolidate gene products involved in signal transduction. The transduced signals are very different. They can be initiated by the ligand–receptor binding events on the cell surface or inside the cell, by the concentration changes of inorganic ions, specific proteins or metabolites, by the physical impacts such as high or low temperature and ionizing radiation, and also by the altered protein-protein interactions, e.g. within high order molecular complexes [2–4]. The signal is initially sensed and transduced to downstream pathway nodes until it reaches one or several effector molecules that execute any actionable item. The intermediate steps in molecular signaling are equally variegated and can deal with biochemical protein modifications or cofactor binding, switch of molecular localization (e.g., transfer of a protein molecule from cell membrane to cytoplasm or nucleus), assembly or disaggregation of protein complexes, release of ions and low-molecular mass metabolites [5–10]. The signaling pathways may have many different outputs. The final outputs can be also extremely diverse, even for a single signaling pathway. They include affected gene expression, altered permeability of biological membranes to ions and metabolites, biochemical modification of proteins, such as phosphorylation, assembly or disruption of molecular complexes, and activation or inhibition of other molecular pathways. Interestingly, most frequently at least one of the signaling pathway outputs affects any cytoskeleton remodeling pathway, which suggests their tight interconnection in a living cell [11, 12]. DNA repair also works in concert with the other molecular pathways. For example, until DNA lesions are repaired, the cell normally cannot proceed to the cell cycle progression, which requires implications of both signaling and cytoskeleton pathways [13–15]. The cytoskeleton reorganization and DNA repair pathways are similarly organized and partly overlap with the signaling pathways. The major distinction is their functional roles, which deal, respectively, with remodeling of cytoskeleton or damaged DNA repair. Each individual node in a pathway can be built by not just a single gene product, but rather by a group of gene products. Those can be formed by the homologous protein families with common functions, or can it be various protein subunits needed to form a specific molecular complex [16, 17]. Pathways, therefore, may have

Molecular Pathway Quantification by RNA-seq

191

tens or hundreds of nodes collectively accumulating far more different gene products [18–21]. For several decades, the molecular pathways are still on the forefront of biomedical research [5–10]. Hundreds of thousands of molecular interactions and thousands of molecular pathways have been discovered and catalogued in the specific databases. Such databases may accumulate information on pathway architecture and any associated functional features [16–22]. Experimental and bioinformatic approaches showed that multitude of molecular pathways may be affected among the conditions under investigation, e.g. with the aging [23, 24], during tumor malignization [25–27] and progression of other human chronic diseases [28, 29]. However, quantitatively measuring extents of pathway activation is technically challenging [30]. Recent approaches have been developed to utilize for this task various OMICS data such as gene expression profiles obtained with RNA sequencing (RNAseq), Microarray hybridization or high-throughput proteomic techniques [30]. Transcriptomics is still the most robust and feasible method of interrogating gene expression, and RNAseq is currently considered a golden standard in transcriptomics [31, 32]. In this chapter, we focus on quantitation of molecular pathway activation using RNA sequencing data and provide a detailed insight into the Oncobox system for pathway scoring. In various experimental models, IMP activation can be measured through the extensive screening of protein phosphorylation, by interrogating switches of transcription factor binding sites, by using specific gene expression or protein aggregation biosensors, by measuring physiological outputs and using many other methods [33, 34]. The mass spectrometry-based quantitative proteomics methods are developing quickly, but so far unfortunately cannot simultaneously accommodate the requirements of high throughput analysis, reasonable costs, high performance and reproducibility, and ability of working with minute amount of biomaterials [35]. However, transcriptomic data-driven analyses outperform those approaches by simplicity and availability of biomaterials. To complete the analysis, only the biomaterials having intact or partly degraded RNAs are needed [36]. Next condition is the access to high-throughput transcriptomics. However, this infrastructure became broadly distributed during the last decade with constantly increasing availability and decreased costs [37, 38]. Those methods primarily include the RNAseq-, NanoString-, or microarray-based techniques [35]. Although these instruments are limited by the RNA-only level of analysis, where one can routinely determine expression levels for all known genes [39], the data-processing analytic approaches don’t have such limitations and can be switched, when possible, to the quantitative proteomics and other types of high throughput data as well [5]. Several methods

192

Nicolas Borisov et al.

have been recently published to analyze mRNA, protein, and even microRNA expression data on the level of intracellular molecular pathways (IMPs) [40]. For example, Khatri et al. [41] classified those methods into three major groups: over-representation analysis (ORA), Functional Class Scoring (FCS) and Pathway Topology (PT)-based approaches. ORA-based methods calculate if the pathway is significantly enriched with differentially expressed genes [42]. These methods have many limitations, because they ignore all non-differentially expressed genes and do not take into account many gene-specific characteristics influencing the nature of interactions between the pathway members. FCS-based approaches partially tackle these limitations by calculating fold change-based scores for each gene and then combining them into a single pathway enrichment score [43]. PT-based analysis also takes into account topological characteristics of each given pathway, assigning additional weights to the enclosed genes [44]. Next, another set of differential variability methods has been developed [45]. Differential variability analysis determines a group of genes with a significant change in variance of gene expression between case and control groups [46]. In 2014, we published a new approach for pathway analysis, based on kinetic models that use the “low-level” approach of mass action law, it performs both quantitative and qualitative enrichment analysis of the signaling pathways. For each sample under investigation, it does a case–control pairwise comparison and calculates the Pathway Activation Level (PAL; also termed Pathway Activation Strength), a value which serves as an accurate qualitative measure of pathway activation [47]. This approach does not simply identify whether the molecular pathway is affected, but also determines if the pathway is significantly upregulated or downregulated compared to the reference sample(s) and returns the quantitative measure of this deregulation. The PAL-based method in its initial form termed OncoFinder [47] has then evolved to include analysis of pathways topologies (iPANDA) [48] and, finally, to provide for each gene product differential weighing coefficients depending on its algorithmically curated function in a pathway (Oncobox) [49]. The same rationale was employed also for the pathway activity assessment based on the micro RNA (miR) expression data. The related technique, termed MiRImpact, enables to crosslink the total miR expression profiles with their estimated outcomes on the regulation of molecular pathways, by calculating the respective PAL score [49]. Furthermore, other epigenetic data like frequencies of transcription factor binding sites can also serve as the input for measuring deepness of transcriptional regulation of a molecular pathway using another PAL-related method [50]. Finally, this approach can be also expanded on DNA mutation data to assess molecular pathways with respect to mutation burden [51, 52].

Molecular Pathway Quantification by RNA-seq

193

Fig. 1 Correlation between transcriptomic data obtained for the same representative renal carcinoma specimen using the Illumina HT12 (ordinate) and CustomArray (abscissa) microarray platforms. The panels represent: (a) correlation between the oligonucleotide expression tags; (b) correlations at the level of individual genes; (c) correlation at the level of molecular pathways (PAL values)

Of note, the PAL approach and its daughter methods mentioned above, have also demonstrated an ability to strongly reduce the levels of experimental noise in the input data, either microarrayor NGS-based [53, 54]. This noise frequently referred to platform bias and batch effect, is introduced by the experimental systems profiling the gene expression, including equipment and sets of reagents used. Reduced noise was detected for the experimental gene expression data obtained using the platforms Affymetrix HG U133 Plus 2.0, llumina Genome Analyzer, Illumina HT12 bead array, Agilent 1M array, and also with the mass spectrometer platforms Orbitrap Velos and XL [53]. For both transcriptomic and proteomic levels, the PAL-based analytic approach returned more stable results than the expression of individual genes (Fig. 1) [53]. A comparative investigation was recently made to test performance of the different pathway measuring techniques in one study [53]. Five popular approaches were compared, namely PAL-based OncoFinder [47], TAPPA (Topology analysis of pathway phenotype association) [55], Topology-Based Score (TBscore) [56], Pathway-Express (PE) [57], and SPIA (Signal pathway impact analysis) [58]. The testing system had several biosamples profiled using different experimental platforms. The pathway approach performance was tested in two ways. First, the ability of a method to improve correlations between the data (for initial gene expression versus pathway activation characteristics) obtained using different platforms was assessed. Second, the different methods were checked for their abilities to retain biological features for the biosample types: if results for the same biosample from the different platforms will be closer to each other than results for different biosamples obtained on the same platforms. In both tests, the PAL-based approach outperformed the competitor methods (Table 1) [53]. We, therefore, further concentrate on the method

194

Nicolas Borisov et al.

Table 1 Comparison of pathway scoring methods using functional and statistical tests Data aggregation effecta

Distance distribution within each sample Quality of typeb clusteringc

PAL/ OncoFinder

++

+++

+++

TAPPA

+++

++

TBScore

Pathwayexpress

+++

SPIA

+++

Method

a

Data aggregation effect indicates benefit of operating at molecular pathway level compared to operating at the level of individual gene expressions; this effect was measured for biosamples with mRNA expression independently profiled using at least two different experimental platforms b Distance distribution within each sample type indicates distribution of Euclidean distances between the vectors of pathway activation metrics for three different sample types normalized on the fourth sample type. Each sample type in multiple replicates was profiled in parallel using different experimental platforms. For each sample taken separately, a unimodal distribution indicated lack of significant difference between within-platform and cross-platform distances. A bimodal or multimodal distribution meant that the cross-platform PAS distance was essentially higher that the withinplatform distance c Following mRNA profiling using different experimental platforms and assessment of pathway activation as above, the profiles were than clustered according to their pathway signatures. Here the ability of pathway scoring methods to support the sample type-based rather than platform type-based clustering was investigated

Oncobox, which is the most recent PAL-based approach for transcriptomic data. To explain the phenomenon of improved correlations, a mathematical model was created to simulate error acquisition in raw gene expression and in PAL-based approaches. In agreement with the experimental data, in this model PAL methods produced significantly more stable results. The same model also predicts that this noise-reducing data aggregation effect grows with the number of members in a pathway and becomes significant when a pathway has at least 30 gene products [53].

2

Materials Molecular pathways used in the actual version of Oncobox system were published in [51]. The code for Shambhala harmonizer tool was written as further modification and upgrade of the R package CONOR [59]. The whole code was arranged as the R package HARMONY. This package, as well as a code example for Shambhala application are deposited at Github, https://github.com/oncobox-admin/ harmony.

Molecular Pathway Quantification by RNA-seq

3

195

Methods For all PAL-family methods including Oncobox, the pathway scoring is based on the algorithm for PAL calculation. The basic algorithm for molecular pathway activation analysis is based on the acceptance of the following major principles. First, the molecular interaction graph in each pathway is supposed in the form of two parallel chains of events, one leading to activation and another—to inhibition of a molecular pathway. Second, expressions of all the gene products participating in a pathway with “activator” roles are supposed to be lower when the pathway is inhibited, and vice versa. This principle is based on the published data that the deeply unsaturated states of each of the proteins-signal transducers in an individual molecular pathway are congruent with the low pathway activity states [5, 9]. The algorithm requires gene expression data and the information on protein-protein interactions in a pathway under investigation in order to functionally annotate to each gene product in a pathway, i.e. their activator or repressor roles [47]. Another important aspect is the identification of control samples needed to normalize gene expression during PAL calculation. The positive PAL values indicate increased activity of a molecular pathway compared to the group of normal samples, and vice versa, the negative values mean pathways downregulation, whereas zero PAL score characterizes unaffected pathways between the case and the normal samples [47]. This is important to note that although the basic algorithm includes a notion of gene expression (i.e. this means mRNA and protein relative contents in normal interpretation), other measured molecular characteristics can come down to it: – MicroRNA expression (affecting target gene expression via specific inhibition of target mRNAs). – Transcription factor binding (regulating gene expression at the transcription level). The Oncobox system assumes all gene products participating in a molecular pathway as those having potentially equal possibilities to cause activation or inhibition of this pathway. For calculating molecular pathway activation levels, the Oncobox system utilizes the following basic formula: PALp ¼

X

X NII ARR log CNR = NIInp jARR n j, np np n n n

where PALp—molecular pathway p activation level; CNRn (case-tonormal ratio)—ratio of the protein-encoding gene n product concentrations in the test sample and in the norms (average value in the control group); ln—natural logarithm; NIInp—index of gene

196

Nicolas Borisov et al.

Fig. 2 Organization of Oncobox system for PAL calculation using mRNA, microRNA, quantitative proteomics and TFBS data

product n assignment to the pathway p, assuming the values equal to 1 for gene products included in the pathway and equal to 0 for gene products not included in the pathway; discrete value ARRnp (activator/repressor role) is deposited into the molecular pathway base and determined for a gene n in the pathway p as follows:

ARR np

8 1; protein n is a signal repressor in a pathway p > > > > > > < 0:5; protein n is more likely a signal repressor in a pathway p ¼ 0; the role of a protein n in a pathway p is either ambivalent or neutral > > > 0:5; protein n is a signal activator in a pathway p > > > : 1; protein n is a signal activator in a pathway p Molecular pathway activation level is normalized based on the number of gene participants of a molecular pathway with the known functional roles, represented by the |ARRn| parameter. Depending on the available type of molecular data for a biosample under investigation, the log CNRn parameter is calculated in different ways as a part of the basic algorithm, i.e. this is the logarithm of the ratio of the gene expression n in the test sample and in the control sample or control group of samples. Below are the options for the log CNRn calculation depending on the different types of available molecular data (Fig. 2).

3.1 Log CNRn for mRNA Expression Data

Common normalization is applied to the test samples along with the groups of relevant normal samples for mRNA profiles prior PAL calculations. The publicly available information on high throughput mRNA profiles includes the results for more than two million samples obtained in more than 70,000 experiments [60]. These data were obtained using different experimental platforms and reagents and, therefore, are hardly comparable [61]. To reach the satisfactory data homogeneity level for the expression profiles, the

Molecular Pathway Quantification by RNA-seq

197

Fig. 3 Bringing the gene expression profile to universal form using Shambhala method. Expression profiles (gene distributions by expression levels) before and after harmonization using Shambhala algorithm are shown (panels a–d and e–h, respectively). All the expression profiles were obtained for the same biosample (Stratagene Universal Human Reference RNA; UHRR Catalog #740000) using various experimental platforms: Illumina HiSeq 2000 (a, e), Illumina HumanHT-12 V4.0 expression beadchip (b, f), Affymetrix Human Gene 2.0 ST Array (c, g) and Affymetrix GeneChip PrimeView Human Gene Expression Array (d, h)

Oncobox system utilizes Shambhala, an original method of gene expression harmonization that is applicable for standardization of the results received using different experimental platforms and protocols [62] (Fig. 3). Unlike previous harmonization methods like DWD (distanceweighted discrimination) [63], XPN (cross-platform normalization) [64] and PLIDA (platform-independent latent Dirichlet allocation) [65] that can use only the data generated by maximum two experimental platforms, Shambhala can harmonize any number of transcriptomic datasets obtained using any number of experimental platforms by deeply restructuring every expression profile and bringing it to a universal comparable form [62]. Following Shambhala harmonization, direct analysis of gene expression levels is performed. The harmonized (if needed) sequencing profiles are next normalized. For microarray gene expression data, quantile normalization method may be applied [66], whereas for the RNAseq data, another group of pre-normalization techniques may be recommended, such as the DESeq or DESeq2 methods [67]. The molecular pathway activation is calculated according to the above main PAL calculation formula, where ln CNR is natural logarithm of the ratio of harmonized gene n expression values in the test sample to the norm (simple average or geometric mean value for the control group), see Example on Fig. 4.

198

Nicolas Borisov et al.

Fig. 4 Schematic representation of alterations in molecular pathway “ATM Pathway (DNA repair)” after 4 weeks of incubation of cell lines SKOV and NGP-127 with anticancer target drugs Everolimus, Temsirolimus, Pazopanib, Sorafenib, and Sunitinib, respectively. The pathway is shown as an interacting network, where green arrows indicate activation and red arrows, inhibition. Pathway Activation Levels (PALs) were measured against time-matched controls of the corresponding cell lines not exposed to target drugs. PAL values are shown for each experiment. Color depths of the pathway nodes correspond to logarithms of the case-tonormal (CNR) expression rate for gene products forming each node, where “normal” was a geometric average between the respective control samples 3.2 Log CNRn for MicroRNA Data

This method of data analysis is based on the use of a database of gene products—molecular targets of individual microRNAs (miRs) [49, 68]. For every miR, the targets database contains its unique identifier and a list of identifiers for all gene products—molecular targets of this miR. The effect on the gene expression level is calculated based on the assumption that miR molecules functionally inhibit their mRNA targets. The increased miR level, therefore, leads to decreased adjusted expression levels of the relevant target mRNAs, and vice versa. Wherein, each gene product may have several regulatory microRNAs, and each microRNA may have several gene targets [49]. Here, log CNRn is calculated according to the following formula: X log CNR n ¼ j log miCNR i miIIi, n , where n—gene product being analyzed, j—total number of miRs under investigation, i—individual miR under investigation.

Molecular Pathway Quantification by RNA-seq

199

Boolean variable of miR involvement index (miIIi,n) indicates if gene product n is molecular target for miR i. Wherein, miIIi,n is equal to 1 when gene product n is molecular target for miR i and is 0 when this is not the case. miCNRi is a ratio of the established miR i expression levels in the test sample to such average value for the control group. The negative coefficient before the summation symbol reflects the inhibiting role of an miR for the corresponding target gene product. 3.3 Log CNRn for Quantitative Proteomic Data

To analyze pathway activation at the level of protein expression, the Oncobox system uses the Shambhala method at the first stage for harmonization of the case and normal samples similarly as for the mRNA application. The molecular pathway activation is then calculated, where a natural logarithm of the ratio of harmonized protein expression level n in the test sample to the norm (average value for the control group) is taken as log CNR [53].

3.4 Log CNRn for Transcription Factor Binding Site Data

In this application, a consensus transcription starting point is determined for each gene. For each point, the neighborhood (e.g., a region of 5 kb around the transcription starting point) is determined for every relevant gene. In this neighborhood, the number of mapped transcription factor binding sites is identified. Then GRES (Gene Record Enrichment Score) is calculated for every gene: Xmi¼1 GRESn ¼ m TESn = TESi , where GRESn—GRES value for gene n; m—total number of relevant genes for a given testing sample; TESn—number of mapped transcription factor binding sites (TFBS) in the neighbouhood of gene n; i—index corresponding to gene identifier; TESi sum by gene number m—total number of mapped TFBS in the neighborhood of all testing genes. For every gene, GRES value allows to rank the saturation level of the TFBS. For example, GRES ¼ 1 means average saturation level among all genes; GRES > 1 means higher saturation; GRES < 1, contrarily, means that the gene is depleted in TFBS relatively to the average for all genes. Finally, log CNRn is calculated according to the following formula: log CNR n ¼ log CNR ðGRESn Þ, where CNR(GRESn)—ratio of GRES in the test sample to the average GRES in the control group for gene n [50]. For each of the analyzed data types (mRNA, protein and microRNA expression profiles, transcription factor binding profile), the Oncobox system uses specific modifications of the original algorithm (Fig. 2). For all the above methods, results of PAL

200

Nicolas Borisov et al.

scoring depend greatly on the functional annotation of pathway members reflected by ARR values in the main formula. Although it is possible to manually annotate a limited number of pathways, working with the high throughput database of molecular pathways in such a way would be unrealistic. Manual curation of each gene product function by looking at molecular pathway graphs is an apparent source of inevitable operational errors that restricts a wide use of such techniques to a relatively small number of pathways. To solve this problem by automating the pathway annotation, the following analytic algorithm was incorporated into the Oncobox system. 3.5 Automatic Annotation of Pathway Member Roles

For the Oncobox system, five types of functional roles reflected by ARR values are considered for the pathway members: pathway activator, repressor, rather activator, rather repressor, and gene product with uncertain or inconsistent role. Automatic annotation of gene product functional roles and ARR coefficients placement in molecular pathways is implemented as follows. The annotation algorithm is based on the machine analysis of protein–protein interaction graph for each particular pathway. This graph can be built manually or using any molecular pathway database, such as KEGG, biocarta, and Reactome. Genes are placed on the graph nodes, and the rib between two nodes symbolizes a protein–protein interaction between the corresponding gene products. Each rib of this graph is directed, as well as has a parameter indicating the protein–protein interaction: activation or inhibition. For correct arrangement of ARR coefficients, this graph should be connected, wherein a weak connectivity is enough. If protein interactions graph for a specific molecular pathway matches the above criteria, then ARR coefficients can be automatically assigned to the gene products included in this pathway. This is enabled by using the following recursive algorithm: 1. Initialization: the first (top) node is identified to be the central graph node. The two parameters N and M are then calculated for each node: N—number of nodes which can be reached when moving from the node V, M—number of nodes from which the node V can be reached. The central node will be such node V for which the N + M parameter is maximum. The value of ARR ¼ 1 is next assigned to the central top. From this top, ARR indexes are being recursively assigned to the other nodes. 2. Recursion R: for each node V, all nodes Pi to be found with the rib Pi ! V or V ! Pi in the graph. Each rib can be counted only once during recursion. Otherwise, the recursion can be endless in case of cyclic interactions occurring in the graph. If the rib has an “activation” parameter, temporary ARRtemp ¼ 1 is assigned to the node Pi. If the rib has an “inhibition”

Molecular Pathway Quantification by RNA-seq

201

parameter, temporary ARRtemp ¼ 1 is assigned to the node Pi. If the node Pi was never found previously in the graph traversal, ARR ¼ ARRtemp would be assigned to the node Pi. If the node Pi was found previously in the graph traversal and the previously assigned ARR is equal to ARRtemp, then ARR ¼ ARRtemp would be assigned to the node Pi. If the node Pi was found previously in the graph traversal and the previously assigned ARR is not equal to ARRtemp, then ARR would be assigned to the node Pi according to the conflict resolution rule. Conflict resolution rule: if a gene with already specified ARR is found in the graph traversal but according to the above rules, contradictory ARR values can be applied to this gene, then the conflicts should be resolved by the following rules: (a) if the signs of two ARR coefficients are different, the resulting ARR ¼ 0; (b) if ARRs are different by 0.5 and one of them is positive, the resulting ARR ¼ 0.5; (c) if ARRs are different by 0.5 and one of them is negative, the resulting ARR ¼ 0.5. Then for each node Pi with the |ARR| module equal to 1, the recursion R is initiated. 3. As a result, the algorithm will assign ARR to the graph nodes. These coefficients can be used to calculate the molecular pathway activation strengths according to the above formula. Therefore, the gene products included in the molecular pathways database will have the assigned ARR values representing their functional significances in the given molecular pathway. For generation of the molecular pathway database, both the published and the user-defined molecular pathway catalogues can be used. The published catalogues include collections of data, such as BioCarta, KEGG, NCI, Reactome, and Pathway Central [30]. In the Oncobox system, 3.125 molecular pathways were accumulated which collectively cover above 11,000 protein-coding human genes and include ~300,000 protein–protein interactions.

4

Summary Profiling dynamics of intracellular pathways clearly helps understanding molecular processes linked with any condition of interest to researcher. Oncobox and other PAL-based approaches were also effective in finding numerous biomarkers for various biological processes, including applications in disease biomarkers, assessment of embryo development, aging, immunity, drug discovery, and repurposing and even molecular evolution during human speciation (summarized in Table 2). So far, PAL scoring methods were applied only for the mammalian species, most probably due to specific composition of the molecular pathways used in the first releases of the software.

202

Nicolas Borisov et al.

Table 2 Selected applications of pathway activation level/Oncobox methods according to published reports #

Application

Reference

1

Drug discovery using high-throughput gene expression data

[69, 70]

2

Drug repurposing using high throughput gene expression data

[71]

3

Discovery of combinational target therapies based on individual gene expression profiles

[72, 73]

4

Better quality molecular biomarkers for machine learning-assisted personalized prediction of drug responders and nonresponders

[74–76]

5

Analysis of molecular effects of nutrients on human cell cultures

[77]

6

Understanding molecular mechanisms of cytotoxicity of putative therapeutic agents

[78, 79]

7

Identification of embryonic-fetal transition markers

[80]

8

Interactome analysis of myeloid-derived suppressor cells in murine models of colon [81] and breast cancer

9

Analysis of death and survival mechanisms in mammalian immune cells

10 Analysis of cell death and survival modalities in hepatocytes

[82–84] [84]

11 Studying tumor hypoxia during surgical operative temporary and permanent portal [85] vein embolization on animal models 12 Crosslinking cancer cell resistance against target therapies and sensitivity to radiation therapy

[86]

13 Finding biomarkers of human cancer

[26, 27, 80, 87–89]

14 Measuring mutation burden of molecular pathways in cancer

[51]

15 Finding molecular biomarkers of asthma

[28]

16 Discovery of suppression of miR processing machinery linked with early events of [68] cytomegaloviral infection in human cells 17 Profiling proteomes of tumor exosomes

[25]

18 Finding geroprotector (antiaging) molecules based on known molecular targets of [23] chemicals and gene expression profiles of young, ageing and senescent stem cells 19 Ranking rates of molecular pathways evolution based on TFBS mapping within transposable elements

[50]

20 Measuring impact of total miR profiles on molecular pathways

[49]

However, incorporation of new pathways specific to other groups of organisms will clearly expand applicability of Oncobox pathway scoring and related methods to novel species, not necessarily vertebrates or even eukaryotes.

Molecular Pathway Quantification by RNA-seq

203

Acknowledgments This study was supported by the Oncobox research program in oncology and by Russian Science Foundation grants 18-15-00061. References 1. Blagosklonny MV (2013) MTOR-driven quasi-programmed aging as a disposable soma theory: blind watchmaker vs. intelligent designer. Cell Cycle 12:1842–1847 2. Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100:57–70 3. Hanahan D, Weinberg RA (2011) Hallmarks of cancer: the next generation. Cell 144:646–674 4. Sonnenschein C, Soto AM (2013) The aging of the 2000 and 2011 hallmarks of cancer reviews: a critique. J Biosci 38:651–663 5. Aliper AM, Korzinkin MB, Kuzmina NB, Zenin AA, Venkova LS, Smirnov PY et al (2017) Mathematical justification of expression-based pathway activation scoring (PAS). Methods Mol Biol 1613:31–51 6. Borisov N, Aksamitiene E, Kiyatkin A, Legewie S, Berkhout J, Maiwald T et al (2009) Systems-level interactions between insulin-EGF networks amplify mitogenic signaling. Mol Syst Biol 5:256 7. Kholodenko BN, Demin OV, Moehren G, Hoek JB (1999) Quantification of short term signaling by the epidermal growth factor receptor. J Biol Chem 274:30169–30181 8. Kiyatkin A, Aksamitiene E, Markevich NI, Borisov NM, Hoek JB, Kholodenko BN (2006) Scaffolding protein Grb2-associated binder 1 sustains epidermal growth factor-induced mitogenic and survival signaling by multiple positive feedback loops. J Biol Chem 281:19925–19938 9. Kuzmina NB, Borisov NM (2011) Handling complex rule-based models of mitogenic cell signaling (on the example of ERK activation upon EGF stimulation). Int Proc Chem Biol Environ Eng 5:76–82 10. Marshall CJ (1995) Specificity of receptor tyrosine kinase signaling: transient versus sustained extracellular signal-regulated kinase activation. Cell 80:179–185 11. Disanza A, Frittoli E, Palamidessi A, Scita G (2009) Endocytosis and spatial restriction of cell signaling. Mol Oncol 3:280–296 12. Filteau M, Diss G, Torres-Quiroz F, Dube´ AK, Schraffl A, Bachmann VA et al (2015) Systematic identification of signal integration by

protein kinase A. Proc Natl Acad Sci 112:4501–4506 13. Branzei D, Foiani M (2008) Regulation of DNA repair throughout the cell cycle. Nat Rev Mol Cell Biol 9:297–308 14. Malumbres M, Barbacid M (2009) Cell cycle, CDKs and cancer: a changing paradigm. Nat Rev Cancer 9:153–166 15. Vermeulen K, Van Bockstaele DR, Berneman ZN (2003) The cell cycle: a review of regulation, deregulation and therapeutic targets in cancer. Cell Prolif 36:131–149 16. UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res 39:D214–D219 17. Mathivanan S, Periaswamy B, Gandhi TKB, Kandasamy K, Suresh S, Mohmood R et al (2006) An evaluation of human proteinprotein interaction data in the public domain. BMC Bioinformatics 7(Suppl 5):S19 18. Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G et al (2014) The reactome pathway knowledgebase. Nucleic Acids Res 42:D472–D477 19. Elkon R, Vesterman R, Amit N, Ulitsky I, Zohar I, Weisz M et al (2008) SPIKE—a database, visualization and analysis tool of cellular signaling pathways. BMC Bioinformatics 9:110 20. Nakaya A, Katayama T, Itoh M, Hiranuka K, Kawashima S, Moriya Y et al (2013) KEGG OC: a large-scale automatic construction of taxonomy-based ortholog clusters. Nucleic Acids Res 41:D353–D357 21. Nikitin A, Egorov S, Daraselia N, Mazo I (2003) Pathway studio—the analysis and navigation of molecular networks. Bioinformatics 19:2155–2157 22. Bauer-Mehren A, Furlong LI, Sanz F (2009) Pathway databases and tools for their exploitation: benefits, current limitations and challenges. Mol Syst Biol 5:290 23. Aliper A, Belikov AV, Garazha A, Jellen L, Artemov A, Suntsova M et al (2016) In search for geroprotectors: in silico screening and in vitro validation of signalome-level mimetics of young healthy state. Aging (Albany NY) 8:2127–2152

204

Nicolas Borisov et al.

24. Aliper AM, Csoka AB, Buzdin A, Jetka T, Roumiantsev S, Moskalev A et al (2015) Signaling pathway activation drift during aging: Hutchinson-Gilford Progeria syndrome fibroblasts are comparable to normal middle-age and old-age cells. Aging (Albany NY) 7:26–37 25. Shtam T, Naryzhny S, Samsonov R, Karasik D, Mizgirev I, Kopylov A et al (2019) Plasma exosomes stimulate breast cancer metastasis through surface interactions and activation of FAK signaling. Breast Cancer Res Treat 174:129–141 26. Petrov I, Suntsova M, Ilnitskaya E, Roumiantsev S, Sorokin M, Garazha A et al (2017) Gene expression and molecular pathway activation signatures of MYCN-amplified neuroblastomas. Oncotarget 8:83768–83780 27. Petrov I, Suntsova M, Mutorova O, Sorokin M, Garazha A, Ilnitskaya E et al (2016) Molecular pathway activation features of pediatric acute myeloid leukemia (AML) and acute lymphoblast leukemia (ALL) cells. Aging (Albany NY) 8:2936–2947 28. Alexandrova E, Nassa G, Corleone G, Buzdin A, Aliper AM, Terekhanova N et al (2016) Large-scale profiling of signalling pathways reveals an asthma specific signature in bronchial smooth muscle cells. Oncotarget 7:25150–25161 29. Makarev E, Izumchenko E, Aihara F, Wysocki PT, Zhu Q, Buzdin A et al (2016) Common pathway signature in lung and liver fibrosis. Cell Cycle 15:1667–1673 30. Buzdin A, Sorokin M, Garazha A, Sekacheva M, Kim E, Zhukov N et al (2018) Molecular pathway activation – new type of biomarkers for tumor morphology and personalized selection of target drugs. Semin Cancer Biol 53:110–124 31. da Silveira WA, Hazard ES, Chung D, Hardiman G (2019) Molecular profiling of RNA tumors using high-throughput RNA sequencing: from raw data to systems level analyses. Methods Mol Biol 1908:185–204 32. Crow M, Gillis J (2019) Single cell RNA-sequencing: replicability of cell types. Curr Opin Neurobiol 56:69–77 33. Otto GM, Brar GA (2018) Seq-ing answers: uncovering the unexpected in global gene regulation. Curr Genet 64:1183–1188 34. Yang KC, Sathiyaseelan P, Ho C, Gorski SM (2018) Evolution of tools and methods for monitoring autophagic flux in mammalian cells. Biochem Soc Trans 46:97–110 35. Zhang P, Lehmann BD, Shyr Y, Guo Y (2017) The utilization of formalin fixed-paraffin-

embedded specimens in high throughput genomic studies. Int J Genomics 2017:1–9 36. Gaffney EF, Riegman PH, Grizzle WE, Watson PH (2018) Factors that drive the increasing use of FFPE tissue in basic and translational cancer research. Biotechnol Histochem 93:373–386 37. Peters DG, Yatsenko SA, Surti U, Rajkovic A (2015) Recent advances of genomic testing in perinatal medicine. Semin Perinatol 39:44–54 38. Odriozola L, Corrales FJ (2015) Discovery of nutritional biomarkers: future directions based on omics technologies. Int J Food Sci Nutr 66 (Suppl 1):S31–S40 39. Vivar JC, Pemu P, McPherson R, Ghosh S (2013) Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and “Big data” biology. OMICS 17:414–422 40. Buzdin AA, Prassolov V, Zhavoronkov AA, Borisov NM (2017) Bioinformatics meets biomedicine: OncoFinder, a quantitative approach for interrogating molecular pathways using gene expression data. In: Tatarinova TV, Nikolsky Y (eds) Biological networks and pathway analysis. Springer, New York, NY, pp 53–83 41. Khatri P, Sirota M, Butte AJ (2012) Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol 8: e1002375 42. Khatri P, Dra˘ghici S (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21:3587–3595 43. Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ (2005) Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci U S A 102:13544–13549 44. Mitrea C, Taghavi Z, Bokanizad B, Hanoudi S, Tagett R, Donato M et al (2013) Methods and approaches in the topology-based analysis of biological pathways. Front Physiol 4:278 45. Afsari B, Geman D, Fertig EJ (2014) Learning dysregulated pathways in cancers from differential variability analysis. Cancer Informat 13:61 46. Zhang J, Li J, Deng H-W (2009) Identifying gene interaction enrichment for gene expression data. PLoS One 4:e8064 47. Buzdin AA, Zhavoronkov AA, Korzinkin MB, Venkova LS, Zenin AA, Smirnov PY et al (2014) Oncofinder, a new method for the analysis of intracellular signaling pathway activation using transcriptomic data. Front Genet 5:55

Molecular Pathway Quantification by RNA-seq 48. Ozerov IV, Lezhnina KV, Izumchenko E, Artemov AV, Medintsev S, Vanhaelen Q et al (2016) In silico pathway activation network decomposition analysis (iPANDA) as a method for biomarker development. Nat Commun 7:13427 49. Artcibasova AV, Korzinkin MB, Sorokin MI, Shegay PV, Zhavoronkov AA, Gaifullin N et al (2016) MiRImpact, a new bioinformatic method using complete microRNA expression profiles to assess their overall influence on the activity of intracellular molecular pathways. Cell Cycle 15:689–698 50. Nikitin D, Penzar D, Garazha A, Sorokin M, Tkachev V, Borisov N et al (2018) Profiling of human molecular pathways affected by retrotransposons at the level of regulation by transcription factor proteins. Front Immunol 9:30 51. Zolotovskaia MA, Sorokin MI, Roumiantsev SA, Borisov NM, Buzdin AA (2018) Pathway instability is an effective new mutation-based type of cancer biomarkers. Front Oncol 8:658 52. Zolotovskaia M, Sorokin M, Garazha A, Borisov N, Buzdin A (2020) Molecular pathway analysis of mutation data for biomarkers discovery and scoring of target cancer drugs. In: Astakhova K, Bukhari SA (eds) Nucleic acid detection and structural investigations. Methods and protocols. Springer, New York 53. Borisov N, Suntsova M, Sorokin M, Garazha A, Kovalchuk O, Aliper A et al (2017) Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data. Cell Cycle 16:1810–1823 54. Buzdin AA, Zhavoronkov AA, Korzinkin MB, Roumiantsev SA, Aliper AM, Venkova LS et al (2014) The OncoFinder algorithm for minimizing the errors introduced by the highthroughput methods of transcriptome analysis. Front Mol Biosci 1:8 55. Gao S, Wang X (2007) TAPPA: topological analysis of pathway phenotype association. Bioinformatics 23:3100–3102 56. Ibrahim MA-H, Jassim S, Cawthorne MA, Langlands K (2012) A topology-based score for pathway enrichment. J Comput Biol 19:563–573 57. Draghici S, Khatri P, Tarca AL, Amin K, Done A, Voichita C et al (2007) A systems biology approach for pathway level analysis. Genome Res 17:1537–1545 58. Tarca AL, Draghici S, Khatri P, Hassan SS, Mittal P, Kim J-S et al (2009) A novel signaling pathway impact analysis. Bioinformatics 25:75–82

205

59. Rudy J, Valafar F (2011) Empirical comparison of cross-platform normalization methods for gene expression data. BMC Bioinformatics 12:467 60. Cancer Genome Atlas Research Network (2008) Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061–1068 61. Demetrashvili N, Kron K, Pethe V, Bapat B, Briollais L (2010) How to deal with batch effect in sequential microarray experiments? Mol Inform 29:387–393 62. Borisov N, Shabalina I, Tkachev V, Sorokin M, Garazha A, Pulin A et al (2019) Shambhala: a platform-agnostic data harmonizer for gene expression data. BMC Bioinformatics 20:66 63. Huang H, Lu X, Liu Y, Haaland P, Marron JS (2012) R/DWD: distance-weighted discrimination for classification, visualization and batch adjustment. Bioinformatics 28:1182–1183 64. Shabalin AA, Tjelmeland H, Fan C, Perou CM, Nobel AB (2008) Merging two geneexpression studies via cross-platform normalization. Bioinformatics 24:1154–1160 65. Deshwar AG, Morris Q (2014) PLIDA: crossplatform gene expression normalization using perturbed topic models. Bioinformatics 30:956–961 66. Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19:185–193 67. Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550 68. Buzdin AA, Artcibasova AV, Fedorova NF, Suntsova MV, Garazha AV, Sorokin MI et al (2016) Early stage of cytomegalovirus infection suppresses host microRNA expression regulation in human fibroblasts. Cell Cycle 15:3378–3389 69. Kadurin A, Nikolenko S, Khrabrov K, Aliper A, Zhavoronkov A (2017) druGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol Pharm 14:3098–3104 70. Spirin P, Lebedev T, Orlova N, Morozov A, Poymenova N, Dmitriev SE et al (2017) Synergistic suppression of t(8;21)-positive leukemia cell growth by combining oridonin and MAPK1/ERK2 inhibitors. Oncotarget 8:56991–57002

206

Nicolas Borisov et al.

71. Vanhaelen Q, Mamoshina P, Aliper AM, Artemov A, Lezhnina K, Ozerov I et al (2017) Design of efficient computational workflows for in silico drug repurposing. Drug Discov Today 22:210–222 72. Sorokin M, Kholodenko R, Suntsova M, Malakhova G, Garazha A, Kholodenko I et al (2018) Oncobox bioinformatical platform for selecting potentially effective combinations of target cancer drugs using high-throughput gene expression data. Cancers (Basel) 10:E365 73. Spirin PV, Lebedev TD, Orlova NN, Gornostaeva AS, Prokofjeva MM, Nikitenko NA et al (2014) Silencing AML1-ETO gene expression leads to simultaneous activation of both pro-apoptotic and proliferation signaling. Leukemia 28:2222–2228 74. Borisov N, Tkachev V, Suntsova M, Kovalchuk O, Zhavoronkov A, Muchnik I et al (2018) A method of gene expression data transfer from cell lines to cancer patients for machine-learning prediction of drug efficiency. Cell Cycle 17:486–491 75. Borisov N, Tkachev V, Muchnik I, Buzdin A (2017) Individual drug treatment prediction in oncology based on machine learning using cell culture gene expression data. ACM Press, New York, NY, pp 1–6 76. Borisov N, Tkachev V, Buzdin A, Muchnik I (2018) Prediction of drug efficiency by transferring gene expression data from cell lines to cancer patients. In: Rozonoer L, Mirkin B, Muchnik I (eds) Braverman readings in machine learning. Key ideas from inception to current state. Springer, Cham, pp 201–212 77. Vishniakova KS, Babizhaev MA, Aliper AM, Buzdin AA, Kudriavtseva AV, Egorov EE (2014) Stimulation of proliferation by carnosine: cellular and transcriptome approaches. Mol Biol (Mosk) 48:824–833 78. Emelianova AA, Kuzmin DV, Panteleev PV, Sorokin M, Buzdin AA, Ovchinnikova TV (2018) Anticancer activity of the goat antimicrobial peptide ChMAP-28. Front Pharmacol 9:1501 79. Marggraf MB, Panteleev PV, Emelianova AA, Sorokin MI, Bolosov IA, Buzdin AA et al (2018) Cytotoxic potential of the novel horseshoe crab peptide polyphemusin III. Mar Drugs 16:E466 80. West MD, Labat I, Sternberg H, Larocca D, Nasonkin I, Chapman KB et al (2018) Use of deep neural network ensembles to identify

embryonic-fetal transition markers: repression of COX7A1 in embryonic and cancer cells. Oncotarget 9:7796–7811 81. Aliper AM, Frieden-Korovkina VP, Buzdin A, Roumiantsev SA, Zhavoronkov A (2014) Interactome analysis of myeloid-derived suppressor cells in murine models of colon and breast cancer. Oncotarget 5:11345–11353 82. Sarhan J, Liu BC, Muendlein HI, Weindel CG, Smirnova I, Tang AY et al (2019) Constitutive interferon signaling maintains critical threshold of MLKL expression to license necroptosis. Cell Death Differ 26:332–347 83. Larkin B, Ilyukha V, Sorokin M, Buzdin A, Vannier E, Poltorak A (2017) Cutting Edge: activation of STING in T cells induces type I IFN responses and cell death. J Immunol 199:397–402 84. Ram DR, Ilyukha V, Volkova T, Buzdin A, Tai A, Smirnova I et al (2016) Balance between short and long isoforms of cFLIP regulates Fas-mediated apoptosis in vivo. Proc Natl Acad Sci U S A 113:1606–1611 85. Wirsching A, Melloul E, Lezhnina K, Buzdin AA, Ogunshola OO, Borger P et al (2017) Temporary portal vein embolization is as efficient as permanent portal vein embolization in mice. Surgery 162:68–81 86. Sorokin M, Kholodenko R, Grekhova A, Suntsova M, Pustovalova M, Vorobyeva N et al (2018) Acquired resistance to tyrosine kinase inhibitors may be linked with the decreased sensitivity to X-ray irradiation. Oncotarget 9:5111–5124 87. de Klerk E, Fokkema IFAC, Thiadens KAMH, Goeman JJ, Palmblad M, den Dunnen JT et al (2015) Assessing the translational landscape of myogenic differentiation by ribosome profiling. Nucleic Acids Res 43:4408–4428 ˇ , Vranicˇ A, 88. Jovcˇevska I, Zupanec N, Urlep Z Matos B, Stokin CL et al (2017) Differentially expressed proteins in glioblastoma multiforme identified with a nanobody-based anti-proteome approach and confirmed by OncoFinder as possible tumor-class predictive biomarker candidates. Oncotarget 8:44141–44158 89. Shepelin D, Korzinkin M, Vanyushina A, Aliper A, Borisov N, Vasilov R et al (2016) Molecular pathway activation features linked with transition from normal skin to primary and metastatic melanomas in human. Oncotarget 7:656–670

Chapter 16 Molecular Pathway Analysis of Mutation Data for Biomarkers Discovery and Scoring of Target Cancer Drugs Marianna Zolotovskaia, Maxim Sorokin, Andrew Garazha, Nikolay Borisov, and Anton Buzdin Abstract DNA mutations govern cancer development. Cancer mutation profiles vary dramatically among the individuals. In some cases, they may serve as the predictors of disease progression and response to therapies. However, the biomarker potential of cancer mutations can be dramatically (several orders of magnitude) enhanced by applying molecular pathway-based approach. We developed Oncobox system for calculation of pathway instability (PI) values for the molecular pathways that are aggregated mutation frequencies of the pathway members normalized on gene lengths and on number of genes in the pathway. PI scores can be effective biomarkers in different types of comparisons, for example, as the cancer type biomarkers and as the predictors of tumor response to target therapies. The latter option is implemented using mutation drug score (MDS) values, which algorithmically rank the drugs capacity of interfering with the mutated molecular pathways. Here, describe the mathematical basis and algorithms for PI and MDS values calculation, validation and implementation. The example analysis is provided encompassing 5956 human tumor mutation profiles of 15 cancer types from The Cancer Genome Atlas (TCGA) project, that totally make 2,316,670 mutations in 19,872 genes and 1748 molecular pathways, thus enabling ranking of 128 clinically approved target drugs. Our results evidence that the Oncobox PI and MDS approaches are highly useful for basic and applied aspects of molecular oncology and pharmacology research. Key words Cancer, DNA mutation, Molecular pathways, Pathway instability, Biomarker, Target drugs, Tyrosine kinase inhibitors, Nibs, Mabs

Abbreviations CDS length COSMIC FDA ICGC MDS MR NIH

Coding DNA sequence length Catalogue of somatic mutations in cancer Food and Drug Administration International Cancer Genome Consortium Mutational Drug Scores Mutation rate The National Institutes of Health

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_16, © Springer Science+Business Media, LLC, part of Springer Nature 2020

207

208

Marianna Zolotovskaia et al.

nMR PI ROC AUC TC TCGA

1

Normalized mutation rate Pathway instability Receiver operator characteristics area under the curve Target conversion The Cancer Genome Atlas

Introduction Cancer is characterized by frequent accumulation of genetic mutations [1]. Mutations driving cancer development vary dramatically among individual cancers [2]. For example, in a high throughput project The Cancer Genome Atlas (TCGA), a very high molecular heterogeneity been documented, not only between different cancer types, but also among the individual tumors of the same type [3– 6]. This allowed to advance understanding molecular mechanisms of tumorigenesis by profiling landscapes of pathological somatic mutations. Many of the alterations revealed appeared promising for molecular cancer diagnostics to improve and personalize the treatment regimens [7, 8]. For several decades, chemotherapy remains a key treatment for many cancers, often with impressive success rates [9, 10]. However, most of the advanced cancers remain incurable and/or unresponsive using standard chemotherapy approaches by developing resistance to treatments and relapsing [11, 12]. A new generation of drugs has been developed that specifically target functional tumor marker molecules. These medicines termed Target drugs have one or a few specific molecular targets in a cell [13–16]. They offer greater selectivity and generally lower side toxicity than the conventional chemotherapy [17]. Structurally, they can be either low molecular mass inhibitor molecules or monoclonal antibodies [18]. The repertoire of their molecular targets is permanently growing. Now it includes various tyrosine kinases proteins [19], vascular endothelial growth factor [20], immune checkpoint molecules [21], poly(ADP-ribose) polymerase [22], mTOR inhibitors [23], hormone receptors [24], proteasomal components [25], ganglioside GD2 [26], and cancer-specific fusion proteins [27]. Target drugs are highly beneficial for many cancers (e.g., trastuzumab (anti-HER2 monoclonal antibody) and other related medications for patients with metastatic HER2-positive breast cancer [28, 29] and immune checkpoint inhibitors and anti-BRAF target drugs in melanoma [30, 31]). The efficiencies of target drugs vary from patient to patient [32] and the results of clinical trials clearly evidence that the drugs considered inefficient for an overall cohort of a given cancer type, may be beneficial for a small fraction of the patients [33].

Molecular Pathway Analysis of Mutation Data

209

There are currently more than 200 different anticancer target drugs approved in different countries, and this number grows every year [34, 35]. However, the predictive molecular diagnostic tests are available for only a minor fraction of drugs, in a minor fraction of cancer types [35–37]. This frequently makes the clinician’s decision on drug prescription a difficult task. It is, therefore, of a great importance to identify robust predictive biomarkers of target drug efficacy, for as many cancer-drug combinations as possible. Identification of informative and robust genetic markers of cancer is one of the priority tasks of the contemporary biomedicine. Several cancer-specific mutations and gene fusions are already used in molecular diagnostics, but the problem of finding new relevant and informative cancer markers with higher sensitivity and specificity is largely unsolved [8, 38]. Further accumulation of cancer typeand condition-specific biomarkers can be a key to a more effective, personalized treatment [39]. Traditionally, the focus is being made on the roles of the individual genes [40, 41]. However, this approach cannot always properly address tumor development in a comprehensive way. Apparently, this is most probably due to the way of gene functioning as the nodes of molecular pathways, where roles of individual genes are highly interconnected and frequently interchangeable [42]. Previously, several approaches for measuring molecular pathway activities were proposed for the expression data at both mRNA, protein and microRNA levels [35, 43–47]. The analysis of molecular pathways at the expression level was successfully applied for cancer investigations [48–50]. The extent of pathway activation called pathway activation level (PAL), is a cumulative value aggregating relative expression levels of the corresponding gene products in relation with their functions in a pathway [51]. For most of the cancer types, PAL biomarkers were more robust compared to those based on the individual genes [52, 53]. This feature is fundamentally linked with the property of combining individual expression levels, thus decreasing experimental errors [45]. PAL biomarkers were further generated for a plethora of normal and pathological conditions, including cancer response to treatments [16, 54–57]. However, this approach was not used for the mutation data. We recently proposed a new type of molecular biomarkers based on DNA mutation impacts on the molecular pathways [58]. We introduced a quantitative metric termed Pathway instability (PI) proportionate to the relative number of mutated genes in a pathway. Using high throughput gene mutation profiles for 5956 patients of 15 different cancer types from TCGA collection, we screened 2,316,670 mutations in 19,872 genes and 1748 molecular pathways. The robustness of mutation-based molecular pathway approach dramatically exceeded that for the individual gene biomarkers. We could generate a list of 660 novel robust cancer typespecific pathway mutation biomarkers.

210

Marianna Zolotovskaia et al.

To assess efficiencies of pathway-based mutation biomarkers to predict efficiencies of target drugs, we next proposed and tested ten alternative pathway-based Oncobox Drug Scoring algorithms utilizing mutations data. These algorithms were used for the data from 3800 published mutation profiles representing eight cancer types and then validated using the published clinical trials data. We showed that several mutation-based Drug Scoring methods were effective for predicting overall drug responses, as evidenced by statistically significant correlations between Drug Score ratings of individual drugs and their therapeutic success reflected by the completed phases of clinical trials for the respective cancer types. We also used the best Drug Scoring algorithm to simulate all known protein coding genes as the potential drug targets. We found that many of the Oncobox algorithm-predicted proteins were highly congruent with the molecular targets already captured by the real anticancer drugs.

2

Materials

2.1 Initial Mutation Data

The human mutation dataset was obtained from the Catalogue Of Somatic Mutations In Cancer (COSMIC) [59]. COSMIC aggregates and annotates mutation data from various sources by providing lists of verified somatic mutations. The data were downloaded from COSMIC website, version 76. The complete dataset includes 6,651,236 somatic mutation records for 20,528 genes in 19,434 tumor samples of 37 primary localizations.

2.2 Algorithm Validation Dataset

For validation of drug scoring algorithms, we extracted mutation data only for the primary localizations containing at least 100 samples indexed in COSMIC and originally taken from The Cancer Genome Atlas (TCGA) project [59, 60]. The TCGA mutation profiles were selected because they represented the largest collection of uniformly treated biosamples profiled using the same deep sequencing platforms [60]. For the Pathway Instability algorithm validation, we totally analyzed 5956 genetic profiles corresponding to fifteen primary tumor localizations: breast, central nervous system, cervical, endometrium, ovaries, prostate, kidney, urinary tract, liver, hematopoietic and lymphoid tissue, stomach, large intestine, lung, thyroid, and skin. The database accession numbers of the samples used were published in [58]. For the Drug Scoring algorithm validation, we took 3800 individual mutation profiles of eight cancer types: central nervous system, kidney, large intestine (including cecum, colon, and rectum), liver, lung, ovary, stomach, and thyroid gland (Table 1). To obtain mutation profile for each tumor, the COSMIC data were processed with script Cosmic v76 processing written in R (version 3.4.3) available through Gitlab at https://gitlab.com/

Molecular Pathway Analysis of Mutation Data

211

Table 1 The structure of MDS validation dataset Cancer type

Number of samples

Disease abbreviation

Central nervous system

657

Gliomas, GL

Kidney

601

Kidney cancer, KC

Large intestine

620

Colorectal cancer, CRC

Liver

188

Hepatic cancer, HC

Lung

569

Non-small cell lung cancer, NSCLC

Ovary

474

Ovarian cancer, OVC

Stomach

288

Stomach cancer, STC

Thyroid

403

Thyroid cancer, THC

White_Knight/cosmic76_processing/tree/master (Accessed 22 Oct 2018). The processed data is fully available in the previous publication [61]. 2.3 Molecular Targets Interrogation Dataset

The full COSMIC dataset was used to investigate the effectiveness of potential target drugs. The samples related to cell cultures or tumor xenograft were excluded to standardize the analysis. We also recommend excluding records having the following marks in the “Sample source” field: organoid culture, short-term culture, cellline, xenograft. Totally, the final dataset contained 6,027,881 mutations records for 18,273 tumor samples of 35 primary localizations. To return mutation rates for all genes, the COSMIC data were processed using R script available through GitHub at https:// gitlab.com/White_Knight/cosmic76_processing/tree/master (Accessed 22 Oct 2018). The processed data is fully available in the previous publication [58].

2.4 Clinical Trials Data

Clinical trials data were extracted from the National Institutes of Health website ClinicalTrials.gov, available at: https://clinicaltrials. gov/ (Accessed 25 Jul 2017) and the US Food and Drug Administration Home Page Available at: https://www.fda.gov/ (Accessed 25 Jul 2017). They were processed by manual curation of web data. The processed clinical trials data used for the correlation studies is fully available in the previous publication [61].

2.5 Molecular Pathways Data

The gene contents data about 3125 human molecular pathways used to calculate mutation drug scores were extracted from Reactome [62], NCI Pathway Interaction Database [63], Kyoto Encyclopedia of Genes and Genomes [64], HumanCyc [65], Biocarta [66], Qiagen (https://www.qiagen.com/us/shop/genes-and-

212

Marianna Zolotovskaia et al.

pathways/pathway-central/; accessed 19 Sep 2018). For drug scores calculation, we used only the 1752 pathways including ten gene products or more because of previously reported poor theoretical data aggregation effect for smaller pathways [45]. The information about molecular specificities of 128 target anticancer drugs was obtained from DrugBank [34] and ConnectivityMap [67] databases. 2.6 Data Presentation

3

The results were visualized using ggplot2 package [68].

Methods

3.1 Pathway Instability (PI) Scoring

Pathway instability (PI) scoring for a molecular pathway depends on the mutation frequencies in the genes involved. To assess mutation burden of the individual genes, we introduced Mutation rate (MR) value calculated according to the formula: MR n ¼

N mutðn, gÞ , N samples ðgÞ

where MRn is the Mutation rate of a gene n; N mut(n,g) is the total number of mutations identified for a gene n in a group of samples g; N samples (g) is the number of samples in a group g. The MR values strongly positively correlate with the gene coding DNA sequence (CDS) lengths (Spearman correlation obtained in model experiment 0.798, p < 2e16; Fig. 1a [58]), for example, because larger genes are more likely to accumulate mutations. To avoid this bias, a length-normalized value termed Normalized mutation rate (nMR) was introduced expressed by the formula: nMR n ¼

1000 MR n , Length CDS ðnÞ

where nMRn is the Normalized mutation rate of a gene n; MRn is the Mutation rate of a gene n; and Length CDS(n) is the length of CDS of a gene n in nucleotides. nMR did not correlate with the size of CDS for the respective genes (Fig. 1b; Spearman correlation 0.151) [58]. Pathway instability (PI) score can be next calculated for every pathway to estimate relative enrichments by mutations. PI is expressed by the formula: P nMR n PGp, n PIp ¼ n Np where PIp is Pathway instability score for a pathway p; nMRn is the Normalized mutation rate of gene n; PGp, n is pathway-gene

Molecular Pathway Analysis of Mutation Data

213

Fig. 1 Correlations with gene coding DNA sequence (CDS) lengths. (a) The correlation of Mutation rates (MR) and gene CDS lengths calculated for 5956 samples from fifteen tumor localizations. (b) Correlation of Normalized mutation rates (nMR) and gene CDS lengths calculated for the same biosamples

indicator that equals to 1 if gene n is included in pathway p, or equals to 0 if not; Np is the total number of gene products included in pathway p. 3.2 PI Analysis of Cancer TypeSpecific Mutation Signatures

Totally, PI scores for 1748 molecular pathways in 5956 tumor samples representing fifteen primary localizations were calculated: breast, central nervous system, cervical, endometrium, ovaries, prostate, kidney, urinary tract, liver, hematopoietic and lymphoid tissue, stomach, large intestine, lung, thyroid, and skin (data shown in [58]). Each tumor sample was characterized by nMR values for the individual genes and by PI values for the molecular pathways. As

214

Marianna Zolotovskaia et al.

previously shown by the principal component analysis (PCA), the complete sets of 19,872 nMR biomarkers and 1748 PI biomarkers could not distinguish between the above tumor localization types [58]. However, many molecular pathways had characteristic PI scores that were clearly distinctive of the different tumor types, as shown by the high area under the ROC curve (AUC) values. The AUC value is the universal biomarker robustness characteristics depending on its sensitivity and specificity [69]. It varies in a range 0.5–1 and positively correlates with the quality of a biomarker. The AUC discrimination threshold is typically 0.7 or 0.75. The parameters with greater AUC are considered goodquality biomarkers, and vice versa [70]. The ROC AUC test was performed in two ways: (1) for comparing every separately taken tumor type (localization) versus all other tumors, and (2) for all possible pairwise comparisons among the tumor types. In parallel, the same AUC tests were performed also for every gene nMR characteristics of every sample [58]. The data analysis pipeline is schematized on Fig. 2. In this way, we could compare the biomarker potentials of the individual gene mutations (nMR) with the aggregated pathway-based mutation characteristics (PI). Our analysis revealed a dramatic advantage of the pathway based (PI) compared to gene based (nMR) approach in finding good quality biomarkers in all types of the comparisons made. For example, for the analyzes when one tumor localization was compared against 14 others, the total number of good quality (AUC > 0.75) biomarkers was 660 for the pathways (PI), compared to only six for the individual genes (nMR). Similarly, for the pairwise comparisons we identified totally 32,594 good quality PI biomarkers versus only 226 nMR biomarkers (Fig. 2). Considering that the initial number of potential pathway biomarkers (1748) was one order of magnitude lower than the number of gene biomarkers (19,872), this further strengthens the advantage of a PI-based approach. 3.2.1 Cancer Type Specific PI Biomarkers

The six cancer type-specific gene mutation biomarkers were APC for colorectal cancer, PTEN for endometrial cancer, BRAF for thyroid cancer and MUC16, DNAH5, TTN for cutaneous melanoma. These genes were already linked with the respective cancer types in the previous reports [71–74], but the overall number of six biomarkers may seem small provided they were obtained for fifteen comparisons (Fig. 2). In contrast, the pathway approach returned here as much as 660 reliable biomarkers representing 428 pathways. Different localizations had very different numbers of marker pathways (Fig. 3a). Gene and pathway biomarkers were found for four and eight tumor localizations, respectively. For the first time, a database of pathway mutation biomarkers was reported for eight localizations investigated: colorectal, kidney, non-small cell lung, prostate, thyroid cancers, hematological malignancies, cutaneous

Molecular Pathway Analysis of Mutation Data

215

Fig. 2 Pipeline of bioinformatic quality check for pathway- and gene-based mutation biomarkers

melanoma, and uterine corpus endometrial carcinoma [58]. Despite the large number (Fig. 3), they were applicable only for eight localizations out of 15 totally investigated. Of those, colorectal

216

Marianna Zolotovskaia et al.

Fig. 3 (a) Numbers of mutation marker genes and molecular pathways in “one versus all” cancer type comparisons. Cancers are abbreviated as follows: breast invasive carcinoma—BRCA, brain lower grade glioma—LGG, glioblastoma multiforme—GBM, cervical squamous cell carcinoma and endocervical adenocarcinoma—CESC, uterine corpus endometrial carcinoma—UCEC, acute myeloid leukemia—LAML, kidney

Molecular Pathway Analysis of Mutation Data

217

cancer and endometrial carcinoma had maximum number of PI biomarkers. The characteristic PI scores could be either higher or lower than the average values for all cancer types, thus resulting in “high” or “low” biomarkers (Fig. 3a). We next identified molecular pathways that were frequently mutated in all cancer types under investigation. To this end, 1145 pathways having AUC less than 0.7 in all tumor types were selected and intersected with the list of top 10% pathways sorted according to the average PI values. This two-step selection procedure enabled to identify highly mutated pathways (top 10% average PI score) which are similarly mutationally charged among the different cancer types (AUC < 0.7 in all cancer types). The final list of 18 molecular pathways most frequently mutated in all cancer types is shown on Table 2. On the other hand, we also looked for the pathways that were most informative as the biomarkers (AUC > 0.75) for the maximum number of cancer types. Top 25 most informative biomarker pathways are shown on Table 3. 3.2.2 Pairwise Comparison PI Biomarkers

The number of high-quality ROC AUC biomarkers found in pairwise comparisons is characteristic of tumor mutational landscapes and their relative similarities [58]. For example, small number of specific biomarkers suggests little differences in their mutation profiles. In contrast, high number of biomarkers means highly distinct mutation profiles. Based on the numbers of high-quality biomarkers, a distance matrix can be created and a clustering dendrogram can be built for the different tumor localizations. The distance matrix was built separately for the gene (nMR) and the pathway (PI) mutation biomarkers (Fig. 4a). The number of biomarkers that distinguish between two cancer types varies greatly depending on the localizations compared (Fig. 4a). The number of PI biomarkers was roughly two orders of magnitude higher than the number of nMR biomarkers (Fig. 4a) [58].There was also an overall correlation between these numbers (corr. 0.69, pvalue ¼ 4e16). For example, all the comparisons having no good PI biomarkers also had no good nMR biomarkers there (Fig. 4a). The numbers of biomarkers per cancer type differed for up to 784 for PI and only up to 5 for nMR. The pathway-based approach, therefore, can be regarded beneficial and much more informative than the gene-based mutation analysis.

ä Fig. 3 (continued) renal papillary cell carcinoma—KIRP, kidney renal clear cell carcinoma—KIRC, colorectal cancer—COADREAD, liver cancer—LICA, liver hepatocellular carcinoma—LIHC, lung adenocarcinoma— LUAD, lung squamous cell carcinoma—LUSC, ovarian serous cystadenocarcinoma—OV, prostate adenocarcinoma—PRAD, skin cutaneous melanoma—SKCM, stomach adenocarcinoma—STAD, thyroid carcinoma— THCA, bladder urothelial carcinoma—BLCA. (b) AUC distributions of pathways and genes. AUC cut-off level for high-quality biomarkers is 0.75. AUC were calculated for “one versus others” comparisons

218

Marianna Zolotovskaia et al.

Table 2 Intersection of top 10% molecular pathways by average PI and molecular pathways with AUC < 0.7 for all cancer types

#

Pathway ID (according to the source pathway database)

PI

Reference

0.19

http://apps.pathwaycommons.org/view? uri¼http%3A%2F%2Fpathwaycommons. org%2Fpc2%2FPathway_ 83968ff327912d4d5a0ee5f31d27adf9

1

NCI Aurora A signaling pathway (protein catabolic process)

2

Biocarta double stranded RNA induced gene 0.19 expression Main Pathway

http://amp.pharm.mssm.edu/ Harmonizome/gene_set/double +stranded+rna+induced+gene +expression/Biocarta+Pathways

3

Biocarta role of BRCA1 BRCA2 and ATR in 0.18 cancer susceptibility Pathway (DNA replication termination)

http://amp.pharm.mssm.edu/ Harmonizome/gene_set/role+of+brca1 +brca2+and+atr+in+cancer +susceptibility/Biocarta+Pathways

4

Biocarta p53 signaling Main Pathway

0.18

http://amp.pharm.mssm.edu/ Harmonizome/gene_set/p53+signaling +pathway/Biocarta+Pathways

5

NGF pathway apoptosis

0.17

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼320

6

BRCA1 pathway mismatch repair

0.16

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼68

7

ATM pathway G2-mitosis progression

0.16

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼46

8

ATM pathway G2 M checkpoint arrest

0.16

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼46

9

Biocarta tumor suppressor ARF inhibits ribosomal biogenesis main pathway

0.16

http://software.broadinstitute.org/gsea/ msigdb/cards/BIOCARTA_ARF_ PATHWAY

10 Biocarta ATM signaling main pathway

0.14

http://software.broadinstitute.org/gsea/ msigdb/geneset_page.jsp? geneSetName¼BIOCARTA_ATM_ PATHWAY

11 NCI Hypoxic and oxygen homeostasis regulation of HIF1 alpha main pathway

0.12

http://www.pathwaycommons.org/pc/ record2.do?id¼517145

12 HIF1 alpha pathway NOS Pathway

0.11

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼223 (continued)

Molecular Pathway Analysis of Mutation Data

219

Table 2 (continued)

#

Pathway ID (according to the source pathway database)

PI

Reference

0.09

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼223

14 HIF1 alpha pathway gene expression via JUN 0.09 CREB3

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼223

15 Lipoxins influence on cell growth and proliferation

0.05

http://pathwaymaps.com/maps/2690

16 NCI validated transcriptional targets of TAp63 isoforms pathway (pathway degradation of TP63)

0.03

http://www.pathwaycommons.org/pc/ record2.do?id¼517011

17 NCI validated transcriptional targets of TAp63 isoforms pathway (metastasis)

0.02

http://www.pathwaycommons.org/pc/ record2.do?id¼517011

18

0.02

https://metacyc.org/META/NEWIMAGE?type¼PATHWAY& object¼PWY-6351

13 HIF1 alpha pathway VEGF pathway

D-Myoinositol

1,4,5-trisphosphate

biosynthesis

3.2.3 Mutation Biomarker-Based Clustering of Cancers

Using numbers of biomarkers with significant AUC as a proximity metric, clustering dendrograms were built for the fifteen cancer types under investigation. For clustering, we used Ward’s criterion and Ward.d2 algorithm [75]. The dendrograms generated for the nMR and PI specific metrics differed considerably. The nMR-based tree had lower number of major clades (three vs. four for the PI data tree) and it had more degenerated distances between the cancers (Fig. 4b, c). These features most likely reflected the significantly lower numbers of nMR biomarkers compared to PI, thus suggesting in favor of using pathway- rather than gene-specific clustering based on mutation data. It should be noted that positions on the clades of the dendrograms were not linked with the anatomical proximities of the respective localizations in human body [58].

3.3 Algorithms of PIBased Drug Scoring

Oncobox Mutation Drug Scoring (MDS) method utilizes quantization of mutation enrichment for the molecular pathways having molecular targets of a drug under investigation [61]. Overall, they are based on assumption that the greater is the pathway instability (PI) of the respective pathways, the higher will be the expected drug efficiency. To link PI scores and estimated drug efficiencies, the following basic formula was proposed for the calculation of Mutation Drug Score (MDS):

220

Marianna Zolotovskaia et al.

Table 3 Top molecular pathways sorted by the number of cancer types where PI score serves as a good biomarker distinguishing from the other fourteen localizations (AUC > 0.75)

#

Pathway ID (according to the source pathway database)

Cancers Reference

1

KEGG pathways in cancer main pathway

6

https://www.genome.jp/kegg-bin/show_ pathway?hsa0 [52, 53]

2

AKT signaling pathway

5

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/?pwid¼23

3

cAMP pathway

5

https://www.qiagen.com/br/shop/genesand-pathways/pathway-details/?pwid¼76

4

ILK signaling pathway

5

https://www.qiagen.com/dk/shop/genesand-pathways/pathway-details/? pwid¼246

5

ILK signaling pathway cytoskeletal adhesion complexes

5

https://www.qiagen.com/dk/shop/genesand-pathways/pathway-details/? pwid¼246

6

KEGG neuroactive ligand receptor interaction main pathway

5

https://www.genome.jp/kegg-bin/show_ pathway?map¼hsa04080&show_ description¼show

7

PTEN pathway adhesion or migration

5

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼375

8

PTEN pathway angiogenesis and tumorigenesis

5

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼375

9

PTEN pathway Ca2+ signaling

5

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼375

10 ERK signaling pathway

4

https://www.qiagen.com/fr/shop/genesand-pathways/pathway-details/? pwid¼162

11 ILK signaling pathway epithelial mesenchymal transition tubulointerstitial fibrosis

4

https://www.qiagen.com/dk/shop/genesand-pathways/pathway-details/? pwid¼246

12 ILK signaling pathway migration vasculogenesis

4

https://www.qiagen.com/dk/shop/genesand-pathways/pathway-details/? pwid¼246

13 KEGG ECM receptor interaction main pathway

4

https://www.genome.jp/kegg-bin/show_ pathway?hsa04512

14 KEGG HTLV I infection main pathway

4

https://www.genome.jp/kegg-bin/show_ pathway?hsa05166 (continued)

Molecular Pathway Analysis of Mutation Data

221

Table 3 (continued)

#

Pathway ID (according to the source pathway database)

Cancers Reference

15 KEGG MAPK signaling main pathway

4

https://www.genome.jp/kegg/pathway/ hsa/hsa04010.html

16 KEGG Olfactory transduction main pathway

4

https://www.genome.jp/kegg-bin/show_ pathway?map¼hsa04740&show_ description¼show

17 KEGG Protein digestion and absorption main pathway

4

https://www.genome.jp/kegg-bin/show_ pathway?map¼hsa04974&show_ description¼show

18 KEGG sphingolipid signaling main pathway

4

https://www.genome.jp/kegg-bin/show_ pathway?map¼hsa04071&show_ description¼show

19 MAPK signaling pathway

4

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼282

20 MTOR pathway

4

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼304

21 NCI beta1 integrin cell surface interactions main pathway

4

http://www.pathwaycommons.org/pc/ record2.do?id¼517095

22 p38 signaling pathway

4

https://www.qiagen.com/mx/shop/genesand-pathways/pathway-details/? pwid¼337

23 PAK pathway

4

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼342

24 PTEN pathway

4

https://www.qiagen.com/us/shop/genesand-pathways/pathway-details/? pwid¼375

25 Ras pathway

4

https://www.qiagen.com/no/shop/genesand-pathways/pathway-details/? pwid¼383

MDSd ¼

X

DTId , n n

X p

PGn, p ,

ð1Þ

where d is drug name; n is gene name; p is pathway name; MDSd is MDS for drug d; DTId,n is drug target index for drug d and gene n; PIp is Pathway Instability of pathway p; PGn,p is pathway-gene indicator for gene n and pathway p. As in the previous formula for PI calculation, PGp, n here is pathway-gene indicator that equals to 1 if gene n is included in

222

Marianna Zolotovskaia et al.

Fig. 4 (a) Data distance matrix of high quality (AUC > 0.75) biomarkers for pairwise comparisons between different cancer localizations. Cancer type indications are as above. Lower triangle shows numbers of good biomarkers for pathway-based data (PI); upper triangle—for individual gene-based mutation data (nMR). The intersection of cancer indications shows number of effective biomarkers for the respective comparison. (b) Cluster dendrogram built for the fifteen cancer types based on mutation biomarker (nMR) data. Number of biomarkers was used as the distance metric. (c) Cluster dendrogram built for the fifteen cancer types based on mutation biomarker (PI) data. Number of biomarkers was used as the distance metric

pathway p, or equals to 0 if not; in turn, a Boolean flag drug target index, DTId, n formalizes if gene n is molecular target of drug d: 1, drug d has target gene n, DTId , n ¼ 0, drug d doesn0 t have target gene n: To complete DTI database, we used the data about molecular specificities of 128 target drugs extracted from the databases DrugBank [34] and Connectivity Map [67]. The above basic formula for MDS calculation was modified to generate several alternative methods of drug scoring. – Pathway size-normalized. Since molecular pathways include considerably different number of genes varying from dozens to hundreds, a modification of the MDS calculation method (Eq. 1) performs normalization on the respective number of genes for each PI member: X X MDS N d ¼ DTId , n p PGn, p PI=kp , ð2Þ n where kp is number of genes in pathway p.

Molecular Pathway Analysis of Mutation Data

223

– Single count-normalized. Impact of each gene participating in a pathway targeted by drug d is counted only once: X nMR n GIId , n , ð3Þ MDS gened ¼ n where GIId, n—Boolean flag gene involvement index, GIId , n ¼

1, gene n participates in at least one pathway targeted by drug d 0, gene n does not participate in pathways targeted by drug d

– Number of pathways-normalized. MDS for drug d is normalized on the number of its targeted molecular pathways. MDS md ¼ MDSd =m d ,

ð4Þ

where md—number of pathways targeted by drug d. – Number of pathways-normalized. MDS_N is additionally normalized on the number of pathways targeted by drug d (md). MDS N md ¼ MDS N =m d ,

ð5Þ

– Number of target genes-normalized. MDS_bd is additionally normalized on the number of target genes for drug d, (bd). MDS b d ¼ MDSd =b d ,

ð6Þ

– Number of target genes-normalized MDS_N. MDS_N, normalized on the number of target genes for drug d, (bd). MDS N b d ¼ MDS N =b d ,

ð7Þ

– Number of target genes-normalized MDS_gene. MDS_gene, normalized on the number of target genes for drug d, (bd). MDS gene b d ¼ MDS gene=b d :

ð8Þ

– Target genes dependent only. MDS2 is calculated considering only mutation frequencies of target genes. MDS2d ¼

X

PGn, p p

X n

DTId , n nMR n :

ð9Þ

– Single count-normalized, target genes dependent only. MDS2_gene is calculated, considering each target gene for drug d only once. X DTId , n NMR n GIId , n : ð10Þ MDS2 gened ¼ n

224

Marianna Zolotovskaia et al.

Table 4 Clinical status of drug, according of the top passed clinical trials phase

3.4 Validation of MDS Calculation Methods on Clinical Trials Data

Phase of clinical trials

Clinical status

Phase I ongoing

0.1

Phase I/II ongoing (phase I completed)

0.2

Phase II ongoing

0.3

Phase II completed

0.4

Phase III ongoing

0.7

Phase III completed

0.85

Phase IV (drug approved and marketed)

1

Different versions of MDS were calculated according to formulae (1–10) for 128 anticancer target drugs, for 3800 individual samples from eight cancer localizations: large intestine (including cecum, colon, and rectum), lung, kidney, stomach, ovarian, central nervous system, liver, and thyroid (Table 1). To measure completion of clinical investigations for a drug, we introduced the metric termed Clinical Status. These values are congruent with the apparent efficiencies of drugs for the given cancer types. The same drugs most frequently had different clinical statuses for the different cancer types [61]. The Clinical Status varied in a range from 0 to 1 proportional to the top phase of clinical trials passed by a drug for a given cancer type. The Clinical Status grows incrementally depending on the completion of the clinical trials phases 1–4, while the later phases have a greater specific weight, because they allow to more accurately determine clinical efficacy of a drug (Table 4). The complete Clinical Status information for 128 drugs was published in [61]. The major limitation of this approach is that only the drugs that had been already clinically investigated for the respective tumor type can be ranked in such a way. To investigate the capacities of different versions of MDS to successfully predict drug clinical efficiency, the correlations of MDS values with Clinical Status of drugs are investigated. To calculate correlations, we took all cancer mutation profiles together without separation on cancer types (Fig. 5).The correlations are calculated and Spearman correlation coefficient distributions are next compared. Overall, the markedly better correlations were seen for the MDS and MDS_N types of drug scoring (Fig. 5). When looking at the cancer type-specific distributions (Fig. 6), it was established that both MDS and MDS_N scores positively correlated with the drugs clinical efficiencies in all the cancers investigated, thus confirming their high quality. Among those, MDS showed best overall functional characteristics and was, therefore, used in further analyses.

Molecular Pathway Analysis of Mutation Data

225

Fig. 5 Correlation between Clinical Status and MDS rank for 10 types of drug scoring in combined group of samples from eight cancer types. (a) Distributions of Spearman correlation coefficients between Clinical Status and MDS rank for 128 target drugs in 3800 tumor samples. MDS rank of a drug was calculated as the individual drug’s position in the rating of all drugs under investigation (from top to low). Violin plots distributed along X-axis represent ten types of drug scoring. Y-axis reflects density distributions of correlations between Clinical Status and MDS ranks. Boxes indicate the second and third quartiles of distribution, black dots indicate outliers. (b) Distributions of p-value for the correlation coefficients between Clinical Status and MDS rank for 128 target drugs in the same tumor samples. The horizontal threshold line corresponds to p ¼ 0.05 3.5 Application of MDS for Identification of Possible Target Genes

The MDS algorithm was tested for its capacity to identify potentially valuable drug targets. To this end, a situation was modeled where each gene specifically corresponds to only one drug. Those simulated, or virtual drugs, also were specific each to only one gene product. Using the database of 1752 molecular pathways, MDS were calculated for 8736 virtual drugs specific to the same number of genes included in these pathways. For this analysis, 18,273 full-

226

Marianna Zolotovskaia et al.

Fig. 6 Correlation between Clinical Status and MDS rank for two best types of drug scoring in eight cancer types taken separately. (a) Distributions of Spearman correlation coefficients for 128 target drugs. MDS rank was calculated as the drug’s rating of all drugs under investigation (from top to low). Drug scoring methods are shown in horizontal lines, and cancer types are given vertically. Each violin plot along X-axis represents a particular cancer type. Y-axis reflects density distributions of correlation coefficients between Clinical Status and MDS ranks. Boxes indicate the second and third quartiles of distribution, black dots indicate outliers. (b) Distributions of p-value for the correlation coefficients between Clinical Status and MDS rank for 128 target drugs in the same tumor types. The horizontal threshold line corresponds to p ¼ 0.05

exome tumor mutation profiles were used. Top 30 predicted molecular targets with highest MDS values are listed on Table 5 along with the real cancer drugs specific for these molecular targets. The complete MDS calculation data were published in [61]. All the virtual drugs were next ranked according to their MDS values and assessed if the same molecular targets are specific to any of the existing cancer drugs (Fig. 7). To this end, an auxiliary value termed Target Conversion (TC) was introduced that reflects the

Molecular Pathway Analysis of Mutation Data

227

Table 5 Top 30 molecular targets sorted by MDS and clinically approved drugs using these molecular targets Potential molecular targets

MDS

Existing relevant drugs

PIK3CA

387.11

Idelalisib

PIK3R1

371.31

MAPK1

354.75

MAPK3

343.81

HRAS

343.66

PIK3CB

313.02

Idelalisib

AKT1

305.54

Perifosine

PIK3R2

302.74

PIK3CD

293.15

KRAS

291.42

PIK3R3

290.07

MAP2K1

288.80

NRAS

287.90

PIK3R5

279.34

RAF1

271.72

MAPK8

267.73

MAP2K2

257.33

TP53

255.89

GRB2

254.36

SOS1

243.39

RAC1

239.32

MAPK9

233.01

EGFR

232.80

MAPK14

224.08

MAPK10

222.51

EGF

214.20

RELA

212.43

PRKCA

211.99

NFKB1

211.63

Thalidomide

AKT2

205.38

Perifosine

Idelalisib

Binimetinib, Cobimetinib, Selumetinib, Trametinib

Dabrafenib, Regorafenib, Sorafenib

Binimetinib, Cobimetinib, Selumetinib, Trametinib

Afatinib, Brigatinib, Cetuximab, Erlotinib, Flavopiridol, Foretinib, Gefitinib, Lapatinib, Masitinib, Nimotuzumab, Osimertinib, Panitumumab, Vandetanib, Necitumumab

228

Marianna Zolotovskaia et al.

percentage share of known molecular targets among predicted molecular targets. TC ¼

Number of known molecular targets 100%: Number of predicted molecular targets

For the overall (complete) list of potential molecular targets, TC was 2.17%. However, there was a clear-cut incremental TC growth trend when the potential molecular targets were sorted in the ascending order of MDS value (Fig. 7a, shown for deciles of the potential targets). For the decile of molecular targets having the

Fig. 7 MDS ranking and occurrence of molecular targets in approved cancer drugs. (a) Deciles of potential molecular targets are sorted in ascending order according to MDS value. TC values for every decile are shown on vertical axes. (b) Distribution of MDS values among the potential molecular drug targets. Color scale on the graph indicates densities of clinically approved cancer drugs using the same molecular targets

Molecular Pathway Analysis of Mutation Data

229

highest MDS values, the greatest TC value was observed exceeding 10%. Molecular targets with the highest MDS were clearly enriched by the existing clinically approved drugs (Fig. 7a). On the other hand, target genes with higher MDS are covered by a bigger number of approved drugs per target, as many drugs have common molecular specificities (Fig. 7b) [61]. The mutation enrichment of a pathway may characterize its overall involvement in malignization. According to the present conception of drug scoring, the maximum efficiency of drug can be obtained by acting on the most strongly affected molecular pathways [61].

4

Summary Bioinformatic approaches based on measuring of molecular pathway activation were efficient in finding biomarkers using highthroughput proteomics [45], mRNA [47, 76], and epigenetic data [46, 77]. Here, we propose application of molecular pathway scoring approach to mutation data. Of course, the idea of mutation data aggregation has already been reflected in previous techniques. Thus, bioinformatics tool BioBin overcomes high data sparsity by combining mutations into bins at the level of molecular pathways, protein families, evolutionary conversed region, regulatory regions [78, 79]. Also, this problem is solved by the Network regularized sparse nonnegative TRI matrix factorization for PATHway identification using known molecular pathways and network of gene interactions [80]. Unlike previous techniques, the Oncobox mutation pathway instability approach focused on generation of a universal metric that objectively reflects the mutation burden of а molecular pathway. Previous approaches evaluated the mutation load on the basis of presence or absence of a mutation and did not look at the number of pathway participants and length of their coding DNA sequence, which produced significant bias in the output results. In the Oncobox PI calculation approach, this bias was removed through normalization steps on the gene lengths and on the numbers of pathway participants. PI calculation doesn’t require detailed annotation of the individual mutation effects on the pathway activities because only a minor fraction of the mutations identified has been experimentally characterized in terms of its impact on the protein and pathway functioning. However, further accumulation of these data on a high throughput basis will make it possible to improve the Pathway instability calculation by adding the specific coefficients reflecting effects of every individual mutation on the respective protein and pathway functions. This method can be routinely used for comparing sets of human exome or complete genome data. To this end, for every sample, mutations should be identified for the genes participating

230

Marianna Zolotovskaia et al.

in the molecular pathways under investigation. PI scores are then calculated showing relative mutation burden of each pathway in every biosample. These findings can be valuable per se for better understanding of the individual mechanisms of carcinogenesis. Furthermore, ROC AUC test can be next applied to the PI data to identify reliable biomarkers of the given sample groups under comparison. All these procedures can be done by using publicly available bioinformatic tools, and the gene composition of the molecular pathways required for PI calculation is available in the relevant databases [61–66, 81]. PI mutation data can be used as the additional criteria for differential diagnostics in oncology. We describe here ten different versions of molecular pathway-based mutation drug scoring. At least two versions could provide output data positively correlating with the clinical trials data for 128 drugs in all tested tumor types. We hope that the pathway-based mutation drug scoring approach has a potential of helping clinical oncologists to implement personalized selection of target drugs based on the individual, the patient’s tumor-specific high-throughput mutation profile. Moreover, evidence was provided that the same approach can be applied also to identify potentially efficient molecular targets in experimental oncology [61]. In this application, we considered integral MDS for all cancer types. However, in further applications the same approach can be used for any specific tumor type or subtype to identify targets that will be most promising for a given disease. This could be valuable, for example, for drugs repurposing among the different tumor types and for more effectively identifying the patient cohorts in clinical trials. The present mutation drug scoring approach quantitates the molecular pathway instability caused by accumulation of mutations and ranks drugs according to a simple rationale—the higher is mutation burden of a pathway, the greater may be the efficiency of a drug targeting this pathway. We hope that Oncobox pathway instability (PI) calculation and mutation drug scoring (MDS) technique will be interesting to those working in the fields of functional genomics and molecular medicine.

Acknowledgment Funding: This study was supported by the Oncobox research program in digital oncology, by the Russian Science Foundation grant no. 18-15-00061, by Amazon and Microsoft Azure grants for cloud-based computational facilities.

Molecular Pathway Analysis of Mutation Data

231

References 1. Sieber O, Heinimann K, Tomlinson I (2005) Genomic stability and tumorigenesis. Semin Cancer Biol 15:61–66 2. Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Kinzler KW (2013) Cancer genome landscapes. Science 339:1546–1558 3. Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C et al (2013) Mutational landscape and significance across 12 major cancer types. Nature 502:333–339 4. Bignell GR, Greenman CD, Davies H, Butler AP, Edkins S, Andrews JM et al (2010) Signatures of mutation and selection in the cancer genome. Nature 463:893–898 5. Campbell PJ, Yachida S, Mudie LJ, Stephens PJ, Pleasance ED, Stebbings LA et al (2010) The patterns and dynamics of genomic instability in metastatic pancreatic cancer. Nature 467:1109–1113 6. International Cancer Genome Consortium, Hudson TJ, Anderson W, Artez A, Barker AD, Bell C, et al (2010) International network of cancer genome projects. Nature 464:993–998 7. Rafiq S, Khan S, Tapper W, Collins A, UpstillGoddard R, Gerty S et al (2014) A genome wide meta-analysis study for identification of common variation associated with breast cancer prognosis. PLoS One 9:e101488 8. Mitra AP, Lerner SP (2015) Potential role for targeted therapy in muscle-invasive bladder cancer: lessons from the cancer genome atlas and beyond. Urol Clin North Am 42:201–215 9. Hanna N, Einhorn LH (2014) Testicular cancer: a reflection on 50 years of discovery. J Clin Oncol 32:3085–3092 10. Oldenburg A, Hohmann J, Foert E, Skrok J, Hoffmann C, Frericks B et al (2005) Detection of hepatic metastases with low MI real time contrast enhanced sonography and SonoVue®. Eur J Ultrasound 26:277–284 11. Vasey PA (2003) Resistance to chemotherapy in advanced ovarian cancer: mechanisms and current strategies. Br J Cancer 89(Suppl 3): S23–S28 12. Housman G, Byler S, Heerboth S, Lapinska K, Longacre M, Snyder N et al (2014) Drug resistance in cancer: an overview. Cancers (Basel) 6:1769–1792 13. Sawyers C (2004) Targeted cancer therapy. Nature 432:294–297 14. Druker BJ, Sawyers CL, Kantarjian H, Resta DJ, Reese SF, Ford JM et al (2001) Activity of a specific inhibitor of the BCR-ABL tyrosine kinase in the blast crisis of chronic myeloid leukemia and acute lymphoblastic leukemia

with the Philadelphia chromosome. N Engl J Med 344:1038–1042 15. Druker BJ, Talpaz M, Resta DJ, Peng B, Buchdunger E, Ford JM et al (2001) Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia. N Engl J Med 344:1031–1037 16. Spirin P, Lebedev T, Orlova N, Morozov A, Poymenova N, Dmitriev SE et al (2017) Synergistic suppression of t(8;21)-positive leukemia cell growth by combining oridonin and MAPK1/ERK2 inhibitors. Oncotarget 8:56991–57002 17. Joo WD, Visintin I, Mor G (2013) Targeted cancer therapy—are the days of systemic chemotherapy numbered? Maturitas 76:308–314 18. Padma VV (2015) An overview of targeted cancer therapy. Biomedicine 5:19 19. Baselga J (2006) Targeting tyrosine kinases in cancer: the second wave. Science 312:1175–1178 20. Rini BI (2009) Vascular endothelial growth factor-targeted therapy in metastatic renal cell carcinoma. Cancer 115:2306–2312 21. Azoury SC, Straughan DM, Shukla V (2015) Immune checkpoint inhibitors for cancer therapy: clinical efficacy and safety. Curr Cancer Drug Targets 15:452–462 22. Anders CK, Winer EP, Ford JM, Dent R, Silver DP, Sledge GW et al (2010) Poly(ADP-ribose) polymerase inhibition: “targeted” therapy for triple-negative breast cancer. Clin Cancer Res 16:4702–4710 23. Xie J, Wang X, Proud CG (2016) mTOR inhibitors in cancer therapy. F1000Res 5:F1000 Faculty Rev-2078 24. Ko YJ, Balk SP (2004) Targeting steroid hormone receptor pathways in the treatment of hormone dependent cancers. Curr Pharm Biotechnol 5:459–470 25. Kisselev AF, van der Linden WA, Overkleeft HS (2012) Proteasome inhibitors: an expanding army attacking a unique target. Chem Biol 19:99–115 26. Suzuki M, Cheung N-K V (2015) Disialoganglioside GD2 as a therapeutic target for human diseases. Expert Opin Ther Targets 19:349–362 27. Giles FJ, Cortes JE, Kantarjian HM (2005) Targeting the kinase activity of the BCR-ABL fusion protein in patients with chronic myeloid leukemia. Curr Mol Med 5:615–623 28. Nahta R, Esteva FJ (2007) Trastuzumab: triumphs and tribulations. Oncogene 26:3637–3643

232

Marianna Zolotovskaia et al.

29. Hudis CA (2007) Trastuzumab – mechanism of action and use in clinical practice. N Engl J Med 357:39–51 30. Chapman PB, Hauschild A, Robert C, Haanen JB, Ascierto P, Larkin J et al (2011) Improved survival with vemurafenib in melanoma with BRAF V600E mutation. N Engl J Med 364:2507–2516 31. Prieto PA, Yang JC, Sherry RM, Hughes MS, Kammula US, White DE et al (2012) CTLA-4 blockade with ipilimumab: long-term followup of 177 patients with metastatic melanoma. Clin Cancer Res 18:2039–2047 32. Ma Q, Lu AYH (2011) Pharmacogenetics, pharmacogenomics, and individualized medicine. Pharmacol Rev 63:437–459 33. Zappa C, Mousa SA (2016) Non-small cell lung cancer: current treatment and future advances. Transl Lung Cancer Res 5:288–300 34. Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y et al (2014) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42:D1091–D1097 35. Buzdin A, Sorokin M, Garazha A, Sekacheva M, Kim E, Zhukov N et al (2018) Molecular pathway activation – new type of biomarkers for tumor morphology and personalized selection of target drugs. Semin Cancer Biol 53:110–124 36. Hornberger J, Cosler LE, Lyman GH (2005) Economic analysis of targeting chemotherapy using a 21-gene RT-PCR assay in lymph-nodenegative, estrogen-receptor-positive, earlystage breast cancer. Am J Manag Care 11:313–324 37. Le Tourneau C, Paoletti X, Servant N, Bie`che I, Gentien D, Rio Frio T et al (2014) Randomised proof-of-concept phase II trial comparing targeted therapy based on tumour molecular profiling vs conventional therapy in patients with refractory cancer: results of the feasibility part of the SHIVA trial. Br J Cancer 111:17–24 38. Røsland GV, Engelsen AST (2015) Novel points of attack for targeted cancer therapy. Basic Clin Pharmacol Toxicol 116:9–18 39. Duffy MJ (2017) Clinical use of tumor biomarkers: an overview. Klin Biochem Metab 25:157–161 40. Sowter HM, Ashworth A (2005) BRCA1 and BRCA2 as ovarian cancer susceptibility genes. Carcinogenesis 26:1651–1656 41. The´riault C, Pinard M, Comamala M, Migneault M, Beaudin J, Matte I et al (2011) MUC16 (CA125) regulates epithelial ovarian cancer cell growth, tumorigenesis and metastasis. Gynecol Oncol 121:434–443

42. Zhang Q, Burdette JE, Wang J-P (2014) Integrative network analysis of TCGA data for ovarian cancer. BMC Syst Biol 8:1338 43. Buzdin AA, Zhavoronkov AA, Korzinkin MB, Venkova LS, Zenin AA, Smirnov PY et al (2014) Oncofinder, a new method for the analysis of intracellular signaling pathway activation using transcriptomic data. Front Genet 5:55 44. Ozerov IV, Lezhnina KV, Izumchenko E, Artemov AV, Medintsev S, Vanhaelen Q et al (2016) In silico pathway activation network decomposition analysis (iPANDA) as a method for biomarker development. Nat Commun 7:13427 45. Borisov N, Suntsova M, Sorokin M, Garazha A, Kovalchuk O, Aliper A et al (2017) Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data. Cell cycle 16:1810–1823 46. Artcibasova AV, Korzinkin MB, Sorokin MI, Shegay PV, Zhavoronkov AA, Gaifullin N et al (2016) MiRImpact, a new bioinformatic method using complete microRNA expression profiles to assess their overall influence on the activity of intracellular molecular pathways. Cell Cycle 15:689–698 47. Aliper AM, Korzinkin MB, Kuzmina NB, Zenin AA, Venkova LS, Smirnov PY et al (2017) Mathematical justification of expression-based pathway activation scoring (PAS). Methods Mol Biol 1613:31–51 48. Chong M-L, Loh M, Thakkar B, Pang B, Iacopetta B, Soong R (2014) Phosphatidylinositol-3-kinase pathway aberrations in gastric and colorectal cancer: meta-analysis, co-occurrence and ethnic variation. Int J Cancer 134:1232–1238 49. Li H, Zeng J, Shen K (2014) PI3K/AKT/ mTOR signaling pathway as a therapeutic target for ovarian cancer. Arch Gynecol Obstet 290:1067–1078 50. Toren P, Zoubeidi A (2014) Targeting the PI3K/Akt pathway in prostate cancer: challenges and opportunities (Review). Int J Oncol 45:1793–1801 51. Borisov N, Sorokin M, Garazha AV, Buzdin A (2019) Quantitation of molecular pathway activation using RNA sequencing data. Methods Mol Biol. (In Press) 52. Borisov NM, Terekhanova NV, Aliper AM, Venkova LS, Smirnov PY, Roumiantsev S et al (2014) Signaling pathways activation profiles make better markers of cancer than expression of individual genes. Oncotarget 5:10198–10205

Molecular Pathway Analysis of Mutation Data 53. Buzdin AA, Zhavoronkov AA, Korzinkin MB, Roumiantsev SA, Aliper AM, Venkova LS et al (2014) The OncoFinder algorithm for minimizing the errors introduced by the highthroughput methods of transcriptome analysis. Front Mol Biosci 1:8 54. Wirsching A, Melloul E, Lezhnina K, Buzdin AA, Ogunshola OO, Borger P et al (2017) Temporary portal vein embolization is as efficient as permanent portal vein embolization in mice. Surgery 162:68–81 55. Kurz S, Thieme R, Amberg R, Groth M, Jahnke H-G, Pieroh P et al (2017) The antitumorigenic activity of A2M-A lesson from the naked mole-rat. PLoS One 12:e0189514 56. Petrov I, Suntsova M, Ilnitskaya E, Roumiantsev S, Sorokin M, Garazha A et al (2017) Gene expression and molecular pathway activation signatures of MYCN-amplified neuroblastomas. Oncotarget 8:83768–83780 57. Sorokin M, Kholodenko R, Grekhova A, Suntsova M, Pustovalova M, Vorobyeva N et al (2018) Acquired resistance to tyrosine kinase inhibitors may be linked with the decreased sensitivity to X-ray irradiation. Oncotarget 9:5111–5124 58. Zolotovskaia MA, Sorokin MI, Roumiantsev SA, Borisov NM, Buzdin AA (2018) Pathway instability is an effective new mutation-based type of cancer biomarkers. Front Oncol 8:658 59. Forbes SA, Beare D, Boutselakis H, Bamford S, Bindal N, Tate J et al (2017) COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res 45:D777–D783 60. Tomczak K, Czerwin´ska P, Wiznerowicz M (2015) The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol (Pozn) 19:A68–A77 61. Zolotovskaia MA, Sorokin MI, Emelianova AA, Borisov NM, Kuzmin DV, Borger P et al (2019) Pathway based analysis of mutation data is efficient for scoring target cancer drugs. Front Pharmacol 10:1 62. Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G et al (2014) The reactome pathway knowledgebase. Nucleic Acids Res 42:D472–D477 63. Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T et al (2009) PID: the pathway interaction database. Nucleic Acids Res 37: D674–D679 64. Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30 65. Romero P, Wagg J, Green ML, Kaiser D, Krummenacker M, Karp PD (2004) Computational prediction of human metabolic pathways

233

from the complete human genome. Genome Biol 6:R2 66. Nishimura D (2001) BioCarta. Biotech Software Internet Rep 2:117–120 67. Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, Wrobel MJ et al (2006) The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313:1929–1935 68. Wickham H (2009) Ggplot2: elegant graphics for data analysis. Springer, New York, NY 69. Green DM, Swets JA et al (1966) Signal detection theory and psychophysics. Wiley, New York, NY 70. Boyd JC (1997) Mathematical tools for demonstrating the clinical usefulness of biochemical markers. Scand J Clin Lab Invest Suppl 227:46–63 71. Fodde R (2002) The APC gene in colorectal cancer. Eur J Cancer 38:867–871 72. Risinger JI, Hayes K, Maxwell GL, Carney ME, Dodge RK, Barrett JC et al (1998) PTEN mutation in endometrial cancers is associated with favorable clinical and pathologic characteristics. Clin Cancer Res 4:3005–3010 73. Cohen Y, Xing M, Mambo E, Guo Z, Wu G, Trink B et al (2003) BRAF mutation in papillary thyroid carcinoma. J Natl Cancer Inst 95:625–627 74. Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A et al (2013) Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499:214–218 75. Murtagh F, Legendre P (2014) Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion? J Classif 31:274–295 76. Zhu Q, Izumchenko E, Aliper AM, Makarev E, Paz K, Buzdin AA et al (2015) Pathway activation strength is a novel independent prognostic biomarker for cetuximab sensitivity in colorectal cancer patients. Hum Genome Variat 2:15009 77. Nikitin D, Penzar D, Garazha A, Sorokin M, Tkachev V, Borisov N et al (2018) Profiling of human molecular pathways affected by retrotransposons at the level of regulation by transcription factor proteins. Front Immunol 9:30 78. Moore CB, Wallace JR, Frase AT, Pendergrass SA, Ritchie MD (2013) BioBin: a bioinformatics tool for automating the binning of rare variants using publicly available biological knowledge. BMC Med Genomics 6(Suppl 2):S6 79. Kim D, Li R, Dudek SM, Wallace JR, Ritchie MD (2015) Binning somatic mutations based on biological knowledge for

234

Marianna Zolotovskaia et al.

predicting survival: an application in renal cell carcinoma. Pacific symposium biocomputing, pp. 96–107 ˜ a-Llopis S, Gao J, 80. Park S, Kim S-J, Yu D, Pen Park JS et al (2016) An integrative somatic mutation analysis to identify pathways linked

with survival outcomes across 19 cancer types. Bioinformatics 32:1643–1651 81. QIAGEN. Sample to insight. https://www. qiagen.com/us/shop/genes-and-pathways/ pathway-central/. Accessed 19 Sep 2018

Chapter 17 Oncobox Method for Scoring Efficiencies of Anticancer Drugs Based on Gene Expression Data Victor Tkachev, Maxim Sorokin, Andrew Garazha, Nicolas Borisov, and Anton Buzdin Abstract We describe here the Oncobox method for scoring efficiencies of anticancer target drugs (ATDs) using high throughput gene expression data. The method rationale, design, and validation are given along with the examples of its practical applications in biomedicine. The method is based on the analysis of intracellular molecular pathways activation and measuring expressions of molecular target genes for every ATD under consideration. Using Oncobox method requires collection of normal (control) expression profiles and annotated databases of molecular pathways and drug target genes. Both microarray and RNA sequencing profiles are acceptable, although the latter type of data prevails in the most recent applications of this technique. Key words Systems biology, Bioinformatics, Intracellular molecular pathways, Gene expression, Transcriptomics, Proteomics, Tumor biomarkers, Anticancer target drugs, Response to cancer therapy

Abbreviations ATD IMP NGS PAL

1

Anticancer target drug Intracellular molecular pathway Next generation sequencing Pathway activation level, calculated using mRNA or protein expression data

Introduction For several decades, chemotherapy remains a key treatment for many cancers, often with impressive success rates. For example, the use of cisplatin regimens of testicular cancer treatment turned almost complete mortality to ~90–95% disease-specific survival [1, 2]. However, many cancer types remain incurable and/or unresponsive using standard chemotherapy approaches. Moreover,

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_17, © Springer Science+Business Media, LLC, part of Springer Nature 2020

235

236

Victor Tkachev et al.

chemotherapy frequently causes severe side effects, which dramatically decrease the patient’s quality of life [3, 4]. The chemical compounds included in standard chemotherapy cocktails may have numerous, sometimes poorly characterized molecular targets in both pathological and normal proliferating cells. This makes it problematic to predict the activities of drugs for an individual patient and to assess their overall clinical benefits considering severe side effects [5]. Two decades ago, a new generation of drugs termed Anticancer Target Drugs (ATDs) emerged that specifically target one or a few types of molecules in a cell [6–9]. Target drugs are either humanized monoclonal antibodies termed Mabs or low molecular weight inhibitor molecules (e.g., specific kinase inhibitors Nibs) [10]. ATDs drugs have different mechanisms of action and are effective for different cohorts of patients. For now, around two hundreds of ATDs have been approved for the global pharmaceutical market (e.g., see www.drugbank.ca). For several cancer types, the emergence of ATDs was highly beneficial. For example, trastuzumab (anti-HER2 monoclonal antibody) and several other new anti-HER2 medications at least doubled median survival time in patients with metastatic HER2-positive breast cancer and improved 5-year survival in early stage disease to ~90–95% [11, 12]. This reverted the worst prognosis for HER2-positive cancers among all breast cancer subtypes [13]. Compared to the previous therapies, ATDs are more expensive, but they often demonstrate increased efficiency and cause reduced side effects. However, many individual cases and even entire cancer types remain poorly responsive on target therapies [14]. Importantly, the results of clinical trials clearly evidence that in many cases the drugs considered inefficient for an overall cancer type, show significant benefit for a small fraction of the patients. For example, anti-EGFR drugs gefitinib and erlotinib showed no overall benefit in large randomized trials on patients with non-small cell lung cancer. However, ~10–15% of the patients responded to the treatment and showed longer survival. Further studies revealed that the therapeutic success was connected with the activating mutations in EGFR gene [15]. It is, therefore, of great importance to identify robust predictive markers of ATD efficacy, for as many cancer-drug combinations as possible. Several molecular tests have been designed to identify individualized cancer treatments [16, 17]. These tests utilize data on single somatic mutations or expressions of single or few genes [5]. Nevertheless, those predictor features are applicable to a relatively small proportion of patients because they cover only a minor fraction of ATDs, in a minor fraction of cancers. Thus, the problem of efficient ATD selection remains largely unsolved. More universal methods are, therefore, needed to rank the maximum number of drugs for a patient. The recent development of sequencing and microarray technologies provided an instrument for a new type of analysis of the

Scoring Efficiencies of Anticancer Drugs by Oncobox

237

histologically homogeneous tumors. The information about histomorphological type of tumor needs to be reinforced with the tumor genetics data when selecting the therapy. This approach is becoming more and more common in the modern clinical practice and has its obvious proven advantages [18]. In 2017, the US Food and Drug Administration (FDA) for the first time ever approved the tumor genetic marker, but not tumor localization or morphological type as the primary indication for prescription of an ATD Keytruda (Pembrolizumab) [19]. This case suggests that in the future personalized oncology may become a new major standard of care. The relevant task, therefore, is to develop new generation of biomedical platforms enabling smart selection of the most efficient therapy and search for prognostic tumor markers for an individual patient. Wide implementation of such techniques in the future will hopefully result in decrease of mortality from oncological diseases. Now there are a small number of diagnostic platforms utilizing specific types of large-scale genetic data for recommending cancer treatments. For example, CARIS Molecular Intelligence system is based on the analysis of a limited spectrum of mutations with previously demonstrated clinical significance, and on the immunohistochemical profiling for detection of several cancer biomarker proteins [20–23]. However, this system does not consider highthroughput gene expression data. We describe here the Oncobox method of predicting ATD efficacy based on high throughput gene expression profiles in the patient’s tumor. This method utilizes quantitative interrogation of the intercellular molecular pathways (IMP) activities. IMPs consolidate gene products involved in collective execution of certain molecular functions. IMPs drive all major events in the living cell and include signaling, metabolic, DNA repair, and cytoskeleton organization pathways [24]. Recently, a new family of methods has been proposed for quantitative analysis of IMP activation based on large scale molecular data [5, 25]. Measuring IPM activation requires data on concentrations of the gene products involved in this pathway. Oncobox, the latest IMP profiling approach, makes it possible to analyze mRNA, protein, and microRNA expression data, DNA mutation profiles and distributions of transcription factor binding sites (described in detail elsewhere [25, 26]). This approach also demonstrated a remarkable ability to reduce levels of experimental noise in the input data during the analysis. This noise is frequently introduced by the experimental equipment, protocols and reagents used [27, 28]. IMP activation levels (Pathway Activation Levels, PALs) were more stable biomarkers compared to the individual gene expression levels [27]. The Oncobox approach and related methods were effective in finding biomarkers for a plethora of biological processes, including many applications in cancer biology [5, 26]. These methods were

238

Victor Tkachev et al.

also useful for finding biomarkers of tumor response on drug treatments [29–33]. Overall, the pathway-based methods predicting drug responses can be classified in two major groups: (1) those using PAL signatures as statistical coincidence biomarkers (2) those using PAL values in the context of knowledge of drug specific molecular mechanisms [5]. The Oncobox approach for ATD scoring belongs to the second group of methods. This method combines profiling of intercellular molecular pathways (IMPs) and probing expression levels of drug target molecules for personalizing selection of ATDs.

2

Materials

2.1 Molecular Pathways

Molecular pathways used in the current version of Oncobox system were published previously in [34]. The gene contents data about 3125 human molecular pathways used to calculate mutation drug scores were extracted from Reactome [35], NCI Pathway Interaction Database [36], Kyoto Encyclopedia of Genes and Genomes [37], HumanCyc [38], Biocarta [39], and Qiagen (https://www. qiagen.com/us/shop/genes-and-pathways/pathway-central/; accessed 19 Sep 2018). For drug scores calculation, we used only the 1752 pathways including ten gene products or more because of previously reported poor theoretical data aggregation effect for smaller pathways [27]. All calculations of Pathway Activation Level (PAL) values were performed according to [25].

2.2 Expression Profile Harmonization

The code for Shambhala harmonizer tool [40] was written as further modification and upgrade of the R package CONOR [41]. The whole code was arranged as the R package HARMONY. This package, as well as a code example for Shambhala application, is deposited at Github, https://github.com/oncobox-admin/ harmony.

2.3 Clinical Trials Data

Clinical trials data were extracted from the National Institutes of Health web site ClinicalTrials.gov, available at: https://clinicaltrials. gov/ (Accessed 25 Jul 2017) and the US Food and Drug Administration Home Page Available at: https://www.fda.gov/ (Accessed 25 Jul 2017). They were processed by manual curation of web data. The processed clinical trials data used for the correlation studies is fully available in the previous publication [42].

2.4 Drug Targets Data

The information about molecular specificities of 128 target anticancer drugs was obtained from DrugBank [43] and ConnectivityMap [44] databases.

2.5 Data Presentation

The results were visualized using ggplot2 package [45].

Scoring Efficiencies of Anticancer Drugs by Oncobox

3

239

Methods

3.1 Molecular Mechanism-Based Pathway Activation Biomarkers

This group of methods employs knowledge of specific molecular mechanisms for predicting drug responses. It can be applied only for the drugs with known molecular specificities and established mechanism(s) of action. However, it has an advantage of ranking all possible target drugs using a single high throughput tumor gene expression profile. To this end, a Drug Score (DS) value is introduced to measure effectiveness of each target drug in a patient. DS calculation is based on the rationale that to be effective, a drug should compensate pathological changes in IMPs associated with cancer progression; at the same time, the specific molecular target(s) of a drug must be expressed at the average or high level. To assess IMP activation profile, the Pathway Activation Level (PAL) values can be calculated for the pathological tissues of a patient [25]. In case of oncological disease this can be, for example, fresh biopsy tissue or formalin fixed, paraffin-embedded (FFPE) tumor tissue block. In contrast to previous approaches for drug scoring [29] the Oncobox system interrogates the efficient concentrations of molecular targets of drugs under consideration. When required, it also may include a module for gene expression data harmonization using Shambhala method (described in detail in [40]) when combining expression profiles of test samples and of normal (control) samples. Data harmonization means bringing expression data to the universal comparable mode. It is necessary when joining the testing sample patient’s data with the relevant control sample(s), including those obtained using different experimental platforms. In such case, the Shambhala method should be applied. When the patient’s and control sample data are obtained using the same experimental platform, the data are most frequently considered harmonized without using additional algorithms. Furthermore, in the Oncobox system during the analysis of molecular pathway activation, the roles of each gene product in a pathway are defined algorithmically [25]. In contrast, in previous methods these roles were determined by manual curation of the molecular pathway graphs. This is an apparent source of inevitable operational errors that restricts a broad use of such techniques due to problematic high-quality serial manual processing of hundreds and thousands of molecular pathways, including each tens or hundreds of gene products forming numerous functional nodes [25]. Finally, the previous methods poorly distinguished the nature of different classes of target drugs and their modes of action when attempting to calculate drug scores [29].

240

Victor Tkachev et al.

3.2 Oncobox Drug Scoring Method

As the initial data, Oncobox uses the high throughput gene expression data obtained from the tumor tissue samples of the individual patients, and from the control healthy tissue samples. The fresh tissue, FFPE blocks or otherwise preserved tissue samples are used as the starting biomaterials. It is preferred to analyse sufficiently homogeneous area with cancer phenotype. To this end, the tissue with proliferative phenotype can be additionally cleared or isolated from other surrounding tissues using the methods known to specialists. The input primary data for Oncobox platform can be different: (1) at the mRNA level, high-throughput gene expression profiling with microarray hybridization or deep sequencing, or real-time reverse transcription PCR; (2) at the protein level, high throughput quantitative proteomics data. 1. The first addressable technical objective is profiling of PAL scores for the IMPs. Wherein, the measured intracellular pathways include signalling, DNA repair, metabolic, cytoskeleton rearrangement and other molecular pathways. It is divided into several sub-objectives: (a) Development of molecular pathways database and assigning of pathway-based functions to the enclosed gene products. (b) Development of algorithms for molecular pathway activation analysis using experimental data. The Oncobox platform quantitates PAL values for the relevant IMPs according to [25]. 2. The second technical problem being solved is personalized prediction of clinical efficiency of drugs for individual patients. This is divided into the following sub-objectives: (a) Development of molecular targets database for target drugs. (b) Development of algorithms for personalized prediction of clinical efficiencies of drugs based on molecular pathway activation data and other molecular statistical data. Drug score termed Balanced Efficiency Score (BES) is calculated as follows. BES is calculated using a single algorithm including summation of two basic members: Drug Efficiency Score MP, DESMP (reflects the contribution of molecular pathways) and Drug Efficiency Score TG, DESTG (reflects the contribution of individual molecular target genes). The different weight coefficients for DESMP and DESTG varying from 1 to 1.5 are used for various drugs. The drugs are classified into functional groups according to their known mode of action and molecular specificity.

Scoring Efficiencies of Anticancer Drugs by Oncobox

241

IMP analysis. The quantitative analysis of IMPs is the first stage of the Oncobox operation pipeline. To this end, the Oncobox system uses an algorithm for molecular pathway analysis described in detail in [25]. The analysis is performed using the molecular pathway database with the automatically annotated functional roles of the individual gene products—participants of each pathway. For the Oncobox system, five types of functional roles are introduced for gene products: pathway activator, repressor, rather activator, rather repressor, and gene product with uncertain or inconsistent role. Each functional role corresponds to a specific activator/repressor role (ARR) coefficient needed to calculate PAL. Automatic annotation of functional roles of gene products from the molecular pathways database is described in [25]. Therefore, the gene products included in the molecular pathways database will have the assigned ARR values representing their functional significances in the given molecular pathway. For generation of the molecular pathway database, both the published and the user-defined molecular pathway catalogues can be used. The published catalogues include collections of data, such as BioCarta, KEGG, NCI, Reactome, and Pathway Central [25]. To be integrated into the Oncobox system, each molecular pathway database should include the following information: l

Unique identifiers for all genes involved in the curated IMP.

l

ARR of each relevant gene product in every curated IMP: role of activator, repressor, neutral role, or roles of interim activator or repressor.

The basic algorithm for molecular pathway activation analysis is described in detail in [42]. Here we will only mention the basic Oncobox system formula for calculating pathway activation level of IMP: X X NIInp ARR np ln CNR n = n jARR n j, PALp ¼ n where PALp—molecular pathway p activation level; CNRn (case-to-normal ratio)—ratio of the protein-encoding gene n product concentrations in the test sample and in the norms (average value in the control group); ln—natural logarithm; NIInp—index of gene product n assignment to the pathway p, assuming the values equal to 1 for gene products included in the pathway and equal to 0 for gene products not included in the pathway; ln CNR is natural logarithm of the ratio of

242

Victor Tkachev et al.

gene n expression values in the test sample to the norm (average value for the control group); discrete value ARRnp (activator/repressor role) is deposited into the molecular pathway database and determined for a gene n in the pathway p as follows:

ARR np

3.3 Balanced Efficiency Score (BES) Calculation for Target Drugs

8 1; protein and signal repressor in pathway P > > > > > > < 0:5; protein n rather signal repressor in pathway P ¼ 0; unclear repressor or activator role in pathway P > > > 0:5; protein n rather signal activator in pathway P > > > : 1; protein n signal activator in pathway P

For using in the current version of the Oncobox system (January 2019), the term “target drug” is limited to 16 classes of drugs listed on Table 1. Drug classes under numbers 8, 9, 10, 14, 15 in Table 1 are immunoglobulin (antibody-based), while all other drugs on Table 1 are low-molecular mass chemical compounds (small molecules). Information from drug manufacturers, as well as scientific publications in specialized journals can be used to create database of molecular targets. In the Oncobox system, the database includes the following information for each drug: 1. Unique identifier of a drug. 2. Unique identifiers for gene products—molecular targets for this drug. 3. Drug type by the mode of action (according to Table 1). The Oncobox platform uses the innovative parameter of BES for each drug as a target drug efficiency measure. Wherein, the data on molecular pathway activity in a test sample and the data on expression levels of gene products - targets of a certain drug are simultaneously used for the BES calculation: BESd ¼ a DESMP d þb DESTG d , where d—target drug under investigation; a and b—weight coefficients varying from 1 to 1.5 depending on the target drug type d; DESMPd (Drug Efficiency Score for Molecular Pathways)—drug efficiency index d calculated based on activity levels for molecular pathways containing molecular targets of drug d; DESTGd (Drug Efficiency Score for Target Genes)—drug efficiency index d calculated based on levels of expression of individual gene products—molecular targets of drugs. – To calculate DESMP, the following formula is used:

Scoring Efficiencies of Anticancer Drugs by Oncobox

243

Table 1 Specific weight coefficients a and b for 16 classes of ATDs

No. Type

a and b values ATD class mechanism

1

Nibs

a ¼ 0.5 b ¼ 0.5

Low-molecular weight tyrosine kinase inhibitors

2

Nibs∗

a ¼ 0.5 b ¼ 0.5

Nibs being active only in case of specific diagnostic mutations

3

Hormones

a ¼ 0.5 Binding with hormone receptors b ¼ 0.5

4

Anti-hormones

a ¼ 0.5 b ¼ 0.5

Reducing the level of hormone production or sensitivity to hormones

5

Retinoids

a ¼ 0.5 b ¼ 0.5

Binding with retinoic acid receptors

6

Rapalogs

a ¼ 0.5 b ¼ 0.5

Rapamycin analogues; blocking the MTOR signaling

7

Mibs

a ¼ 0.5 b ¼ 0.5

Proteasome blocking agents

8

VEGF blocking agents

a¼0 b¼1

Antibodies inhibiting VEGF molecules in the blood flow

9

Mabs

a¼0 b¼1

Monoclonal antibodies binding with proteins on cell surface

10 Killermabs

a¼0 b ¼ 1.5

Antibodies covalently linked to small molecules (toxins), killing cells when directly binding with them

11 Tubulin blocking agents

a¼0 b¼1

Blocking the microtubule homeostasis in proliferating cells

12 HDAC inhibitors

a¼0 b¼1

Inhibition of histone deacetylases

13 Alkylating agents

a¼0 b ¼ 1

Genotoxic alkylation of DNA in proliferating cells

14 Immunotherapeutic drugs, type 1

a ¼ 0.5 b ¼ 0.5

Monoclonal antibodies blocking immunosuppression by binding with T-cell surface receptors

15 Immunotherapeutic drugs, type 2

a ¼ 0.5 b ¼ 0.5

Monoclonal antibodies blocking immunosuppression by binding with ligands of T-cell receptors

16 PARP blocking agents

a ¼ 0.5 b ¼ 0.5

Inhibition of poly-ADP ribose polymerase and blocking DNA repair

DESMP d ¼

X

DTId, t t

X p

PALp AMCFp NIIt, p ,

where d—unique identifier of target drug; t—unique identifier of gene product—target of drug d; p—unique identifier of signaling pathway; PALp—molecular pathway p activation

244

Victor Tkachev et al.

strength; discrete value AMCF (activation-to-mitosis conversion factor) to be determined as follows: AMCF ¼ 1, when the activation of a pathway facilitates cell survival, growth and division. AMCF ¼ 0, when there are no data whether the molecular pathway activation facilitates cell survival, growth and division, or when such data available for researcher are conflicting. AMCF ¼ 1, when the activation of a pathway prevents cell survival, growth and division. Discrete value DTI (drug-target index) is defined as follows: ( 0, when drug d doesn’ t affect gene product t DTIdt ¼ 1, when drug d affects gene product t Discrete value NII (node involvement index) is defined as follows: 0, there is no gene product t in pathway P NIItp ¼ 1, there is gene product t in pathway P – To calculate DESTG, the following formula is used: X X DTI ln CNR t ARR t, p AMCFp NIIt, p , DESTG d ¼ d , t t p where d—unique identifier of target drug; t—unique identifier of gene product—molecular target of drug d; p—unique identifier of signalling pathway; CNRn (case-to-normal ratio)—ratio of the expression levels of protein-coding gene t in the test sample to the norm (averaged expression level for a control group); ln—natural logarithm; definitions of DTId,t, AMCFp, and NII are similar to those given above; discrete value ARRt, p (activator/repressor role) is defined for a gene product t in the pathway p as follows and deposited into the molecular pathway database: 8 1; geneproduct n is repressor of pathway P > > > > > > 0:5; gene product n is rather repressor of pathway P > > < ARR np ¼ 0; activator=repressor role of gene product n inpathway P is unclear orunknown > > > > > 0:5; geneproduct n is rather activatorof pathway P > > > : 1; gene product n is activatorof pathway P To calculate the Balanced Efficiency Score (BES) for drug d, weight coefficients a and b are used, which differ depending on the drug type. Values of the coefficients are given in Table 1.

Scoring Efficiencies of Anticancer Drugs by Oncobox

245

For low-molecular tyrosine kinase inhibitors (nibs), both weight coefficients are equal to 0.5 representing equal significance of target molecular pathway activation and target gene expression levels in the pathological tissue sample tested. This is related to nibs capability of blocking their molecular targets and thus inhibiting their activities, as well as modulating the cell signaling via related molecular pathways. For hormones, both weight coefficients are equal to 0.5, due to the fact that they activate but not inhibit their molecular targets and act accordingly also on their target molecular pathways. For antihormones, coefficients are equal to 0.5 again which is due to their inhibition effect on their molecular targets, hormone products and on the respective molecular pathways. For retinoids, both coefficients are equal to 0.5 because these drugs bind retinoic acid receptors and activate a number of dependent molecular pathways. For rapalogs (rapamycin analogs), both coefficients are equal to 0.5 because they demonstrate their inhibition effect by directly binding with their molecular targets, and act accordingly on the relevant molecular pathways. For mibs (proteasome inhibitors), both coefficients are equal to 0.5 because these drugs demonstrate the inhibition effect when binding with their molecular targets, and act accordingly on the relevant molecular pathways and proteasome signaling. For VEGF blocking agents, a ¼ 0 and b ¼ 1 because these drugs directly blocks the VEGF molecules in the blood flow while not binding with the molecular targets inside the cell or on the cell surface and, therefore, do not directly affect the intracellular signaling. For monoclonal antibodies that bind with their molecular targets on the cell surface (mabs), a ¼ 0 and b ¼ 1 as their main mode of action consists in activation of immune cytotoxic response against the cells having bound mab molecules on their surface and does not deal with strong modulation of signaling by affecting molecular pathways. Killermabs are antibodies against molecular targets on the cell surface, chemically bound with cytotoxic agents. When binding with their targets on the cell surface, the killermabs kill these cells, thus demonstrating therapeutic mechanism not related to intracellular molecular pathway activation. For them, a ¼ 0 and b ¼ 1.5; in this case the increased coefficient b represents proprietary high cytotoxic activities of these drugs. For drugs blocking de novo tubulin polymerization, a ¼ 0 and b ¼ 1; this represents the indefinite function of many targeted pathways for these drugs in cell survival and proliferation, as well as their direct inhibitory effect on their molecular targets. The same coefficients are also set for histone deacetylase inhibitors due to the same reasons concerning their mechanism of action. For DNA-alkylating agents, a ¼ 0 and b ¼ 1 reflecting the indefinite functions of the majority of targeted pathways for cell survival and proliferation, as well as direct inhibitory effect on these drugs on of DNA repair proteins that target the alkylated DNA

246

Victor Tkachev et al.

(reflected by the coefficient b ¼ 1). For immunotherapeutic drugs, both coefficients are equal to 0.5 due to dependence of their effect on the availability of both direct molecular targets and molecular pathway activation profiles related to tumor infiltration with lymphocytes. Similarly, the poly-ADP ribose polymerase blocking drugs inhibit DNA repair and depend on both availability of direct molecular targets and on the activities of targeted molecular pathways. This is reflected by both coefficients a and b equal to 0.5. The Oncobox system makes it possible to rank the efficiencies of anticancer medicinal products which belong to 16 different classes (Table 1). Classification of the medicinal product is made according to their known modes of action and molecular specificities. Then the Balanced Efficiency Score (BES) is calculated in different ways for different classes of anti-cancer drugs (Table 1). Then, according to BES values, a personalized rating of target anticancer medicinal products for the test biosample, for example, taken from the oncological patient, is built, wherein, the medicinal products with a positive BES value (BES > 0) can be recommended. 3.4

Examples

The practical applications of Oncobox system can be illustrated by the following examples. Example 1: Calculation of oncological drug activity rating for individual cancer patient based on mRNA expression data. The rating of potentially efficient target drugs was built for a 72-years old female patient with histologically distinctive moderately differentiated intrahepatic cholangiocarcinoma (Fig. 1). The patient was diagnosed in October 2015 with the following symptoms: moderate body weight loss, pain in right hypochondrium, loss of appetite and asthenia, with 70% Karnofsky index. The magnetic resonance imaging (MRI) proved the diagnosis during diagnostics. The tumor was not surgically excised due to advanced stage, several intrahepatic masses and lung metastasis. Initially, the patient received treatment that was considered the best clinical practice: four courses of chemotherapy (2 courses of gemcitabine combined with capecitabine and next 2 courses of gemcitabine combined with cisplatin), until May 2016. The treatment was ineffective, and the tumor size increased according to MRI; additional metastatic tumors appeared in left and right lobes which spread to bile duct and gall bladder. Karnofsky index decreased to 60%. The patient did not respond to treatment, and the extended molecular analysis of tumor was performed using Oncobox system to identify alternative treatment options [46]. At first, total RNA was isolated from tumor sample (FFPE tissue block) and used to measure expression levels for 2163 genes with CustomArray Inc. (USA) equipment using microarray hybridization method as described in [27, 46]. These 2163 genes

Scoring Efficiencies of Anticancer Drugs by Oncobox

247

Fig. 1 Hematoxylin and eosin staining of FFPE slice of tumor biopsy indicates moderately differentiated intrahepatic cholangiocarcinoma

participate in major IMPs associated with cancer, and also act as the molecular targets of anti-cancer drugs. Liver samples without pathological characteristics were taken from healthy donors and used as normal tissue controls. Using the Oncobox algorithm, the rating of target drugs was formed according to the BES values obtained (Table 2). In accordance with the Oncobox test results, in May 2016 the patient received ATD tyrosine kinase inhibitor Sorafenib. In October 2016, MRI detected moderate development of tumor corresponding to stable disease according to RECIST classification. Furthermore, following treatment with Sorafenib, pain elimination in right hypochondrium was detected. MRI dated January 2017 detected progressing tumor and additional nodes in the right lung. Therefore, the period till tumor progression was about 6 months. The following adverse effects were registered: reddening, edema, pains in palms of the hands and bottoms of the feet. It was a doctor decision to change treatment and to prescribe Pazopanib, another tyrosine kinase inhibitor ATD recommended according to the Oncobox test results. Treatment with Pazopanib started in January 2017. The check MRI in June 2017 showed moderate tumor development. Wherein, change in treatment has managed adverse effects of Sorafenib and generally improved the quality of life of the patient. In October 2017, the patient was alive and physically active, with the Karnofsky index of ~100% [46].

248

Victor Tkachev et al.

Table 2 Rating of the most efficient predicted target drugs for the patient with cholangiocarcinoma according to Oncobox test results

Position in ATD rating

ATD

Balanced Efficiency Score (BES)

1

Regorafenib

128.5

2

Sorafenib

82.7

3

Sunitinib

79.2

4

Pazopanib

69.1

5

Axitinib

55.1

6

Vandetanib

33.6

7

Cabozantinib

25.4

8

Imatinib

21.4

9

Aflibercept

20.8

10

Dasatinib

15.5

Example 2: Comparison of different drug scoring methods: Oncobox versus previously published approaches. The Oncobox BES values were compared with the drug scores (referred to as DS1) calculated using the previously published method [29, 47–50] for the ATDs. Determination of target drugs efficiency was made according to mRNA gene expression profiles of cancer patients from the open database TCGA (The Cancer Genome Atlas), and the calculated efficiency scores for different drugs were compared with their known clinical statuses. TCGA database (https://cancergenome.nih.gov/) includes multiple gene expression profiles of different cancer tissues. Using Oncobox system, BES values were calculated for the patients of 11 cancer types (Table 3). For the same group of gene expression profiles, the DS1 values were also calculated according to the alternative method previously published in [29]. Totally, 128 ATDs were ranked using alternative approaches. Then, the alternative ratings for the top 10% drugs were built by BES or DS1 values. The lists obtained were compared between different cancer types by Jaccard index values. The resulting graph summarizing the paired comparison for all cancer types for DS1 is shown on Fig. 2. According to DS1, all drugs included in top 10% of the rating were identical for the different cancer types having very different clinically approved treatments. This clearly suggests that DS1 is hardly applicable for personalized prescription of ATDs to oncological patients as most of the patients got identical predictions of ATD efficiencies.

Scoring Efficiencies of Anticancer Drugs by Oncobox

249

Table 3 Statistics of the analyzed transcriptomic profiles from the TCGA database by cancer types Tumor type

Number of transcriptomic profiles

Brain tumors

703

Breast cancer

1102

Cervical cancers

306

Colorectal cancer

382

Esophageal cancer

185

Kidney cancer

891

Liver cancer

374

Lung cancer

1019

Ovarian cancer

309

Prostate cancer

498

Stomach cancer

415

On the contrary, in the case of BES ranking of ATDs, top 10% of drugs varied significantly between the different cancer types (Fig. 3), thus reflecting apparently different clinical efficiencies of ATDs in different cancer types. The BES rating, therefore, met the criterion of personalized ATD prescription according to this test. It was further investigated if alternative ATD scorings matched the clinical statuses of these drugs. To this end, the information about clinical trial stages of 128 ATDs for cervical carcinoma were extracted from the clinical trials database available through the web site clinicaltrials.gov. Every ATD received a clinical efficiency coefficient varying from 0 to 1 according to the following rule: 1—the drug is clinically accepted for cervical carcinoma, 0.85—phase 3 of clinical trials completed, 0.7—phase 3 of clinical trials in progress, 0.4—phase 2 of clinical trials completed, 0.3—phase 1 of clinical trials completed, and 0—no clinical trials data available. Based on these metrics, it is possible to calculate to which extent the personalized rating matches the drug clinical status in each patient’s tumor tissue. When the top of the rating includes the drugs, which advanced through the clinical trials pipeline or were even recommended for the particular cancer type, and vice versa, when the bottom of the rating has lower proportion of such drugs, this means that the rating reflects the real success rates of the ATDs. In order to formalize this principle, the following formula was used:

250

Victor Tkachev et al.

Fig. 2 Jaccard index plot comparison of top 10% lists of drugs appearing in the DS1 ratings for eleven investigated cancer types. Color depth indicates value of Jaccard index, see color scale

A¼

Xndrugs i¼1

Xndrugs 1 E i rankðDSi Þ ndrugs 0:5 Ei , i¼1

where Ei—clinical status coefficient of the medicinal product i; DS—efficiency score for a target drug (e.g., BES or DS1); ndrugs—number of ATDs (in this example, 128). The obtained index A called Anubis coefficient assesses if the calculated drug efficiency score matches well with its clinical status. Figure 4 illustrates the graph showing the dependence of BES rating of a drug upon its clinical status for a patient with cervical carcinoma. The drugs that passed through later stages of clinical trials were more abundant in top-half positions in the rating. The Anubis coefficient calculated for this case was 14.83. The Anubis coefficients were next calculated for all patients with cervical carcinoma using BES and DS1 metrics, density functions shown on

Scoring Efficiencies of Anticancer Drugs by Oncobox

251

Fig. 3 Jaccard index plot comparison of top 10% lists of drugs appearing in the Oncobox BES ratings for eleven investigated cancer types. Color depth indicates value of Jaccard index, see color scale

Fig. 5. The results demonstrated that the Anubis coefficients were significantly higher for BES than for DS1. This suggests that BES better matches the clinical status of drugs than DS1.

4

Summary The capacity of the Oncobox drug scoring algorithm was tested using both published and original gene expression data, obtained using RNA sequencing methods (Illumina HiSeq, MySeq, Ion Torrent/Proton) or microarrays (Illumina HT-12v4, Affymetrix HG-U133_Plus_2, and the others) [5]. The method was efficient in discriminating responders vs nonresponders to treatment in both retrospective and prospective investigations [5].

252

Victor Tkachev et al.

Fig. 4 Comparison of the BES-based Oncobox target drugs rating and clinical status of the same drugs according to the clinicaltrials.gov database (August 2017). The Oncobox drug scoring was based on high throughput transcriptomic profile of a patient with cervical carcinoma extracted from TCGA database. Green color labels drugs appearing in the top half of the rating, red color—drugs in the bottom half of the rating

Fig. 5 Density plot of Anubis coefficients calculated by BES (upper panel) or DS1a (lower panel) for 306 patients with cervical carcinoma. Transparent histograms show densities of Anubis coefficients for randomized clinical statuses of drugs

Clinical benefit of using Oncobox method was mentioned in some recent oncological case study reports [46, 51]. Several clinical trials are ongoing to identify which groups of the patients can get a maximum advantage from using Oncobox tests (https:// clinicaltrials.gov/ct2/show/NCT03521245?term¼oncobox&

Scoring Efficiencies of Anticancer Drugs by Oncobox

253

rank¼1; https://clinicaltrials.gov/ct2/show/NCT03724097?ter m¼oncobox&rank¼2). So far, there are only few very first attempts of translating these approaches to clinical oncology. The theoretical considerations suggest that the more recently the tumor biopsy investigated was taken from the moment of molecular testing, the better should be the results. This is also true for the number of lines of therapy in the interval between taking biopsy and molecular testing. Several successive lines of treatment can dramatically change molecular landscape of the tumor due to either its heterogeneity or purifying selection of the resistant clones [52, 53]. Addressing this point of high heterogeneity of certain tumors by using adequate technical solutions hopefully can in the future dramatically increase the effectiveness of cancer therapy. Further laboratory and clinical studies are needed to investigate the applicability and success rate of gene expression and molecular pathwaybased approaches for personalizing treatments for the different human cancers [5]. Finally, so far, all the clinical applications of Oncobox method dealt with the gene expression measured at level of mRNA. Due to its further development, one day quantitative proteomics can replace transcriptomics as the major source of gene expression data. We believe that the Oncobox method will be effective for the proteomics data as well due to recent success with measuring molecular pathway activation using quantitative proteomics profiles [42].

Acknowledgments This study was supported by the Russian Science Foundation grant 18-15-00061. References 1. Hanna N, Einhorn LH (2014) Testicular cancer: a reflection on 50 years of discovery. J Clin Oncol 32:3085–3092 2. Oldenburg J, Aparicio J, Beyer J, CohnCedermark G, Cullen M, Gilligan T et al (2015) Personalizing, not patronizing: the case for patient autonomy by unbiased presentation of management options in stage I testicular cancer. Ann Oncol 26:833–838 3. Ahles TA, Saykin AJ, Furstenberg CT, Cole B, Mott LA, Titus-Ernstoff L et al (2005) Quality of life of long-term survivors of breast cancer and lymphoma treated with standard-dose chemotherapy or local therapy. J Clin Oncol 23:4399–4405 4. Kayl AE, Meyers CA (2006) Side-effects of chemotherapy and quality of life in ovarian

and breast cancer patients. Curr Opin Obstet Gynecol 18:24–28 5. Buzdin A, Sorokin M, Garazha A, Sekacheva M, Kim E, Zhukov N et al (2018) Molecular pathway activation – new type of biomarkers for tumor morphology and personalized selection of target drugs. Semin Cancer Biol 53:110–124 6. Druker BJ, Sawyers CL, Kantarjian H, Resta DJ, Reese SF, Ford JM et al (2001) Activity of a specific inhibitor of the BCR-ABL tyrosine kinase in the blast crisis of chronic myeloid leukemia and acute lymphoblastic leukemia with the Philadelphia chromosome. N Engl J Med 344:1038–1042 7. Druker BJ, Talpaz M, Resta DJ, Peng B, Buchdunger E, Ford JM et al (2001) Efficacy

254

Victor Tkachev et al.

and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia. N Engl J Med 344:1031–1037 8. Spirin P, Lebedev T, Orlova N, Morozov A, Poymenova N, Dmitriev SE et al (2017) Synergistic suppression of t(8;21)-positive leukemia cell growth by combining oridonin and MAPK1/ERK2 inhibitors. Oncotarget 8:56991–57002 9. Sjo¨stro¨m J (2002) Predictive factors for response to chemotherapy in advanced breast cancer. Acta Oncol 41:334–345 10. Aggarwal S (2010) Targeted cancer therapies. Nat Rev Drug Discov 9:427–428 11. Hudis CA (2007) Trastuzumab – mechanism of action and use in clinical practice. N Engl J Med 357:39–51 12. Nahta R, Esteva FJ (2007) Trastuzumab: triumphs and tribulations. Oncogene 26:3637–3643 13. Onitilo AA, Engel JM, Greenlee RT, Mukesh BN (2009) Breast cancer subtypes based on ER/PR and Her2 expression: comparison of clinicopathologic features and survival. Clin Med Res 7:4–13 14. Institute for Quality and Efficiency in Health Care (2014) Curation vs. palliation: an attempt to clarify terms. Institute for Quality and Efficiency in Health Care (IQWiG), Cologne 15. Gridelli C, De Marinis F, Di Maio M, Cortinovis D, Cappuzzo F, Mok T (2011) Gefitinib as first-line treatment for patients with advanced non-small-cell lung cancer with activating epidermal growth factor receptor mutation: Review of the evidence. Lung Cancer 71:249–257 16. Hornberger J, Cosler LE, Lyman GH (2005) Economic analysis of targeting chemotherapy using a 21-gene RT-PCR assay in lymph-nodenegative, estrogen-receptor-positive, earlystage breast cancer. Am J Manag Care 11:313–324 17. Le Tourneau C, Paoletti X, Servant N, Bie`che I, Gentien D, Rio Frio T et al (2014) Randomised proof-of-concept phase II trial comparing targeted therapy based on tumour molecular profiling vs conventional therapy in patients with refractory cancer: results of the feasibility part of the SHIVA trial. Br J Cancer 111:17–24 18. Martel CL, Lara PN (2003) Renal cell carcinoma: current status and future directions. Crit Rev Oncol Hematol 45:177–190 19. Poole RM (2014) Pembrolizumab: first global approval. Drugs 74:1973–1981 20. Russell K, Shunyakov L, Dicke KA, Maney T, Voss A (2014) A practical approach to aid

physician interpretation of clinically actionable predictive biomarker results in a multi-platform tumor profiling service. Front Pharmacol 5:76 21. Green DE, Jayakrishnan TT, Hwang M, Pappas SG, Gamblin TC, Turaga KK (2014) Immunohistochemistry - microarray analysis of patients with peritoneal metastases of appendiceal or colorectal origin. Front Surg 1:50 22. Popovtzer A, Sarfaty M, Limon D, Marshack G, Perlow E, Dvir A et al (2015) Metastatic salivary gland tumors: a singlecenter study demonstrating the feasibility and potential clinical benefit of molecular-profilingguided therapy. Biomed Res Int 2015:614845 23. Vigneswaran J, Tan Y-HC, Murgu SD, Won BM, Patton KA, Villaflor VM et al (2016) Comprehensive genetic testing identifies targetable genomic alterations in most patients with non-small cell lung cancer, specifically adenocarcinoma, single institute investigation. Oncotarget 7:18876–18886 24. Blagosklonny MV (2013) MTOR-driven quasi-programmed aging as a disposable soma theory: blind watchmaker vs. intelligent designer. Cell Cycle 12:1842–1847 25. Borisov N, Sorokin M, Garazha AV, Buzdin A (2019) Quantitation of molecular pathway activation using RNA sequencing data. In: Walker J (ed) Methods Molecular Biology. Springer, Heidelberg 26. Zolotovskaia M, Sorokin M, Garazha A, Borisov N, Buzdin A (2019) Molecular pathway analysis of mutation data for biomarkers discovery and scoring of target cancer drugs. In: Walker J (ed) Methods Molecular Biology. Springer, Heidelberg 27. Borisov N, Suntsova M, Sorokin M, Garazha A, Kovalchuk O, Aliper A et al (2017) Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data. Cell Cycle 16:1810–1823 28. Buzdin AA, Zhavoronkov AA, Korzinkin MB, Roumiantsev SA, Aliper AM, Venkova LS et al (2014) The OncoFinder algorithm for minimizing the errors introduced by the highthroughput methods of transcriptome analysis. Front Mol Biosci 1:8 29. Artemov A, Aliper A, Korzinkin M, Lezhnina K, Jellen L, Zhukov N et al (2015) A method for predicting target drug efficiency in cancer based on the analysis of signaling pathway activation. Oncotarget 6:29347–29356 30. Zhu Q, Izumchenko E, Aliper AM, Makarev E, Paz K, Buzdin AA et al (2015) Pathway activation strength is a novel independent prognostic

Scoring Efficiencies of Anticancer Drugs by Oncobox biomarker for cetuximab sensitivity in colorectal cancer patients. Hum Genome Var 2:15009 31. Venkova L, Aliper A, Suntsova M, Kholodenko R, Shepelin D, Borisov N et al (2015) Combinatorial high-throughput experimental and bioinformatic approach identifies molecular pathways linked with the sensitivity to anticancer target drugs. Oncotarget 6:27227–27238 32. Buzdin AA, Prassolov V, Zhavoronkov AA, Borisov NM (2017) Bioinformatics meets biomedicine: OncoFinder, a quantitative approach for interrogating molecular pathways using gene expression data. Methods Mol Biol 1613:53–83 33. Buzdin A, Sorokin M, Glusker A, Garazha A, Poddubskaya E, Shirokorad V et al (2017) Activation of intracellular signaling pathways as a new type of biomarkers for selection of target anticancer drugs. J Clin Oncol 35: e23142–e23142 34. Zolotovskaia MA, Sorokin MI, Roumiantsev SA, Borisov NM, Buzdin AA (2018) Pathway instability is an effective new mutation-based type of cancer biomarkers. Front Oncol 8:658 35. Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G et al (2014) The Reactome pathway knowledgebase. Nucleic Acids Res 42:D472–D477 36. Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T et al (2009) PID: the pathway interaction database. Nucleic Acids Res 37: D674–D679 37. Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30 38. Romero P, Wagg J, Green ML, Kaiser D, Krummenacker M, Karp PD (2005) Computational prediction of human metabolic pathways from the complete human genome. Genome Biol 6:R2 39. Nishimura D (2001) BioCarta. Biotech Software Internet Rep 2:117–120 40. Borisov N, Shabalina I, Tkachev V, Sorokin M, Garazha A, Pulin A et al (2019) Shambhala: a platform-agnostic data harmonizer for gene expression data. BMC Bioinformatics 20:66 41. Rudy J, Valafar F (2011) Empirical comparison of cross-platform normalization methods for gene expression data. BMC Bioinformatics 12:467 42. Zolotovskaia MA, Sorokin MI, Emelianova AA, Borisov NM, Kuzmin DV, Borger P et al (2019) Pathway based analysis of mutation data is efficient for scoring target cancer drugs. Front Pharmacol 10:1

255

43. Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y et al (2014) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42:D1091–D1097 44. Lamb J (2006) The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313:1929–1935 45. Wilkinson L (2011) ggplot2: elegant graphics for data analysis by WICKHAM, H. Biometrics 67:678–679 46. Poddubskaya EV, Baranova MP, Allina DO, Smirnov PY, Albert EA, Kirilchev AP et al (2018) Personalized prescription of tyrosine kinase inhibitors in unresectable metastatic cholangiocarcinoma. Exp Hematol Oncol 7:21 47. Borisov N, Buzdin AA, Zavoronkovs A, Aliper AM, Allina D, Kovalchuk O et al (2017) System, method, and software for improved drug efficacy and safety in a patient. US Patent US20170193176A1 48. Borisov N, Tkachev V, Suntsova M, Kovalchuk O, Zhavoronkov A, Muchnik I et al (2018) A method of gene expression data transfer from cell lines to cancer patients for machine-learning prediction of drug efficiency. Cell Cycle 17:486–491 49. Borisov N, Tkachev V, Buzdin A, Muchnik I (2018) Prediction of drug efficiency by transferring gene expression data from cell lines to cancer patients. In: Rozonoer L, Mirkin B, Muchnik I (eds) Braverman readings in machine learning. Key ideas from inception to current state. Springer, Cham, pp 201–212 50. Borisov N, Tkachev V, Muchnik I, Buzdin A (2017) Individual drug treatment prediction in oncology based on machine learning using cell culture gene expression data. ACM Press, New York, NY, pp 1–6 51. Poddubskaya E, Baranova M, Allina D, Sekacheva M, Makovskaia L, Kamashev D et al (2019) Personalized prescription of imatinib in recurrent granulosa cell tumor of the ovary: case report. Cold Spring Harb Mol Case Stud. https://doi.org/10.1101/mcs. a003434 52. Snyder V, Reed-Newman TC, Arnold L, Thomas SM, Anant S (2018) Cancer stem cell metabolism and potential therapeutic targets. Front Oncol 8:203 53. Zhang L, Zhang H, Ai H, Hu H, Li S, Zhao J et al (2018) Applications of machine learning methods in drug toxicity prediction. Curr Top Med Chem 18:987–997

Chapter 18 Global Characterization of Circulating Nucleic Acids Marina Dunaeva and Ger J. M. Pruijn Abstract Circulating nucleic acids (CNAs) include genomic and mitochondrial DNA fragments, small RNAs, and bacterial and viral DNA/RNA. Different mechanisms such as cell apoptosis, necrosis, and active CNA release from cells have been proposed to result in nucleic acids in the circulation. Application of next generation sequencing technology demonstrated that CNAs contain specific mutations, indels, microsatellite alterations, and epigenetic changes (DNA methylation) associated with various diseases. Their clinical implications have been demonstrated for diseases such as cancer, stroke, trauma, myocardial infarction, autoimmune disorders, and pregnancy-associated complications. Thus, CNAs in blood represent an attractive family of molecules that can serve as biomarkers and the analysis of CNAs can be alternative for immunohistochemical analyses of conventional biopsies. The methods described in this chapter provides details for circulating DNA and small RNA isolation, CNA(-derived cDNA) library preparation, and sequencing data analysis. Key words DNA, Small RNA, Isolation, Sequencing, Data analysis, Biomarker

1

Introduction It has been known for a long time that circulating nucleic acids (CNAs) such as cell-free DNA (cfDNA), cfRNA and cell-free small RNA are present in human body fluids such as serum, plasma, saliva, milk, and urine. Various mechanisms for the release of CNAs in circulation have been proposed. They include (programmed) cell death processes, like apoptosis and necrosis, and active release from cells. Circulating cfDNA (ccfDNA) comprises double-stranded DNA fragments in a complex with histones (nucleosomes). Histones protect ccfDNA in biofluids from nuclease digestion. The most common size of these complexes is between 160 and 170 bp. It has recently been demonstrated that the ccfDNA size distribution can indicate its source and distinguish tumor- and non-tumorderived DNA fragments [1]. ccfDNA has been considered as intercellular messenger and it is capable to travel within an organism or

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9_18, © Springer Science+Business Media, LLC, part of Springer Nature 2020

257

258

Marina Dunaeva and Ger J. M. Pruijn

between organisms [2]. For example, genomic DNA was detected in extracellular vesicles released from prostasomes, prostate-derived organelles that occur in human seminal plasma [3]. ccfDNA levels in plasma/serum of healthy subjects are low (1–100 ng/mL). ccfDNA and genomic DNA show no differences in repeat and gene representation in healthy subjects, although ccfDNA has a higher proportion of short interspersed nuclear element sequences (SINEs) and a lower proportion of long interspersed nuclear elements (LINEs) [4]. Different pathological conditions, such as cancer or transplant rejection, are associated with changes in ccfDNA concentration and composition. Deep sequencing analysis of cDNA from human tumor tissues demonstrated the presence of tumor-shed DNA in the circulation and showed genomic concordance (gene mutations, microsatellite instability and methylation patterns) with cellular DNA from primary tumor cells or metastases. The presence of donor-derived ccfDNA in patients with organ transplantation has also been shown and the increased levels of this ccfDNA could be an indication of organ rejection. In addition to these pathological examples, the analysis of maternal ccfDNA in peripheral blood showed the presence of fetal shed DNA. This fetal genetic material could be used in prenatal screening for chromosomal aneuploidies. Currently, this approach is applied to detect certain fetal genetic defects such as Down syndrome. Thus, differences in genetic profiles of ccfDNA from contributing tissues may allow the identification of biomarkers for cancer development and progression, organ integrity or fetus pathology. Acute or chronic conditions such as stroke, cardiovascular diseases and autoimmune diseases are also associated with elevated levels of ccfDNA. Most probably, damaged tissues are the main sources of elevated ccfDNA levels. However, it is difficult to determine the contributing cells due to the lack of genetic differences. Indeed, some studies revealed that ccfDNA profiles can be used as potential biomarkers for these diseases. ccfDNA motifs observed in serum distinguished patients with multiple sclerosis from healthy controls [5]. Patients with systemic lupus erythematosus (SLE) showed plasma DNA aberrations, including differences in genomic representation and significant elevation in proportion of short DNA fragments [6]. Furthermore, Snyder et al. used a nucleosome foot printing (ccfDNA fragmentation patterns) strategy to demonstrate that short ccfDNA fragments harbor footprints of transcription factors and these footprints can be used to infer cell types that contribute to ccfDNA release [7]. This alternative “genotype-independent” strategy can broaden the clinical applications of ccfDNA in liquid biopsies. Circulating small RNAs as disease biomarkers are another “genotype-independent” approach. Circulating small RNAs include miRNAs, piRNAs, tRNA fragments (tRFs), Y RNAs,

Global Characterization of Circulating Nucleic Acids

259

small nucleolar RNAs (snoRNAs), small nuclear RNAs (snRNAs), and long noncoding RNAs (lncRNAs). They all can be found in circulating RNA–protein complexes in association with argonaute protein or high density lipoprotein or incorporated into exosomes or microparticles. miRNAs and tRFs are the best characterized small RNAs. miRNAs are 19–23 nucleotides (nts) long and they regulate cellular gene expression posttranslationally via RNA∗RNA antisense binding. Multiple tRF types include 50 -tRFs, i-tRFs, 30 -tRFs, 50 -tRNA halves (50 -tRHs), and 30 -tRNA halves (30 -tRHs) [8]. They differ in the cleavage position of the mature or precursor tRNA transcript and their length ranges from 18 to 36 nucleotides (nts). tRFs are constitutive components of some cell types and are upregulated during specific developmental stages, proliferation, stress or viral infection. snoRNAs are another class of well-known small RNAs. They are involved in ribosomal RNA (rRNA) maturation [9]. Y RNAs are involved in the regulation of RNA stability, cellular responses to stress, and for the initiation of chromosomal DNA replication [10]. Circulating nucleic acids have potential as surrogate markers for development, progression and recurrence of many diseases, since they reflect tissue damage or represent tumor-related nucleic acids. The real-time quantitative polymerase chain reaction (qPCR) is a widely used technique to quantify circulating cell-free nucleic acids in biofluids. However, next generation sequencing (NGS), a high throughput DNA sequencing technology, introduced new options to characterize genomic profiles and quantify CNAs and is currently used to detect molecular abnormalities in cancer patients. Here we provide details for CNA purification, library preparation, next generation sequencing and data analysis for circulating nucleic acids in human serum/plasma.

2

Materials All solutions have to be prepared with nuclease-free water. All purification steps are performed at room temperature.

2.1 Circulating CellFree DNA

1. Serum/plasma samples. 2. QIAamp DNA Blood Mini kit (Qiagen).

2.1.1 Isolation of Circulating CellFree DNA 2.1.2 ccfDNA Library Preparation

1. GenomePlex Complete Whole Genome Amplification (WGA) kit (WGA2, Sigma-Aldrich). 2. GenElute PCR Clean-up kit (Sigma-Aldrich).

260

Marina Dunaeva and Ger J. M. Pruijn

3. QuantiFluor®dsDNA (Promega). 4. 5 TBE running buffer (total volume 1000 mL; 54 g of Tris base, 27.5 g boric acid, 20 mL of 0.5 M EDTA, pH 8.0; adjusted to 8.3 by HCl). 5. 10 mg/mL ethidium bromide. 6. 0.8% Agarose TBE gel (0.8 g agarose, 100 mL 1 TBE, 5 μL EtBr). 7. DNA gel loading dye (Thermo Fisher Scientific). 8. Gel electrophoresis equipment. 9. Power supply. 10. PCR markers (NEB). 11. Lambda/HindIII markers (Thermo Fisher Scientific). 12. TruSeq SBS kit v3-HS (FC-401-3001, 200 cycles, Illumina; SBS-sequencing by synthesis). 2.2 Circulating CellFree Small RNA

2.2.1 Small RNA Isolation

All reagents have to be RNase-free. RNA samples should be stored at 80 C. The use of DNA/RNA LoBind microcentrifuge tubes (Eppendorf) diminishes absorption of oligonucleotides and RNA molecules. 1. miRCURY RNA isolation kit-Biofluids (Exiqon). 2. >99.5% 2-Propanol. 3. RNase-free water. 4. Quant-iT RiboGreen RNA reagent (Thermo Fisher Scientific). 5. 96-well microtiter plates (Costar).

2.2.2 Small RNA Library Preparation

1. TruSeq small RNA RS-200-0012).

library

Prep

Kit

(Index

1–12,

2. 200,000 U/mL T4 RNA Ligase 2, Deletion Mutant (New England Biolabs). 3. 200 U/uL SuperScript (Invitrogen).

II

Reverse

Transcriptase

kit

4. 5 First Strand Buffer (250 mM Tris–HCl, pH 8.3; 375 mM KCl, 15 mM MgCl2; Invitrogen). 5. 100 mM DTT. 2.2.3 Gel Purification of the Amplified cDNA Libraries

1. 40% acrylamide–bisacrylamide solution (19:1, SERVA). 2. 5 TBE running buffer (total volume is 1000 mL; 54 g of Tris base, 27.5 g boric acid, 20 mL of 0.5 M EDTA, pH 8.0; adjusted to 8.3 by HCl). 3. Ammonium persulfate: 10% solution in water. 4. N,N,N,N-tetramethylethylenediamine (TEMED).

Global Characterization of Circulating Nucleic Acids

261

5. 10 mg/mL Ultra-pure ethidium bromide. 6. Gel loading dye Blue (6, NEB). 7. High Resolution Ladder (Illumina). 8. 5 μm filter tubes (IST Engineering Inc). 9. Gel breaker tubes (IST Engineering Inc). 10. Razor blade. 11. 3 M sodium acetate, pH 5.2 (Thermo Scientific). 12. 10 mM Tris–HCl, pH 8.5. 13. 100% ethanol.

3

Methods

3.1 Circulating CellFree DNA 3.1.1 ccfDNA Isolation

3.1.2 ccfDNA Library Preparation for Next Generation Sequencing and Sequencing Analysis

Collect serum/plasma samples and store them at 80 C prior to further processing. Thaw frozen serum/plasma samples at 4 C and remove cell debris by brief centrifugation at 3000 g for 5 min. Extract ccfDNA from 200 μL of serum/plasma using the QIAamp DNA Blood Mini kit (Qiagen) according to the manufacturer’s manual and elute with 50 μL of water (see Note 1). Since human body fluids contain relatively small amounts of ccfDNA, an amplification step is required to increase the amount of DNA. 1. Use the GenomePlex Complete Whole Genome Amplification (WGA) kit to amplify isolated ccfDNA according to manufacturer’s instructions (see Note 2). Omit the shearing step (see Note 3). 2. Use GenElute PCR Clean-up kit to isolate the amplified DNA. Elute the DNA with 40 μL of water. 3. Check the quality of the amplified ccfDNA by loading 5–10% of the amplification reaction mixture on a 0.8% agarose gel. The DNA size should range from 100 to 1000 bp with a mean size of approximately 400 bp (Fig. 1). 4. Quantify DNA samples, for example, using the QuantiFluor®dsDNA kit manual. 5. Construct sequencing libraries using 1 μg of amplified ccfDNA with the TruSeq SBS kit v3-HS (200-cycles), using the manufacturer’s guidelines. 6. Pool equimolar amounts of each library. Check the size distribution by the Agilent Technologies 2100 Bioanalyzer using the Agilent High Sensitivity DNA kit (Agilent Technology) according the manufacturer’s instructions. 7. Sequence the samples on an Illumina HiSeq2000, generating paired-end reads of 2 100 nucleotides.

262

Marina Dunaeva and Ger J. M. Pruijn

Fig. 1 Agarose gel analysis of amplified ccfDNA. The quality of the amplified ccfDNA using the WGA2 kit was analyzed by electrophoretic separation in a 0.8% agarose gel and staining by EtBr. As indicated in the kit manual, the DNA size ranges from 100 to 1000 bp, with a mean size around 400 bp. Line 1, PCR markers (NEB); line 2, Lambda/HindIII markers (Thermo Fischer Scientific); line 3, amplified ccfDNA. The square bracket indicates the range of amplified ccfDNA

Raw data

Output: FASTQ files

Quality control, detection of repetetive elements

Output: Phred score, GC content, nucleotide distribution

Alignment

Output: SAM/BAM files

Analysis of insert size distribution and genomic representation

Output: disease characterization

Fig. 2 Workflow for ccfDNA sequencing data analysis. Pipeline for data analysis: Raw reads are preprocessed for quality assessment, quality filtering and adaptor trimming of raw reads, followed by sequence read alignment against a reference genome 3.1.3 Bioinformatics Analyses

The pipeline for the analysis of ccfDNA sequencing is presented in Fig. 2. This pipeline includes quality control of raw data, preprocessing of the raw reads, sequence alignment/mapping, and alignment postprocessing.

Quality Control of Raw Data, Preprocessing of the Raw Reads

1. Use Consensus Assessment of Sequence and variation software package (CASAVA-1.8.2; http://gensoft.pasteur.fr/docs/ casava/1.8.2/CASAVA_1_8_2_UG_15011196C.pdf; Illumina Inc.) to produce FASTQ files (see Note 4).

Global Characterization of Circulating Nucleic Acids

263

2. Conduct data quality control on raw FASTQ files using FastQC software (version 0.11.8; https://www.bioinformat ics.babraham.ac.uk/projects/fastqc/). Analyze the following quality parameters: Phred score distribution per base and per sequence, GC content distribution (see Note 5), the nucleotide distribution, and duplication rate. Remove low-quality sequences before mapping. 3. Detect repetitive elements with the RepeatMasker software (http://www.repeatmasker.org). 4. To remove duplicate reads (reads originating from a single fragment of DNA) in a BAM and SAM file, use MarkDuplicates (Genome Analysis Toolkit, GATK; https://software.bro adinstitute.org/gatk/documentation/tooldocs/4.0.4.0/ picard_sam_markduplicates_MarkDuplicates.php). Sequence Alignment/ Mapping

Alignment Postprocessing

Use Bowtie 2 (version 2.3.4.3; http://bowtie-bio.sourceforge. net/bowtie2/index.shtml) [11] to align the reads against the nonrepeat-masked human reference genome GRCh38 (Genome Reference Consortium Human build hg38) with the following parameters: allow two mismatches; discard duplicates and reads that map to multiple locations. The output of this step is a file in Sequence Alignment Map (SAM) format that stores alignment data (see Note 6). 1. Convert the SAM files into Binary Alignment Map (BAM) format using SAMtools [12]. This step is useful to save storage space (see Note 7). 2. Filter BAM files based on for example mapping quality, read length, reads without mismatches.

Analysis of the ccfDNA Fragments

1. Subsequently, determine the insert size distribution. The insert size is the sequence between the adapters. It can be extracted from a SAM file (see Note 8). 2. To analyze the distribution of ccfDNA along the genome, divide the entire genome into equal size windows/bins (e.g., 1 Mb or 100 kB) [13]. Due to sequencing biases in GC-rich/ GC-poor regions, correct read counts for the GC content for each window/bin using locally weighted scatterplot smoothing (LOESS model) using the LOESS R package [14]. Use z-score statistics to analyze whether the ccfDNA representation in each window differs from the reference (control) group. 3. Use GATK4 (https://software.broadinstitute.org/gatk/ gatk4) and the dbSNP/1000 Genomes dataset (http://www. internationalgenome.org/category/dbsnp/) for SNP analysis.

264

Marina Dunaeva and Ger J. M. Pruijn

Data Visualization

Use integrative genomics viewer (http://software.broadinstitute. org/software/igv/home) as visualization tool for data generated by the analyses described above [15] and Circos (http://circos.ca/ intro/genomic_data/) to analyze similarities and differences between samples.

3.2 Circulating CellFree Small RNA

Use a 100 μL aliquot of each serum/plasma sample to extract small RNA by the miRCURY RNA isolation kit-Biofluids (Exiqon) according to the manufacturer’s manual (see Note 9). Elute the RNA with 25 μL of RNase-free water. Quantify the RNA yield by Quant-iT RiboGreen RNA reagent using the low range assay in a 200 μL total volume in a 96-well microtiter plate.

3.2.1 Small RNA Isolation

3.2.2 Small cfRNA Sequencing Library Preparation

Use the TruSeq Small RNA sample preparation kit (Illumina) to generate sequence libraries. (https://support.illumina.com/con tent/dam/illumina-support/documents/documentation/chemis try_documentation/samplepreps_truseq/truseqsmallrna/truseqsmall-rna-library-prep-kit-reference-guide-15004197-02.pdf; see Note 10). The example of the small RNA is presented in Fig. 3.

3.2.3 Bioinformatics Analyses

The pipeline for the analysis of small RNA sequencing is presented in Fig. 4. This pipeline includes demultiplexing and generation of FASTQ files, adapter trimming and quality control of raw data,

Fig. 3 Electrophoretic analysis of amplified cDNA from small RNA. Amplified cDNAs were separated by electrophoresis in a 6% nondenaturing polyacrylamide-TBE gel, followed by staining with EtBr. The positions of markers are depicted on the left side. Line 1, High Resolution Ladder (HRL, Illumina); line 2, serum small RNA; line 3, HeLa cells.

Fig. 4 Pipeline to analyze small RNA sequencing data. These steps include raw data preprocessing, mapping to human genome, small RNA identification, analysis of difference in small RNA levels, and of their putative targets

Global Characterization of Circulating Nucleic Acids

265

sequence alignment/mapping, analysis of difference in levels and miRNA putative targets. 1. Demultiplex all reads (see Note 11) and convert the raw Illumina reads to the FastQC format for downstream analysis using CASAVA-1.8.2 software. Allow one mismatch in the index tags. 2. Trim the adapters from the raw reads with Cutadapt-1.3 software (https://cutadapt.readthedocs.io/en/stable/guide.html). 3. Apply FastQC and FASTX tools to check the read quality. Discard reads with a quality lower than phred ¼ 30, reads shorter than 18 nts (see Note 12), and reads with less than 5 counts (lower limit detection). The FASTQ output files also provide information about the presence of overrepresented sequences (k-mer) which usually represent adapters, poly (A) stretches and contaminations. Trim the overrepresented sequences if they are adapters. The contaminating sequences will fall out during the alignment. 4. To identify human sequences in the libraries, map the reads to the human genome (hg38; UCSC Genome Browser) using Bowtie 2 aligner, allowing up to two mismatches. 5. To annotate the sequences of small RNAs, use various reference databases such as miRBase (www.mirbase.org; release 21) for miRNAs, the genomic tRNA database GtRNAdb (http:// gtrnadb.ucsc.edu) for tRNA-derived molecules, and databases containing hY RNA sequence information (http://rnacentral. org) for hY RNA-derived molecules. Allow up to two mismatches, although the error tolerance can be adjusted. Count each read only once even if it maps to multiple reference sequences (see Note 13). 6. Perform miRNA isoforms analysis using the isomer detection software tools (see Note 14). 3.2.4 Detection of Differences in Circulating miRNA Levels Between Different Groups

To detect differences in the levels of circulating miRNAs between various groups and to perform multidimensional scaling (MDS) analysis, apply the R package edgeR in Bioconductor (www.bio conductor.org/packages/3.3/bioc/html/edgeR.html version 3.12.0; see Note 15) [16]. Use trimmed mean of M value (TMM) for normalization across samples. The output file contains multidimensional scaling, small RNA level heat map, p-value, and FDR value.

3.2.5 Mapping of Nonhuman Small RNA Sequences

Map nonhuman reads against the NCBI RefSeq collection of bacterial, viral, and fungal genomes (ftp://ftp.ncbi.nlm.nih.gov/ genomes/refseq/).

266

Marina Dunaeva and Ger J. M. Pruijn

3.2.6 miRNA Target Prediction

4

To identify a specific function of an miRNA, use an miRNA prediction tool. The list of the different tools can be found in [17].

Notes 1. Several different kits for ccfDNA isolation have been developed. cfDNA isolation efficiencies differ and fragment size should be considered an important factor to choose an appropriate isolation kit. 2. To amplify DNA, techniques like PCR-based or multiple displacement-based amplification (MDA) have been developed. Keep in mind that both methods have strengths and weaknesses. GenomePlex is a Whole Genome Amplification (WGA) method that generates a ~500-fold amplification. 3. Circulating cfDNA is highly fragmented and additional treatment for cfDNA such as shearing procedure prior to library construction is not necessary. 4. CASAVA is the part of software developed by Illumina for sequencing analysis. 5. It has been reported that DNA molecules with poor GC-content and shorter in length are preferentially amplified leading to overrepresentation of these fragments in high throughput sequences (GC bias) which affects the downstream analysis. 6. Many alignment tools have been developed for NGS analysis. Sequencing platform, duration of the analysis, and required memory should be considered for the alignment tool choice. 7. Both SAM and BAM formats contain the same information but a BAM file represents the compressed binary format of a SAM file, which saves a significant amount of storage space. 8. Previous studies revealed that the molecular size distribution profiles of ccfDNA are altered in several clinical conditions such as pregnancy [18], malignancy [19], and systemic lupus erythematosus [13]. 9. Different kits for the isolation of small RNA are available on the market. These kits use different methods for RNA extraction such as organic extraction, the spin column extraction, and the magnetic beads extraction. All these methods have both advantages and disadvantages and affect RNA extraction yield and efficiency. Currently, no consensus exists as to which method and/or kit is the best. 10. The TruSeq Small RNA sample preparation kit (Illumina) provides a detailed protocol how to prepare small RNA library.

Global Characterization of Circulating Nucleic Acids

267

11. The process of demultiplexing includes splitting sequenced reads into groups according to their index tag, which corresponds to the identity of the samples. 12. Sequences shorter than 18 nts cannot be mapped to genomic loci with high confidence. 13. The Bowtie2 aligner looks for as many alignments as possible with scores above the cutoff value for each read, and by default chooses one valid sequence alignment from the top hits (heuristic approach). It randomly selects the best alignment from those with equal scores. However, it is recommended to report all loci in case of multiple alignments with equal scores. 14. miRNA isomirs are miRNA sequences that have variations in comparison with the canonical miRNA sequence [20]. Many different software tools are available for isomiR identification. 15. Alternatively, DESeq2 can be used to estimate the differences in small RNA levels. The conditions determining the choice for one of these two methods are described by Schurch et al. [21]. References 1. Mouliere F, Chandrananda D, Piskorz AM, Moore EK, Morris J, Ahlborn LB et al (2018) Enhanced detection of circulating tumor DNA by fragment size analysis. Sci Transl Med 10:466 2. Khier S, Lohan L (2018) Kinetics of circulating cell-free DNA for biomedical applications: critical appraisal of the literature. Future Sci OA 4: FSO295 3. Ronquist KG, Ronquist G, Carlsson L, Larsson A (2009) Human prostasomes contain chromosomal DNA. Prostate 69:737–743 4. Beck J, Urnovitz HB, Riggert J, Clerici M, Schu¨tz E (2009) Profile of the circulating DNA in apparently healthy individuals. Clin Chem 55:730–738 5. Beck J, Urnovitz HB, Saresella M, Caputo D, Clerici M, Mitchell WM et al (2010) Serum DNA motifs predict disease and clinical status in multiple sclerosis. J Mol Diagn 12:312–319 6. Chan RW, Jiang P, Peng X, Tam LS, Liao GJ, Li EK et al (2014) Plasma DNA aberrations in systemic lupus erythematosus revealed by genomic and methylomic sequencing. Proc Natl Acad Sci U S A 111:E5302–E5311 7. Snyder MW, Kircher M, Hill AJ, Daza RM, Shendure J (2016) Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin. Cell 164:57–68 8. Magee RG, Telonis AG, Loher P, Londin E, Rigoutsos I (2018) Profiles of miRNA isoforms and tRNA fragments in prostate cancer. Sci Rep 8:5314

9. Kiss T (2002) Small nucleolar RNAs: an abundant group of noncoding RNAs with diverse cellular functions. Cell 109:145–148 10. Kowalski MP, Krude T (2015) Functional roles of non-coding Y RNAs. Int J Biochem Cell Biol 66:20–29 11. Langmead B, Salzberg SL (2012) Fast gappedread alignment with Bowtie 2. Nat Methods 9:357–359 12. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079 13. Chan RW, Jiang P, Peng X, Tam LS, Liao GJW et al (2014) Plasma DNA aberrations in systemic lupus erythematosus revealed by genomic and methylomic sequencing. Proc Natl Acad Sci U S A 111:E5302–E5311 14. Benjamini Y, Speed TP (2012) Summarizing and correcting the GC content bias in highthroughput sequencing. Nucleic Acids Res 40:e72 15. Robinson JT, Thorvaldsdo´ttir H, Winckler W, Guttman M, Lander ES, Getz G et al (2010) Integrative genomics viewer. Nat Biotechnol 29:24–26 16. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140 17. Chen L, Heikkinen L, Wang C, Yang Y, Sun H, Wong G (2018) Trends in the development of

268

Marina Dunaeva and Ger J. M. Pruijn

miRNA bioinformatics tools. Brief Bioinform. https://doi.org/10.1093/bib/bby054 18. Chan KCA, Zhang J, Hui ABY, Wong N, Lau TK, Leung TN et al (2004) Size distributions of maternal and fetal DNA in maternal plasma. Clin Chem 50:88–92 19. Wang BG, Huang HY, Chen YC, Bristow RE, Kassauei K, Cheng CC et al (2003) Increased plasma DNA integrity in cancer patients. Cancer Res 63:3966–3968 20. Morin RD, O’Connor MD, Griffith M, Kuchenbauer F, Delaney A, Prabhu AL et al

(2008) Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res 18:610–621 21. Schurch NJ, Schofield P, Gierlin´ski M, Cole C, Sherstnev A, Singh V et al (2016) How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use. RNA 22:839–851

INDEX A

F

ADs, see Autoimmune diseases (ADs) AFM, see Atomic force microscopy (AFM) Algorithm ............................................. 12, 134, 195–197, 199–201, 210, 218–223, 225, 239–241, 247, 251 Animal models............................................. 158, 178, 202 Antibodies ..................................................... v, 58, 59, 63, 66, 68, 70, 73–82, 141, 142, 146, 159, 165, 175, 176, 179, 208, 236, 245 Antibodies to double stranded DNA............................. 73 Anticancer drugs .................................210, 212, 235–253 Antisense oligonucleotide (AON) ...................... 119–137 Apolipoprotein H (ApoH) .......................................58–70 Atomic force microscopy (AFM) ...................... 90, 92, 96 Autoantibodies ..........................................................58–70 Autoimmune diseases (ADs) .................... 58, 73, 75, 258

Flow cytometry ............................................................. 175 Fluorescence ..................................... v, 27, 34, 35, 37–43, 45, 47, 48, 51, 53, 74, 92, 101–116, 165, 175 Fluorescence imaging .......................................... 119–137 Fluorescent oligonucleotides....................................37, 40

B Biomark HD....................................................... 17, 19–24 Biosensors .................................................. 74, 75, 78, 191

C Cancer DNA..............................................................37–43 Circulating nucleic acids (CNAs)........................ 257–267 Cy5............................ 122, 124, 125, 127, 129, 132, 136

G Gene expression ................................................17, 20, 21, 160, 167, 190–197, 202, 218, 237, 239, 240, 245, 248, 251, 253, 259

H High-throughput .............................................. 17–24, 28, 181–188, 191, 196, 200, 202, 208, 209, 226, 230, 237, 239, 240, 252, 259, 266 High-throughput DNA sequencing ............................ 259 HIV, type 1.................................................. 28, 29, 34, 35

I Imaging........................v, vi, 89, 90, 92, 94–96, 100, 243 Influenza virus ...........................................................18, 19 Isothermal amplification ............................................... 3, 4

J

D

Joe, E. ................................................................... 139–153

Deoxyribonucleic acid (DNA) .............................. v, 4, 17, 28, 37, 45, 58, 73, 87, 102, 119, 182, 190, 209, 237, 257 Detection probes.............................................v, 38, 40–42 DNA-amphiphiles ...................................................87–100 DNA nanostructure ..................................................87, 88 DNA oncogene ............................................................... 38 Double-stranded RNA viruses ............................ 181–188 Dynamic array (DA).....................................18, 20–22, 24 Dystrophic mouse models ................................... 171–179

L

E Electroporation .................................................v, 157–168 Enzyme-linked immunosorbent assay (ELISA).................. 58–60, 62, 63, 65–69, 74, 81 Excimer ......................................................................45, 53

Library preparation .................... 181–188, 259–261, 264 Liposome .............................................................. 101–116 Liposome fusion................................................... 101–116 Low copy RNA .........................................................27–35

M Magnetic nanoparticles (MNPs) ....................... 4, 5, 8–13 MicroRNA (miRNA) ................................. 139–153, 192, 195–197, 199, 209, 237 miRNA, see MicroRNA Mismatches ...............................38, 45–54, 136, 263, 265 Muscle derived stem cells .................................v, 171–179 Muscular dystrophies .................................. 158, 172, 175 Mutations ................... 4, 28, 37, 45, 157, 208, 236, 258

Kira Astakhova and Syeda Atia Bukhari (eds.), Nucleic Acid Detection and Structural Investigations: Methods and Protocols, Methods in Molecular Biology, vol. 2063, https://doi.org/10.1007/978-1-0716-0138-9, © Springer Science+Business Media, LLC, part of Springer Nature 2020

269

NUCLEIC ACID DETECTION

270 Index

AND

STRUCTURAL INVESTIGATIONS: METHODS

AND

PROTOCOLS

O

S

Oligonucleotides ............................................. v, 8, 27, 37, 45, 88, 102, 119, 260 Oncobox .............................................................. 191–196, 199–202, 210, 218, 226, 230, 235–253 Optomagnetic ............................................................. 3–14

Satellite cells (SCs) .............................................. 158, 167, 171, 173–175, 178, 179 Self-assembly ...........................................................87–100 Serological test ................................................................ 73 Single microfluidics device........................................17–24 Single-nucleotide polymorphism (SNP)..................37, 38 Skeletal muscle injury .......................................... 157–168 SLE, see Systemic lupus erythematosus (SLE) Solid-phase hybridization .........................................37–43 Subcellular fractionation...................................... 139–153 Subtyping of virus .....................................................17–24 Swine influenza .........................................................17–24 Synthetic oligonucleotide .........................................8, 119 Systemic lupus erythematosus (SLE)......................58–70, 73, 75, 258, 266

P Perylene ........................................................................... 38 Phosphoramidites.....................28, 30, 34, 88–93, 97, 98 Pre-amplification .................................19–21, 23, 24, 150 Pyrene ........................................................................45–54

R Rational design................................................................ 59 Real-time detection.......................................... 4, 5, 11–13 Real-time PCR (RT-qPCR).....................................17–24, 27–35, 143, 150–151, 160, 167–168 Ribonucleic acid (RNA) ......................................v, 18, 28, 45, 58, 119, 166–167, 181, 191, 218, 243, 257 RNA sequencing .................... v, 190–202, 251, 264, 265 Rolling circle amplification (RCA)............................. 3–14

T 2’-O-methyl RNA .....................................................45–54

Y Yin-yang probes ........................................................27–35