Principles of Cell Biology [3 ed.] 2019955373

633 121 58MB

English Pages [1748]

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Principles of Cell Biology [3 ed.]
 2019955373

Table of contents :
Cover
Title Page
Copyright Page
Dedication Page
Brief Contents
Contents
The Fourteen Principles of Cell Biology
Preface
Acknowledgments
About the Cover
About the Authors
Chapter 1 Life Is a Team Sport
1.1 The Big Picture
1.2 Life Can Arise from Simple Ingredients
Nonliving Substances Combine to Form Life
Membrane Formation Requires Water
Code Biology Helps Explain the Diversity of Life
Evidence for the Possibility of Extraterrestrial Life
1.3 All Cells Are Built from the Same Common Molecular Building Blocks
The Study of Cellular Chemistry Begins with an Examination of the Carbon Atom
Complex Biomolecules Are Mostly Composed of Chemical Building Blocks Called Functional Groups
Lipids Are Carbon-Rich Polymers That Are Insoluble in Water
Sugars Are Simple Carbohydrates
Amino Acids Form Carbon-Rich Molecules That Contain an Amino Acid Group and a Carboxylic Acid Group
Nucleotides Are Complex Structures Containing a Sugar, a Phosphate Group, and a Base
1.4 Cells Must Cooperate to Succeed
Prokaryotes Are the Simplest Forms of Cells
Eukaryotes Are Complex Cells Capable of Forming Multicellular Organisms
Biofilms Support Prokaryotic and Eukaryotic Symbiosis
Macroorganismal Hosts Coevolve with Their Microbiomes to Create New Holobionts
1.5 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 2 DNA Is the Instruction Book for Life
2.1 The Big Picture
2.2 All of the Information Necessary for Cells to Respond to Their External Environment Is Stored as DNA
A Cell’s DNA Is Inherited
DNA Must Be Read to Be Useful
2.3 DNA Is Carefully Packaged into Five Levels of Organization
DNA Is a Linear Polymer of Deoxyribonucleotides
Level 1: DNA Forms an Antiparallel Double Helix
Level 2: DNA Is Bound to a Protein/RNA Scaffold
Level 3: DNA Is Twisted to Form Fibers
Level 4: DNA Fibers Attach to a Protein-RNA Scaffold
Level 5: Chromatin Is Packaged into Highly Condensed Chromosomes
2.4 Cells Chemically Modify DNA and Its Scaffold to Control Packaging
Chemical Modifications at Level 1 and Level 2 Can Affect DNA Packing Across All Levels of DNA Organization
2.5 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 3 Proteins Are the Engines of Evolution
3.1 The Big Picture
3.2 Amino Acids Form Linear Polymers
A Peptide Bond Joins Two Amino Acids Together
Definitions: Proteins Versus Polypeptides Versus Peptides Versus Subunits
3.3 Protein Structure Is Classified into Four Categories
Primary Structure Is Defined by the Linear Sequence of Amino Acids
Secondary Structure Is Defined by Regions of Repetitive, Predictable Organization in the Primary Structure
Tertiary Structure Is Defined by the Arrangement of the Secondary Structures in Three Dimensions
Quaternary Structure Is Defined by the Three-Dimensional Arrangement of Polypeptide Subunits in a Multimeric Protein
Five Classes of Chemical Bonds Stabilize Protein Structure
3.4 Changing Protein Shape and Protein Function
All Proteins Adopt at Least Two Different Shapes
Cells Chemically Modify Proteins to Control Their Shape and Function
Classification of Proteins
3.5 Where Do Proteins Go to Die?
Proteins in the Cytosol and Nucleus Are Broken Down in the Proteasome
Proteins in Organelles Are Digested in Lysosomes
Proteinases Digest Proteins in the Extracellular Space
3.6 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 4 Membranes Are Complex Fluids That Define Compartments
4.1 The Big Picture
4.2 Phospholipids Are the Basic Building Blocks of Cellular Membranes
Phospholipids Contain Four Structural Elements
The Amphipathic Nature of Phospholipids Allows Them to Form Lipid Bilayers in Aqueous Solution
Phospholipid Bilayers Are Semipermeable Barriers
4.3 The Fluid-Mosaic Model Explains How Phospholipids and Proteins Interact Within a Cellular Membrane
Membrane Proteins Associate with Membranes in Three Different Ways
Cellular Membranes Are Both Fluid and Static
4.4 Cellular Membranes Maintain Chemical Disequilibrium Between Compartments
Protein Channels, Carriers, and Pumps Regulate the Transport of Most Small Molecules Across Membranes
4.5 The Smooth Endoplasmic Reticulum and Golgi Apparatus Build Most Eukaryotic Cellular Membrane Components
Glycerol and Fatty Acids Are Synthesized in the Cytosol
The Synthesis of Phosphoglycerides Begins at the Cytosolic Face of the SER Membrane
Additional Membrane Lipids Are Synthesized in the Endoplasmic Reticulum and Golgi Apparatus
Most Membrane Assembly Begins in the SER and Is Completed in the Target Organelle
4.6 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 5 The Cytoskeleton Forms the Architectural Foundation for the Structural Complexity of Life
5.1 The Big Picture
5.2 The Cytoskeleton Is Represented by Three Functional Classes of Proteins
5.3 Intermediate Filaments Are the Strongest, Stablest Elements of the Cytoskeleton
Intermediate Filaments Are Formed from a Family of Related Proteins
The Primary Building Block of Intermediate Filaments Is a Filamentous Subunit
Intermediate Filament Subunits Form Coiled-Coil Dimers
Heterodimers Overlap to Form Filamentous Tetramers
Assembly of a Mature Intermediate Filament from Tetramers Occurs in Three Stages
Posttranslational Modifications Control the Shape of Intermediate Filaments
5.4 Microtubules Organize Movement Inside a Cell
Microtubule Assembly Begins at a Microtubule-Organizing Center
The Growth and Shrinkage of Microtubules Is Called Dynamic Instability
Microtubule-Associated Proteins Regulate the Stability and Function of Microtubules
Cilia and Flagella Are Specialized Microtubule-Based Structures Responsible for Motility in Some Cells
5.5 Actin Filaments Control the Movement of Cells
The Building Block of Actin Filaments Is the Actin Monomer
Actin Polymerization Occurs in Three Stages
Actin Filaments Have Structural Polarity
5.6 Seven Classes of Proteins Bind to Actin to Control Its Polymerization and Organization
Monomer-Binding Proteins Regulate Actin Polymerization
Nucleating Proteins Regulate Actin Polymerization
Capping, Depolymerizing, and Severing Proteins Affect the Length and Stability of Actin Filaments
Crosslinking Proteins Organize Actin Filaments into Bundles and Networks
Membrane Anchors and Cytoskeletal Linkers Bridge Actin Filaments to Other Structural Proteins Including Intermediate Filaments and Microtubules
Myosins Exert Force on Actin Filaments to Induce Cell Movement
Cell Migration Is a Complex, Dynamic Reorganization of an Entire Cell
5.7 Eukaryotic Cytoskeletal Proteins Arose from Prokaryotic Ancestors
5.8 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 6 The Rise of Multicellularity
6.1 The Big Picture
6.2 Multicellularity Is an Evolutionary Response to Selective Pressure
6.3 The Extracellular Matrix Is a Complex Network of Molecules That Fills the Spaces Between Cells in a Multicellular Organism
Glycoproteins Form Filamentous Networks Between Cells
Proteoglycans Provide Hydration to Tissues
Matricellular Proteins Are Nonadhesive Proteins That Regulate the Functions of Extracellular Matrix Proteins
The Basal Lamina Is a Specialized Extracellular Matrix
Most Integrins Are Receptors for Extracellular Matrix Proteins
6.4 Cells Adhere to One Another via Specialized Proteins and Junctional Complexes
Tight Junctions Form Selectively Permeable Barriers Between Cells
Adherens Junctions Link Adjacent Cells
Desmosomes Are Intermediate Filament-Based Cell Adhesion Complexes
Gap Junctions Allow Direct Transfer of Molecules Between Adjacent Cells
Calcium-Dependent Cadherins Mediate Adhesion Between Cells
Calcium-Independent NCAMs Mediate Adhesion Between Neural Cells
Selectins Control Adhesion of Circulating Immune Cells
6.5 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 7 The Nucleus Is the Brain of a Cell
7.1 The Big Picture
7.2 The Nucleus Carefully Protects a Eukaryotic Cell’s DNA
The Nuclear Envelope Is a Double-Membrane Structure
Nuclear Pore Complexes Regulate Molecular Traffic into and Out of the Nucleus
The Interior of the Nucleus Is Highly Organized and Contains Many Subcompartments
7.3 DNA Replication Is a Complex, Tightly Regulated Process
DNA Polymerases Are Enzymes That Replicate DNA
DNA Replication Is Semidiscontinuous
Cells Have Two Main DNA Repair Mechanisms
Excision Systems Remove One Strand of Damaged DNA and Replace It
7.4 Mitosis Separates Replicated Chromosomes
Mitosis Is Divided into Stages
Prophase Prepares the Cell for Division
Chromosomes Attach to the Mitotic Spindle During Prometaphase
Arrival of the Chromosomes at the Spindle Equator Signals the Beginning of Metaphase
Separation of Chromatids at the Metaphase Plate Occurs During Anaphase
The Structural Rearrangements That Occur in Prophase Begin to Reverse During Telophase
Cytokinesis Completes Mitosis by Partitioning the Cytoplasm to Form Two New Daughter Cells
7.5 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
Additional Reading
Chapter 8 RNA Links the Information in DNA to Actions Performed by Proteins
8.1 The Big Picture
8.2 Transcription Converts the DNA Genetic Code into RNA
RNA Polymerases Transcribe Genes in a “Bubble” of Single-Stranded DNA
Transcription Occurs in Three Stages
In Eukaryotes, Messenger RNAs Undergo Processing Prior to Leaving the Nucleus
8.3 Proteins Are Synthesized by Ribosomes Using an mRNA Template
Translation Occurs in Three Stages
8.4 At Least Five Different Mechanisms Are Required for Proper Targeting of Proteins in a Eukaryotic Cell
Signal Sequences Code for Proper Targeting of Proteins
The Nuclear Import/Export System Regulates Traffic of Macromolecules Through Nuclear Pores
Proteins Targeted to the Peroxisome Contain Peroxisomal Targeting Signals (PTS)
Secreted Proteins and Proteins Targeted to the Endomembrane System Contain an Endoplasmic Reticulum Signal Sequence
Integration of Transmembrane Proteins Requires Specific Amino Acid Sequences
8.5 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 9 The Endomembrane System Serves as the Cellular Import/Export Machinery for Most Macromolecules
9.1 The Big Picture
9.2 The Endomembrane System Is a Network of Organelles in Eukaryotic Cells
The Endomembrane System Controls Molecular Transport into and out of a Cell
Vesicles Shuttle Material Between Organelles in the Endomembrane System
9.3 Exocytosis Begins in the Endoplasmic Reticulum
Newly Synthesized Proteins Begin Posttranslational Modification as ER-Resident Proteins Help Them Fold Properly
COPII-Coated Vesicles Shuttle Proteins from the ER to the Golgi Apparatus
Resident ER Proteins Are Retrieved from the Golgi Apparatus
9.4 The Golgi Apparatus Modifies and Sorts Proteins in the Exocytic Pathway
The Golgi Apparatus Is Subdivided into Cis, Medial, and Trans Cisternae
The Trans-Golgi Network Sorts Proteins Exiting the Golgi Apparatus
9.5 Exocytosis Ends at the Plasma Membrane
Cells Use Two Mechanisms for Controlling the Final Steps of Exocytosis
9.6 Endocytosis Begins at the Plasma Membrane
Clathrin Stabilizes the Formation of Endocytic Vesicles
9.7 The Endosome Sorts Proteins in the Endocytic Pathway
The Endosome Is Subdivided into Early and Late Compartments
Proton Pump Proteins Play a Central Role in the Sorting and Activation of Endosomal Contents
9.8 Endocytosis Ends at the Lysosome
Endogenous Proteins Destined for the Lysosome Are Tagged and Sorted by the Golgi Apparatus
Digested Material Is Transported into the Cytosol
Lysosomes Can Also Degrade Some Resident Organelles
Peroxisomes Defy Classification
9.9 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 10 Chemical Bonds and Ion Gradients Are Cellular Fuel
10.1 The Big Picture
10.2 Cells Store Energy in Many Forms
The Laws of Thermodynamics Define the Rules for Energy Transfer
Fats and Polysaccharides Are Examples of Long-Term Energy Storage in Cells
High-Energy Electrons and Ion Gradients Are Examples of Short-Term Potential Energy in Cells
Nucleotide Triphosphates Store Energy for Immediate Use
Cells Couple Energetically Favorable and Unfavorable Reactions
The Amount of Potential Energy Stored in an Ion Gradient Can Be Expressed as an Electrical Potential
10.3 Storage of Light Energy Occurs in the Chloroplast
Chloroplasts Have Three Membrane-Bound Compartments
Chloroplasts Convert Sunlight into the First Forms of Cellular Energy
10.4 Cells Use a Combination of Channel, Carrier, and Pump Proteins to Transport Small Molecules Across Membranes
The Na+/K+ ATPase Maintains the Resting Potential Across the Plasma Membrane
In the Vertebrate Gut, a Leaky K+ Channel, an Na+/Glucose Symporter, and a Passive Glucose Carrier Work Together to Move Glucose from the Gut Lumen to the Bloodstream
10.5 The First Phase of Glucose Metabolism Occurs in the Cytosol
Why a Stepwise Method of Metabolizing Glucose Is Necessary
The Ten Chemical Reactions in Glycolysis Convert a Glucose Molecule into Two Three-Carbon Compounds, Two NADH Molecules, and Two ATP Molecules
Pyruvate Is Not an Endpoint in Glucose Metabolism
10.6 Aerobic Respiration Results in the Complete Oxidation of Glucose
Aerobic Respiration Occurs in Four Stages
10.7 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 11 Signaling Networks Are the Nervous System of a Cell
11.1 The Big Picture
11.2 Signaling Molecules Form Communication Networks
Signaling Networks Are Composed of Signals, Receptors, Signaling Proteins, and Second Messenger Molecules
11.3 Cell-Signaling Molecules Transmit Information Between Cells
Intercellular Signals Are Secreted into the Extracellular Space
Six Classes of Receptors Are Sufficient to Detect a Vast Array of Environmental Stimuli
11.4 Intracellular Signaling Proteins Propagate Signals Within a Cell
G Proteins Are Two Classes of Molecular Switches
Protein Kinases Phosphorylate Downstream Signaling Proteins
Lipid Kinases Phosphorylate Phospholipids
Ion Channels Release Bursts of Ions
Calcium Fluxes Control Calcium-Binding Proteins
Adenylyl Cyclases Form Cyclic AMP
Adaptors Facilitate Binding of Multiple Signaling Proteins
Mutations in Signaling Networks Are Common in Cancer Cells
11.5 A Brief Look at Some Common Signaling Pathways
Protein Tyrosine Kinase Signaling Pathways Control Cell Growth and Migration
Heterotrimeric G Protein Signaling Pathways Regulate a Great Variety of Cellular Behaviors
Phospholipid Kinase Pathways Work in Cooperation with Protein Kinase and G Protein Pathways
Steroid Hormones Control Long-Term Cell Behavior by Altering Gene Expression
11.6 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 12 Protein Complexes Are Cellular Decision-Making Devices
12.1 The Big Picture
12.2 Many Signaling Proteins Enter the Nucleus
Nuclear Receptors Translocate from the Cytosol to the Nucleus During Signaling
Notch Is a Transmembrane Scaffold Receptor That Enters the Nucleus
G Protein Coupled Receptors and GPCR Fragments Signal in the Nucleus
Heterotrimeric G Proteins Target Many Cellular Compartments, Including the Nucleus
Several Elements of Phosphatidylinositol Signaling Pathways Are Present in the Nucleus
Receptor Protein Tyrosine Kinases Signal in the Nucleus
Some Protein Kinases Phosphorylate Nuclear Proteins
PTEN Is a Nuclear Phosphatase
An ATP-Binding Calcium Ion Channel Is Present in the Plasma Membrane and Nuclear Envelope in Some Neurons
An Adenylyl Cyclase Is Present in the Nucleus
12.3 Effector Proteins in the Nucleus Are Grouped into Three Classes
Cohesins and Condensins Help Control the Packaging State of Chromatin
Histone Modifiers Control the Structure of Nucleosomes
Transcription Factors Promote the Expression of Genes
Epigenetic Mechanisms Alter Gene Expression Without Modifying DNA Sequences
12.4 Signal Transduction Pathways and Gene-Expression Programs Form Feedback Loops
12.5 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 13 Progression Through the Cell Cycle Is the Most Vulnerable Period in a Cell’s Life
13.1 The Big Picture
13.2 New Cells Arise from Parental Cells That Complete the Cell Cycle
The Cell Cycle Is Divided into Five Phases
The G1/S Checkpoint Is the Point of No Return
The G2/M Checkpoint Is the Trigger for Large-Scale Rearrangement of Cellular Architecture
Activation of Cyclin-CDK Complexes Begins in G1 Phase
DNA Replication Occurs in S Phase
G2 Phase Prepares Cells for Mitosis
Mitosis and Cytokinesis Occur in M Phase
13.3 Multicellular Organisms Contain a Cell Self-Destruct Program That Keeps Them Healthy
Two Different Types of Cellular Death: Necrosis and Apoptosis
13.4 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Chapter 14 Human Activity Is Triggering a Paradigm Shift in Evolution
14.1 The Big, Big Picture: A Review of Chapters 1–13
14.2 The Neuromuscular System Is an Emerging Target for Human Intervention and Artificial Selection
Neurons Transmit Signals via Action Potentials
Muscle Cells Are Effectors of Nerve Signals
Skeletal Muscle Cells Are Multinucleated, Highly Specialized Cells
Amyotrophic Lateral Sclerosis Describes a Range of Neuromuscular Diseases
14.3 Gene Editing Is a Revolutionary Advance in Artificial Selection
CRISPR/Cas9 Is a Promising, Efficient Additive to Traditional Anti-HIV Treatments
14.4 Gametogenesis, Fertilization, and Embryogenesis Form a Complex Developmental Program That Is Subject to Human Intervention
Meiosis Creates Gametes, Which Are the Two Essential Precursors of a Diploid Life
14.5 Chapter Summary
Chapter Study Questions
Multiple-Choice Questions
References
Glossary
Answers
Index

Citation preview

THIRD EDITION

PRINCIPLES OF Cell Biology George Plopper, PhD Senior Lead Associate/Regulatory Scientist Booz Allen Hamilton Rockville, MD

Diana Bebek Ivankovic, PhD Director of the Center for Cancer Research Anderson University Anderson, SC

JONES & BARTLETT LEARNING

World Headquarters Jones & Bartlett Learning 5 Wall Street Burlington, MA 01803 978-443-5000 [email protected] www.jblearning.com Jones & Bartlett Learning books and products are available through most bookstores and online booksellers. To contact Jones & Bartlett Learning directly, call 800-832-0034, fax 978-443-8000, or visit our website, www.jblearning.com. Substantial discounts on bulk quantities of Jones & Bartlett Learning publications are available to corporations, professional associations, and other qualified organizations. For details and specific discount information, contact the special sales department at Jones & Bartlett Learning via the above contact information or send an email to [email protected].

Copyright © 2021 by Jones & Bartlett Learning, LLC, an Ascend Learning Company All rights reserved. No part of the material protected by this copyright may be reproduced or utilized in any form, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the copyright owner. The content, statements, views, and opinions herein are the sole expression of the respective authors and not that of Jones & Bartlett Learning, LLC. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not constitute or imply its endorsement or recommendation by Jones & Bartlett Learning, LLC and such reference shall not be used for advertising or product endorsement purposes. All trademarks displayed are the trademarks of the parties noted herein. Principles of Cell Biology, Third Edition is an independent publication and has not been authorized, sponsored, or otherwise approved by the owners of the trademarks or service marks referenced in this product. There may be images in this book that feature models; these models do not necessarily endorse, represent, or participate in the activities represented in the images. Any screenshots in this product are for educational and instructive purposes only. Any individuals and scenarios featured in the case studies throughout this product may be real or fictitious, but are used for instructional purposes only. 14998-2

Production Credits VP, Product Management: Amanda Martin Director of Product Management: Laura Pagluica Product Specialist: Audrey Schwinn Product Assistant: Melissa Duffy Product Coordinator: Paula-Yean Gregory Marketing Manager: Suzy Balk Senior Project Specialist: Dan Stone Project Specialist: Kelly Sylvester Digital Project Specialist: Angela Dooley VP, Manufacturing and Inventory Control: Therese Connell Manufacturing and Inventory Control Supervisor: Amy Bacus Composition: codeMantra U.S. LLC Cover Design: Michael O’Donnell Text Design: Michael O’Donnell Senior Media Development Editor: Troy Liston Rights Specialist: John Rusk Cover Image (Title Page): Courtesy of Rhonda Reigers Powell Printing and Binding: LSC Communications Library of Congress Cataloging-in-Publication Data Library of Congress Cataloging-in-Publication Data unavailable at time of printing. LCCN: 2019955373 6048 Printed in the United States of America 23 22 21 20 19 10 9 8 7 6 5 4 3 2 1

To my family, colleagues, and former students for their continued support and encouragement. – Dr. George Plopper

To my husband, Miren, my parents Lori and Ante Bebek, and our children, Sven, Andre, Nina, and Laura, who are the source of my greatest inspiration and happiness. – Dr. Diana Bebek Ivankovic

Courtesy of Rhonda Reigers Powell.

Brief Contents The Fourteen Principles of Cell Biology Preface Acknowledgments About the Cover About the Authors Chapter 1 Life Is a Team Sport Chapter 2 DNA Is the Instruction Book for Life Chapter 3 Proteins Are the Engines of Evolution Chapter 4 Membranes Are Complex Fluids That Define Compartments Chapter 5 The Cytoskeleton Forms the Architectural Foundation for the Structural Complexity of Life Chapter 6 The Rise of Multicellularity Chapter 7 The Nucleus Is the Brain of a Cell Chapter 8 RNA Links the Information in DNA to Actions Performed by Proteins Chapter 9 The Endomembrane System Serves as the Cellular Import/Export Machinery for Most Macromolecules Chapter 10 Chemical Bonds and Ion Gradients Are Cellular Fuel

Chapter 11 Signaling Networks Are the Nervous System of a Cell Chapter 12 Protein Complexes Are Cellular Decision-Making Devices Chapter 13 Progression Through the Cell Cycle Is the Most Vulnerable Period in a Cell’s Life Chapter 14 Human Activity Is Triggering a Paradigm Shift in Evolution Glossary Answers Index

Courtesy of Rhonda Reigers Powell.

Contents The Fourteen Principles of Cell Biology Preface Acknowledgments About the Cover About the Authors

Chapter 1 Life Is a Team Sport 1.1 The Big Picture BOX 1-1 TIP 1.2 Life Can Arise from Simple Ingredients Nonliving Substances Combine to Form Life BOX 1-2 FAQ: Are Viruses Alive? BOX 1-3 TIP: Make Judicious Use of the Internet Membrane Formation Requires Water BOX 1-4 TIP: Quick Review of Darwinian Evolution Code Biology Helps Explain the Diversity of Life BOX 1-5 TIP: The World of Codes BOX 1-6 The “Cell as a Busy City” Analogy BOX 1-7 FAQ: Are Biological Codes the Product of

“Intelligent Design”? Evidence for the Possibility of Extraterrestrial Life

1.3 All Cells Are Built from the Same Common Molecular Building Blocks BOX 1-8 TIP: Overcoming the Jargon Barrier

BOX 1-9 FAQ: How Much Chemistry Do I Need to

Know to Understand the Subjects in this Book? The Study of Cellular Chemistry Begins with an Examination of the Carbon Atom BOX 1-10 Silicon as a Potential Substitute for Carbon

in Living Organisms Complex Biomolecules Are Mostly Composed of Chemical Building Blocks Called Functional Groups BOX 1-11 TIP: Understanding Molecular Structure

Diagrams Lipids Are Carbon-Rich Polymers That Are Insoluble in Water Sugars Are Simple Carbohydrates BOX 1-12 Case Study: Why Can Adult Humans Drink

Milk? Amino Acids Form Carbon-Rich Molecules That Contain an Amino Acid Group and a Carboxylic Acid Group Nucleotides Are Complex Structures Containing a Sugar, a Phosphate Group, and a Base

1.4 Cells Must Cooperate to Succeed Prokaryotes Are the Simplest Forms of Cells Eukaryotes Are Complex Cells Capable of Forming Multicellular Organisms BOX 1-13 FAQ: Is Uncertainty in the Mechanism of

Evolution Evidence That it Is Not True? Biofilms Support Prokaryotic and Eukaryotic Symbiosis BOX 1-14 Artificial Eukaryote-like Cells on the Horizon Macroorganismal Hosts Coevolve with Their Microbiomes to Create New Holobionts BOX 1-15 Applied Cell Biology: Healing the Human

Holobiont with Prokaryotic Transplantation

1.5 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 2 DNA Is the Instruction Book for Life 2.1 The Big Picture 2.2 All of the Information Necessary for Cells to Respond to Their External Environment Is Stored as DNA BOX 2-1 TIP BOX 2-2 The Library of Congress Analogy, Part 1 A Cell’s DNA Is Inherited DNA Must Be Read to Be Useful BOX 2-3 The Library of Congress Analogy, Part 2 BOX 2-4 FAQ: Is Cancer Inherited or Not?

2.3 DNA Is Carefully Packaged into Five Levels of Organization DNA Is a Linear Polymer of Deoxyribonucleotides BOX 2-5 The Library of Congress Analogy, Part 3 BOX 2-6 TIP: Structure Before Jargon BOX 2-7 TIP BOX 2-8 TIP BOX 2-9 TIP BOX 2-10 TIP BOX 2-11 TIP: Chemistry Nomenclature BOX 2-12 The Holiday Lights Analogy BOX 2-13 DNA Sequencing Technologies BOX 2-14 The Library of Congress Analogy, Part 4

Level 1: DNA Forms an Antiparallel Double Helix Level 2: DNA Is Bound to a Protein/RNA Scaffold Level 3: DNA Is Twisted to Form Fibers BOX 2-15 FAQ: What is the Difference Between a

Nucleosome and a Chromatosome? Level 4: DNA Fibers Attach to a Protein-RNA Scaffold Level 5: Chromatin Is Packaged into Highly Condensed Chromosomes BOX 2-16 Case Study: Duchenne Muscular Dystrophy

in a Female Patient 2.4 Cells Chemically Modify DNA and Its Scaffold to Control Packaging Chemical Modifications at Level 1 and Level 2 Can Affect DNA Packing Across All Levels of DNA Organization

2.5 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 3 Proteins Are the Engines of Evolution 3.1 The Big Picture 3.2 Amino Acids Form Linear Polymers BOX 3-1 TIP A Peptide Bond Joins Two Amino Acids Together Definitions: Proteins versus Polypeptides versus Peptides versus Subunits BOX 3-2 TIP: Exploring Protein Nomenclature BOX 3-3 FAQ: Why Are There so Many Different Ways

to Draw a Protein? BOX 3-4 FAQ: How is Protein Binding Defined?

BOX 3-5 TIP: Anthropomorphism in Analogies BOX 3-6 TIP: Understanding Protein Names

3.3 Protein Structure Is Classified into Four Categories Primary Structure Is Defined by the Linear Sequence of Amino Acids Secondary Structure Is Defined by Regions of Repetitive, Predictable Organization in the Primary Structure BOX 3-7 Many “Disordered” Random Coils in Proteins

Are Functional Tertiary Structure Is Defined by the Arrangement of the Secondary Structures in Three Dimensions BOX 3-8 The Personal Trainer Analogy BOX 3-9 TIP: Beware of Overinterpreting Definitions of

“Domain” in Protein Structure Quaternary Structure Is Defined by the ThreeDimensional Arrangement of Polypeptide Subunits in a Multimeric Protein Five Classes of Chemical Bonds Stabilize Protein Structure

3.4 Changing Protein Shape and Protein Function BOX 3-10 The “Cell as a Busy City” Analogy, Revisited All Proteins Adopt at Least Two Different Shapes Cells Chemically Modify Proteins to Control Their Shape and Function BOX 3-11 TIP BOX 3-12 TIP Classification of Proteins BOX 3-13 The Advent of -Omics in Biotechnology BOX 3-14 The “Cell as a Busy City” Analogy,

Continued

BOX 3-15 Practical Protein Technology: The Home

Pregnancy Test BOX 3-16 FAQ: What is the Function of Fluorescent Proteins? 3.5 Where Do Proteins Go to Die? Proteins in the Cytosol and Nucleus Are Broken Down in the Proteasome Proteins in Organelles Are Digested in Lysosomes BOX 3-17 The Incinerator Analogy Proteinases Digest Proteins in the Extracellular Space BOX 3-18 FAQ: Do Proteins Evolve? BOX 3-19 Case Study: Flu Vaccines and Antiviral Drug

Designs 3.6 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 4 Membranes Are Complex Fluids That Define Compartments 4.1 The Big Picture 4.2 Phospholipids Are the Basic Building Blocks of Cellular Membranes Phospholipids Contain Four Structural Elements BOX 4-1 TIP: Beware of Falling Into the Jargon Trap BOX 4-2 TIP: Drawing Phospholipids The Amphipathic Nature of Phospholipids Allows Them to Form Lipid Bilayers in Aqueous Solution BOX 4-3 The Soap Bubble Demonstration Phospholipid Bilayers Are Semipermeable Barriers

4.3 The Fluid-Mosaic Model Explains How Phospholipids and Proteins Interact Within a Cellular Membrane BOX 4-4 TIP BOX 4-5 The Edible Membrane Exercise BOX 4-6 The Pool Analogy Membrane Proteins Associate with Membranes in Three Different Ways BOX 4-7 TIP Cellular Membranes Are Both Fluid and Static BOX 4-8 The Hearty Goldfish BOX 4-9 The Dance Floor Analogy

4.4 Cellular Membranes Maintain Chemical Disequilibrium Between Compartments Protein Channels, Carriers, and Pumps Regulate the Transport of Most Small Molecules Across Membranes BOX 4-10 The Folded Hands Analogy BOX 4-11 More Name Games BOX 4-12 FAQ: What is the Difference Between Active,

Passive, and Coupled Transporters? BOX 4-13 Case Study: Maintaining a Balanced Disequilibrium 4.5 The Smooth Endoplasmic Reticulum and Golgi Apparatus Build Most Eukaryotic Cellular Membrane Components Glycerol and Fatty Acids Are Synthesized in the Cytosol The Synthesis of Phosphoglycerides Begins at the Cytosolic Face of the SER Membrane Additional Membrane Lipids Are Synthesized in the Endoplasmic Reticulum and Golgi Apparatus

Most Membrane Assembly Begins in the SER and Is Completed in the Target Organelle BOX 4-14 FAQ: What Is a “Protein Family”? BOX 4-15 Technology at Work: Artificial Membranes BOX 4-16 TIP: Exploring Cell Membranes Online

4.6 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 5 The Cytoskeleton Forms the Architectural Foundation for the Structural Complexity of Life 5.1 The Big Picture 5.2 The Cytoskeleton Is Represented by Three Functional Classes of Proteins 5.3 Intermediate Filaments Are the Strongest, Stablest Elements of the Cytoskeleton Intermediate Filaments Are Formed from a Family of Related Proteins BOX 5-1 The Steel Cable/Highrise Building Analogy The Primary Building Block of Intermediate Filaments Is a Filamentous Subunit Intermediate Filament Subunits Form Coiled-Coil Dimers BOX 5-2 TIP: Revisiting the Protein/Subunit Issue Heterodimers Overlap to Form Filamentous Tetramers BOX 5-3 The Tetris Analogy Assembly of a Mature Intermediate Filament from Tetramers Occurs in Three Stages Posttranslational Modifications Control the Shape of Intermediate Filaments

5.4 Microtubules Organize Movement Inside a Cell BOX 5-4 The Highway Network Analogy Microtubule Assembly Begins at a MicrotubuleOrganizing Center BOX 5-5 TIP: Centrosomes versus MTOCs The Growth and Shrinkage of Microtubules Is Called Dynamic Instability BOX 5-6 The Fishing Analogy BOX 5-7 Case Study: Microtubules are Therapeutic

Targets for Cancer Patients Microtubule-Associated Proteins Regulate the Stability and Function of Microtubules BOX 5-8 TIP: Definitions of the Word Polarity Cilia and Flagella Are Specialized Microtubule-Based Structures Responsible for Motility in Some Cells BOX 5-9 FAQ: Are Prokaryotic Cilia and Flagella the

Same as Those in Eukaryotes? 5.5 Actin Filaments Control the Movement of Cells The Building Block of Actin Filaments Is the Actin Monomer Actin Polymerization Occurs in Three Stages BOX 5-10 The Clasping Hands Analogy Actin Filaments Have Structural Polarity

5.6 Seven Classes of Proteins Bind to Actin to Control Its Polymerization and Organization Monomer-Binding Proteins Regulate Actin Polymerization Nucleating Proteins Regulate Actin Polymerization Capping, Depolymerizing, and Severing Proteins Affect the Length and Stability of Actin Filaments

Crosslinking Proteins Organize Actin Filaments into Bundles and Networks Membrane Anchors and Cytoskeletal Linkers Bridge Actin Filaments to Other Structural Proteins Including Intermediate Filaments and Microtubules Myosins Exert Force on Actin Filaments to Induce Cell Movement Cell Migration Is a Complex, Dynamic Reorganization of an Entire Cell BOX 5-11 Clarifying the “Foot” versus “Hand”

Comparisons BOX 5-12 Applied Cell Biology: Blood Pressure Medications 5.7 Eukaryotic Cytoskeletal Proteins Arose from Prokaryotic Ancestors BOX 5-13 Exploring the Cytoskeleton Online 5.8 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 6 The Rise of Multicellularity 6.1 The Big Picture 6.2 Multicellularity Is an Evolutionary Response to Selective Pressure 6.3 The Extracellular Matrix Is a Complex Network of Molecules That Fills the Spaces Between Cells in a Multicellular Organism Glycoproteins Form Filamentous Networks Between Cells BOX 6-1 The Assembly-Line Analogy

BOX 6-2 TIP Proteoglycans Provide Hydration to Tissues BOX 6-3 Applied Cell Biology: Engineered ECM BOX 6-4 Cell Walls Are Evolutionary “Shields” Against

Environmental Threats Matricellular Proteins Are Nonadhesive Proteins That Regulate the Functions of Extracellular Matrix Proteins The Basal Lamina Is a Specialized Extracellular Matrix Most Integrins Are Receptors for Extracellular Matrix Proteins BOX 6-5 TIP BOX 6-6 Case Study: Can Gene Therapy Cure

Epidermolysis Bullosa? 6.4 Cells Adhere to One Another via Specialized Proteins and Junctional Complexes Tight Junctions Form Selectively Permeable Barriers Between Cells Adherens Junctions Link Adjacent Cells Desmosomes Are Intermediate Filament-Based Cell Adhesion Complexes Gap Junctions Allow Direct Transfer of Molecules Between Adjacent Cells BOX 6-7 TIP Calcium-Dependent Cadherins Mediate Adhesion Between Cells Calcium-Independent NCAMs Mediate Adhesion Between Neural Cells Selectins Control Adhesion of Circulating Immune Cells BOX 6-8 The Kayak Analogy

6.5 Chapter Summary

BOX 6-9 Explore the ECM and Cell Junctions Online

Chapter Study Questions Multiple-Choice Questions References

Chapter 7 The Nucleus Is the Brain of a Cell 7.1 The Big Picture BOX 7-1 Analogies of the Nucleus, Extended 7.2 The Nucleus Carefully Protects a Eukaryotic Cell’s DNA The Nuclear Envelope Is a Double-Membrane Structure BOX 7-2 TIP: Lamins versus Laminins versus Lamina Nuclear Pore Complexes Regulate Molecular Traffic into and Out of the Nucleus The Interior of the Nucleus Is Highly Organized and Contains Many Subcompartments BOX 7-3 Cell Biology Depends on Microscopy BOX 7-4 TIP: More Subunit Issues

7.3 DNA Replication Is a Complex, Tightly Regulated Process DNA Polymerases Are Enzymes That Replicate DNA BOX 7-5 TIP DNA Replication Is Semidiscontinuous BOX 7-6 FAQ: Is Animal Research Ethical? BOX 7-7 The Library of Congress Analogy, Revisited Cells Have Two Main DNA Repair Mechanisms Excision Systems Remove One Strand of Damaged DNA and Replace It BOX 7-8 Case Study: Xeroderma Pigmentosum

7.4 Mitosis Separates Replicated Chromosomes

Mitosis Is Divided into Stages BOX 7-9 The Centromere Prophase Prepares the Cell for Division Chromosomes Attach to the Mitotic Spindle During Prometaphase Arrival of the Chromosomes at the Spindle Equator Signals the Beginning of Metaphase Separation of Chromatids at the Metaphase Plate Occurs During Anaphase The Structural Rearrangements That Occur in Prophase Begin to Reverse During Telophase Cytokinesis Completes Mitosis by Partitioning the Cytoplasm to Form Two New Daughter Cells BOX 7-10 Explore More Online

7.5 Chapter Summary Chapter Study Questions Multiple-Choice Questions Additional Reading

Chapter 8 RNA Links the Information in DNA to Actions Performed by Proteins 8.1 The Big Picture 8.2 Transcription Converts the DNA Genetic Code into RNA BOX 8-1 TIP: Transcription versus Translation RNA Polymerases Transcribe Genes in a “Bubble” of Single-Stranded DNA Transcription Occurs in Three Stages BOX 8-2 TIP BOX 8-3 The Rope Analogy

In Eukaryotes, Messenger RNAs Undergo Processing Prior to Leaving the Nucleus BOX 8-4 Gene Splicing in Medicine

8.3 Proteins Are Synthesized by Ribosomes Using an mRNA Template Translation Occurs in Three Stages BOX 8-5 A (True) Cell Biology Joke BOX 8-6 Applied Cell Biology: Next-Generation

Sequencing and Pharmacogenomics 8.4 At Least Five Different Mechanisms Are Required for Proper Targeting of Proteins in a Eukaryotic Cell Signal Sequences Code for Proper Targeting of Proteins BOX 8-7 The “Cell As a Busy City” Analogy, Revisited BOX 8-8 The Ticket Analogy The Nuclear Import/Export System Regulates Traffic of Macromolecules Through Nuclear Pores BOX 8-9 Case Study: Nuclear Export as a Cancer

Therapy Target Proteins Targeted to the Peroxisome Contain Peroxisomal Targeting Signals (PTS) BOX 8-10 TIP: Monoubiquitin versus Polyubiquitin Secreted Proteins and Proteins Targeted to the Endomembrane System Contain an Endoplasmic Reticulum Signal Sequence BOX 8-11 GTP is a Multipurpose Molecule Integration of Transmembrane Proteins Requires Specific Amino Acid Sequences BOX 8-12 Explore More Online

8.5 Chapter Summary Chapter Study Questions

Multiple-Choice Questions References

Chapter 9 The Endomembrane System Serves as the Cellular Import/Export Machinery for Most Macromolecules 9.1 The Big Picture 9.2 The Endomembrane System Is a Network of Organelles in Eukaryotic Cells The Endomembrane System Controls Molecular Transport into and out of a Cell Vesicles Shuttle Material Between Organelles in the Endomembrane System

9.3 Exocytosis Begins in the Endoplasmic Reticulum Newly Synthesized Proteins Begin Posttranslational Modification as ER-Resident Proteins Help Them Fold Properly COPII-Coated Vesicles Shuttle Proteins from the ER to the Golgi Apparatus Resident ER Proteins Are Retrieved from the Golgi Apparatus

9.4 The Golgi Apparatus Modifies and Sorts Proteins in the Exocytic Pathway The Golgi Apparatus Is Subdivided into Cis, Medial, and Trans Cisternae BOX 9-1 The Balloon Analogy BOX 9-2 The Car Engine Analogy The Trans-Golgi Network Sorts Proteins Exiting the Golgi Apparatus

9.5 Exocytosis Ends at the Plasma Membrane

Cells Use Two Mechanisms for Controlling the Final Steps of Exocytosis

9.6 Endocytosis Begins at the Plasma Membrane Clathrin Stabilizes the Formation of Endocytic Vesicles

9.7 The Endosome Sorts Proteins in the Endocytic Pathway The Endosome Is Subdivided into Early and Late Compartments Proton Pump Proteins Play a Central Role in the Sorting and Activation of Endosomal Contents

9.8 Endocytosis Ends at the Lysosome Endogenous Proteins Destined for the Lysosome Are Tagged and Sorted by the Golgi Apparatus BOX 9-3 TIP Digested Material Is Transported into the Cytosol Lysosomes Can Also Degrade Some Resident Organelles Peroxisomes Defy Classification BOX 9-4 Applied Cell Biology: The Power of

Antibodies BOX 9-5 Case Study: Orphan Diseases BOX 9-6 Explore Further 9.9 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 10 Chemical Bonds and Ion Gradients Are Cellular Fuel 10.1 The Big Picture BOX 10-1 TIP

10.2 Cells Store Energy in Many Forms The Laws of Thermodynamics Define the Rules for Energy Transfer BOX 10-2 FAQ: What About Deep-Sea Vents? Fats and Polysaccharides Are Examples of Long-Term Energy Storage in Cells High-Energy Electrons and Ion Gradients Are Examples of Short-Term Potential Energy in Cells BOX 10-3 TIP Nucleotide Triphosphates Store Energy for Immediate Use Cells Couple Energetically Favorable and Unfavorable Reactions BOX 10-4 The Falling Water Analogy The Amount of Potential Energy Stored in an Ion Gradient Can Be Expressed as an Electrical Potential

10.3 Storage of Light Energy Occurs in the Chloroplast Chloroplasts Have Three Membrane-Bound Compartments Chloroplasts Convert Sunlight into the First Forms of Cellular Energy

10.4 Cells Use a Combination of Channel, Carrier, and Pump Proteins to Transport Small Molecules Across Membranes The Na+/K+ ATPase Maintains the Resting Potential Across the Plasma Membrane In the Vertebrate Gut, a Leaky K+ Channel, an Na+/Glucose Symporter, and a Passive Glucose

Carrier Work Together to Move Glucose from the Gut Lumen to the Bloodstream

10.5 The First Phase of Glucose Metabolism Occurs in the Cytosol Why a Stepwise Method of Metabolizing Glucose Is Necessary The Ten Chemical Reactions in Glycolysis Convert a Glucose Molecule into Two Three-Carbon Compounds, Two NADH Molecules, and Two ATP Molecules Pyruvate Is Not an Endpoint in Glucose Metabolism

10.6 Aerobic Respiration Results in the Complete Oxidation of Glucose Aerobic Respiration Occurs in Four Stages BOX 10-5 TIP BOX 10-6 TIP BOX 10-7 Applied Cell Biology: Mobile Sensors of

Metabolic Function BOX 10-8 Case Study: Manipulating the Mitochondrial Genome BOX 10-9 Explore Online 10.7 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 11 Signaling Networks Are the Nervous System of a Cell 11.1 The Big Picture BOX 11-1 Analogies Galore

11.2 Signaling Molecules Form Communication Networks Signaling Networks Are Composed of Signals, Receptors, Signaling Proteins, and Second Messenger Molecules BOX 11-2 FAQ: What’s the Difference Between a

Protein that Functions as a Signal and a Signaling Protein? BOX 11-3 Memorization Warning 11.3 Cell-Signaling Molecules Transmit Information Between Cells Intercellular Signals Are Secreted into the Extracellular Space Six Classes of Receptors Are Sufficient to Detect a Vast Array of Environmental Stimuli

11.4 Intracellular Signaling Proteins Propagate Signals Within a Cell G Proteins Are Two Classes of Molecular Switches BOX 11-4 FAQ: How Is the Phosphate Placed Back on

the GDP? Protein Kinases Phosphorylate Downstream Signaling Proteins Lipid Kinases Phosphorylate Phospholipids Ion Channels Release Bursts of Ions Calcium Fluxes Control Calcium-Binding Proteins Adenylyl Cyclases Form Cyclic AMP Adaptors Facilitate Binding of Multiple Signaling Proteins Mutations in Signaling Networks Are Common in Cancer Cells

11.5 A Brief Look at Some Common Signaling Pathways BOX 11-5 TIP Protein Tyrosine Kinase Signaling Pathways Control Cell Growth and Migration BOX 11-6 TIP: Capitalization and Italicization of Names

in Signal Transduction Heterotrimeric G Protein Signaling Pathways Regulate a Great Variety of Cellular Behaviors Phospholipid Kinase Pathways Work in Cooperation with Protein Kinase and G Protein Pathways Steroid Hormones Control Long-Term Cell Behavior by Altering Gene Expression BOX 11-7 Applied Cell Biology: Studying Protein

Activity BOX 11-8 Explore More Online BOX 11-9 Case Study: Combating the Spread of Cancer Cells via Signal Transduction 11.6 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 12 Protein Complexes Are Cellular Decision-Making Devices 12.1 The Big Picture BOX 12-1 TIP 12.2 Many Signaling Proteins Enter the Nucleus Nuclear Receptors Translocate from the Cytosol to the Nucleus During Signaling

Notch Is a Transmembrane Scaffold Receptor That Enters the Nucleus G Protein Coupled Receptors and GPCR Fragments Signal in the Nucleus Heterotrimeric G Proteins Target Many Cellular Compartments, Including the Nucleus Several Elements of Phosphatidylinositol Signaling Pathways Are Present in the Nucleus Receptor Protein Tyrosine Kinases Signal in the Nucleus Some Protein Kinases Phosphorylate Nuclear Proteins PTEN Is a Nuclear Phosphatase An ATP-Binding Calcium Ion Channel Is Present in the Plasma Membrane and Nuclear Envelope in Some Neurons An Adenylyl Cyclase Is Present in the Nucleus

12.3 Effector Proteins in the Nucleus Are Grouped into Three Classes Cohesins and Condensins Help Control the Packaging State of Chromatin Histone Modifiers Control the Structure of Nucleosomes Transcription Factors Promote the Expression of Genes Epigenetic Mechanisms Alter Gene Expression Without Modifying DNA Sequences

12.4 Signal Transduction Pathways and GeneExpression Programs Form Feedback Loops BOX 12-2 Case Study: Stem Cells and the Amazing Regenerating Liver

BOX 12-3 Applied Cell Biology: Are Personalized

Tissue and Organ Implants in Our Future? BOX 12-4 Explore More Online 12.5 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 13 Progression Through the Cell Cycle Is the Most Vulnerable Period in a Cell’s Life 13.1 The Big Picture 13.2 New Cells Arise from Parental Cells That Complete the Cell Cycle BOX 13-1 FAQ: Haven’t Scientists Already Created Artificial Life? The Cell Cycle Is Divided into Five Phases The G1/S Checkpoint Is the Point of No Return BOX 13-2 TIP The G2/M Checkpoint Is the Trigger for Large-Scale Rearrangement of Cellular Architecture BOX 13-3 TIP Activation of Cyclin-CDK Complexes Begins in G1 Phase BOX 13-4 TIP: Overcoming Complicated Signaling

Diagrams DNA Replication Occurs in S Phase G2 Phase Prepares Cells for Mitosis Mitosis and Cytokinesis Occur in M Phase BOX 13-5 TIP

13.3 Multicellular Organisms Contain a Cell SelfDestruct Program That Keeps Them Healthy Two Different Types of Cellular Death: Necrosis and Apoptosis BOX 13-6 FAQ: How is Apoptosis Pronounced in

English? BOX 13-7 Case Study: Progeria BOX 13-8 Applied Cell Biology: Flow Cytometry and Cell Sorting BOX 13-9 Explore More Online 13.4 Chapter Summary Chapter Study Questions Multiple-Choice Questions References

Chapter 14 Human Activity Is Triggering a Paradigm Shift in Evolution 14.1 The Big, Big Picture: A Review of Chapters 1–13 BOX 14-1 Tip 14.2 The Neuromuscular System Is an Emerging Target for Human Intervention and Artificial Selection Neurons Transmit Signals via Action Potentials Muscle Cells Are Effectors of Nerve Signals Skeletal Muscle Cells Are Multinucleated, Highly Specialized Cells Amyotrophic Lateral Sclerosis Describes a Range of Neuromuscular Diseases BOX 14-2 Beating the Odds BOX 14-3 TIP

BOX 14-4 Case Study: Brain Plasticity and

Schizencephaly 14.3 Gene Editing Is a Revolutionary Advance in Artificial Selection CRISPR/Cas9 Is a Promising, Efficient Additive to Traditional Anti-HIV Treatments BOX 14-5 Artificial Selection of Humans Raises

Serious Ethical Concerns 14.4 Gametogenesis, Fertilization, and Embryogenesis Form a Complex Developmental Program That Is Subject to Human Intervention BOX 14-6 Medical Ethics Disclaimer Meiosis Creates Gametes, Which Are the Two Essential Precursors of a Diploid Life BOX 14-7 TIP: Autosomes versus Sex Chromosomes BOX 14-8 Artificial Selection at Work: Human Birth

Control BOX 14-9 Applied Cell Biology: In Vitro Fertilization Artificially Selects and Supports the Creation of Life 14.5 Chapter Summary Chapter Study Questions Multiple-Choice Questions References Glossary Answers Index

Courtesy of Rhonda Reigers Powell.

The Fourteen Principles of Cell Biology Principle 1. Life is a Team Sport Principle 2. DNA is the Instruction Book for Life Principle 3. Proteins are the Engines of Evolution Principle 4. Membranes are Complex Fluids that Define Compartments Principle 5. The Cytoskeleton Forms the Architectural Foundation for the Structural Complexity of Life Principle 6. The Rise of Multicellularity Was a Watershed Moment in Evolution Principle 7. The Nucleus is the Brain of a Cell Principle 8. RNA Links the Information in DNA to Actions Performed by Proteins Principle 9. The Endomembrane System Serves as the Cellular Import/Export Machinery for Most Macromolecules Principle 10. Chemical Bonds and Ion Gradients are Cellular Fuel Principle 11. Signaling Networks are the Nervous System of a Cell Principle 12. Protein Complexes are Cellular DecisionMaking Devices Principle 13. Progression through the Cell Cycle is the Most Vulnerable Period in a Cell’s Life Principle 14. Human Activity is Triggering a Paradigm Shift in Evolution

Courtesy of Rhonda Reigers Powell.

Preface “Nothing in education is so astonishing as the amount of ignorance it accumulates in the form of inert facts.” —Henry Adams, The Education of Henry Adams s the rate of discovering facts in the sciences continues to escalate, both students and instructors must confront the age-old problem of deciding what material matters most, especially at the introductory level. In our experience, even the most enthusiastic students have great difficulty distinguishing essential facts in cell biology from more technical details. Given the heavy emphasis on memorization in most K–12 programs, many introductory-level college students resort to storing as much information as possible in short-term memory, only to discover later that they’ve missed the underlying concepts that give these facts their significance. In the past 25 years of teaching, we have spent as much time helping students navigate a conceptual path through the dense web of facts in cell biology as we have explaining the meaning of those facts.

A

The purpose of this book is to help students build a conceptual framework for cell biology that will persist long after their coursework is complete.

▶ Our Approach to Learning Cell Biology The field of introductory cell biology enjoys a wealth of well-written texts by outstanding authors. Why add yet another introductory text? This book is needed for two important reasons. First, it

overtly focuses on some of the underlying principles that illustrate both how cells function as well as how we study them. While many textbooks reference “principles” in their fields, few specifically identify these principles or explore them in detail. In contrast, this book identifies 14 specific principles of cell biology (see page xiv), and devotes a separate chapter to illustrate each one of them. As a result: ■ We intentionally shift away from the traditional focus on technical details and toward a more integrative view of cellular activity that can be tailored to suit students with a broad range of backgrounds. ■ Instructors have great freedom to organize technical subjects as they see fit while permitting students to build their own conceptual view of how cells solve problems. In short, because every cellular activity discussed in the text is tied directly to an overlying principle, these activities can be arranged and taught in many different combinations, at varying depths of detail, without losing focus on the Big Picture. ■ Students develop a framework for evaluating facts as they encounter them, and this invites them to critically evaluate information as they learn. The principles in this book are not intended to be treated as laws and are thus always subject to criticism and review. ■ Instructors can capitalize on this organizing style to seamlessly merge supplemental material with the text as the field changes or to emphasize specific subjects in a topics course. ■ Professionals in the field can use these principles as starting points for identifying additional principles in cell biology and in other related fields, for comparing these principles in other fields of biology, and for developing a more integrated curriculum across multiple scales of biological organization. For example, mapping several courses to specific principles such

as those identified in this text could assist in curriculum development and assessment at a department or program administrative level. The second important distinguishing feature of this book is its informal, narrative writing style. We adopted this style to make even the most complex concepts accessible to students new to a scientific field, including stripping away some of the technical complexity that many introductory students find intimidating. Each chapter thus reflects our own lectures in introductory cell biology, in both style and content. Specifically, this includes: ■ Liberal use of analogies that have proven effective over many years of teaching ■ Boxes throughout each chapter including studying tips, clarifications of apparent contradictions, explanations of naming schemes, FAQs, etc. ■ Gradual introduction of jargon, after the concepts have been established, thereby de-emphasizing memorization of names ■ Novel artwork reflecting drawing exercises the authors include in their own lectures

▶ Audience Principles of Cell Biology is written for introductory cell biology courses having an emphasis on eukaryotic cells, especially humans and other mammals. It is geared toward students in general biology, molecular biology, physiology, nursing, dental hygiene, and bioengineering. The book also provides a firm foundation for advanced programs in biological sciences, medicine, dentistry, and bioengineering.

▶ Organization The book consists of four chapters (1–4) that introduce the fundamental molecular building blocks of all cells: sugars, proteins, nucleic acids, and lipids. Chapter 1 also firmly establishes the role of natural selection as a driving force in the evolution of life at the level of single cells. The remaining 10 chapters focus on illustrating how cells use these building blocks to perform their essential functions. We do not assume students must read the chapters in order; instructors should be able to arrange the chapters in any sequence with minimal impact on topic continuity. Some chapters can be clustered into broader themes. Chapters 5 and 6 focus on the cytoskeleton and extracellular matrix, respectively, to explain how cells establish, maintain, and modify their shapes. Chapters 7–9 focus on DNA replication, transcription, translation, protein sorting, and the endomembrane system to illustrate the theme of information transfer from DNA to proteins. Chapters 11–13 use signal transduction as a unifying theme to illustrate the relationships among signaling pathways, control of gene expression, and cell growth/apoptosis. Finally, Chapter 14 revisits the evolution theme in Chapter 1 but updates it to include artificial selection by humans as the next paradigm shift in evolution. This allows students to review the technology topics presented in Chapters 1-13 in the context of creating new forms of life.

▶ What’s New First and foremost, we added a new author, Dr. Diana Bebek Ivankovic. Dr. Ivankovic has over 25 years of teaching and research experience in cell biology, cancer, genetics, and microbiology as well as years of experience as a textbook editor,

reviewer, and contributor. She is the source and inspiration of the new pedagogic elements in the third edition. We received over 250 comments/suggestions on the second edition from students, instructors, and subject matter experts, adjudicated each, and incorporated most them into the third edition. For example, we added five new elements to every chapter: ■ First, we addressed reviewers’ requests for more emphasis on cell biology techniques by adding “Applied Cell Biology” callout boxes that discuss technology in the context of each chapter’s principle. ■ Second, we added a “Case Study” to each chapter to illustrate how the concepts of the chapter apply to modern challenges in biology; these focus primarily on human health and disease, reflecting our emphasis on eukaryotic organisms. The case studies also include study questions, with suggested answers provided in the back matter of the text. ■ Third, we embraced the wealth of content available on the World Wide Web by providing sample search terms and inviting students to explore the chapter topics online. ■ Fourth, for the eBook version, we provide online access to one video lesson, recorded by Dr. Ivankovic, for each chapter; this serves to reinforce the informal narrative style of the text and give students a glimpse into how we teach these topics ourselves. ■ Fifth, because of the positive feedback we received for the first two editions of this text, we added cell biology principles to Chapters 1-4 and revised four of the existing principles to fully embrace this organizational concept for all 14 chapters. We also added Concept Check questions, with suggested answers, to every major section of each chapter, and we

increased the number of multiple-choice questions at the end of every chapter to 10. In addition to the new elements, we made the following changes in the third edition, by chapter, as follows: ■ Chapter 1. We rewrote nearly all of Chapter 1 to move it away from the traditional role of previewing content in other chapters and towards a more substantive coverage of evolution at the cellular level. Our goal for Chapter 1 is to firmly establish the theme of evolution from the outset so students can view how the molecular differences between cells and organisms arose through the natural selection. This chapter also establishes the themes of teamwork and biological codes, which we feel are critical to understanding the molecular complexity of most cellular functions. ■ Chapter 2. We revised approximately 20% of the content. We added a section describing the variety of RNAs cells produce and imported some of the DNA modification material from Chapter 12 to consolidate the discussion of chromosome remodeling. This provides the background to support our case study on X chromosome silencing. We also corrected some errors in chemical nomenclature. ■ Chapter 3. We made modest updates to the existing text (e.g., a slight expansion of our discussion of protein binding kinetics) and focused most of our revisions on the new elements (callout boxes on ‘omics technologies, home pregnancy test kits, and flu vaccine development). ■ Chapter 4. Per request of many reviewers, we moved sections on membrane transport from Chapter 10 to Chapter 4. We also expanded our discussion of sphingolipids and added visual analogy/exercise—building a membrane “sandwich”—to reinforce the complex composition of biological membranes.

■ Chapter 5. We added a section on membrane anchors and cytoskeletal linkers to expand the classes of actin binding proteins to five. ■ Chapter 6. We added emphasis on the role of extracellular matrix and cell-cell junction proteins to the evolution of multicellular organisms and added a new class of extracellular matrix molecules: matricellular proteins. This new topic provided an opportunity to reinforce the theme that signaling molecules can be embedded in the extracellular space. ■ Chapter 7. Approximately 15% of this chapter has been updated. We revised the principle for this chapter to expand our focus to the role of the nucleus as an organizer and protector of nucleic acids. The principle for the second edition, that DNA integrity is the top priority for cells, is embedded in the new principle. We used this shift in focus to more closely examine how DNA is organized and imported content from Chapter 2 to explain the physical basis of chromosome organization. We also expanded our discussion of DNA replication and the organization of the replication fork. ■ Chapter 8. We updated the principle for this chapter to further emphasize the connections between DNA, RNA, and proteins as forms of cellular information and to illustrate how biological codes are essential to proper collaboration between molecular teams. We also pick up the discussion of RNAs from Chapter 2 and address the multiple ways that cells move different RNA types into and out of the nucleus. The revised chapter also includes an updated description of peroxisomal membrane transport and protein degradation in the endoplasmic reticulum. Finally, we expand the topic of targeted mRNA transport in the cytosol. ■ Chapter 9. The section on the structure and function of the Golgi apparatus was extensively updated to reflect the impact of genomic sequencing on the field. We also added a new

section, entitled Peroxisomes Defy Classification, to embrace the growing evidence that the peroxisome is a descendent of both the endomembrane system and mitochondria. This section also provides an opportunity to review the sometimes tortuous path researchers must take to answer such a seemingly simple question as, “where did peroxisomes come from?” ■ Chapter 10. We moved the membrane transport section to Chapter 4 and expanded the discussion of glucose metabolism to include gluconeogenesis and anabolic pathways. ■ Chapter 11. We made minor changes to clarify the content. ■ Chapter 12. We updated approximately 20% of the content, including the sections covering G protein coupled receptor-, phosphatidylinositol-, and protein kinase-mediated signaling in the nucleus; and epigenetic modification of DNA. ■ Chapter 13. We expanded the details of growth factormediated signaling and control of the cell cycle, repair of double-stranded DNA breaks, and cytokinesis. ■ Chapter 14. We changed the principle for this chapter to reflect our increased emphasis on cellular evolution and the impact of human activity on all organisms. We retained most of the content on neuromuscular systems and introduced a new section on meiosis and embryonic development. We also added a section on gene editing to capture its central importance on all aspects of biology, from biomedical research to designer organisms.

▶ The Student Experience Each chapter contains pedagogical features to assist instructors and facilitate student comprehension. These include:

Principles of Cell Biology, Third Edition has been thoroughly redesigned to better identify and illuminate the 14 Principles of Cell Biology (page xiv). This breakdown into 14, easy-tounderstand principles provides students with the formula for understanding how cells function as well as how we study them.

Key Concepts—A list of Key Concepts at the start of each section provides a framework of the core cell biology concepts for students as they read through the section. Students should refer back to the Key Concepts as they progress through the section and eventually to the Concept Check question. Chapter Outlines—Along with The Big Picture, a Chapter Outline is included at the start of every chapter. This list succinctly identifies the sections readers will encounter as they progress through a chapter. It provides students with a clear understanding

of the overarching topics that pertain to the cell biology principle under discussion. The Big Picture—Written in a conversational tone, The Big Picture opens every chapter and identifies and explains the objectives of the chapter. Students should carefully read it to guide their focus before diving into the chapter. As they progress through the chapter, they should think critically about these objectives to ensure they fully understand the cell biology principle being covered.

Concept Checks—Each section opens with a list of key concepts to help guide readers’ focus and concludes with a Concept Check question. Answers are provided in the back of the book so students can instantly check their comprehension of section material.

Box Tips—Helpful boxes throughout each chapter include study tips, clarifications of apparent contradictions, explanations of

naming schemes, FAQs, and more. These boxes clarify and complement section and chapter material as well as offer additional interesting cell biology information. Chapter Summaries—To ensure readers thoroughly grasp the important concepts, each chapter concludes with a comprehensive Chapter Summary to provide students with a clear understanding of the cell biology principle under discussion. Students can review the summary before diving into the chapter to direct their study and can also use it as a study tool to prepare for course lectures and exams. Chapter Study Questions—New, thought-provoking, end-ofchapter questions, including open-ended and multiple choice, integrate material across chapter sections and assess students’ retention of core cellular concepts. The answers are at the back of the book, providing students with a complete learning and study solution.

Case Studies – A NEW Case Study box has been added to each chapter taking a closer look at each principle. Each Case Study also includes study questions. Animations—To visually assist readers in understanding key cellular processes, Jones & Bartlett Learning has developed interactive cellular biology animations. These engaging animations bring fascinating cell biology phenomena to life! Each interactive animation guides students through cellular processes and gauges students’ understanding with exercises and assessment questions.

▶ Teaching Tools The following Teaching Tools are available to assist qualified instructors with preparations for their course. The Image Bank in PowerPoint format provides the illustrations and photographs from the text to which Jones & Bartlett Learning holds the copyright or has permission to reproduce digitally. You can quickly and easily copy individual images into your existing lecture slides.

The Slides in PowerPoint format, developed by the author, provide lecture notes and images for each chapter of Principles of Cell Biology, Third Edition. Instructors with the Microsoft PowerPoint software can customize the outlines, art, and order of presentation. An extensive Test Bank is also available. A list of Projects and Assignments details fun and engaging classroom and homework activities relevant to each chapter.

Courtesy of Rhonda Reigers Powell.

Acknowledgments Just as cell biology is a collaborative science, creating and publishing this textbook is the product of my extensive collaboration with the outstanding editorial and production team at Jones & Bartlett Learning, artists, our professional colleagues, and the external reviewers. We offer our deepest thanks to everyone who helped take this project from its very humble beginning to the finished product. This list includes, but is not limited to, the following individuals: Laura Pagluica and Melissa Duffy. Special thanks are extended to Elizabeth Morales, who developed the art program. Special thanks to Dr. Jeffrey Pommerville, who let us use his “To the Student” section for this book. Thanks are also extended to Dr. Heidi Moawad of John Carroll University for her assistance in revising the PowerPoint Slides and Instructor’s Manual. The authors and publisher would like to thank the following individuals for serving as reviewers in the preparation of this Third Edition: Dr. Ellen L. Batchelder, Unity College Christine Bezotte, Elmira College Erick Bourassa, Mississippi College Claudette P. Davis, George Mason University Bradley Creamer, Missouri State College Robert S Dotson, Tulane University Kristiann Dougherty, Palm Beach Atlantic University Scott Gehler, Augustana College Kyle Miller Hesed, Hesston College Ron Harris, Marymount California University

Vladimir Jurukovski, Suffolk County Community College Fran Norflus, Clayton State University Malaney O’Connell, McMurry University Lalitha Ramamoorthy, Marian University Dr. Sederick C. Rice, University of Arkansas at Pine Bluff Robert Seiser, Roosevelt University Thomas E. Smith, Ave Maria University Beth VanWinkle, Rochester Institute of Technology Thanks are also extended to those who have previously reviewed Principles of Cell Biology: Soohyoun Ahn, Arkansas State University Jennifer L. Bath, Concordia College Wade Bell, Virginia Military Institute Brad Bryan, Worcester State College Ken Cao, University of Maryland College Park Carol Castaneda, Indiana University Northeast Helen Cronenberger, University of Texas at San Antonio Linda DeVeaux, Idaho State University Lara Dowland, Mount Wachusett Community College Erastus C. Dudley, Huntingdon College Janet Duerr, Ohio University Scott Erdman, Syracuse University David E. Featherstone, University of Illinois at Chicago Ellen France, Georgia College & State University Sheldon R. Gordon, Oakland University Barbara Frank, Idaho State University Daniel E. Frigo, University of Houston Anne Grippo, Arkansas State University Jeff Hadwiger, Oklahoma State University Shelley Hepworth, Carleton University Dr. Sarah Higbie, Saint Joseph College Stanley Hillyard, University of Nevada at Las Vegas

Barry Hoopengardner, Central Connecticut State University Dr. Benjamin Johnson, Hardin-Simmons University Martin Kapper, Central Connecticut State University Erik Larson, Illinois State University Julia Lee, St. Joseph’s University Ken Lerea, New York Medical College Holger Lill, VU University, Amsterdam Charles H. Mallery, University of Miami Paolo Martini, Brandeis University Bob Morris, Wheaton College Jim Mulrooney, Central Connecticut State University Rolf Prade, Oklahoma State University Hongmin Qin, Texas A&M University Mark Running, University of Louisville Eric Sikorski, University of Tampa I wish to thank my former students and fellow cell biologists, who simultaneously ask the utmost of me and provide much of the inspiration and imaginative ideas to spur me on. I salute you and look forward to hearing your ideas for making this text as effective and enjoyable as possible. George Plopper I would like to thank my dearest friends Dr. Dorota Abramovitch and Dr. Anne Claire Edwards for their help with organic chemistry and medicine. I would also like to thank Alyssa Edwards and Haley Bennett for their support with spec sheets, as well as my three top students, Isaac Daffron, Michael Stevens and Walter (Kyle) Myers for their assistance with edits, updates, the glossary, and answers to chapters’ questions. I am deeply grateful to Dr. David DeHart and Mr. Zachary Perdun for sharing with me their knowledge in molecular biology. I am grateful to Dr. Andrew Mount and Ms. Rhonda Powell from Clemson University for supplying us

with images for the book. Lastly, I’d like to thank my sons Andre and Sven for sharing information about Stephen Hawking’s points of view, space, as well as the latest updates in medicine. Diana Bebek Ivankovic

Courtesy of Rhonda Reigers Powell.

About the Cover MOVAS cells, an immortalized vascular smooth muscle cell line, were stained using primary antibodies to ATP-binding cassette, subfamily G, member 1 (ABCG1) (green) and α-actin (red), followed by incubation with secondary Alexa Fluor antibodies. Cells were counterstained with DAPI (blue) to highlight the nuclei. Samples were imaged using a Leica SP8X confocal microscope (Leica Microsystems), housed in the Clemson Light Imaging Facility (CLIF, Clemson University Division of Research, Clemson, South Carolina). Sample preparation and imaging was conducted by Ms. Rhonda Reigers Powell, CLIF Manager. Cells and antibodies were provided by Dr. Alexis Stamatikos (Assistant Professor; Department of Food, Nutrition, and Packaging Sciences; Clemson University). Additional reagents and technical advice were provided by Dr. Terri F. Bruce, Academic Program Director, CLIF.

Courtesy of Rhonda Reigers Powell.

About the Authors

George Plopper is a Senior Lead Scientist at the management, information technology, and strategic consulting firm Booz Allen Hamilton in the Washington, D.C. area. He received his Bachelor of Arts degree in Biology from the University of California, San Diego and his Ph.D. in Cell and Developmental Biology from Harvard University in Cambridge, Massachusetts. Prior to joining Booz Allen Hamilton, Dr. Plopper was a professor of biological sciences at Rensselaer Polytechnic Institute in Troy, New York.

During his 20-year career in academia he received four teaching awards and was named an Education Fellow by the National Academy of Sciences.

Diana Bebek Ivankovic is the Director of the Center for Cancer Research (CCR) and Professor at Anderson University in Anderson, SC. She received an International Baccalaureate degree from the United World College of the Adriatic in Duino, Italy, and a bachelor’s degree in biology from Lander University, Greenwood, SC. She has a master’s degree in zoology and a doctorate degree in microbiology from Clemson University, in Clemson, SC. Dr. Ivankovic taught biology at Clemson University from 1989 to 2004 before coming to Anderson University. She is a cancer survivor, the founder of the Anderson University CCR, and has been teaching biology, microbiology, cell biology, and genetics since 2004. Dr. Ivankovic has received several teaching and leadership awards.

▶ A Note to the Students

To the Student—Study Smart Your success in cell biology—or any college or university course— will depend on your ability to study effectively and efficiently. Therefore, this textbook was designed with you, the student, in mind. The text’s organization will help you improve your learning and understanding and, ultimately, your grades. The learning design illustrated below reflects this organization. Study it carefully, and, if you adopt the flow of study shown, you should be a big step ahead in your preparation and understanding of cell biology—and for that matter any subject you are taking. When we were undergraduate students, we hardly ever read the “To the Student” section (if indeed one existed) in our textbooks because the section rarely contained any information of importance. This one does, so please read on. In college, we were mediocre students until our junior year. Why? Mainly because we did not know how to study properly, and, more importantly, we did not know how to read a textbook effectively. Our textbooks were filled with highlighted sentences without any plan on how we would use this “emphasized” information. In fact, most textbooks assume you know how to read a textbook properly. We didn’t and you might not, either. Reading a textbook is difficult if you are not properly prepared. So you can take advantage of what we learned as students and have learned from instructing thousands of students; we have worked hard to make this text user friendly with a reading style that is not threatening or complicated. Still, there is a substantial amount of information to learn and understand, so having the appropriate reading and comprehension skills is critical. Therefore, we encourage you to spend 20 minutes reading this section, as we are going to give you several tips and suggestions for acquiring those skills. Let us show you how to be an active reader.

Be a Prepared Reader Before you jump into reading a section of a chapter in this text, prepare yourself by finding the place and time and having the tools for study. Place. Where are you right now as you read these lines? Are you in a quiet library or at home? If at home, are there any distractions, such as loud music, a blaring television, or screaming kids? Is the lighting adequate to read? Are you sitting at a desk or lounging on the living room sofa? Get where we are going? When you read for an educational purpose—that is, to learn and understand something—you need to maximize the environment for reading. Yes, it should be comfortable, but not to the point that you will doze off. Time. All of us have different times during the day when we best perform a certain skill, be it exercising or reading. The last thing you want to do is read when you are tired or simply not “in tune” for the job that needs to be done. You cannot learn and understand the information if you fall asleep or lack a positive attitude. We have kept all of the chapters in this text about the same length so you can estimate the time necessary for each and plan your reading accordingly. If you have done your preliminary survey of the chapter or chapter section, you can determine about how much time you will need. If 40 minutes is needed to read— and comprehend (see below)—a section of a chapter, find the place and time that will give you 40 minutes of uninterrupted study. Brain research suggests that most people’s brains cannot spend more than 45 minutes in concentrated, technical reading. Therefore, we have avoided lengthy presentations and instead have focused on smaller sections, each with its own heading. These should accommodate shorter reading periods. Reading Tools. Lastly, as you read this, what study tools do you have at your side? Do you have a highlighter or pen for

emphasizing or underlining important words or phrases? Notice the text has wide margins that allow you to make notes or to indicate something that needs further clarification. Do you have a pencil or pen handy to make these notes? Or, if you do not want to “deface” the text, make your notes in a notebook. Lastly, some students find having a ruler is useful to prevent their eyes from wandering on the page and to read each line without distraction.

Be an Explorer Before You Read When you sit down to read a section of a chapter, do some preliminary exploring. Look at the section head and subheadings to get an idea of what is discussed. Preview any diagrams, photographs, tables, graphs, or other visuals used. They give you a better idea of what is going to occur. We have used a good deal of space in the text for these features, so use them to your advantage. They will help you learn the written information and comprehend its meaning. Do not try to understand all the visuals, but try to generate a mental “big picture” of what is to come. Familiarize yourself with any symbols or technical jargon that might be used in the visuals. The end of each chapter contains a Summary for that chapter. It is a good idea to read the summary before delving into the chapter, even though it is at the end. That way you will have a framework for the chapter before filling in the nitty-gritty information.

Be a Detective as You Read Reading a section of a textbook is not the same as reading a novel. With a textbook, you need to uncover the important information (the terms and concepts) in the forest of words on the page. So, the first thing to do is read the complete paragraph. When you have determined the main ideas, highlight or underline them. However, we have seen students highlighting the entire paragraph in yellow, including every a, the, and and. This is an example of highlighting before knowing what is important. So, we have helped you out somewhat. Important terms and concepts are in boldface followed by the definition (or the definition might be in the glossary). So only highlight or underline with a pen essential ideas and key phrases—not complete sentences, if possible. What if a paragraph or section has no boldfaced words? How do you find what is important here? From an English course, you may know that often the most important information is mentioned first in the paragraph. If it is followed by one or more examples, then you can backtrack and know what was important in the paragraph. In addition, we have added section “speed bumps” (called Concept Checks) to let you test your learning and understanding before getting too far ahead in the material. These checks also are clues to what was important in the section you just read.

Be a Repetitious Student Brain research has shown that each individual can only hold so much information in short-term memory. If you try to hold more, then something else needs to be removed—sort of like a cell phone out of storage. So that you do not lose any of this important information, you need to transfer it to long-term memory—to the cloud, if you will. In reading and studying, this means retaining the term or concept; so, write it out in your notebook using your own words. Memorizing a term does not mean you have learned the term or understood the concept. By writing it out in your own words, you are forced to think and actively interact with the information. This repetition reinforces your learning.

Be a Patient Student In textbooks, you cannot read at the speed that you read your email or a magazine story. There are unfamiliar details to be learned and understood—and this requires being a patient, slower reader. Identifying the important information from a textbook chapter requires you to slow down your reading speed. Speedreading is of no value here. It may help to go back and re-read sections as your general understanding of the topic improves. We use many cross-references in this book, and suggest you take the time to look up the referenced material in other chapters.

Know the What, Why, and How Have you ever read something only to say, “I have no idea what I read!”? As we’ve already mentioned, reading a cell biology text is not the same as reading Sports Illustrated or People magazine. In these entertainment magazines, you read passively for leisure or perhaps amusement. In Principles of Cell Biology, you must read actively for learning and understanding—that is, for comprehension. This can quickly lead to boredom unless you engage your brain as you read—that is, be an active reader. Do this by knowing the what, why, and how of your reading. What is the general topic or idea being discussed? This often is easy to determine because the section heading might tell you. If not, then it will appear in the first sentence or beginning part of the paragraph. Why is this information important? If we have done our job, the text section will tell you why it is important or the examples provided will drive the importance home. These surrounding clues further explain why the main idea was important How do we “mine” the information presented? This was discussed under “Be a Detective as You Read.”

Have a Debriefing Strategy After reading the material, be ready to debrief. Verbally summarize what you have learned. This will start moving the short-term information into the long-term memory storage—that is, retention. Any notes you made concerning confusing material should be discussed as soon as possible with your instructor. A lot of cell biology is represented visually, so allow time to draw out diagrams. Again, repetition makes for easier learning and better retention. In many professions, such as sports, music, or the theater, the name of the game is practice, practice, practice. The hints and suggestions we have given you form a skill that requires practice to perfect and use efficiently. Be patient, things will not happen overnight; perseverance and willingness though will pay off with practice. You might also check with your college or university academic (or learning) resource center. These folks will have more ways to help you to read a textbook better and to study well overall.

Send Us a Note In closing, we would like to invite you to write us and let us know what is good about this textbook so we can build on it and what may need improvement so we can revise it. Send your correspondence to Dr. Diana Ivankovic, 316 Boulevard, #1153, Anderson University, Anderson, SC 29621. Feel free to email us at: [email protected]. We wish you great success in your cell biology course. Welcome! —Drs. Plopper and Ivankovic Website: https://www.andersonuniversity.edu/artssciences/faculty/diana-ivankovic

CHAPTER 1

Life Is a Team Sport The Evolution of Cellular Diversity CHAPTER OUTLINE 1.1 The Big Picture 1.2 Life Can Arise from Simple Ingredients 1.3 All Cells Are Built from the Same Common Molecular Building Blocks 1.4 Cells Must Cooperate to Succeed 1.5 Chapter Summary

▶ 1.1 The Big Picture To make good use of the study strategy outlined in the To the Student section of this book, let’s start by exploring the overall organization of the entire text, and this chapter in particular. Each one of the fourteen chapters in this book addresses a principle of cell biology. While it is not essential to read the chapters in order, each builds on the material covered in earlier chapters. You will often be referred to other chapters to learn more about specific concepts. In this chapter, we will examine how life appeared and evolved on Earth. We will address a question that currently strikes at the heart of all biological disciplines: did life originate from scratch on this planet, or was it “seeded” by extraterrestrial matter that crashed onto the Earth’s surface? If the latter proves to be true, then life may exist throughout the universe and be subject to the same type of evolution that gave rise to the tremendous diversity of life forms on Earth. One of the most important lessons we have learned is that organisms, and the molecules that comprise them, never function in isolation. Even the simplest single-celled organisms live in a complex chemical environment, and their existence depends on their ability to cooperate with other organisms they encounter. The same is true for the molecules inside cells, too: virtually every cellular function depends on groups of different molecules interacting in specialized ways to help cells feed, reproduce, secrete waste, and defend themselves against attack. In everyday terms, we can sum this up as cell biology principle 1: life is a team sport (see Box 1-1). Every cellular function addressed in this book requires groups of molecules, with each member playing its part in a sometimes dizzyingly complex dance.

Focusing on what molecules comprise each function’s team, what their roles are, and the sequence of their activities will help provide a framework for keeping the details straight.

CELL BIOLOGY PRINCIPLE 1

Life is a team sport.

Photo courtesy of Andrew S. Mount, PhD; Jonathan Stewart; Nichole Hickman; Michael Groce; Okeanos Research Lab, Clemson University.

BOX 1-1 TIP Our goal in each chapter of the book is for students to understand one principle of cell biology at a level that is suitable for their background and interests. This means that depending on the degree of detail they are seeking, it may be unnecessary to read an entire chapter, or they may have to seek more advanced texts. It is important to ask the course instructor what level of sophistication is appropriate for his or her course.

Please keep an open mind while reading this chapter, and recall that cell biology, like all scientific fields, is not a study of absolutes. Rather, we evaluate the results from properly controlled

experiments and decide whether those results support a proposed explanation for what we know about how organisms live and adapt to an ever-changing environment. Rarely do we declare a given explanation to be unequivocally true or false, and frequently, our explanations must be revised as new data are collected. Sometimes we find new team members, change their roles, or reorganize the sequence in which they act. An excellent example of an unexpected outcome is the life history of a famous scientist, the British physicist Stephen Hawking. When he was diagnosed with the deadly motor neuron deterioration disease amyotrophic lateral sclerosis (ALS) at age 21, most of his physicians did not expect him to live more than a few years. Instead, he led a long and productive life, dying at age 76. Years after his initial diagnosis, physicians learned that the name ALS applies to several highly related diseases and that Dr. Hawking suffered from a slowly progressing form of ALS. As more information became available, Dr. Hawking’s diagnosis and medical care were updated to best reflect his medical condition. In First Principle terms, the roster of molecules and cells responsible for his condition changed as doctors learned more about his disease. This chapter will follow a journey from the beginning of life on Earth to the evolution of prokaryotic cells and the transition from simple prokaryotes to the development of more complex eukaryotes. The chapter concludes with a discussion of the importance of holobionts—teams of eukaryotic and prokaryotic cells that communicate with each other in intriguing, mind-boggling ways. Our own human bodies are excellent examples of such cooperation and teamwork in action. Chapter 1 is organized into three major sections: ■ The first section, Life Can Arise from Simple Ingredients, serves as an introduction to some of the essential concepts

that form the foundation for supporting life. This includes the definition of life we will use for this book, a description of the conditions needed for the formation of the first cells on Earth, a review of evolution by natural selection, an introduction to the concept of code biology, and a brief discussion of the possibility of life elsewhere in the universe. ■ The second section, All Cells Are Built from the Same Common Molecular Building Blocks, introduces some of the basic chemistry that governs how molecules in living organisms interact. This section also lists some of the most common chemical structures found in biomolecules. These form a critical part of the structure–function relationship, another essential concept in biology that we will refer to often in this book. This section also presents our closest examination of sugars. This information will be put to immediate use in Chapter 2, where we examine how the sugars are used to synthesize nucleotides, another class of cellular building blocks. Pay particular attention to how sugars are assembled into polymers, as we will be referring to these polymers in several chapters. ■ The third section, Cells Must Cooperate to Succeed, explains how prokaryotes were the first organisms formed on Earth and shows how important molecular and cellular teamwork is for their continued survival. This section also addresses how eukaryotes evolved from teams of prokaryotic organisms and why this teamwork allows for the tremendous diversity of eukaryotic life forms on Earth. A defining feature of eukaryotes is their internal, membrane-bound compartments called organelles. It is important to keep in mind that these organelles are multitaskers, so most will appear in several chapters. Many students are tempted to map an organelle to one or two specific cellular functions (e.g., nucleus = DNA replication), when in fact every organelle can perform many different

functions at the same time. Thus, when we discuss deoxyribonucleic acid (DNA) packing (Chapter 2), DNA replication (Chapter 7), nuclear pore transport (Chapter 8), and regulation of gene expression (Chapter 12), remember that all of these events occur simultaneously in the nucleus and that they often influence one another. It is important to keep an eye on this big picture of a cell as we explore each of the fourteen principles in this book. This section concludes with the introduction of closely cooperating teams of eukaryotic and prokaryotic cells called holobionts, which make up a surprisingly large proportion of the living beings we encounter in everyday life, including humans. We will review each of these topics, apart from sugars, in much greater detail in the subsequent chapters of the book. While most of the following chapters are somewhat shorter, they will contain a much higher density of critical details. Tackling these details will provide students an opportunity to apply the study skills they will develop in this chapter. Welcome to the realm of cell biology. Let’s get started!

▶ 1.2 Life Can Arise from Simple Ingredients Key Concepts ■ Living beings are composed of some of the most common atoms and molecules on Earth. ■ Liquid water is essential for the maintenance of all known living beings. ■ Biological codes help explain how simple molecules combine to form living beings. ■ Chemical and physical evidence supports the possibility that extraterrestrial life could exist.

For most cell biologists, the definition of the word life is fairly straightforward: it is a chemical system capable of Darwinian evolution. Objects that are alive must, therefore, be able to both generate nearly exact copies of themselves and self-correct (i.e., restore themselves to a defined state by repairing damage). In simpler terms, we can say that all living beings replicate (create offspring) and correct their genetic errors. In the hierarchy of biological complexity, individual molecules, such as proteins or DNA, are not considered to be alive, even if they exist inside a living organism, because they lack one or both of these abilities. Although many of these molecules (e.g., enzymes) can catalyze chemical reactions, no single chemical reaction can both replicate and repair these molecules. From these observations, it is clear that life is a trait possessed only by groups of molecules that work together. Biologists believe that the earliest biological molecules were simple molecules capable of self-replication. Once these molecules were enclosed

in a selective barrier, called a membrane, they were capable of forming teams that cooperated to maintain a fairly constant internal environment, even when conditions outside the membrane varied greatly. Molecular teams that developed the additional ability to repair or replace damaged team members were then able to generate nearly exact replicas of themselves, including the membrane. This membrane-enclosed team of molecules is now called a cell. The membrane is often referred to as the plasma membrane or cell membrane, and it encloses a compartment called the cytoplasm. Because a cell is the simplest living structure, it is also referred to as the fundamental unit of life (see Box 1-2). All living beings are composed of cells. The simplest cells are called prokaryotes, to distinguish them from more complex cells known as eukaryotes.

BOX 1-2 FAQ: ARE VIRUSES ALIVE? One issue that continues to be debated among biologists is whether viruses, which must infect a living cell to replicate, are living organisms. A common argument supporting the living organism definition is that viruses are capable of accurately replicating themselves by following the same types of instructions (the genetic code of DNA and ribonucleic acid [RNA]) that all other living organisms use. Proponents of the living definition also point out that viral dependence on a cell for replication is not unlike the dependence of individual cells in a multicellular organism on the whole organism (e.g., for a skin cell to divide, the whole animal must be alive). Those who disagree with this point of view argue that viruses cannot selfcorrect; if a virus particle is damaged, it will never repair the damage—it can only attempt to create new virus particles. Much of this debate stems from the fact that there is no universally accepted definition of life. Regardless, it is clear that viruses contribute genetic information to living organisms and that they appeared very early in the evolution of life.

Nonliving Substances Combine to Form Life Considerable experimental evidence suggests that the very first cell on Earth arose through abiotic chemical reactions. This is hardly surprising: before life began, everything on Earth was, by definition, abiotic (lacking life). But how did this remarkable transformation occur? The most compelling and plausible research suggests that the beginning of life depended on the compounds hydrogen sulfide and hydrogen cyanide (see Box 1-3). The chemical reduction of hydrogen cyanide by hydrogen sulfide, catalyzed by ultraviolet light, allowed carbon atoms to form covalent bonds with each other. This process, in turn, formed

multicarbon complexes that bound to other atoms and rearranged themselves until they formed the molecules that ensure storage and transmission of biological information (ribonucleotides), compartmentalization of chemical reactions (lipids), and catalysis of the chemical reactions necessary to keep cells alive (ribonucleotides and amino acids). (We will explore these molecules in more detail in Chapters 2–4.) These events likely occurred independently and then combined, perhaps millions of times. Once this primitive collection of molecules cooperated enough to replicate and self-correct as an integrated system, life began. BOX 1-3 TIP: MAKE JUDICIOUS USE OF THE INTERNET We all use internet searches in our everyday lives to gain information, solve problems, and communicate with one another. Scientists are no different: a tremendous wealth of information about cell and molecular biology is available for free on the Web. This book assumes the reader has ready access to online information and uses it often. But just as in daily life, not all information is reliable. One of the key steps in becoming a professional scientist is learning to discriminate between sound, evidence-based sources of information and the rest. Some of the best biological information is collected by the U.S. National Center for Biotechnology Information, and it is free to access. Other sources are professional societies, including: ■ American Association for the Advancement of Science ■ Federation of American Societies for Experimental Biology ■ American Society for Cell Biology ■ Federation of European Biochemical Societies ■ Association of Academies and Societies of Sciences in Asia ■ Network of African Science Academies ■ iBiology

Because these organizations are intended primarily for professional scientists, the “learning curve” for making sense out of their communications can at times be rather steep for beginners. However, most scientific societies commit considerable effort to public outreach, often through short videos. Virtually every topic discussed in this book has many videos dedicated to it, and many of them are excellent. For example, search for “hydrogen sulfide, hydrogen cyanide, origin of life videos” to learn more about the role these molecules played when early life began. Check with your instructor for guidance on what information to access for your course.

Membrane Formation Requires Water Lipid formation is likely the most important of these abiotic events because many lipids naturally form compartments in aqueous solutions. This permits several successive chemical reactions to occur within the same compartment, an essential requirement for the emergence of life on Earth. Scientists describe these lipidbound compartments as “natural test tubes” where cellular experiments are carried out. Within these primitive membranes, early cells developed the skills necessary to survive and replicate. All organisms have inherited these critical capabilities. An important lesson we have learned is that even the most modern organisms rely on the same concepts to keep themselves alive. Simply put, life is still impossible without liquid water, almost 3.5 billion years after the first cells appeared on Earth. Why water?

Water Is the Most Common Compound Found in Cells Water is so common in biology that we often take it for granted. All living cells, including the driest plant seeds and fungal spores, contain water. In most cells, water is the most abundant molecule.

To understand its true significance, we need to emphasize the five traits that make it special (FIGURE 1-1):

FIGURE 1-1 Five unusual traits of water. ■ First, water is the only molecule of its size that exists as a liquid at room temperature and normal atmospheric pressure (compounds of a similar mass, such as methane and ammonia, are gasses under these conditions). ■ Second, water is a very polar molecule. Specifically, the high electronegativity of oxygen causes the hydrogen electron involved in the covalent bond to spend the majority of its time circling the nucleus of the oxygen atom. This creates an imbalanced bond. Because the electron orbits mostly around the oxygen, the oxygen acquires a partial negative charge (typically represented by the symbol δ–), while each hydrogen atom becomes partially positive (represented as δ+). The imbalance in charge is what holds water molecules together so closely and helps explain why it is a liquid at room temperature: the δ– of the oxygen atom attracts the δ+ of the hydrogen atoms in nearby water molecules, and vice versa. This phenomenon, known as hydrogen bonding, occurs in other molecules as well. ■ Third, the liquid phase of water has a higher density than its solid phase in standard conditions (i.e., ice floats in water). This trait, too, can be explained by the ubiquitous hydrogen bonding that occurs in liquid water, as these bonds pack water molecules more closely together than the regular, repeating arrangement found in crystals such as ice. If liquid water forms its most common solid (known as ice) in cells, the increased volume of the solid water can tear membranes apart and rupture cells. ■ Fourth, the extensive hydrogen bonding in water allows it to absorb a great deal of heat before it changes temperature: 1 calorie of heat is required to heat 1 mL of water by 1˚C. The

technical term for this is specific heat, and the value for water is much higher than for most other liquids. In practical terms, this means that water is a good insulator against any heat generated by chemical reactions in a cell. ■ Fifth, the high polarity of water molecules also means that it takes a relatively large amount of heat to vaporize liquid water; this is called the heat of vaporization. Some multicellular organisms take advantage of this property by using water as a coolant (e.g., sweat) or as a means of transporting molecules (e.g., transpiration in plants).

The Chemical Properties of Water Impact Nearly All Molecular Interactions in Cells Because water is polar, every other molecule that interacts with it can be classified as compatible (hydrophilic), incompatible (hydrophobic), or both (amphipathic). Hydrophilic molecules include ions (H+, Na+, Cl−, etc.) as well as other polar molecules (e.g., sugars, ammonia); the charge imbalance in these compounds attracts the polar atoms in water. Hydrophobic molecules are nonpolar (and contain no charged atoms); therefore, they do not attract water. Amphipathic molecules have both hydrophilic and hydrophobic regions; in liquid water, they spontaneously arrange themselves in specific ways to shield their hydrophobic regions from water molecules, while the hydrophilic regions maximize their contact with water. Lipids are one of the most common amphipathic molecules in cells. This need to adopt a specific configuration in water makes lipids extremely valuable to cells. As we will see in our discussion of biological codes later in this chapter, to remain alive, cells use membranes to separate their contents from the environment. This helps explain why lipids are the major components of all biological membranes. The logic of this is shown in FIGURE 1-2.

FIGURE 1-2 Three reasons why lipids form membranes in cells. ■ First, hydrophobic and amphipathic molecules spontaneously cluster together in water. This is why lipids spontaneously assemble into a bilayer when submerged in water: the fatty acid tails in lipids are hydrophobic and thus cluster together. In short, cells have to expend little to no energy to organize lipids into a membrane. ■ Second, this spontaneous assembly permits membranes to automatically reseal if they are punctured (remember that selfrepair is a critical feature of all cells). ■ Third, a membrane composed of lipids repels most hydrophilic molecules, thereby creating an effective barrier to hydrophilic molecules between the two sides of that membrane. It is primarily for this reason that cells can create their own internal environment. The simple concept of spontaneous arrangement with water applies to all amphipathic molecules, even those from extraterrestrial sources. Scientists were amazed to find amphipathic compounds, among other biologically important molecules, embedded in an extraterrestrial rock known as the Murchison meteorite, which fell in Australia in 1969. These compounds were the right size to form biological membranes, and when extracted and placed in water, they self-assembled into compartments that behaved like simple biological membranes. How might they have formed? Scientists discovered that in the presence of high heat, carbon monoxide and hydrogen can combine to form compounds with up to sixteen carbons. These uncomplicated reactions could have occurred both in the volcanic conditions of the early Earth and during meteorite formation. This discovery led scientists to the conclusion that during the Eoarchean era (3.6–4.0 billion years ago) the “first membrane-

forming amphiphiles” on Earth were formed from “geochemical sources, including extra-terrestrial material” (Fiore and Strazewski 2016). Chemists hypothesize that repeated cycles of boiling and cooling of water concentrated these amphipathic molecules on the surface of bodies of water (oceans, lakes, lagoons, etc.), on beaches, or at the edge of hydrothermal vents. (In 2018, scientists discovered evidence of a large lake of liquid water approximately one mile below the surface of Mars, raising the possibility that similar reactions could take place there as well.) As time passed, the amphipathic compounds formed water-filled compartments that concentrated essential molecular building blocks for life, increasing their interactions. Some of these compartments eventually merged to form the first protocells, similar to how oil droplets collect in a glass of water. Scientists believe this is the first instance of evolutionary selection (see Box 1-4) because every time a new membranous vessel formed, different internal combinations of biologically important molecules were encapsulated within it. Only those vessels that had encapsulated the winning combination of molecules were able to move on to new levels of complexity as living cells. The current name for this winning organism is LUCA—the last universal common ancestor of all organisms. BOX 1-4 TIP: QUICK REVIEW OF DARWINIAN EVOLUTION Darwinian evolution relies on three conditions: 1.

Organisms undergoing evolution must possess enough structural and functional variation in the population to ensure they do not all respond the same way to changes in their environment. Note that the populations of the amphiphiles in Earth’s primitive environment were sufficiently different to meet

this condition. 2.

Organisms must be able to pass their variable traits to their descendants; in other words, these traits must be inheritable. Because primitive amphipathic lipids self-assemble in aqueous environments, their structural and functional traits are shared by all structures that contain them, including their descendants. Note that structures that fail to make copies of themselves fail to meet a fundamental requirement for being alive, so only selfreplicating protocells were eligible to evolve into living organisms.

3.

Some inherited traits must confer a selective advantage on subsets of organisms within the population; over time, these traits become more common in surviving members of the population. The makeup of the population must change over time in response to environmental changes. If all variants respond identically to changes in the environment, there is no selective pressure, the population remains unchanged, and evolution by natural selection does not occur.

Other experiments demonstrated that under very specific conditions, layers of lipids will stack on each other, forming the characteristic “lipid bilayer” that all biological membranes now have; this must have conferred a significant advantage over single-layer compartments because all living organisms inherited this trait. Amazingly, the entire sequence of events that gave rise to all of life on Earth started with a simple chemical principle: amphipathic molecules self-assemble in the presence of liquid

water to form stable structures. Life would not exist without the assembly-forcing potential of water.

Code Biology Helps Explain the Diversity of Life Once the first cells formed on Earth, how did they grow and divide to give rise to populations capable of undergoing Darwinian evolution? There is no known physical evidence to explain how this occurred, so to address this question, we need to propose possible explanations and look for evidence to support them. One group of explanations focuses on the importance of transmitting information from one generation of cells to the next, using codes. Think for a moment how we transmit information in our everyday lives (see Box 1-5). For communication to be effective, we must understand and follow some fundamental rules: both the sender and receiver must share a common language, be it literal (e.g., words) or figurative (e.g., a smile). Furthermore, each must be equipped with the necessary structures to generate a signal and translate it into some action.

BOX 1-5 TIP: THE WORLD OF CODES To better familiarize yourself with the importance of coded communication at the cellular level, consider how critical codes are in everyday life. This book relies on the code of written language. For example, where you sit in your classroom is determined in part by the code built into the arrangement of the seats, all in order to optimize communication and learning. As another example, our conversations are filled with codes that communicate our thoughts and even our emotions to others; some codes (e.g., basic vocabulary in languages) are “general purpose,” while others (e.g., cell biology codes) are highly specialized to address very specific ideas. Next, consider what is required to make a code effective. The sender and the recipient must possess the physical ability to generate/encode and receive/decode the information. To make effective use of language codes, we rely on our senses and motor skills to read and write.

The same is true for cells and the molecules that comprise them: they must have the ability to create and receive coded information. At the cellular level, the first bits of information they exchanged were likely instructions for replication and basic metabolism. As cells grew more complex and sophisticated, the amount and complexity of the information they could exchange increased as well (see Box 1-6).

BOX 1-6 THE “CELL AS A BUSY CITY” ANALOGY Because most students new to cell biology are unfamiliar with cells, they could try thinking of a cell as something similar to what is already familiar: a very busy city. Consider how to describe the concept of a city to someone who has never been in (or even seen) one. It’s a daunting task. Hovering over any large, busy city in a helicopter and looking down at it, we might start by describing the general function(s) of the largest structures we can see (factories, office buildings, schools, roads, etc.) without delving into how they all work together to keep the city functioning. Likewise, we might describe the general concept of “people” in our initial description of a city, but we certainly would not want to start describing every single person in that city. The purpose of this chapter, then, is to (1) introduce the building blocks of the “big buildings” (proteins, membranes, organelles, and other structures in cells), (2) explain how “small villages” evolved into larger cities, and (3) emphasize the importance of cooperation and teamwork that allows groups of cities to prosper in a larger community (e.g., a state/province/region). The other chapters in this book will examine how the structures within these cities function. We’ll return to this analogy in other chapters, and as we add more detail to it, we can bring this city/cell to life, with the goal of arriving at a clear picture of how cells function at the molecular level.

Biologists use a shorthand term—code biology—to describe the biological basis for this type of information transfer (Barbieri 2018). Code biology simply proposes that biological codes are sets of rules that allow independent biological entities (e.g., molecules or cells) to exchange information. Cells pass these codes down from

generation to generation. Perhaps the best-known example of a biological code is the genetic code, which was discovered in the 1960s. The genetic code enables a molecule of RNA to communicate with the machinery required to convert the information stored in its genetic sequence into a polymer of amino acids, and some of these polymers became what we now call proteins. (We will explore how these processes happen in Chapter 8.) Proteins developed their own codes for communicating with sugars, lipids, DNA, RNA, and other proteins to build even more complex structures, and some of these developed their own codes. Over the course of billions of years, these codes have multiplied, always under the selective pressure of Darwinian evolution, to give rise to such complex networks as the human brain and the codes it developed for writing and reading languages. By examining how common a specific code is in living organisms, biologists can estimate when it appeared. Biologists now believe that our oldest common ancestor must have developed the genetic code and a code for controlling basic metabolism (see Chapter 10) before the other codes appeared. The basis for this belief is the fact that all known organisms use them to convert genetic information into proteins and convert chemical energy into the actions necessary to self-replicate and repair. How these codes arose is still a hotly debated topic among biologists. The earliest machinery that interpreted these codes was likely composed of RNA molecules that could catalyze chemical reactions. Some of these RNA-based catalysts, called ribozymes, persist in modern organisms and still play a critical role in building proteins, though the mechanisms for translating genetic information into proteins have become faster and more efficient. All organisms share the following codes, and all of them require teams of molecules to translate the codes into actions (see Box 1-

5): ■ Storing genetic information. Cells contain instructions for manufacturing most of the biological molecules necessary to stay alive. These instructions are stored in the form of a simple molecular polymer called DNA. ■ Maintaining the internal environment. Living organisms must capture and store energy, and this is accomplished by forming and maintaining chemical disequilibrium with the external environment. To remain alive, cells must continually adjust their internal activities to maintain a consistent environment that differs from conditions outside the cell membrane. The captured energy is used to maintain chemical conditions that favor replication and self-repair; as the energy is used up, heat is released as a byproduct. ■ Sensing the external environment. It is essential that cells be “aware” of changes in the external environment that may impact their internal environment (e.g., changes in temperature, acidity, nutrient levels, osmotic pressure). Cells contain sensors for relevant environmental conditions such as these and ignore the rest. ■ Controlling the flow of molecules into and out of the cell. Cells communicate with their external environment mainly through the selective transport of molecules (e.g., cells import nutrients and export metabolic waste). Controlling this molecular traffic also helps cells maintain chemical disequilibrium and sense their surroundings. A state of balanced internal conditions maintained by living organisms is called homeostasis. ■ Catalyzing chemical reactions. In order to maintain a consistent internal environment, cells must be able to control the chemical reactions taking place within them. Molecules

called enzymes play an important role in regulating these reactions. ■ Generating useful energy forms. To catalyze most chemical reactions and do any form of work, cells must expend energy. Many molecules in cells are devoted to capturing energy from outside of the cell (e.g., sunlight and “food”) and converting it into a small number of energy forms that cells can use directly. A well-known example of a useful energy form is adenosine triphosphate (ATP). ■ Synthesizing biological molecules. A considerable amount of the energy captured by cells is used to construct new biological molecules inside cells. These molecules may serve to replace damaged molecules, permit new functions in the cell, or generate sufficient copies of a molecule for the cell to replicate. To generate a nearly exact copy of itself, a cell must ensure that all information stored in its DNA is present in the newly created daughter cell. Cells possess molecular teams responsible for accurately replicating DNA and properly segregating it during cell division. ■ Regulating information flow. Much like teams of people who communicate with one another to accomplish a complex task, molecular teams in a cell communicate with one another as well. Some molecules specialize in transferring information from one team to another. The appearance of new codes, which biologists call codepoiesis, helped drive the explosion of diversity in living organisms. For example, scientists now propose that three specific signalprocessing codes are responsible for the creation of the three biological domains—Archaea, Bacteria, and Eukarya—from the LUCA (see FIGURE 1-3). Two of the three domains lost the ability to develop new codes, but the domain Eukarya continued introducing new codes and thus was able to generate considerably

more variation in its populations than Archaea and Bacteria. This, in turn, helped drive the evolution of eukaryotes and contributed to the tremendous variety of eukaryotic life forms we find on Earth (see Box 1-7).

FIGURE 1-3 The three domains of organisms are all derived from the last common universal ancestor (LUCA) but have distinct structural properties. This is a phylogenetic tree showing the likely sequence of evolution leading from the LUCA (bottom) to eukaryotes (top right).

BOX 1-7 FAQ: ARE BIOLOGICAL CODES THE PRODUCT OF “INTELLIGENT DESIGN”? No. Keep in mind that there is no experimental evidence to support the concept that coding rules are designed, have “intent,” or reflect any logical reasoning; biological codes arise spontaneously and are subject to evolutionary change.

Evidence for the Possibility of Extraterrestrial Life

The question as to whether life exists outside of Earth is likely as old as humanity itself. But only recently have cell biologists been able to weigh in, armed with experimental evidence. So far, there is absolutely no credible, verifiable evidence that life exists anywhere outside of Earth. Instead, the argument now focuses on whether it is even possible for life to exist elsewhere. This debate gave rise to a new branch of biology in the twentieth century called astrobiology. Based on the evidence outlined so far in this chapter, the possibility of extraterrestrial life rests on some very simple requirements: (1) a source of carbon; (2) liquid water; (3) an energy source capable of catalyzing polymerization of carbon; and (4) biological codes that permit these carbon-based polymers to self-replicate, repair, and capture energy from the environment. Evidence regarding extraterrestrial life is now accumulating. Recent advances in technologies that can find planets outside our solar system have also allowed physicists to look for the telltale signs of potential life. Carbon and water are abundant throughout the universe and have been detected on planets, moons, meteors, and smaller structures called astrophysical ices. A 2017 study replicated the conditions in astrophysical ices and exposed them to electrons that are produced when matter is hit by high-energy radiation in space, including ultraviolet light, cosmic rays, and other charged particles. In one study, these electrons supplied enough energy to catalyze the formation of carbon polymers, the basic building blocks of life (Esmaili et al. 2017). Other studies confirm that some meteorites striking the Earth’s surface contain some of the simple molecular building blocks necessary to support life. There is, of course, no evidence for biological codes in extraterrestrial matter, though the fact that organic compounds found in the Murchison meteorite can self-assemble in water leaves open the possibility that the codes necessary for life may exist elsewhere in the universe.

Together, these findings suggest that because life originated from simple molecular building blocks that exist throughout the universe, life may have originated on Earth only after the necessary biological codes appeared to ensure they could exchange communication. Alternatively, it is possible that because these codes arise spontaneously, life exists anywhere these codes function, including extraterrestrial sites. The field of astrobiology is dedicated to solving this age-old riddle. (To learn more about this field, refer to https://astrobiology.nasa.gov.)

Concept Check 1 In addition to carbon compounds, what are the other three essential ingredients for life to form? As a hint, consider what is necessary to make and break bonds between carbon atoms, what is necessary to compartmentalize these chemical reactions, and what enables groups of molecules to communicate.

▶ 1.3 All Cells Are Built from the Same Common Molecular Building Blocks Key Concepts ■ The structure–function relationship is a powerful tool for understanding cellular organization. ■ The study of cellular chemistry begins with an examination of the carbon atom. ■ Lipids, sugars, amino acids, and nucleic acids are the most common biomolecules in cells. ■ Sugars are simple carbohydrates. ■ Amino acids form carbon-rich molecules that contain an amino group and a carboxylic acid group.

Much of the remainder of this book explores the rosters of molecular team members within a cell, their roles, and how they accomplish their tasks. Along the way, we will be guided by a compelling paradigm in biology called the structure–function relationship. The central tenet of the structure–function relationship is that the function of any biological entity (ranging from an individual molecule to a vast ecosystem) can be determined by understanding its structure. From this idea, we can infer that as structural variation increases, so too does the range of functions. We will use the structure–function relationship to understand how molecules in cells function, but to do so, we must have a good understanding of the chemical principles that govern molecular structure (see Boxes 1-8 and 1-9). We can learn a great deal by simply examining the most common members of cellular teams and how they are made.

BOX 1-8 TIP: OVERCOMING THE JARGON BARRIER A good portion of this chapter discusses the chemical principles that govern cell structure and function and introduces many specialized words. The fields of biology and chemistry have their own jargon, and these words often make it difficult for students to see the connections between basic concepts. Here is a tip: students should be able to explain the relationships discussed here in their own words before they start memorizing the jargon. When the underlying concept is unclear, it is often difficult to recall specific words, and the sheer number of technical terms biologists use will overwhelm the short-term memory capacity of even the best students. Concept before vocabulary is a good habit to adopt in this area.

BOX 1-9 FAQ: HOW MUCH CHEMISTRY DO I NEED TO KNOW TO UNDERSTAND THE SUBJECTS IN THIS BOOK? This chapter contains a relatively short summary of a broad range of chemical principles, including topics that are often covered in organic chemistry courses. This approach is designed to help students who have organic chemistry experience to apply this knowledge to the subjects in cell biology, but this does not mean students must master organic chemistry to learn from this book. In fact, one can explore a great deal of cell biology with an understanding of chemistry at the level covered in most introductory college courses in general chemistry. If students have any concerns about the level of chemistry expected in their course, they should check with their instructor.

The Study of Cellular Chemistry Begins with an Examination of the Carbon Atom Carbon is an especially important element in cells, for three reasons: 1.

Aside from water, carbon-containing compounds are the most abundant molecules in cells; of all known molecules, those containing carbon vastly outnumber all of the rest, combined.

2.

These compounds exist in a dizzying array of variations.

3.

Carbon atoms can attach to one another more readily than the atoms of any other element, giving rise to molecules of tremendous size (some contain several thousand carbons) and structural complexity.

No wonder, then, that organisms are often referred to as carbonbased life forms.

Carbon Forms Characteristic Bonds with Hydrogen, Oxygen, Nitrogen, and Other Carbons The number of carbon-containing compounds is so vast that they are classified into groups (or families) according to their structure. We will take a closer look at some of these groups later in this chapter, but first, we need to recognize an important concept: that despite their tremendous complexity, carbon-containing compounds are typically constructed from a small number of basic chemical shapes. In cells, carbon atoms are typically covalently bound to only four other atoms: hydrogen, oxygen, nitrogen, and other carbons. Most carbon-based compounds in cells are built with these simple structures. For those interested in the details, let’s examine these carbon building blocks more carefully. (As discussed in Box 1-9, readers are encouraged to check with their instructor if this material is new; some courses may not require this level of detail.) Because each

block contains at least one carbon, we have to understand how covalent bonds with carbon are formed. The carbon atom contains six electrons, arranged in three orbital configurations, as shown in FIGURE 1-4. Two electrons are present in the 1s orbital (the innermost shell), filling it. The other four are located in the second (valence, or outermost) shell: two are in the 2s orbital and in a single (unbound) carbon atom; the other two are unpaired and occupy two of the three 2p orbitals, as shown in Figure 1-4a. In chemistry, the octet rule states that nonmetallic atoms tend to gain, lose, or share electrons until they are surrounded by eight valence electrons. Because carbon has only four electrons in its valence shell, it “needs” four additional electrons. It fills the remaining positions in the two 2p orbitals by forming four covalent bonds with other atoms. This results in a rearrangement of the valence shell, as shown in Figure 1-4b: one electron in the 2s shell is “borrowed” by the 2p orbitals, resulting in the formation of four orbitals called sp3 hybrids.

FIGURE 1-4 (a) Model of a single carbon atom. (b) Model of a carbon atom bound to four other atoms. These four bonds adopt characteristic shapes for each configuration, as seen in FIGURE 1-5. When carbon binds to four other atoms, these four bonds are arranged in a tetrahedral configuration, with bond angles of 109.5°. When carbon binds to three other atoms, one of these atoms forms a double bond, which forces the other two atoms into a trigonal, planar configuration. When carbon binds to two other atoms, the three atoms adopt a linear arrangement. Carbon can be connected to two atoms by a pair of double bonds (e.g., carbon dioxide) or by a triple bond and a single bond (e.g., cyanide). Carbon does not form four covalent bonds with a single atom.

FIGURE 1-5 The orientation of covalent bonds formed by carbon.

Complex Biomolecules Are Mostly Composed of Chemical Building Blocks Called Functional Groups Because carbon always binds to at least two other atoms, these atoms and their associated binding partners can be easily combined to create a wide variety of structures. The field of organic chemistry, which is devoted to the study of carboncontaining compounds in organisms, subdivides them according to their chemical structure (see Box 1-11). The different classes are called functional groups. TABLE 1-1 lists some of the more common functional groups we will encounter throughout this book. Note that while not all functional groups contain carbon, those that do are by far the most abundant in cells.

BOX 1-10 SILICON AS A POTENTIAL SUBSTITUTE FOR CARBON IN LIVING ORGANISMS Like carbon, silicon has four valence electrons in its outer orbital shell. Silicon is also the second most abundant element on Earth, raising the question of whether it could serve as a “stand-in” for carbon in organic compounds. In 2016, a group of researchers mutated a bacterium collected from a submarine hot spring in Iceland to replace a carbon with a silicon atom in a single protein, called cytochrome c. This resulted in a fifteenfold increase in this protein’s activity. While no carbon–silicon bonds have been found in nature, the results of this experiment suggest that such an arrangement is possible and may even be advantageous for some organisms. Modified from Kan, S.B.J. et al. Directed evolution of cytochrome c for carbon–silicon bond formation: Bringing silicon to life. Science, 2016:354 (6315), 1048–1051 DOI: 10.1126/science.aah6219.

BOX 1-11 TIP: UNDERSTANDING MOLECULAR STRUCTURE DIAGRAMS Most of the diagrams that biologists and chemists use to depict molecular structures are shorthand simplifications of the real three-dimensional structures. For example, a carbon atom’s four bonds are often drawn in an up–down/left–right orientation, even though we know that the bonds are never at 90˚ angles to one another. In other cases, the bonds aren’t even drawn: A methyl group is often abbreviated as –CH3. In other instances, even the letters are missing; in most drawings of large organic molecules such as lipids, simple lines are used to represent bonds between carbons, and the carbon–hydrogen bonds aren’t included at all. Thus, a zigzag line can represent a series of alkyl (–CH2–) groups. There are mixtures of many different versions of chemical structure shorthand in cell biology, but don’t let it be confusing. An alkyl group is always an alkyl group, no matter how it is drawn.

TABLE 1-1 Common functional groups found in biological molecules. In this abbreviated version of the chemical structure, the bond angles for most atoms are ignored, and the atoms are usually arranged at right angles.

Lipids Are Carbon-Rich Polymers That Are Insoluble in Water When carbon forms covalent bonds with oxygen or nitrogen, these bonds are classified as polar (oxygen and nitrogen are more electronegative than carbon). By contrast, carbon–carbon bonds are nonpolar, and covalent bonds between carbon and hydrogen have so little polarity that molecules consisting only of carbon and hydrogen do not attract water. These compounds, often called hydrocarbons, are hydrophobic and do not dissolve in water. Lipids are a class of hydrocarbons commonly found in cells that include components of membranes; the presence of oxygen in lipids makes them slightly hydrophilic. FIGURE 1-6 shows the generalized structures of some common cellular lipids (see Box 111). Most lipids are insoluble in the aqueous environment of the cytosol and cluster into structured aggregates in cells, including membranes. Many lipids in cells are attached to hydrophilic functional groups such as phosphate or hydroxyl groups, which confer some degree of water solubility.

FIGURE 1-6 Common types of lipids in cells. Common abbreviations of organic structures are shown. Common modified lipids include the following: ■ Phospholipids. Phospholipids are by far the most common form of modified lipids in cells. They constitute most of the mass of cellular membranes, as we shall see in Chapter 4. ■ Sterols. Cholesterol is an essential component of animal cellular membranes. Because it is mostly hydrophobic, cholesterol is most commonly found in the middle (hydrophobic) zone of membranes, where it interacts with phospholipids to change the mechanical properties of the membrane. In some animals, derivatives of cholesterol are used as hormones that circulate in the body to permit communication between even very distant cells. ■ Triglycerides, commonly known as fats or oils. Triglycerides serve as an important form of energy storage in animals. Because they are mostly insoluble in water, they form distinct droplets that are, in some cells, easy to identify with a microscope. Often, triglycerides are transported through the circulatory system, bound to proteins to form a lipoprotein. Some lipids are permanently attached to proteins, where they play an important role in targeting these proteins for cell membranes; these structures are sometimes referred to as lipid tails, and they serve as a form of anchor by interacting with the hydrophobic region of the membrane to keep the protein in place. Chapter 4 addresses lipid tails in membrane proteins.

Sugars Are Simple Carbohydrates Many molecules in cells are composed entirely of carbon, hydrogen, and oxygen, and often, these atoms are present in a ratio of CnH2nOn. Chemists who first characterized these

compounds speculated that they might be arranged such that the carbon atoms are attached to water molecules, and so they referred to them as carbohydrates (we now know that this is not the arrangement of these compounds, but the name stuck). Carbohydrates typically form rings or linear strands of linked carbon atoms; the oxygen and hydrogen atoms are present in the functional groups (with names such as hydroxyls, carbonyls, aldehydes, alkanes, etc.) formed by the carbons. Sugars, illustrated in FIGURE 1-7, are common carbohydrates found in cells. All sugars contain at least three carbons. (For those interested in the technical terms organic chemists use to describe them, one carbon forms a carbonyl group, which exists as either an aldehyde or a ketone, and the rest of the carbons are attached to hydroxyl groups.) The most important sugars in cellular metabolism contain between three and seven carbons.

FIGURE 1-7 Common monosaccharides in cells. The carbons are numbered by convention as shown.

The Common Sugars Glucose and Ribose Serve Several Different Functions in Cells For many students, mentioning sugar immediately brings to mind its important role as a source of metabolic energy. But sugars are far more than food. For example, ribose is a five-carbon sugar found in all cells. Ribose and its derivative, known as deoxyribose, form the backbone of the nucleic acids RNA and DNA, respectively. We will discuss nucleic acids in more detail in Chapters 2, 7, and 12. Glucose is a six-carbon sugar that serves as the building block for complex molecules such as starch, cellulose (a major component of the cell wall in plants), and chitin (found in the exoskeleton of arthropods and in fungal cell walls).

Many Sugars Exist as Disaccharides in Cells Glucose and ribose are examples of the simplest form of a sugar, called a monosaccharide. In cells, sugars are commonly found in linked pairs called disaccharides. For example, the common table sugar, sucrose, consists of glucose and another six-carbon sugar, fructose. Lactose and maltose are other common disaccharides. Two important properties of disaccharides help determine their function in cells. The first is simply the type of monosaccharide they contain, and the second is how they are connected, as illustrated in FIGURE 1-8. When monosaccharides are joined together, they are linked by a glycosidic bond (a form of ether bond specific to sugars) between the carbon atoms on each sugar. These bonds are not all the same. All disaccharides are formed between sugars in a ring form. The two sides of the ring are never identical because each carbon in the ring has two different atoms (–H, –OH, or –CHO) projecting outward. This means that the bond formed between two carbons oriented in the same direction is different from the bond that joins two carbons facing in opposite directions, even if the same carbons are involved. Organic chemists have developed a shorthand way of defining these bonds

by assigning each a number, starting with the carbon at the end of the linear molecule nearest the carbonyl group. Glycosidic bonds are named according to the number of carbons forming the bond and the relative stereochemical orientation of the two sugars, as either α (same) or β (different). Notice, for example, the α and β orientations of glucose in Figure 1-8 and the resulting disaccharides formed by α1,4 and β1,4 bonds between the two glucose molecules. (For those not steeped in this level of chemistry, the most important point to take away from this discussion is that biologists pay considerable attention to exactly how sugars are linked together and use the shorthand nomenclature developed by chemists to name the bonds.)

FIGURE 1-8 α and β glycosidic bonds in common disaccharides. A seemingly minor point, such as the orientation of a single atom, has profound effects for cells. All of the chemical reactions necessary to make and break bonds between sugars are catalyzed by enzymes, which are proteins that contain a binding site where these chemical reactions take place. Each enzyme has its specific binding site; this means that an enzyme that catalyzes the formation of an α1,4 glycosidic bond cannot also catalyze the

formation of a β1,6 bond or a β1,4 bond because these bonds are shaped differently. Likewise, the enzymes that degrade a specific bond cannot degrade the others. Practically speaking, this difference determines which disaccharides cells can make and degrade, based on the types of enzymes they contain. This fact also explains why humans can digest many α1,4 bonds (e.g., in sucrose and maltose) but not most β1,4 bonds (e.g., in chitin and cellobiose): human cells contain few enzymes capable of binding and breaking β1,4 glycosidic bonds. The enzyme lactase, which breaks the β1,4 bond in the disaccharide lactose, is expressed only in infancy in most mammals (see Box 1-12). BOX 1-12 CASE STUDY: WHY CAN ADULT HUMANS DRINK MILK? A defining characteristic of mammals is the production of milk to feed their offspring. The main sugar in milk is lactose, which is a disaccharide composed of glucose and galactose joined by a β1,4 glycosidic bond. Before the offspring can metabolize lactose, the glycosidic bond must be broken by an enzyme called lactase. All newborn mammals express this enzyme. Soon after the offspring are weaned, lactase begins to disappear from the gut. Throughout mammalian evolution, normal healthy adult mammals did not express lactase and did not drink milk. Today, roughly two-thirds of adult humans do not produce lactase. These individuals are sometimes referred to as “lactose intolerant.” An important step in human evolution occurred roughly 8,000– 10,000 years ago, when humans began domesticating animals that produce milk. Around 7,000 years ago, a small set of genetic mutations appeared in populations across the Earth that allowed the lactase gene to continue expression past infancy. This also made fresh milk from sheep, cattle, horses, goats,

camels, and other domesticated mammals available as a rich source of nutrients. Those who inherited one of these genetic variants can drink milk throughout adulthood. This is sometimes referred to as “lactase persistence.”

Study Questions Use the internet to find out more about lactose tolerance, human evolution, and dairy production to answer the following questions: 1. The most common symptoms of lactose intolerance are stomach pain, bloating, gas, and diarrhea. How would you explain these symptoms? 2. Nearly all adult humans, including those who do not express lactase, can eat most cheeses. If cheese is a milk product, why doesn’t it make lactose-intolerant individuals ill? 3. Those with lactose intolerance can drink plant-based alternatives (e.g., soy milk, almond milk) with no complications. Why is this? 4. Commercial “lactose-free” milk products are now available that contain cow milk but do not trigger the symptoms of lactose intolerance. How is this possible?

Oligosaccharides and Polysaccharides: The Storage, Structural, and Signaling Components of Cells Just as cells can build disaccharides with monosaccharides, they can also combine multiple sugars to form larger polymers. Many of these polymers are composed of repeating disaccharide units, and some are assembled into complex branching structures. There are two different types of these complexes: oligosaccharides and polysaccharides. The term oligosaccharide (oligo- means “a few”) is typically used to describe the sugars attached to cellular proteins and lipids; they play an important role in determining the

shape and function of the molecules they are attached to. (see the section As Proteins Enter the ER Lumen, They May Be Posttranslationally Modified in Chapter 8.) Polysaccharides (poly- means “many”) are tremendously large complexes of sugar that lie outside of cells in the extracellular space. These polysaccharides play a number of different roles in organisms, including long-term storage of food sugars (e.g., starch or glycogen) and as a reinforcing material (e.g., cellulose) in plant cell walls. In many animals, both oligosaccharides and polysaccharides are important components of the extracellular matrix, a dense network of molecules that provides structural support to tissues; we’ll examine the extracellular matrix in Chapter 6 (see the section Proteoglycans Provide Hydration to Tissues). These complexes can also contribute to cellular signaling networks, as we will see in later chapters, especially Chapter 11. FIGURE 1-9 illustrates some important features of oligo- and polysaccharides.

FIGURE 1-9 Oligosaccharides and polysaccharides play many important roles in cells. Most oligosaccharides and polysaccharides are located on the cell surface and in the extracellular space.

Amino Acids Form Carbon-Rich Molecules That Contain an Amino Acid Group and a Carboxylic Acid Group Amino acids are the building blocks of proteins. Each amino acid contains a central carbon atom, called an α-carbon, attached to four different molecular structures. Two of these structures are functional groups (an amino group and a carboxylic acid group, hence the term amino acid), and the third is a simple hydrogen atom. The fourth structure, often called an amino acid side chain (or “R” group), differs in each different amino acid, as shown in FIGURE 1-10a.

FIGURE 1-10 (a) General structure of an amino acid. Each unique amino acid has a different modification at the site designated R, which is called a side chain. (b) High-resolution 3D representation of a protein composed of a folded, linear sequence of amino acids linked by peptide bonds. (b) Courtesy of Brian O’Rourke, Albert Einstein College of Medicine.

Proteins are composed of long, linear sequences of amino acids. These sequences are created by forming covalent bonds (here called peptide bonds) between the carboxylic acid group of one amino acid and the amino group of another. Depending on the number of amino acids linked together, the names of the polymers differ; two and three amino acids linked together are called dipeptide and tripeptide, respectively, while the term oligopeptide refers to a small number of amino acids. Polypeptide typically refers to polymers of ten or more amino acids (some polypeptides contain over 1,500 amino acids). When polypeptides fold into a stable configuration that is useful to cells, they are called proteins. Cell biologists represent proteins in a variety of ways, ranging from the simple lines and blobs in Figure 1-9 to exquisitely detailed 3D models such as that shown in Figure 1-10b, depending on how much structural information they want to examine. A more in-depth discussion of the naming schemes for polypeptides, proteins, and protein subunits is provided in Chapter 3.

Chemical Modifications of Amino Acids Help Control Protein Function The structure–function relationship predicts that any condition that alters the shape of a protein also impacts its function. Changes in a protein’s environment (temperature, pH, ionic strength, etc.) have global effects on protein folding and function that are sometimes difficult to control. Cells also use targeted chemical modifications of individual amino acids to slightly alter protein shape; these modifications are carried out by other proteins, and so they are comparatively easy to control. Examples of these modifications include the addition of phosphate, methyl, and acetyl groups to the side chains of amino acids. These modifications are easily reversible, which allows cells to fine-tune the shape of individual proteins with great precision. Other modifications are permanent and essential for protein function. Two examples of permanent modifications are the addition of oligosaccharides to amino acid side chains and the creation of covalent bonds between amino acid side chains. The chemical basis of protein structure and protein modifications are discussed in greater detail in Chapters 3, 9, and 11.

Nucleotides Are Complex Structures Containing a Sugar, a Phosphate Group, and a Base Nucleotides are the building blocks for DNA and RNA, the genetic material of cells. The nucleotides that compose RNA contain a five-carbon sugar called ribose, and those found in DNA contain a modified ribose called deoxyribose, as shown in FIGURE 1-11. The sugar portion of nucleotides is always in a ring conformation. A second, nitrogen-containing ring compound called a base is covalently linked to the sugar, and the nucleotide is named according to the type of base it contains. When bases are attached to sugars, such as in nucleotides, the naming scheme for the carbons in the sugars changes by adding a “prime” to the

number (e.g., 1′, 2′, 3′). One, two, or three phosphates may be attached to the 5′ carbon, and the base is attached to the 1′ carbon. Cells join a single phosphate attached to the 5′ carbon of one nucleotide to the 3′ carbon of another to create polymers of nucleotides, called nucleic acids. These polymers can contain over 100 million nucleotides, and this sequence of nucleotides is how all organisms store genetic information. We will examine the structure of DNA and RNA more closely in Chapters 2 and 7.

FIGURE 1-11 Nucleotides are composed of a sugar, a base, and one to three phosphate groups. Note that primes are added to the carbon numbers of the sugar in a nucleotide.

Concept Check 2 Imagine for a moment that life is discovered on another planet. Much to the surprise of scientists, it is neither carbon nor water-based. But in many ways, it resembles terrestrial life forms, including the use of equivalents of lipids, sugars, amino acids, and nucleotides. What critical chemical properties would you expect these alien molecules to have?

▶ 1.4 Cells Must Cooperate to Succeed Key Concepts ■ Prokaryotes are the simplest, most abundant forms of life on Earth. ■ Biofilms support prokaryotes and eukaryotes in a symbiotic network. ■ Eukaryotes are complex cells capable of forming multicellular organisms. ■ Holobionts are multicellular life forms composed of eukaryotes and prokaryotes.

In this section, we will survey the three domains of organisms and highlight the distinguishing characteristics of each at the cellular level. As we follow the progression of complexity of life on Earth, the next step above biomolecules is single-celled organisms. The simplest and oldest types of these organisms are called prokaryotes. They comprise the Bacteria and Archaea domains shown in Figure 1-3. We can anticipate that (based on cell biology principle 1) they do not function in complete isolation, despite being described as single-celled organisms. They must interact with their environment, including other organisms, throughout their lifetime. When they function as individuals, biologists call this their planktonic or “free-living” state. In some situations, pairs of prokaryotes exist in a “dependent type of living” arrangement with each other called symbiosis. Each organism that participates in this relationship is called a symbiont. Symbiotic relationships can benefit both organisms (mutualism), benefit one without harming the other (commensalism), or benefit one while harming the other (parasitism). Symbiosis is not limited

to pairs of cells. Enormous populations of cells composed of at least two different species obey the same rules as smaller groups, provided they have an interdependent relationship. Biologists believe this ability to form symbiotic relationships gave rise to the domain of eukaryotes approximately 1.2 billion years ago when some archaea internalized bacteria without killing them. This likely happened many, many times during the ~2 billion years after life began, resulting in numerous different types of cellular “teams.” Just as the different types of protocells competed to give rise to the first true unicellular organisms, these cellular teams were subject to natural selection, and only the “best” teams survived long enough to reproduce this symbiotic relationship in their offspring. The winners of this contest were capable of assigning subtasks to these internalized partners, making them more competitive than other teams in the face of natural selection. This permanent state of cooperation persists to this day in all eukaryotes, including humans. While eukaryotes are less abundant than prokaryotes, they contain a great deal more structural diversity. Biologists believe this is true because eukaryotes can assign a subset of tasks to the specialized compartments, called organelles, inside them. These compartments are the product of winning teamwork in the earliest eukaryotes. This “divide-and-conquer” strategy gave eukaryotes the ability to specialize more than prokaryotes, which in turn encouraged even more teamwork between different types of eukaryotic cells. If we continue the principle of teamwork one step further, it makes sense that stable teams of single-celled eukaryotes would also be competitive in some environments if they could survive and reproduce as a single unit. This is how multicellular organisms such as humans arose. Note, however, that forming ever more complex teams is not always a winning strategy: the

preponderance of prokaryotic organisms today is clear evidence that remaining a single-celled, relatively unspecialized organism has been a very effective strategy for billions of years. Each form of life on Earth today represents the product of evolution by natural selection, and evolution does not favor complexity for its own sake. If remaining a single-celled organism is the most successful way to survive and reproduce in a given environment, natural selection will favor it. This section concludes with a discussion of some of the most amazing forms of teamwork in biology: holobionts. These organisms consist of a permanent cooperative relationship between eukaryotes and prokaryotes. Human beings and the multitude of prokaryotes that must reside within them to keep them healthy (called the microbiome) are an excellent example.

Prokaryotes Are the Simplest Forms of Cells Three features are unique to prokaryotes (see FIGURE 1-12):

FIGURE 1-12 Prokaryotes are the simplest cell types. (a) An electron micrograph of a bacterial cell, which is a prokaryote. (b) Common structures found in prokaryotes. (c) The capsule in prokaryotes is connected to the cell wall, the S layer, or the outer membrane. An electron micrograph of the capsule of a strain of bacteria called Bacillus anthracis. (a) Photo courtesy of Jonathan King, Massachusetts Institute of Technology; (c) Reproduced from J. Bacteriol., 1998, vol. 180, pp. 52–58, DOI and reproduced with permission from American Society for Microbiology. Photo courtesy of Agnés Fouet, Pasteur Institute.

■ Prokaryotes have only one membrane: the plasma membrane. Because all of their internal chemical reactions take place in a single compartment, the cytosol, the degree of specialization that these cells can achieve is limited. Some prokaryotes have evolved elaborate modifications of the plasma membrane, such as stacks of membrane folds that provide them with some degree of compartmentalization. Likewise, the cytosol of prokaryotes appears heterogeneous when viewed with an electron microscope (Figure 1-12a), suggesting that it may be partially organized. ■ All prokaryotic organisms are unicellular. Prokaryotes do not assemble into multicellular organisms, although some cluster together to form enormous symbiotic structures called biofilms. ■ Prokaryotes do not divide by mitosis. Most genetic information in prokaryotic cells is contained in a single circular DNA molecule called a chromosome, while most eukaryotic DNA is contained in several linear chromosomes. Mitosis, which is an elaborate mechanism designed to ensure the proper segregation of chromosomes during cell division,

evolved in eukaryotic organisms only. Prokaryotes divide by binary fission instead of by mitosis. Bacteria diverged from LUCA before Archaea and Eukarya. Archaea are classified as distinct prokaryotes because they differ in four important ways from bacteria. First, the enzyme complex they use to synthesize RNA is more similar to the corresponding eukaryote complex than the bacterial version. Second, the protein and RNA components of ribosomes in Archaea are more like those found in eukaryotes. Third, the cell membranes of Archaea contain components not found in either bacteria or eukaryotes. Fourth, the components of the cell walls in Archaea are also quite different from their counterparts in bacteria. Despite their relatively simple structure, prokaryotes can occupy some of the harshest environments on Earth. These include extreme heat and cold, tremendous atmospheric pressure, little or no atmospheric oxygen, and pH values ranging from 2 to 12. This is likely because modern prokaryotes are the direct descendants of the Earth’s earliest cells, which evolved in environmental conditions very different from those existing today. Due in part to their high degree of adaptability, prokaryotes are also by far the most abundant organisms on Earth.

Prokaryotic Cells Are Protected by a Cell Wall Most prokaryotes contain an additional layer of protection outside the plasma membrane. The general term for this structure is a peptidoglycan cell wall, and the portion of it that is directly connected to the outer membrane of gram-negative bacteria, or the S layer of gram-positive bacteria, is often called the capsule (Figure 1-12, panels b and c). The cell wall is composed mainly of sugar molecules linked together to form a thick mesh. Aside from protecting against physical trauma, the cell wall also retains water to help ensure that the cell is properly hydrated while protecting it from bursting.

Cellular Teamwork in Action: Most Prokaryotes Are More Successful as Symbionts than as Planktonic Cells Despite being the smallest and structurally simplest forms of life, prokaryotes are some of the most sophisticated and heartiest organisms on Earth. Because they were the first successful forms of life on the planet, prokaryotes have been subject to evolution by natural selection for over 3 billion years and so have evolved some robust survival strategies. They can survive in even the harshest environments and can evolve rapidly when environmental stress threatens. The rapid rise of drug-resistant bacteria since the introduction of antibiotics in the twentieth century is clear evidence of their adaptability. In 1995, scientists were able to revive and culture bacteria that had been trapped in fossilized tree sap (amber) for at least 25 million years. Despite this advance, most prokaryotes are difficult to study, for a surprisingly simple reason: scientists cannot keep them alive in a lab. For example, more than 90% of the known marine prokaryotes cannot be cultured by humans. No amount of nutrients can keep them alive outside their native environment. Cell biology principle 1 offers a plausible explanation: when they

are removed from their natural habitat, they lose contact with the organisms necessary to keep them alive; they lose their teammates. Most aquatic bacteria exist as biofilms containing more than one species, and it is hard to replicate the environmental conditions that favor the symbiotic relationships these cells need. This reliance on teamwork is so pervasive that some prokaryotes depend on others to feed them: they consume the metabolic products of their neighbors (FIGURE 1-13a). Others go so far as to internalize each other (see FIGURE 1-13b), thereby providing direct evidence that eukaryotic cells likely evolved from similar mutualistic relationships between their prokaryotic ancestors. This is different from simply eating one’s neighbors. It is critical that both parties remain alive because each must play an active role in supporting the partnership.

FIGURE 1-13 Prokaryotic teamwork. Panel (a) shows archaea (orange) surrounded by bacteria (green); the archaea extract energy from methane (CH4), releasing hydrogen gas (H2) as a byproduct. The bacteria consume the H2, which allows the archaea to continue feeding on the methane. Panel (b) shows bacteria living within a prokaryote, the blue-green alga Pleurocapsa minor. (a) Panel (g) from Figure 2 on page 624 of the Boetius article: Boetius A, Ravenschlag K, Schubert CJ, Rickert D, Widdel F, Gieseke A, Amann R, Jørgensen BB, Witte U, Pfannkuche O. 2000. A marine microbial consortium apparently mediating anaerobic oxidation of methane. Nature 407:623– 626. https://doi.org/10.1038/35036572; (b) Wujek DE. 1979. Intracellular bacteria in the blue-green alga Pleurocapsa minor. Trans Am Microsc Soc 98:143–145. https://doi.org/10.2307/3225953. Source: Figure 1, p.144 of Wujek article.

One of the most important benefits of combining to form a biofilm is that such a structure makes it easier for cells to capture and store chemical energy. This is based on the simple concept of concentration gradients: the principle of osmosis states that high concentrations of a solute (including ions such as H+, OH–, Na+, and Cl–, which are abundant in seawater) will spontaneously diffuse to areas of lower concentration until the concentrations are

equal (i.e., reach equilibrium). Even the earliest cells had to develop some means of capturing energy, and one of the easiest ways to do so was to capture the energy of ions as they made this journey in seawater. The water escaping some hydrothermal vents in oceans contains lower concentrations of H+ ions and higher concentrations of OH– ions than the surrounding seawater (i.e., it is alkaline relative to the water surrounding it), creating a stable proton (H+ ion) gradient in the water near these vents. The earliest cells capitalized on this concentration difference by letting protons pass through a protein in their plasma membrane as they moved down their concentration gradient. This protein converted the movement of the protons into adenosine triphosphate (ATP), a form of nucleotide all cells use as an energy source. The LUCA developed this mechanism of capturing energy, and all organisms still use some version of it today. Although individual cells are capable of capturing proton gradient energy, if the cells pack closely together, more of them can benefit from the same concentration difference, as FIGURE 1-14 shows. Cells in biofilms secrete a matrix (or mat) of polysaccharides, proteins, nucleic acids, and lipids (collectively called extracellular polymeric substances) that helps secure them to a surface and remain close to their neighbors. Within this matrix, they can capitalize on the proton gradient in the surrounding water. This is one reason why it is so difficult to keep prokaryotes that live near hydrothermal vents alive in the lab: it is costly to replicate and maintain this proton gradient in an artificial environment. This also shows how the simple behavior of ions in water plays a central role in the evolution of life on Earth.

FIGURE 1-14 Biofilms help teams of prokaryotes capture chemical energy. This shows a possible scenario that promotes specialization and cooperation in teams of prokaryotes.

A logical extension of this arrangement is that, as biofilms grew more stable and developed more sophisticated methods for tapping the proton gradient, natural selection would favor them. This provided the evolutionary impetus for specialization within biofilms: some cells might develop better means of directing the protons to the energy-converting proteins (e.g., decreasing proton permeability in their membranes), while others may develop more efficient converting proteins. Eventually, the cells comprising a biofilm would develop methods for sharing their captured energy with their neighbors. Modern biofilms that live near thermal vents have done exactly this.

Competition for Resources and Faulty DNA Repair Drive Prokaryotes’ Evolution Even the most cooperative groups run into trouble when resources become scarce. Biofilms cannot grow forever: eventually, new members of the group find there is a limit to how much ATP can be extracted from even the richest hydrothermal vent, and they may not survive when resources drop. This sets the stage for the next major step in the evolution of life on Earth: competition between teams of organisms. Fortunately, the same concept we discussed to explain how early proto- cells competed to survive applies here: evolution by natural selection. To explain how this happened, we need to account for the same three traits—variation within a population, heritability of this variation, and a survival advantage for possessing this variation. The variation arises from two primary sources: built-in errors in DNA replication and exchange of DNA among members of a population. Although our definition of life requires an organism to replicate and self-repair, neither of these traits must be 100% accurate. A simple principle of physics is that disorder in any closed system (such as

an organism and its energy source) always increases over time. In this context, it means cells never replicate their DNA perfectly or fix all their mistakes; genetic replication errors are a fact of life. Because DNA stores genetic information, this variation is inherited by an organism’s descendants. Prokaryotes even promote this genetic variation by sharing pieces of their DNA with their neighbors, which biologists call horizontal gene transfer. This accelerates the rate of evolutionary change by allowing members of a population to share the benefits of the variation without having to develop it themselves. An excellent example is antibiotic resistance. If a single bacterium mutates so that it is no longer susceptible to antibiotic drugs, it can quickly pass on that genetic information to its neighbors by copying the DNA encoding this resistance and physically transferring it to other cells it comes into contact with. This type of gene transfer can change prokaryotic populations dramatically, even giving rise to new species. The selective advantage of genetic variation within a population is, therefore, the determining factor as to how well organisms compete with one another. We can use biological codes to help explain this phenomenon. When genetic variation gives rise to a new or modified biological code, this can provide a tremendous advantage over organisms that lack it. One example is the code called quorum sensing. Quorum sensing is the bacterial language through which the microorganisms communicate with each other to regulate the size of their communities. Signaling molecules exchanged among cells can contribute to the formation of biofilms, colony diversification, prokaryotic motility, and virulence factor production. In some cases, this communal living is vital for the survival of an organism: some prokaryotes can remain alive in their planktonic state but can only replicate when they are in a symbiotic environment such as a biofilm. This is our First

Principle at work: teamwork promotes prokaryotic evolution by increasing genetic diversity in a population.

Eukaryotes Are Complex Cells Capable of Forming Multicellular Organisms Eukaryotic cells comprise the third and most recent domain of biological organisms shown in Figure 1-3. An artist’s representation of the diversity of eukaryotic cells is shown in FIGURE 1-15. When viewed under a microscope, the most striking feature of eukaryotic cells is that their cytoplasm is highly organized. Even the simplest microscope reveals the presence of a large, often oval-shaped structure called the nucleus. Closer examination with more powerful microscopes allows us to see additional distinct structures in the cytosol, such as mitochondria (see FIGURE 1-16). These structures are generally classified into two groups:

FIGURE 1-15 Examples of different eukaryotic cell types, including structures specific to each.

FIGURE 1-16 Subcellular structures can be visualized with light microscopes or electron microscopes. Increasing magnification reveals the structure of a single mitochondrion in a eukaryotic cell in the intestine. Photos from top to bottom: © Sebastian Kaulitzki/Shutterstock; © David Litman/Shutterstock; © Jose Luis Calvo Martin & Jose Enrique Garcia-Mauriño Muzquiz/iStock/Getty Images; © Dlumen/iStock/Getty Images Plus/Getty Images. © Callista Images/Image Source/Getty Images.

■ Cytoplasmic structures that are surrounded by at least one distinct membrane are called organelles. The presence of these membranes allows the cell to create specialized compartments in the cytoplasm that are devoted to performing a subset of cellular tasks under optimized conditions. Like the plasma membrane, these membranes are selectively permeable to ions and other small molecules, which helps to create a unique internal environment optimally suited to the molecules contained inside. Because it is membrane bound, the nucleus is classified as an organelle. Additional organelles found in eukaryotic cells include the endoplasmic reticulum, Golgi apparatus, mitochondria, chloroplasts, endosomes, lysosomes, and peroxisomes. Each of these organelles contains unique molecules and performs a separate set of functions. We will spend the remainder of this book discussing the structure and function of these organelles in detail. ■ Large molecular complexes that are not enclosed in a separate membrane do not have a generic name (like organelle), but they do share one important trait: they represent specialized regions in the cytosol that are devoted to a subset of cellular tasks. For example, eukaryotic cells contain three different types of fibers that serve as a scaffold for the organization of the cytosol, thereby earning them the name cytoskeleton.

Careful arrangement of the cytoskeleton is essential for proper cell function; without it, muscle cells would not contract, and nerve cells would fall silent. We will focus on the cytoskeleton in Chapter 5 but will revisit it repeatedly throughout the book.

Eukaryotes Arose Through Teamwork by Prokaryotes The fact that eukaryotes arose from teams of prokaryotes is unquestioned in biology. Exactly how this happened, however, is still a subject of much debate. Before we explore what is not known, let’s establish the facts that all scientists agree on: ■ Eukaryotic cells are so named because one of their most distinctive features, when viewed under a microscope, is the nucleus. (The name “eukaryote” is derived from the Greek eu-, meaning “true,” and karyo, meaning “kernel.” Prokaryotes are therefore said to have arisen pro- or “before,” the nucleus arose.) It is tempting to think that the appearance of the nucleus marked the onset of eukaryotic life. Yet, the first organelle to develop was the mitochondrion. This occurred approximately 1.8 billion years ago, almost 2 billion years after life appeared on Earth. ■ Eukaryotes have, on average, four times more genetic material than bacteria; the largest known amount of DNA in a eukaryotic cell is 10,000 times more than the largest known in a prokaryotic cell. This means that eukaryotic cells contain more genetic information than prokaryotes: Eukaryotes have over 1,000 families of genes that are absent in prokaryotes. If we use the “cell as a busy city analogy” from Box 1-6, we can say that the library in a eukaryotic cell has both a larger and more complex “collection” than a corresponding prokaryotic library. This is an important consideration because it helps explain why eukaryotes are so much more structurally complex than prokaryotes: they have more information to work with.

■ Eukaryotic cells are on average 10 to 100 times larger than prokaryotic cells. While size alone does not necessarily confer any special advantage on eukaryotes, their larger size does reflect their increased structural stability, which can be a tremendous help. Borrowing from the busy city analogy, we can say that skyscrapers are far more complex than mere stacks of small buildings: they require specialized structures that allow them to remain functional under conditions that would break simple clusters of small buildings (or cells) apart. Much remains to be explained for us to understand eukaryotic evolution fully, but common themes are emerging. For example, based on careful examination of the sequences of DNA in the nucleus and the small amount of DNA remaining in the mitochondrion, biologists have determined that the source of most nuclear DNA is an archaeal ancestor, and mitochondrial DNA traces back to a form of bacterium called an alphaproteobacterium. This strongly suggests that eukaryotes arose when an archaean “host cell” internalized a bacterium that was capable of generating more ATP than it needed to survive; this bacterium became the mitochondrion. This explanation is known as the endosymbiont theory or endosymbiosis. (In 2015, scientists discovered an archaeal organism called Lokiarcheaota, which is strikingly similar in genetic makeup to this proposed archaeal ancestor.) All eukaryotes share this genetic ancestry, strongly suggesting that the formation of the mitochondrion only happened once during evolution. Exactly when and how this happened are currently topics of debate. We know this occurred before the nucleus formed because the structure and molecular composition of the membrane surrounding the nucleus (called the nuclear envelope) in all eukaryotes are more similar to bacterial membranes than to archaeal membranes (FIGURE 1-17), and the source of this bacterial membrane is most

likely the ancestral mitochondrion. (An alternative hypothesis is that the early eukaryote internalized a second bacterium that contributed to the nuclear membrane.) We also know this event triggered the “division of labor” that gives rise to our First Principle: once internalized, the mitochondrion no longer had to move through its environment (e.g., swim), find food, or defend itself against attack. This allowed it to specialize in creating ATP. In turn, the host cell no longer had to commit its plasma membrane to ATP synthesis; instead, it could further specialize to form stable attachments with its surroundings, including neighboring cells. Teamwork between Archaea and Bacteria was thus immortalized in the form of a new biological domain, Eukarya.

FIGURE 1-17 The nucleus arose by endosymbiosis, a process in which one prokaryotic cell engulfs another cell to form a primitive organelle such as the nucleus.

A second common theme in the theories of the origin of eukaryotes is that mitochondria appeared well before they started functioning the way most modern mitochondria do. As we will see in Chapter 10, modern mitochondria rely on oxygen (O2 gas) as the final electron acceptor when they convert food energy to ATP; depriving aerobic eukaryotes of oxygen can be fatal. But at the time the first mitochondria appeared, atmospheric oxygen levels were much, much lower than they are today. What, then, were mitochondria using instead of oxygen at that time? Again, the power of genetics helps us solve this riddle: bacteria that live in oxygen-poor environments today use proteins derived from genes that are very similar to those mitochondria use to create ATP, but these oxygen-poor bacteria substitute oxygen with other biological molecules or even simple hydrogen ions. Researchers argue that if modern bacteria are using the same genes as the earliest mitochondria to make ATP in the absence of oxygen, the first mitochondria could have done so as well. This means that the earliest mitochondria were adaptable enough to use electron acceptors other than oxygen and that they later acquired the ability to use oxygen when it became available. (Some modern organisms can use both oxygen and its alternatives.) Because we have no direct proof that this occurred, this point remains open to some speculation. A third element common to all models of eukaryotic evolution is that the first eukaryotic cell had roughly twice as many genes as its prokaryotic neighbors. And because many of the archaeal genes initially performed nearly the same functions as the corresponding genes in the bacterial ancestor, many of the genes in the eukaryotic DNA (the collection of all genetic information in an organism is called its genome) were redundant. In evolutionary terms, this fact is important because it provides a tremendous opportunity for the eukaryotic cell to mutate its genome without

fear of dying out. Put simply, the first eukaryote could tolerate mutation of one of its genes as long as its “cousin” gene remained intact and functional—this genetic redundancy meant it could afford a lot of mistakes. This capability allowed for a lot of variety, including new genes not seen before in prokaryotes. For example, over 280 families of new genes arose during this period, and they contain both archaeal and bacterial sequences, demonstrating that they are the product of fusion between multiple genes. Not all genes survived this selection process: modern eukaryotes are missing some of the genes in their ancestral cells, and some bacterial genes have moved from the mitochondrial DNA to the nuclear chromosomes over the past billion years. Most modern eukaryotic genomes even include some of the “failed” (nonfunctional) genes inherited from these turbulent times. Exactly how this proliferation of genetic diversity unfolded over time remains unclear and is open for interpretation based on different models of eukaryotic evolution. Fourth, teamwork in response to natural selection prompted yet more division of labor over time. Aided by careful analysis of the genetic history of the proteins that are associated with specific organelles and cellular functions, we can now propose a reasonable explanation for how organelles and other specializations developed. For example, the organelles and vesicles that comprise the endomembrane system (covered in detail in Chapter 9) arose from the mitochondrial membrane, while the molecular machinery responsible for controlling membrane trafficking within this system arose from the archaeal ancestor. Similarly, the proteins that are responsible for replicating and processing DNA and RNA in the nucleus (discussed in Chapters 7 and 8) are archaeal, while the machinery for converting genetic information to proteins in the cytosol (this is called translation, covered in Chapter 8) is bacterial in origin.

Extension of the nuclear membrane gave rise to the endoplasmic reticulum. When cyanobacteria acquired the ability to harness sunlight as an energy source, they released molecular oxygen to the atmosphere as a product of photosynthesis. (Some eukaryotes later internalized these cyanobacteria, giving rise to chloroplasts and eventually plants.) When the levels of atmospheric oxygen rose high enough to pose a danger from reactive oxygen species (e.g., superoxide [O2*–] and hydrogen peroxide [H2O2]), which can damage DNA and proteins, the mitochondrial membrane budded to create the peroxisome, an organelle that neutralizes these dangerous compounds. The threat of damage from oxygen radicals may have also promoted the development of other organelles, such as the nucleus, which protects DNA from damage. Because these intermediate stages no longer survive, we cannot definitively determine the exact sequence of events that gave rise to these eukaryotic specializations. The fifth theme in eukaryotic evolution is that, regardless of how many times the above events occurred, only a single eukaryotic precursor survived with all of the elements shared by every subsequent eukaryote. This organism is now appropriately called the last eukaryotic common ancestor, or LECA. (Remember that LUCA gave rise to LECA.) While its historical existence is unquestioned, it is unlikely we will ever find a living example of it. As we learn more about the evolution of eukaryotes, our definition of LECA will likely evolve as well. We are already applying what we know to develop new forms of life in the lab (see Box 1-14).

BOX 1-13 FAQ: IS UNCERTAINTY IN THE MECHANISM OF EVOLUTION EVIDENCE THAT IT IS NOT TRUE? Absolutely not. The evidence that evolution occurs, and has occurred since the beginning of life on Earth, is overwhelming. We can observe evolution happening all around us: the emergence of different influenza (flu) virus strains every year is clear evidence that the genetic material (DNA or RNA) in viruses changes rapidly and that the variants most able to infect humans proliferate and make us ill. This continual genetic variation is true for all organisms and is easily proved by comparing the nucleotide sequence of DNA or RNA in organisms over time. However, it is still not perfectly clear why a given variant occurs in a specific organism at a given time. Yet, not fully understanding the mechanisms that give rise to specific genetic variation in no way diminishes the strength of the evidence that it occurs.

BOX 1-14 ARTIFICIAL EUKARYOTE-LIKE CELLS ON THE HORIZON Advances in biotechnology, especially genetic engineering, revolutionized the world in the 1980s. They enabled us to use viruses and bacteria to save human lives, increase crop yields, identify criminals, or solve paternity cases. Just as self-driving cars are expected to join “conventional” vehicles on our roads in the near future, artificial cells are expected to soon join the biological world alongside natural cells. By capitalizing on the concepts presented in this chapter, scientists can now synthesize viable cells from nonliving raw materials. Most of these human-made cells resemble simple prokaryotes and are capable of reproducing and performing the minimal functions

required to live and reproduce. In the future, such cells could be designed as replacements for damaged natural cells (e.g., artificial red blood cells) or as delivery vehicles for targeted therapy (e.g., drug carriers). A technology developed in 2018 allows scientists to create artificial cells that contain normal eukaryotic cells inside them to serve as “organelles” inside the artificial cells. The artificial cell membrane protects these normal cells from harsh external environments so that they can function in extreme conditions, much the way a spacesuit protects an astronaut during a spacewalk. It also protects against physical damage, similar to how the plasma membrane protects organelles or the skull protects the brain. Even when exposed to highly toxic, concentrated copper solutions, the inner living cells are not affected, thanks to their outer armor: the artificial cell protecting them. Modified from Elani Y., et al., Constructing vesicle-based artificial cells with embedded living cells as organelle-like modules. Scientific Reports 2018 Mar 14;8(1):4564. doi: 10.1038/s41598-018-22263-3

Biofilms Support Prokaryotic and Eukaryotic Symbiosis Biofilms promote teamwork, and not just among prokaryotes. They also include eukaryotes such as algae and fungi. The advent of high-throughput DNA sequencing technologies in the early twentyfirst century allows scientists to compare genomes of organisms that share common habitats. These metagenomic studies reveal a high diversity of microorganisms present in biofilms, including a diverse spectrum of Eukarya and Bacteria. Some of these biofilms form in the human body, including the surfaces of our respiratory and digestive tracts.

How organisms within a biofilm change over time can tell us a great deal about the evolutionary pressures that drive biological cooperation. As mentioned earlier, biofilms are resource-limited populations of organisms that must cooperate and compete to survive, so they provide an excellent opportunity to study cooperation and competition among populations within a single habitat. For example, one strain of microorganisms can alter the biofilm environment to allow a second strain to thrive or perish. Cells within the biofilm may successfully compete for space by producing substances that limit the growth or prevent the adherence of other cells. Other times, in order to save energy, cells in biofilms may stop producing specific substances; while this helps the current residents, it could make it more difficult for newly arrived members to survive. The driver of biofilm evolution and symbiosis is, of course, genetic variation. For example, a strain of biofilm bacteria may develop faulty DNA repair mechanisms, making them more susceptible to mutation. While this may prove dangerous to an individual in the population, the increased mutation rate may benefit the entire biofilm, and with support, even the mutated cells may survive. This rationale is similar to the argument supporting increased mutation tolerance in eukaryotes—having copies of healthy, undamaged genes in neighboring cells increases the tolerance for mutation, with one proviso: the mutant cells must remain within the biofilm to survive. This, in turn, leads to mutations that favor long-term stability within a biofilm, such as quorum sensing, antibiotic resistance, enhanced motility, and greater adherence to the matrix of the biofilm. One excellent example of competition among organisms is simply called cheating. Cheater cells within a biofilm develop methods for benefiting from the biofilm environment but contribute nothing in return. Genetic variation with a biofilm allows some cells to fight

back by developing mechanisms that make it more difficult to cheat. One example is secreting a thicker biofilm matrix, which supports cells within the community, limits the diffusion of nutrients to the cheaters, and even prevents adherence of newly arrived cheaters. Cooperation within biofilms can present important health problems for humans. For example, patients with the genetic disease cystic fibrosis secrete a thick, sticky layer of mucus into the lungs, pancreas, and other organs. These patients are at risk of serious lung infections because bacteria use the mucus as a matrix to form biofilms. As the bacteria mutate within this mucus, they can develop antibiotic resistance, making infection difficult to combat. FIGURE 1-18 shows the results of an experiment wherein a common bacterium found in human lung biofilms, Pseudomonas aeruginosa, was allowed to evolve over a 90-day period (approximately 600 generations) as a biofilm population. Notice that the three samples of the ancestral population evolved into very different shapes and sizes, reflecting the tremendous genetic diversity within the ancestral population. Subsequent tests confirmed that some of these descendant strains were far more competitive than (i.e., they outgrew) “native” Pseudomonas aeruginosa populations in biofilms.

FIGURE 1-18 Induced evolution of the biofilm bacterium Pseudomonas aeruginosa. Three cultures taken from the ancestor population PA14 were grown for 90 days as biofilms, with samples collected at 17, 45, and 90 days. Note the tremendous variation in size, shape, and color of the descendant biofilms. Some of these descendants proved to be far more successful in forming biofilms than the ancestor. Reproduced from Flynn KM, Dowell G, Johnson TM, Koestler BJ, Waters CM, Cooper VS. 2016. Evolution of ecological diversity in biofilms of Pseudomonas aeruginosa by altered cyclic diguanylate signaling. J Bacteriol 198:2608–2618. doi:10.1128/JB.00048-16

Researchers are fighting back as well, using experimental models such as that shown in Figure 1-18, to study how drug resistance arises and to test therapies for removing these bacteria from the body. In this case, we are competing against the very forces that gave rise to our own diversity, and the bacteria we are battling have a long and successful history of evolving to overcome such obstacles. Pandemic infections such as the bubonic plague, influenza, and smallpox have devastated human populations in the past, and infectious diseases remain a great health risk despite our best efforts to control them.

Macroorganismal Hosts Coevolve with Their Microbiomes to Create New Holobionts The evidence presented above demonstrates that organisms depend on others for their survival and adaptation. This includes eukaryote–prokaryote teams in structures as simple as biofilms and as complex as the entire human body. It may be surprising to learn that even the healthiest humans are riddled with a tremendous variety of bacteria: approximately 90% of the cells in a human are prokaryotes. The shorthand term for the microbial population within our bodies is microbiome. Microbiomes are not static: the constituents of a given person’s microbial makeup change over time and in response to alterations in the environment (e.g., changes in diet, medications, or even atmospheric conditions). This relationship is considered symbiotic, rather than an infection, because we rely on these microbes to remain healthy. The human microbiome has coevolved with humans to form a eukaryotic–prokaryotic team called a holobiont. In this lexicon, the nonmicrobial host is often called a macroorganism, meaning it is large enough to see with the naked eye. All macroorganisms, including humans, animals, plants, and fungi, are holobionts. Eukaryote–prokaryote teamwork is now

considered a prerequisite for nearly all multicellular life on Earth, even at the level of individual organisms. The recognition of holobionts changes the way scientists view evolution. A human and his or her microbiome can now be viewed as a single “superorganism” capable of evolving as a unified collection of genes, collectively called the hologenome. The microbiome, composed of at least 20,000 different species, contributes nearly all of the genetic variation in this superorganism: whereas human cells have about 22,000 different genes (and these are 99.9% similar across individuals), current estimates suggest that the microbiome contains about 10 million genes. This is the logical extension of the theme introduced at the beginning of this chapter: viruses infected prokaryotes to form teams, prokaryotes formed clusters to specialize, specialized prokaryotes collaborated to form eukaryotes, eukaryotes teamed up with prokaryotes to form macroorganisms, and macroorganisms evolved as a single genetic unit to create ever more complex forms of life. This story is reflected in our DNA— humans contain genetic material that traces back to Archaea, Bacteria, and even viral origins. Evidence of complex, crossspecies communication indicates that a wealth of biological codes guide the interactions among these genes. We can, therefore, hypothesize that human health and disease can be traced to changes in our microbiome. Already the evidence is mounting in support of this concept: an array of diseases, including inflammatory and autoimmune disorders, may be linked to a loss of some elements of our microbiome. We also see evidence during pregnancy and childbirth: bacteria colonizing a female’s vagina during pregnancy (lactobacilli) produce a very acidic environment to ward off any potential pathogens. During childbirth, the newborn is exposed to microbes that help establish the first “colonies” of microbes in the baby (children delivered by

caesarean section avoid the passage through the vaginal canal and often lack these early colonists). Scientists have also suggested that a lactating mother can provide nutrients in her milk to support the growth of specific bacteria in her baby’s gut that prevent colonization by harmful pathogens. Physicians are now capitalizing on this concept by “transplanting” the microbes of one individual to another to treat diseases (see Box 1-15). In effect, our microbiome is now analogous to a set of organs in our body. This means new therapies may target these organisms, rather than the human “host” cells, to treat diseases. BOX 1-15 APPLIED CELL BIOLOGY: HEALING THE HUMAN HOLOBIONT WITH PROKARYOTIC TRANSPLANTATION Clostridium difficile is a bacterial species common in the digestive tract of humans and is generally considered harmless. In some cases, however, this bacterium causes a disease called Clostridium difficile infection (CDI), which includes symptoms such as frequent watery diarrhea; severe stomach pain or tenderness; nausea; loss of appetite; low-grade fever; blood or pus in the stool; and even death. The most common treatment for CDI is oral antibiotics, which target the gut microbiome. When antibiotics prove ineffective, physicians may use a more sophisticated therapy called fecal microbiota transplantation (FMT). FMT is performed by transferring a sample of fecal matter from a healthy donor to the CDI patient during a colonoscopy or even via an oral capsule, and it is over 95% effective in curing CDI. Use the internet to learn more about FMT, CDI, antibiotic therapy, and fecal microbiota, and then apply your knowledge to answer the following questions:

1.

If Clostridium difficile is common in the human digestive tract, why do only a small number of people develop CDI? That is, what are the most common causes of CDI?

2.

Why does the proliferation of Clostridium difficile cause the symptoms listed above? How is this related to the concept of cooperation and competition in microbial populations?

3.

Propose an explanation for why some populations of Clostridium difficile are resistant to antibiotic treatment.

4.

Propose an explanation for why FMT is such an effective treatment for CDI.

Concept Check 3 An important theme in Section 1.4 is that teamwork promotes success at all levels of life, from prokaryotes to holobionts. If this is the case, does evolution favor a condition where all organisms on Earth collaborate to form a single organism that spans the entire planet? In the early 1970s, British chemist James E. Lovelock and U.S. biologist Lynn Margulis advanced such an idea, known as the Gaia hypothesis. While still hotly debated, some aspects of this hypothesis are supported by experimental evidence. Apply the concepts presented in this chapter, as well as any other reliable sources you find, to (a) build a case that all life is linked by an ever-increasing level of complex interactions, and (b) refute this hypothesis, on the grounds that such complexity cannot meet the requirements of Darwinian evolution.

▶ 1.5 Chapter Summary Cells, the fundamental units of life, are complex and highly organized structures that possess two traits common to all living beings: the ability to self-replicate and the ability to self-correct. The remaining chapters in this book focus on the cellular mechanisms responsible for these two abilities, but before we explore the structure and function of cells in greater detail, it is important to understand that cells share common features. All cells contain water and are composed primarily of carbon-based molecules that can be categorized into four structural classes: lipids, sugars, amino acids, and nucleic acids. These simple building blocks form a tremendous number of different molecules, each of which performs a very small number of the tasks necessary for cells to remain alive. Over the course of evolution, cells acquired a common strategy to simplify these tasks: they clustered these molecules into teams that cooperate to perform a subset of cellular activities. The resulting structural complexity in these teams gives rise to the variation required for evolution to occur. In many cases, the structure–function relationship helps explain how these teams work. For example, prokaryotes developed mechanisms for adhering to each other and to solid surfaces to form stable clusters that are generally more successful than individual prokaryotic organisms. Archaeal cells acquired the ability to enclose teams of molecules, including at least one entire living bacterial organism, in membrane-bound compartments called organelles; this gave rise to eukaryotic organisms. Teams of eukaryotes and prokaryotes form organisms called holobionts, which coevolve as a single “superorganism.” These ideas form the foundation for understanding how and why cells look and act the way they do.

Chapter Study Questions 1. Using the bulleted list of “essential tasks” performed by cells as a guide, what additional tasks do cells have to perform? What additional tasks might be especially important for unicellular organisms (including prokaryotes) or for tissue cells (e.g., skin, heart muscle, brain)? 2. Most prokaryotes are generally considered to be less complex than most eukaryotic cells, yet they are the most abundant cells on Earth. How would one explain why simple prokaryotes are so abundant? 3. Explain the structure–function relationship. Then develop an example of this relationship in everyday (nonbiological) life. How applicable is this concept to the nonbiological structures that people interact with during a typical day? 4. What properties of the carbon atom permit it to form a wider variety of molecules than any other atom? 5. Carbon is one of the most abundant atoms in sugars and lipids, but sugars and lipids have very different chemical properties. How would one explain this difference? 6. Cows and people are mammals, but cows can use most grasses (which contain the β1,4-linked polysaccharide cellulose) as a food source, while humans cannot. What special abilities must cows have to enable them to eat grass? (Hint: cows and humans have different bacteria growing in their gastrointestinal tract.) 7. Provide an example of a biological code not specifically addressed in this chapter, and explain what makes it useful to organisms. What qualifies your example as a code? 8. Genetic evidence is cited multiple times in this chapter. What makes genetic sequence information more useful than other factors such as cell size, membrane structure, or the size of a population when discussing evolution? 9. Why is it important that mitochondria were the first organelles? What makes mitochondria even more important to a cell than a nucleus? 10. The advent of quorum sensing is considered one of the most important events in the evolution of life on Earth.

Compare the way organisms coexisted before and after quorum sensing, and provide a brief explanation for why it persists to this day in prokaryotic populations.

Multiple-Choice Questions 1. Methane has the chemical formula CH4. Which of the following statements about methane is false? A. B.

It has a lower molecular mass than water. It forms hydrogen bonds with water.

C.

Its four hydrogen atoms are arranged in a tetrahedral pattern around the carbon atom. It is a gas at room temperature and one atmosphere of pressure. It is a hydrocarbon.

D. E.

2. Which statement best defines the concept of “life”? A. B. C. D. E.

Living beings must be able to replicate and self-repair. Living beings must contain at least one polysaccharide. Living beings must reproduce sexually. Living beings must remain in equilibrium with their environment. Living beings must remain at rest during reproduction.

3. Which one of the following statements about polysaccharides is true? A. B. C. D. E.

They form a portion of the backbone in nucleic acids. They form peptide bonds with proteins. They form the backbone of phospholipids. They are held together by glycosidic bonds. They line the inner membrane of a eukaryotic cell’s nucleus.

4. Which one of the following is an example of how cells store genetic information? A. B.

Cells assemble information-rich, complex polysaccharides from simple monosaccharides. Cells catalyze information-rich, chemical reactions that convert DNA into monosaccharides and bases.

C.

Cells encode information in the form of sugar– phosphate bonds in polysaccharides.

D.

Cells transport materials into and out of their interior based on information in the extracellular environment. Cells assemble long strands of nucleotides.

E.

5. Which one of the following statements about water is false? A. B. C. D. E.

Water is a neutral molecule containing ten protons and ten electrons. The H atoms and the O atom within water share electrons in covalent bonds. Hydrogen atoms in water are more electronegative than oxygen atoms. Water has a higher specific heat value than methane. Water is a polar molecule with a partial positive charge on one end and a partial negative charge on another end.

6. Which one of the following statements about biofilms is true? A. B. C. D. E.

They are composed of only eukaryotic cells embedded in extracellular polymeric substances. They are composed of only prokaryotic cells embedded in extracellular polymeric substances. They are composed of both eukaryotic and prokaryotic cells embedded in extracellular polymeric substances. They are composed of only living prokaryotic cells internalized within other prokaryotic cells. They are composed of only prokaryotic cells living in a symbiotic relationship and close vicinity to other prokaryotic cells.

7. Which statement best defines the concept of a “holobiont”? A.

B.

It is the aggregation of a macroscopic host (macrobiota) and myriad interacting microorganisms (microbiota). It is a conglomeration of host prokaryotes and all their microbial symbionts.

C.

It is an assemblage of the species within a single ecological unit.

D.

It is a collection of microbiota and an archaeal host competing with each other. It is a representation of imbalance in the microbiota– host equilibrium.

E.

8. Which one of the following statements about silicon is false? A. B. C. D. E.

Silicon is the second most abundant atom on Earth. Silicon–carbon bonds often form in organisms. Silicon atoms can make four covalent bonds with other elements in forming chemical compounds. Silicon can be incorporated into carbon-based molecules. Silicon does not easily make and break bonds with oxygen.

9. Which one of the following statements about monomer bonds is false? A. B. C. D. E.

Amino acids are linked by peptide bonds in polypeptides. Glycosidic bonds link monosaccharides in nucleotides. Individuals lacking lactase are unable to break the glycosidic bond in the disaccharide lactose. Nucleotides are linked by phosphodiester bonds in both DNA and RNA. A glycosidic bond linking glucose and galactose can be broken by lactase.

10. Which statement best defines the concept of “code biology”? A. B. C.

It is a coded cellular language that requires quorum sensing to decode. It is a set of rules transmitted from generation to generation in living organisms. It is identical to the genetic code, which allows RNA signals to make proteins.

D.

It is a set of rules for assembling biofilms.

E.

It is a set of rules found only in holobionts.

References Barbieri M. What is code biology? Biosystems 164:1–10. https://doi.org/10.1016/j.biosystems.2017.10.005 Esmaili S, Bass AD, Cloutier P, Sanche L, Huels MA. Synthesis of complex organic molecules in simulated methane rich astrophysical ices. J Chem Phys. 2017 Dec 14;147(22): 224704. doi: 10.1063/1.5003898 Fiore M, Strazewski P. Prebiotic Lipidic Amphiphiles and Condensing Agents on the Early Earth. Life. 2016 Jun; 6(2): 17. doi: 10.3390/life6020017 Kan, S.B.J. et al. Directed evolution of cytochrome c for carbon–silicon bond formation: Bringing silicon to life. Science, 2016:354 (6315), 1048–1051. doi: 10.1126/science.aah6219

CHAPTER 2

DNA Is the Instruction Book for Life Nucleic Acid Structure and Organization CHAPTER OUTLINE 2.1 The Big Picture 2.2 All of the Information Necessary for Cells to Respond to Their External Environment Is Stored as DNA 2.3 DNA Is Carefully Packaged into Five Levels of Organization 2.4 Cells Chemically Modify DNA and Its Scaffold to Control Packaging 2.5 Chapter Summary

▶ 2.1 The Big Picture One of the most important trends discussed in Chapter 1 is that, as organisms evolved into increasingly complex beings, the amount of genetic material (deoxyribonucleic acid, or DNA) they contained increased as well. This is not coincidental: the biological codes responsible for developing and sustaining this complexity rely on the molecules encoded by sequences of nucleotides in DNA to serve as the actors that obey the rules in biological codes. The more genetic material a cell has, the more information it can access, and the more variation it can generate. Accessing this information requires cells to follow one of the most ancient biological codes, the genetic code. DNA is one of the most critical molecules in living organisms because it contains the essential information passed down from generation to generation. The sequence of nucleotides in DNA reflects its evolutionary history, similar to written documents that can trace our genealogical history to our ancestors. (Indeed, genealogists use DNA sequences to trace family histories for exactly this reason.) Put more simply, sequences of nucleotides in DNA are the functional equivalent of words in a book, and every time a cell divides, it must replicate its entire library of genetic sequence information. Changing even a single letter in one word can change the meaning of the word, as well the sentence and book that contains it. Nothing in nature is perfect: cells make errors when replicating their DNA and pass these errors on to their descendants. These changes constitute the genetic variation required for evolution to occur: variation in the words changes how biological codes are enacted and thereby changes the way organisms function. By studying when uncorrected errors in this sequence appeared over evolutionary history, biologists can reconstruct the time when the information in the library changed

and thereby trace the history of all organisms back to LUCA (the last universal common ancestor). This concept forms the foundation of cell biology principle 2: DNA is the instruction book for life.

CELL BIOLOGY PRINCIPLE 2

DNA is the instruction book for life.

Photo courtesy of Andrew S. Mount, PhD; Jonathan Stewart; Nichole Hickman; Michael Groce; Okeanos Research Lab, Clemson University.

The implications of this principle are profound. Virtually every cellular function can be traced back to biological codes acting on ribonucleic acids (RNAs) and proteins encoded by DNA. Unpacking how this happens remains one of the primary challenges in cell and molecular biology, and the lessons we learn along the way directly impact nearly all aspects of modern life, including medicine, agriculture, manufacturing, and even warfare. The goal of this chapter is to explore the structure of nucleotides and nucleic acids, as well as their structural organization inside eukaryotic cells. This will serve as the foundation for addressing how DNA is replicated (Chapter 7) and how cells access DNA information (Chapters 8 and 12). We will frequently reference sections of these chapters to reinforce the concept that a carefully

organized “library” is a prerequisite for cells to respond properly to challenges in their environment. This chapter covers four major ideas: ■ First, DNA is a storehouse of “cellular information.” Like code biology, the concept of biological information is also somewhat abstract; notice how a relatively simple linear code can produce an enormous number of different RNA and protein products. ■ Second, the linear structure of a DNA molecule is relatively simple. It can often be learned largely by memorization. ■ Third, DNA organization in a eukaryotic cell is relatively complex. To assist in working through this subject, we have divided the physical organization into five levels, from poorly organized to very highly organized. ■ Fourth, cells can chemically modify DNA and the proteins that organize it to help control their structure and function. Understanding this idea sets the stage for explaining how cells read specific sections of a DNA molecule at specific times.

▶ 2.2 All of the Information Necessary for Cells to Respond to Their External Environment Is Stored as DNA Key Concepts ■ Cells store the information necessary to build RNA and proteins in the form of DNA, a simple nucleic acid found in all living organisms. ■ When a cell divides, each of the resulting daughter cells contains an almost exact copy of the parental cell DNA. ■ DNA must transmit its information to other molecules to be useful. ■ The functional unit of DNA information is called a gene. ■ The DNA information in genes is converted into matching, single strands of RNAs by a process called transcription; cells transcribe many different types of RNA from their DNA. ■ The sequence information in messenger RNAs (mRNA) is converted into a sequence of amino acids by a process called translation. ■ Mutations in DNA are passed on to RNAs and proteins, resulting in variations that are acted upon by natural selection.

As we stated in Chapter 1, to be alive, cells must have the ability to self-replicate and self-correct. This statement implies that cells can sense their surroundings, so they know when it is safe to replicate, and that they are aware of their functional state and know when they are damaged. To “know” these things, cells must contain information (see Box 2-1).

BOX 2-1 TIP Here is a helpful tip for every chapter in this book: before reading the chapter in detail, flip through the pages and note how many figures are devoted to each of the major subdivisions. Remember that it takes considerably more effort to generate a figure than it does to write an equivalent volume of text, so if an idea warrants a figure, it likely merits your attention. Looking at the figures before reading reveals a visual roadmap of some of the most important concepts in the chapter. By noting how many figures are used for each subdivision and what they contain, students will have a good idea of what to expect as they work their way through a chapter.

What is cellular information? The concept of information is fairly abstract. While we are all quite familiar with its use in our daily lives, we cannot point to it. Most often, when we refer to information, we mean the way it is represented in physical form: as written text, recorded sound, or images, for example. One of the best physical examples of information storage in our everyday lives is a library (see Box 2-2). Cells don’t use any of these forms to store information, but they nonetheless capture and store a tremendous amount of information about their surroundings and their physical state. This information is stored in the form of molecules. Just as we use different physical forms to store different kinds of data, cells use different types of molecules to store the different types of information they need to stay alive, as illustrated in FIGURE 2-1.

FIGURE 2-1 Some forms of information storage in cells.

BOX 2-2 THE LIBRARY OF CONGRESS ANALOGY, PART 1 The information stored in DNA is analogous to the letters, numbers, and other symbols used in books. Units of these books, such as chapters, are the genes in DNA. For an information storage system to be useful, someone has to open and read it; in a library, this is done by people, and in cells, this is done by proteins. A nucleus is similar to the Library of Congress in that entry is restricted and the removal of information is prohibited. So, proteins in the nucleus “copy” the genes in the DNA “books” by creating RNA copies; this is called

transcription because the language of a DNA “book” is transcribed to a new form. FIGURE A shows how these concepts apply to both cells and our hypothetical library. While the source materials (books or genes) must remain within the structure (library or nucleus), several copies can be made rather quickly and distributed to numerous locations outside the library building. Note that in a cell’s “library,” many of the books resemble “instruction manuals” more than novels or textbooks. Thus, photocopies of these manuals are useful only if they perform a job. Most RNA molecules control cell behavior directly, analogous to reproducing a work of art and putting it on display. A notable exception is messenger RNA (mRNA), which requires conversion of its information into sequences of amino acids we call proteins. The reading of mRNA photocopies to build polypeptides and proteins is called translation for this reason: words are converted into physical objects (proteins) that do the actual work. The reading of mRNA is done by ribosomes, which can be thought of as factories (several proteins and RNA molecules working together as a team). Note that this strategy protects the books quite well, while also making them useful to the population.

FIGURE A The Library of Congress analogy for information storage and transmission in cells. © iStockphoto/Thinkstock.

The recent effort to digitize library holdings by scanning and photographing predigital-age works illustrates how humans have put the lessons of cell and molecular biology to use. Instead of storing information on paper, canvas, clay, stone, parchment, vinyl, and magnetic tape, all of this information is being converted into a simple binary computer code that can be readily reproduced and edited. Whereas binary code uses only two elements (zeros and ones) to store information, DNA uses four building blocks, called nucleotides. Digital libraries more closely resemble how cells manage information than conventional libraries, but even they must confront physical storage demands: server farms are highly specialized,

organized spaces dedicated to storing and securing the hardware holding the encoded information. DNA has even been proposed as a digital information storage device to replace binary code (Panda et al., 2018). This analogy also demonstrates how mutations can affect cells. Imagine that someone (or even entire teams of people) tried to reproduce every letter of every book in the library. Chances are good that he or she would make several mistakes (even digital copies can contain errors), and the new books would have different words in them. If these mistakes are not corrected, all subsequent photocopies of these books would have different information, and many of the new words would be unrecognizable; this is why many mutations are harmful to cells. On very rare occasions, the mistakes yield changes that are useful, such that the RNA (and possibly proteins) resulting from these new books helps the cell better adapt to its environment. These cells have a slightly higher probability of surviving long enough to divide and pass on this new information to the next generation of cells. This is how evolution by natural selection works.

One of the most important molecules used to store cellular information is DNA. Cells use DNA to store information for constructing other complex biological molecules, such as RNAs and proteins. (How cells use DNA to construct these other molecules is discussed in Chapter 8.) DNA stores the most fundamental information necessary for life; every cell, and, therefore, every living organism, uses DNA for this purpose.

A Cell’s DNA Is Inherited DNA stores information in the form of a linear sequence of repeating units, somewhat analogous to how a linear sequence of

letters or characters in an alphabet can store information in the form of words, sentences, or paragraphs. DNA uses a very simple language, composed of only four molecules, commonly known as A, C, T, and G. As we will see below, each of these letters is a molecule known as a deoxyribonucleotide. These deoxyribonucleotides are attached end-to-end to form very large structures in cells. When cells replicate, they divide into two parts called daughter cells, each of which inherits a complete and nearly identical copy of the parental cell’s DNA. This requires the parental cell to replicate its DNA before cell division. (The mechanisms of DNA replication are discussed in Chapter 7.) In multicellular organisms such as humans, most cells (called somatic cells) only pass their DNA on to the cells that replace them during that organism’s lifetime; a specialized set of cells, often called germ cells (e.g., eggs and sperm), is usually responsible for passing DNA from one organism to its offspring.

Mutations in DNA Are Passed from Generation to Generation When cells replicate their DNA, they frequently make mistakes, as seen in FIGURE 2-2. Some of these mistakes result in changes to the deoxyribonucleotide sequence. This is understandable, considering how many DNA replication events take place during an organism’s lifetime. For example, let’s take a look at the human body. Each human somatic cell (i.e., not eggs or sperm) contains approximately 12 billion (12 × 109) deoxyribonucleotides in its DNA. The number of cells in an average human body at any given time is estimated to be about 3 × 1013, which replicate to form approximately 1016 cells during a person’s lifetime. Generating 1016 cells from a single fertilized egg, therefore, requires (12 × 109) × 1016 = 12 × 1025 nucleotides to be replicated in the correct

order. Nothing in nature can achieve the level of accuracy necessary to replicate all of these nucleotides perfectly. In fact, DNA replication errors are actually quite common in humans, especially in rapidly dividing cells, such as blood cells or cells lining the digestive tract: the rate of inserting the wrong deoxyribonucleotide in a DNA sequence (often called a point mutation) is approximately 1 per every 2 × 1010 deoxyribonucleotides, or about six errors each time a cell replicates.

FIGURE 2-2 Mistakes in DNA replication may cause mutations.

Other changes in the original DNA template sequence can also occur: extra deoxyribonucleotides can be inserted, some can be left out; even large pieces of DNA can be accidentally deleted, added, and/or moved to another location in the DNA sequence. The net effect of these changes is that every cell, and every organism, is at least slightly different from its ancestors, siblings, and other relatives. This heterogeneity in DNA sequences contributes a great deal to the variation found in populations of organisms that undergo evolution by natural selection.

DNA Must Be Read to Be Useful Like other forms of coded information, DNA is not useful in isolation. For example, the information in a book is meaningful only if it is read and put to use. The same principle applies to DNA in a cell: only the portions of DNA that are “read” are meaningful. Cells don’t literally read a DNA sequence, of course. Instead, they use proteins that bind to specific deoxyribonucleotide sequences in the DNA, and this binding changes the behavior of the proteins, as illustrated in FIGURE 2-3. The information is then converted into a useful form (an RNA molecule). Some of these proteins are responsible for transcribing deoxyribonucleotide sequences in DNA into RNA sequences; some of these RNA sequences are then translated into amino acid sequences in proteins, a topic we will cover in much greater detail in Chapter 8. We will see numerous other examples of how DNA information is used throughout the book.

FIGURE 2-3 DNA information is “read” by proteins. For many years, scientists thought that a large percentage of the DNA in most eukaryotic cells was useless because they could find no evidence that proteins would bind to these regions. More recently, they discovered that these regions contain characteristic patterns of repeating DNA sequences. We now know that many of these sequences either bind to proteins directly or control the shape of neighboring DNA sequences that bind proteins. (These non-coding regions are analogous to the “behind-the-scenes” roles that individuals play in a theater production: without them, the performance could never take place.) Much of this DNA contains deoxyribonucleotides that are chemically altered (e.g., by methylation: see Section 2.4). These chemical modifications can have a profound impact on which portions of DNA are read by a cell. These modifications are so important that a new field of

biology, called epigenetics, has emerged for studying how these modified sequences impact the phenotypes and behaviors of organisms. We will present more specific examples of epigenetic modifications later in this chapter (see Figure 2-23) and in Chapter 12.

DNA Information Is Packaged into Units Called Genes The most familiar form of DNA information is a segment of deoxyribonucleotides known as a gene (FIGURE 2-4). Despite its common usage in many fields of biology, there is no universally accepted definition for this term. In this book, we will define a gene as the linear sequence of deoxyribonucleotides necessary for converting a portion of that sequence into a complementary sequence of ribonucleotides inside a cell. (In this context, two nucleic acid sequences are called complementary when they can form stable hydrogen bonds between their base pairs.) Less formally, we can say that a gene is a portion of DNA that can be converted into RNA, plus some additional sequences that are absolutely necessary for this conversion to take place. A gene is always a single linear sequence of deoxyribonucleotides on a single piece of DNA; a gene cannot be fragmented into different portions of DNA scattered throughout different DNA molecules. The average length of a human gene is 10,000–15,000 nucleotides (often abbreviated as base pairs, or bp), though there is considerable variation in size.

FIGURE 2-4 The smallest functional unit of DNA is a gene. A sample of eukaryotic DNA encoding a messenger RNA is illustrated. Genes are the best-known units of biological inheritance. When Gregor Mendel (1822–1884) first discovered the principles of genetic inheritance in the mid-nineteenth century, he was studying how individual genes of pea plants in his garden were passed from generation to generation. A few decades later, Heinrich Wilhelm Gottfried Waldeyer-Hartz (1836–1921) discovered linear strands in the nucleus that changed color when he added stain to the cells, and he called these structures chromosomes (derived from the Greek word meaning “colored bodies”). Another 20 years passed before Thomas Hunt Morgan (1866–1945) determined that genes

are arranged on chromosomes. A single chromosome may contain thousands of genes lined up one after another, each with its distinct packet of information (see Figure 2-4). The modern definition of a chromosome is a genetic element containing genes essential to cell function. Genetics, the field of study devoted to uncovering the mechanisms governing the inheritance and expression of genes, has contributed greatly to our understanding of cell behavior. We will discuss the mechanisms controlling gene expression in greater detail in Chapter 12.

Genes Are Transcribed into RNAs All genes share one important trait: some portion of their deoxyribonucleotide sequence can be converted by a cell into a complementary sequence of ribonucleotides (also known as RNA). In many genes, the portion that is replicated as RNA includes a region called the coding sequence. The coding sequence of many eukaryotic genes is often broken up into segments, called exons, separated by noncoding sequences called introns. The process of synthesizing an RNA molecule is called transcription, illustrated in Figure 2-4. Currently, biologists recognize seven major classes of genes, according to the function of the transcribed RNAs they generate (see Box 2-3):

BOX 2-3 THE LIBRARY OF CONGRESS ANALOGY, PART 2 The library analogy works best for mRNAs, as mRNAs are the photocopies cells read to make proteins. However, the analogy doesn’t work too well for the rest of the RNAs, and this is because they are perfectly useful to cells without having to be translated into proteins. Most photocopies are pretty useless by themselves: they can be made into paper airplanes but not real planes. So think of these other RNAs as very special photocopies that function perfectly well without having to undergo translation.

■ Ribosomal RNA (or rRNA) molecules are an essential component of the large and small subunits of ribosomes, the molecular complexes that make proteins. In humans, the small subunit rRNA is about 1,900 nucleotides long (sometimes referred to as simply bases), and the large subunit rRNA is about 5,000 nucleotides long. Biologists have found over 1,200 introns at over 150 unique sites in the small and large subunit ribosomal RNA genes across all known species of organisms. These intron sequences must be clipped out before the mature rRNAs can function properly. ■ Messenger RNA (or mRNA) molecules, unlike rRNAs, do not play an active role in any cellular activity. Instead, they serve as templates for the assembly of the ribosomes that will build a specific new polypeptide. These are called messenger RNAs (mRNAs) because they are intermediates in the genetic code, carrying the encoded message from DNA to proteins. Put another way, mRNAs are short-lived, intermediate copies of DNA information that are translated into proteins. Human mRNAs average 2,500 nucleotides in length and contain an average of 7.8 introns.

■ Transfer RNA (or tRNA) molecules are bridge molecules that link amino acids to a specific three-nucleotide sequence on mRNA. They specifically deliver the correct amino acids to ribosomes, where they are added to the polypeptides being synthesized. (This is analogous to mail carriers and couriers, who ensure packages are delivered to the correct address.) Compared to the rRNAs and mRNAs they interact with, tRNAs are comparatively tiny (typically about 73–93 nucleotides long) and contain a small number of introns that can be traced back to the archaeal ancestor of eukaryotes (see the subsection Eukaryotes Arose Through Teamwork by Prokaryotes in Chapter 1). ■ MicroRNA (miRNA) molecules are noncoding RNAs with an average length of 22 nucleotides. The genes encoding miRNAs do not contain introns, but some are contained within the introns of genes encoding mRNAs. They perform many different functions in cells, but most inhibit translation by binding to a noncoding region of mRNA molecules and inhibiting the ribosome. Their discovery in 1993 revolutionized cell and molecular biology by introducing a fourth critical player (in addition to rRNA, mRNA, and tRNA) in the biological code that converts DNA into proteins. In addition to inhibiting translation of their own genes, cells can also use miRNAs to defend against infection by viruses. These molecules are covered in much greater detail in Chapter 11. ■ Short interfering RNA (siRNA) molecules are so-named because they are small (21–23 base pairs in length) and interfere with translation by inducing the destruction of targeted mRNAs. Biologists think these molecules evolved in plants and animals to silence genes of the viruses and other pathogens that infect them. Mammals, including humans, do not express siRNAs, but they inherited the ability to use foreign siRNAs from their evolutionary ancestors. This discovery in 2001

opened up a new and potentially powerful way of combating genetic diseases, such as cancer, by inserting synthetic siRNA molecules that target and inhibit cancer-causing genes in tumors. ■ Small nuclear RNAs (snRNAs) are slightly larger than miRNAs (100–200 nucleotides long) and function primarily in the nucleus, where they control transcription of mRNA genes and export of RNA and proteins from the nucleus to the cytosol. These are discussed in greater detail in Chapter 11. ■ The seventh and most diverse class of RNA molecules plays a multitude of regulatory roles, including splicing of introns from other RNAs and editing of gene sequences. X-inactive-specific transcript (Xist) RNA is a noncoding RNA that plays an essential role in inactivating X chromosomes, as we will discuss in Section 2.3. Another notable member of this class is called clustered regularly interspaced short palindromic repeats (CRISPR) RNA. We discuss CRISPR in much greater detail in Chapter 12.

Messenger RNAs Are Translated into Proteins The combined product of rRNAs, tRNAs, mRNAs, and miRNAs working together is a newly synthesized chain of amino acids called a polypeptide, as shown in FIGURE 2-5. miRNAs help determine when a specific mRNA is available for translation. During translation, three-nucleotide-long codons in the coding sequence of mRNA are matched with anticodons in tRNA by ribosomes (including rRNA) to determine the sequence of amino acids in the resulting polypeptide. Even though 64 different codons are possible (four different nucleotides can fill each position, therefore 43 or 4 × 4 × 4 = 64), each codon does not code for a unique amino acid. Instead, 3 codons (UAG, UAA, UGA) are designated as “stop” codons, which halt translation, and in the remaining 61 codons, redundancies ensure that an mRNA

specifies only 20 different amino acids. The average size of a human gene coding sequence is about 500–600 codons, yielding a polypeptide of 500–600 amino acids in length. Therefore, an average-sized human gene can theoretically encode at least 20500 different polypeptide sequences (20 different amino acids per each of the 500 codons; see TABLE 2-1). In reality, the actual number of polypeptides produced by any single organism is far less than this because most of these polypeptides would not be useful to cells. Polypeptides (either as individuals or as assembled groups) form proteins, some of the most common molecular structures in cells, and proteins have some strict requirements for how their polypeptides must behave. We call these requirements the three traits of proteins, and we will discuss them in detail in Chapter 3.

FIGURE 2-5 An overview of translation in eukaryotes.

TABLE 2-1 Comparison of the Information Storage Capability of the Genetic Code, ASCII Computer Code, and Musical Notation

The information stored in DNA can undergo two transformations to become useful to cells. The first transformation, transcription, simply changes the DNA informational sequence into a complementary RNA informational sequence. This has a profound benefit for cells: relative to the typically very large DNA molecules, RNA molecules are quite small and can be transported to different regions in a cell relatively easily. Also, multiple RNA molecules can be generated from the same DNA sequence, so that a cell can configure its RNA profile by simply changing the number of RNA molecules it copies from each of its genes. This is one easy way for cells to become specialized. The second transformation occurs when mRNA information is used to build polypeptides. The process of converting the ribonucleotide sequence of mRNA into an amino acid sequence in a polypeptide is called translation. We will discuss the molecular events responsible for translation in much greater detail in Chapter 8. Note that mRNA is the only known type of RNA that is translated. This, too, offers great benefits to cells. Each DNA and RNA molecule is composed of only four different subunits (nucleotides), so the number of variations available is relatively small. By comparison, proteins are composed of twenty different subunits (amino acids), which permit them to form much greater numbers of different molecules. Translation allows the information in DNA to be expressed in a myriad different proteins.

Mutations in DNA Give Rise to Variation in Proteins, Which Are Acted on by Natural Selection Mutations such as those discussed previously have the potential to alter the structure and function of RNAs and/or proteins. If a mutation occurs in the coding sequence of a gene, this change in the DNA informational sequence is reflected by a corresponding

change in the RNA informational sequence that arises from it through transcription. In some cases, point mutations have little or no effect on the structure and function of the RNA, but in other cases, the effects of mutations can be profound, resulting in the formation of a dramatically different informational RNA or even no RNA at all. Because three types of RNAs play a role in synthesizing proteins, mutations in their sequences have the potential to alter the sequence of amino acids created from an mRNA. This is especially true when a sequence in the coding region of an mRNA is altered because this region of mRNA determines the order of tRNAs it binds to and thus of amino acids in a new polypeptide. If the coding region of an mRNA is changed, the order of tRNAs that bind to it will be changed accordingly, and the resulting polypeptide will have a different sequence of amino acids, as shown in FIGURE 2-6.

FIGURE 2-6 Mutations can alter amino acid sequence and protein function. A classic example is the point mutation in the hemoglobin gene that causes sickle-cell disease. In this case, a single deoxyribonucleotide change in the DNA coding sequence of the hemoglobin gene (the 17th nucleotide is changed from A to T) causes a change in the mRNA and a single amino acid change in the hemoglobin protein (the sixth amino acid is changed from glutamic acid to valine). This tiny change causes red blood cells to adopt a characteristic “sickle” shape when the concentration of oxygen in the blood drops because the hemoglobin protein in the red blood cells changes from its normal tetrameric configuration to long polymer strands that distort the membrane of red blood cells.

Resulting sickle-shaped cells can get stuck in capillaries and/or hemolyzed (ripped open), thereby interfering with proper circulation and causing a great deal of pain (FIGURE 2-7).

FIGURE 2-7 A single-point mutation causes sickle-cell disease.

Mutations in tRNA or rRNA mitochondrial genes can also have significant effects in cells. (Recall from Chapter 1 that mitochondria and chloroplasts are descendants of prokaryotic organisms and, therefore, perform transcription and translation of their own genes.) Several mutations have been found in the gene encoding the tRNA that carries the amino acid leucine to ribosomes in mitochondria, causing a wide range of problems (such as diabetes, degeneration of muscle fibers and nerves, and stroke-like episodes). Likewise, a single-point mutation—the substitution of G for A at position 1,555—in the coding sequence of the gene for the rRNA found in the small ribosomal subunit (often called the 12S subunit) is associated with a form of deafness. The molecular mechanisms linking these mutations to their associated physiological problems are still not clear. There are as many different possible combinations of mutations in an organism as there are nucleotides in its genotype. Some can be harmful, or even fatal, including those that cause healthy cells to become cancer cells, but most mutations do not cause serious problems for living cells and organisms. The reason for this is fairly simple: typically, alterations in a DNA sequence either (1) have little to no effect on the structure and function of the RNAs and/or proteins produced by a cell or (2) have such a drastic impact on the cell that it (and possibly the entire multicellular organism in which it lives) quickly dies. As a result, most cells in a multicellular organism, or in a population of single-celled organisms, are subtly different from the previous generation of cells that divided to form them, as shown in FIGURE 2-8. Most scientists believe that a slow, steady rate of mutation persists for several rounds of cell division until enough mutations accumulate to generate a noticeably different cell type, and possibly even a new type of organism if these mutations are passed on from one generation to the next.

FIGURE 2-8 Mutations accumulate slowly in a population of cells. In the short term, this can have dire consequences for an organism: most cancers arise from cells that have acquired multiple mutations from their ancestral cells over the course of an individual’s lifetime (see Box 2-4). However, the same principles can have a positive impact over the long term. Each generation of cells and organisms is subtly different from that of its ancestors, and it is this variation that permits populations of organisms to adapt to changing environmental conditions. As discussed in Chapter 1 (see Box 1-3), evolution by natural selection only functions if a population of organisms contains some inheritable variation. The persistent mutation rate resulting from errors in DNA replication can, therefore, be viewed as an important tool to help ensure the survival of a species. In effect, every member of a population of organisms, including humans, can be viewed as an experiment in natural selection. We all speak the same genetic language, but we are all mutants, in one way or another.

BOX 2-4 FAQ: IS CANCER INHERITED OR NOT? The answer is both. Most cancers occur as a result of mutations that accumulate in a somatic cell, such as a skin or lung cell; these cancers are not inherited because skin and lung cells are not passed from parent to offspring. But if a germ cell (e.g., egg or sperm cell) has a mutation that increases the likelihood of developing cancer, this trait, like any other genetic trait, can be passed onto offspring.

Concept Check 1 Table 2-1 shows a comparison between the genetic code and two other codes commonly used in our everyday lives. Personal computers typically store numbers and symbols as magnetized particles on a hard drive, organized into bits and bytes, while sheet music stores musical performance instruction as symbols representing notes. What are the similarities and differences between these forms of information storage? In what ways is DNA a better or worse system than the others for storing information?

▶ 2.3 DNA Is Carefully Packaged into Five Levels of Organization Key Concepts ■ The fundamental structural unit of DNA is a deoxyribonucleotide; combinations of the four possible deoxyribonucleotides are arranged sequentially to form a linear strand of DNA. ■ The simplest form of stable DNA in a cell is called a DNA double helix, formed by two strands of DNA-oriented antiparallel (i.e., oriented in opposite directions) to one another and held together by hydrogen bonds between the atoms in the base portion of the deoxyribonucleotides. ■ Three different forms of double-helix DNA have been observed, suggesting that the shape of the double helix may vary in different regions of a DNA molecule. ■ The length of DNA in most cells is so great that complex packaging strategies are necessary to make the DNA fit inside the cells. ■ Cells construct a protein/RNA scaffold that protects and supports DNA; in prokaryotes, the resulting DNA-scaffold structure is called nucleoid, and in eukaryotes, it is called chromatin. These structures are folded and twisted to further condense the DNA inside cells. ■ DNA organization is classified into five levels. The first level is simply the binding of two linear DNA strands to form a double-stranded double helix. ■ The second level of packaging is a beads-on-a-string configuration, wherein the DNA double helix is wrapped around a “spool” made up of histone proteins; this shortens

the length of a DNA molecule by 7-fold. Chemical modification of the histone proteins is an important mechanism for controlling the expression of genes in eukaryotes. ■ The third level of packaging is the formation of 30- to 40-nmthick fibers, made by twisting the beads-on-a-string structure into a coil. This shortens the length of a DNA strand approximately 42-fold. ■ The fourth level of packaging is called looped domains, formed by periodic attachment of 30- to-40-nm fibers to a protein/RNA complex that shortens DNA approximately 750fold. ■ The fifth level of packaging is an organization of the Level 4 DNA/protein/RNA complex into a highly stable structure called a scaffold or matrix. This structure can undergo extensive compaction, also called condensation, when regions of DNA are not being read. This shortens DNA by up to 20,000-fold relative to Level 1. ■ DNA compaction beyond the fourth level silences gene expression in DNA; this hypercondensed DNA is only found in eukaryotes and is called heterochromatin.

Because DNA is heritable (i.e., passed on from a cell’s ancestors), it reflects the tremendous amount of information that has been gathered throughout billions of years of evolution by natural selection. Even the simplest cells have hundreds of thousands of nucleotides in their DNA. One of the smallest known genomes, that of the microbe Nanoarchaeum equitans, contains nearly 500,000 nucleotides. Remember that to be useful as a template, these nucleotides need to be accessible on demand. This presents a special challenge for cells: condensing DNA into a manageable size, while still permitting access to each nucleotide.

The solution to this challenge is complex. To better understand it, we will address the problem one level at a time. In this section of the chapter, we will focus on five levels of physical structure of the DNA molecule, increasing in complexity from a simple double helix to a highly compacted form called heterochromatin. In Section 2.4, we will discuss the chemical modifications cells use to control the behavior of proteins that organize DNA.

DNA Is a Linear Polymer of Deoxyribonucleotides Before we examine the packing of DNA, let’s have a closer look at DNA and how it is built (Box 2-5). We will start by examining the structure of a deoxyribonucleotide, the simplest building block in DNA, and then we will move up in complexity until we reach a complete double-stranded DNA molecule (see Box 2-6). Refer to TABLE 2-2 to keep track of the names of the different structures as we increase in complexity. BOX 2-5 THE LIBRARY OF CONGRESS ANALOGY, PART 3 In this section, we will learn how to create letters (deoxyribonucleotides) and link them together to form a language (DNA). We will then focus on how the letters are organized into words, sentences, books, bookshelves, bookcases, and so on, as we explore how DNA is packaged inside a cell.

BOX 2-6 TIP: STRUCTURE BEFORE JARGON Just as we found in Chapter 1, as we go through the structure of DNA, it is likely that many new words will come up. We strongly suggest focusing initially on the fundamentals of DNA structure, and then, once the concepts are understood, spend the necessary time memorizing the names.

TABLE 2-2 The Bases, Nucleosides, and Nucleotides of RNA and DNA

A deoxyribonucleotide is a fairly simple structure. One good way to become familiar with this structure is to practice drawing it (use a pencil because some erasing will be required). There are three steps to this procedure, and they are outlined in FIGURE 2-9. Let’s

start by drawing ribose, a five-carbon sugar in a ring configuration (see Box 2-7). Be sure to number each of the carbons in this sugar in a clockwise fashion as shown.

FIGURE 2-9 A stepwise method for drawing a deoxyribonucleotide.

BOX 2-7 TIP Study Figure 2-9 carefully; there is a lot of information there. Chapter 1 discusses the basic structure and nomenclature of sugars (see Sugars Are Simple Carbohydrates) and nucleotides (see Nucleotides Are Complex Structures Containing a Sugar, a Phosphate Group, and a Base). One easy way to remember the ring structure of ribose is to draw it like a cartoon house: the peak of the roof is the oxygen linking the 1 carbon and the 4 carbon, the 5 carbon represents the top of a chimney, and the 1 carbon is where the nitrogenous base is attached to the roof, like a flag on a pole.

■ Remember that DNA is a deoxy ribonucleotide, so in step 1, we will take one oxygen away by replacing the hydroxyl group (– OH) on the 2′ carbon with hydrogen, yielding deoxyribose. Erase the hydroxyl group attached to the 2′ carbon, and replace it with a single hydrogen atom. ■ In step 2, we will attach a base to the deoxyribose. Draw a base near the 1′ carbon. We have four choices: two purines (adenine or guanine) or two pyrimidines (thymine or cytosine). Notice that all four of these bases contain a nitrogen atom bonded to a hydrogen atom that is pointing downward in the diagram; this is where we will attach the base to the ribose. We can join our base to the rest of our structure by creating a covalent bond between the 1′ carbon on the ribose molecule and the nitrogen atom on the base.

This reaction is called a dehydration (or condensation) reaction because it also yields a single water molecule as one of its products, as shown in Figure 2-9. The result is a deoxyribose sugar attached to a nitrogenous base; this structure is called a deoxyribonucleoside, or more simply, deoxynucleoside (see Box 2-8). BOX 2-8 TIP Many cell and molecular biologists use the word base as a shorthand term for the entire deoxyribonucleotide. A good example in this chapter is the term base pair. It’s easy to get confused when we first encounter it. Watch out for the word base, and make sure to know which structure is being referred to (a deoxyribonucleotide or just the base portion of one) when we use that term.

Because four different bases can be joined to the deoxyribose, there are also four deoxynucleosides. Deoxyadenosine and deoxyguanosine contain the purines adenine and guanine, respectively, and deoxycytidine and deoxythymidine contain the pyrimidines cytosine and thymine (see Box 2-9).

BOX 2-9 TIP Notice the difference in spelling between the base and the corresponding deoxynucleoside: although it would be convenient to simply put the term “deoxy” in front of the base to yield the corresponding name of the nucleoside, the rules of chemistry don’t allow it. One way to remember the difference in spelling is that for the purines, one inserts the letters “os” before the “-ine” when referring to the nucleoside (or deoxynucleoside), and for the pyrimidines, one inserts the letters “id” before the “-ine.” It helps that the word pyrimidine also follows this rule. Remember to think about this only after fully understanding the structures these words refer to.

■ In step 3, we attach a triphosphate to the deoxyribose. Draw a triphosphate group near the 5′ carbon as shown. We can create a covalent bond between the 5′ carbon and the nearest phosphate by performing another dehydration reaction. The product of this reaction gets a new name: it is now called a deoxyribonucleotide, or simply deoxynucleotide (see Box 210). BOX 2-10 TIP Notice that the only difference in spelling between deoxynucleotide and deoxynucleoside is a single letter: s or t. Because “s” comes before “t” in the alphabet, it is easy to remember that the smaller structure contains the “s” and the bigger structure contains the “t.”

There are two additional things to remember about deoxy(ribo) nucleotides:

■ Deoxynucleotides have one, two, or three phosphate groups attached to the 5′ carbon, and each form gets its own name, as shown in FIGURE 2-10. For example, a deoxynucleotide that contains adenosine as its base can be called deoxyadenosine monophosphate, deoxyadenosine diphosphate, or deoxyadenosine triphosphate, depending on the number of phosphate groups attached to it. The bond linking the phosphate to the 5′ carbon is called a phosphoester bond. In Figure 2-9, we drew deoxyguanosine triphosphate. There is no deoxynucleotide that does not have a phosphate attached to it, nor can more than three phosphates be attached to a deoxynucleotide—the number is always one, two, or three. Because the formal names of deoxynucleotides are rather long, most of the time we use abbreviations for them: dAMP, dADP, and dATP are the abbreviations for those that contain adenine, for example. Note that the lowercase “d” indicates that these are deoxynucleotides, to distinguish them from nucleotides that are not missing any oxygen atoms (such as those found in RNA). When we draw polymers of deoxynucleotides, we simplify the names even more by resorting to the familiar single-letter abbreviations (A, G, C, T). ■ RNAs are composed of nucleotide subunits that closely resemble the deoxynucleotides in DNA (see Figure 2-10). The two most important differences between the deoxynucleotides found in DNA and the corresponding nucleotides found in RNAs are (1) the pyrimidine uracil is used in place of thymine, and (2) the nucleotides in RNAs contain ribose rather than deoxyribose (hence the absence of the term deoxy when naming these structures). Similar to their deoxyribonucleotide counterparts, AMP, ADP, and ATP are the abbreviations for adenine ribonucleotides containing one, two, or three phosphates. In addition to serving as building blocks for RNA,

ATP and GTP are sources for metabolic energy and help control the function of proteins, as we will see in later chapters.

FIGURE 2-10 Distinctive features of ribonucleotides. The mono, di-, and triphosphate forms of uracil triphosphate are indicated.

A Single Strand of DNA Is Held Together by Phosphodiester Bonds Deoxynucleotides can be joined together linearly, via ester bonds, by performing dehydration reactions between the phosphate group of one and the hydroxyl group on the 3′ carbon of another (see Box 2-11). (We will discuss the exact mechanism of this reaction in much greater detail in Chapter 7.) The result is a string of deoxynucleotides linked together by an alternating sequence of phosphates and deoxyribose sugars (hence the familiar name

“sugar-phosphate backbone”) held together by phosphodiester bonds, with the nitrogenous bases extending out to the side, as shown in FIGURE 2-11. Note also that this string of deoxynucleotides has different structures at the two ends: no matter how long the string is, one end always has an unbound 5′ carbon (no additional nucleotides attached to its 5′ carbon), and the other end has an unbound 3′ carbon. These are called the 5′ and 3′ ends of DNA (see Box 2-12). BOX 2-11 TIP: CHEMISTRY NOMENCLATURE The definition of an ester bond is a covalent bond formed between an acid and an alcohol. In DNA and RNA, the phosphate group is the acid, and ribose or deoxyribose sugar is the alcohol. A single bond formed between a phosphate and a sugar is called a phosphoester bond. Two sugars attached to the same phosphate group form two phosphoester bonds, and these sugars are often described as being linked by a single phosphodiester bond.

BOX 2-12 THE HOLIDAY LIGHTS ANALOGY The linear string of nucleotides resembles a string of holiday lights: the wires represent the sugar-phosphate backbone, and the lightbulbs represent the bases. In this analogy, each bulb would be one of four colors to represent each of the four bases. One end of the wire would represent the 5′ end, and the other end would represent the 3′ end.

FIGURE 2-11 The general structure of a nucleic acid. Just as the simple building block (monomer) of DNA has a name, so too does the polymer: our strand is now called a deoxyribonucleic acid. Note that this is a generic name and that the sequence of deoxynucleotides in the strand makes no difference—any linear sequence of the four deoxynucleotides, arranged in this 5′ to 3′ fashion, is called DNA (see Box 2-13). When one wants to discuss both DNA and RNAs as a group, we use the more general term nucleic acids. BOX 2-13 DNA SEQUENCING TECHNOLOGIES One of the most important advances in biology in the past 50 years is the ability to “read” the linear sequence of nucleotides in DNA. The most commonly used “first-generation” technology for DNA sequencing was invented in 1977 by Frederick Sanger and his colleagues. The Sanger sequencing method relies on two simple principles: DNA cannot grow if the 3′ carbon on deoxyribose lacks a hydroxyl (–OH) group, and radioactive atoms can be detected on photographic film. During Sanger sequencing, multiple single strands of the piece of DNA being sequenced (the template) are mixed in a test tube with all of the cellular components necessary to replicate them, including all four deoxynucleotide triphosphates. As the chemical reaction takes place, a new piece of “complementary” DNA is synthesized to convert the single-stranded DNA template into a double-stranded helix. If one of the four types of deoxynucleotides being added lacks an –OH group on the 3′ end (these are called dideoxynucleotides because both the 2′ and 3′ carbons lack –OH groups), they can be inserted into the growing DNA chain, but no additional deoxynucleotides can be added to them. This shuts down DNA synthesis on that single

template immediately. By “spiking” a test tube with a small amount of specific dideoxynucleotide (e.g., dideoxycytosine), biologists can conclude that each time DNA synthesis stops in this reaction, the last deoxynucleotide to be added must be this dideoxynucleotide. Performing this reaction on thousands of copies of a DNA template yields many different copies of double-stranded DNA, each of which contains a known dideoxynucleotide at its 3′ end. The phosphate groups in the newly added dideoxynucleotides contain radioactive phosphorus (P32), which makes the newly synthesized DNA radioactive. These copies are separated, based on their length, using gel electrophoresis. The separated pieces are visualized by drying the gel on a sheet of nitrocellulose paper and placing it next to X-ray film; the radioactive phosphorus exposes the film, so when it is later developed, small lines (called bands) appear on the film. Each line represents a different length of DNA that contains a known dideoxynucleotide at the 3′ end. When this procedure is repeated in four test tubes, each containing a small amount of one radioactive dideoxynucleotide (A, C, T, or G), the contents of all four tubes can be separated in different lanes of the same gel. The resulting pattern appearing on the developed X-ray film shows a series of bands that differ in length by one known dideoxynucleotide. Reading the gel band pattern from the smallest copy to the largest across all four lanes allows researchers to deduce the sequence of the newly synthesized DNA, from the 5′ end (smallest band) to the 3′ end (largest band). Because the two strands of a DNA helix are base-paired, researchers can also infer the sequence of the template that gave rise to the radioactive strands. Later, radioactive dideoxynucleotides were replaced with fluorescently labeled versions, and these can be visualized with

highly sensitive light detectors; radioactive gels were replaced with four-color readouts of DNA sequences. As technology improved, DNA amplification allowed hundreds of DNA templates to be sequenced at the same time, making it possible to sequence the entire genome of a cell with a robot. This technology produced the first complete human DNA sequence in 2001. “Second-generation” DNA sequencing (also called next generation sequencing, or NGS) relies on immobilizing many copies of a single DNA template on a solid surface, adding the necessary proteins and replicating DNA, then washing the template with successive solutions containing only one type of deoxynucleotide triphosphate. If, for example, the template requires a T to be inserted in the newly synthesized strand at the 3′ end, washing the template with a solution containing dideoxythreonine triphosphate but no other deoxynucleotides will result in the cleavage of the triphosphate on deoxythreonine, generating a new 3′ end containing T and a pyrophosphate (two phosphates cleaved off the 5′ end). By measuring the amount of pyrophosphate generated after each wash with a known deoxynucleotide triphosphate solution, scientists can determine whether a given “base” was added to the growing DNA strand. Passing each type of “base” solution over the template in a repeated cycle yields a surge in pyrophosphate each time a base is added. Summing the bursts in pyrophosphate with each round of washes allows scientists to determine the order of base addition and, by extension, the 5′-to-3′ sequence of the growing DNA chain. The advent of second-generation sequencing/NGS spawned a revolution in commercial DNA sequencing technologies. One of the most significant was replacing pyrophosphate detection with stepwise, fluorescence-based detection of bases as they are

added to the growing 3′ end. This removes the need for repeated cyclical washes and allows scientists to observe the addition of bases by measuring the fluorescent color emitted by a “spot” of DNA template after each round of base addition. Now, millions of spots can be arrayed on a slide, exposed to all four types of fluorescently labeled deoxynucleotides at once, and the sequence of each spot is revealed by the release of color each time a new round of base addition is completed. The field of second-generation DNA sequencing is expanding rapidly, with new approaches appearing every year. Unlike the situation in the days of Frederick Sanger (1918–2013), now many different types of DNA sequencing are appearing in a highly competitive commercial market. The cost of sequencing an individual’s entire genome is dropping so rapidly that scientists estimate that over 2 billion humans will have their genome sequenced by the mid-2020s.

FIGURE A Using NGS technology, scientists can read DNA sequences by measuring pyrophosphate release from specific nucleotides as DNA is synthesized. © nicolas_/E+/Getty Images.

Single strands of DNA, such as the one shown in Figure 2-11, are not stable in cells. Cells stabilize DNA by folding/organizing it. To keep track of how it occurs, we’ll identify five different levels and number them consecutively. Organization on this scale is a significant challenge for eukaryotic cells: storing genetic information, accumulated over billions of years of evolution, in the form of linear sequences of deoxyribonucleotides means that these sequences are very, very long. For example, the roughly 12 billion deoxyribonucleotides in an average human somatic cell form about 6 billion base pairs that, if laid end-to-end, would be a string over 2 meters long, hundreds of thousands of times the size of the cell containing it. If each of the deoxynucleotides was listed

using its single-letter abbreviation, these letters would occupy more than a million pages of a typical book. How can all that information be packed into a single cell without it getting all jumbled up (see Box 2-14)? Here is where we describe Levels 1 through 5 of DNA organization. BOX 2-14 THE LIBRARY OF CONGRESS ANALOGY, PART 4 In Section 2.3, we focus on how DNA information is stored in a highly organized, easy-to-use system. The library analogy applies quite well here: the books of information are stored on shelves, the shelves are arranged into rows, and the rows are assembled on floors of the building. Some books, because there is little or no demand for them, are stored away in shelves that are difficult to access. While the shelves do not encode much information themselves, they are essential for the books to be useful. Here, we will learn how cells build and arrange the protein/RNA “shelves” that support DNA.

Level 1: DNA Forms an Antiparallel Double Helix The simplest form of stable DNA in cells, which we will call Level 1, is a double-stranded DNA molecule where the two strands run antiparallel to one another (one strand of 5′ to 3′ is alongside one that runs 3′ to 5′) and are held together by hydrogen bonds between oxygen and nitrogen atoms in the complementary bases to form base pairs. The absence of the –OH group on the 2′ carbon of deoxyribose allows the two DNA strands to twist around one another to form a helix (most RNA molecules never form a double-stranded helix). FIGURE 2-12 shows three common representations of double-stranded DNA to highlight different structural features of the molecule. Figure 2-12a is a simple line drawing illustrating how complementary base pairing holds the two strands of DNA together. Figure 2-12b uses a three-dimensional

drawing to demonstrate that the hydrogen bonds between complementary bases stabilize the two strands in the double helix.

FIGURE 2-12 Level 1 of DNA organization is a double-stranded, antiparallel double helix held together by hydrogen bonds between base pairs. (a) A simple line drawing of the doublestranded double helix. A “ribbon-ladder” representation of the double helix is shown for reference. (b) A 3-D drawing showing the spatial arrangement of the nucleotides in a DNA double helix. (c) A space-filling model of the DNA double helix (B form), indicating the location and size of the major and minor grooves. (Photo) © Photodisc.

Practice drawing the ribbon-ladder form of DNA, as shown in Figure 2-12a. When doing so, make sure that the double helix contains two different grooves. The wider of these grooves is called the major groove, while the narrower is called the minor groove. These grooves are important because they form attachment sites for DNA binding proteins. DNA binding proteins often contain finger-like structures that fit into these grooves; these structures allow DNA binding proteins to slide back and forth in the grooves as they search for the specific sequence of deoxynucleotides they are targeting, as shown in Figure 2-3. Note that the twisting of the DNA strands results in a periodicity of approximately 3.4 nm (or 34 Å); this means that there are approximately 10.5 base pairs per turn of the helix. This matters because many DNA binding proteins bind to short DNA sequences (six base pairs or fewer). Because these sequences are shorter than a single turn of the helix, they seem more or less linear to DNA binding proteins, which makes them easy to detect. Figure 2-12c is a space-filling model of the DNA double helix, and it illustrates one final, very important point that we will discuss in greater detail in later chapters. It shows that DNA is composed of a large number of atoms. This is important because changes in even a few atoms result in noticeable variations in the DNA

structure. For example, a region of DNA encoded by several A–T base pairs will have a slightly different shape than one encoded by G–C base pairs, even though both pairs are perfectly aligned. The fact that each base pair imparts its shape to the overall molecule means that two segments of DNA made up of different deoxynucleotide sequences also have slightly different shapes. It is this difference in shape that allows proteins to “know” which regions of DNA to bind to. Stated more simply, every different sequence of deoxyribonucleotides has a unique shape. Proteins that bind to specific sequences of DNA can, therefore, slide along a strand of DNA until they find the exact shape that fits their binding site. Even very minor changes in the atomic structure of deoxynucleotides can have profound effects on overall DNA shape: one common cause of DNA mutation is the loss of a single amino group (–NH2) from the base of a single deoxynucleotide.

DNA Can Be Supercoiled to Form at Least Three Different Structures Discovering the double-stranded, helical organization of DNA was one of the most significant advances in biology during the twentieth century. A considerable amount of the data used to deduce this structure came from crystals of DNA grown in the lab. These crystals also demonstrated that double-stranded DNA could form at least three different types of double helices. DNA adopts the configuration shown in FIGURE 2-13, called B-DNA, under conditions of high relative humidity (92%). A second configuration, called A-DNA, appears when DNA is crystallized under conditions of lower relative humidity. (The third type of double helix is called Z-DNA.) All three types of double helices have a major and a minor groove. Because the interior of a cell is entirely saturated with water, most of the DNA in a living cell likely adopts a shape very similar to B-DNA. Small regions of Z-DNA have been detected in cells, though it is still unclear what significance this

conformation plays in chromosome function. Both A-DNA and BDNA helices are “right-handed,” meaning that if one holds a piece of it in front of one’s eye like a telescope, the helix will appear to turn in a clockwise fashion. Z-DNA is twisted in the opposite (“lefthanded”) direction, so it has been suggested that Z-DNA regions may serve to reduce the amount of effort required to unwind BDNA in areas of the chromosome that are frequently copied during transcription.

FIGURE 2-13 Three different forms of double-helical DNA (L to R: A form, B form, Z form). Note that all three forms have major and minor grooves. Courtesy of Richard Wheeler.

Level 2: DNA Is Bound to a Protein/RNA Scaffold The next part of the strategy for packaging DNA in cells is to support it with an elaborate infrastructure made of proteins and RNA. The proteins and RNA don’t store any information, but without them, the DNA would be hopelessly tangled and altogether useless. The complex formed by these proteins, RNA, and their associated DNA is called chromatin in the nuclei of eukaryotes,

and nucleoid in mitochondria, chloroplasts, prokaryotes, and Archaea. Proteins account for at least 50% of the mass of chromatin and are likely as abundant in the nucleoid.

Double-Stranded DNA Is Wrapped Around Histone Proteins to Form a Small Particle The best-known set of structural proteins belongs to the histone family. These proteins are found in all organisms and are thought to be some of the earliest proteins to appear during evolution. When associated with DNA, they form spools called nucleosome core particles, similar to those used to store thread, string, wire, and the like. This is Level 2 of DNA organization. In eukaryotic cells, these spools are composed of two copies each of four different histones (named H2A, H2B, H3, and H4). The histones contain many positively charged amino acids, which attract them to the negatively charged backbone of DNA in a way that is largely sequence independent. The double-stranded DNA molecule is wrapped around a histone “spool” approximately 1.7 times (167 base pairs). This spool, plus the short stretch (20–50 base pairs) of DNA called linker DNA that lies between spools, is called a nucleosome, as shown in FIGURE 2-14. A linear arrangement of several nucleosomes forms a structure that looks like a string of beads (FIGURE 2-15). The addition of a “linker” histone, either H1 or H5, “pins” the DNA to the core particle, resulting in a structure called a chromatosome (see Figure 2-14). Similar structures are found in prokaryotic cells, where the DNA is wrapped around a different set of histone protein spools. Wrapping DNA in this fashion causes the DNA double-helical strand to become shorter and thicker: the length is reduced by approximately seven-fold, and the width increases from 2 nm to about 11 nm. These “beadson-a-string” structures also contain proteins in addition to histones in both prokaryotes and eukaryotes.

FIGURE 2-14 Two representations of the chromatosome. The top panel shows the octameric arrangement of histones in a nucleosome core particle, with DNA wrapped around it. Histone H1 pins the DNA to the core particle, forming a chromatosome. The lower panel shows a computer model of DNA wrapped around the nucleosome core particle. Photos courtesy of E. N. Moudrianakis, John Hopkins University.

FIGURE 2-15 Nucleosomes are linked by short segments of linker DNA to form a “beads-on-a-string” arrangement. Proteins named switch/sucrose nonfermentable (SWI/SNF) can slide the core particles to widen and narrow the gaps between nucleosomes. Does wrapping DNA around a spool have any negative consequences? Remember that the goal is to compact DNA without compromising a cell’s ability to access the genetic information. Because so much of the DNA comes into contact with the spool, most of it is inaccessible to other DNA-binding proteins, thereby negating the beneficial effects of these spools. In eukaryotes, DNA can be partially unwrapped from the nucleosome by members of the SWI/SNF family of proteins. These proteins

use ATP energy to move the core particle a short distance along the DNA, thereby freeing up any base-pair sequences that may have been buried in the core particle. At least two other families of proteins participate in this type of chromatin remodeling as well; this is illustrated in Figure 2-15.

Histone Modifiers Control the Structure of Nucleosomes Chromatin remodeling, illustrated in FIGURE 2-16, is the term used to describe the process of displacing histones to control access to DNA. In some cases, this means that cells simply slide nucleosomes from one position in chromatin to another, which can change the spacing between nucleosomes. In other cases, entire nucleosomes are temporarily removed from a particular region of DNA. Large ATP-consuming complexes are primarily responsible for performing these remodeling activities. Humans have at least eight different complexes, and the best-known complex is called SWI/SNF.

FIGURE 2-16 Chromatin remodeling is a modification of histones to permit their removal or sliding along a piece of DNA. Opening up gaps between nucleosomes permits RNA polymerases and transcription factors to bind to promoters.

Level 3: DNA Is Twisted to Form Fibers The beads-on-a-string structure we see in the microscope only appears when we break cells open and spread out their contents. In intact cells, chromatosomes are clustered together in a highly ordered fashion to form a series of similar configurations, all of which are called the 30-nm fiber. (In prokaryotes, similar 40-nm

fibers form from clusters of the “beads” in the genophore.) These fibers represent Level 3 of DNA organization. Multiple configurations have the same name because scientists are still not entirely sure how many of the configurations are present in living cells (the electron microscope, which is a common tool for observing chromatin, cannot be used to view living cells). These fibers will form spontaneously in a test tube if the salt concentration of the buffer is kept low enough, demonstrating that no additional proteins or metabolic energy are required. BOX 2-15 FAQ: WHAT IS THE DIFFERENCE BETWEEN A NUCLEOSOME AND A CHROMATOSOME? In simple terms, a chromatosome is a nucleosome attached to a linker histone. The important distinction to keep in mind is that all of the mechanisms discussed here about controlling DNA access focus on the shape of the nucleosome, not the chromatosome. This means that loosening of DNA around the spool, sliding DNA over the surface of the spool, and even removing the spool altogether takes place when the linker histone is not present, and thus this is all happening to a nucleosome. After the modification has occurred and the linker histone returns to bind the nucleosome, it again becomes a chromatosome. DNA reforms a 30-nm fiber when chromatosomes bind together via their lnker histones.

The 30-nm fiber is held together by electrostatic interactions between different histones. For example, a negatively charged region on histone H4 binds to a positive region in a histone H2A/H2B complex in another nucleosome, drawing the two nucleosomes together and further compacting the chromosome. In addition, linker histones bind to each other, as well as to other

proteins that serve as bridges between chromatosomes. This results in additional shortening and thickening of the chromosome and increases the packing density to about 42 times that of double-stranded DNA alone.

Level 4: DNA Fibers Attach to a Protein-RNA Scaffold At Level 4 of DNA organization, the 30-nm fibers are attached to a protein-RNA scaffold (also called a matrix) that keeps them organized (FIGURE 2-17). Specifically, the fibers are attached to the scaffold at intervals of 10–30 µm, thereby forming so-called loop domains of approximately 60 kilobases in length. Exactly how these loops are attached to the scaffold is not known, but the attachment occurs at DNA sequences called MARs or SARs (for matrix or scaffold attachment regions, respectively). These sequences typically contain a large number of A–T base pairs but are otherwise not very similar. Many proteins have been isolated from the chromosome scaffold, and the structure is sensitive to enzymes that digest an unknown form of RNA, but it is not yet clear how these molecules assemble to form the mature structure. Loop domains are approximately 750-fold more compact than BDNA. The prokaryotic chromosome is thought to consist entirely of Level 1–4 structures.

FIGURE 2-17 Level 4 of DNA organization can be seen in this electron micrograph of loop domains projecting outward from the protein/RNA scaffold (dark material at the bottom). Reproduced from Cell, vol. 12, Paulson, J. R., and Laemmli, U. K., The structure of histone..., pp. 817–828. Copyright 1977, with permission from Elsevier. Photo courtesy of Ulrich K. Laemmli, University of Geneva, Switzerland.

Level 5: Chromatin Is Packaged into Highly Condensed Chromosomes Eukaryotic cells adopt an additional means for organizing DNA that most prokaryotes do not use: they cut their DNA up into several chromosomes. Human beings are considered to be diploid, meaning they contain two copies of each chromosome that range in length from approximately 47 million base pairs to nearly 250 million base pairs. These are organized as two copies each of

22 autosomes and 1 pair of sex chromosomes. We can tell by simply looking in a microscope that chromosomes are very large bundles of DNA with distinctive shapes that change throughout a cell’s life. During mitosis, these chromosomes condense to form their familiar X-shaped structures, and they decondense once mitosis is complete, as shown in FIGURE 2-18. This condensation/decondensation is analogous to packing one’s belongings into a small, compact space (e.g., a suitcase for a trip) and then unpacking once the journey is complete.

FIGURE 2-18 (a) Chromosomes at different levels of DNA compaction. These chromosomes have banded regions containing tightly wound chromatin and loose “puffs” where genes are easier to access. (b) This chromosome is fully condensed, as it would look during mitosis. (a) and (b) © Biophoto Associates/Science Source.

These observations reveal two important things. First, DNA organization is dynamic: chromosomes can be tightly bundled or loosely bundled as necessary during a cell’s lifetime. Second, this implies that there must be some machinery responsible for controlling this bundling. Consider how important this machinery is. During mitosis, a eukaryotic chromosome can be condensed

into a structure that is about 15,000 to 20,000 times shorter than its unwound length. Changes in the length of a chromosome result from a complex series of folding and twisting events designed to prevent the individual strands from becoming tangled.

Heterochromatin Is a Form of Tightly Packed DNA in Eukaryotic Cells Collectively, Levels 1–4 of DNA organization are called euchromatin in eukaryotes because they all share one important property: they can be easily accessed by proteins responsible for replicating the chromosomes in preparation for cell division or by proteins responsible for reading a strand of DNA to make RNA (i.e., transcription). In other words, DNA sequences organized as a form of euchromatin are easy to use (FIGURE 2-19). This helps explain why prokaryotic cells can alter their gene expression patterns fairly rapidly, compared to most eukaryotes: their entire DNA is easily accessible.

FIGURE 2-19 The five different levels of DNA organization in eukaryotic cells. Note that two types of Level 5 (heterochromatin) are shown: 5A is found in nondividing (interphase) cells, and 5B is found only in cells undergoing cell division (mitosis or meiosis). Photo (a) © Science Source; Photo (b) © Fancy Tapis/Shutterstock; Photo (c) Courtesy of Barbara Hamkalo, University of California, Irvine; Photo (d) Courtesy of Bruno Zimm and Ruth Kavenoff. Used with permission of Georgianna Zimm, University of California, San Diego; Photo (f) Courtesy of the Cell Image Library.

Eukaryotes go even further in compacting their chromosomes. Advancing to Level 5 is a very big step for these cells. Any portion of a chromosome that condenses past the point of loop domains becomes essentially inactive. Unlike the case in prokaryotes, however, a considerable amount of DNA in eukaryotic chromosomes is actually rather useless to a cell. This DNA may contain genes or fractions of genes that were important for our distant ancestors but are no longer useful. Or it may contain genes that are useful only during early embryonic development or in specialized cells (e.g., humans form gill-like structures only during early development; usually only nerve and muscle cells make neurotransmitter receptors while only bone cells make bone proteins). Once a cell commits to a specific developmental fate (e.g., nerve, muscle, or bone), it no longer needs access to at least some of the genes required for other fates (e.g., skin or liver). A nerve cell can afford to “pack up” the portions of its DNA containing instructions necessary for functioning as a liver cell because it does not ever expect to use them.

Cohesins and Condensins Help Control the Packaging State of Chromatin The additional condensation of DNA is accomplished by twisting loop domains into shorter and thicker filaments. Several different

structures are possible, depending on the extent of twisting used. The degree of this compaction has been estimated to be between 250- and 20,000-fold. We will break this range into two parts: those areas of condensed DNA found in cells when they are not actively dividing (a period also called interphase) are Level 5A; more compact chromosomes are required for cells to undergo the mitotic or meiotic phase of cell division, and this extra degree of compaction is Level 5B. Regardless of their size, all of these Level 5 structures are called heterochromatin, both to differentiate them from euchromatin and because they appear as blobs of varying darkness in an electron microscope (FIGURE 2-20). Like the condensation of chromatosomes to form 30-nm fibers, this condensation requires additional proteins.

FIGURE 2-20 Heterochromatin appears as dark patches in the nucleus during interphase. This is Level 5A of DNA organization. Photo courtesy of Edmund Puvion, Centre National de la Recherche Scientifique.

Two good examples are those belonging to the SMC (structural maintenance of chromosomes) family of proteins. One group, called condensins, is responsible for general chromosome structure, along with chromosome condensation during the

prophase period of cell division (see Prophase Prepares the Cell for Division in Chapter 7). A second group, called cohesins, plays an important role in condensation of yeast chromosomes during cell division and regulates access to genes in virtually all eukaryotes. FIGURE 2-21 shows models of how they might accomplish this condensation. Cohesins bind two strands of DNA together by forming a ring-shaped structure that encloses Level 1 and 2 forms of DNA, and condensins are thought to condense DNA by forming similar ring-shaped structures enclosing Level 3 loops of DNA. While it is known that DNA condensation requires ATP energy, it is still not entirely clear exactly how that ATP is used. It is possible that these proteins hydrolyze ATP to gather the DNA into these rings.

FIGURE 2-21 Cohesins and condensins control the spatial arrangement of chromatin.

RNAs Play a Critical Role in Compacting Chromatin Unfortunately, our understanding of how chromosomes are compacted is hampered by the same teamwork between molecules that forms our first principle of cell biology (discussed in Chapter 1). As chromosomes condense, the molecular teams responsible for regulating this process grow in both size and complexity. A very common method for deciphering how a molecular team works is to break it into its constituent parts and then reassemble it to figure out how each part interacts with the others. For condensed chromosomes, even breaking the scaffold into simpler parts is very difficult: it is resistant to most chemicals and remains largely intact even when almost everything else in a cell is broken apart. What we have learned so far is that a significant portion of the nuclear scaffold is made up of RNA molecules and that most of these belong to the seventh class of RNAs discussed in Section 2.2. These RNAs play two important roles in forming the scaffold and controlling the compaction of chromosomes. First, they fold back on themselves to form double-stranded RNA “loop-stem” structures that are stabilized by hydrogen bonds between bases. This makes them unexpectedly resistant to physical or chemical disruption; the stem-loop regions bind directly to proteins within the scaffold and help hold them in place. Second, these RNAs serve as binding partners for additional RNAs and proteins that trigger DNA condensation. One important class of these proteins is histone-modifying proteins (discussed in Section 2.4); the modifications to the histones, in turn, control the degree of compaction of the scaffold and thereby help determine whether a region of DNA is accessible for gene transcription. Other proteins

link the scaffold to the inner surface of the nucleus, helping to stabilize it. One of the best-studied nuclear scaffold RNAs is called X-inactivespecific transcript (Xist) (see Box 2-16). Xist is both necessary and sufficient to compress an X chromosome into a permanently inactive structure (often abbreviated as Xi) called a Barr body. Female mammals contain two copies of the X chromosome, and it is important that one of these be inactivated in every cell soon after fertilization; failure to silence one of the X chromosomes results in overexpression of sex-specific genes and failure to complete embryonic development. Xist is the consummate team player: scientists estimate it has at least thirty protein binding partners, forming hundreds of potential combinations (i.e., different teams). BOX 2-16 CASE STUDY: DUCHENNE MUSCULAR DYSTROPHY IN A FEMALE PATIENT Muscular dystrophies are a group of disorders that share clinical characteristics of progressive muscular weakness. Duchenne muscular dystrophy (DMD) is the most common type of muscular dystrophy. It includes delays in muscle development, wheelchair confinement, and cardiac and respiratory problems that prematurely end the patient’s life. DMD is an X chromosome–linked recessive disorder. Because girls inherit one X chromosome from each parent, they nearly always have at least one healthy X chromosome and therefore rarely exhibit DMD symptoms. Individuals who carry one healthy and one mutated version of the same gene are called carriers because they can transmit the mutated gene to their children without being affected themselves. If boys inherit a mutated X chromosome from their mother, they must develop DMD because they lack a healthy X chromosome to

compensate for the mutated gene; boys cannot be carriers of X chromosome–linked disorders. In this clinical case, we have a family of phenotypically normal parents and their three children: one son and a set of female identical twins. Identical twins arise from a single fertilized egg and therefore have identical genes. All three children appeared healthy from birth to adolescence. At age 16, one of the girls began displaying the classic clinical symptoms for DMD: muscle weakness and frequent tripping and falling while walking or running. The pediatrician did not suspect DMD since the patient was a girl. As the disease progressed, the affected sister was tested, and the diagnosis was confirmed as DMD. Genetic analysis of blood samples and skin biopsies confirmed that both girls were carriers for DMD. Everyone was puzzled by the test results. Why was only one sister affected by the mutation? The most likely reason for this outcome is that X chromosome inactivation differed between the two girls. During early embryonic development, Xist RNA randomly inactivated one of the X chromosomes in each cell to form condensed, inactive Barr bodies in both girls. Note that either the functioning or the mutated X chromosome had an equal chance of being inactivated in each cell. In one girl, the mutated X chromosome was inactivated in most of her cells, while the opposite occurred in her sister. Determining what causes Xist RNA to select which chromosome to silence and which to leave unaffected remains a significant challenge. Despite recent progress, how Xist RNA localizes and interacts with the X chromosome is still not completely understood.

Study Questions 1. If we assume that the DMD mutation did not spontaneously occur in both twins, explain why the

mother, but not the father, must be a carrier of the DMD mutation. 2. In addition to being coated with Xist RNA, what other changes occur on the inactivated X chromosome to ensure it remains compact and transcriptionally inactive? Search the internet for additional clues. 3. Given your answer to question 2, what are the challenges facing scientists who might try to develop a means of reversing X chromosome inactivation in DMD patients?

Despite this tremendous variation, scientists have arrived at some consensus as to how Xist functions. The Xist gene is on the X chromosome, and Xist transcripts bind to a protein in the X chromosome scaffold called scaffold attachment factor A (SAF-A), forming a “coat” that spreads out from the Xist gene until the entire X chromosome is covered. After SAF-A binds to Xist, SAF-A changes shape to attract additional proteins that further compact the chromosome.

Concept Check 2 Apply the five levels of DNA organization to the Library of Congress analogy: What would be reasonable approximations of these levels for books on a shelf? What happens in a library that distinguishes Level 4 and Level 5? What parts of the DNA organization do not readily fit into this analogy?

▶ 2.4 Cells Chemically Modify DNA and Its Scaffold to Control Packaging Key Concepts ■ In addition to changing the physical organization of DNA, cells control DNA packing by chemically modifying DNA and the proteins in the DNA scaffold, including chromatin. ■ The best-known chemical modifications target Level 1 and 2 DNA organization. ■ Modifications of Level 1 and 2 DNA organizations can impact higher-level organization as well, including silencing of DNA in regions of heterochromatin. ■ Level 1 chemical modifications occur directly on the DNA double helix; the most common modification is the addition of a methyl group to deoxycytosines, which suppresses transcription. ■ Level 2 modifications occur primarily on histones and can affect chromatin condensation, gene transcription, and nucleosome assembly.

DNA and the proteins responsible for packaging it into Levels 1–5 of organizations can be chemically modified to change their structure and function (see Cells Chemically Modify Proteins to Control Their Shape and Function in Chapter 3). This is an important way for cells to more carefully control which regions of a chromosome are available for sharing information and which are effectively closed off. (In our Library of Congress analogy, this is equivalent to prominently displaying some texts while placing others in storage.)

Chemical Modifications at Level 1 and Level 2 Can Affect DNA Packing Across All Levels of DNA Organization One of the most well-studied effects of these modifications is the formation of heterochromatin (Level 5 packaging) and subsequent repression of gene expression, often called gene silencing. This can occur by at least two mechanisms. In the first mechanism, shown in FIGURE 2-22, methyl groups are added directly to the bases adenine (in prokaryotes) or cytosine and guanine (in eukaryotes) in the DNA, a Level 1 process called DNA methylation. In mammals, these modifications occur most commonly on deoxycytosines that are adjacent to deoxyguanosines in a DNA strand. This is often abbreviated as CpG, with the p representing the phosphodiester bond holding the two deoxynucleotides together. (Regions of DNA that have a high proportion of CpG sequences are often called CpG islands.) DNA methylation silences gene expression because it directly prevents the binding of proteins that are required for transcription to take place (see Transcription Factors Promote the Expression of Genes in Chapter 12).

FIGURE 2-22 Methylation of DNA triggers gene silencing. A second common way to silence genes is to modify histone proteins in the nucleosome. A variety of proteins are capable of

attaching relatively small molecules (methyl groups, acetyl groups, or phosphate groups) to the tails of histones, which causes them to change their shape, as shown in FIGURE 2-23. This change in shape alters the function of the proteins as well (see Cells Chemically Modify Proteins to Control Their Shape and Function in Chapter 3). FIGURE 2-24 shows an example of how histone modification can silence genes. An enzyme called histone deacetylase (HDAC) removes an acetyl group from histone H3; histone acetylation often activates transcription, so removal of the acetyl group partially inhibits translation. Following this, a protein called histone methyltransferase attaches a methyl group to the ninth amino acid (a lysine, abbreviated K) on histone H3 (this is sometimes abbreviated as H3K9me). Finally, a third protein (the names vary in different species; in mammals, it is called heterochromatin protein 1, or HP1) attaches to the newly methylated histone H3. When this histone is deacetylated and methylated, its shape changes. This triggers a conformational change in the entire core particle, such that the DNA attached to the core particle can no longer be transcribed; it is now silenced. As additional nucleosomes next to the newly silenced section undergo the same modifications, the silencing can extend to larger stretches of DNA.

FIGURE 2-23 Chemical modification of histone tails changes the shape of chromatosomes.

FIGURE 2-24 Histone modification can silence DNA to form heterochromatin. Modification of a chromatosome can help ensure that a region of DNA is not accessible. This is shown in FIGURE 2-25, when a protein called Rap1 binds to a portion of DNA. The resulting change in Rap1’s shape allows two additional proteins named Sir3 and Sir4 to attach to it, and they in turn bind to histones H3 and H4. This creates a change in the shape of the core particle and facilitates the binding of additional Sir3/Sir4 complexes to adjacent nucleosomes. Note that in both strategies, changing the structure

of the nucleosome core particle by altering the configuration of histones is a key step.

FIGURE 2-25 Rap1, Sir3, and Sir4 can silence DNA to form heterochromatin.

Some Regions of Eukaryotic Chromosomes Are Always Silenced In eukaryotes, portions of the chromosomes are never active and are, therefore, called constitutive heterochromatin (in biology, constitutive means “constantly produced”). A minimal amount of transcription, such as for the siRNAs mentioned previously, takes place in these regions, but none of the resulting RNA transcripts is ever translated. Instead, these regions play important roles in

maintaining the structure and organization of a chromosome during mitosis. For example, the centromere region of the chromosome is essential for proper attachment of the chromosome to the microtubule spindle during mitosis, and the telomere regions protect the ends of the chromosome from damage. Both of these regions bind to many different proteins but only unwind completely during DNA replication. They are discussed in greater detail in Chapter 7. Recently, scientists discovered another ATP-dependent group of proteins that play a crucial role in chromosome condensation prior to mitosis. These proteins are responsible for reading highly repetitive sequences of DNA to synthesize siRNAs. This finding came as something of a surprise because until then most researchers believed that siRNAs only controlled gene transcription. While the exact mechanism has yet to be determined, the siRNA-forming proteins recruit and bind to still another group of proteins that modify histones. This, too, is an essential step in heterochromatin formation.

Concept Check 3 This section discusses altering the physical form of DNA as a means of controlling access to genetic information. A common expression is that humans are currently living in an age of information overload. Are these two ideas at all related? Can cells suffer from information overload? What would be the consequences if they did? How do cells control access to this information without being overwhelmed? Use the vocabulary in this section to answer these questions.

▶ 2.5 Chapter Summary To remain alive, cells must do two things: respond appropriately to external signals and internal programs and maintain their internal environment. Nearly all of the molecules responsible for these activities are either RNA or proteins, and the instructions for creating these molecules are stored in a relatively simple polymer, DNA. Each time a cell divides, the daughter cells inherit a copy of the parental cell DNA, which is slightly modified in each successive generation of cells by mistakes (mutations) made during the replication process. The instructions in DNA are organized into units called genes; cells read these genes to produce several types of RNAs (by transcription) and proteins (by translation of messenger RNA). Mutated genes produce altered RNAs and proteins when they are transcribed and translated, and it is these differences in RNAs and proteins that yield the variation in cellular phenotype that is acted upon by natural selection in each generation of cells and organisms. DNA is, therefore, the heritable material acted on by evolution. Because DNA encodes the instructions for producing all of the RNAs and proteins a cell will require during its lifetime (and may also encode additional unused genes), it is typically an enormous molecule relative to the cell that harbors it. The complete DNA molecule, called a chromosome, is made up of combinations of four subunits, called deoxyribonucleotides, which form two antiparallel strands held together by hydrogen bonds. One of these two strands contains the coding sequence of a gene. To ensure that the genes can be easily accessed while also compacting them enough to fit into a cell, DNA is supported by an elaborate protein/RNA scaffold, called genophore in prokaryotes and chromatin in eukaryotes. Histone proteins are at the heart of these scaffolds, and DNA wraps around “spools” of histones.

Chemical modifications of histones and DNA bases play an important role in controlling which sections of DNA molecules are read by the transcription machinery. In some eukaryotes, portions of the chromosomes are condensed and modified so that they are not capable of undergoing transcription. These regions are called heterochromatin, to distinguish them from the transcriptionally accessible regions called euchromatin. Prokaryotes do not form heterochromatin.

Chapter Study Questions 1. How many phosphoester bonds, phosphodiester bonds, and hydrogen bonds are shown in Figure 2-12a? Do not count bonds formed at the ends of the strands with atoms not shown. 2. Explain how mutations in a cell’s DNA can impact the proteins made by that cell’s progeny (i.e., daughter cells). 3. Propose a model for how/why DNA binding proteins can “read” DNA sequences in double-stranded DNA. 4. During heterochromatin formation, how is DNA “compaction” different from DNA “silencing”? 5. Briefly describe the functions of the protein scaffold in DNA organization. 6. Why is it important that the two strands of the DNA double helix are held together by hydrogen bonds instead of (much stronger) covalent bonds? 7. Have a careful look at the structure of the core particle in Figures 2-14, 2-15, and 2-16. Based on our discussion of the function of this structure, explain three advantages provided by its molecular composition and shape. 8. If the phenotype of the cell is a product of the genes it expresses at a given time, why don’t cells simply use a binary, on-off mechanism for controlling gene expression? In other words, why do eukaryotic cells use at least five levels of DNA organization instead of two? 9. Explain why DNA must always be synthesized in the 5′ to 3′ direction. Why is this important for current DNA sequencing technologies? 10. Based on your understanding of DNA structure, function, and replication, propose an explanation for why genomes of organisms tend to grow in size rather than shrink or remain constant over evolutionary time. How do organisms manage the useless genes they inherit from their ancestors?

Multiple-Choice Questions 1. What is a gene? A. B.

C.

D. E.

A sequence of nucleotides that wraps around a histone “spool.” A sequence of nucleotides that includes the coding sequence for an RNA molecule, plus the sequences that control the timing and amount of RNAs generated from the coding sequence. A sequence of nucleotides that encodes all of the polypeptides necessary to form a protein with quaternary structure. A sequence of nucleotides that binds to proteins. A sequence of nucleotides that encodes the alpha helix portion of each polypeptide.

2. A common assay (method) for isolating DNA binding proteins from cells is called chromatin immunoprecipitation (ChIP). The product of this experimental method is often a list of proteins and the nucleotide sequences (200–1,000 base pairs) attached to them. Which one of the following questions could be answered using a ChIP assay? A. B. C. D. E.

How many cells are in this sample? How many nucleosomes does this sample of DNA have in it? How big/long are the chromosomes in this sample? Which DNA binding proteins bind most tightly to DNA? Which nucleotide sequences do these proteins bind to?

3. Which statement best explains how natural selection promotes changes in DNA sequences in successive generations of cells? A.

Natural selection increases the mutation rate in cells, thereby increasing the genetic diversity of each successive cell generation. This diversity is essential for ensuring that the optimal cell phenotype will emerge in each generation. Therefore, natural selection directly alters DNA sequences.

B.

C.

D.

E.

Natural selection determines the size and shape of histone proteins. These, in turn, determine the shape of nucleosomes and have a direct impact on the packing state of DNA in cells. When histone composition of nucleosomes changes, the DNA sequences bound to these nucleosomes change as well. Natural selection increases the rate of histone modification, thereby giving rise to a more diverse range of DNA packing in cells. This increased diversity, in turn, increases the chances that DNA will be damaged during replication, giving rise to changes in DNA sequences in successive generations of cells. Natural selection increases the probability of survival in cells that express proteins best suited to allow cells to prosper in their environment. Because a cell’s external environment is never static, natural selection favors slightly different cells in successive generations. One of the major factors determining the phenotype of cells is the structure and function of the proteins they express, and these are directly influenced by the sequence of DNA encoding their genes. Natural selection reflects a cell’s relative ability to overcome deleterious mutations by repairing them and allowing cells to reproduce. Thus, natural selection promotes changes in the DNA in successive generations of cells by favoring cells with the best DNA repair mechanisms; eventually, future generations of cells will be able to repair virtually all DNA damage.

4. Assume you analyze the hemoglobin gene of a sickle-cell patient and an unaffected (control) individual and find that both have single base-pair mutations in this gene. Which statement best explains why the mutation in the control individual is not causing sickle-cell disease? A.

A. The mutation in the sickle-cell patient occurs in a sequence of DNA that encodes the amino acid sequence of the hemoglobin protein; the mutation in

the unaffected individual occurs in a region of the hemoglobin gene that does not. B.

The mutation in the sickle-cell patient was inherited from her parents; the mutation in the unaffected individual arose spontaneously when the individual was a child.

C.

The mutation in the sickle-cell patient is in a region of DNA wrapped around a nucleosome core particle; the mutation in the unaffected individual is not. The mutation in the sickle-cell patient is caused by deletion of a single nucleotide in the gene sequence; the mutation in the unaffected individual is caused by the addition of a single nucleotide in the gene sequence. The mutation in the sickle-cell patient is found in euchromatin; the mutation in the unaffected individual is found in heterochromatin.

D.

E.

5. All nucleotides are composed of ________. A. B. C. D. E.

a base, DNA, and amino acids a base, a sugar, and a scaffold a base, a sugar, and a phosphate group mRNA, rRNA, and tRNA mRNA, rRNA, tRNA, siRNA, and miRNA

6. Choose the best answer that differentiates a nucleosome from a chromatosome. A.

B.

C. D. E.

A nucleosome is located only in the nucleus; a chromatosome is located in the nucleus, mitochondria, and chloroplasts. A nucleosome consists of 167 base pairs of doublestranded DNA wrapped around eight histones; a chromatosome consists of a nucleosome plus linker DNA and a linker histone. A chromatosome is identical to a nucleosome, except a chromatosome lacks a linker histone. A nucleosome is a 167-base-pair-long doublestranded DNA with a linker histone. A and B are both correct.

7. In double-stranded DNA, which kinds of bonds hold one complementary strand to the other? A. B. C.

Hydrogen bonds Covalent bonds Phosphodiester bonds

D. E.

Ionic bonds Hydrophobic and hydrophilic bonds

8. Which one of the following statements is true? A.

B.

C.

D.

E.

The deoxynucleotides comprising both strands in DNA are oriented unidirectionally, in a parallel and complementary fashion, with deoxynucleotides on opposite strands bound by double or triple hydrogen bonds. The building blocks of DNA are called deoxynucleotides because they lack oxygen on their 5′ end when compared to building blocks of RNA. DNA is composed of two strands of amino acid monomers organized antiparallel to each other, with one strand oriented 3′ to 5′, while the other strand is oriented 5′ to 3′. DNA is a macromolecule made up of a nitrogenous base and a phosphate backbone, held together by hydrogen bonds between deoxyribose sugars. None of the above statements is correct.

9. During dinucleotide polymerization in DNA, RNA, and oligonucleotides, covalent bonds are formed between neighboring nucleotides. Which carbons form these bonds? A. B. C. D. E.

5′ carbon on the sugars and 3′ carbon on the nitrogenous bases 1′ carbon and 2′ carbon on the sugars 3′ carbon and 5′ carbon of the sugars, linked via a phosphate group 3′ carbon and 5′ carbon of the sugars in DNA and 2′ carbon and 5′ carbon of the sugars in RNA 3′ carbon of the sugars and a phosphorus atom

10. Under conditions of high relative humidity, DNA adopts the _____ configuration. A. B. C. D.

Z 3′ B 5′

E.

dideoxy

References Heather JM, Chain B. (2016) The sequence of sequencers: The history of sequencing DNA. Genomics 107(1): 1–8. doi: [10.1016/j.ygeno.2015.11.003] Panda D, Molla KA, Baig MJ, Swain A, Behera D, Dash M. (2018) DNA as a digital information storage device: hope or hype? 3 Biotech. 8(5): 239. doi: 10.1007/s13205-018-1246-7 Scherrer K. (2018) Primary transcripts: from the discovery of RNA processing to current concepts of gene expression—Review. Exp Cell Res 2018 September 25. pii: S0014-4827(18)30948-0. doi: 10.1016/j.yexcr.2018.09.011 Shendure J, Balasubramanian S, Church GM, Gilbert W, Rogers J, Schloss JA, Waterston RH. (2017) DNA sequencing at 40: past, present and future. Nature 550(7676): 345–353. doi: 10.1038/nature24286 Allshire RC, Madhani HD. (2018) Ten principles of heterochromatin formation and function. Nat Rev Mol Cell Biol. 2018 Apr;19(4):229–244. doi: 10.1038/nrm.2017.119

CHAPTER 3

Proteins Are the Engines of Evolution Structure and Functions of Polypeptides CHAPTER OUTLINE 3.1 The Big Picture 3.2 Amino Acids Form Linear Polymers 3.3 Protein Structure Is Classified into Four Categories 3.4 Changing Protein Shape and Protein Function 3.5 Where Do Proteins Go to Die? 3.6 Chapter Summary

▶ 3.1 The Big Picture The goal of this chapter is to introduce the general structure and function of proteins. We will begin at the most basic level and work our way up in complexity. Like Chapter 2, this chapter contains four main concepts: ■ The tremendous diversity of protein structure and function arises from simple polymers of amino acids. This is derived from the relatively simple chemistry used to create these polymers. Similar to our discussions of polysaccharides (Chapter 1), nucleic acids (Chapter 2), and lipids (Chapter 4), our strategy for understanding proteins begins by understanding how their fundamental building blocks are assembled. Learning the structure of amino acids and the peptide bonds holding them together is the primary objective of the first section. Heavy emphasis is placed on memorization of these structures and definitions because they apply to all proteins. ■ Proteins are dynamic structures that change shape. There are thousands of different proteins in a typical cell, making it virtually impossible to come up with unique descriptions for every one of these shapes. Instead, the spatial arrangements have been organized into categories, numbered 1 through 4 in increasing complexity, to give us a vocabulary for describing these changes in shape. ■ Proteins are only useful to cells if they change shape in predictable ways. Here we describe two general ways for inducing these shape changes. Because we cannot describe every type of shape change and function in proteins, we will classify proteins into categories according to the types of functions they perform.

■ Proteins have a limited lifetime. Here we address a fairly straightforward problem confronting all cells: What happens to proteins when they wear out? We’ll begin our discussion of the answer in this chapter and pick it up again in later chapters as we learn more about the inner workings of cells. An important goal of this chapter is to permit students to understand the fundamental structure of all proteins and be able to describe the general spatial arrangement of every protein described in this book. All subsequent chapters assume this knowledge; pay close attention to the new vocabulary introduced here, and practice using it. One easy way to do this is to pick out a few proteins from the later chapters, especially those that appear in figures, and describe their structural organization with the vocabulary introduced in this chapter. At this stage, we’ll focus primarily on structure and address the function of proteins as they appear throughout the text. We will also see numerous opportunities to apply cell biology principle 3: proteins are the engines of evolution. In Chapter 1, we emphasized that cells use the genetic code to translate the information stored in DNA into actions performed by proteins. Here, we will set the stage for how this happens by examining how the chemistry of proteins permits them to form a multitude of interactions with other molecules. It is these interactions that define protein function. As the diversity of DNA sequences increased over evolutionary history, so too did the variety of functions performed by the proteins they encoded, and natural selection favors some functions over others. Simply put, without proteins, evolution cannot occur.

CELL BIOLOGY PRINCIPLE 3

Proteins are the engines of evolution.

Photo courtesy of Andrew S. Mount, PhD; Jonathan Stewart; Nichole Hickman; Michael Groce; Okeanos Research Lab, Clemson University.

▶ 3.2 Amino Acids Form Linear Polymers Key Concepts ■ Proteins are composed of polypeptides, which are linear polymers made up of twenty different amino acids. ■ Amino acids in polypeptides are linked together by peptide bonds. ■ Amino acids have a characteristic structural polarity that is reflected in the polarity of polypeptides. ■ In many cases, several polypeptides must join together to form a functional protein. ■ All proteins exhibit three characteristic traits: (1) they adopt at least two stable three-dimensional shapes, (2) they bind to at least one molecular target, and (3) they perform at least one cellular function.

Amino acids are the building blocks of proteins, analogous to the nucleotide building blocks of nucleic acids. Each amino acid contains a central carbon atom, called the α-carbon, attached to four different molecular structures. Two of these are functional groups (an amino group and a carboxylic acid group, hence the term amino acid), and one is a single hydrogen atom (see Box 31). The fourth structure, often called an amino acid side chain (commonly abbreviated R or R group), differs in each different amino acid. Proteins are constructed from twenty different amino acids. These twenty amino acids are classified into three or four groups, according to the chemical nature of their side chains, as shown in FIGURE 3-1.

BOX 3-1 TIP It may be helpful to review the structures of functional groups in Table 1-1 in Chapter 1.

FIGURE 3-1 (a)–(d) The twenty most common amino acids are classified into three classes based on the structure of their side chains. Examples of chemically modified side chains are shown in (e). The group containing the largest number of amino acids (with nine) is called the nonpolar (or hydrophobic) group. Note that these amino acid side chains are composed almost entirely of carbon and hydrogen atoms; the sulfur atom in methionine and the nitrogen atom in tryptophan do not impart enough polarity to attract water. Six amino acids belong to the polar group, and each of the polar side chains contains a polar functional group (hydroxyl, sulfhydryl, or amide). The ionic group of five amino acids is sometimes separated into two subgroups: two containing a negatively charged carboxylic acid functional group (acidic subgroup) and three containing a positively charged nitrogenbased functional group (amine or imine—basic subgroup). All five of these side chains contain one charged atom (O– or N+).

A Peptide Bond Joins Two Amino Acids Together Proteins are composed of long, linear sequences of amino acids. These sequences are created by forming a covalent bond between the carboxylic acid group of one amino acid and the amino group of another. One good way to understand how this happens is to practice drawing it, as shown in FIGURE 3-2. Start by drawing the α-carbon of one amino acid, and then attach the four functional groups to it. In most drawings such as this one, the amino group is placed on the left side, the carboxylic acid group on the right side, a hydrogen atom either above or below, and the side chain opposite the hydrogen. In this case, it doesn’t matter which specific amino acid we are drawing, so we can use the abbreviation R to indicate that a side chain is present. Next, draw

another amino acid next to the first and in the same orientation. Be sure that the carboxylic acid group of the amino acid on the left is near the amino group of the amino acid on the right. This placement is important because it makes it easier to see how these two functional groups undergo a chemical reaction to form a covalent bond between the carbon atoms of each group. This reaction, which results in the creation of a water molecule as a byproduct, is called a dehydration reaction (because water has been removed from the initial reactants). The resulting bond is called a peptide bond. Finally, add a box outlining the amide plane, as shown.

FIGURE 3-2 Creating a peptide bond between two amino acids. (a) A simplified depiction of amino acids to illustrate the formation of a peptide bond. (b) A three-dimensional representation of a peptide bond, demonstrating its planar structure. (c) A playing card analogy for polypeptides. Each card represents the plane of a peptide bond, and the strings linking them represent the alpha carbons. Adapted from D.Voet and J. G. Voet. Biochemistry, Third edition. John Wiley & Sons, Ltd., 2005. Original gure adapted from R. E. Marsh and J. Donohue, Adv Protein Chem. 22 (1967): 235–256.

A Peptide Bond Forms a Rigid, Planar Structure Peptide bonds have three features that make them especially important in the structure of proteins: ■ Because the peptide bond’s carbon atom (labeled Co) is bound to three other atoms (α-carbon [labeled Cα], oxygen, and nitrogen), these atoms are arranged in a triagonal, planar configuration. They are enclosed in the yellow-shaded amide plane in Figure 3-2. ■ The bond distance between the Co carbon and nitrogen atoms is approximately 10% shorter than a typical carbon–nitrogen bond. As a result, this bond has a considerable double-bond character, and the nitrogen bonds are also forced into a triagonal, planar configuration. This is also shown in the amide plane in Figure 3-2. Because these two triagonal configurations overlap, all four affected atoms lie in the amide plane. This means that the bonds linking amino acids are not flexible. Keep in mind that each α-carbon is bound to four atoms, so the peptide bonds formed on either side of Cα are arranged in a tetrahedral fashion (see Figure 1-5).

■ Although the bonds linking amino acids are not flexible, the αcarbons can rotate around these bonds. This is known as tortional rotation. This rotation is very important because it allows a linear chain of amino acids to form many different shapes without compromising the planar structure of the peptide bonds. The drawings in panel (b) of Figure 3-2 show two such orientations. Note how different the two shapes are, even though the amide plane is in the same orientation. Viewed another way, if one fixed the R groups in place, the planes of the peptide bonds could rotate from one amino acid to the next. The point here is that the R groups of amino acids can be located in a great variety of locations with respect to the peptide bonds, and vice versa. Panel (c) of Figure 3-2 shows a visual analogy of how the peptide bond planes can rotate around the α-carbons, showing why strings of amino acids can adopt so many different shapes. This also permits linear sequences of amino acids to adopt characteristic shapes independent of the R groups in the sequences. ■ Atoms in peptide bonds are capable of forming hydrogen bonds. Notice that each peptide bond contains an oxygen atom, as well as a hydrogen atom, bound to a nitrogen atom. These atoms can form hydrogen bonds with other molecules, or even with other amino acids in the same protein. This is a very common way to stabilize the structure of proteins. Note that adjacent C=O and H–N groups in the same peptide bond do not form hydrogen bonds with one another because they are facing in opposite directions. FIGURE 3-3 shows many different hydrogen bonds that can be formed between amino acids.

FIGURE 3-3 Some typical hydrogen bonds in proteins. The amino acid residue that supplies the hydrogen atom is designated the donor, and the residue that binds the hydrogen atom is called the acceptor.

The Amino Acid Side Chain Does Not Participate in the Formation of a Peptide Bond All twenty amino acids can form peptide bonds with one another, regardless of the side chain each contains. The reason for this is fairly simple: the peptide bond is formed between an amino group (or imino group: see proline in Figure 3-1) and a carboxylic acid group (which all twenty amino acids have, in the same orientation), but the side chain is not involved. Because each amino acid has only one amino group and one carboxylic acid group (ignoring the side chains), it can only form one or two peptide bonds, ensuring that the amino acid polymers found in proteins are always organized linearly. (There are no naturally occurring branching polypeptides, though some have been synthesized in laboratories.) Note that because of the tetrahedral orientation of the bonds formed by the α-carbon, even a straight sequence of amino acids has a characteristic zigzag shape to it.

Amino Acids Joined by a Peptide Bond Maintain Structural Polarity Every amino acid, whether or not it is part of a peptide bond, always has an amino group and carboxylic acid group attached to the α-carbon. This means that amino acids have structural polarity; that is, the functional groups are attached to oppositefacing bonds formed by the α-carbon. Therefore, if several amino acids are linked together by peptide bonds, each of them maintains this polarity. Even the entire polymer has polarity: the amino acid at one end has a free (unbound) amino group, and the amino acid at the other end has a free carboxylic acid group, as shown in FIGURE 3-4. These two ends are called the amino terminus and the carboxy terminus of the polymer, respectively. They are abbreviated in many different ways, such as NH2 terminus, NH3+ terminus, COO− terminus, COOH terminus, N terminus, and C terminus, or even just N and C.

FIGURE 3-4 Conventions for drawing peptides. By convention, the amino acid terminus (N terminus) is on the left, and the carboxyl terminus (C terminus) is on the right. Peptides are named as derivatives of the carboxy-terminal amino acid.

Definitions: Proteins Versus Polypeptides Versus Peptides Versus Subunits For (mostly) historical reasons, several different names are used to describe amino acids held together by peptide bonds (see Box 3-2). Each name refers to a specific combination of amino acids. The smallest of these is two amino acids held together by a single peptide bond, and it is called a dipeptide. Adding more amino acids to the dipeptide results in the formation of a tripeptide (three amino acids), tetrapeptide (four amino acids), and so on, as shown in Figure 3-4 (see Box 3-3). Notice that the numerical portion of the name (di-, tri-, tetra-, etc.) refers to the number of amino acids in the structure, not to the number of peptide bonds. At some point, scientists stop counting the number of amino acids and use the word oligopeptide to designate a few amino acids held together by peptide bonds. There is no strict convention for what defines the structure. (Think of the word group here.)

BOX 3-2 TIP: EXPLORING PROTEIN NOMENCLATURE Search the internet for these terms: oligopeptide, polypeptide, peptide bond, protein subunit, and quaternary structure. Add what you learn from your search to the descriptions of these terms in this chapter. Of course, use your judgment as to the accuracy, and ask your instructor for suggestions for other search terms. One easy way to visualize 3D models of protein and polypeptide structures is to use the app called Cn3D, which is supported by the U.S. National Center for Biotechnology Information (NCBI).

BOX 3-3 FAQ: WHY ARE THERE SO MANY DIFFERENT WAYS TO DRAW A PROTEIN? It is common practice for scientists to draw proteins with only the resolution necessary to illustrate whatever point they are making. In addition to the molecular diagrams used in Figure 34, there are at least six other ways to draw proteins, and each has its particular advantage. The simplest drawings, which look like simple blobs, are used when one is discussing the overall shape of a protein without considering its actual threedimensional shape. Typically, these blobs are used to illustrate that a protein changes shape under different chemical conditions or when it binds a target, without detailing exactly what kind of shape change occurs. Figure 3-17 is a good example of a blob. Simple lines, as in Figure 3-5, represent the polypeptide backbone and add a bit more resolution by showing roughly where each amino acid in a polypeptide is located in space, but these are usually not very precise. Ribbon drawings (see Figure 3-8c) add more detail to a line drawing by showing where secondary structures form in a polypeptide, using the

standard convention of a cylinder to represent an α-helix and a flat ribbon (sometimes shaped like an arrow pointing from amino terminus to carboxy terminus) to represent a strand of a β-sheet. Ball-and-stick diagrams are highly detailed and show where each atom of a polypeptide is located in threedimensional space, plus the position of the covalent bonds holding them together; Figure 3-2b is a good example of a balland-stick model. Space-filling diagrams (sometimes referred to as van der Waals diagrams) expand on ball-and-stick diagrams by showing the volume occupied by each atom in space (roughly equivalent to the van der Waals radius), as in Figure 3-8a. Finally, sometimes proteins are drawn as cartoons when their three-dimensional structure is well known (e.g., microtubules, showing the microtubule spindle during mitosis). Because molecular diagrams, blobs, and line drawings do not represent the actual structure of proteins, they can be used when the actual structure is either unknown or does not matter. All of the other drawings require some knowledge of a protein’s actual structure and are therefore much more limited in use since the real three-dimensional structure of most proteins is unknown.

At yet another somewhat arbitrary point (typically ten amino acids or more), the structure is called a polypeptide. Regardless of its length, the repeating sequence of atoms linking amino acids is called the backbone. When discussing proteins, the term polypeptide is most appropriate because most proteins contain more than 20 amino acids (the smallest known human protein has 44), and some proteins contain over 1,500 amino acids.

Proteins Are Polymers of Amino Acids That Possess Three Important Traits

In many cases, a protein consists of a single polypeptide. But this is not always the case, so the two words are not interchangeable. When a protein is composed of two or more polypeptides, each polypeptide is called a subunit; the term protein applies only to the collection of all of the subunits when they become functional. To earn the name “protein,” any group of amino acids has to have three important traits, as illustrated in FIGURE 3-5: ■ PROTEIN TRAIT 1: All proteins adopt at least two stable three- dimensional shapes. It is obvious that any group of amino acids must have at least one shape and that the more amino acids it contains, the more possible shapes it can adopt (think of adding links to a chain). What distinguishes proteins from any other amino acid polymers is that proteins organize their amino acids into stable shapes. Put another way, each protein adopts a small number of shapes and tends to use only those shapes during its lifetime. Furthermore, every protein composed of the same amino acid sequence in its polypeptide(s) adopts the same small number of shapes. The protein hemoglobin, which we discussed in Chapter 2 (Figure 2-7), is an excellent example: every healthy hemoglobin molecule that has the same amino acid sequence uses the same small number of shapes, and it switches between these shapes depending on whether or not it is bound to oxygen; only mutant forms of hemoglobin form the polymeric fibers that cause sickle-cell disease. ■ PROTEIN TRAIT 2: All proteins bind to at least one molecular target. As we discussed in Chapter 2 (Messenger RNAs Are Translated into Proteins), proteins are the molecular machines that translate DNA information into some form of cellular activity. While the number of cellular activities performed by proteins is far too large to count, they all share one common property: they take place only if the proteins bind

to some other molecule. Students are likely already familiar with one class of proteins where this property is easy to see: enzymes. The function of enzymes is to facilitate chemical reactions, and they do so by binding to the reactant(s) via an active site on the protein, where the chemical reaction takes place and the product(s) is/are released. An easy way to experience this is to put a starchy food in our mouth and leave it there for a minute without chewing. The enzymes present in saliva bind to the starch and break it down into smaller sugars, partially liquefying the food and making the starch much easier to digest. Note that these enzymes do not digest everything they come into contact with: proteins that bind either to a wide variety of molecules (low specificity) or to molecules not found in cells (excessive specificity) are of little use to cells. More details about enzymes and their role in metabolism are covered in Chapter 10. ■ PROTEIN TRAIT 3: All proteins perform at least one cellular function. As we will see in Chapter 8, cells expend a tremendous amount of their metabolic energy on the synthesis of proteins. For this strategy to survive evolution by natural selection, there has to be a substantial return on such a large investment. Cells benefit from making proteins because proteins that adopt useful shapes (sometimes called the native state) perform or control nearly every chemical and physical activity that takes place in cells. It is important to understand that each protein is a specialist, performing or assisting with just a few of the countless tasks necessary for a cell to survive and divide. Thousands of different proteins are necessary to maintain a cell in a healthy state.

FIGURE 3-5 The three traits of proteins.

Figure 3-5 also shows how these three traits are related; to perform a function, a native protein has to be able to adopt one stable shape when it is bound to its target and another shape when it is unbound. More simply, a protein has to be flexible enough to bind and release its target. The rearrangement of atomic forces that accompanies the binding and release of a target requires this flexibility. Other stable intermediate shapes (sometimes called states) may also be necessary for a protein to function properly. There is a balance between stability and flexibility in protein structure. This is why the first protein trait is so important. Every time a cell synthesizes a polypeptide, there has to be a reasonable probability that the polypeptide will function like all of the other polypeptides of the same sequence (e.g., a newly synthesized hemoglobin will function like all of the other hemoglobins before it). If a polypeptide could adopt a different shape every time it was made, cells would have no way of controlling what it would do, and such a randomly organized polypeptide would not perform any cellular function well. Stated more simply, unstable proteins equal chaos, and chaos cannot survive billions of years of natural selection. If there was ever a protein that behaved unpredictably, it likely no longer exists in cells. This is also why the second protein trait is important. For a protein to participate in any chemical reaction or physical activity, it has to come into contact with its designated target and bind to it predictaby. Because of trait #1, proteins can form finely tuned regions in their structure (typically called binding sites) that only bind to their relevant target molecules. We saw a few examples of binding specificity in Chapter 2, where we discussed proteins that bind to and modify histones (see DNA Is “Silenced” in Heterochromatin). If a protein could bind to any molecule, it is

highly unlikely it would ever be efficient enough to be of any real use to a cell. How proteins bind and interact with their targets is of such central importance to biology that an entire subdiscipline (biochemistry) is devoted to understanding these interactions (see Box 3-4). BOX 3-4 FAQ: HOW IS PROTEIN BINDING DEFINED? The study of binding between molecules is called kinetics and is typically covered in college-level chemistry and biochemistry courses. We can skip the details here, but it is important to remember that interactions between proteins and their targets occur through noncovalent bonds and are therefore reversible. This also means they are concentration dependent: the higher the concentration of the target, the more likely a protein will bind to it. Relative binding between two molecules is often expressed as a ratio between bound and unbound states of the two molecules at equilibrium when the concentrations of the bound and unbound states do not change over time. If we wish to measure the relative binding strength (also called affinity) of molecule A for molecule B, one commonly used value is the dissociation constant, abbreviated Kd, and defined as the concentration (in moles) of B at which half of the A is bound to B and the other half of A is unbound. For molecule A binding to molecule B,

One of the easiest ways to measure protein binding is to observe the effects of an enzyme on its substrate over time. Some enzymes convert their substrates into a product that changes color, and we can use an instrument called a spectrophotometer to capture the interaction of these enzymes

and their substrates by measuring the color change. The rate of substrate conversion (called Vi) is a product of substrate concentration. At very low concentrations of substrate, Vi is low because the enzymes rarely encounter their substrate; at higher substrate concentrations, the rate increases because substrate availability also increases. Eventually, adding more substrate does not increase Vi (i.e., the reaction reaches saturation), and this maximal rate is called Vmax. The substrate concentration that produces a Vi that is one-half of Vmax is designated the Michaelis–Menten constant, Km. The lower the Km, the greater the affinity between enzyme and substrate. Detecting interactions between proteins and their binding partners extends beyond enzymes and substrates (think of how histones form octameric complexes in nucleosome core particles: see Figure 2-14). The tools for detecting these interactions vary considerably, but the central point remains: binding causes a structural and functional change in the protein that depends on the concentration of its binding partner. This means every protein must have a dissociation constant for each of its binding partners. Dissociation constants for most molecular interactions in cells fall in the range of 10−6 to 10−9 M, and cells maintain the concentration of most of their proteins in this range. Some proteins bind very tightly to their targets and thus have a low Kd. The reciprocal ratio, called the association constant (abbreviated Ka) is sometimes used to describe binding affinity as well: binding interactions with low Ka values are considered to be weak interactions. Varying the fraction of target bound by a protein depends on both the location and concentration of both binding partners.

FIGURE A Plotting the velocity of a chemical reaction as a function of substrate concentration reveals both the maximum velocity of the reaction (Vmax) and the Michaelis–Menten constant, Km.

BOX 3-5 TIP: ANTHROPOMORPHISM IN ANALOGIES Throughout the remainder of this chapter and in the rest of this book, we present many analogies for cellular activity in boxes such as this one. Because proteins participate in so many different cellular activities, it is often easy to compare these activities to everyday activities performed by people. Our experience is that students find these anthropomorphized analogies helpful in visualizing what is happening at the molecular level, but it is very important not to overinterpret the analogies. Molecules, including proteins, aren’t people—they can’t see, they don’t think, they don’t want, and they express no emotions. For these types of analogies to be effective, it is critical to see the underlying biology and chemistry they are helping to explain. As these analogies come up, be sure to be able to trace them back to actual science. If you are unable to, that’s a good sign that it’s time to leave the analogy and go back to examining the real cells obeying the laws of nature.

Subunits Are Polypeptides That Assemble Together to Form a Single Protein To discriminate between proteins made up of a single polypeptide and those composed of two or more subunits (FIGURE 3-6), four additional terms are used. A protein containing a single polypeptide is called a monomer (adjective, monomeric), while a protein containing two subunits is called a dimer (adjective, dimeric). In some cases, two polypeptides may associate with one another long after they have been synthesized; when this happens, polypeptides are said to dimerize (e.g., two subunits of a dimeric protein can dimerize). If a protein contains three or more subunits, it can be named in a variety of ways: by number of

subunits (trimer, tetramer, pentamer, etc.) or by the more general terms (multimer, multisubunit, oligomer, or polymer) (see Box 3-6).

FIGURE 3-6 Polypeptides may be proteins or protein subunits. B. E. Tropp. Biochemistry: Concepts and Applications, First edition. Brooks/Cole Publishing Company, 1997.

BOX 3-6 TIP: UNDERSTANDING PROTEIN NAMES One of the most intimidating aspects of cell biology, especially for students new to the field, is the tremendous number of new words one has to learn to understand the subject matter. This is especially true for protein names. Let’s have a look at where all

these names come from. The generic names (dimer, oligomer, polymer) are borrowed from organic chemistry and apply to any molecule made up of many subunits (PVC, the material used to make many different types of pipes, is a polymer of vinyl chloride). The specific names for individual proteins don’t follow any formal rules, but some trends can help. Many protein names end with the suffix -in (e.g., actin, tubulin, lamin). This won’t help to determine what the protein does, but it will hint that a new word may be a protein name. A second common suffix is -ase, and this is applied to enzymes. Often, the name includes the substrate for the enzyme (e.g., collagenase, enolase, proteinase). A word we will see often is kinase, and this means an “enzyme that attaches a phosphate to... .” Kinases are often named for their substrates (the molecules that receive the phosphate), such as protein kinase. One can name a protein any way one likes. Traditionally, the person or group that first identifies and describes a new protein chooses a name for it and uses this name in their publications and lectures, until the greater scientific community eventually adopts it. When two or more groups discover the same protein at about the same time, a naming contest exists until everyone agrees to adopt one name. Protein names fall into three general categories. The oldest of these categories relies on the use of Latin and Greek prefixes and suffixes to describe the known or suspected function of the protein at the time it was discovered (e.g., laminin, fibronectin, hexokinase). Many of these names also include references to the organic chemistry in its structure or in the molecule it binds to (e.g., transglutaminase, superoxide dismutase). One of the longest of these names has over 1,100 letters and is virtually unpronounceable. A second category that appeared in the middle of the twentieth century uses English words in much the same fashion but introduces

acronyms (e.g., transforming growth factor [TGF], heat shock protein [HSP]). A third naming convention, which became popular beginning in the 1980s, has a sense of humor, borrows the acronym strategy, and pokes fun at the stodgy tone of the first category of names. One of the first proteins named this way was Sonic Hedgehog, after the video game of the same name, and this was followed by another protein named Tiggywinkle Hedgehog, after a character in a children’s book by Beatrix Potter. Other protein names in this class include Bad, Boo, CARDIAK, Casper, CLAP, DEDD, MADD, SODD, TANK, TRAMP, TRANCE, and TWEAK. When encountering a new protein name, don’t be intimidated (see Box 1-8). These days, remembering the name of a protein is far less important than understanding its three traits. If the name of a protein gets in the way of understanding its function, call it whatever name is easiest. Then, once you understand it, memorize the real name.

Concept Check 1 Do the three traits apply to all biological molecules? Briefly review the structure of nucleic acids (Chapter 2), and build an argument to answer this question: do the three traits of proteins also apply to nucleic acids?

▶ 3.3 Protein Structure Is Classified into Four Categories Key Concepts ■ A vocabulary has been devised for describing types, or classes, of distinctive patterns in protein structure. ■ The primary structure of any polypeptide is the order (sequence) of its amino acids, from amino terminus to carboxy terminus. ■ The secondary structure of a polypeptide describes the three-dimensional arrangement of its primary sequence. The three categories of secondary structure are α-helix, β-sheet, and random coil. Some combinations of secondary structure are called motifs. ■ The tertiary structure of a polypeptide is the threedimensional arrangement of secondary structures. Several different proteins share certain three-dimensional arrangements called domains. ■ Quaternary structure describes the three-dimensional arrangement of polypeptides in a multi-subunit protein. ■ Five classes of chemical bonds stabilize polypeptide structure.

Despite the myriad functions performed by proteins, all proteins share enough structural similarities that scientists have developed a vocabulary for describing these features. We can also use this vocabulary to compare different proteins and reach some general conclusions about how the structure of a protein can contribute to its function. There are four categories of protein structure,

beginning with the simplest and ending with the most complex, as shown in FIGURE 3-7.

FIGURE 3-7 The four categories of protein structural organization. B. E. Tropp. Biochemistry: Concepts and Applications, First edition. Brooks/Cole Publishing Company, 1997.

Primary Structure Is Defined by the Linear Sequence of Amino Acids As stated above, all naturally occurring proteins contain at least one polypeptide, which is composed of a linear sequence of amino acids. The first category of a protein’s structure (also called its 1° [primary] structure) is therefore simply the sequence of amino acids in its polypeptide(s). Remember that polypeptides are synthesized from the amino terminus to the carboxy terminus by ribosomes (see Chapter 2, Messenger RNAs Are Translated into Proteins, and Chapter 8, Proteins Are Synthesized by Ribosomes from an mRNA Template) by adding one amino acid at a time to the carboxy terminus of a growing polypeptide. Every polypeptide synthesized by a ribosome from a different mRNA has its own unique sequence of amino acids and corresponding 1° structure. When written out in conventional text, the primary sequence always reads left to right, with the amino terminus at the start and the carboxy terminus at the end.

Secondary Structure Is Defined by Regions of Repetitive, Predictable Organization in the Primary Structure The orientation of the 1° structure in three-dimensional space is called the 2° (secondary) structure of a protein. As we stated earlier, the tortional rotation of the α-carbon atoms in peptide bonds permits great rotational freedom along the polypeptide backbone. However, somewhat surprisingly, only two characteristic orientations appear in proteins, and they are named the α-helix and β-sheet. A third type of 2° structure, called the

random coil, consists of all of the rest of the configurations adopted by linear sequences of amino acids (i.e., anything that is not an α-helix or a β-sheet).

An α-Helix Is Shaped Like a Coil The α-helix is the most common 2° structure and is shaped like a coil or spring. Because there are 3.6 amino acids per complete turn of the α-helix, the backbone atoms in every fourth amino acid come in relatively close contact and form hydrogen bonds (N– H•••••O=C) with one another. These hydrogen bonds stabilize the α-helix, and because they can form between any amino acids, there is no theoretical limit to how long an α-helix can be. Notice that the R groups project outward from the long axis of the helix. This has special importance for some membrane proteins, as we will discuss in Chapter 4 (see Membrane Proteins Associate with Membranes in Three Different Ways). In the so-called ribbon model diagrams of protein structure, α-helices are represented by a corkscrew shape, as shown in FIGURE 3-8C and Figure 3-7.

FIGURE 3-8 Three ways of representing the structure of a protein. Structures from Protein Data Bank 1JVT. L. Vitagliano, et al., Proteins 46 (2002): 97– 104. Prepared by B. E. Tropp.

A Beta Sheet Forms a Plane A β-sheet is the second stable arrangement of an amino acid sequence. As in an α-helix, hydrogen bonds within the polypeptide backbone stabilize the β-sheet structure, but these hydrogen bonds do not form between every fourth amino acid; instead, they form between a row of “downstream” amino acids (they are often called strands) that align with the first row. There are two forms of

β-sheets: a parallel β-sheet, which forms between amino acid sequences with the same structural polarity (i.e., the amino and carboxy-terminal ends of each strand of the β-sheet are aligned), and an antiparallel β-sheet, which contains strands with alternating polarities, as shown in Figure 3-7. In both types of β-sheets, notice that the zigzag pattern in the backbone of each strand aligns, giving the entire β-sheet the appearance of a plane, or sheet, that has been folded back and forth (i.e., pleated). These structures are often called β-pleated sheets for this reason. Notice too that the folds occur at the αcarbons, such that the R groups project outward from the sheets in rows. Also note that these rows of R groups alternate from one side of the sheet to the other at each fold. Like the α-helix, there is no theoretical limit to how many strands a β-sheet can have. β-sheets can adopt a number of different shapes. The larger the βsheet is, the more different shapes it can adopt. In one remarkable example, a β-sheet can be so large that it can fold into a tube-like shape, forming what is known as a β-barrel. Other characteristic shapes of β-sheets exist as well, and we’ll see some examples in later chapters. In many models of protein structure, the strands of a β-sheet are illustrated by ribbon-like shapes, as seen in Figure 3-8c. In some cases, the ribbons include an arrow pointed toward the carboxy terminus of the polypeptide to assist with orientation, as shown in FIGURE 3-9.

FIGURE 3-9 Examples of common motifs in proteins. α-helices are represented by coiled ribbons, and strands of β-sheets are represented by arrows pointed from the N terminus to the C terminus of the polypeptide. Another important difference between the α-helix and the β-sheet is that the α-helix is formed by an uninterrupted sequence of amino acids, while the β-sheet, by necessity, is formed by strands of amino acids separated by sequences long enough to form the loops necessary for aligning the strands. The shortest of these loops contains only two amino acids and is called a hairpin loop between strands of an antiparallel β-sheet. For parallel β-sheets, the loops must be even longer than the strands to allow for proper strand orientation. In some cases, these loops are extremely long and may even contain one or more α-helices.

Motifs Are Specific Combinations of Secondary Structures The fact that there is no theoretical limit to the size of α-helices or β-sheets raises an important question: why aren’t all proteins made up entirely of these structures? There is no definitive answer to this question yet, but one possibility is that such a protein would

be too stable to be useful. Because proper protein function requires a balance between stability and flexibility, it makes sense that proteins would include α-helices and β-sheets but would mix them in with more flexible structures. These non-α-helix or non-βsheet structures are, of course, highly variable in shape, but they have a common name: random coil. Although there is no specific definition for a random coil, most biologists consider it a type of 2° structure, as every part of a polypeptide must belong to one of these categories. The structural stability of random coils can vary considerably; until recently, scientists thought that especially disordered regions contributed little to the function of proteins, but evidence is accumulating that this is not the case (see Box 3-7). BOX 3-7 MANY “DISORDERED” RANDOM COILS IN PROTEINS ARE FUNCTIONAL The term random coil covers a large range of structures in protein structure and function. The key feature of random coils —flexibility—is often essential for proteins to switch between alternate functional shapes. In some cases, scientists have had difficulty assigning a role to the randomness—the principle of Darwinian evolution argues that useless variation is selected against and eventually disappears from proteins and the genes that encode them. Yet, about half of all human proteins contain apparently disordered regions: why do they persist? One clear sign that these regions are important is that they are indeed functional—mutations leading to changes in their amino acid sequence can contribute to several diseases, including cancer. In 2016, a team of cell biologists, computer scientists, mathematicians, and systems biologists developed a method for predicting the functionality of these regions, based on a simple concept called evolutionary coupling. Put simply, the

underlying assumption of this approach is that if seemingly “random” amino acid sequences persisted in related genes throughout evolution, there must be some evolutionary advantage (i.e., function) favoring these sequences: they are said to be evolutionarily coupled. After carefully aligning the amino acid sequences of disordered human proteins with thousands of related sequences from other species, then tracing how the sequences of the genes encoding these regions changed over evolutionary time, this team was able to identify key arrangements of coupled amino acids that persisted in these regions. They were then able to predict stable 2° and 3° structures that preserved these coupled arrangements. Based on these predicted structures, the team was also able to assign functional roles (e.g., DNA binding, enzymatic function, chaperone function) to these conserved regions within the apparent disorder. By relying on the functions of related proteins, the evolutionary coupling approach can infer functionality of a protein even when the structure is highly disordered. To learn more about this approach, search the internet for the terms evolutionary coupling and evolutionary constraint. (Source: Toth-Petroczy et al., 2016.)

FIGURE A Conceptual depiction of evolutionary coupling.

In many cases, α-helices, β-sheets, and random coils form characteristic substructures within a protein. These structures are referred to as motifs, and Figure 3-9 shows examples of some common motifs found in proteins. Some of these motifs are used

to fit into the major groove and the minor groove of the DNA double helix (see Figure 2-12). Keep in mind that motifs are not proteins; they are simply parts of a protein that have a characteristic shape and contribute to the function of the protein.

Tertiary Structure Is Defined by the Arrangement of the Secondary Structures in Three Dimensions The orientation of 2° structures in three-dimensional space is the 3° [tertiary] structure of a protein. For proteins composed of a single polypeptide, tertiary structure is essentially the overall shape of the entire protein. For multi-subunit proteins, the term 3° structure applies to each subunit but not to the overall protein. Determining the actual tertiary structure of a protein is still quite difficult and relies on a combination of several different techniques: ■ One of the oldest methods is growing crystals of purified proteins in a carefully controlled atmosphere and using a mathematical technique, known as Fourier transformation, to analyze a pattern of X-rays deflected off of the crystals. However, this requires scientists to grow a large crystal of a protein in the lab; proteins do not crystallize under normal physiological conditions, so it is extremely difficult to find chemical conditions that will promote the formation of a crystal stable enough to withstand X-ray bombardment. As a result, relatively few proteins have been analyzed in this amount of detail. ■ The 3° structure of relatively small polypeptides (