For the past decade or more, much of cell biology research has been focused on determining the key molecules involved in

*191*
*117*
*2MB*

*English*
*Pages 50
[60]*
*Year 2017*

- Author / Uploaded
- Wallace F Marshall

*Table of contents : Introduction to Quantitative Cell BiologyColloquium Digital Library of Life SciencesColloquium Series on Quantitative Cell BiologyAbstractKeywordsContentsChapter 1: Overview: What Is Quantitative Cell Biology? 1.1 Modeling to bridge the gap in scales 1.2 Emergent properties and self-organization 1.3 Predictive understanding of cellular systems 1.4 How does quantitative cell biology differ from systems biology?Chapter 2: Quantifying Data 2.1 Summarizing and visualizing large numbers of examples 2.2 Allowing samples to be compared using statistics 2.3 Derived quantities 2.4 Testing models 2.5 Exploratory data analysis 2.5.1 Spatial Statistics 2.5.2 Time Series Analysis 2.6 Machine learning and “big data” 2.7 When the numbers, themselves, are directly relevantChapter 3: Building and Using Models 3.1 Role of modeling in cell biology 3.2 Types of models 3.2.1 Ordinary Differential Equations 3.2.2 Partial Differential Equations 3.2.3 Network Models 3.2.4 Rule-Based Modeling 3.2.5 Agent-Based Modeling 3.2.6 Stochastic Modeling 3.3 What to do with models once you have them 3.3.1 Testing Theories 3.3.2 Regression and Parameter Estimation 3.3.3 Proofs About Extreme Behaviors 3.3.4 Establishing Equivalence with Known Systems 3.3.5 From Models to Design ToolsChapter 4: Examples of Quantitative Cell Biology 4.1 Counting molecules in the kinetochore 4.2 Modeling cytokinesis 4.3 Understanding forces in endocytosisChapter 5: Frontiers in Quantitative Cell Biology 5.1 New numerical methods 5.2 Multiscale modeling of cancer 5.3 Measuring and modeling developmental biologyChapter 6: How to Get Started in Quantitative Cell Biology 6.1 Prerequisites 6.2 Resources for learning and teaching 6.3 Approaches to interdisciplinary researchReferencesAuthor BiographyBlank Page*

Series ISSN: 2375-7744

Quantitative Cell Biology

MARSHALL

Colloquium Lectures on

Series Editor: Wallace F. Marshall, University of California, San Francisco

Introduction to Quantitative Cell Biology Wallace F. Marshall, University of California, San Francisco

Introduction to Quantitative Cell Biology

For the past decade or more, much of cell biology research has been focused on determining the key molecules involved in different cellular processes, an analytical problem that has been amenable to biochemical and genetic approaches. Now, we face an integrative problem of understanding how all of these molecules work together to produce living cells, a challenge that requires using quantitative approaches to model the complex interactions within a cell, and testing those models with careful quantitative measurements. This book is an introductory overview of the various approaches, methods, techniques, and models employed in quantitative cell biology, which are reviewed in greater detail in the other volumes in this series. Particular emphasis is placed on the goals and purpose of quantitative analysis and modeling, and the special challenges that cell biology holds for understanding life at the physical level.

LIFE SCIENCES

Introduction to Quantitative Cell Biology Wallace F. Marshall

About Morgan & Claypool Publishers This volume is a printed version of a work that appears in the Colloquium Digital Library of Life Sciences. Colloquium books provide concise, original presentations of important research topics, authored by invited experts. All books are available in digital & print formats. For more information, visit store.morganclaypool.com

Colloquium Lectures on morgan & claypool

store.morganclaypool.com

Quantitative Cell Biology Series Editor: Wallace F. Marshall

Introduction to Quantitative Cell Biology

ii

Colloquium Digital Library of Life Sciences The Colloquium Digital Library of Life Sciences is an innovative information resource for researchers, instructors, and students in the biomedical life science community, including clinicians. Each PDF e-book available in the Colloquium Digital Library is an accessible overview of a fast-moving basic science research topic, authored by a prominent expert in the field. They are intended as time-saving pedagogical resources for scientists exploring new areas outside of their specialty. They are also excellent tools for keeping current with advances in related fields, as well as refreshing one’s understanding of core topics in biomedical science. For the full list of available titles, please visit: colloquium.morganclaypool.com Each book is available on our website as a PDF download. Access is free for readers at institutions that license the Colloquium Digital Library. Please e-mail [email protected] for more information.

iii

Colloquium Series on Quantitative Cell Biology Series Editor Wallace F. Marchall Department of Biochemistry and Biophysics, University of California, San Francisco A fundamental unsolved problem in biology is understanding how a living cell emerges from the multitude of molecular components. While cell biology has made great strides in enumerating all the components of the cell, this is only just the beginning, and the challenge we now face is understanding the cell as a complex, self-organizing system. To meet this challenge, we must take cell biology to a quantitative level, combining mathematical modeling with new methods in measurement and data analysis. The goal of this e-book series is to provide an overview of current approaches and challenges in the emerging field of Quantitative Cell Biology, in a way that will be accessible to readers both from the biological sciences as well as the physical and computational sciences. These state of the art volumes introduce readers to the cutting edge research in the field, including computational modeling and image analysis methods, while also discussing current understanding and open questions in the systems biology of cells. Each book is intended to be useful independent of the others, and the series as a whole will provide a comprehensive introduction for students and researchers who are new to the field. Published titles For titles please see the website, www.morganclaypool.com/toc/qcb/1/1.

Copyright © 2017 by Morgan & Claypool Life Sciences All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Introduction to Quantitative Cell Biology Wallace F. Marshall www.morganclaypool.com ISBN: 9781615046683 paperback ISBN: 9781615046690 ebook DOI: 10.4199/C00121ED1V01Y201409QCB002 A Publication in the Colloquium Series on Quantitative Cell Biology Lecture #1 Series Editor: Wallace F. Marshall, Department of Biochemistry & Biophysics, University of California, San Francisco Series ISSN ISSN 2375-7744

print

ISSN 2375-7752

electronic

Introduction to Quantitative Cell Biology Wallace F. Marshall Director, Center for Cellular Construction Professor, Department of Biochemistry and Biophysics University of California San Francisco San Francisco, CA

COLLOQUIUM SERIES ON QUANTITATIVE CELL BIOLOGY #1

vi

Abstract For the past decade or more, much of cell biology research has been focused on determining the key molecules involved in different cellular processes, an analytical problem that has been amenable to biochemical and genetic approaches. Now, we face an integrative problem of understanding how all of these molecules work together to produce living cells, a challenge that requires using quantitative approaches to model the complex interactions within a cell, and testing those models with careful quantitative measurements. This book is an introductory overview of the various approaches, methods, techniques, and models employed in quantitative cell biology, which are reviewed in greater detail in the other volumes in this e-book series. Particular emphasis is placed on the goals and purpose of quantitative analysis and modeling, and the special challenges that cell biology holds for understanding life at the physical level.

Key words quantitative cell biology, modeling, image analysis, numerical methods, biophysics, parameter estimation, model discrimination, exploratory data analysis, spatial statistics, ARMA models, machine learning, high content image-based screens, coarse-graining, boundary value problems, network models, rule-based modeling, agent based modeling, stochastic modeling, computer aided design, kinetochore, cytokinesis, endocytosis, computational fluid dynamics, level set method, multiscale modeling approaches, morphogens, differential equations, probability theory, statistical analysis

vii

Contents 1.

Overview: What Is Quantitative Cell Biology?......................................................1 1.1 Modeling to Bridge the Gap in Scales................................................................. 1 1.2 Emergent Properties and Self-Organization........................................................ 2 1.3 Predictive Understanding of Cellular Systems...................................................... 3 1.4 How Does Quantitative Cell Biology Differ from Systems Biology?.................. 3

2.

Quantifying Data.................................................................................................5 2.1 Summarizing and Visualizing Large Numbers of Examples................................ 5 2.2 Allowing Samples to Be Compared Using Statistics............................................ 7 2.3 Derived Quantities............................................................................................... 8 2.4 Testing Models................................................................................................... 10 2.5 Exploratory Data Analysis.................................................................................. 11 2.5.1 Spatial Statistics...................................................................................... 11 2.5.2 Time Series Analysis.............................................................................. 11 2.6 Machine Learning and “Big Data”..................................................................... 13 2.7 When the Numbers, Themselves, Are Directly Relevant................................... 14

3.

Building and Using Models................................................................................. 17 3.1 Role of Modeling in Cell Biology...................................................................... 17 3.2 Types of Models................................................................................................. 18 3.2.1 Ordinary Differential Equations............................................................ 18 3.2.2 Partial Differential Equations................................................................. 20 3.2.3 Network Models..................................................................................... 20 3.2.4 Rule-Based Modeling............................................................................. 21 3.2.5 Agent-Based Modeling.......................................................................... 22 3.2.6 Stochastic Modeling............................................................................... 22 3.3 What to Do with Models Once You Have Them.............................................. 24 3.3.1 Testing Theories..................................................................................... 24 3.3.2 Regression and Parameter Estimation . ................................................. 24

viii Introduction to Quantitative Cell Biology

3.3.3 Proofs About Extreme Behaviors........................................................... 25 3.3.4 Establishing Equivalence with Known Systems..................................... 25 3.3.5 From Models to Design Tools................................................................ 27 4.

Examples of Quantitative Cell Biology................................................................ 29 4.1 Counting Molecules in the Kinetochore............................................................ 29 4.2 Modeling Cytokinesis......................................................................................... 30 4.3 Understanding Forces in Endocytosis................................................................ 32

5.

Frontiers in Quantitative Cell Biology................................................................ 35 5.1 New Numerical Methods................................................................................... 35 5.2 Multiscale Modeling of Cancer ......................................................................... 36 5.3 Measuring and Modeling Developmental Biology............................................. 37

6.

How to Get Started in Quantitative Cell Biology................................................. 39 6.1 Prerequisites........................................................................................................ 39 6.2 Resources for Learning and Teaching................................................................. 40 6.3 Approaches to Interdisciplinary Research.......................................................... 40

References.................................................................................................................. 43 Author Biography....................................................................................................... 49

chapter 1

Overview: What Is Quantitative Cell Biology? The living cell is a mechanochemical machine of almost unimaginable complexity, but at some level, all of its activities can be reduced to a combination of biochemical processes and physical processes. The challenge in understanding cells is understanding how all the parts work together, and how physics and biochemistry combine to produce life. The field of cell biology started out as a descriptive science, based on visualizing cellular structures by light and electron microscopy. In subsequent decades, work focused on determining the “parts list” of molecules that make up the cell, but this is still an essentially descriptive approach. One might think that determining the functions of the individual molecular components of a cell would provide all the answers, but it did not. The problem is that unlike the comparatively simple processes that biochemistry and molecular biology were able to explain, such as digestion of sugar or activation of gene expression at a particular promoter, cell biology can only be understood at the level of entire systems in which large numbers of molecules work together to produce emergent behaviors. The field of quantitative cell biology seeks to address the need for methods to measure, model, and predict complex behaviors at the level of cells and sub-cellular processes.

1.1

Modeling to bridge the gap in scales

Many of the most interesting unanswered questions in cell biology exist at a mesoscale between the atomic scale (measured in Angstroms) and the spatial scale of light microscopy (measured in microns). At this intermediate spatial scale, direct visualization of key events is extremely difficult because they involve groups of molecules too large to resolve by X-ray crystallography but too small to resolve by light microscopy. Furthermore, cellular processes are inherently cooperative, arising from interactions of large numbers of components and mediated by energy-consuming pathways that allow spontaneous generation of order from disorder. This combination of a gap in direct measurement and the importance of complex interactions of large numbers of components means that modeling and

Introduction to Quantitative Cell Biology

theoretical approaches are necessary to span this gap in understanding (Mogilner, 2006). The centrality of modeling in this process creates a need for quantitative measurements and methods for data analysis. The combined application of computational modeling and quantitative data analysis to fundamental questions in cell biology has created a new discipline: quantitative cell biology.

1.2

Emergent properties and self-organization

Traditional concepts and methods of biochemistry and molecular biology are extremely powerful for dissecting mechanism in cases where the observable process is the direct outcome of a single molecule or complex. Consider, for example, DNA replication. Since a single polymerase complex can drive the incorporation of nucleotides, it is possible to isolate the complex and determine its kinetic properties using simple enzymatic assays. There is no need to consider any other components in the cell, since the enzyme in itself is sufficient to produce the phenomenon under study. The key is that a process like DNA replication can be localized to a single point in the cell and assigned to a single enzymatic function. But other processes are not subject to this type of localization. Cell division and motility, for example, are the collective result of hundreds of different molecular plays distributed in broad swathes across the cell. Rather than being a direct outcome of a single enzymatic activity, these phenomena are emergent properties of huge molecular collectives. It is thus very hard, maybe even impossible, to truly understand emergent behaviors by considering one individual molecular species at a time—that is, using the conceptual framework of molecular biology. Does this mean such systems are incomprehensible? Not at all—emergent properties are studied all the time in condensed matter physics, and all it takes is a way to represent the pertinent level of organization at a suitable level of abstraction. For instance, solid–liquid phase transitions can be understood by thinking about order parameters and phase diagrams, without needing to consider the detailed behavior of every single molecule. Self-organization is one type of emergent property that is highly germane to cell biology. Although the genome of a cell is often likened to a blueprint, it is more like a recipe book that specifies what components should be made, but not how they should be put together. The actual physical structure of the cell apparently results from self-organization of the components. For example, despite the apparent complexity of the mitotic spindles, bipolar spindles are able to self-assemble from components in vitro (Heald et al., 1996), and presumably, these same assembly processes play a key role in spindle assembly in living cells (Pavin and Tolic, 2016). Self-organization of structures is seen in physical systems as well. One of the goals of quantitative cell biology is to view cell structure and behavior as emergent, self-organizing phenomena, and to try to analyze them using similar

Overview: What Is Quantitative Cell Biology?

mathematical and physical approaches used in condensed matter physics. This program of trying to understand cellular structure and behavior is a modern extension of the goals initially stated almost a century ago by Thompson (1942).

1.3

Predictive understanding of cellular systems

The goal of studying cell biology is to understand how cells work at a mechanistic level. Mechanisms are commonly described in words or in diagrams. Indeed, it is common practice to formulate a cartoon-like model to summarize our understanding about a given cell biological process. Such cartoons fill modern cell biology textbooks and review articles, and often occur in research articles as a way to encapsulate the conceptual take-home message of the work. But how does one decide whether an appealing-looking cartoon, or a convincingly worded descriptive model, actually corre sponds to reality? How do we know whether we really understand how a system works? What makes science different from other branches of knowledge is that we test our tentative understanding with experiments, which ultimately amounts to asking whether our “model,” whether formulated as words or diagrams or in some other form, is sufficient to predict how the system will respond to defined perturbations. This is the very essence of what an experiment is. The trick becomes how to know whether or not a prediction has been satisfied. Sometimes, a prediction can be formulated as something that is very black-and-white: for example, we may predict that if gene X is knocked out, our cells will die. In other cases, however, the prediction may not be something as obvious and easy to decide as life or death.

1.4

How does quantitative cell biology differ from systems biology?

Systems biology and quantitative cell biology share the common feature of adapting tools traditionally employed in the physical sciences to better understand the complexity of living systems, but they differ greatly in the level of biological organization that they address. Systems biology currently places virtually all of its emphasis on molecular genetic regulatory pathways, and rarely if ever addresses the level of the entire cell, or even of sub-cellular components or organelles. As a result, the computational and analytical tools in common use in systems biology, as well as data standards and representations, are not directly applicable to quantitative cell biology data. The fields are therefore distinct.

Introduction to Quantitative Cell Biology

However, quantitative cell biology has much to learn from systems biology as a field, partic ularly in the way that the field of systems biology has made use of standardized data formats and modeling languages such as Systems Biology Markup Language (SBML). SBML is a formal language in which mathematical models of biochemical pathways can be represented in a machinereadable form that can then be used by many different software packages, for example, to visualize the pathway in network form, or to run simulations of the dynamic behavior of the network. In many respects, the types of data that systems biology studies, that is, gene expression profiles, are far easier to analyze than Cell Biology data, because the data usually lacks a spatial component. Thus, quantitative cell biology poses unique challenges for understanding, representing, and manipulating data, that systems biology of gene networks does not face. Developing formal languages to represent models of the spatially complex and dynamic behaviors of cells remains a challenging problem. • • • •

chapter 2

Quantifying Data Traditionally, cell biology has relied heavily on image data and biochemical data, such as Western blots, usually with the goal of obtaining qualitative results such as showing that a particular protein localizes near a particular organelle and so on. Such approaches have yielded tremendous advances in our understanding of how cells work. However, there are a number of compelling reasons for want ing cell biological data to be more quantitative in nature, that is, results that are expressed in terms of numbers.

2.1

Summarizing and visualizing large numbers of examples

Perhaps the simplest reason for wanting numerical measurements is that these measurements allow large numbers of samples to be summarized and aggregated in a single figure. By having numerical measurements obtained as a function of time, for example, one can generate a graph showing the dy namics of the system, thus summarizing the results of many, perhaps hundreds, of time-lapse image series, all in a single plot. Although it might in principle be possible to look at all of the raw image sequences, this becomes prohibitive as the number and size of the files increases. Moreover, in scientific publications, it is not reasonable to expect every reader to go through the exercise of laboriously viewing hundreds of movies. Nobody will do it. Although some cell biology journals have emphasized the storage and distribution of raw data files, and certainly this is important for specialized purposes, in routine practice such files are essentially useless because they contain too much data. In contrast, simple graphical representations, based on numerical measurements, are far easier to view and draw conclusions from. As an extension of this application, once a mathematical model is determined to represent key aspects of a system, it is sometimes possible to come up with combinations of system parameters that allow data to be re-plotted in a much simpler form. For example, Figure 1 shows how levels of gene expression change as a function of repressor concentration under many different conditions. The resulting graphs look like a mish-mash of different behaviors. But when the results are replotted

Introduction to Quantitative Cell Biology

in terms of a combination of system variables, the data all fall onto a single curve. Such “data collapse” allows huge numbers of experiments to be summarized on a single graph, which then allows outliers whose behavior is truly different to be more easily recognized.

Figure 1: Example of data collapse. (A) Change in gene expression versus repressor concentration in a series of different experiments in which the strength of binding sites and their total number is varied. Although increased repressor always gives decreased gene expression, the curves look dramatically different. (B) Replotting the data from panel A but using the fugacity, a calculated parameter that reflects available repressor (thus taking into account competition from other binding sites). In this case, all the data collapse onto a single curve, revealing the underlying consistent behavior. Reprinted figure from Weinert et al. (2014). Used with permission of the American Physical Society.

Quantifying Data

2.2

Allowing samples to be compared using statistics

The “typical image” has long been the bane of the cell biology literature. Given two different perturbations or conditions, one is faced with the challenge of how to prove that a difference that one sees is real and reproducible. Single images or data points, whether they be Western blots or qPCR (quantitative polymerase chain reaction) results or any other method, can never make a convincing case because spurious variation between experiments might be driving the variation, rather than true biological differences. The solution to this problem is statistics. Statistical analysis can actually play a number of roles. First and foremost, statistics can be used to determine, in a rigorous manner, whether or not an observed difference is likely to be reproducible or simply a result of spurious variation between samples. The standard statistical toolbox requires numerical data as grist for its mathematical mill. If we have a hundred images of cells under condition X, and another hundred images of cells under condition Y, we must extract numerical measures from these images. Once we have a measurement of interest that lets us assign a number to each cell in each condition, we can then apply a range of standard statistical tests for differences in the mean. But this requires us to have numbers, in order to apply any sort of statistical test. This can often be done using very simple approaches. For example if one wants to ask whether a particular mutant has a larger nucleus than wild-type cells, one could measure nuclear diameter using image analysis packages such as ImageJ (https://imagej.net), and then feed the resulting measurements into simple comparison tests in a package such as Excel. Another powerful use of statistics is to test for correlation of one measured value with another. For example, in the study of organelle size, there is a qualitative sense that larger cells have larger organelles. But to really prove this correlation, one must measure organelle size and cell size numerically, and then compute the correlation coefficient between the two measurements. Again, the main use of statistics is to test whether the correlation is strong enough that it is unlikely to have occurred as a fluke in a particular data set. A less common application of statistics is to analyze statistical properties of a distribution to determine new features of the data. For example, studies of biological noise have used statistical analysis of the width of distributions of gene expression measurements to document variability in gene expression. This type of statistical application is quite distinct from tests of significance, and properly falls under the category of exploratory data analysis, to be discussed below.

Introduction to Quantitative Cell Biology

2.3

Derived quantities

Many quantities of interest can be directly measured using a particular type of experimental data. For example, in microscopy data, it is usually straightforward to measure the length of objects. Other quantities cannot be directly measured but must be calculated from other measurements. The simplest example, perhaps, is the measurement of speed. A time-lapse series of images, for example, of a vesicle moving on a microtubule, does not show us the speed directly. Instead, we can measure po sition at a series of time-points, then subtract the positions to get displacement, and then divide the displacement by the time to get the speed. This is a case where the biologically relevant quantity that we wish to know (vesicle speed) can only be obtained by performing a numerical calculation. This type of analysis is easiest to do when the moving objects are small and discrete, such as vesicles. When the moving object of interest is a continuum, for example the actin network at the leading edge of a cell, it is still possible to track the motion if the network can be sparsely labeled. In this so-called speckle microscopy approach, it is then possible to quantify the speed of the network as a function of spatial position by tracking the motion of individual foci of staining (see Figure 2). Other commonly seen examples are the calculation of diffusion constants from Fluorescence recovery after photobleaching (FRAP) recovery curves (Sprague and McNally, 2005; Lin and Othmer, 2017), from fluorescence correlation spectroscopy (FCS) data (Dittrich et al., 2001; Wang et al., 2006; Guan et al., 2015), or from analysis of mean squared displacement versus time of a dif fusing particle (Qian et al., 1991). Such derived quantities are especially important in studies of cell

Figure 2: Quantifying velocity of the actin network in a moving cell using speckle microscopy. Arrows indicate the direction of network motion at each point. From Ponti et al. (2004). Reprinted with permission from AAAS.

Quantifying Data

mechanics, where key mechanical properties (such as the Young’s modulus of a microtubule) are calculated indirectly from measurements of bending angles (Gittes et al, 1993; Mickey and Howard, 1995; Hawkins et al., 2013). An important aspect of these types of studies is that in order to extract the numerical value of a derived quantity, the data must be fit to curves predicted by theoretical models in which the quantity to be derived appears as a parameter. Using model fits to determine the likely value of a desired parameter is known as parameter estimation. Failure to fit the expected curve to the data can reveal important novel phenomena. For example, when movement of particles gives a mean squared displacement that is not proportional to time, this indicates anomalous diffu sion, which suggests that additional processes, such as persistent motion, crowding, corrals, or reversible binding to an invisible lattice, may be at work (Saxton and Jacobson, 1997). The use of derived quantities to extract insight in cell biology may seem less strange if we look back to the field of Enzymology. Standard enzyme assays yield reaction velocities as a function of concentration, but this is seldom very interesting in itself. Rather, enzymologists use the numerical measurements of velocities to calculate the turnover rate (kcat) and the Michaelis constant (Km), which provide information about two distinct enzyme functional properties. Related to the idea of derived quantities is the use of image data to compute key parameters of a process. For example, the rate of turnover of the actin network can be calculated based on the appearance and disappearance of speckles (Figure 3).

Figure 3: Visualizing the turnover of the actin network in a moving cell by quantifying the rate of speckle creation and disappearance. From Ponti et al. (2004). Reprinted with permission from AAAS.

10 Introduction to Quantitative Cell Biology

In all of these cases, answering a biological question or obtaining a biological insight requires us to know the value of a value that cannot be directly measured but must be calculated from other measurements. This is only possible if we have those other measurements in numerical form. Thus, a key advantage of quantitative measurements for cell biology is that they allow us to calculate inter esting and important derived quantities.

2.4 Testing models In the next section we will discuss the importance of mathematical modeling, as well as a range of different approaches for constructing models. But regardless of the model, there is always a challenge of determining whether or not a given model fits observed data. Because most models are mathematical or computational, they naturally produce numerical predictions, and hence one very important reason for wanting to have numerical data is that such data can be directly compared with the predictions of models. Even if it is possible for a model to produce more qualitative results such as synthetic image data, it would not be obvious how to compare the prediction to the observations. Ideally, one would use several models, each making a different predicted set of numerical data, and compare these to observed data to see that one model fits and the others do not. The ruling out of a model using comparison of predictions to numerical data is known as model discrimination. The power of model discrimination is illustrated well by considering an historical example. Tyco Brahe spent years carefully measuring the position of stars and planets, essentially converting his visual impression of the night sky into a series of numbers. Using this data and his own measurements, Johannes Kepler eventually noticed that the orbit of mars could be explained if the planet followed an elliptical, rather than circular, orbit. The elliptical orbit was not directly observed in the data, but rather, Kepler tried out a variety of different shapes for the orbit and eventually found that if he modeled the planetary orbit as an ellipse, and then calculated how the position should vary in the sky over time when observed from the earth, he found that an ellipse gave a better fit to the data that other orbital forms such as circles. One big challenge for model discrimination is the fact that some models have more adjustable parameters than others. The more parameters a model has, the more likely it is to fit a set of data even if the model does not correctly describe the underlying process, simply because the range of curves that it could fit is larger. For this reason, one should be skeptical when anyone shows a fit of his concern was epitomized by Jon von Neumann data by a model with a large number of parameters. T who is supposed to have remarked: “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” One method for model discrimination in which the number of param-

Quantifying Data 11

eters of different models is taken into account is the use of the Akaike Information Criterion (AIC), which has been applied to model discrimination in both medical data and cell biological modeling (Lindsey and Jones, 1998; Ludington et al., 2015).

2.5

Exploratory data analysis

Thus far, the applications of numerical data have been either to support conclusions that we may have already drawn from initial qualitative observation, or to calculate derived values that we knew would be of interest. But an increasingly important use of numerical data is to support exploratory data analysis, which collectively refers to a wide range of tools and approaches for detecting previously unknown behaviors or relations within a set of data. Testing for statistical correlation is a simple example—if we can measure different aspects of our data, we can ask which of those aspects are correlated. This has the potential to reveal surprising relations and possibly new causal links. Although correlation does not prove causality, it might at least suggest that a causal relation may exist, and that it would be worth doing further experiments to test for a causal relation. But testing for correlation is just a tip of the iceberg in terms of the variety of exploratory data analyses that are possible.

2.5.1 Spatial Statistics Cell biology is an inherently spatial problem, and often the data that we collect is given as a function of position. For such types of spatially-varying data, the field of spatial statistics becomes applicable. Although spatial statistics was originally developed in other fields such as forestry and geological prospecting, the same tools can be applied to look for patterns and interactions inside a cell. For example, spatial statistical approaches have been used to quantify clustering and distribution of organelles (Yang et al., 2011; Apte and Marshall, 2013). Spatial statistics become even more powerful when applied to time-varying data, allowing detection and visualization of novel dynamic patterns (Figure 4).

2.5.2 Time Series Analysis Live-cell imaging allows time-lapse data to be collected to describe cell behavior, but the big challenge is what to do with all this data. For any given measured parameter, its value over time results in a time-series data set. Such time-series data can be analyzed by methods drawn from the Signal Processing field, an approach that is common in neuroscience but still underexploited in cell biology.

12 Introduction to Quantitative Cell Biology

Figure 4: Analysis of spatial data reveals different zones of cytoskeletal dynamics near the leading edge of a motile cell. For this analysis, three different quantities were compared at a series of time points before, during, and after cytoskeletal depolymerization. All three parameters show spatially non-uniform behavior. From Ponti et al., 2004. Reprinted with permission from AAAS.

One reason for analyzing time-series data from cells is to detect trends, such as periodicity, that might not be visually obvious due to noise. A second reason is to identify features of the data that can only be gleaned by comparing many different time-points, something the human eye is not always so good at. One example is that time-series analysis of protein transport into cilia has revealed avalanche-like features, including a power-law distribution of transport event sizes and a strong correlation between dwell times, before and after an import, and even in regard to the quantity of protein imported (Ludington et al., 2013). One approach to handling time series data is to fit observed time series to statistical models, using model frameworks capable of emulating many types of dynamics. A classic example of this, common in digital signal analysis, is the class of models known as ARMA models (autoregressive moving average), which determines the probability of measuring a particular value at each time point as a function of past time points, plus a series of noise sources added at several prior time points. ARMA models have an advantage in that they do not require detailed information about the internal dynamics of the system. This means that the model parameters may not be directly relatable to molecular entities, but on the other hand, the model parameters are still useful because they can allow comparisons to be made between different

Quantifying Data 13

Figure 5: Justification for ARMA modeling of kinetochore microtubule dynamics. Experimental data from live cell imaging is shown in the red line. Green line shows a simulation of the same data based on standard dynamic instability model (as indicated by the green line labelled MTDI). No combination of parameters can reproduce the actual appearance of the observed time series. More accurate modeling could be accomplished by taking prior time series values into account using ARMA modeling. Reprinted from Jaqaman et al. (2006), with permission from Elsevier.

conditions. For example, various mutants can be compared and asked which ARMA model parameters are affected in each case, thus allowing the mutants to be compared and grouped based on similar time-series phenotypes ( Jaqaman et al., 2006). One might question whether such advanced methods are really needed, but an illustration of the importance of this type of modeling is shown in Figure 5.

2.6

Machine learning and “big data”

Exploratory data analysis, as discussed above, involves a scientist looking at the data and searching for trends or interesting unexpected features. A natural extension of this approach is to automate the search, so that algorithms replace humans in the search for connections within the data. This approach has come to be known as “big data” but is more properly referred to as machine learning. Given a set of data, we train a computer to recognize patterns in the data, and then we ask it to tell us what it has found. Machine learning is commonplace in bioinformatics, and is routinely used to identify protein domains or other interesting sequence features. Machine learning methods are also commonly applied in dealing with “-omics” data, particularly transcriptomics, by searching for sets of genes whose expression patterns vary in a similar way across different conditions. One nice example of a big data approach in cell biology was the automated analysis of cell shape space by Keren et al. (2008). Using the statistical method of principal components analysis

14 Introduction to Quantitative Cell Biology

Figure 6: PCA of cell shape in fish keratocytes. (a) images of several different cells. (b) diagrams il lustrating the four main modes of shape variation. Reprinted by permission from Macmillan Publishers Ltd, Keren et al. (2008).

(PCA), they found that cell shape variation in fish keratocytes shapes could be represented almost entirely as a combination of just four fundamental modes of shape variation (Figure 6). Big data tools are a natural fit for high-content image-based screens, in which images are acquired under many different conditions (such as RNAi of every gene in the genome) and many numerical descriptions of cell shape extracted for each condition. A simple application of machine learning in such a context is to learn how to recognize a “typical” cell based on the range of numerical features, so that unusual cells can be scored as “hits” in a screen. Machine learning has also proven itself highly useful in automatic segmentation, for example in recognizing cells within image data.

2.7

When the numbers, themselves, are directly relevant

Sometimes, precise values of numbers are, in and of themselves, directly relevant for understanding mechanism(s). This is perhaps most evident in the biochemistry of multiprotein complexes, where the stoichiometry of the subunits can determine how the complex works.

Quantifying Data 15

A two-headed motor has a potential for processivity that a one-headed motor simply would not have. The number of sliding clamps associated with the replication fork has huge implications for how clamps are loaded and used during DNA synthesis, and the fact that there seem to be three polymerases per replication complex, rather than two as had always been assumed, solved the problem of how the lagging strand keeps pace with the leading strand (Reyes-Lamothe et al., 2010). Another interesting example is the kinetochore, where the number of potential microtubule binding domains plays a critical role in determining the ability of the kinetochore to remain attached to the spindle ( Joglekar et al., 2008; Emanuele et al., 2005; Johnston et al., 2010). Sometimes, the stoichiometry of subunits can be determined by biochemical measurements or by direct imaging, for example, using cryoelectron microscopy. In other cases, various fluorescencebased methods are used to count the number of molecules in a complex within an intact cell. These are reviewed more thoroughly in another book in this series, Counting Molecules Within Cells (Coffman et al., 2014); for recent examples, see Chen et al. (2014). • • • •

17

chapter 3

Building and Using Models 3.1 Role of modeling in cell biology The purpose of a mathematical model is to express what is known or hypothesized about a system using a formal representation, from which new predictions can be determined using mathematics or computation. Such models are thus distinct from models based purely on cartoons or verbal descriptions. There are several advantages of a mathematical model over a purely verbal or cartoon model. First, it is usually the case that assumptions must be explicitly stated in the model in order to render it in mathematical form. This is not always the case—sometimes assumptions are implicitly built into the structure of the model, and in some cases, these implicit assumptions end up being critical for the model. But, in general, the exercise of rendering a model for a system as a series of equations or other formal representations tends to make our underlying assumptions more clearly visible. A second advantage of mathematical models is that the language of mathematics is universal, so once a model has been rendered as a formal mathematical system, any researcher studying the model can be reasonably sure that they are studying the same model as any other researcher. This is not going to be the case with verbal or pictorial models, which tend to rely on the interpretation of the reader or viewer. A third advantage of mathematical models is that predictions can in principle be obtained from the models in a mechanical manner that would come out the same if someone else were to start with the same model and ask the same questions. We say “in principle” because with even a very simple model, a great deal of scientific intuition and creativity can be required to decide what questions to ask using the model or what behaviors to analyze. Nevertheless, once the question has been explicitly stated, a formal model can be used to derive an answer. Whether different people would always get the same answer is also not clear, especially when numerical solutions or computer simulations are employed. Depending on how “well-behaved” a system of equations is, the solution that one obtains could potentially depend strongly on the numerical approaches used. Finally, and as a consequence of the preceding features, mathematical models have the power to reveal unexpected behaviors or predictions that were not available to our limited intuition. This is usually

18 Introduction to Quantitative Cell Biology

when models are the most valuable: for example, if we construct a model that represents our pet theory for how a system operates, only to find that the system is unstable or otherwise fails to work properly. This then forces us to reevaluate our assumptions and concepts. Sometimes, a model will show strange behaviors that lead to new experiments designed to detect these unexpected behaviors. This is also a highly useful role for a model, because it allows experimental science to proceed in new directions.

3.2 Types of models Mathematics employs a wide range of structures and formalisms, which in turn lead to distinctly different flavors of mathematical models used for cell biology. Which modeling framework to use will depend on the level of abstraction that one chooses to think about the cell. For example, one could try to model the whole cell at the atomic level, but this would be computationally difficult and possibly pointless, given the difficulty in extracting any meaningful understanding from the results. Instead, one tends to build models in which the variables of the model reflect entities at the level of abstraction of interest (molecules, organelles, cells, or tissues). In the course of building models at a given level, one ends up combining or averaging out details at a more detailed level. This process is known as coarse-graining. The level of coarse graining is a key element in deciding which type of model to use. Other issues to be considered include whether one wants to represent the system as a deterministic or stochastic process, and whether spatial and temporal information needs to be accounted for. Here, we survey several broad classes of mathematical modeling frameworks, all of which are being actively applied in quantitative cell biology.

3.2.1 Ordinary Differential Equations The type of model that most biologists are familiar with is ordinary differential equations (ODEs), in which the state of a system is defined via a finite set of continuous variables, and for each state variable there is a differential equation that tells us the rate of change of this variable as a function of the instantaneous values of the state variables. This type of model is the province of dynamical systems theory and is commonly employed in ecology as well as in electrical and mechanical engineering. An excellent example of the power of simple ODE models lies in the field of enzyme kinetics. Chemical rate equations are nothing more or less than ODEs that describe the rate of change of one or more chemical constituents as a function of the concentration of the various reactants and

Building and Using Models 19

Figure 7: ODE model of the yeast cell cycle. Each plot shows the predicted level of a different cell cycle regulatory molecule as a function of time. The model used here consisted of 60 differential equations with a total of 71 parameters, which was then solved with an ODE solver in MATLAB. Reprinted Figure 3A from Barik et al. (2010), under CC-BY 3.0.

products in the solution. When we use familiar terms like “first-order kinetics” or “second-order reaction” we are describing the structure of the differential equations in terms of how the different concentrations interact to produce the rate of reaction. ODE models thus have the great benefit of familiarity. They also are comparatively simple to understand and provide a convenient way to represent assumptions about the dynamics of a system. In many cases, the model includes not only the dynamical equations but also a set of initial conditions, and in this case the model is known as an initial value problem. Because ODE models are good at representing temporal dynamics, but do not usually include a spatial component, they have been most effectively applied to processes such as cell cycle regulation (Figure 7) in which temporal variation is the only thing that matters. Once a set of ODEs have been determined, they can be

20 Introduction to Quantitative Cell Biology

solved either analytically or computationally. In some cases, it is possible to examine the ODEs and prove whether or not the system is stable, or capable of oscillations. This is a case where dynamical systems theory can make a strong impact in cell biology, but it tends to be easiest to analyze ODE systems when they only have a few variables. As the size of the model grows, one must rely more and more on computational solutions.

3.2.2 Partial Differential Equations ODEs focus exclusively on change over time. But in cell biology, we are often concerned with how a quantity changes as a function not only of time but also of position in space. Partial differential equations allow us to represent systems in which variation over time is linked to variation over space. The classic example is diffusion, in which the rate of change of concentration at a point depends on the spatial variation in concentration in the vicinity of that point. Partial differential equations provide a straightforward way to represent such spatiotemporal variation and predict both the short term and steady state behavior of such a system. Importantly, because the behavior of such a system depends on the assumptions that we make at the boundary of the system, hence these models are often known as boundary value problems. Boundary value problems have played a central role in understanding development of morphogen gradients in development (see below).

3.2.3 Network Models The numerous molecular components of a cell are often drawn in complex network diagrams, as exemplified by the metabolic networks that adorn the walls of most biochemistry labs. The real power of such network diagrams is that they can serve as a modeling framework. In a network model, one represents entities of interest (molecules, cells, etc.) as nodes in a graph, with links representing interactions between those entities. Computational methods can then be used to simulate the behavior of the network. In actuality, this often means reducing the network to a system of differential equations, hence network models are really best viewed not as a different type of model, but rather as a way to build a model schema in which many different differential equation models (with different sets of parameters) can be simultaneously visualized. Because network models are discrete models, it is possible to enumerate every possible network arrangement of a fixed number of nodes, and given the simplicity of network models it is often not too hard to simulate the performance of the network. This combination of enumerability and rapid evaluation allows optimally performing networks to be determined. An enumeration approach such as this can sometimes reveal novel or unexpected solutions to a given biological problem. One

Building and Using Models 21

Figure 8: Enumerating network models of adaptation. Left panel shows the general model of three interacting nodes. Every possible pattern of connectivity among these nodes was independently simulated with a wide range of parameters determining interaction strength. Each resulting network was tested in dynamic simulations to ask whether it would yield an output that transiently responded to a step input and then adjusted back to baseline, thus showing adaptive behavior. The right panel shows the three main connectivity patterns that achieved adaptation over the widest range of connection strength parameters. Reprinted from Ma et al. (2009), with permission from Elsevier.

example of a network enumeration approach is given in Figure 8 (Ma et al., 2009). Another way that network models can be used is to identify network modules within larger biological network diagrams (Milo, 2002; Shen-Orr, 2002).

3.2.4 Rule-Based Modeling Network models discussed in the last section can be viewed as a way to represent an underlying system of differential equations in a more intuitive form that better reflects the way biochemists think about molecular interactions than would a set of equations written out on paper. This works because the two things that matter in biochemistry—the molecules present and how they interact—are directly reflected in the structure of the network diagram. In some cases, however, it may be more intuitive to represent the behavior of a system in terms of rules that describe what the parts of the system do (Sekar and Faeder, 2012). The rules can be specified in various programming languages, and give a higher degree of flexibility than a network model. Moreover, because the rules themselves specify how the system evolves over time, we are starting to step away from differential equations as the primary framework. In rule-based modeling, algorithms become the framework. This puts modeling into a regime that computer science can engage with. A number of rule-based modeling

22 Introduction to Quantitative Cell Biology

systems are currently available, for example, PySB is a rule-based modeling platform in which rules are expressed using Python code (Lopez et al., 2013). Although some rule-based methods like PySB use standard programming languages, in other cases, the rules are expressed in customdesigned languages for representing biological interactions, one prominent example being BioNetGen (Faeder et al., 2009).

3.2.5 Agent-Based Modeling One specific variant of rule-based modeling is a strategy for building models in which the behaviors of individual components of the system are independently simulated. The components in such a model are called “agents,” and the strategy is thus known as “agent-based modeling.” Agent-based modeling is a natural framework for thinking about populations of cells, because each cell becomes an “agent,” but the approach can also be applied at a sub-cellular level, in which case the agents are organelles or molecular structures. For example, a collection of microtubules can be represented by simulating the motion of each microtubule individually (Loughlin et al., 2010).

3.2.6 Stochastic Modeling Two features of processes at the cellular scale make cells different from macroscopic systems. One is that the numbers of molecules involved are small, such that random variation from one cell to another in copy number can have a large effect on function. The other is that interactions between proteins typically involve energies comparable to the thermal energy (kT), such that all pathways in the cell are constantly fluctuating. For this reason, cellular information processing is noisy, and a truly accurate representation of what is going on inside a cell requires ways to model the stochastic nature of biology at this size scale. The theory of stochastic processes provides an important set of analytical tools, but often can only be applied when the system is simplified. For predicting behavior of complex systems involving many different interactions, numerical methods have been developed that can greatly speed up the simulation process (Gillespie, 2007). Developing better and more efficient stochastic simulation methods remains an ongoing area of research in numerical methods. Figure 9 depicts an example of stochastic simulations of a biochemical oscillator using two different modeling methods.

Building and Using Models 23

Figure 9: Stochastic simulation of a two-gene transcriptional oscillator system (Vilar et al., 2002). The top plot shows an exact solution calculated using the Gillespie method, whereas the lower plot shows a solution using an improved method that achieves higher speed by restricting the reactions that are considered at each step. Figure reprinted from Adalsteinsson et al. (2004), under CC-BY 4.0 license.

24 Introduction to Quantitative Cell Biology

3.3

What to do with models once you have them

Development of a model is seldom an aim in itself. Rather, the model serves as a basis for testing whether our assumptions are in principle able to explain an observation. But often, it is possible to go further and use models to design new experiments and to gain new insights into how a system functions.

3.3.1 Testing Theories Models can play a key role in testing theories. By theory, we mean any conceptual explanation for a biological phenomenon. Theories can vary in terms of the scale of description and the degree to which assumptions are made explicit. One power of models, as discussed previously, is the way models make our assumptions explicit. In some cases, the act of assessing assumptions during the construction of a model can be enough to suggest that the theory may be wrong (e.g., if it requires some unrealistic set of assumptions). On the other hand, if a model does give the expected qualitative behavior, then we are able to conclude that a given conceptual model and set of assumptions is, at least in principle, sufficient to explain the behavior. In this way, the plausibility of a conceptual idea can be confirmed. A classic example of using a mathematical model to prove sufficiency of a conceptual model is the analysis of the cell cycle by Tyson and Novak (2011), who showed that phos phorylation networks are, by themselves, sufficient to account for cell cycle transitions. It is much harder to prove that a conceptual model is not plausible, since one can always argue that some subtle deficiency in the detailed assumptions caused the model to fail. Another way to test a theory is to compare the predictions of the theory with the results of experiments. When a theory makes a very dramatic qualitative prediction, it can be easy to compare prediction with results in a qualitative way. But, in other cases, it may not be obvious what a theory predicts for a given experiment. In such cases, models can be required to make concrete predictions of numerical outcomes, which can then be compared with measurements (Howard, 2014). For this type of purpose, both analytical models and simulations can be useful for making quantitative predictions (Howard, 2014).

3.3.2 Regression and Parameter Estimation As discussed above in section 2, one key task in quantitative cell biology is to estimate derived quantities, that is, numerical values that are important for developing and testing biological hypotheses but which cannot be measured directly. Modeling has an important part to play in this approach because models can be fit to numerical data in order to obtain optimal estimates for unknown pa-

Building and Using Models 25

Figure 10: Potential challenges to model fitting. (A) Although some models clearly do not fit a given data set (red line), in other cases more than one model may give equally good fits (green and orange lines). In such cases, models with fewer parameters should be favored. (B) When available data only spans a limited range of values, it can be difficult to discriminate between models that show similar behavior in that limited range. Reprinted by permission from Macmillan Publishers Ltd: Jaqaman et al. (2006b).

rameters. Fitting models to data is not a trivial task, however, and if the wrong criteria are used for goodness of fit, it is possible to be fooled into thinking that one has estimated a parameter correctly. Hence, formal methods of data regression need to be used to ensure reliable parameter estimates ( Jaqaman and Danuser, 2006). Figure 10 illustrates two potential concerns of model fitting. Despite these concerns, estimation of unknown parameters is a major application of mathematical models.

3.3.3 Proofs About Extreme Behaviors One of the things that makes math different from science is that in math, it is possible to prove absolute truth and falsehood, at least in many cases. This can allow mathematical models to make predictions about the possible range of system behaviors. For example, it may be possible to prove that no matter what parameter values are used, a system will never show stable oscillations, or will always converge to a single fixed point, and so on. In some cases, bounds can be put on possible values of a system variable. Such proofs are to be distinguished from simply running a whole bunch of simulations and noting the range of behaviors. This latter approach is far less satisfactory since it leaves open the possibility that some other combination of parameters might give a completely different behavior.

3.3.4 Establishing Equivalence with Known Systems One of the best ways to use a model is if you are able to show that your model is in some sense formally equivalent to another model that is already well understood. The reason this is useful is that

26 Introduction to Quantitative Cell Biology

Figure 11: Representing planar polarity using a dipole moment, allowing it to be conceptualized with an Ising-like model. Arrows indicate the orientation of individual cells in a lattice analogous to the dipole moments of elements in an Ising lattice. Figure reprinted under CC-BY 4.0 license from Burak and Shraiman (2009).

the better-understood model is likely to convey a conceptual grasp of a problem, due to the long history of its study. Finding that a new system falls into the same class of models means that those insights can automatically carry over to the system under study. One productive example is the recognition that certain models of planar cell polarity end up being equivalent to the Ising model of ferromagnetism (Burak and Shraiman, 2009). The equivalence is established by giving each cell an orientation vector that corresponds to the magnetic dipole orientation in a spin glass (Figure 11). This observation holds out the hope that some of the large body of theory that has developed around the Ising model may eventually lead to new ways to think about developmental biology, for example by encouraging researchers to explore how short-range interactions lead to long-range order, which was the main conceptual result of the Ising model. In

Building and Using Models 27

addition to being able to ask similar questions, recognition of correspondence between models means that one can even implement similar approaches. For example, it is common to define an “order pa rameter” as a function of the combined orientations of units in a spin glass, and then to analyze phase transitions by asking about continuity and derivatives of the order parameter as the system changes from disordered to ordered. The same equations can be used to analyze planar polarity in essentially the same way (Burak and Shraiman, 2009).

3.3.5 From Models to Design Tools Models have long held a special place in engineering. What makes engineering different from tinkering is the use of models as design tools. Once the components of a system, such as transistors or logic gates, are represented by mathematical models, it becomes possible to use those models to guide the design process. For example, models can be used to explore component values, such as resistances or capacitances, in order to determine component values that give a desired behavior. Models used in this way range from simple graphical estimates of transistor operating curves to highly complex computational models. Synthetic biology has long relied on trial and error, but as we learn to understand and predict cellular systems at a formal level, it will increasingly be possible to use mathematical models of molecular and cellular systems to design engineered cells. Currently, this approach is being used for genetic circuit design (Nielsen et al., 2016), but if the same overall idea of building design tools from models can be adapted to cellular structures, it should be possible to create a true computer-aided design (CAD) system for cell biology. • • • •

29

chapter 4

Examples of Quantitative Cell Biology Models and quantitative measurements have been combined to analyze many different aspects of cell biology, including cellular decision-making (Atay and Skotheim, 2014), cell motility (Keren et al., 2008), and the establishment of cell polarity (Altschuler et al., 2008; Kravtsova and Dawes, 2014). In this section, we will briefly review several examples where quantitative approaches have led to new conceptual insight into important cell biological processes.

4.1

Counting molecules in the kinetochore

The kinetochore is a complex protein machine that lies at the heart of chromosome segregation, cell division, and the mitotic checkpoint. Understanding how the various kinetochore proteins work together to properly segregate chromosomes has long been a holy grail of cell biology, but gaining a mechanistic understanding of such a complex machine requires, among other things, that we know how many of each protein are present in the complex at different points in time. Methods have been developed for counting the number of proteins in various cellular structures (for a more extensive introduction to methods for counting molecules, refer to the volume by Coffman and Wu in this series), and these methods have been productively applied to kinetochores (Figure 12). These numerical results allow kinetochore function to be modeled more realistically, and also have direct implications to possible mechanisms. For example: by taking into account the number of proteins in a particular kinetochore, compared with the number of kinetochore microtubules, it is possible to envision different arrangements of interactions between those proteins and the set of microtubules converging on the kinetochore. The ability to count molecules also allows mutations to be classified based on potential effects on kinetochore structure and allows the dynamics of kinetochore protein recruitment to be quantified.

30 Introduction to Quantitative Cell Biology

Figure 12: Counting kinetochore proteins. Graph shows the fluorescent signal for two different yeast kinetochore proteins, plus the signal from cells in which two different kinetochore proteins, of equal copy number, are simultaneously labeled. The fact that the average intensity doubles in this case compared to when just one proteins is tagged strongly supports the idea that fluorescence intensity is directly proportional to copy number. Figure reprinted from Joglekar et al. (2008), with permission from Elsevier.

4.2

Modeling cytokinesis

Cytokinesis is a dramatic example of an emergent behavior. Actin filaments and myosin motors work together with membrane dynamics pathways to pinch a cell into two daughter cells, somehow maintaining the integrity of the cell membrane. To do this, the cell must assemble an actin-myosin contractile ring, and this ring has to be positioned in the correct place relative to the mitotic spindle, so that the two daughter nuclei end up in the two daughter cells. The mechanisms of contractile ring assembly and placement have been extensively studied by experimental methods including both genetic analysis and physical perturbation (Green et al., 2012). A number of careful quantitative microscopy studies have provided a wealth of detailed information about the process of cytokinesis ring assembly, including counting the numbers of key molecular players and measuring their recruitment as a function of time

Examples of Quantitative Cell Biology 31

(Wu and Pollard, 2005). With this much quantitative, physical, and molecular information in hand, the time is ripe to develop a truly mechanistic understanding of how cytokinesis works, and the only way to test whether a proposed mechanism can explain the complex process is to use models. In fission yeast, the observation that ring formation begins as a series of myosin “nodes” which then coalesce into the ring (Vavylonis et al., 2008) has provided the basis for a series of models based on mutual attraction of nodes through actin filaments which are ultimately pulled into a ring as the only stable solution (Bidone et al., 2014). Figure 13 shows several timepoints from an agent-based simulation of this node-attraction model. Fission yeast is a particularly simple case for modeling cytokinesis given the cylindrical cell shape, and now the challenge, in addition to testing the existing models, is developing models that can account for other cell types. A particular challenge is to

Figure 13: Agent-based model of cytokinetic ring assembly in fission yeast. Cortical myosin nodes are plotted in red, actin filaments in green. Figure reprinted from Bidone et al. (2014), with permission of Elsevier.

32 Introduction to Quantitative Cell Biology

model formation of actin rings that are not closed rings, such as those seen in comb jellies, since in those cases, the natural stability of a closed ring around a cylinder no longer can be relied on as an organization principle.

4.3

Understanding forces in endocytosis

Endocytosis, the pinching off of membrane vesicles from the cell surface, is a geometrically inter esting phenomenon that changes the topology of the plasma membrane, converting a single closed sur face into two nested surfaces. Pinching off a vesicle from the membrane is an inherently mechanical process, but at the same time is linked to complex regulatory molecular pathways, and the big challenge is understanding how the molecular players drive the mechanics (Figure 14), which in turn has led to many questions about what forces make the process happen. Endocytosis is known to

Figure 14: Relating mechanical events of endocytosis to molecular regulatory elements. Diagram emphasizes the importance of switching between positive and negative membrane curvature during the budding of a vesicle, and defines a set of geometric descriptors that can be used for modeling work. Figure reprinted from Liu et al. (2009). CC-BY 4.0.

Examples of Quantitative Cell Biology 33

require actin, but it has not been entirely explained what role actin plays. For example, actin might serve as a scaffold to recruit other molecules. One possibility considered has been that actin might be involved in generating a force that would push membrane in from the surface, but another possibility is that membrane curvature changes could be driven by binding of coat proteins and/or by alterations in membrane lipid composition. Figure 15 shows results from an ODE model of endocytosis that takes into account coupling between membrane curvature and enzymatic activity of key endocytotic regulatory molecules (Liu et al., 2009). The ability to integrate the physical properties of membranes with the enzymatic prop erties of regulatory molecules illustrates the power of mathematical modeling in tying together dif ferent aspects of a complex system.

Figure 15: Results of a mechanochemical model for endocytosis plotting recruitment of key molecular elements (red, blue, and green) along with the progression of the tip of the bud away from the cell surface (black) over time. The diagrams below the graph show how the shape changes during the course of the simulation. Figure reprinted from Liu et al. (2009). CC-BY 4.0.

34 Introduction to Quantitative Cell Biology

In parallel with development of such mathematical models, quantitative image analysis has been opened a new window into endocytosis, for example by revealing the relative timing of key steps in the process, allowing them to be ordered in a logical sequence (Figure 16), and by defining the distinct roles of apparently similar molecules in controlling different aspects of the dynamics and timing of endocytosis (Figure 17).

Figure 16: Rapid time-lapse imaging of individual endocytic events, using green flourescent proteins-tagged proteins, allows the order of recruitment be determined. By using two tagged proteins, it is possible to use one as a reference standard and align the recruitment timing of other key proteins based on this reference standard. Figure reprinted from Stimpson et al. (2009). CC-BY 3.0.

Figure 17: Alteration in the timing of actin (top row) and capping protein (bottom row) in the aip1 deletion mutant (right column) compared to wild-type cells (left column). Fimbrin (green) is used as a reference standard to align the timing of actin and capping protein recruitment. This type of analysis can then be used to detect subtle alterations in timing as shown here. Figure reprinted from Berro and Pollard (2014). CC-BY 3.0.

• • • •

35

chapter 5

Frontiers in Quantitative Cell Biology 5.1

New numerical methods

Cell biology entails the interaction of huge numbers of molecules, interactions that take place at many different length scales. In some cases, key events can be understood in purely biochemical terms, allowing methods of chemical kinetics to be brought to bear. But in other cases, the cell is better thought of in terms of continuum mechanics, for example by considering the cytoplasm as a viscoelastic medium. Because cells are in constant motion, they can in some sense be viewed as droplets of active fluids. For this reason, fluid dynamics represents a viable framework for modeling and analyzing cellular properties and behavior. But fluid dynamics also faces similar challenges to cell biology, in terms of having to deal with large (relative to molecular scale) systems that show dynamic behavior. Development of methods for computational fluid dynamics are currently an important research area in applied math, and one that holds tremendous potential for quantitative cell biology. Moreover, this is one area in which the challenges of cell biology can serve as novel problems for development of new mathematical methods. One example of a mathematical tool under development for fluid dynamics, and which is now being applied to cells, is the level set method (LSM). In the level set method , a surface in n dimensions is represented using a function of n variables, such that the surface of an object (like a liquid droplet, bubble, or cell) is defined by all sets of coordinates for which the function has a particular constant value. This is where the method gets its name, since a level set refers to the set of points for which a function has a single value. If the function is created in an appropriate way, then level sets of the same function for different values can be used to predict how the surface will evolve over time. One advantage of this approach, compared with more traditional approaches of representing a surface as a linked set of points (like beads on a string), is that it automatically prevents crossing of surface points. Thus, an initial investment in constructing a function for the level set method ends up making simulations more robust. Figure 18 shows an example of the level set method applied to modeling cell motility. Computational fluid dynamics is one example of new numerical methods, developed for other fields, which are starting to make a difference in how we analyze living cells. Other examples include new methods for visualizing high dimensional data sets (van der Maaten and Hinton, 2008),

36 Introduction to Quantitative Cell Biology

Figure 18: Simulation of cell motility using level set formalism. Level set method was used to extend cell boundaries in a manner that avoids self-crossing of the boundary, whereas taking into account physical parameters derived from independent measurements. Different color lines indicate predicted cell contours as the cell polarizes and moves towards a source of chemoattractant on the right. Figure reprinted from Yang et al. (2008). CC-BY 4.0.

methods for solving and simplifying complex models (Soliman et al., 2014), and algorithmic methods for assessing global stability of complex nonlinear systems (El-Samad et al., 2006).

5.2

Multiscale modeling of cancer

One of the key frontiers of cell biology in general is understanding how cellular behaviors drive tissue and organ scale behaviors. This has led to increasing interest in multiscale modeling, with cancer as an obvious “killer app.” Despite inconceivable work and resources being spent on studying the molecular basis of cancer biology, the disease as a whole remains somewhat mysterious and hard to treat, in part because the disease is so complex and relies on interactions of processes that take place at many different scales. Although some of the features of cancer cells, such as resistance to apoptosis, can be understood on a single-cell level, other behaviors such as invasion and tumor morphology are emergent properties that result from collective behaviors of large numbers of cells (Figure 19). Another example is vascularization of tumors, which involves population-scale interactions between large numbers of tumor and host cells, and for which differences in behavior can have strong impacts on the outcome of treatment (Scott et al., 2016). Even cell-autonomous alterations in cancer cells result from complex network dynamics of cell regulatory pathways, and understanding these systems can provide new avenues for therapy (Bagheri et al., 2011). Coping with this type of complexity will require computational approaches to quantify behaviors and construct predictive models.

Frontiers in Quantitative Cell Biology 37

Figure 19: Computational modeling of tumor development. In this simulation, as cells proliferate the cell mass becomes mechanically unstable, leading to the outline of the initial mass breaking up into fingers which invade the surrounding tissue. In this type of system, invasivity is recognized as an emergent property of collective cell behaviors. Figure reprinted by permission from Macmillan Publishers Ltd: Anderson and Quaranta (2008).

5.3

Measuring and modeling developmental biology

Another example of an inherently multi-scale phenomenon in biology is development. The develop ment of a multicellular organism involves molecular switches and signaling pathways at the smallest scale, and tissue and organ mechanics such as buckling (Savin et al., 2011) at the largest scale. In between, cell behaviors including oriented cell division (Tang et al., 2011; Mao et al., 2013), shape changes (Kam et al., 1991), migration, and cell–cell interactions, play key roles. Putting all of these processes together will require multiscale modeling approaches. Moreover, although much research has focused on genes that control development, an embryo is a physical object, and much of the ac tual shape development is driven by physical processes, hence methods to integrate physical measure ments and models become paramount. One area in which quantitative measurements and mathematical modeling has had a major im pact is the question of how positional information is transduced into gene expression within a de veloping embryo or tissue. In an early theoretical analysis of pattern formation, Alan Turing coined the term “morphogen” to describe a molecule whose concentration determined cell identity in a

38 Introduction to Quantitative Cell Biology

Figure 20: Quantification of bicoid mRNA contributing to the bicoid morphogen gradient in Drosophila oocytes using single molecule fluorescence in situ hybridization. Plot depicts distribution of bicoid mRNA particle intensity as a function of normalized position along the anterior-posterior axis. Different colors indicate different stages of embryonic development. Figure reprinted from Little et al. (2011). CC-BY 4.0.

developing pattern (Turing, 1952). Such morphogens were subsequently discovered by developmental biologists in the form of positional determinant molecules such as bicoid, which visibly form concentration gradients spanning a tissue or embryo, and it has been proposed that this information is used by cells to determine their position relative to the source of the morphogen. However, several questions have remained controversial concerning this model. First, it is unclear whether the gradients are set up by pure diffusion or by some mixture of non-diffusive mechanisms (Lander et al., 2002). In many cases this diffusion can show anomalous features due to the complex geometry of the extracellular space through which the morphogens must move (Mueller et al., 2013). For intracellular morphogens, the motion and distribution of both the morphogen protein itself and also the mRNA that encodes it appear to play a role in shaping the final protein gradient shape (Little et al., 2011). It is thus necessary to quantify both the distribution of morphogen protein but also of the mRNA that encodes it (Figure 20). The second major question is how the morphogen gradient is interpreted. Careful quantitative measurements combined with information theory-based analysis have shown that virtually all of the theoretically available information in the bicoid gradient is extracted to give fine control of target gene expression levels (Gregor et al., 2007; Tkacik et al., 2008). One interesting suggestion has been that the gradient might be interpreted not just once it has reached steady state, but also at earlier stages while it is still forming (Bergmann et al., 2007). The mechanisms of gradient generation and interpretation remain at the forefront of quantitative developmental cell biology. • • • •

39

chapter 6

How to Get Started in Quantitative Cell Biology 6.1

Prerequisites

The most important prerequisite is to identify a cell biological problem for which the quantitative viewpoint is likely to yield new insights. This will generally mean a problem about which something is already known at the molecular level, so that tools are available to perturb and visualize the behavior of the system. Understanding whether or not a problem has been developed to this level will necessarily require some knowledge of cell biology. It is thus necessary for anyone wanting to get involved with quantitative cell biologist to learn cell biology, at least to a certain level of detail. Second, given the importance of imaging and image analysis as a central tool for quantitative measurements in cells, it is also a virtual necessity to learn microscopy techniques. Again, this can be taken to different levels of sophistication, and the existence of many different microscopy methods makes it hard to be an expert in everything. But it is imperative to learn at least one form of microscopy that can be used to obtain quantitative measurements through analysis of images. Third, since quantitative cell biology makes extensive use of mathematical models, one should develop basic proficiency in at least one style of model building. The vast majority of mathematical modeling in biology ends up using differential equations, so a little training in that area is highly desirable. Differential equations can be studied at many levels, but at bare minimum, the researcher interested in quantitative cell biology should at least be able to look at a differential equation and understand what it means, even if solving it might be too hard. This should not be too daunting a task for anyone with a background in chemistry or biochemistry, because they will already have been exposed to reaction kinetics. To go a step further, it is worth learning how to solve the most basic of differential equations, namely first-order processes such as radioactive decay or exponential growth of bacterial cultures. Such equations are commonly discussed in introductory calculus textbooks, and come up all the time in cell biology. Another basic aspect of differential equations in which a small investment of study will yield a large payment in comprehension, is learning how to solve for the steadystate solution of a simple differential equation by setting the derivative to zero. In many cases, this is the main result of a mathematical model in biology, and anyone can learn how to find such solutions

40 Introduction to Quantitative Cell Biology

even if they have forgotten (or never learned) calculus. For biologists who want to learn more about using and solving differential equations, an excellent book that is based on cellular and biochemical examples, is Segal (1987). Another fundamental area of math that arises a lot in quantitative cell biology is probability. Again, probability theory is a vast subject that can be approached at many levels of formalism, but at least it would be useful for anyone who wants to study cells quantitatively to have some idea of how to calculate joint probabilities for two independent events which occur with probability p1 and p2, respectively. This level of probability, which is covered in introductory textbooks, is already enough to help the researcher comprehend much of the literature in quantitative cell biology. An excellent textbook on modeling cells using probabilistic methods, geared towards students with little formal training in probability, is Nelson (2014).

6.2 Resources for learning and teaching The two main skills that we seek to teach in training researchers in quantitative cell biology are (A) being able to combine computational or physical approaches with biological experimental meth ods in ones own research, and (B) being able to work s part of an interdisciplinary team. Several educational efforts have been launched with such training in mind. Examples include the Q-Bio Summer School (http://q-bio.org/wiki/The_q-bio_Summer_School_and_Conference) which seeks to train individuals from a wide range of backgrounds in a common set of modeling approaches, and the Physiology course at the Marine Biological Laboratory (http://www.mbl.edu/physiology), which brings together students from biological and physical sciences and trains them to speak a common scientific language by working together on quantitative and physical cell biology research problems. A common theme that has emerged from these initiatives is the power of project-based learning, in which students learn new skills on a need-driven basis as they attempt to carry out projects that answer real scientific questions (Vale et al., 2012; Shekhar et al, 2014). This is a major shift from more traditional lecture-based teaching, and requires a major investment. More and more graduate programs are incorporating this type of project-based learning, often in the form of “boot camp” courses for incoming students.

How to Get Started in Quantitative Cell Biology 41

6.3

Approaches to interdisciplinary research

There are currently two quite distinct approaches for working on a quantitative cell biology project. For those individuals who were trained within traditional field boundaries, it is likely that they are far more comfortable with either biological experimentation or with computation, but not both. When someone trained in a single field sees the need to cross over boundaries between fields, the apparently simplest approach is to find a collaborator in the other field, and work together. This is a valuable and time-tested approach, but it is worth considering some of the challenges. First, collaboration requires a certain amount of extra effort to set up lines of communication between the collaborating groups. Second, it is often the case that individuals from different fields speak a different scientific language, and it can take time and effort to establish enough common terminology to allow the work to proceed. These may seem like trivial problems but those who have been engaged in such collaborations will know that in fact they can be quite serious. The alternative approach is for one individual to gain enough experience in both experimental and theoretical /computational approaches so that they can cross the boundary by themselves. This will hopefully become more commonplace as interdisciplinary training expands. Obviously, nobody can be an expert in everything, but those who are able to operate in both camps will have an advantage in terms of being able to get help on either side, because they will know the concepts and terminology of both groups. In either case, one clear need is for better ways to bring people together and break down barriers between fields. • • • •

43

References Adalsteinsson, D., McMillen, D., and Elston, T.C. (2004). Biochemical network stochastic simulator (BioNetS): software for stochastic modeling of biochemical networks. BMC Bioinformatics 5, 24. Altschuler, S.J., Angenent, S.B., Want, Y., Wu, L.F. (2008). On the spontaneous emergence of cell polarity. Nature 454, 886–9. doi:10.1038/nature07119 Anderson, A.R., and Quaranta, V. (2008). Integrative mathematical oncology. Nat. Rev Cancer. 8, 227–34. doi:10.1038/nrc2329 Apte, Z.S. and Marshall, W.F. (2013). Statistical method for comparing the level of intracellular organization between cells. Proc. Natl. Acad. Sci. U. S. A. 110, E1006–15. doi:10.1073/pnas .1212277109 Atay, O., and Skotheim, J.M. (2014). Modularity and predictability in cell signaling and decision making. Mol. Biol. Cell 25, 3445–3450. doi:10.1091/mbc.E14-02-0718 Bagheri, N., Shiina, M., Lauffenberger, D.A., Korn, W.M. (2011). A dynamical systems model for combinatorial cancer therapy enhances oncolytic adenovirus efficacy by MEK-inhibition. PLoS Comp. Biol. 7, e1001085. doi:10.1371/journal.pcbi.1001085 Barik, D., Baumann, W.T., Paul, M.R., Novak, B., and Tyson, J.J. (2010). A model of yeast cellcycle regulation based on multisite phosphorylation. Mol. Syst. Biol. 6, 405. Bergmann, S., Sandler, O., Sberro, H., Shnider, S., Schejter, E., Shilo, B.Z., Barkai, N. (2007). Presteady-state decoding of the bicoid morphogen gradient. PLoS Biol. 5, e46. Berro, J., and Pollard, T.D. (2014). Synergies between Aip1p and capping protein subunits (Acp1p and Acp2p) in clathrin-mediated endocytosis and cell polarization in fission yeast. Mol. Biol. Cell 25, 3515–27. doi:10.1091/mbc.E13-01-0005 Bidone, T.C., Tang H., Vavylonis, D. (2014). Dynamic Network Morphology and Tension Buildup in a 3D Model of Cytokinetic Ring Assembly. Biophysical Journal 107 (11), 2618–2628. http:// dx.doi.org/10.1016/j.bpj.2014.10.034. Burak, Y., and Shraiman, B.I. (2009). Order and stochastic dynamics in Drosophila planar cell polarity. PLoS Comp. Biol. 5, e1000628. doi:10.1371/journal.pcbi.1000628 Chen, Y., Deffenbaugh, N.C., Anderson, C.T., and Hancock, W.O. (2014). Molecular counting by

44 Introduction to Quantitative Cell Biology

photobleaching in protein complexes with many subunits: best practices and application to the cellulose synthesis complex. Mol. Biol. Cell 25, 3630–3642. doi:10.1091/mbc.E14-06-1146 Coffman, V.C., Lee, I.J., and Wu, J.Q. (2014). Counting molecules within cells. Colloquium series on quantitative cell biology. Morgan and Claypool Life Sciences Publishers: San Rafael, CA (USA). doi:10.4199/C00109ED1V01Y201406QCB001 Dittrick, P., Malvezzi-Capeggi, F., Jahnz, M., Schwille, P. (2001). Accessing molecular dynamics in cells by fluorescence correlation spectroscopy. Biol. Chem. 382, 491–4. doi:10.1515/BC .2001.061 El-Samad, H., Prajna, S., Papachristodoulou, A., Doyle, J., and Khammash, M. 2006. Advanced methods and algorithms for biological network analysis. Proc. IEEE 94, 832–53. doi:10.1109 /JPROC.2006.871776 Emanuele, M.J., McCleland, M.L., Satinover, D.L., and Stukenberg, P.T. (2005). Measuring the stoichiometry and physical interactions between components elucidates the architecture of the vertebrate kinetochore. Mol Biol. Cell 16, 4882–92. doi:10.1091/mbc.E05-03-0239 Faeder, J.R., Blinov, M.L., and Hlavacek, W.S. (2009). Rule-based modeling of biochemical systems with BioNetGen. Meth. Mol. Biol. 500, 113–67. doi:10.1007/978-1-59745-525-1_5 Gillespie, D.T. (2007). Stochastic simulation of chemical kinetics. Annu. Rev. Phys. Chem. 58, 35–55. doi:10.1146/annurev.physchem.58.032806.104637 Gittes, F., Mickey, B., Nettleton, J., and Howard, J. (1993). Flexural rigidity of microtubules and actin filaments measured from thermal fluctuations in shape. J. Cell Biol. 120, 923–34. doi: 10.1083/jcb.120.4.923 Green, R.A., Paluch, E., and Oegema, K. (2012). Cytokinesis in animal cells. Ann. Rev. Cell Dev. Biol. 28, 29–58. doi:10.1146/annurev-cellbio-101011-155718 Gregor, T., Tank, D.W., Wiescahus, E.F., and Bialek, W. (2007). Probing the limits to positional information. Cell 130, 153–64. doi:10.1016/j.cell.2007.05.025 Guan, Y., Meurer, M., Raghavan, S., Rebane, A., Lindquist, J.R., Santos, S., Kats, I., David son, M.W., Mazitschek, R., Hughes, T.E., Drobizhev, M., Knop, M., Shah, J.V. (2015). Livecell multiphoton fluorescence correlation spectroscopy with an improved large stokes shift fluorescent protein. Mol. Biol. Cell 26, 2054–66. Hawkins, T.L., Sept, D., Mogessie, B., Straube, A., Ross, J.L. (2013). Mechanical properties of doubly stabilized microtubule filaments. Biophys. J. 104, 1517–28. doi:10.1016/j.bpj.2013 .02.026 Heald R, Tournebize R, Blank T, Sandaltzopoulos R, Becker P, Hyman A, Karsenti E. (1996) Selforganization of microtubules into bipolar spindles around artificial chromosomes in Xenopus egg extracts. Nature. 1996 Aug 1;382(6590):420–5.

References 45

Howard, J. (2014). Quantitative cell biology: the essential role of theory. Mol. Biol. Cell 25, 3438–40. doi:10.1091/mbc.E14-02-0715 Jaqaman, K., Danuser, G. (2006). Linking data to models: data regression. Nat. Rev. Mol. Cell Biol. 7, 813–9. doi:10.1038/nrm2030 Jaqaman, K., Dorn, J.F., Jelson, G.S., Tytell, J.D., Sorger, P.K., and Danuser, G. (2006b). Comparative autoregressive moving average analysis of kinetochore microtubule dynamics in yeast. Biophys. J. 91, 2312–25. doi:10.1529/biophysj.106.080333 Joglekar, A.P., Salmon, E.D., and Bloom, K.S. (2008). Counting kinetochore protein numbers in budding yeast using genetically encoded fluorescent proteins. Meth. Cell Biol. 85, 127–51. doi:10.1016/S0091-679X(08)85007-8 Johnston, K., Joglekar, A., Hori, T., Suzuki, A., Fukagawa, T., and Salmon, E.D. (2010) Vertebrate kinetochore protein architecture: protein copy number. J. Cell Biol. 189, 937–43. doi:10.1083 /jcb.200912022 Kam, Z., Minden, J.S., Agard, D.A., Sedat, J.W., and Leptin, M. (1991). Drosophila gastrulation: analysis of cell shape changes in living embryos by three-dimensional fluorescence microscopy. Development 112, 365–70. Keren, K., Pincus, Z., Allen, G.M., Barnhart, E.L., Marriott, G., Mogilner, A., and Theriot, J.A. (2008). Mechanism of shape determination in motile cells. Nature 453, 475–80. doi:10.1038 /nature06952 Kravtsova, N. and Dawes, A.T. (2014). Actomyosin regulation and symmetry breaking in a model of polarization in the early C. elegans embryo: symmetry breaking in cell polarization. Bull. Math. Biol. 76, 2426–48. doi:10.1007/s11538-014-0016-x Lander, A.D., Nie, Q., and Wan, F.Y. (2002). Do morphogen gradients arise by diffusion? Dev. Cell 2, 785–96. doi:10.1016/S1534-5807(02)00179-X Lin, L. and Othmer, H.G. (2017). Improving parameter inference from FRAP data: an analysis motivated by pattern formation in the Drosophila wing disc. Bull. Math. Biol. 79, 448–97. doi:10.1007 /s11538-016-0241-6 Lindsey, J.K. and Jones, B. (1998). Choosing among generalized linear models applied to medical data. Stat. Med. 17, 59–68. doi:10.1002/(SICI)1097-0258(19980115)17:1%3C59::AID -SIM733%3E3.0.CO;2-7 Little, S.C., Tkacik, G., Kneeland, T.B., Wieschaus, E.F., and Gregor, T. (2011). The formation of the Bicoid morphogen gradient requires protein movement from anteriorly localized mRNA. PLoS Biol. 9, e1000596. doi:10.1371/journal.pbio.1000596 Liu, J., Sun, Y., Drubin, D.G., and Oster, G.F. (2009). The mechanochemistry of endocytosis. PLoS Biol. 7, e1000204. doi:10.1371/journal.pbio.1000204

46 Introduction to Quantitative Cell Biology

Lopez, C.F., Muhlich, J.L., Bachman, J.A., and Sorger, P.K. (2013). Programming biological models in Python using PySB. Mol. Syst. Biol. 9, 646. doi:10.1038/msb.2013.1 Loughlin, R., Heald, R., and Nedelec, F. (2010). A computational model predicts Xenopus meiotic spindle organization. J. Cell Biol. 191, 1239–49. doi:10.1083/jcb.201006076 Ludington, W.B., Wemmer, K.A., Lechtreck, K.F., Witman, G.B., and Marshall, W.F. (2013). Avalanche-like behavior in ciliary import. Proc. Natl. Acad. Sci. U. S. A. 110, 3925–30. doi: 10.1073/pnas.1217354110 Ludington, W.B., Ishikawa, H., Serebrenik, Y.V., Ritter, A., Hernandez-Lopez, R.A., Gunzenhauser, J., Kannegaard, E., and Marshall, W.F. (2015). A systematic comparison of mathematical models for inherent measurement of ciliary length: how a cell can measure length and volume. Biophys. J. 108, 1361–79. doi:10.1016/j.bpj.2014.12.051 Ma, W., Trusina, A., El-Samad, H., Lim, W.A., and Tang, C. (2009). Defining network topologies that can achieve biochemical adaptation. Cell 138, 760–73. Mao, Y., Tournier, A.L., Hoppe, A., Kester, L., Thompson, B.J., and Tapon, N. (2013). Differential proliferation rates generate patterns of mechanical tension that orient tissue growth. EMBO J. 32, 2790–803. doi:10.1038/emboj.2013.197 Mickey, B. and Howard, J. (1995). Rigidity of microtubules is increased by stabilizing agents. J. Cell Biol. 130, 909–17. doi:10.1083/jcb.130.4.909 Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., and Alon, U. 2002. Network motifs: simple building blocks of complex networks. Science 298, 824–7. doi:10.1126/science .298.5594.824 Mueller, P., Rogers, K.W., Yu, S.R., Brand, M., Schier, A.F. (2013). Morphogen transport. Development 140, 1621–38. doi:10.1242/dev.083519 Nelson, P.C. (2014). Physical models of living systems. W.H. Freeman and Company, New York. 384 pp. Nielsen, A.A., Der, B.S., Shin, J., Vaidyanathan, P., Paralanov, V., Strychalski, E.A., Ross, D., Densmore, D., Voigt, C.A. (2016). Genetic circuit design automation. Science 352, aac7341. doi:10.1126/science.aac7341 Pavin, N. and Tolic, I.M. (2016). Self-organization and forces in the mitotic spindle. Ann. Rev. Biophys. 45, 279–98. doi:10.1146/annurev-biophys-062215-010934 Ponti, A., Machacek, M., Gupton, S.L., Waterman-Storer, C.M., and Danuser, G. (2004). Two distinct actin networks drive the protrusion of migrating cells. Science. 305:1782–6. doi:10.1126 /science.1100533 Qian, H., Sheetz, M.P., and Elson, E.L. (1991). Single particle tracking. Analysis of diffusion and flow in two-dimensional systems. Biophys. J. 60, 910–21. doi:10.1016/S0006-3495(91)82125-7

References 47

Reyes-Lamothe, R., Sherratt, D.J., and Leake, M.C. (2010). Stoichiometry and architecture of active DNA replication machinery in Escherichia coli. Science 328, 498–501. doi:10.1126 /science.1185757 Savin, T., Kuripos, N.A., Shyer, A.E., Florescu, P., Liang, H., Mahadevan, L., and Tabin, C.J. (2011). On the growth and form of the gut. Nature 476, 57–92. doi:10.1038/nature10277 Saxton, M.J. and Jacobson, K. (1997). Single-particle tracking: applications to membrane dynamics. Ann. Rev. Biophys. Biomol. Struct. 26, 373–99. doi:10.1007/978-1-59745-397-4_6 Scott, J.G., Fletcher, A.G., Anderson, A.R., and Maini, P.K. (2016). Spatial metrics of tumor vascular organization predict radiation efficacy in a computational model. PLoS Comp. Biol. 12, e1004712. Segal, L.A. (1987). Modeling dynamic phenomena in molecular and cell biology. Cambridge University Press, Cambridge. Sekar, J.A. and Faeder, J.R. (2012). Rule-based modeling of signal transduction: a primer. Meth. Mol. Biol. 880, 129–218. doi:10.1007/978-1-61779-833-7_9 Shekhar, S., Zhu, L. Mazutis, L, Sgro, A.E., Fai, T.G., and Podolski, M. (2014). Quantitative biology: where modern biology meets physical sciences. Mol. Biol. Cell 25, 3482–5. doi:10.1091 /mbc.E14-08-1286 Shen-Orr, S.S., Milo, R., Mangan, S., and Alon, U. (2002). Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. 31, 64–8. doi:10.1038/ng881 Soliman, S., Fages, F., and Radelescu, O. (2014). A constraint solving approach to model reduction by tropical equilibration. Algorithms Mol. Biol. 9, 24. doi:10.1186/s13015-014-0024-2 Sprague, B.L. and McNally, J.G. (2005). FRAP analysis of binding: proper and fitting. Trends Cell Biol. 15, 84–91. doi:10.1016/j.tcb.2004.12.001 Stimpson, H.E., Toret, C.P., Cheng, A.T., Pauly, B.S, and Drubin, D.G. (2009). Early-arriving Syp1p and Ede1p function in endocytic site placement and formation in budding yeast. Mol. Biol. Cell 20, 4640–51. doi:10.1091/mbc.E09-05-0429 Tang, N., Marshall, W.F., McMahon, M., Metzger, R.J., and Martin, G.R. (2011). Control of mitotic spindle angle by the RAS-regulated ERK1/2 pathway determines lung tube shape. Science 333, 342–5. doi:10.1126/science.1204831 Tkacik, G., Callan, C.G., and Bialek, W. (2008). Information flow and optimization in transcriptional regulation. Proc. Natl. Acad. Sci. U. S. A. 105, 12265–70. doi:10.1073/pnas.0806077105 Turing, A.M. (1952). The chemical basis of morphogenesis. Phil. Trans. Royal Soc. London B, 237, 37–72. doi:10.1098/rstb.1952.0012 Tyson, C.B. and Novak, B. (2011). Irreversible transitions, bistability and checkpoint controls in the eukaryotic cell cycle: a systems-level understanding. In Handbook of Systems Biology,

48 Introduction to Quantitative Cell Biology

A.J.M. Walhout, M. Vidal, and J. Dekker eds. Academic Press, Waltham, MA. p 1–40. doi:10.1016/B978-0-12-385944-0.00014-9 Vale, R.D., DeRisi, J., Phillips, R., Mullins, R.D., Waterman, C., and Mitchison, T.J. (2012). Graduate education. Interdisciplinary graduate training in teaching labs. Science 338, 1542–3. Van der Maaten, L.J.P. and Hinton, G.E. (2008). Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–605. Vavylonis, D., Wu, J.Q., Hao, S., O’Shaughnessey, B., and Pollard, T.D. (2008). Assembly mechanism of the contractile ring for cytokinesis by fission yeast. Science 319, 97–100. doi:10.1126 /science.1151086 Vilar, J., Kueh, H., Barkai, N., and Leibler, S. (2002). Mechanism of noise-resistance in genetic oscillators. Proc. Natl. Acad. Sci. U. S. A. 99, 5988–92. doi:10.1073/pnas.092133899 Wang, Z., Shah, J.V., Berns, M.W., and Cleveland, D.W. (2006). In vivo studies of dynamic intracellular processes using fluorescence correlation spectroscopy. Biophys. J. 91, 343–51. doi: 10.1529/biophysj.105.077891 Weinert, F.M., Brewster, R.C., Rydenfelt, M., Phillips, R., and Kegel, W.K. (2014). Scaling of gene expression with transcription-factor fugacity. Phys. Rev. Lett. 113, 258101. doi:10.1103 /PhysRevLett.113.258101 Wu, J.Q. and Pollard, T.D. (2005). Counting cytokinesis proteins globally and locally in fission yeast. Science 310, 310–4. Yang, L., Effler, J.C., Kutscher, B.L., Sullivan, S.E., Robinson, D.N., and Iglesias, P.A. (2008). Modeling cellular deformations using the level set formalism. BMC Systems Biol. 2, 68. doi:10.1186/1752-0509-2-68 Yang, Y., Sage, T.L., Liu, Y., Ahmad, T.R., Marshall, W.F., Shiu, S.H., Froehlich, J.E., Imre, K.M., and Osteryoung, K.W. (2011). Clumped chloroplasts 1 is required for plastic segregation in Arabidopsis. Proc. Natl. Acad. Sci. U. S. A. 108, 18530–5.

49

Author Biography Dr. Wallace F. Marshall is Professor of Biochemistry and Biophysics at the University of California San Francisco and founding director of the NSF-funded Center for Cellular Construction at UCSF. Dr. Marshall’s research focuses on understanding how the complex geometry of cells arises from the interplay of molecular and physical mechanisms, as well as how cell geometry relates to cell function. His laboratory has a particular interest in the mechanisms that control the size of organelles. Questions of cell geometry are inherently quantitative, and Dr. Marshall’s group employs an integrated combination of quantitative microscopy, image analysis, and computational modeling, together with genetic and biochemical methods. Dr. Marshall received bachelor degrees in Electrical Engineering and Biochemistry at the State University of New York, Stony Brook, and his Ph.D. in Biochemistry at the University of California San Francisco. After postdoctoral training at Yale University in the Molecular, Cellular, and Developmental Biology department, he returned to UCSF in 2003 to start his faculty position. He was an organizer of the Cold Spring Harbor Computational Cell Biology conference, and for the past four years has been organizing a series of meetings and workshops on Quantitative Cell Biology funded by the National Science Foundation. He is also currently co-director of the Physiology Course at the Marine Biological Laboratory in Woods Hole, MA.