# A mathematical way to think about biology

- Apply physical sciences perspectives to biological research
- Be able to teach yourself quantitative biology
- Be able to communicate with mathematical and physical scientists

- Algebra
- Exposure to calculus (there is an appendix for students interested in review)

A mathematical way to think about biology comes to life in this lavishly illustrated video book. After completing these videos, students will be better prepared to collaborate in physical sciences-biology research. These lessons demonstrate a **physical sciences perspective: **training intuition by deriving equations from graphical illustrations.

*"Excellent site for both basic and advanced lessons on applying mathematics to biology."*

*-*Tweeted by the U.S. **National Cancer Institute**'s Office of Physical Sciences Oncology

- Undergraduate students
- Graduate students
- Postdoctoral scholars
- Lab managers
- Funding agency program staff
- Principal investigators and grant writers
- Citizen scientists
- Patient advocates
- Lifelong learners
- Integrative Cancer Biology Program members
- Physical Sciences Oncology Network members
- National Centers for Systems Biology members

Concepts of stochasticity underlie many of the models of dynamic systems explored in quantitative biology. We describe some of these ideas in this and the following three videos. In this video, we state that systems exhibiting deterministic dynamics can sample a messy variety of waiting times between chemical reaction events even when the motions of component parts are periodic. Particularly, this can happen when the periods of motion of individual parts are incommensurate (pairs of periods form ratios that are irrational).

In a deterministic system with complicated interactions, small differences in initial conditions can quickly avalanche into qualitative differences in dynamics. Since initial conditions can only be measured with finite certainty, the dynamics of such systems are, for practical purposes, unpredictable after short times.

In the previous two videos, deterministic systems displayed dynamics with aspects associated with stochasticity. In contrast, some systems not only mimic some aspects associated with stochasticity, but, instead, display indeterminism at a fundamental level. For example, when a collection of *completely* identical systems later displays heterogeneous outcomes, the systems are fundamentally indeterministic. They have no initial properties that can be used to discern which individual system will display which particular outcome.

Markov models are often used when developing mathematical models of systems which partially or more fully display aspects associated with stochasticity (depending on how fully a system displays aspects associated with stochasticity, the use of a Markov model might need to be recognized as a conceptual approximation). Icons that can represent the use of such models include spinning wheels of fortune and rolling dice.

In this and the following three videos, we present a canonical worked problem that is presented in introductory systems biology coursework. For an example of this mathematical lesson, see Alon, Ch. 2.4, pp. 18-21. In this video, we animate a time sequence of translation and degradation events that cause the number of copies of a protein of interest in a cell to change over time.

We derive a differential equation approximating the time-rate of change of the number of copies of protein in the cell modeled by the animation in the previous video. This differential equation reads, d*x*/d*t* = *β* - *αx*. We depict aspects of this differential equation with a flowchart. It is important to remember that this differential equation does not represent all aspects of the stochastic dynamics in the toy model presented in the previous video.

We sketch a slope field corresponding to the differential equation derived in the previous video. We use this slope field to draw a qualitative curve describing how the number of copies of protein is expected to rise over time, when starting from an initial value of zero.

We obtain an analytic solution for the relationship between the number of copies of protein and time for the differential equation qualitatively investigated in the previous video. We find that the rise time, *T*1/2, is ln(2) divided by the degradation rate coefficient, *α*. The fact that the rise time is independent of the translation rate *β* is sometimes used as a pedagogical example of the importance of quantitative reasoning for gaining insights into biological dynamics that would be difficult to develop through natural-language and vaguely-structured notional reasoning alone.

This video introduces collisional population dynamics and tabular game theory (comparative statics). The particular game in this example is the prisoner's dilemma. In this game survival of the relatively most fit occurs simultaneously with decrease in overall fitness. For a printable tutorial explaining how evolutionary game theoretic differential equations can be applied to analyze population dynamics, please refer to doi:10.1098/rsfs.2014.0037.

- Brief introduction to tabular game theory
- An outcome of the prisoner's dilemma is simultaneous stability of D with, as a consequence, lower than maximum possible payoff for D
- We give a taste of the idea that tabular game theory and the population dynamics from the preceding video are connected deeply. We state that (1) that payoffs from tabular game theory can be associated with rate coefficients from the population dynamics in part 1a, and (2) that part 1a should be referred to as evolutionary game theory.
- The purpose is to inspire the audience to read in textbooks how this conceptual connection can be established.

In the previous slide deck, we noted similarities between population dynamics and business transaction payoff pictures. In this and the next video, we provide deeper understanding of these connections. In this video, we derive the population dynamics equations in such a way that it is natural to say that cells being modeled repeatedly play games and are subject to game outcomes. For a printable tutorial describing interpretations that can be associated with evolutionary game theoretic differential equations, please see doi:10.1098/rsfs.2014.0038.

The first of five videos on introductory statistics, this module introduces probability distributions and averages. The average (also called "arithmetic mean") quantitatively expresses the notion of a central tendency among the results of an experiment.

Two variables are said to be statistically independent if the outcome of an experiment tracked by one variable does not affect the relative likelihoods of different outcomes of the experiment tracked by the other variable. The two-variable probability distribution factorizes into two probability distribution functions.

The covariance of statistically independent variables is zero. The variance of a sum of statistically independent variables equals the sum of the variances of the variables. This identity is often used to derive uncertainty propagation formulas.

This slide deck provides examples of how hypotheses about probabilistic processes can be used to discuss probability distributions and obtain theoretical values for averages and variances. In this first video, we describe the Bernoulli trial, which corresponds to the experiment in which a coin is flipped to determine on which of two sides it lands.

In the Poisson limit, we take a series of [independent] Bernoulli trials (giving rise to a binomial distribution) and allow the number of coin flips *N* to increase without bound while allowing the chance *p* of success on a particular coin flip to decrease without bound in such a compensatory fashion that the average number of successes ("heads") is unchanged. Because the likelihood of "heads" on any given toss decreases without bound, this limit is called the limit of rare events.

To study the combinatorics involved in an example where the central limit theorem applies, we will need to work with the factorials of large numbers. Stirling's approximation is an approximation for *n*! for large *n*. In this video, we motivate this approximation by comparing the expression for ln(*n*!) with an integral of the natural log function.

The central limit theorem states that a Gaussian probability distribution arises when describing an overall variable that is a sum of a large number of independently randomly fluctuating variables, no small number of which dominate the fluctuations of the overall variable.

In some situations, when the number of coin tosses is large, Stirling's approximation can be applied to factorials that appear in the expression for the binomial distribution. The resulting expression is basically an exponential function of a quadratic function with a negative leading coefficient. This is the hallmark of a Gaussian distribution.

Because equipment in physics experiments is highly-engineered, individual device contributions to measurement fluctuations might be "small." The overall fluctuations in the final measured quantity might be well approximated using a first-order Taylor expansion in terms of individual device fluctuations. Fluctuations in measurements are thus sums over random variables, and thus, potentially Gaussian distributed.

The levels of molecules in biological systems can approximate "temporary" steady-state values that equal products of rate coefficients and reactant concentrations. Since logarithms convert products into sums, the logarithms of the levels of some biological molecules can be normally distributed. Hence, the levels of the biological molecules are log-normally distributed.

In order to identify theoretical curves that closely imitate a set of experimental data, it is necessary to be able to quantify to what extent a set of data and a curve look similar. To address this need, we present the definition of the quantity chi-squared. For a given number of measurements, a smaller chi-squared indicates a closer match between the data and the curve of interest. In other words, a smaller chi-squared corresponds to a situation in which it looks more as though the data "came from" Gaussian distributions centered on the curve. The average chi-squared value across a number of experiments, each involving *M* measurements, is *M*.

We slightly modify the definition of chi-squared developed in the previous video for the situation in which a "correct" curve has not been theoretically determined beforehand. We choose a "best guess" curve with corresponding best guess values of fitting parameters by minimizing chi-squared, which corresponds to maximizing likelihood.

Using the concepts developed in the preceding two videos, we present a checklist of steps necessary for performing fitting of mathematical curves to data with error bars. These steps include checking whether the reduced chi-squared value is in the neighborhood of unity and inspecting a plot of normalized residuals to check for systematic patterns. This algorithm is appropriate for general education undergraduate "teaching laboratory" courses.

In this and the following two videos, we present the stochastic simulation algorithm. To apply this algorithm, we need to specify the kinds of reactions that a system can undergo, we need to determine waiting times that elapse between consecutive reactions, and we need to determine the identities of the reactions that occur. In this first video, we illustrate how a systems' possible reactions are specified by specifying reaction rates and stoichiometries.

**Model:** RNA polymerase makes many (usually unsuccessful) independent attempts to initiate transcription. Once a mRNA strand is produced, it begins to make independent (usually many unsuccessful) attempts to be degraded.

**Outcome:** As in part a, mRNA copy numbers are Poisson distributed

Simple quasispecies eigendemographics and eigenrates based on Bull, Meyers, and Lachmann, "Quasispecies made simple," *PLoS Comp Biol*, **1**(6):e61 (2005)

In this first video, we obtain discrete-time-step population dynamics equations by considering proliferation and mutation events at the level of the single cell.

We use eigenvalue-eigenvector analysis to describe the long-term steady-state population composition. We find that relative dominance in a population is determined, not merely by "fitness" alone, but also depends on the degree to which individuals "breed true."

In this and the next video, we develop a familiarity with the representation of vector rotations using rotation matrices. This understanding is helpful for identify dynamical systems that support oscillations in physics, engineering, and biology. A rotation operator rotates a vector by an angle without changing the length of the vector. A rotation matrix represents the action of a rotation operator on a vector.

How can we determine whether a dynamical system can be represented using something that looks like a rotation matrix? Rotation matrices have complex eigenvalues. We can determine whether a dynamical system supports rotational motion by determining whether the matrix representing the system's dynamics has complex eigenvalues.

- Direction fields, quiver plots, and integral curves
- Numerical integration of systems of differential equations.

**CAUTION:** I'm not familiar enough with numerical integration to know whether the particular example of the method for step-size adaptation in the video is used generally (or at all) in commonly available software packages. The purpose of the example was to show that it is possible to generate an error estimate (a) without knowledge of the actual solution and (b) by comparing the solutions from two numerical integration algorithms.

In this and the following three videos, we present a canonical introduction to mRNA-protein system from systems biology 101. In the fourth video in this slide deck, we summarize the process of linear stability analysis that can be applied to systems of differential equations that can be expressed in the form of 2x2 matrix equations.

In this first video, we obtain the system of differential equations describing this model by presenting assumptions that mRNA molecules are transcribed and degraded and that copies of protein are translated and degraded.

Some of the trajectories in mRNA-protein level state space are one-dimensional (unbending). This insight allows us to learn that the dynamics of the vector in mRNA-protein state space are described by a linear combination of eigenvectors with weighting coefficients that are exponential functions of time with coefficients equal to the corresponding eigenvalues.

Adaptation is not absence of change; instead it is the presence of eventually compensatory changes See also: Read Ma, Trusina, El-Samad, Lim, and Tang, "Defining network topologies that can achieve biochemical adaptation," *Cell* **138**: 760-773 (2009).

In this video, we describe an example of an incoherent feed-forward loop molecular circuit topology, which, as we learn in the following two videos, supports adaptation. In the fourth video in this slide deck, we summarize the method of almost linear stability analysis that can be used to study systems in which the differential equations cannot be expressed in the form of a matrix equation with constant coefficients.

Adaptation is the eventual restoration of the level despite the lasting presence of a change in a stimulus that temporarily caused a change in the read out. The incoherent feed-forward loop is one way to use three nodes to produce this effect. After the level of input A rises, activation of read out C rises, but inhibition of C through B also rises. The final steady-state level of read out C is unchanged. However, since the level of inhibitor B takes some time to rise, inhibition of C is temporarily insufficient to compensate for increased activation of C by A. Thus, the level of C is temporarily higher before it approaches its original value.

We visualize nullclines and critical points in the BC phase portrait before and after a step change in A.

The system of differential equations describing the incoherent feed-forward loop in this example cannot be directly expressed in the form of a 2x2 matrix equation with constant coefficients. A power series expansion is used to identify higher-order terms that are neglected in the vicinity of the critical point. The remaining portion of the system of differential equations is linear and can be analyzed using eigenvalue-eigenvector methods. The dynamics obtained are consistent with the dynamics described more qualitatively in the previous video.

Even though an almost linear system is not exactly a linear system, the portions of the system that are not linear vanish with decreasing distance from the critical point of interest faster than the linear portion vanishes. The linear portion (which can be expressed using a matrix equation with constant coefficients) dominates near the critical point. The cribsheet of linear stability analysis can be used to classify a critical point of an almost linear system with two modifications. If application of linear stability analysis suggests a star or a degenerate node, the shapes of the trajectories should be checked by carefully graphing by hand. If application of linear stability analysis suggests a center, actual trajectories will circulate, but they need to be carefully graphed by hand to determine whether they sink inward, expand outward, or are closed.

In this and the following four videos, we present some concepts that can be used to design and recognize mathematical models that support oscillatory behavior. In this first video, we show that oscillations can be viewed as cyclic loops in a 2-dimensional plane. One way to arrange for a pair of variables *R* and *J* to perform oscillations is to let the time-derivative of each variable be proportional to the value of the other variable, with a negative sign in the coefficient of one of these differential equations.

The angles at which nullclines pass through the phase plane (e.g. steep vs. shallow) determine the relative arrangement of regions in which quivers point in the top-left, bottom-left, bottom-right, and top-right directions. By modifying the slopes of nullclines, and thus the relatively positions of these regions, the qualitative dynamics of a dynamical system might be modified to support a stable star, a stable spiral, a closed loop, or even an unstable spiral. One way to understand how parameters affect trajectories is to understand how parameters affect the slopes that nullclines make when drawn in the phase plane.

Interfaces between the physical sciences and oncology have become especially active in recent years owing, in part, to the Physical Sciences-Oncology Centers (PSOC) Network funded by the U.S. National Cancer Institute. While physical and mathematical scientists have historically contributed to instrumentation and technology development in the medical sciences, the PSOC network also promotes the application of physical sciences ways of thinking to understanding basic cancer biology and cancer therapy.

- Liao D, Estévez-Salmerón L, and Tlsty T D 2012 Conceptualizing a tool to optimize therapy based on dynamic heterogeneity
**†***Phys. Biol.***9**(6):065005 (doi:10.1088/1478-3975/9/6/065005) (open-access online) - Liao D, Estévez-Salmerón L, and Tlsty T D 2012 Generalized principles of stochasticity can be used to control dynamic heterogeneity
*Phys. Biol.***9**(6):065006 (doi:10.1088/1478-3975/9/6/065006) (open-access online)

**†** The authors dedicate this paper to Dr Barton Kamen who inspired its initiation and enthusiastically supported its pursuit.

The research described in these articles was supported by award U54CA143803 from the US National Cancer Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the US National Cancer Institute or the US National Institutes of Health.

(C) 2012-2013 David Liao (lookatphysics.com) CC-BY-SA (license updated 2013 March 27). When distributing this set of three videos under the Creative Commons license, please cite the full journal references above (including authors and dois) as well as the citation information for this video collection:

Title: Dynamic heterogeneity for the physical oncologist

Author of work: David Liao

The full citation of the papers (at least the first paper) is necessary because the journal *Phys. Biol.* has released these works under a CC-BY-NC-SA license. These papers are copyrighted and not public domain.

In the previous video, we asked whether phenotypic interconversion was a source of therapeutic failure or a therapeutic opportunity. In this video, we develop a graphical device, called a metronomogram, to understand that the dynamics of a phenotypically interconverting population (eventual reduction, expansion, or maintenance of population size) can depend on whether therapy is administered with sufficient time frequency.

We use a simple lattice model of synchronous reproduction of annual plants to give an example of a kind of spatially-resolved modeling that is easy to program into personal computers for routine study. This example happens to use a "winner takes all" replacement rule. See Nowak and May, *Nature* (1992) for an article describing spatial patterns that can arise when using a "winner takes all" model. In this video, we see that heterogeneous coexistence (as distinguished from homogeneous dominance by a single subpopulation) can sometimes be promoted by spatial localization.

In this unit, we provide intuitional background for studying Jeremy England's recent paper, "Statistical physics of self-replication," at a level mostly appropriate for algebra-based high school physics courses. To understand the irreversibility of a macroscopic state change, it is important to compare the volumes of the portions of phase space corresponding to two macrostates within the volume of phase space that is kinetically accessible. In this video, we provide probabilistic language for describing the dynamic exploration of microstates of a universe.

In the models we will consider, the conditional probability of a transition from a microstate of the universe, *i*, to a microstate universe, *j*, is equal to the conditional probability of a transition from microstate *j* to microstate *i*. We refer to this assumption as an assumption of microscopic reversibility.

Transitions from a cluster of microstates of the universe associated with one microstate of the system to another cluster of microstates of the universe associated with another microstate of the system can be probabilistically favored to proceed in the forward direction. This occurs when the number of microstates in the final cluster is greater than the number of microstates in the initial cluster. Irreversibility equals the ratio of the number of microstates of the universe in the final cluster to the number of microstates of the universe in the initial cluster. Irreversibility increases with increasing heat exhausted to the reservoir when paths are taken in the forward direction.

The irreversibility of a transition from a macroscopic state to another macroscopic state depends on the numbers of microstates of the universe in the two macrostates. The irreversibility of such a transition effectively equals the ratio of the number of kinetically accessible microstates of the universe belonging to the second macrostate of interest to the number of kinetically accessible microstates of the universe belonging to the first macrostate of interest.