David's illustrations have been published in Science, Physical Review Letters, Molecular Pharmaceutics, Biosensors and Bioelectronics, and the Proceedings of the National Academy of Sciences.
University of California, San Francisco
Harvey Mudd College
BS, Physics, 2005
Advisor: Robert J. Cave
A mathematical way to think about biology comes to life in this lavishly illustrated video book. After completing these videos, students will be better prepared to collaborate in physical sciences-biology research. These lessons demonstrate a physical sciences perspective: training intuition by deriving equations from graphical illustrations.
"Excellent site for both basic and advanced lessons on applying mathematics to biology."
-Tweeted by the U.S. National Cancer Institute's Office of Physical Sciences Oncology
This is not a quick fix: It can take a couple months to work through this material at a comprehensible pace. We briefly review algebra and calculus, describe basic probabilistic modeling, explain how to solve dynamical systems, and then present an area of application in physical oncology. Even after viewing these sections, students will still need to invest significant effort in order to participate in multidisciplinary research. These videos provide starting points for conversation between biological and physical disciplines. Students may wish to return to these tutorials periodically for review as research proceeds.(PowerPoint files and backup links to videos, in case the udemy versions experience technical difficulties, available at main website lookatphysics.com)
This video outlines contents from the course
In this and the following three videos, we will review the concept of quantity, which is represented by numbers. In this video, we review two ways in which we learned to think about numbers in elementary school. We used numbers to refer to the idea of having distinct manipulatives, and we used numbers to refer to the idea of labeling geographic locations with addresses.
The analysis of a system of particles display Bose-Einstein statistics is an example of a situation in which it is important to be aware whether we are thinking of numbers in terms of distinct manipulatives or in terms of addresses on a street. Incorrectly assuming that atomic and subatomic particles are just as distinct as the plastic counting manipulatives from kindergarten leads to overestimating the number of ways that particles can be excited out of the lowest energy state. In some situations, a system of particles that tends to occupy the lowest energy state in a way that is quantitatively consistent with thinking of numbers in terms of addresses (rather than thinking of particles as distinct manipulatives) is sometimes referred to as a Bose-Einstein condensate.
Numbers can be represented using a number line, a wedge, and place-value representation. The application of memorized rules for performing arithmetic on numerals formatted in place-value representation is called algorism.
Infinity is not a number. There is no tick mark on the number line labeled "infinity."
This slide deck presents aspects of quantitative "vocabulary" (variables) and quantitative "grammar" (functions and function composition) that will allow us to express quantitative reasoning in future slide decks. In this first of five videos, we note that it is cumbersome to describe quantitative relationships purely through the enumeration of repetitive examples involving concrete numbers. This difficult can be addressed with the assistance of abstract "placeholder," "stand-in" symbols. A variable is a symbol that stands in for a number at once arbitrary, yet specific and particular. Using variables, we can communicate quantitative relationships concisely.
Functions are basic building-block sentences of mathematical reasoning. A function relates input values in a domain to output values in a codomain, and these associations can be depicted using plots. While different disciplines use slightly different definitions of a function, an essential stipulation familiar to scientists and mathematicians from a variety of fields is that a function associates each input value with precisely one output value.
Functions can be combined by using the output of one function as the input for another function. The resulting object is a composite function, which is one way to combine mathematical ideas to derive mathematical conclusions.
When two functions are called each other's inverses, they can be composed. The overall composite function has the property that the value entered as an input is returned as an output. The plot of the composition of inverse functions is the diagonal line y = x.
When we try to think of an inverse of the squaring function, we encounter two difficulties. One problem is that the reflection of the parabola y = x^2 is, in many places, double-valued, and, thus, not a function. Second, this plot does not explore negative input values. When we attempt to address this second difficulty, we develop the idea of the imaginary root i, which, when squared, gives -1. Knowledge of the imaginary root because will help us to study oscillatory dynamics in a later slide deck.
Graphical and analytic understanding of solving the quadratic equation
Plotting quadratic functions
Completing the square
The geometry routinely used by physical scientists on a day-to-day basis is only a small portion of the typical high school course. Useful concepts include the notion of a flat space (as opposed to a curved space), as well as the Pythagorean theorem.
The unit circle is a circle of radius one centered at the origin of the xy-coordinate plane. The location of a point on a circle is specified by the angle θ it sweeps counterclockwise from the x axis. The location of a point is also specified using its corresponding x- and y-coordinates, which, in this context, are referred to as cos(θ) and sin(θ), respectively.
Using the Pythagorean theorem to relate the lengths of sides of triangles drawn in the context of a circle, we estimate π. We also provide a mnemonic for memorizing π to 6 digits. This allows us to understand that the tick marks on the horizontal axis of the function plots from the previous video correspond to numerical values.
Even though sine and cosine are fundamentally defined as functions that provide the y- and x-coordinates, respectively, of points on the unit circle, sine and cosine are also regarded as "trigonometric" functions, which describe the geometry of right triangles. We practice applying this perspective as we derive two examples of identities involving sine and cosine.
Gauss summation trick, which is used when counting the number of pairwise interactions in a population of components
How many ways can we arrange n distinct objects in n slots? The answer is n (n - 1) (n - 2) . . . 3 * 2 * 1. Because this kind of calculation appears often in the study of probabilities, we give it a symbol called the factorial: n! = n (n - 1) (n -2) . . . 3 * 2 * 1.
Informally, when we say that the limit of a function as x approaches a is L, we mean that as x becomes arbitrarily close to a, the function becomes arbitrarily close to L. This idea is made more precise using the ε-δ definition.
When we say that the limit of a function at a value of x = a is infinity, we mean that as x becomes arbitrarily close to a, the value of the function becomes arbitrarily large.
When we say that a function has a limit of L "at" infinity, we mean that as x becomes arbitrarily large, the function becomes arbitrarily close to L.
When we say that a function has an infinite limit "at" infinity, we mean that as x becomes arbitrarily large, the function becomes arbitrarily large.
An example of a situation in which a function can fail to have a limit at a value of x = a is when the function jumps discontinuously in height at that value of x. One example of a situation in which a function can fail to have a limit at infinity is an oscillatory function that fails to approach a particular value of y = L because it keeps swinging with sustained amplitude up and down through y = L.
In this video, the outline for using the epsilon-delta definition to prove that the limit of a function has a particular value y = L at x = a has two main parts. First, we determine what range of y values the function takes when x is restricted to intervals on either side of the value x = a of interest. Then, we ask whether we can narrow these intervals sufficiently to ensure that the range of y values taken by the function is contained within a range of y values of interest centered at y = L. When we conclude that this can be done for any finite range of such y values, we conclude that the limit of interest exists.
The goal of this and the next 4 videos is to formalize an idea of "slope" and then to build a cribsheet of rules for studying the slopes of some example functions. In this video, we define the derivative, caution against interpreting differentials as numbers, and remark that derivatives do not always exist. It is important to become familiar with derivatives because they provide a basic vocabulary for talking about dynamical systems in the natural sciences (including in biology).
When a function depends on multiple independent variables, the "partial" symbol is reserved to denote slopes calculated by jiggling one independent variable at a time
This set of four videos introduces power series representations. Using a power series representation is like using decimal representation. Both techniques organize the description of the target object at levels of increasing refinement.
In this first video, we show that the second derivative corresponds to the curvature of a plot. In this way, we strengthen intuition that higher-order derivatives can also have geometric interpretations.
In these four videos, we develop a familiar with integration that will later be useful for deducing functions of time (e.g. number of copies of a molecule as a function of time) using rates of change (e.g. the first derivative of the number of copies of a molecule with respect to time). In this first video, we develop the concept of the definite integral in terms of the area under a curve.
Two wrongs make a right Tear two differentials apart as though they retained meaning in isolation Slap on the smooth S integral sign as though it were a unit of meaning itself, even without a differential You get the same integral expression you would obtain long-hand using u-substitution or "change of variables" in integrals
Compounding interest with arbitrarily small compounding periods
Power series representation of exp(x)
exp(0) = 1
(exp(x))^p = exp(px)
exp(x)exp(y) = exp(x+y)
Mnemonic for memorizing e = 2.718281828459045...
The natural logarithm is the inverse of the exponential
The indefinite integral of 1/x is ln(x) + C
Concepts of stochasticity underlie many of the models of dynamic systems explored in quantitative biology. We describe some of these ideas in this and the following three videos. In this video, we state that systems exhibiting deterministic dynamics can sample a messy variety of waiting times between chemical reaction events even when the motions of component parts are periodic. Particularly, this can happen when the periods of motion of individual parts are incommensurate (pairs of periods form ratios that are irrational).
In a deterministic system with complicated interactions, small differences in initial conditions can quickly avalanche into qualitative differences in dynamics. Since initial conditions can only be measured with finite certainty, the dynamics of such systems are, for practical purposes, unpredictable after short times.
In the previous two videos, deterministic systems displayed dynamics with aspects associated with stochasticity. In contrast, some systems not only mimic some aspects associated with stochasticity, but, instead, display indeterminism at a fundamental level. For example, when a collection of completely identical systems later displays heterogeneous outcomes, the systems are fundamentally indeterministic. They have no initial properties that can be used to discern which individual system will display which particular outcome.
Markov models are often used when developing mathematical models of systems which partially or more fully display aspects associated with stochasticity (depending on how fully a system displays aspects associated with stochasticity, the use of a Markov model might need to be recognized as a conceptual approximation). Icons that can represent the use of such models include spinning wheels of fortune and rolling dice.
In this and the following three videos, we present a canonical worked problem that is presented in introductory systems biology coursework. For an example of this mathematical lesson, see Alon, Ch. 2.4, pp. 18-21. In this video, we animate a time sequence of translation and degradation events that cause the number of copies of a protein of interest in a cell to change over time.
We derive a differential equation approximating the time-rate of change of the number of copies of protein in the cell modeled by the animation in the previous video. This differential equation reads, dx/dt = β - αx. We depict aspects of this differential equation with a flowchart. It is important to remember that this differential equation does not represent all aspects of the stochastic dynamics in the toy model presented in the previous video.
We sketch a slope field corresponding to the differential equation derived in the previous video. We use this slope field to draw a qualitative curve describing how the number of copies of protein is expected to rise over time, when starting from an initial value of zero.
We obtain an analytic solution for the relationship between the number of copies of protein and time for the differential equation qualitatively investigated in the previous video. We find that the rise time, T1/2, is ln(2) divided by the degradation rate coefficient, α. The fact that the rise time is independent of the translation rate β is sometimes used as a pedagogical example of the importance of quantitative reasoning for gaining insights into biological dynamics that would be difficult to develop through natural-language and vaguely-structured notional reasoning alone.
Using a collision picture to understand why reaction rates look like polynomials of reactant concentrations
Cooperativity of a simple (oversimplified) kind
How Hill functions, considered in combination with linear degradation, can support bistability
This video introduces collisional population dynamics and tabular game theory (comparative statics). The particular game in this example is the prisoner's dilemma. In this game survival of the relatively most fit occurs simultaneously with decrease in overall fitness. For a printable tutorial explaining how evolutionary game theoretic differential equations can be applied to analyze population dynamics, please refer to doi:10.1098/rsfs.2014.0037.
In the previous slide deck, we noted similarities between population dynamics and business transaction payoff pictures. In this and the next video, we provide deeper understanding of these connections. In this video, we derive the population dynamics equations in such a way that it is natural to say that cells being modeled repeatedly play games and are subject to game outcomes. For a printable tutorial describing interpretations that can be associated with evolutionary game theoretic differential equations, please see doi:10.1098/rsfs.2014.0038.
The first of five videos on introductory statistics, this module introduces probability distributions and averages. The average (also called "arithmetic mean") quantitatively expresses the notion of a central tendency among the results of an experiment.
The average of a sum is the sum of the averages. The average of a constant multiplied against a function is the constant multiplied by the average of the function. The average of a constant is the constant itself.
The variance of a function is the average of the square of the function. For the purposes of theoretic calculations, it might be useful to express the variance using the "inside-out" computation formula described in this video.
Two variables are said to be statistically independent if the outcome of an experiment tracked by one variable does not affect the relative likelihoods of different outcomes of the experiment tracked by the other variable. The two-variable probability distribution factorizes into two probability distribution functions.
The covariance of statistically independent variables is zero. The variance of a sum of statistically independent variables equals the sum of the variances of the variables. This identity is often used to derive uncertainty propagation formulas.
This slide deck provides examples of how hypotheses about probabilistic processes can be used to discuss probability distributions and obtain theoretical values for averages and variances. In this first video, we describe the Bernoulli trial, which corresponds to the experiment in which a coin is flipped to determine on which of two sides it lands.
In this second video in this slide deck, we discuss the binomial distribution. This distribution describes the probability of getting x heads out of N coin tosses (Bernoulli trials), each individually having probability p of success.
In the Poisson limit, we take a series of [independent] Bernoulli trials (giving rise to a binomial distribution) and allow the number of coin flips N to increase without bound while allowing the chance p of success on a particular coin flip to decrease without bound in such a compensatory fashion that the average number of successes ("heads") is unchanged. Because the likelihood of "heads" on any given toss decreases without bound, this limit is called the limit of rare events.
To study the combinatorics involved in an example where the central limit theorem applies, we will need to work with the factorials of large numbers. Stirling's approximation is an approximation for n! for large n. In this video, we motivate this approximation by comparing the expression for ln(n!) with an integral of the natural log function.
The central limit theorem states that a Gaussian probability distribution arises when describing an overall variable that is a sum of a large number of independently randomly fluctuating variables, no small number of which dominate the fluctuations of the overall variable.
Because equipment in physics experiments is highly-engineered, individual device contributions to measurement fluctuations might be "small." The overall fluctuations in the final measured quantity might be well approximated using a first-order Taylor expansion in terms of individual device fluctuations. Fluctuations in measurements are thus sums over random variables, and thus, potentially Gaussian distributed.
The levels of molecules in biological systems can approximate "temporary" steady-state values that equal products of rate coefficients and reactant concentrations. Since logarithms convert products into sums, the logarithms of the levels of some biological molecules can be normally distributed. Hence, the levels of the biological molecules are log-normally distributed.
Standard deviation vs. sample standard deviation
Mean vs. sample mean
Standard deviation of the mean vs. standard error of the mean
"I quantitated staining intensity for 1 million cells from 5 patients, everything I measure is statistically significant!" It is quite possible that you need to use n = 5, instead of 5 million, for the √ n factor in the standard error.
In order to identify theoretical curves that closely imitate a set of experimental data, it is necessary to be able to quantify to what extent a set of data and a curve look similar. To address this need, we present the definition of the quantity chi-squared. For a given number of measurements, a smaller chi-squared indicates a closer match between the data and the curve of interest. In other words, a smaller chi-squared corresponds to a situation in which it looks more as though the data "came from" Gaussian distributions centered on the curve. The average chi-squared value across a number of experiments, each involving M measurements, is M.
We slightly modify the definition of chi-squared developed in the previous video for the situation in which a "correct" curve has not been theoretically determined beforehand. We choose a "best guess" curve with corresponding best guess values of fitting parameters by minimizing chi-squared, which corresponds to maximizing likelihood.
Using the concepts developed in the preceding two videos, we present a checklist of steps necessary for performing fitting of mathematical curves to data with error bars. These steps include checking whether the reduced chi-squared value is in the neighborhood of unity and inspecting a plot of normalized residuals to check for systematic patterns. This algorithm is appropriate for general education undergraduate "teaching laboratory" courses.
Dynamics of population fractions
Model: RNA polymerase makes many (usually unsuccessful) independent attempts to initiate transcription. Once a mRNA strand is produced, it begins to make independent (usually many unsuccessful) attempts to be degraded.
Outcome: As in part a, mRNA copy numbers are Poisson distributed
Relative dominance in a population is determined, not merely by "fitness" alone, but also depends on the degree to which individuals "breed true."
In this and the next video, we develop a familiarity with the representation of vector rotations using rotation matrices. This understanding is helpful for identify dynamical systems that support oscillations in physics, engineering, and biology. A rotation operator rotates a vector by an angle without changing the length of the vector. A rotation matrix represents the action of a rotation operator on a vector.
How can we determine whether a dynamical system can be represented using something that looks like a rotation matrix? Rotation matrices have complex eigenvalues. We can determine whether a dynamical system supports rotational motion by determining whether the matrix representing the system's dynamics has complex eigenvalues.
CAUTION: I'm not familiar enough with numerical integration to know whether the particular example of the method for step-size adaptation in the video is used generally (or at all) in commonly available software packages. The purpose of the example was to show that it is possible to generate an error estimate (a) without knowledge of the actual solution and (b) by comparing the solutions from two numerical integration algorithms.
Por favor escribe tu crítica aquí
I am very interested in the course but as a biologist I had little maths during my education years. I only have school level maths & it is now very rusty. The lectures are very fast with little to no explanations, if you get struck up. May be others are more familiar with maths required here so it is easier for them. From my point of view the course could be divided into two or three parts & more examples should have been included to make it more intuitive. The language used is really hard to understand. It is very technical.
I love this guy. He's giving me a terrifec mathematical backgound and overview that I can handle.
Good organization. Good explanation. Good pronunciation.
Good language, and clear explanations. Requires work and practice, but that`s how you learn math!