
Explore why causal data science matters, introduce directed acyclic graphs, and preview core concepts like Simpson's paradox, confounding, and transportability.
Represent causal structures with graphs, where nodes are variables and edges link them; understand directed versus undirected graphs, paths and cycles, and how directed acyclic graphs relate to causal interpretation.
Explore how directed acyclic graphs reveal conditional independence through d-separation, showing how chains, forks, and colliders block or open paths between variables for causal inference.
Explore causal inference by modeling interventions with the do operator in structural causal models. Trace post-intervention distributions of Y and counterfactual reasoning to predict outcomes of actions.
Explore practical causal inference with a simulated dag in R, using tidyverse and ggdag to show how intervening on X shifts Y and clarifies causation.
Explore how directed acyclic graphs yield testable d-separation implications for causal inference. Use conditional independence tests to assess data, discard incompatible graphs, and iteratively refine models.
Learn how to use D'agoty for causal discovery by building a graph, deriving D separation implications, and testing conditional independencies with partial correlations to refine the model.
Explore considerations in causal discovery with acyclic graphs, including equivalence class limitations, pc algorithm complexity, and conditional independence tests by data types, with fci as alternative when causal sufficiency fails.
Explore confounding bias in causal data science with directed acyclic graphs, using back-door and front-door criteria, do calculus, and identification tasks to recover causal effects from observational data.
Apply the backdoor criterion to identify valid adjustment sets that block spurious paths from X to Y, enabling identification and estimation of causal effects from observational data.
Explain front door adjustment in causal graphs, showing how Z6 intercepts X to Y, blocks unblocked X to Z paths, and, with X, blocks backdoor paths from Z6 to Y.
Demonstrate practical causal inference in R using dagitty and ggdag to perform backdoor and front-door adjustment, do-calculus, and propensity-score weighting on DAGs with observed and unobserved factors.
Explore how Z identification extends instrumental variable ideas to identify causal effects in DAGs when X cannot be manipulated, using graphical criteria and do-calculus.
Apply Z identification to identify the causal effect of X on Y using a DAG with unobserved confounders in R, using the args effect algorithm and surrogate Z experiments.
Apply directed acyclic graphs to diagnose and recover causal effects under selection bias, using selection diagrams (G_s) and do calculus, while addressing collider bias and non-parametric methods.
Explore recovering conditional and interventional distributions from selection bias using selection diagrams, d-separation, and do-calculus, with a practical two-step strategy and examples.
Examine how transportability of causal knowledge across structurally different domains uses selection diagrams to compare source and target domains, and determine when experiments or observational data transport causal effects.
Learn how s-admissibility and do-calculus enable transportability of causal effects across populations using selection diagrams and reweighting. See practical examples from economics and education illustrating admissibility and do-calculus applications.
Explore transportability in causal data science, including Z transportability via surrogate experiments, and meta transportability to combine heterogeneous source studies using do-calculus and selection diagrams.
Map causal effects to data using directed acyclic graphs and do-calculus. Learn to specify a query, build a DAG model, and assess data types for complete identification.
This course offers an introduction into causal data science with directed acyclic graphs (DAG). DAGs combine mathematical graph theory with statistical probability concepts and provide a powerful approach to causal reasoning. Originally developed in the computer science and artificial intelligence field, they recently gained increasing traction also in other scientific disciplines (such as machine learning, economics, finance, health sciences, and philosophy). DAGs allow to check the validity of causal statements based on intuitive graphical criteria, that do not require algebra. In addition, they open the possibility to completely automatize the causal inference task with the help of special identification algorithms. As an encompassing framework for causal thinking, DAGs are becoming an essential tool for everyone interested in data science and machine learning.
The course provides a good overview of the theoretical advances that have been made in causal data science during the last thirty year. The focus lies on practical applications of the theory and students will be put into the position to apply causal data science methods in their own work. Hands-on examples, using the statistical software R, will guide through the presented material. There are no particular prerequisites, but a good working knowledge in basic statistics and some programming skills are a benefit.