
Explore probabilistic foundations of neural networks and deep learning, from probability theory to implementing and evaluating multi-layer perceptrons, CNNs, and RNNs (LSTM/GRU) with standard metrics.
Trace the history from machine learning to deep learning, explain artificial neural networks and learning concepts, and explore applications from medical diagnosis to AlphaFold protein structure prediction and synthetic data.
Explore probability distributions that underpin deep learning for discrete variables, including Bernoulli and categorical models. Learn how mu and softmax translate scores into probabilities for binary and multiclass predictions.
Explore the Gaussian distribution for continuous variables and multivariate data, learn parameter estimation with maximum likelihood and log likelihood, and see its role in deep learning.
Treat linear regression as a single-layer neural network, using weights and a bias to form a dot product, then apply basic non-linear functions and a probabilistic loss via likelihood.
Explore how single-layer networks perform regression through model fitting, maximum likelihood, and regularization, linking probability distributions, decision theory, and closed-form linear regression.
Explore single layer networks for classification, covering logistic regression, softmax for multiple classes, and the math behind maximum likelihood and cross-entropy optimization using sgd.
Explore why deep neural networks learn hierarchical features in multi-layer perceptrons, how nonlinear activations like sigmoid, tanh, and ReLU overcome vanishing gradients, and how data lies on manifolds.
Course Probabilistic Foundations of Neural Networks and Deep Learning is designed to equip students with a basic to advanced understanding of the concepts and applications of Deep Learning. In this course, students will learn the probabilistic foundations that underlie machine learning, including probability theory, standard distributions, and parameters. The discussion continues with single-layer networks for regression and classification, which provides insight into how simple models can be linked to probability theory and loss functions.
Next, students will explore deep neural networks with a focus on multilayer perceptron (MLP) architecture, non-linear activation functions, and how network depth increases representation capacity. This course also discusses important issues such as the curse of dimensionality, regularization, and decision theory in making optimal predictions. In the final session, students are introduced to the concepts of representation learning, transfer learning, and various error functions relevant to modern model development.
Through a combination of mathematical and probabilistic theory and practical implementation, this course provides comprehensive skills for understanding, designing, and evaluating artificial neural network architectures. By the end of the course, participants are expected to be able to explain the basic principles of deep learning, implement regression and classification models, and understand the benefits of networks in representation learning and knowledge transfer.