
Welcome to Neural Networks Fundamentals online course. In this lecture, we will focus on the background of neural networks.
Build a single-layer perceptron from scratch to solve gate problems, training with input features, weights, a threshold, and iterative weight updates driven by learning rate and delta.
Neural networks consist of two important units: nodes and weights. In this lecture, we will construct neural networks model node structure dynamically.
In this lecture, we continue to construct neural networks model. In previous lecture, we've constructed node structure. Today, we'll create connections between nodes. These connection units are also known as weights.
Until now, we've constructed neural networks model. Model basically consists of nodes and weights. In this lecture, we would apply feed forward neural networks through nodes and weights, and make predictions.
In previous lecture, we've applied feed forward propagation and make predictions. Backpropagation algorithm updates weights and provides to make more relevant predictions. In this lecture, we would focus on its theory.
In this lecture, we would apply back propagation algorithm. Backpropagation calculates derivative of the error with respect to the any weight. After then, appending these derivatives to weights own values optimizes system and make more relevant predictions. We would implement the algorithm in python code.
Backpropagation algorithm is very complex to implement. Monitoring provides to ensure that implemented system works correctly. Today, we'll mention how to monitor loss over backpropagation iterations.
Even though, sigmoid function is one of the most common activation function in neural networks, it is not unrivaled. Some alternative activation functions may contribute to increase system accuracy. In this lecture, we will mention several activation functions and their effects on system performance.
Explore softplus activation, defined as ln(1+e^x), its unbounded output, and derive that its derivative equals the sigmoid function, highlighting backpropagation relevance.
Large learning rate causes to move away from the local minimum whereas small learning rate causes to get closer to local minimum with baby steps. Adaptive learning provides to reach local minimum with fewer steps. In this lecture, we will adapt this concept.
Gradient descent algorithm with standard configuration guarantees to reach local minimum. But there could be another minimums and the global minimum might be different than the point that you've reached. Global minimum is the most optimum value for weights that produces the lowest loss and the most successful predictions. Momentum provides to save you the local minimum that you get stuck in. In this way, you might reach the global minimum. In this lecture, we will adapt momentum in our program.
In this video, we'll mention why we need to normalize input features and outputs, and how to do it.
We've mentioned backpropagation theory but skipped the proof of concept. This is an optional lecture. You should watch this if you wonder how backpropagation works really.
Deep learning would be part of every developer's toolbox in near future. It wouldn't just be tool for experts.
In this course, we will develop our own deep learning framework in Python from zero to one whereas the mathematical backgrounds of neural networks and deep learning are mentioned concretely. Hands on programming approach would make concepts more understandable. So, you would not need to consume any high level deep learning framework anymore. Even though, python is used in the course, you can easily adapt the theory into any other programming language.