
Trace the evolution from variational autoencoder and generative adversarial networks to diffusion models, detailing generator and discriminator dynamics, training instability, and practical implementations.
Design and train deep learning image generators by building TensorFlow models, exploring variational autoencoders, GANs, Wasserstein GANs, CycleGAN, and diffusion models, and apply to image super-resolution and face mask removal.
Build and train a variational autoencoder on MNIST, with an encoder outputting mu and log var and a decoder reconstructing 28x28 digits, then sample z to generate new digits.
Master the gan loss and training setup by examining the discriminator and generator, real versus fake data, and the minimax game driving convergence in deep convolutional gan.
This lecture demonstrates implementing ProGAN practices in TensorFlow, including progressive growth from 4x4 to 128x128, pixel normalization, equalized learning rate, mini-batch std, and fade-in to stabilized discriminator and generator.
Use the diffusers library to run RunwayML's Stable Diffusion 1.5 in an image-to-image pipeline, adjusting strength, guidance scale, and steps to balance prompt fidelity and preserving the initial image.
Image generation has come a long way, back in the early 2010s generating random 64x64 images was still very new. Today we are able to generate high quality 1024x1024 images not only at random, but also by inputting text to describe the kind of image we wish to obtain.
In this course, we shall take you through an amazing journey in which you'll master different concepts with a step by step approach. We shall code together a wide range of Generative adversarial Neural Networks and even the Diffusion Model using Tensorflow 2, while observing best practices.
You shall work on several projects like:
Digits generation with the Variational Autoencoder (VAE),
Face generation with DCGANs,
then we'll improve the training stability by using the WGANs and
finally we shall learn how to generate higher quality images with the ProGAN and the Diffusion Model.
From here, we shall see how to upscale images using the SrGAN
Final Project: AI Interior Designer
You will build an application that can take any photo of an empty room and breathe life into it. We will architect a pipeline that truly understands the space.
Step 1: Scene Understanding. First, we’ll use the Depth Anything model to generate a precise depth map, giving our AI an understanding of the room's 3D geometry.
Step 2: Intelligent Masking. Next, we'll use a powerful combination of Grounding DINO and Segment Anything (SAM) to automatically detect and create masks for key areas like the door, and windows.
Step 3: Controlled Generation. Finally, we will feed the original image, the depth map, and the segmentation masks into ControlNet with a Stable Diffusion Inpainting model. This allows us to tell the AI, "Generate a modern sofa here on the floor, respecting the room's depth and leaving the windows untouched." The result is a stunning, realistic, and context-aware interior design.
If you are willing to move a step further in your career, this course is destined for you and we are super excited to help achieve your goals!
This course is offered to you by Neuralearn. And just like every other course by Neuralearn, we lay much emphasis on feedback. Your reviews and questions in the forum, will help us better this course. Feel free to ask as many questions as possible on the forum. We do our very best to reply in the shortest possible time.
Enjoy!!!