Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Computer Vision - Object Detection on Videos - Deep Learning

Name: Computer Vision - Object Detection on Videos - Deep Learning
Rating: 4.2 (87 reviews)

Quick Starter on Object Detection and Image Classification on Videos using Deep Learning, OpenCV, YOLO and CNN Models

Created byVineeta Vashistha

Last updated 1/2025

English

What you'll learn

Learn how to implement Video Analytics using Deep Learning concepts
Understand how to implement Object Detection Models on Videos using Python
Build your own Deep Learning model using Transfer Learning for Image Classification
Executable Code of Faster RCNN, YOLO, HOG and Haar Cascade for Object Detection
Build a technical solution containing both Object Detection and Image Classification
Develop Image Classification Model using InceptionV3 model architecture
Learn to implement SORT Framework for Object Tracking
Executable Code of SORT for People Footfall Tracking and Automatic Parking Management

Course content

13 sections • 94 lectures • 3h 25m total length

Learning Path1:38
Kick off with Python and OpenCV to build a base for image processing and OCR, then master advanced Python concepts, NumPy, pandas, and OpenCV techniques like thresholding, dilation, and erosion.
Course Starter6:18
Enable English captions to follow subtitles and adjust video speed for easier learning. Download resources via the folder icon, and review tools setup and download code in the course Q&A.
Udemy Review1:51
Discover how the Udemy review system shapes your computer vision course learning journey by encouraging you to assess the complete course content, sections, projects, and downloadable resources before rating.

Objectives0:47
Explore the objectives of video analytics, its architecture, and technical capabilities through real-world use cases in security, retail, autonomous cars, smart homes, and smart cities.
Video Analytics Overview2:15
Video analytics leverages deep learning to monitor real-time video streams, detect objects and movement patterns, and extract insights to enable smart decisions, security for homes, cities, and transportation.
Video Analytics Architecture3:17
Compare server-based and edge-based video analytics architectures and outline the frame-by-frame workflow from streaming to frame extraction, OpenCV or deep learning processing, and output video creation.
Video Analytics Use Cases Part - 13:50
Video Analytics Use Cases Part - 22:03
Explore real-time video analytics use cases—from autonomous cars with collision, traffic, and road signs detection to smart city crowd management, real-time parking control, and media video categorization and subtitling.

Objectives0:53
Discover the three-step video analytics workflow—capture, process, and save video—beginning with capture, then processing methods like get/set, read, waitKey, and grab/retrieve, and finally explore video codecs, FourCC, and saving video.
Capture Video1:20
Create a VideoCapture object to read from cameras, image sequences, or video files, using a device index or file path, verify with cap.isOpened() and print 'Can't open Camera' on failure.
Processing Video - get/set method1:51
Use the get and set methods to query and update video capture properties, such as frame width and height, with examples like 640 and 480, and 320 and 240.
Processing Video - read method0:34
The cap.read() method reads each video frame, returning a success flag and the frame image. If the read is successful, proceed with operations; otherwise print an error and break.
Processing Video - waitKey method1:17
Explore how waitKey() controls key events during video processing, showing infinite wait with waitKey(0) and a 1-millisecond pause with waitKey(1) while cap.read() refreshes frames.
Processing Video - grab/retrieve method1:11
Explore grab and retrieve methods in video processing for multi-camera environments without synchronization. Call grab on each camera, then retrieve to decode frames and align timing, reducing motion jpeg decompression.
Video Codec3:08
Define a video codec, explain encoding and decoding, and how containers like MP4 hold H.264 video and AAC audio, with steps to check and install codecs.
Video Codec Timeline3:39
Trace the timeline of popular video codecs from MJPEG to H.265. Explore how each codec enabled DVD, Blu-ray, and UHD streaming with improved compression.
FourCC0:40
FourCC stands for four character code and defines the video codec, using four bytes of alphanumeric characters; it can be defined as a single code or with a star.
Save Video2:06
Initialize VideoWriter with a filename, fourcc (for example mjpg), fps, and 640 by 360 frame size to save processed frames; resize with cap.get() and cast 8-bit images, set isColor accordingly.
Video Analytics Quiz

Objectives0:26
Provide an overview of object detection models, with a focus on human detection, and cover Euclidean distance concepts needed to understand the machine learning solution we will build.
Object Detection Models2:28
Human Detection0:49
Identify and locate humans in images and videos by extracting features that quantify the human body and bounding boxes, then apply trained machine learning models to detect and track them.
Euclidean Distance0:54
Compute the euclidean distance as the shortest distance between two points in an n dimensional space using the Pythagoras theorem, where d connects (x1,y1) and (x2,y2) in an image.

Objectives0:50
Compare object detection models such as Haar cascade, HOG, R-CNN, SSD, and YOLO, including YOLOv3 and Tiny, and discuss accuracy, speed, and challenges.
Haar Cascade Classifier1:21
Learn how Haar cascade classifier, a trained object detection algorithm, uses positive and negative images to trainCascadeObjectDetector that identifies full body images within a video.
HOG Model2:00
Explore the histogram of oriented gradients (hog) model for human detection in videos, using gradient-based global features, a sliding detection window, 8x8 pixel blocks, and an svm classifier.
RCNN and Fast RCNN Model3:42
Explain how R-CNN and Fast R-CNN balance accuracy and speed using mAP and FPS, detailing region proposals, CNN feature maps, and bounding-box regression on PASCAL VOC and COCO datasets.
Faster RCNN and R-FCN Model2:02
Study faster r-cnn and r-fcn for video object detection, comparing region proposal networks with region proposals via region of interest scoring, enabling detection and achieving mAP on the Pascal dataset.
SSD AND YOLO Model2:38
Compare SSD's end-to-end CNN with YOLO's real-time single-shot grid detection. YOLO runs at 155 fps with 63.7% mAP, while SSD reaches 83.2% but slower.
YOLOv3 and YOLOv3 Tiny Model2:05
Delve into YOLOv3 for video object detection with three scales and more bounding boxes, compare to YOLOv3 tiny, noting logistic regression confidence, mAP, and speed.

Objectives1:36
Define a social distancing solution using object detection models and provide a detailed code walkthrough, tool setup, and downloadable resources for haar cascade, hog, yolov3 tiny, and faster r-cnn.
Introduction2:06
Leverage video analytics to monitor social distancing of 6 feet in real time, using CCTV camera feeds to highlight violations and enable automated social monitoring.
Tools Setup - Ubuntu1:22
Open the Ubuntu terminal to download Python 3.6 using the on-screen command, then install OpenCV with pip3 and Jupyter Notebook using the exact on-screen syntax.
Tools Setup - Windows1:17
Move to the Windows environment by installing Python 3.6 in the C drive, then install OpenCV with pip and the OpenCV contrib package, and finally install Jupyter Notebook.
Code Walkthrough for Haar Cascade Model - 10:46
Import math, os, and cv2, then load the Haar cascade classifier for video processing, and set up cv2 VideoWriter with mjpg to .avi, matching input size.
Code Walkthrough for Haar Cascade Model - 23:30
Walks through a haar cascade model, reading and resizing frames, converting bgr to grayscale, detecting persons with person.cascade.multiScale, and drawing red rectangles when euclidean distances between centers are under 50.
Code Walkthrough for Haar Cascade Model - 30:35
Complete the code walkthrough by writing frames to the output file with the out object. Release input and objects and press q to quit, then close cv2 windows using destroyAllWindows.
Demo Video0:42
Watch a demo video of the deep learning object detection solution showing social distancing violations. Red rectangular boundaries mark violators, such as P1 and P2, under the six-foot rule.
Download Code For Haar Cascade0:39
Download the Haar cascade xml file and social distancing python code from the resource section, and run a lightweight person detection model on the input video with minimal hardware.
Code Changes For Hog Model0:50
Implement the hog model for human detection by initializing the HogDescriptor and using HogDescriptor.getDefaultPeopleDetector, then apply the Hog Detect Multiscale function to derive bounding boxes for social distancing.
Download Code For HOG Model0:16
Download the hog model code for social distancing in Python, added in the resources section of this module. Download the input video file from the indicated reference path.
Code Walkthrough for YOLOv3 Tiny - Part 10:47
Walk through the social distancing solution with YOLOv3 tiny, importing cv2 and numpy, calibrating distances, checking adjacency with isclose, and loading class labels, network configurations, and weights.
Code Walkthrough for YOLOv3 Tiny - Part 21:31
Load the yolo network from disk with cv2.dnn.readNetfromDarkNet, identify layer names and unconnected out layers, preprocess frames to a blob with blobFromImage, and obtain class probabilities through a pre-trained network.
Code Walkthrough for YOLOv3 Tiny- Part 31:16
Walks through looping over layer outputs and detections, extracting class id and confidence, thresholding at 0.5, scaling YOLO boxes to coordinates, and applying non maximum suppression to prepare distance computations.
Code Walkthrough for YOLOv3 Tiny - Part 41:11
Walkthrough of a YOLOv3 tiny code for video object detection, using isclose to compare bounding box centers, drawing red or green boundaries and lines, and writing frames to output.
Download Code for Yolov3 Tiny Solution0:27
Explore a YOLOv3 tiny social distancing solution with Python code, using YOLOv3 tiny weights, cfg, and COCO names from the resources, and view the output video.
Using PyCharm for Coding6:26
Launch PyCharm to set up projects, configure the Python interpreter and virtual environment, install packages or use requirements.txt, and run or debug code with breakpoints.
Code Walkthrough for Faster R-CNN10:37
Walk through implementing social distancing detection with Faster R-CNN, loading a pre-trained model, configuring PyCharm virtualenv, processing video frames, and visualizing detections with red or green bounding boxes.
Download Code For Faster R-CNN Solution1:30
Download and run the faster r-cnn social distancing project: unzip faster-rcnn.zip, execute faster_rcnn_Social_Distancing.py with frozen_inference_graph.pb, requirements.txt, and input video, comparing Haar Cascade, Hog, YoloV3 Tiny, Faster R-CNN.
Object Detection Quiz

Objectives0:40
Explore image classification and review classifiers like VGGNet, ResNet, Inception, and Xception, then learn transfer learning with pretrained CNNs and train a model using Inception V3 on Google Colab.
Image Classification1:29
Explore image classification in computer vision, which assigns a label to an image based on visual content and trains models to recognize target classes like cats, dogs, or handwritten digits.
Deep Learning Image Classifier - Part 12:10
Explore convolutional neural networks as deep learning image classifiers, detailing input, hidden, and output layers, convolution and pooling layers, and comparing VGGNet and ResNet variants on ImageNet.
Deep Learning Image Classifier - Part 21:30
Explore popular image classification models beyond VGGNet and RestNet, including inception variants (Inceptionv1–Inceptionv4) and Xception, focusing on reducing parameter space and computational complexity through depthwise separable convolution.
Transfer Learning with Pretrained CNN1:26
Apply transfer learning to reuse knowledge for a related task, speeding training and improving performance, using pre-trained CNN weights from models like ResNet or InceptionV3 via Google Colab.
Google CoLab Setup1:09
Set up Google Colab for image classification by downloading face_mask_detection.zip, extracting the ipynb file, logging in with your Gmail id, uploading the notebook, and enabling GPU runtime before running.
Using Google Colab11:17
Code Walkthrough For Model Training on CoLab Part - 10:22
Click the black arrow on each block in google colab to run code, and start the walkthrough by importing tensorFlow and keras for model creation.
Code Walkthrough For Model Training on CoLab Part - 20:47
Walk through downloading train.zip and validation.zip, uploading them to Google Colab, unzipping, and using augmented code to create a dataset of mask and without mask faces.
Code Walkthrough For Model Training on CoLab Part - 30:31
Walk through setting up training and validation paths in Colab, defining class directories for with mask and without mask, and calculating image counts for dataset insight.
Code Walkthrough For Model Training on CoLab Part - 41:15
Set up variables for pre-processing and training with Keras ImageDataGenerator to rescale images and load and resize data via flow_from_directory for training and validation.
Code Walkthrough For Model Training on CoLab Part - 50:17
Code Walkthrough For Model Training on CoLab Part - 65:21
Freeze the convolution base, flatten multi-dimensional outputs, apply normalization and dropout, and build a dense network ending in a sigmoid-activated single unit for binary face mask classification.
Code Walkthrough For Model Training on CoLab Part - 72:08
Stack feature extractions with a Keras sequential model, compile with learning rate and train using RMSprop with binary cross-entropy and from_logits=True for two classes, tracking total and count metrics.
Code Walkthrough For Model Training on CoLab Part - 82:17
Explore the fit_generator training loop, detailing steps_per_epoch, validation_data, and validation_steps, and learn how to save the trained .h5 model after 100 epochs in CoLab.
Code Walkthrough For Model Training on CoLab Part - 90:43
Upload a test image to Colab, load the trained model, and classify the image to yield a positive or negative value, where positive means mask and negative means no mask.
Download Code0:43
Learn how to download and run the face mask detection project by unzipping face_mask_detection.zip, Train.zip, and Validation.zip, then execute the Jupyter notebook.

Objectives0:45
Define the face mask detection problem, propose a machine learning solution, set up the environment, reuse last module's model, and download code for image classification solution with a detailed codebook.
Face Mask Detection0:57
Detect people not wearing masks in a community by using face detection with a Haar cascade classifier and a trained model to distinguish masked versus unmasked faces.
Tools Setup1:20
Open the Ubuntu terminal, install Python 3 and pip, install OpenCV, verify with Python 3, and install TensorFlow 2.2 or newer to set up tools for video object detection.
Code Walkthrough -Face Mask Detection Part11:44
Walk through a face mask detection pipeline by loading a trained .h5 model, setting up video capture and output with cv2, numpy, and Tensorflow, and configuring video writers.
Code Walkthrough -Face Mask Detection Part23:14
Read video frames with cap.read, resize outputs, convert to grayscale with cv2.cvtColor, and detect faces using haarcascade_frontalface_alt2.xml via detectMultiScale returning x, y, w, h.
Code Walkthrough -Face Mask Detection Part31:01
Crop faces from frames, record crops with coordinates, resize and normalize grayscale pixels to 0–1, and expand dimensions before feeding frames to the pretrained image classification model.
Code Walkthrough -Face Mask Detection Part40:31
Pass detected face images to a classification model for prediction, yielding positive values for masked faces and negative for unmasked, and store results with the input images and face coordinates.
Code Walkthrough -Face Mask Detection Part50:25
Perform a validation check to ensure the detected face count does not exceed predicted faces, then label faces on the frame as 'Wearing Mask' or 'No Mask' based on classification.
Code Walkthrough -Face Mask Detection Part60:37
We conclude the face mask detection code walkthrough by resizing frames to detected object size, writing frames to output file, and exiting with 'q' before releasing resources with destroyAllWindows.
Download Code0:44
Explore object detection and image classification on videos using Haar cascade, an Inception model, and the accompanying Python code and demo video in the Resources section.
Image Classification Quiz

Objectives1:05
Explore object tracking, its uses, and how it differs from object detection, then implement the SORT framework with tools and code walkthroughs for people footfall tracking and automatic parking management.
Object Tracking3:39
Track moving objects across video frames using deep learning and ID assignments to predict trajectories; cover single and multiple object tracking with SORT and Deep SORT.
SORT Framework3:03
Learn to implement the SORT framework for real-time multi-object tracking by detection, using Kalman filter and the Hungarian algorithm, with unique ids and occlusion handling for footfall and parking analyses.
Tools Setup1:41
Set up your development environment for object detection on videos by installing Python, pip, OpenCV, PyCharm, and project dependencies from requirements.txt in Ubuntu.
Code Walkthrough - People Footfall Tracking7:59
Walk through a PyCharm-based object tracking project that uses YOLO with COCO and the SORT tracker to count pedestrians and cars in video frames.
Download Code1:39
Download the object_tracking.zip from resources, extract it to access main.py and sort.py (SORT framework), with pedestrian.mp4 and parking.mp4 videos and requirements.txt, then download yolov3.weights to the yolo-obj folder to run.
Object Tracking Quiz

Objectives0:35
Project Overview1:25
Code Walkthrough10:46
Walk through setting up a Yolov3 license plate detection project in PyCharm, including virtual environments and dependencies, to detect plates in videos or live feed using OpenCV and pytesseract.
Code Download Instructions0:48
Download and unzip the license plate detection yolov3.zip from resources, open in PyCharm to access main.py and model.py, then run main.py with the test_dataset and YOLO_utils (config, weights, class names).

Requirements

Basic Programming skills in Python

Description

Master Real-Time Object Detection with Deep Learning

Dive into the world of computer vision and learn to build intelligent video analytics systems. This comprehensive course covers everything from foundational concepts to advanced techniques, including:

Video Analytics Basics: Understand the 3-step process of capturing, processing, and saving video data.
Object Detection Powerhouse: Explore state-of-the-art object detection models like Haar Cascade, HOG, Faster RCNN, R-FCN, SSD, and YOLO.
Real-World Applications: Implement practical projects like people footfall tracking, automatic parking management, and real-time license plate recognition.
Deep Learning Mastery: Learn to train and deploy deep learning models for image classification and object detection using frameworks like TensorFlow and Keras.
Hands-On Experience: Benefit from line-by-line code walkthroughs and dedicated support to ensure a smooth learning journey.

Exciting News!

We've just added two new, hands-on projects to help you master real-world computer vision applications:

Real-Time License Plate Recognition System Using YOLOv3: Dive deep into real-time object detection and recognition.
Training a YOLOv3 Model for Real-Time License Plate Recognition: Learn to customize and train your own YOLOv3 model. Don't miss this opportunity to level up your skills!

Why Enroll?

Industry-Relevant Skills: Gain in-demand skills to advance your career in AI and machine learning.
Practical Projects: Build a strong portfolio with real-world applications.
Expert Guidance: Learn from experienced instructors and get personalized support.
Flexible Learning: Access course materials and assignments at your own pace.

Unlock the power of computer vision and start building intelligent systems today!

Who this course is for:

Beginners to Data Science
Machine Learning Professionals
Developers willing to transition into Machine Learning
Anyone looking to implement Machine Learning on Videos
Anyone looking to become more employable as a Data Scientist

Computer Vision - Object Detection on Videos - Deep Learning

What you'll learn

Explore related topics

Course content

Course Starter3 lectures • 10min

Introduction to Video Architecture and Use Cases5 lectures • 12min

Video Analytics and Processing with Codec10 lectures • 17min

Object Detection - Human Detection with Euclidean Distance4 lectures • 5min

Object Detection Models - Haar Cascade, HOG, Faster RCNN, R-FCN, SSD, YOLO7 lectures • 15min

Object Detection Implementation on Videos using Haar Cascade, HOG and YOLO19 lectures • 37min

Training Image Classification Model using Deep Learning on Google Colab17 lectures • 34min

Image Classification Implementation on Videos using Trained Inception V3 Model10 lectures • 11min

Object Tracking using SORT Framework6 lectures • 19min

Project - Real-Time License Plate Recognition System Using YOLOv34 lectures • 14min

Requirements

Description

Who this course is for: