Udemy

Optical Flow Coding with OpenCV - Part One

A free video tutorial from Jose Portilla
Head of Data Science at Pierian Training
Rating: 4.6 out of 5Instructor rating
86 courses
3,934,978 students
Optical Flow Coding with OpenCV - Part One

Learn more from the full course

Python for Computer Vision with OpenCV and Deep Learning

Learn the latest techniques in computer vision with Python , OpenCV , and Deep Learning!

14:03:33 of on-demand video • Updated March 2021

Understand basics of NumPy
Manipulate and open Images with NumPy
Use OpenCV to work with image files
Use Python and OpenCV to draw shapes on images and videos
Perform image manipulation with OpenCV, including smoothing, blurring, thresholding, and morphological operations.
Create Color Histograms with OpenCV
Open and Stream video with Python and OpenCV
Detect Objects, including corner, edge, and grid detection techniques with OpenCV and Python
Create Face Detection Software
Segment Images with the Watershed Algorithm
Track Objects in Video
Use Python and Deep Learning to build image classifiers
Work with Tensorflow, Keras, and Python to train on your own custom images.
English [CC]
(gentle music) -: Welcome back, everyone. Now that we have an understanding of the intuition behind optical flow, let's go ahead and use OpenCV to program our own use cases for optical flow using OpenCV's built-in functions for it. Let's head over to the Jupyter Notebook and get started. All right, here we are in the notebook. What we're gonna do is we're gonna start off with the Lucas-Kanade method for optical flow and that's for those sparse points that we want to track. Essentially, just picking a few points and then tracking their flow throughout the video from frame to frame. And we'll actually be tagging it, so you'll see a little faint line being drawn as the objects move. And we're gonna choose points based off corner detection. Later on in future object-tracking videos in this section, we'll actually detect the face first and then track that. But for right now, we'll keep it simple and we'll use an object detection method that we already know, which is just corner detection. So I will start off by importing numpy as np, and then importing cv2. Then next, I'm gonna set some parameters for Shi-Tomasi Corner Detection. That's from the good features of track paper that we already discussed. And this should feel a little familiar to you. Essentially, just a dictionary of corner tracking parameters. And I'm just using some default values here. So we can detect maxCorners. Let's say we're gonna detect 10 corners in our image, just the best quality ones. And we can actually define a quality level. So we'll say that the quality level we want here is, and let's make sure this is correct, is 0.3. Obviously, you can kind of play around with these. We'll say minimum distance is seven, and we'll say block size is also seven. Okay, so we have our parameters for the corner detection. Essentially what we're gonna do is on the very first frame of the video, detect 10 corners and then track those. All right, the next thing I'm going to do is set up another dictionary called lk_params. And these are gonna be parameters for the Lucas-Kanade optical flow function we call later on. And we're going to provide here a couple of default values. The first one is going to be the window size. And the winSize here, you're essentially going to have a trade off between smaller windows and larger windows. If you have a smaller window, you're going to be more sensitive to noise and you may miss larger motions if the points you're trying to detect are moving really fast, that means from one frame to the other, the movement was actually really large and you may not have caught it if you have a smaller window size. Now, if you have a larger window size, you're gonna be able to catch those larger motions. However, they may not be as sensitive to a smaller motions of the points. So you may just think that the point is standing still when it's actually kind of moving in really small, harder to detect frames. So again, it's kinda a trade off there between being too sensitive to noise versus being able to catch those smaller motions. So that's our winSize. We'll go ahead and use a default value of 200 by 200. Next, we're going to detect or provide a maxLevel. And what this does is the Lucas-Kanade method can actually use this algorithm with what's known as an image pyramid. And if you check out the Wikipedia page on pyramid for image processing and scroll down here, eventually, you'll notice that there's an image here that essentially describes what a pyramid is when it comes to image analysis. And the Lucas-Kanade method, what it does is you can actually have it use image pyramids for its analysis. So if you leave maxLevel at zero, it's just going to use the original image. However, you can have higher maxLevels and we'll provide a maxLevel of two. And what this does is it allows you to find optical flow at various resolutions of the image. So here we can see Level 1 is at half revolution, resolution in Level 2 is at a quarter resolution, and so on. So we'll use a default value of maxLevel = 2. And the next thing we're gonna provide is the criteria. And the way this criteria is going to work is we'll say cv2.TERM_CRITERIA_EPS, and then we'll say pipe operator, essentially or, and then we'll say cv2.TERM_CRITERIA_COUNT. And then right after that, we're gonna say 10 and 0.03. So what this does is we essentially are providing two criteria to perform the Lucas-Kanade optical flow. We're providing the maximum number of iterations. So we have here, we're providing a term criteria count, and that's related to 10. And then we're also providing EPS, or epsilon, and that's going to be 0.03. Now more iterations means a more exhaustive search for the points. So that has to do with the counts on how many iterations are you going to be looking for these points in the current frame versus the previous frame. And if you have a smaller epsilon, then that actually means you're going to finish earlier. Essentially what this does is you can play around with these values of CRITERIA_COUNT versus CRITERIA_EPS. And whichever one essentially happens first, you're gonna finish earlier. And what it does is it's exchanging speed of your tracking versus the accuracy of your tracking. Again, here we've let some good default values for you to play around with, but you'll often have to adjust these depending on what your actual video source is. So keep that in mind. Essentially, what we're doing here is playing around with some values and exchanging speed versus accuracy. Okay, so that is the parameter dictionary we're going to use later on when we actually call the built-in Lucas-Kanade function. So let's go ahead and run that and then continue. Now the next steps is to actually grab an image from our camera. So a lot of this we'll actually already be familiar with. We'll say cv2.VideoCapture. And here you can actually just provide a video file, but in my case, I'm going to do it live streaming from the camera. So I'll say zero. And then we're gonna take ret, essentially indicating true or false, is it actually able to capture the video? And then we will call this the prev_frame of the video. So we're gonna read the very first frame and treat it as a previous frame, essentially saying here is the points, here's the previous frame. And then the next frame, we'll use that to see if we can find those points in the most current frame. So we're reading the very first frame and we'll label it as prev_frame. And in order to do this, we're actually going to also grab a grayscale version of that frame. We'll say prev_gray = cv2.cvtColor. And we're gonna pass in the previous frame and convert it. We'll say COLOR from blue, green, red, 'cause it's OpenCV, 2GRAY. And next we're going to do is we're actually going to decide what are the points we actually want to track. So here are the points to track. And what we're going to do here is just grab using the corner tracking parameters here, some good features to track. So we're gonna choose the top 10 corners and track those. So we'll say prevPts = cv2, and then say goodFeaturesToTrack, we'll pass in the previous grayscale. We're not gonna have a mask for this. And then I can pass in the corner tracking parameters simply by saying two asterisk there, and then pass in corner tracking parameters. So this essentially allows you to provide a dictionary into a function call like that. Okay, so we have the points we wanna track. And then this mask that we're creating here is going to be for displaying the actual points and drawing lines. So this mask has more to do with actually drawing lines onto the video than tracking points. So consider this mask just as a way of visualizing, and we'll be using it later on. So we'll say zeros_like, and then pass in the prev_frame. All this does is it creates a NumPy rate of zeros that has the same shape as the current frame you're looking at. So this creates a matching mass of the previous frame for drawing on later. Okay, so we have our points to track, we're reading in video data, we have the grayscale, and we have our mask that we can draw on, and later recombine with the actual frame that we're looking at. Now it's time to actually have a for loop that does most of the work. We'll say while True, and we'll say, whoops, ret, frame = cap.read, just as we did above. But notice here, now this is called frame. So this is our current frame. The very first one is the prev_frame. So we have ret frame. And then we'll actually do the exact same here of creating a grayscale. So we're gonna copy this and paste it, but instead of prev_gray, I'm going to call it just frame_gray. So it's the current grayscale frame. And then we're gonna pass in frame. So we're taking now the current frame, and technically at the very first instance, those two will be the same. So we have frame_gray. And then next is going to calculate the optical flow on this grayscale frame. And we do that by saying nextPts, status, and err for error. And these are the three things that are returned by the built-in function calcOpticalFlow. And right now we're using PyrLK, which stands for Pyramid Lucas-Kanade. And all we need to do here is pass in our previous image, the previous grayscale image, the current grayscale image, the points that we wanna find from the previous frame in the next frame, and that is going to be these prevPts up here to start off with. Later on, we're gonna reassign them. So there are the prevPts. And then the next parameter, we just label as None. And you can check out the docs here if you wanna see all the parameters. So there's prevPts and then what the nextPts are. But in this case, we actually wanna figure out what the nextPts are so we won't provide them. Instead, that's gonna be spit out by the function itself. So we'll say None there. And after that, we just need all the parameters, which we've already defined in lk_params. Okay, so a lot going on with this function call, but really it's the same stuff that we were talking about in the slides in the previous lecture. We pass in the previous frame, we pass in the current frame, both in grayscale, the prevPts, and then we're not defining the nextPts, we want those to be returned. And then these parameters that we can adjust by simply adjusting this dictionary, which I encourage you to play around with. Okay, so we called the optical flow. And now that we've calculated it, what we're going to do is we're going to use the return status array. So this status array, essentially what it does is it outputs what's known as a status vector, where each element of the vector is set to one if the flow for the corresponding features has been found. Otherwise, it's set to zero. So let me show you what that looks like and how we can use it. I'll create a variable called good_new, and then say nextPts, where status = 1. And then say good_prev, where the prevPts has status = 1. Essentially, matching them up based on the index location. And remember, the status vector is set to one if the flow for the corresponding features has been found. Essentially, this is connecting where the previous points were to the next set of points. Then we're going to do say for i, and then as a tuple here, we can say new and prev, and then we'll say in enumerate. And we're gonna zip together the good new points, along with the good previous points. And then we're gonna calculate the x and y positions. So we can actually draw little markers there. So we'll say x_new, y_new = new, and then what we're gonna say here is call .ravel or ravel, depending how you pronounce it. And what this does is essentially a NumPy method and you can check out the docs here. And it's very similar to flattening out an array. Essentially, it's the same as saying reshape -1 where you are keeping the current order. So here we can see a array passed in with two dimensions. After you call ravel on it, then it's kind of just flattened out to this single array. And we're gonna be using that because we want to use these to draw. So we'll come back up here to Jupyter Lab, and then we'll do the same operation on the old points. We'll say x_prev and y_prev, and then set that equal to the prev, and then call revel on that. And what we're going to be doing is on our mask, you'll be drawing some lines. We'll say cv2.line, and we're gonna pass in the mask, and go from x_new, y_new, and draw a line from that new point to the previous point. This is essentially going to draw a little tracking line of the points as they go from frame to frame. So then we'll say x_prev to y_prev. Then we'll choose the color. Let's go ahead and just make it green. And you can choose a thickness. We'll go ahead and give it three thickness. Now on the actual current frame, I wanna draw a dot of where the current point we're tracking is. We'll say frame is equal to cv2, call circle on this, and then we will say frame. And now we're only gonna draw this on the new points 'cause that's our current location. We'll say new points. Let's go ahead and give this a radius of eight and we'll give it the color red. And we'll we'll have it filled in, so we'll just say thickness is -1. So all this is doing is it's going through the good new points and the good previous points. And this is kind of an a complex NumPy array because it's technically tracking 10 different points because we asked it to track 10 different corners. It can be more, it can be less than that, but keep that in mind. So it has a bunch of points to track. And then we're essentially using enumerate to iterate through those good points and then flatten them out using some NumPy, drawing a line, connecting the previous points position to the current point position. So x_new, y_new, and then the previous points there. And then we're drawing a circle on where the current frame of the point is. Now once we've done that outside of the for loop, we're gonna display this. We'll say img = cv2, and we're gonna say add. And we're gonna add on the frame and the mask. So that adds on the frame with the circles and then the mask with the lines. Then we'll say cv2 and just show this. So we can just call that tracking, and then we'll show that image that now has both the frame with the points and the mask of the lines. We'll add a little bit of functionality here just to make sure that we can escape out of this. We'll say waitKey. So wait 30 milliseconds, and & 0xff, and then say if k == 27. You could technically write that all in one line. We break. And we've seen this before. Now, the most important step here is that we need to update the prevPts to now be the current points. So we just drew our lines and our circles. Now, we need to update the current frame to be the previous frame for the next iteration, which means I'll say prev_gray = frame_gray, and then create it as a copy. That way I don't overwrite anything. And then we'll say that the prevPts = good_new, and we're gonna reshape that to -1, and then 1,2. Since that's going to be the way that it's going to be accepted into this calcOpticalFlowPyramid Lucas-Kanade function. And then outside of all of this, outside of the entire while loop, whoops, we'll say cv2, we'll destroy all the windows, essentially we're done. We hit the Escape key, and then we'll also release the capture. And before we run this, we wanna fix a minor typo we made up here. We're actually checking for equality, not assigning status. So we're gonna check for status being equal to one, not reassigning status. So a simple little typo there. Let's go ahead and save this and run it. And what I would recommend is before you actually do Shift + Enter to run the cell, is look directly into your USB camera. So hopefully some of the corners that detects have to do with your face. That way it will track a few points on your face. I'm going to look directly into my USB camera. Do Shift + Enter here. Let me bring up what it's running. And there we go. So I tracked my eyes and my nostrils and a couple of other corners. I'm going to begin moving. And then you should see, and this looks really creepy from my angle, but you should see the points being tracked, as well as the lines moving and being drawn on that mask as we later combine it. And as you can tell, eventually, if you start moving really fast, it may actually lose some of those points. I'm pretty well-lit here, but you can actually see I started turning my face and it kinda lost my left eyeball. Didn't really understand how to track that since it disappeared. So keep that in mind. If a point disappears, this optical flow won't be able to track it any further. All right, so this was for a sparse set of points. Coming up next, we're going to look at density, which is actually simply swapping out that function line. Similar code, but we're gonna swap a single line out for a dense optical flow calculation. We'll cover that in the next lecture. I'll see you there.