What is computer vision and why is it important?

Sundog Education by Frank Kane
A free video tutorial from Sundog Education by Frank Kane
Founder, Sundog Education. Machine Learning Pro
4.5 instructor rating • 22 courses • 442,733 students

Learn more from the full course

Autonomous Cars: Deep Learning and Computer Vision in Python

Learn OpenCV, Keras, object and lane detection, and traffic sign classification for self-driving cars

12:44:34 of on-demand video • Updated May 2020

  • Automatically detect lane markings in images
  • Detect cars and pedestrians using a trained classifier and with SVM
  • Classify traffic signs using Convolutional Neural Networks
  • Identify other vehicles in images using template matching
  • Build deep neural networks with Tensorflow and Keras
  • Analyze and visualize data with Numpy, Pandas, Matplotlib, and Seaborn
  • Process image data using OpenCV
  • Calibrate cameras in Python, correcting for distortion
  • Sharpen and blur images with convolution
  • Detect edges in images with Sobel, Laplace, and Canny
  • Transform images through translation, rotation, resizing, and perspective transform
  • Extract image features with HOG
  • Detect object corners with Harris
  • Classify data with machine learning techniques including regression, decision trees, Naive Bayes, and SVM
  • Classify data with artificial neural networks and deep learning
English [Auto] Hello everyone and welcome to the section in this section. We're going to cover the basics of fundamentals of computer vision and trying to understand how can we represent an image how to obtain the features within the image how to obtain for example the gradients of an image. It's very exciting and I'm very excited to discuss all these like fascinating topics with you. The first step is kind of an introduction which is what is computer vision the computer vision is kind of a science that's used to make computers understand what's happening within the image. OK if there's for example objects with an image in case of self-driving cars if there is like a pedestrian that has been detected if there is a specific lane that I need to be centered within if there is traffic sign for example that I would be able to see and so on so forth. OK. So again computer vision is a sign that allows computers to understand images and videos and determine what the computer sees or recognizes. That's pretty much it in a nutshell. All right. So let's take a look at what how humans actually use used you know like their eyes per say or you know the video stream that we acquired through through our eyes all the time and we can see how can we imitated in. In our computer computerized fashion. All right. So that's how humans in general see. So let's assume that we have our you know kind of image like this of like a horse as for example in a farm with like a bunch of mountains the sensors that we use obviously as our eyes and then all the data can be sent to our interpreter or our like brain in our vortex. They can specify okay what can we actually see and classifies. For example we can see in the image and tell us you know it's guesses. OK. So you know my guess. OK we are seeing a horse or maybe horses group of horses together or maybe a mountain and maybe a farm maybe all of all of them together and so on. All right. And that's how humans in general perceive or understand or Caesar recognizes objects within image. How can we do that in computerized form. It's actually the same thing. So if we use the exact same image we can use a camera. OK camera to take just one stand still photo. What kind of video stream of of photos. And within the computer what we gonna do with I'm going to you know kind of develop algorithms to tell us or classify as you know the images to again horses you know mountains forms and so on so forth. OK. That's pretty much what we're gonna do throughout the intersection. All right. So why is it important why computer vision is important. Actually computer vision is applicable everywhere not just in self-driving cars. It's actually pretty much everywhere. So it can be used again for self-driving cars for pedestrians and car detection can be used for face recognition. You know all the you know Facebook for example Snapchat and all this stuff actually has to use kind of computer vision techniques object detection obviously with the same kind of tied to our self-driving cars. It can be used for pedestrian for example detection as you guys can see here you can detect OK. This is a pedestrian crossing this kind of a cyclist you know like crossing and so on can use it for hand writing recognition. You can use it for license plate number detection and it's actually one of the most famous one of the very famous algorithms to just read you know whatever they can image of license plate and can read you know the numbers are automatically. So it's simply no divides kind of no the plate of different numbers or different characters and use the machine learning algorithm that can tell us. OK. This is a letter a this a letter and this is art and so on. And obviously Snapchat filters. So again computer vision is everywhere around us. All right. Next question or the most important question is why is it so challenging. All right. So I just went to Google and just try it okay like you know cars. Let's take a look at cars. So as you guys can see this is basically a bunch of cars. This image is of course. OK. We as humans we can easily classify as you know like these images because we have an amazing you know outstanding generalization capability we can just say OK. Even if we see a car from behind from behind from front with different colors even if we look at you know what a cartoon for example or like you know like even if it's very tiny or very small I can just tell you that it's a car. OK. However it's actually very complex task to you know to train computers to actually do it or 10 an algorithm to actually do it because we need you know if we gonna specify you know based on colors actually we have so many different colors if we specify for example different lighting here you might see a little bit darker image for example or maybe this like kind of a smaller image or different scales. You know there's different orientation of the car. You know this car from the front from behind and so on. So that's why it's very difficult actually challenging task to train a computer in the same way that we actually do as humans. All right. I'm going to go through all these challenges I'm going to show you how can we do all these you know image manipulations moving forward. All right. OK great so let's take a look at some of the couple of the challenges you know and dig a little bit deeper. The first step is viewpoints. All right so let's this is kind of an image of the CND tower in Toronto right. And this kind of a view you know like from a kind of a faraway that's kind of a seeing tower. OK. We can recognize it easily. However viewpoint if we let's say go underneath the sea and tower and look up. That's the image we're going to be getting. Or maybe forget a little bit closer and get like you know kind of more and more kind of angle you get come up with. Well that's right. However these three elements are actually the same object. However it's very difficult you know to take all these images and let your computer generalize you know and become way smart from different viewpoints can tell. OK. These are woodsy and towers. OK. We as humans is very easy for us to do however for computers a little bit challenging and to show you how how these such challenges as we move forward through the course. All right. That's the first step which is viewpoint. The next step is we'll call it camera limitations. All right obviously. So you know the better camera we get you know if we get let's say 10 80 for example camera pixel camera we can do better than 240 for example here you spin the final the image is a little bit blurred a little bit. Not not like you know like the resolution wasn't very accurate very great here as we increase the resolution that's better we get more pixels we're going to show you how can we describe an actual pixel in a digital format. I'm sorry an actual image in addition format and that's you know one of the challenges as well as our camera limitations. All right. The next limitation is obviously lighting right. Again here we have our exact same object which we're going to see and power here in the morning in daylight. You know maybe the machine learning algorithm or the you know computerized computer vision algorithm can tell us OK this is a seeing tower easy however at night you know if it's a little bit dark you know the features for us are the same. However it's a bit difficult for that for the algorithm to actually detect it because you know with different lighting conditions for example which is the same again for self-driving cars we want our cars to drive in the morning at night everywhere in snow conditions everywhere and actually detect objects objects and pedestrians and so on. All right. Perfect. Next step is what we call it killing. All right. So again here is our object with what if we zoomed in for example to take take an image like this for instance you know like we if we what if we have for example classifying pedestrians we might have a long like a tall pedestrian a short one for example how can the machine learning algorithm that can specify the rate of their all these are pedestrians in all the cars for example if our car is needed by car. How can we how can we classify all that. That's you know the power of of advanced kind of you know computer vision algorithms. All right. And that's obviously one of the challenges that you know that are posed to them. The next step or the last episode called Object variation again I went there and I looked up let's say when I classify chairs OK that's you know the images of chairs. So all of them for us you know even if you show it to a kind of a baby he can easily you know classify these all these are chairs. OK. Why again. Because our brain can easily generalize. You know it has been knocking over like thousands and hundreds of thousands of images however to train this you know like to like a machine learning algorithm or computer vision is actually a challenging task. All right. So again we have different object variations so like the actual chair this chair for example looks completely different than this looks completely different that this is different materials and textures. That's why it's very difficult to do or very challenging to do computer vision in general in a computerized fashion. All right. And that's pretty much all we'll have for this section. I hope you guys enjoyed it and I can't wait to discuss future sections with you. Thank you. And see in the next one.