How does viola jones work




















Now to calculate the value of any haar-like feature, you have a simple way to calculate the difference between the sums of pixel values of two rectangles. Next, we use a Machine Learning algorithm known as AdaBoost. But why do we even want an algorithm? So we use the AdaBoost algorithm to identify the best features in the , features. In the Viola-Jones algorithm, each Haar-like feature represents a weak learner. To decide the type and size of a feature that goes into the final classifier, AdaBoost checks the performance of all classifiers that you supply to it.

To calculate the performance of a classifier, you evaluate it on all subregions of all the images used for training. Some subregions will produce a strong response in the classifier. Those will be classified as positives, meaning the classifier thinks it contains a human face.

They will be classified as negatives. The classifiers that performed well are given higher importance or weight. The final result is a strong classifier, also called a boosted classifier, that contains the best performing weak classifiers. So ultimately, the algorithm is setting a minimum threshold to determine whether something can be classified as a useful feature or not.

Maybe the AdaBoost will finally select the best features around say , but it is still a time-consuming process to calculate these features for each region. The job of the cascade is to quickly discard non-faces, and avoid wasting precious time and computations. Thus, achieving the speed necessary for real-time face detection.

We set up a cascaded system in which we divide the process of identifying a face into multiple stages. In the first stage, we have a classifier which is made up of our best features, in other words, in the first stage, the subregion passes through the best features such as the feature which identifies the nose bridge or the one that identifies the eyes.

In the next stages, we have all the remaining features. When an image subregion enters the cascade, it is evaluated by the first stage. When a subregion gets a maybe, it is sent to the next stage of the cascade and the process continues as such till we reach the last stage.

If all classifiers approve the image, it is finally classified as a human face and is presented to the user as a detection. Now how does it help us to increase our speed? Basically, If the first stage gives a negative evaluation, then the image is immediately discarded as not containing a human face. If it passes the first stage but fails the second stage, it is discarded as well.

Basically, the image can get discarded at any stage of the classifier. In this section, we are going to implement the Viola-Jones algorithm using OpenCV and detect faces in our webcam feed in real-time. We will also use the same algorithm to detect the eyes of a person too. You can refer to this article to know about OpenCV and how to install it.

Instead of creating and training the model from scratch, we use this file. Now let us start coding. We do this by using the os module of Python language. The next step is to load our classifier.

We are using two classifiers, one for detecting the face and others for detection eyes. Next, we need to get the frames from the webcam stream, we do this using the read function. We use the infinite loop to get all the frames until the time we want to close the stream. The return code tells us if we have run out of frames, which will happen if we are reading from a file. The faceCascade object has a method detectMultiScale , which receives a frame image as an argument and runs the classifier cascade over the image.

The term MultiScale indicates that the algorithm looks at subregions of the image in multiple scales, to detect faces of varying sizes. The variable faces now contain all the detections for the target image. Detections are saved as pixel coordinates. Each detection is defined by its top-left corner coordinates and width and height of the rectangle that encompasses the detected face. To show the detected face, we will draw a rectangle over it.

The coordinates indicate the row and column of pixels in the image. We can easily get these coordinates from the variable face. Also as now, we know the location of the face, we define a new area which just contains the face of a person and name it as faceROI.

In faceROI we detect the eyes and encircle them using the circle function. Next, we just display the resulting frame and also set a way to exit this infinite loop and close the video feed. This brings us to the end of this article where we learned about the Viola Jones algorithm and its implementation in OpenCV.

Remember Me! Great Learning is an ed-tech company that offers impactful and industry-relevant programs in high-growth areas. Know More. Sign in. Cascade Classifier : A cascade classifier consists of multiple stages of filters, to classify a image sliding window of a image is a face.

A detection window shifts around the whole image extract feature by haar filter computed by Integral Image then send the extracted feature to Cascade Classifier to classify if it is a face. The sliding window shifts pixel-by-pixel. Each time the window shifts, the image region within the window will go through the cascade classifier. Haar Filter : You can understand the the filter can extract features like eyes , bridge of the nose and so on. A cascade classifier consists of multiple stages of filters, as shown in the figure below.

Each time the sliding window shifts, the new region within the sliding window will go through the cascade classifier stage-by-stage. If the input region fails to pass the threshold of a stage, the cascade classifier will immediately reject the region as a face. If a region pass all stages successfully, it will be classified as a candidate of face, which may be refined by further processing.

Firstly, I suggest you to read the source paper Rapid Object Detection using a Boosted Cascade of Simple Features to have a overview understanding of the method. Here is a python code Python implementation of the face detection algorithm by Paul Viola and Michael J.

Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. How does the Viola-Jones face detection method work? Ask Question. Asked 10 years, 6 months ago. Active 3 years, 3 months ago. Viewed 29k times. Please explain to me, in few words, how the Viola-Jones face detection method works. Improve this question. Drew Noakes k gold badges silver badges bronze badges.

BlackShadow BlackShadow 1, 5 5 gold badges 19 19 silver badges 25 25 bronze badges. Add a comment. Active Oldest Votes. Lienhart introduced an extended set of twisted Haar-like feature see image These are the standard Haar-like feature that have been twisted by 45 degrees. This process is similar to the rounding used when scaling a Haar-like feature for larger or smaller windows, however one difference is that for a 45 degrees twisted feature, the integer number of pixels used for the height and width of the feature mean that the diagonal coordinates of the pixel will be always on the same diagonal set of pixels This means that the number of different sized 45 degrees twisted features available is significantly reduced as compared to the standard vertically and horizontally aligned features.

Improve this answer. Ketul 86 6 6 bronze badges. Can you explain me, a bit better, what is a visual feature the meaning of the squares in the picture and what means the final formula that is in your answer?

Ok, thanks a lot. I have one last question in reference to the JCooper answer : how can work the face detection with haar features if we have a white guy with blue or green eyes so that the difference of light between the nose and the eyes isn't considerable? BlackShadow It still works pretty well because eyes are set back in the head. The shadows from top-down lighting can trigger the feature.



0コメント

  • 1000 / 1000