2 **The Viola Jones Face Detection Algorithm**
4 WANG Yi-Qing, yiqing.wang@polytechnique.edu, ENS Cachan
10 This source code creates five command line programs.
12 * *
train.cpp*: Commandline program that performs cascade training
14 * *detect.cpp*: Commandline program that performs face detection
16 * *
TrainExamples.{cpp,h}*: Implements a
class for Adaboosting
main routine: adaboost()
18 * *
Detector.{cpp,h}*: Implements a
class for storing the trained cascade for face detection
20 * *detectUtil.{cpp,h}*: Implements the
main routine
for face detection:
scan()
22 * *commonUtil.{cpp,h}*: Implements general utilities such as an Eigen wrapper
for PNG image IO
24 * *boostUtil.{cpp,h}*: Implements the
main routine
for cascade construction:
train()
26 * *io_png.{c.h}*: Image input and output
for PNG
28 * *startTraining.sh*: a bash script
for cascade training. It creates several input files
for train()
30 * *getParameters.sh*: a bash script
for monitoring the training process
36 The code has been tested with g++ 4.6.3 under Ubuntu and no warning or error was issued.
38 *libpng* is required as well as the Eigen library. The latter will be automatically downloaded when the code is compiled
for the first time, which, however, requires *wget* to work properly on your system. If it is not the
case, please manually download the latest Eigen from its official website *http:
40 To compile, type `make OMP=1` should the parallelization be enabled. Otherwise, simply type `make`.
42 RELEASE is the
default mode in the makefile. However, VERSION can be set to DEBUG
for testing.
44 How to
train a cascade
45 ======================
47 Unpack the tarballs and issue ./startTraining.sh which will download the required dataset
49 When asked at the prompt
51 > start training right away? y/n
53 type *n* to defer the training run to a later time, in which
case run the same script again and the required data and folders already in place will be used. Otherwise, type *y*.
55 The learned cascade will be stored in *
Detector.cxx*. To use it, rename it *
Detector.cpp* and compile again.
61 * To run the executable *
train*, issue
63 > ./
train > trainLog 2>&1 &
65 so that the training process is logged to trainLog. It may also be accessed with getParameters.sh, intended to summarize the overall training progress, including accumulated
false positive rates at each layer and the committee size per layer.
67 * To run the executable *detect*, issue
69 > ./detect example.png [number of layers] [threshold]
71 where the first optional parameter defines the number of layers to be used in the cascade and the second denotes the robustness threshold introduced in the post-processing section of the article. These parameters
default to 31 and 3. example.png refers to the image subject to face detection.
73 These explanatory instructions are also given
if one simply types
79 * rotated.png: because two rotated versions of the input image are also used to increase the detection rate, it is one of them and does not constitute a vital part of the
final output.
81 * detectedraw.png: raw detection without the post-processing phase.
83 * ppRobust.png: the detected windows resulting from the robustness test described in the article applied to detectedraw.png.
85 * ppSkin.png: the detected windows resulting from the skin test described in the article applied to detectedraw.png.
87 * ppBoth.png: the detected windows resulting from the two previous tests applied to detectedraw.png.
92 This source code implements a detector that scans all the
square subwindows inside a given image
93 and highlights those presumably having a face
95 This is an outline of how the face detection is performed in detectUtil.cpp:
98 scan(), detectUtil.cpp:
102 All the image subwindows are tested by
tscan().
107 tscan(), detectUtil.cpp:
109 imread() reads in a gray image.
110 In case of a color image, the RGB to grayscale conversion is performed.
113 of the image to
scan so as to accelerate its local sum and variance evaluation.
115 The
main loop examines all the
square subwindows inside the image
117 area contains three parameters that characterizes a subwindow with
118 (pos_i, pos_j) denoting its upper left corner's coordinates
120 If the subwindow has a standard deviation smaller than
FLAT_THRESHOLD,
121 it is considered to be flat and thus labelled as a non-face. Otherwise
122 detectFace() runs the subwindow through the cascade.
127 An empty cascade considers all the subwindows to have a face.
129 Otherwise,
computeFeature() calculates the subwindow's features required by the
130 decision stumps in the first layer and a weighted vote determines its label. If
131 it is declared to be a face, let this subwindow go to the next layer. If not,
132 reject the subwindow.
140 This part of the code implements training example collection and the Adaboost algorithm
141 using the decision stump as its base learner. Note that here Adaboost does not produce a
142 committee, it only adds one more stump to the committee, given the example weights at the
143 time. How many stumps are needed to form a committee is left to boostUtil.cpp
148 Construct a new decision stump with bestStump().
150 predictLabel() evaluates how the decision stump fares.
152 The training examples' weights are adjusted depending on the outcome.
157 Positive examples are provided as a supplementary material, indexed by a variable called
159 blacklisted as false negative by all the previous cascade layers. In absence of such a cascade,
160 all provided positives are used.
162 sampleNegatives() takes negative examples from a large image pool made up of grayscale images
163 without human faces, in addition to the negatives (or false positives) left by the previous layers.
165 The training and validation examples are prepared differently:
167 All the positive training examples are assigned equal weight, positiveTotalWeight/nPositives
168 at the outset. Similarly, (1-positiveTotalWeight)/nNegatives for negative training examples.
173 Sorting is carried out because a decision stump can then be built in linear time.
175 Since there is no need to run Adaboost on the validation pool, to save memory, only their integral
176 images are calculated to facilitate the generalization error estimation at a later stage.
179 sampleNegatives(), TrainExamples.cpp:
182 previous layers. If there are not enough, negativeImagePaths is read in from which we expect
183 to collect new negative examples so that the total number of false positives gets o nNegatives.
185 tscan() is hence called to examine these negative images, if a false positive window larger than the example
186 size is found,
zoomOutNegative() tries to reduce its size so as to make it an acceptable example. If this
187 shrunk version turns out be true negative, this larger window is rejected and the search continues.
189 To avoid repeated detections on the images no longer able to provide good negatives, their entries are
190 deleted from negativeImagePaths if it is determined to be the case. This operation is performed in the
191 routine's final loop.
196 =====================
198 This source code implements the multi-layer cascade designed to reduce false positive rate in an iterative fashion.
201 train(), boostUtil.cpp:
203 The inner loop focuses on constructing a new layer whose false positive rate should be maintained below a
204 targeted level under one constraint on the layer committee's size. If its committee can no longer grow,
205 the false positive rate is then allowed to rise without sacrificing too much the false negative rate. The
206 highlight of this part is the classifier shift selection. See the accompanying IPOL article for more details.
208 The outer loop records the constructed layers and creates
Detector.cpp.
VectorXf readInCascade(vector< stumpRule > *&cascade)
cascade
vector< pair< float, int > > * writeOrganizedFeatures(int featureCount, int sampleCount, RowVectorXf *&featureVectors)
sort every feature and write them out in ascending order
void tscan(const char *file, int &nRows, int &nCols, int defaultLayerNumber, vector< stumpRule > *cascade, VectorXf &tweaks, vector< square > &toMark)
scan the whole image using a cascade
void imread(const char *fileName, int &nRows, int &nCols, int &nChannels, MatrixXf *&image, bool outputGray)
read in fileName as it is or gray
void highlight(const char *inputName, vector< square > &areas, int PPMode, float nFriends)
highlight the faces so that we can see them
void computeHaarLikeFeatures(MatrixXf &image, VectorXf *&features, const char *toFile, bool enforceShape, bool inTrain)
compute Haar-like features with integral image
double computeFeature(int featureIndex, square const &area, MatrixXld &integralImage, bool removeMean)
compute the feature from an image subwindow
bool zoomOutNegative(MatrixXf *&image, int shrinkedSize, int defaultLayerNumber, vector< stumpRule > *cascade, VectorXf &tweaks)
try to make an example out of a false positive window
void buildIntegralImage(MatrixXf &image, MatrixXld &integralImage)
integral image in linear time
void scan(const char *file, int defaultLayerNumber, float nFriends)
detect faces on an image
bool detectFace(square const &area, MatrixXld &integralImage, double varianceNormalizer, VectorXf &tweaks, vector< stumpRule > const *cascade, int defaultLayerNumber)
detect face in an image subwindow
int main(int argc, char *const *argv)
int readImagesFromPathFile(const char *pathFile, MatrixXf **&images, VectorXi *blackList, int sign)
read in images in the pathFile
void train(trainParams const &target)
train a cascade, see the definition of trainParams