From the picture of face detection in openCV website…
This post is the first of series articles discussing openCV HAAR classifier training and use, as it was implemented in GSoc 2012 project. Here in this post there is no complex algorithms presented, instead, this focuses on requirements and techniques for successful classifier training using openCV.
Samples used in openCV face detection
3,000 negative pictures, 5,000 positive frontal face pictures were used and resulted in openCV’s good accuracy of face detection. These 5k pictures were from 1,000 individuals. Put aside how difficult it was to collect so many pictures (which I will talk about in another post), this number is believed to be able to achieve satisfied results due to the fact that the most common features should be covered, such like the dark eyes, bright nose and the relative positions amongst these parts. Special features, such like the colours of eyes, the height of nose and the width of face cheeks, are less worthy of training.
Downloads of face samples, as mentioned in Naotoshi’s tutorials, can be referred in http://note.sonots.com/SciSoftware/haartraining.html#x15ebd98.
How these samples were used
openCV never told how these samples were used. As openCV only provides xml files, the only thing clear is how many stages used in the training with these samples. Neither were the cropping of pictures nor ultimate picture sizes provided. As far as concerned,
Kuranov et. al.  states as 20×20 of sample size achieved the highest hit rate. Furthermore, they states as “For 18×18 four split nodes performed best, while for 20×20 two nodes were slightly better. Thus, -w 20 -h 20 would be good
And, how the picture cropping affects the quality of training will be discussed in a later post.
Method of training
To save some words, the detailed procedures of sample training as well as parameter introductions please refer to Naotoshi’s tutorial. Here I introduce .bat files used in my training. These files save good time to deal with command lines.
opencv_createsamples.exe -info ./contract_palm4/filelist.dat -vec ./contract_palm4/output.vec -num 7430 -bg ./neg3/filelist.dat -bgcolor 0 -w 20 -h 20
cd opencv_haartraining.exe -data
./contract_palm4/output.vec -bg ./neg3/filelist.dat -nstages 20 -nsplits 2 -minhitrate 0.999 -maxfalsealarm 0.0005 -weighttrimming = 0.95 -npos 7430 -nneg 4300 -w 20 -h 20 -mem 1024 -nonsym -mode ALL
and, cascadeconvert.bat (opencv_cascadeconvert.exe is not included in openCV installation)