Most developers are familiar with popular APIs and development products provided by Google, such as Google Map API or Google Analytics. However, there are lot of services which are far more uncommon, but very helpful in some specific cases and it’s definitely worth to be aware of their existence. In this article I’ve wanted to describe one of them – face detection and tracking API available as part of Mobile Vision APIs.
Before diving deep into the subject, it might be a good idea to explain a difference beetwen face detection, tracking and recognition. First term is connected with analysis of static image, in order to find a position of face in it, along with some more specific information, like level of smile of detected object. Tracking is in fact almost the same as detection, although extended for video input. Face recognition, however, is a totally different beast and describes a process of determining whether face matches previously saved pattern (most commonly other face). Unfortunately, this feature is currently unavailable in API provided by Google.
To get started, we need to set up Google Play services and add
in our Gradle build file.
The basic flow of face detection on given image is very simple. First of all, we need to create a FaceDetector instance using provided FaceDetector.Builder class, which can take different configuration options, such as minimal proportional size of detected face or whether the level of smile should be classified. For example,
FaceDetector detector = new FaceDetector.Builder(context)
gives us a detector capable of finding specific points of interest on detected face, such as base of the nose or left eye, performing accurate detection and searching only for the largest face on the image. It’s worth noting, that turning on unnecessary features can have serious impact on performance, so we should think which features in our app are… Well, not necessary.
Before we use our detector, we need to check whether it’s operational. When Face API is used for the first time, the proper library has to be installed, what might take some time in which detector simply won’t work.
At this point we have something that we can use for detecting, but we still need an object to detect from. For this task API provides a special Frame class. To create such object once again we have to use provided builder:
Frame frame = new Frame.Builder().setBitmap(bitmap).build();
Then, all we need to do is to get a list of detected Face objects:
SparseArray faces = detector.detect(frame);
Depending on detector configuration, returned face object will have Landmark list/smiling probability/probability of left and right eye open and the angle of head/any other object on which the face has been detected.
After finishing our work with detector, we have to call release() method, as FaceDetector uses native resources which have to be disposed manually.
Face tracking is a little bit more complex, as it requires a constant video input. Google has provided some code samples to simplify development, which are available here: https://github.com/googlesamples/android-vision.
First of all, we need to declare our FaceTracker, practically in the same way as in previous example. Then, we need to create Processor, which will create Tracker instances capable of providing constant updates of faces on the video. Our processor needs to be assigned to detector in order to work properly with call to the detector.setProcessor() method.
However, we still don’t have a source of video. In order to fix that, we need to create appriopriate CameraSource with our detector with code similar to
mCameraSource = new CameraSource.Builder(context, detector)
This will allow us to track faces from camera. However, we still don’t have any preview. In FaceTracker and multi-processor projects on Github there is CameraSourcePreview class, which renders video from CameraSource on SurfaceView and also gives an opportunity to include own overlay, which will be drawn on top of the rendered view.
To properly grasp the whole concept I strongly advise you to take a look on Github projects, because at first all that theory might seem to be a little (over)complicated.
Of course, Mobile Vision isn’t only existing solution for face detection and tracking problem. For example, there is OpenCV library, although it’s hard to learn and even harder to master. However, when simple landmark and smile detection isn’t enough, it might be a good idea to give it a try.