利用的Google的vision library实现人脸检测功能

原文地址基于Google vision实现人脸检测功能

前言

之前写了一篇人脸检测的文章是基于OpenCV的CascadeClassifier来实现的，基于OpenCV实现人脸检测。这次我们来探讨如何通过Google 的vision来实现人脸检测。

笔者之前在用Google vision做人脸检测的时候，用的是Google Mobile Vision 。准备写博客的时候，发现mobile vision网页上声明说mobile vision 库不维护了，转到firebase 的ML KIT library下了(⊙o⊙)… 就去了firebase 网站上看了一下，还好，使用的套路跟之前差不多，一样意思。

既然人家官方都不准备维护mobile vision了，那我们就看看，firebase的ML库里面的vision是个什么用法把。

facedetect 概述

使用ML Kit 的face detect api 我们可以检测image，video frame 中的人脸，定位人脸上的特征点的位置。获取到这些位置后，我们可以利用这些信息来做一些有趣的功能，比如说我们检测到了人的眼睛，我们可以给人的眼睛做一些特效处理。向下面这样：

这个效果，就是通过获取到人眼睛的位置，所绘制的一个大眼萌的贴画，当眼睛睁开闭合的时候，大眼萌贴画也会有对应的睁开闭合的动作，这是因为使用ML Kit库里面的api我们可以获取到人眼睛是否睁开闭合的这么一个判断。

核心功能 识别定位面部特症：可以获取到脸部的眼睛，鼻子，嘴巴，脸颊等的位置坐标。

识别面部表情：检测人脸是否微笑，是否睁眼。

追踪人脸：为检测到的，每个人的人脸分配一个id，这个id是不变的（相对而言不变，如果人脸移除屏幕丢失，再回来，id会发生变化），我们可以通过这个不变的id 来实现对指定face的track。

实时处理视频帧：没什么好说了，就是可以实时检测人脸。

注意目前firebase的ML Kit还只是beta版，所以说目前还是使用mobile vision靠谱点。

准备工作

要想使用firebase 的ML Kit library来实现facedetect ，首先我们需要在firebase注册我们的app信息。这样才能使用firebase。把firebase添加到项目中，有两种方式，一种是自动方便的方式通过使用Firebase Assistant来注册添加，还有一种是手动的添加方式。先说手动的方式吧，需要我们先进入Firebase的控制台，Firebase control console 添加新的项目。如下图的界面：

、创建项目之后，它会根据你传入的包名等信息生成一个json配置文件，把配置文件下载下来，放入module级别的目录，即可。然后根据他的提示把一些需要的库implementation到对应的gradle文件中即可。而通过Firebase Assistant可以很方便来吧Firebase 添加到你的项目中，他需要Android studio的版本要在2.2以上（包含2.2），位置在Tools下面：

他就是自动化的给我们实现上述自己动手做的一些事情。具体的步骤由于篇幅原因就不细说了，详情请移步添加Firebase至您的项目。

Firebase添加到项目中后，我们需要把ML Kit库添加到app level 的build.gradle文件中：

dependencies {
  // ...

  implementation 'com.google.firebase:firebase-ml-vision:17.0.0'
}

接着，在AndroidManifest.xml file中插入如下代码：

<application ...>
  ...
  <meta-data
      android:name="com.google.firebase.ml.vision.DEPENDENCIES"
      android:value="face" />
  <!-- To use multiple models: android:value="face,model2,model3" -->
</application>

这里要说一下，ML Kit库里面，不仅仅有face detect的api实现人脸检测功能，他还可以做text recognition（文字识别），barcode detection（条码识别），label detect（类型识别）。其中barcode detection的话我试过还挺不错，挺好用的，准确度也比较高。当我们想要识别什么的时候，就把android：value填上对应的值即可。这些识别的使用都不难，如果有兴趣可以自行前往学习，这边就不多做说明了。

实现逻辑

我们首先看一下效果图：

1.首先是常规操作，相机部分的功能实现，来获取到camera的预览数据。因为我们实现的功能是在预览画面中实时检测当前画面是否有人脸，所以我们需要preview date。如何进行camera开发，获取相机预览数据，这里就不再赘述。

2.在获取到相机预览数据后，我们就需要对预览数据进行处理。

2.1 首先我们在对画面进行检测之前，如果你不想使用face detector默认的配置的话，可以根据我们的需求来对face detector进行配置。这就涉及到了FirebaseVisionFaceDetectorOptions类，通过对该类的配置进而设置face detector的setting。可设置的配置有如下几种：

示例代码如下：

private void init() {
        previewWidth = ((MainActivity) mContext).getPreviewWidth();
        previewHeight = ((MainActivity) mContext).getPreviewHeight();
        paddingBuffer = new byte[previewWidth * previewHeight * 3 / 2];
        options = new FirebaseVisionFaceDetectorOptions.Builder()
                //是否开启追踪模式，开启追踪模式后，才可以获得的unique id
                .setTrackingEnabled(true)
                //设置检测模式类型 FAST_MODE or ACCURATE_MODE
                //FAST_MODE 速度快，准确度不高；ACCURATE_MODE 准确度高，速度会慢点
                .setModeType(FirebaseVisionFaceDetectorOptions.ACCURATE_MODE)
                //设置可检测脸的最小尺寸
                .setMinFaceSize(0.15f)
                //设置是否检测脸部特征如：眼睛，嘴巴，耳朵等位置。
                // NO_LANDMARKS 表示 不检测、ALL_LANDMARKS表示检测
                .setLandmarkType(FirebaseVisionFaceDetectorOptions.NO_LANDMARKS)
                //是否设置分类器，如果设置的话，可以检测获得人脸的微笑和正否睁眼的“可能性” ，会返回一个float型的值0.0-1.0 值越大，可能性就越大
                .setClassificationType(FirebaseVisionFaceDetectorOptions.NO_CLASSIFICATIONS).build();

        metadata = new FirebaseVisionImageMetadata.Builder()
                //设置格式
                .setFormat(ImageFormat.NV21)
                .setWidth(previewWidth).setHeight(previewHeight)
                .setRotation(CameraApi.getInstance().getRotation()/90)
                .build();

        //获得face detector
        faceDetector = FirebaseVision.getInstance().getVisionFaceDetector(options);

    }

上面的init方法中，我们看到了对FirebaseVisionFaceDetectorOptions的设置（PS：该设置中我关闭了landmark和classificationType，所以上面对应的gif效果图中，脸部特征和是否微笑是没有体现出来的，后面会在展示一个全部开启的gif效果图），代码中对每一行语句也做了对应的注释，结合上面的图表来看的，应该会很容易懂，所以就不在啰嗦了。

2.2 在对FirebaseVisionFaceDetectorOptions配置后，我们通过配置好的Options来创建一个FirebaseVisionFaceDetector，代码如下：

//获得face detector
        faceDetector = FirebaseVision.getInstance().getVisionFaceDetector(options);

FirebaseVisionFaceDetector 根据他的类名就很容易猜到他的用途，他是一个face检测器，我们看一下他的方法看看他是怎么检测的？

我们发现这个FaceDetector的方法比较少，一个是close用来释放资源的，另一个是我们关注的检测人脸的核心方法。

Task<List<FirebaseVisionFace>>	
        detectInImage(FirebaseVisionImage image)
        //Detects human faces from the supplied image.

这个检测方法，需要传一个叫做FirebaseVisionImage 的对象。那我们看看FirebaseVisionImage是怎么创建的。

2.3 创建FirebaseVisionImage有几种方式，也对应了不同的需求场景。

通过Bitmap对象来创建

FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);

这里的bitmap方向要注意。

   2.  通过media.Image 对象来创建
   比如：通过相机capture的图像数据，这里我们需要注意方向问题。

FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);

   3.  通过ByteBuffer对象来创建（也就是本文我们实现的这种效果）
 使用该方式来创建一个FirebaseVisionImage对象的话，需要创建一个FirebaseVisionImageMetadata对象来进行描述，我们看下面的代码。

FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
        .setWidth(1280)
        .setHeight(720)
        .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
        .setRotation(rotation)
        .build();

上面的代码比较好理解，这边我就说两点，一个是format，目前FirebaseVIsion只支持两种格式： IMAGE_FORMAT_NV21 which corresponds to NV21 IMAGE_FORMAT_YV12 which corresponds to YV12 Google 的mobile vision 库的话，是支持三种格式：NV21，YV12，NV16 另一点关于rotation 的值，从其官方文档看是在给相机设置displayOrientation的时候获取到的。关键代码如下：

private void setRotation(Camera camera, Camera.Parameters parameters, int cameraId) {
    WindowManager windowManager = (WindowManager) activity.getSystemService(Context.WINDOW_SERVICE);
    int degrees = 0;
    int rotation = windowManager.getDefaultDisplay().getRotation();
    switch (rotation) {
      case Surface.ROTATION_0:
        degrees = 0;
        break;
      case Surface.ROTATION_90:
        degrees = 90;
        break;
      case Surface.ROTATION_180:
        degrees = 180;
        break;
      case Surface.ROTATION_270:
        degrees = 270;
        break;
      default:
        Log.e(TAG, "Bad rotation value: " + rotation);
    }

    CameraInfo cameraInfo = new CameraInfo();
    Camera.getCameraInfo(cameraId, cameraInfo);

    int angle;
    int displayAngle;
    if (cameraInfo.facing == Camera.CameraInfo.CAMERA_FACING_FRONT) {
      angle = (cameraInfo.orientation + degrees) % 360;
      displayAngle = (360 - angle) % 360; // compensate for it being mirrored
    } else { // back-facing
      angle = (cameraInfo.orientation - degrees + 360) % 360;
      displayAngle = angle;
    }

    // This corresponds to the rotation constants.
    this.rotation = angle / 90;

    camera.setDisplayOrientation(displayAngle);
    parameters.setRotation(angle);
  }

this.rotation 就是我们需要的值。

FirebaseVisionImageMetadata创建好之后，可通过如下方式创建我们想要的FirebaseVisionImage对象。

FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
// Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);

   4.  通过File文件来创建

FirebaseVisionImage image;
try {
    image = FirebaseVisionImage.fromFilePath(context, uri);
} catch (IOException e) {
    e.printStackTrace();
}

2.4 FirebaseVisionImage 有了，FirebaseVisionFaceDetector也有了，现在我们就可以来检测人脸的。

Task<List<FirebaseVisionFace>> result =
        detector.detectInImage(image)
                .addOnSuccessListener(
                        new OnSuccessListener<List<FirebaseVisionFace>>() {
                            @Override
                            public void onSuccess(List<FirebaseVisionFace> faces) {
                                // Task completed successfully
                                // ...
                            }
                        })
                .addOnFailureListener(
                        new OnFailureListener() {
                            @Override
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                                // ...
                            }
                        });

通过上面的代码，我们可以在onSuccessListener的回调里得到检测的结果。检测的结果被封装到FirebaseVisionFace类里面，我们可以通过该类，获取到我们想要的信息，比如：人脸的位置，人脸上特征点（眼睛，鼻子，耳朵等）的位置、眼睛是否睁开，是否微笑等信息。如下：

for (FirebaseVisionFace face : faces) {
    Rect bounds = face.getBoundingBox();
    float rotY = face.getHeadEulerAngleY();  // Head is rotated to the right rotY degrees
    float rotZ = face.getHeadEulerAngleZ();  // Head is tilted sideways rotZ degrees

    // If landmark detection was enabled (mouth, ears, eyes, cheeks, and
    // nose available):
    FirebaseVisionFaceLandmark leftEar = face.getLandmark(FirebaseVisionFaceLandmark.LEFT_EAR);
    if (leftEar != null) {
        FirebaseVisionPoint leftEarPos = leftEar.getPosition();
    }

    // If classification was enabled:
    if (face.getSmilingProbability() != FirebaseVisionFace.UNCOMPUTED_PROBABILITY) {
        float smileProb = face.getSmilingProbability();
    }
    if (face.getRightEyeOpenProbability() != FirebaseVisionFace.UNCOMPUTED_PROBABILITY) {
        float rightEyeOpenProb = face.getRightEyeOpenProbability();
    }

    // If face tracking was enabled:
    if (face.getTrackingId() != FirebaseVisionFace.INVALID_ID) {
        int id = face.getTrackingId();
    }
}

通过上述代码，我们可以获取到我们想要的位置信息，然后，通过位置信息，我们就可以通过View在设备上展示出来了。

整个开发思路是这样，最后给大家在演示一下效果：

补充

1.因为人脸检测是一个耗时操作，所以在实时检测的开发中要注意避免影响帧率，降低CPU usage，这里有几种解决思路，我们可以通过减少image 的尺寸，对图像进行压缩来处理；也可以在配置option的时候对于自己不需要的东西，设置为none，比如：landmark 的信息，如果不需要的话，就可以设置为FirebaseVisionFaceDetectorOptions.NO_LANDMARKS, ,Mode 可以设置为FAST_MODE等来提高检测速率。

2.上面有提到说Firebase 的ML Kit 库目前是beta版本，建议使用mobile vision的。其实呢，也不尽然，因为使用mobile vision的话也有一点比较麻烦的是需要在使用之前下载一个native library，看下面的mobile vision官方说明：

// Note: The first time that an app using face API is installed on a device, GMS will download a native library to the device in order to do detection. Usually this completes before the app is run for the first time. But if that download has not yet completed, then the above call will not detect any faces. isOperational() can be used to check if the required native library is currently available. The detector will automatically become operational once the library download completes on device.

就是说我们第一次使用Google的mobile vision库的时候，需要通过Google manager service（PS：这里Google service的版本也有要求，版本太低的话这个native lib是没有办法下载的，我们需要调用对应的方法来检查，版本太低的话，调用相应方法让他升级）来下载一个native library，下载好才能使用vision库，所以这是有一定的局限性。这里还是建议大家根据自己的需求来选择。 3. OpenCV 也是可以检测人脸上的特征的。。。 4. 后面会写一篇利用OpenCV实现物体追踪的文章。

GoogleFaceDetectDemo 项目地址

利用的Google的vision library实现人脸检测功能