本帖最后由 dirty 于 2025-1-26 14:54 编辑
本篇讲述使用K230进行人脸检测。
一.了解AI视觉开发框架
K230的KPU是一个内部神经网络处理器,它可以在低功耗的情况下实现卷积神经网络计算,实时获取被检测目标的大小、坐标和种类,对人脸或者物体进行检测和分类,支持INT8和INT16。CanMV官方基于K230专门搭建了配套的AI视觉开发框架。框架结构如下图所示:
这个框架简单来说就是Sensor(摄像头)默认输出两路图像,一路格式为YUV420,直接给到Display显示;另一路格式为RGB888,给到AI部分进行处理。AI主要实现任务的前处理、推理和后处理流程,得到后处理结果后将其绘制在osd image实例上,并送给Display叠加,最后在HDMI、LCD或IDE缓冲区显示识别结果。
AI视觉开发框架主要API接口有:
●PineLine:将sensor、display封装成固定接口,用于采集图像、画图以及结果图片显示。
●AI2D:预处理(Preprocess)相关接口。
●AIBase:模型推理主要接口。
二.人脸检测
人脸检测,是将一幅图片中人脸检测出来,支持单个和多个人脸。
本次人脸检测功能将摄像头拍摄到的画面中的人脸用矩形框标识出来。编程流程如下:
实现代码如下:
- '''人脸检测
- '''
- from media.sensor import *
- from libs.PipeLine import PipeLine, ScopedTiming
- from libs.AIBase import AIBase
- from libs.AI2D import Ai2d
- import os
- import ujson
- from media.media import *
- from time import *
- import nncase_runtime as nn
- import ulab.numpy as np
- import time
- import utime
- import image
- import random
- import gc
- import sys
- import aidemo
-
-
- class FaceDetectionApp(AIBase):
- def __init__(self, kmodel_path, model_input_size, anchors, confidence_threshold=0.5, nms_threshold=0.2, rgb888p_size=[224,224], display_size=[1920,1080], debug_mode=0):
- super().__init__(kmodel_path, model_input_size, rgb888p_size, debug_mode)
- self.kmodel_path = kmodel_path
- self.model_input_size = model_input_size
- self.confidence_threshold = confidence_threshold
- self.nms_threshold = nms_threshold
- self.anchors = anchors
- self.rgb888p_size = [ALIGN_UP(rgb888p_size[0], 16), rgb888p_size[1]]
- self.display_size = [ALIGN_UP(display_size[0], 16), display_size[1]]
- self.debug_mode = debug_mode
- self.ai2d = Ai2d(debug_mode)
- self.ai2d.set_ai2d_dtype(nn.ai2d_format.NCHW_FMT, nn.ai2d_format.NCHW_FMT, np.uint8, np.uint8)
-
-
- def config_preprocess(self, input_image_size=None):
- with ScopedTiming("set preprocess config", self.debug_mode > 0):
- ai2d_input_size = input_image_size if input_image_size else self.rgb888p_size
- top, bottom, left, right = self.get_padding_param()
- self.ai2d.pad([0, 0, 0, 0, top, bottom, left, right], 0, [104, 117, 123])
- self.ai2d.resize(nn.interp_method.tf_bilinear, nn.interp_mode.half_pixel)
- self.ai2d.build([1,3,ai2d_input_size[1],ai2d_input_size[0]],[1,3,self.model_input_size[1],self.model_input_size[0]])
-
-
- def postprocess(self, results):
- with ScopedTiming("postprocess", self.debug_mode > 0):
- post_ret = aidemo.face_det_post_process(self.confidence_threshold, self.nms_threshold, self.model_input_size[1], self.anchors, self.rgb888p_size, results)
- if len(post_ret) == 0:
- return post_ret
- else:
- return post_ret[0]
-
-
- def draw_result(self, pl, dets):
- with ScopedTiming("display_draw", self.debug_mode > 0):
- if dets:
- pl.osd_img.clear()
- for det in dets:
-
- x, y, w, h = map(lambda x: int(round(x, 0)), det[:4])
- x = x * self.display_size[0] // self.rgb888p_size[0]
- y = y * self.display_size[1] // self.rgb888p_size[1]
- w = w * self.display_size[0] // self.rgb888p_size[0]
- h = h * self.display_size[1] // self.rgb888p_size[1]
- pl.osd_img.draw_rectangle(x, y, w, h, color=(255, 255, 0, 255), thickness=2)
- else:
- pl.osd_img.clear()
-
-
- def get_padding_param(self):
- dst_w = self.model_input_size[0]
- dst_h = self.model_input_size[1]
- ratio_w = dst_w / self.rgb888p_size[0]
- ratio_h = dst_h / self.rgb888p_size[1]
- ratio = min(ratio_w, ratio_h)
- new_w = int(ratio * self.rgb888p_size[0])
- new_h = int(ratio * self.rgb888p_size[1])
- dw = (dst_w - new_w) / 2
- dh = (dst_h - new_h) / 2
- top = int(round(0))
- bottom = int(round(dh * 2 + 0.1))
- left = int(round(0))
- right = int(round(dw * 2 - 0.1))
- return top, bottom, left, right
-
- if __name__ == "__main__":
-
- display_mode="lcd"
- if display_mode=="hdmi":
- display_size=[1920,1080]
- else:
- display_size=[800,480]
-
- kmodel_path = "/sdcard/examples/kmodel/face_detection_320.kmodel"
-
- confidence_threshold = 0.5
- nms_threshold = 0.2
- anchor_len = 4200
- det_dim = 4
- anchors_path = "/sdcard/examples/utils/prior_data_320.bin"
- anchors = np.fromfile(anchors_path, dtype=np.float)
- anchors = anchors.reshape((anchor_len, det_dim))
- rgb888p_size = [1920, 1080]
-
-
- pl = PipeLine(rgb888p_size=rgb888p_size, display_size=display_size, display_mode=display_mode)
- pl.create()
-
- face_det = FaceDetectionApp(kmodel_path, model_input_size=[320, 320], anchors=anchors, confidence_threshold=confidence_threshold, nms_threshold=nms_threshold, rgb888p_size=rgb888p_size, display_size=display_size, debug_mode=0)
- face_det.config_preprocess()
-
- clock = time.clock()
-
- while True:
- clock.tick()
- img = pl.get_frame()
- res = face_det.run(img)
-
- if res:
- print(res)
-
- face_det.draw_result(pl, res)
- pl.show_image()
- gc.collect()
- print(clock.fps())
主函数实现获取当前帧数据,AI推理当前帧,检测到人脸绘制、显示结果,这里用到SD卡路径下人脸检测模型/sdcard/examples/kmodel/face_detection_320.kmodel。AI处理阶段定义在FaceDetectionApp类下config_preprocess、postprocess、get_padding_param。
这里选用两张照片,运行摄像头正对照片,识别结果如下,可以看到图片中人脸均识别到。
至此,实现人脸检测功能。
|