【嘉楠科技 CanMV K230测评】多色识别以及颜色计数&AprilTag初次使用

王嘉辉 发表于 2024-10-27 15:36

多色识别

之前我们做过单一色彩识别的程序，多色识别实际上就是使用一个循环函数，在循环中，每次切换不同的判断标准，然后依次进行draw，就可以标识出不同的颜色的色块。

基本的程序逻辑是，在find blobs的时候，选取的阈值，不是单一的，而是在一个范围为3的循环中（已知仅有三种颜色），每一次的循环都会切换不同的阈值。在每一次的循环中，都会找到对应的阈值的所有blobs，返回一个元组列表，在这个元组列表中进行一个新的循环，对每一个元组列表里的色块进行单独的框选标识和打印对应的字符。

这里的话，需要单独的去建立两个新的元组，用来存放颜色信息和字符信息。分别来表示不同的颜色。

下面是代码的演示。

可以看到多出了两个新的变量元组，colors1和colors2，分别存储的是红绿蓝三色的颜色数据和颜色字符数据。

在拍照和显示的中间，加入了两个循环，其中第一个循环用来循环不同的颜色，第二个循环用来循环同一个颜色中的不同的色块。整体的程序与单一颜色识别的程序相差不大。

<pre>
<code class="language-python">import time, os, sys

from media.sensor import * #导入sensor模块，使用摄像头相关接口
from media.display import * #导入display模块，使用display相关接口
from media.media import * #导入media模块，使用meida相关接口

# 颜色识别阈值 (L Min, L Max, A Min, A Max, B Min, B Max) LAB模型
# 下面的阈值元组是用来识别红、绿、蓝三种颜色，当然你也可以调整让识别变得更好。
thresholds = [(30, 100, 15, 127, 15, 127), # 红色阈值
 (30, 100, -64, -8, 50, 70), # 绿色阈值
 (0, 40, 0, 90, -128, -20)] # 蓝色阈值

colors1 = [(255,0,0), (0,255,0), (0,0,255)]
colors2 = ['RED', 'GREEN', 'BLUE']

try:

sensor = Sensor() #构建摄像头对象
sensor.reset() #复位和初始化摄像头
sensor.set_framesize(width=800, height=480) #设置帧大小为LCD分辨率(800x480)，默认通道0
sensor.set_pixformat(Sensor.RGB565) #设置输出图像格式，默认通道0

Display.init(Display.VIRT, sensor.width(), sensor.height()) #只使用IDE缓冲区显示图像

MediaManager.init() #初始化media资源管理器

sensor.run() #启动sensor

clock = time.clock()

while True:

 os.exitpoint() #检测IDE中断

 ################
 ## 这里编写代码 ##
 ################
 clock.tick()

 img = sensor.snapshot() #拍摄一张图片

 for i in range(3):

 blobs = img.find_blobs(]) # 0,1,2分别表示红，绿，蓝色。

 if blobs:

 for b in blobs: #画矩形、箭头和字符表示
 tmp=img.draw_rectangle(b, thickness = 4, color = colors1)
 tmp=img.draw_cross(b, b, thickness = 2)
 tmp=img.draw_string_advanced(b, b-35, 30, colors2,color = colors1)

 img.draw_string_advanced(0, 0, 30, 'FPS: '+str("%.3f"%(clock.fps())), color = (255, 255, 255))

 Display.show_image(img) #显示图片

 print(clock.fps()) #打印FPS

###################
# IDE中断释放资源代码
###################
except KeyboardInterrupt as e:
print("user stop: ", e)
except BaseException as e:
print(f"Exception {e}")
finally:
# sensor stop run
if isinstance(sensor, Sensor):
 sensor.stop()
# deinit display
Display.deinit()
os.exitpoint(os.EXITPOINT_ENABLE_SLEEP)
time.sleep_ms(100)
# release media buffer
MediaManager.deinit()
</code></pre>

效果演示。

<div style="text-align: center;"></div>

颜色识别计数

在实际的生活和生产工程中，可能会存在同一个物品散落，需要进行计数的情况，这样的物体通常都有颜色一致的特征。find blobs函数最终返回的是一个元组列表，那么这个返回值就有一个很好的特性，就是可以计算出元组的长度用来表示检测到的色块的数量。同样例程中也给出了一个检测黄色跳线帽的一个例程，我们运行来实验一下。

代码仿照上面的单一颜色识别程序，需要改变的是颜色阈值的设定，这里我们可以导入一张包含有跳线帽的图片，通过阈值编辑器来进行阈值的获取。然后在获取执行完fing blobs的函数后，得到了一个blobs的元组列表对象。使用len函数（在python中有明确定义，用来获取一个对象的长度，这个对象可以是元组列表等也可以是一个字典）对blobs这个元组列表进行长度的获取，然后使用draw进行字符的打印。下面使用代码实现一下。

<pre>
<code class="language-python">import time, os, sys

from media.sensor import * #导入sensor模块，使用摄像头相关接口
from media.display import * #导入display模块，使用display相关接口
from media.media import * #导入media模块，使用meida相关接口

thresholds = [(18, 72, -13, 31, 18, 83)] #黄色跳线帽阈值

try:
sensor = Sensor() #构建摄像头对象
sensor.reset() #复位和初始化摄像头
sensor.set_framesize(width=800, height=480) #设置帧大小为LCD分辨率(800x480)，默认通道0
sensor.set_pixformat(Sensor.RGB565) #设置输出图像格式，默认通道0

Display.init(Display.VIRT, sensor.width(), sensor.height()) #只使用IDE缓冲区显示图像

MediaManager.init() #初始化media资源管理器

sensor.run() #启动sensor

clock = time.clock()

while True:

 os.exitpoint() #检测IDE中断

 ################
 ## 这里编写代码 ##
 ################
 clock.tick()

 img = sensor.snapshot()
 blobs = img.find_blobs(])

 if blobs: #画框显示
 for b in blobs:
 tmp=img.draw_rectangle(b)
 tmp=img.draw_cross(b, b)

 #显示计算信息
 img.draw_string_advanced(0, 0, 30, 'FPS: '+str("%.3f"%(clock.fps()))+' Num: '
 +str(len(blobs)), color = (255, 255, 255))

 Display.show_image(img)

 print(clock.fps()) #打印FPS

###################
# IDE中断释放资源代码
###################
except KeyboardInterrupt as e:
print("user stop: ", e)
except BaseException as e:
print(f"Exception {e}")
finally:
# sensor stop run
if isinstance(sensor, Sensor):
 sensor.stop()
# deinit display
Display.deinit()
os.exitpoint(os.EXITPOINT_ENABLE_SLEEP)
time.sleep_ms(100)
# release media buffer
MediaManager.deinit()
</code></pre>

下面使用例程中提供的图片，进行效果的演示。

<div style="text-align: center;"></div>

AprilTag识别

AprilTag是一种视觉标签，用于增强现实（AR）、机器人导航和计算机视觉等应用。它由一系列黑白图案组成，每个标签都有一个唯一的标识符。相机可以识别这些标签，并计算它们在三维空间中的位置和朝向。这个标签有更高的识别率，即使在环境较差的情况下，也能实现较为精准的识别，而且容易生成和使用，因此被广泛的应用在需要空间定位的应用中。

K230中可以使用find_apriltags进行AprilTag的识别。其函数原型是find_apriltags(]]]]])。其中参数roi是识别的范围，families是要解码的标签家族的位掩码，fx和fy是以像素为单位的相机x和y方向的焦距。cx和cy是图像的中心。

下面进行代码的演示。

<pre>
<code class="language-python">import time, math, os, gc

from media.sensor import * #导入sensor模块，使用摄像头相关接口
from media.display import * #导入display模块，使用display相关接口
from media.media import * #导入media模块，使用meida相关接口
# apriltag代码最多支持可以同时处理6种tag家族。
# 返回的tag标记对象，将有其tag标记家族及其在tag标记家族内的id。

tag_families = 0
tag_families |= image.TAG36H11

def family_name(tag):
if(tag.family() == image.TAG16H5):
 return "TAG16H5"
if(tag.family() == image.TAG25H7):
 return "TAG25H7"
if(tag.family() == image.TAG25H9):
 return "TAG25H9"
if(tag.family() == image.TAG36H10):
 return "TAG36H10"
if(tag.family() == image.TAG36H11):
 return "TAG36H11"
if(tag.family() == image.ARTOOLKIT):
 return "ARTOOLKIT"

try:

sensor = Sensor(width=1280, height=960) #构建摄像头对象，将摄像头长宽设置为4:3
sensor.reset() #复位和初始化摄像头
sensor.set_framesize(width=320, height=240) #设置帧大小为LCD分辨率(800x480)，默认通道0
sensor.set_pixformat(Sensor.RGB565) #设置输出图像格式，默认通道0

Display.init(Display.ST7701, to_ide=True) #同时使用3.5寸mipi屏和IDE缓冲区显示图像，800x480分辨率

MediaManager.init() #初始化media资源管理器

sensor.run() #启动sensor

clock = time.clock()

while True:

 os.exitpoint() #检测IDE中断
 clock.tick()

 img = sensor.snapshot() #拍摄图片

 for tag in img.find_apriltags(families=tag_families): # 如果没有给出家族，默认TAG36H11。

 img.draw_rectangle(tag.rect(), color = (255, 0, 0), thickness=4)
 img.draw_cross(tag.cx(), tag.cy(), color = (0, 255, 0), thickness=2)
 print_args = (family_name(tag), tag.id(), (180 * tag.rotation()) / math.pi) #打印标签信息
 print("Tag Family %s, Tag ID %d, rotation %f (degrees)" % print_args)

 #显示图片，LCD居中方式显示
 Display.show_image(img, x=round((800-sensor.width())/2),y=round((480-sensor.height())/2)) #显示图片

 print(clock.fps()) #打印帧率

except KeyboardInterrupt as e:
print(f"user stop")
except BaseException as e:
print(f"Exception '{e}'")
finally:
# sensor stop run
if isinstance(sensor, Sensor):
 sensor.stop()
# deinit display
Display.deinit()

os.exitpoint(os.EXITPOINT_ENABLE_SLEEP)
time.sleep_ms(100)

# release media buffer
MediaManager.deinit()
</code></pre>

演示效果如下，可以看到在打印窗口中可以看到识别到的AprilTag以及角度的信息，或许可以借助这个角度的信息进行一些空间定位。（之前没有应用过这个AprilTag，最近比较忙，下周研究一下相关的文档，看看相关的原理是怎么实现的。）

<div style="text-align: center;"></div>

Jacktang 发表于 2024-10-28 07:47

借助这个角度的信息进行一些空间定位这个可以做做

页: [1]

电子工程世界-论坛's Archiver

【嘉楠科技 CanMV K230测评】多色识别以及颜色计数&AprilTag初次使用