基于改进DeepLabv3+的番茄图像多类别分割方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(52065033)和云南省科技厅重大专项(2022AG050002-4)


Multi-category Segmentation Method of Tomato Image Based on Improved DeepLabv3+
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    番茄图像中多类别目标的准确识别是实现自动化采摘的技术前提,针对现有网络分割精度低、模型参数多的问题,提出一种基于改进DeepLabv3+的番茄图像多类别分割方法。该方法使用幻象网络(GhostNet)和坐标注意力模块(Coordinate attention,CA)构建CA-GhostNet作为DeepLabv3+的主干特征提取网络,减少网络的参数量并提高模型的分割精度,并设计了一种多分支解码结构,用于提高模型对小目标类别的分割能力。在此基础上,基于单、双目小样本数据集使用合成数据集的权值参数进行迁移训练,对果实、主干、侧枝、吊线等8个语义类别进行分割。结果表明,改进的DeepLabv3+模型在单目数据集上的平均交并比(MIoU)和平均像素准确率(MPA)分别为68.64%、78.59%,在双目数据集上的MIoU和MPA分别达到73.00%、80.59%。此外,所提模型内存占用量仅为18.5MB,单幅图像推理时间为55ms,与基线模型相比,在单、双目数据集上的MIoU分别提升6.40、6.98个百分点,与HRNet、UNet、PSPNet相比,内存占用量压缩82%、79%、88%。该研究可为番茄采摘机器人的智能采摘和安全作业提供参考。

    Abstract:

    Accurate identification of multi-category targets in tomato images is the technical premise for automatic picking. Aiming at the problems of low segmentation accuracy and the large number of model parameters in existing networks, a multi-category segmentation method based on improved DeepLabv3+ was proposed for tomato images. The method used GhostNet and coordinate attention (CA) to construct CA-GhostNet as the backbone feature extraction network of DeepLabv3+, reducing the number of parameters in the network. And a multi-branch decoding structure was designed to improve the segmentation accuracy of the model for small target categories. Then the weight parameters of the synthesized dataset were used for migration training based on the single and binocular small sample dataset. Eight semantic categories such as fruit, trunk, branch and thin line were segmented. The results showed that mean intersection over union (MIoU) and mean pixel accuracy (MPA) of improved DeepLabv3+ model were 68.64% and 78.59% on the monocular dataset, respectively. The MIoU and MPA were 73.00% and 80.59% on the binocular dataset. In addition, the memory occupation of the proposed model was only 18.5MB, and the inference time of a single image was 55ms. Compared with the baseline model, the MIoU on the monocular and binocular datasets was increased by 6.40 percentage points and 6.98 percentage points, respectively. Compared with HRNet, UNet and PSPNet, the memory occupation was reduced by 82%, 79% and 88%, respectively. The research result can provide reference for intelligent picking and safe operation of tomato picking robot.

    参考文献
    相似文献
    引证文献
引用本文

顾文娟,魏金,阴艳超,刘孝保,丁灿.基于改进DeepLabv3+的番茄图像多类别分割方法[J].农业机械学报,2023,54(12):261-271. GU Wenjuan, WEI Jin, YIN Yanchao, LIU Xiaobao, DING Can. Multi-category Segmentation Method of Tomato Image Based on Improved DeepLabv3+[J]. Transactions of the Chinese Society for Agricultural Machinery,2023,54(12):261-271.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-04-23
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-07-30
  • 出版日期: