基于深度学习加速模型的杂乱目标实时视觉检测方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(52375083)、重庆英才计划项目(CQYC20220207232/cstc2024ycjh-bgzxm0052)、重庆教委科研重大项目(KJZD-M202401101)、重庆自然科学基金项目(cstc2021jcyj-msxmX0372)和重庆技术创新与应用项目(CSTB2022TIAD-CUX0017)


Real Time Visual Detection for Cluttered Targets Based on Deep Learning Acceleration Model
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在农业机械自动装配产线上,其嵌入式控制平台片上资源极其有限,而基于卷积神经网络的深度学习检测系统参数量过大,难以直接移植于嵌入式平台,为此,本文提出一种基于改进ResNet18-SSD(Single shot multi-box detector)和现场可编程门阵列(Field programmable gate array,FPGA)加速引擎的深度学习实时检测方法。为了降低参数量的同时提高检测模型准确性,提出基于ResNet18-SSD的深度学习快速检测模型,利用优化改进后的ResNet18网络替换SSD模型的VGG16前置网络,引入多分支同构结构和非对称并行残差结构,使其能适应遮挡、光线昏暗等复杂场景;在满足检测精度需求的情况下,采用动态定点量化的方式,对模型数据量进行缩减,以提高检测模型执行效率。针对改进ResNet18-SSD模型中消耗资源严重的卷积层,提出一种基于Winograd算法的FPGA加速引擎,提高模型检测实时性,通过软硬件协同设计,从硬件加速器与软件网络轻量化两个角度进行联合优化,实现轻量化、加速性能及复杂场景下准确性三者之间的平衡。在Xilinx FPGA嵌入式平台的实验结果表明,本文方法检测准确率达到93.5%,当工作频率为100MHz时,单幅图像检测时间为80.232ms,满足实时性需求。

    Abstract:

    In the automatic assembly line of agricultural machinery, the on-chip resources of its embedded control platform are extremely limited, and the parameter amount of the convolutional neural network-based deep learning detection system is too large, which is difficult to be directly transplanted to the embedded platform. Therefore, a deep learning real-time detection method based on improved ResNet18-SSD (single shot multi-box detector) and field programmable gate array (FPGA) acceleration engine was proposed. In order to improve the accuracy of the detection model while reducing the number of parameters, a deep learning fast detection model based on ResNet18-SSD was proposed, which utilized the optimized and improved ResNet18 network to replace the VGG16 predecessor network of the SSD model, and introduced a multi-branch isomorphic structure and an asymmetric parallel residual structure, so as to adapt to the complex scenes such as occlusion, dim light; and in the case of meeting the detection accuracy requirements, a dynamic fixed-variance network was used to meet the detection accuracy requirements. Under the condition of meeting the requirements of detection accuracy, the dynamic fixed-point quantization was adopted to reduce the model data volume to improve the execution efficiency of the detection model. Aiming at improving the convolutional layer in the ResNet18-SSD model, which consumed serious resources, an FPGA acceleration engine based on the Winograd algorithm was proposed to improve the real-time performance of the model detection, and through the software-hardware co-design, joint optimization was carried out from the perspectives of the hardware gas pedal and the lightweighting of the software network, so as to achieve a balance between the lightweighting, acceleration performance, and accuracy in the complex scene. Experimental results on the Xilinx FPGA embedded platform showed that the detection accuracy of the proposed method reached 93.5%, and the detection time of a single image under the operating frequency of 100MHz was 80.232ms, which met the real-time demand.

    参考文献
    相似文献
    引证文献
引用本文

余永维,陈天皓,杜柳青,方荣.基于深度学习加速模型的杂乱目标实时视觉检测方法[J].农业机械学报,2025,56(5):617-624. YU Yongwei, CHEN Tianhao, DU Liuqing, FANG Rong. Real Time Visual Detection for Cluttered Targets Based on Deep Learning Acceleration Model[J]. Transactions of the Chinese Society for Agricultural Machinery,2025,56(5):617-624.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-11-19
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-05-10
  • 出版日期:
文章二维码