基于强化学习的苹果采摘机器人遮挡抑制与机械臂路径规划方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家重点研发计划项目(NK2023150202)


Occlusion Suppression and Robotic Arm Path Planning for Apple-picking Robots Based on Reinforcement Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对复杂农业场景中果树枝叶遮挡导致采摘机器人定位精度下降与采摘困难的问题,本研究提出一种基于强化学习的苹果采摘机器人遮挡抑制与路径规划方法。通过构建视觉运动协同的主动感知系统,在视觉感知层基于YOLO v8 模型设计动态遮挡抑制计算模型(OSCM),融合颜色直方图区域分割与边界拟合估计算法,实时生成果实遮挡率及遮挡抑制热力图;在运动决策层,将深度相机固联于机械臂末端,建立三维遮挡抑制矩阵映射机制,将相机至目标的深度空间离散化为三维栅格矩阵,每个栅格包含OSCM 输出的遮挡率与抑制优先级;在路径规划层,提出了一种深度强化学习驱动的机械臂避障决策模型,通过融合遮挡率惩罚与采摘时效性的复合奖励函数,引导机械臂在三维栅格中自主搜索最优抑制路径。实验结果表明,系统在枝叶遮挡率不小于60% 的高遮挡场景下,果实遮挡面积平均下降33% ,采摘成功率从67% 提升至94.7% ,且较启发式方法平均单果耗时降低3.2 s。本研究将遮挡抑制机制与强化学习决策闭环融合,为农业机器人动态环境适应性研究提供了方法。

    Abstract:

    Aiming to address the dual challenges of declining localization accuracy and harvesting difficulties caused by fruit branch occlusion in complex agricultural environments, a reinforcement learning-based occlusion suppression and path planning method for apple picking robots was proposed. By constructing a vision-motion collaborative active perception system, a dynamic occlusion suppression calculation model (OSCM) at the visual perception layer was developed based on YOLO v8 model. This model integrated color histogram-based region segmentation and boundary fitting estimation algorithms to generate real-time occlusion rates and suppression weights heatmaps for target fruits. At the motion decision layer, a depth camera was rigidly mounted on the robotic arm's end-effector, establishing a three-dimensional occlusion suppression matrix mapping mechanism. This mechanism discretized the depth space from the camera to the target into a three-dimensional grid matrix, where each grid cell contained the OSCM-derived occlusion rate and suppression priority. Building on this framework, a deep reinforcement learning-driven obstacle avoidance decision method was proposed. It employed a composite reward function incorporating occlusion rate penalties and harvesting timeliness, enabling the double deep Q-network (DDQN) to search for the optimal suppression path within the three-dimensional grid. Experimental results demonstrated that in high-occlusion scenarios with branch-leaf density no less than 60% , the system reduced the average occlusion rate by 33% , increased the harvesting success rate from 67% to 94.7% , and decreased the average picking time per fruit by 3.2 s. The research pioneered the closed-loop integration of occlusion suppression mechanisms with reinforcement learning decision-making, providing a novel methodology for enhancing agricultural robots'adaptability in dynamic environments.

    参考文献
    相似文献
    引证文献
引用本文

苗中华,周子斐,孙腾,高旋,李楠,张伟.基于强化学习的苹果采摘机器人遮挡抑制与机械臂路径规划方法[J].农业机械学报,2026,57(5):95-104. MIAO Zhonghua, ZHOU Zifei, SUN Teng, GAO Xuan, LI Nan, ZHANG Wei. Occlusion Suppression and Robotic Arm Path Planning for Apple-picking Robots Based on Reinforcement Learning[J]. Transactions of the Chinese Society for Agricultural Machinery,2026,57(5):95-104.

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-07-13
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-03-01
  • 出版日期:
文章二维码