融合Res3D、BiLSTM和注意力机制的羊只行为识别方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

河北省重点研发计划项目(21327402D)


Fusion of Res3D, BiLSTM and Attention Mechanism for Sheep Behavior Recognition Method
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    识别动物行为可以为疾病预防和合理喂养提供重要依据,从而有助于更好地关注动物的健康和福利。本文提出了一种融合三维残差卷积神经网络、双向长短期记忆网络和注意力机制的深度学习网络模型(AdRes3D-BiLSTM)。AdRes3D-BiLSTM模型可以直接针对视频流进行识别,在AdRes3D部分引入了深度可分离卷积和注意力机制,不但减少了浮点运算量,提升了网络轻量化程度,还提高了时间和空间两个维度的特征提取能力;提取的特征被输入BiLSTM模块后,从前后2个方向对时序特征向量进行筛选和更新,最后对羊只行为进行准确识别。试验结果表明,AdRes3D-BiLSTM对羊只站立、躺卧、进食、行走和反刍5种行为的综合识别准确率达到了98.72%,帧速率达到52.79f/s,模型内存占用量为28.03MB。研究结果为基于视频流的动物行为识别提供了新的方法和思路。

    Abstract:

    In intensive sheep farms, behavioral changes can map out whether there are abnormalities in the sheep’s body. For example, when sheep are sick, rumination and feeding time will produce significant changes, and behavioral observation is one of the ways to diagnose their health. Identifying animal behavior can provide a basis for disease prevention and rational feeding, thus improving the focus on animal health and welfare. Therefore, animal behavior recognition has always been a focus of attention for researchers and production managers. Traditional manual observation methods require continuous human monitoring, and the fatigue response from long hours of human work tends to cause subjective errors in the results. In addition, sensor detection methods that require direct contact with the animal’s body tend to stress the animal, affecting animal health and production performance. A deep learning network model AdRes3D-BiLSTM was proposed that incorporated a three dimensional residual convolutional neural network, a bi-directional long and short-term memory network, and an attention mechanism. The AdRes3D component introduced depth separable convolution, a technique instrumental in curtailing computational complexity and enhancing network efficiency. Furthermore, an actionnet attention mechanism based on motion principles was embedded within the AdRes3D section, directing the network’s focus toward discerning behavioral nuances. This augmentation amplified the model’s adeptness in extracting pivotal behavioral key points across consecutive video frames, thereby augmenting its capacity for feature extraction across both temporal and spatial dimensions. Subsequently, the feature vectors extracted from this process were inputted into the BiLSTM module, affording bidirectional filtering and updating for temporal features, and the final sheep behaviors were accurately recognized. A dataset comprising 6000 distinct videos was amassed for training the proposed model. This dataset encompassed different sheep instances, spanning varying periods, lighting conditions, and poses. An additional set of 1200 behavioral videos, distincting from those employed in training, was selected as the testing data. The experimental results showed the efficacy of the AdRes3D-BiLSTM model, as evidenced by an exceptional comprehensive recognition accuracy rate of 98.72% across five fundamental sheep behaviors: standing, lying, feeding, walking, and ruminating. In contrast to five alternative network architectures—namely, C3D, R(2+1)D, Res3D, Res3D-LSTM, and Res3D-BiLSTM-the AdRes3D-BiLSTM model achieved notable improvements in recognition metrics. Specifically, relative to these network models, AdRes3D-BiLSTM exhibited a precision enhancement of 11.32 percentage points, 6.24 percentage points, 4.34 percentage points, 2.04 percentage points and 1.52 percentage points, respectively. The corresponding improvements in recognition recall stood at 11.78 percentage points, 6.38 percentage points, 4.38 percentage points, 2.12 percentage points and 1.68 percentage points, while F1-score improvements registered at 11.70 percentage points, 6.35 percentage points, 4.38 percentage points, 2.08 percentage points and 1.60 percentage points, and the augmentation in recognition accuracy was quantified at 11.97 percentage points, 6.33 percentage points, 4.37 percentage points, 2.32 percentage points and 2.01 percentage points. Furthermore, the method elucidated boasted an impressive frame rate, attaining a remarkable 52.79 frames per second (FPS). This recognition speed substantiated the model’s real-time processing capabilities, thereby satisfying operational demands. Additionally, a 24-hour uninterrupted video segment was randomly culled from the repository of collected videos, effectively validating the model’s efficacy in a real-world environment. This investigation ushers in novel methodologies and conceptual insights for animal behavior recognition based on video streams. The strides furnished fresh avenues for advancing the field, presenting innovative strategies and perspectives for further exploration and implementation.

    参考文献
    相似文献
    引证文献
引用本文

袁洪波,曹润柳,程曼.融合Res3D、BiLSTM和注意力机制的羊只行为识别方法[J].农业机械学报,2024,55(4):221-230. YUAN Hongbo, CAO Runliu, CHENG Man. Fusion of Res3D, BiLSTM and Attention Mechanism for Sheep Behavior Recognition Method[J]. Transactions of the Chinese Society for Agricultural Machinery,2024,55(4):221-230.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-08-20
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-04-10
  • 出版日期: