基于双流跨模态特征融合模型的群养生猪体质量测定
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

财政部和农业农村部:国家现代农业产业技术体系项目(CARS-35)


Estimation of Pig Weight Based on Cross-modal Feature Fusion Model
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对生猪体质量准确测定问题,提出了一种跨模态特征融合模型(Crossmodality feature fusion ResNet,CFF-ResNet),充分利用可见光图像的纹理轮廓信息与深度图像的空间结构信息的互补性,实现了群养环境中无接触的生猪体质量智能测定。首先,采集并配准俯视猪圈的可见光与深度图像,并通过EdgeFlow算法对每一只目标生猪个体进行由粗到细的像素级分割。然后,基于ResNet50网络构建双流架构模型,通过内部插入门控形成双向连接,有效地结合可见光流和深度流的特征,实现跨模态特征融合。最后,双流分别回归出生猪体质量预估值,通过均值合并得到最终的体质量测定值。在试验中,以某种公猪场群养生猪为数据采集对象,构建了拥有9842对配准可见光和深度图像的数据集,包括6909对训练数据和2933对测试数据。本研究所提出模型在测试集上的平均绝对误差为3.019kg,平均准确率为96.132%。与基于可见光和基于深度的单模态基准模型相比,该模型体质量测定精度更高,其在平均绝对误差上分别减少18.095%和12.569%。同时,该模型体质量测定精度优于其他现有生猪体质量测定方法:常规图像处理模型、改进EfficientNetV2模型、改进DenseNet201模型和BotNet+DBRB+PFC模型,在平均绝对误差上分别减少46.272%、14.403%、8.847%和11.414%。试验结果表明,该测定模型能够有效学习跨模态的特征,满足了生猪体质量测定的高精度要求,为群养环境中生猪体质量测定提供了技术支撑。

    Abstract:

    In recent years, with the increasing scale of pig farming in the world, farms are in urgent need of automated livestock information management systems to ensure animal welfare. As one of the significant growing information of pigs, the weight of pigs can help farmers to grasp the healthy status of pigs. The traditional methods manually measure pig weight, which are time-consuming and laborious. With the development of image processing technology, the estimation of pig weight by analyzing images has opened up a way for intelligent determination of pig weight. However, many recent studies usually considered only one image modality, either RGB or depth, which ignored the complementary information between the two modalities. To address the above issues, a cross-modality feature fusion model CFF-ResNet was proposed, which made full use of the complementary between texture contour information of RGB images and spatial structure information of depth images, for realizing the intelligent estimation of pig weight without human contact in a group farming environment. Firstly, RGB and depth images of the piggery in top view were acquired, and the correspondence between the pixel coordinates of the two different modalities were used to achieve alignment. Then the EdgeFlow algorithm was used to segment each target individual pig in the coarse-to-fine pixel level, while filtering out irrelevant background information. A two-stream architecture model was constructed based on the ResNet50 network, and a bidirectional connection was formed by inserting internal gates to effectively combine the features of RGB and depth streams for cross-modal feature fusion. Finally, the two streams were regressed separately to produce pig weight predictions, and the final weight estimation values were obtained by averaging. In the experiment, the data was collected from a commercial pig farm in Henan, and a dataset with 9842 pairs of aligned RGB and depth images was constructed, including 6909 pairs of training images and 2933 pairs of test images. The experimental results showed that the mean absolute error of the proposed model on the test set was 3.019kg, which was reduced by 18.095% and 12.569% compared with the RGB and depth-based single-stream benchmark models, respectively. The average accuracy of proposed method reached 96.132%, which was very promising. Noting that, the model did not add additional training parameters when compared with the direct use of two single-stream models to process RGB and depth images separately. The mean absolute error of the model was reduced by 46.272%, 14.403%, 8.847%, and 11.414% compared with other existing methods: the conventional method, the improved EfficientNetV2 model, the improved DenseNet201 model, and the BotNet+DBRB+PFC model, respectively. In addition, to verify the effectiveness of cross-modal feature fusion, a series of ablation experiments were also designed to explore different alternatives for two stream connections, including unidirectional or bidirectional additive or multiplicative connections. The experimental results showed that the model with a bidirectional additive connection obtained the best performance among all alternatives. All the above experimental results showed that the proposed model can effectively learn the cross-modal features and meet the requirements of accurate pig weight measurement, which can provide effective technical support for pig weight measurement in group farming environment.

    参考文献
    相似文献
    引证文献
引用本文

何威,米阳,刘刚,丁向东,李涛.基于双流跨模态特征融合模型的群养生猪体质量测定[J].农业机械学报,2023,54(s1):275-282,329. HE Wei, MI Yang, LIU Gang, DING Xiangdong, LI Tao. Estimation of Pig Weight Based on Cross-modal Feature Fusion Model[J]. Transactions of the Chinese Society for Agricultural Machinery,2023,54(s1):275-282,329.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-06-20
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-12-10
  • 出版日期: