融合注意力机制的枸杞虫害图文跨模态检索方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(61862050)和宁夏自然科学基金项目(2020AAC03031)


Cross-modal Image and Text Retrieval Method for Lycium Barbarum- Pests by Integrating Attention Mechanism
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对现有农作物病虫害检索模态较为单一问题,以17种常见的枸杞虫害图像和文本描述为研究对象,将跨模态检索引入枸杞虫害检索领域,提出一种融合注意力机制的枸杞虫害图文跨模态检索方法。首先,借助Transformer模型和循环神经网络分别获取带有上下文信息的细粒度图像和文本特征序列;然后,利用注意力机制对特征序列进行聚合以挖掘图像和文本的显著性语义信息;最后,为了深入挖掘不同模态间语义关联,采用跨媒体联合损失函数对模型进行约束。试验结果表明,本文方法在自建的枸杞虫害图文跨模态数据集上平均精度均值平均值达到了0.458。与现有的8种方法相比,平均精度均值平均值提高了0.011~0.195,优于所有对比方法,可为农作物病虫害多样化检索提供技术支撑和算法参考。

    Abstract:

    In recent years, with the change of climatic conditions and the introduction of cultivation techniques, the planting area of Lycium has gradually expanded. It has become one of the important economic crops in Ningxia and even the entire northwestern region. Lycium is a multi-insect host and has poor resistance to insect pests. It is very susceptible to insect infestation, which has a huge impact on yield and quality, causing serious economic losses. Therefore, it is very important to quickly and accurately retrieve and obtain various information about Lycium pests and provide timely and accurate control for the development of the industry. To address the problem that the present retrieval system on crop pests owns only the single mode, the crossmodal retrieval for images and texts in Lycium pest dataset was introduced, which had 17 kinds of common pests, and a cross-modal image and text retrieval method with the attention mechanism was proposed. Firstly, the transformer and the LSTM were used to obtain text and image fine-grained feature sequences with the context information, respectively. Then, the attention mechanism was leveraged to aggregate feature sequences to capture the salient semantic information in texts and images. Finally, in order to explore the semantic correlation between different modalities, the cross-media joint loss was used to constrain the proposed model. The experiment showed that the averaged MAP of the proposed method in the self-built Lycium pest dataset achieved 0.458. Compared with the existing eight methods, the averaged MAP of the method was improved by 0.011~0.195, outperforming all these methods. The proposed method can provide technical support and algorithm reference for diversified retrieval requirements of crop pests.

    参考文献
    相似文献
    引证文献
引用本文

刘立波,赵斐斐.融合注意力机制的枸杞虫害图文跨模态检索方法[J].农业机械学报,2022,53(2):299-308. LIU Libo, ZHAO Feifei. Cross-modal Image and Text Retrieval Method for Lycium Barbarum- Pests by Integrating Attention Mechanism[J]. Transactions of the Chinese Society for Agricultural Machinery,2022,53(2):299-308.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-12-28
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2021-07-09
  • 出版日期: