基于融合对抗训练的农作物品种信息抽取方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

河南省科技创新杰出人才项目


Crop Variety Information Extraction Method Based on Integrated Adversarial Training
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对我国作物品种种类多,资源信息规范性差,模型训练精度低等问题,本文以小麦、水稻、玉米、大豆、棉花、花生、油菜7种作物为对象,以品种、形态、产量和品质等参数为指标,构建了83个品种实体,采用人工标注方法,通过融合对抗训练技术,提出了农作物品种信息抽取4层网络模型(BERT-PGD-BiLSTM-CRF)。模型基于深层双向Transformer构建的BERT(Bidirectional encoder representation from transformers)模型作为预训练模型获取字词语义表示,使用PGD(Projected gradient descent)对抗训练方法为样本增加扰动,提高模型鲁棒性和泛化性,利用双向长短期记忆网络 (Bidirectional long short-term memory, BiLSTM)学习长距离文本信息,结合条件随机场(Conditional random field,CRF)学习标签约束信息。对比18个不同信息抽取模型的训练效果,结果表明,本研究提出的BERT-PGD-BiLSTM-CRF模型精确率为95.4%、召回率为97.0%、F1值为96.2%,说明利用对抗训练技术的BERT-PGD-BiLSTM-CRF模型能够有效对作物品种信息进行抽取,同时也为农业信息抽取提供了技术参考。

    Abstract:

    In response to the issues of a wide variety of crop types, poor resource information standardization, and low model training accuracy in China, focusing on seven crops: wheat, rice, maize, soybeans, cotton, peanuts, and rapeseed, using parameters like variety, morphology, yield, and quality as indicators, totally 83 crop variety entities were constructed. A manual annotation approach was adopted and an information extraction four-layer network model (BERT-PGD-BiLSTM-CRF) was introduced by incorporating adversarial training techniques. The model utilized the bidirectional encoder representation from transformers(BERT) model, based on a deep bidirectional transformer, as a pre-training model to acquire semantic representations of words and phrases. It employed projected gradient descent (PGD) adversarial training to introduce perturbations to the samples, thereby enhancing model robustness and generalization. Additionally, it leveraged a bidirectional long short-term memory (BiLSTM) network to capture long-distance text information and combined conditional random fields (CRF) to learn label constraint information. Comparing the training results with 18 different information extraction models, the research indicated that the proposed BERT-PGD-BiLSTM-CRF model achieved a precision of 95.4%, a recall of 97.0%, and an F1 score of 96.2%. This suggested that the BERT-PGD-BiLSTM-CRF model, utilizing adversarial training techniques, was effective in extracting crop variety information and also provided a technological reference for agricultural information extraction.

    参考文献
    相似文献
    引证文献
引用本文

许鑫,马文政,张浩,马新明,乔红波.基于融合对抗训练的农作物品种信息抽取方法[J].农业机械学报,2023,54(12):272-279,337. XU Xin, MA Wenzheng, ZHANG Hao, MA Xinming, QIAO Hongbo. Crop Variety Information Extraction Method Based on Integrated Adversarial Training[J]. Transactions of the Chinese Society for Agricultural Machinery,2023,54(12):272-279,337.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-08-25
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-09-26
  • 出版日期: