基于DPNA-CASREL的柑橘病虫害实体关系联合抽取方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家重点研发计划项目(2023YFD2101001)


Joint Entity-relation Extraction Method for Citrus Diseases and Pests Based on DPNA-CASREL
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对柑橘病虫害领域文本数据中存在重叠三元组、嵌套实体和复杂实体抽取困难的问题,提出一种基于DPNA-CASREL(Dual-pointer network annotation-cascade binary tagging framework for relational triple extraction)的柑橘病虫害实体关系联合抽取方法。通过结合预训练模型RoBERTa-wwm-ext与双向长短期记忆网络(BiLSTM)构建编码器获取文本的多维向量编码,并根据柑橘病虫害语料特点设计双重指针网络标注的解码网络,在头实体解码中引入多级指针网络标注方法,在尾实体解码网络中采用复杂实体标注策略以增强模型对复杂实体的抽取性能,实现对实体关系三元组的同步抽取,解决三元组重叠、嵌套实体等问题。在自建柑橘病虫害数据集上的实验结果表明,DPNA CASREL模型的精确率、召回率和F1值分别为82.12%、81.97%、82.05%,优于其他模型,对嵌套、复杂实体抽取的F1值比CASREL分别提升8.16、6.58个百分点,有效解决了实体嵌套和实体边界不清晰问题。本文方法可为柑橘病虫害知识图谱构建提供基础。

    Abstract:

    Aiming at the problems of overlapping triples, the difficulty in extracting nested entities and complex entities in the text data within the field of citrus diseases and pests, a joint extraction method for citrus diseases and pests entity relationships based on dual-pointer network annotation-cascade binary tagging framework for relational triple extraction (DPNA-CASREL) was proposed. By combining the pre-training model robustly optimized BERT pre-training approach with whole word masking and extended training data (RoBERTa-wwm-ext) with the bi-directional long short-term memory (BiLSTM) to construct an encoder, multi-dimensional vector encodings of the text were obtained. According to the semantic characteristics of citrus diseases and pests, a decoding network with dual-pointer network annotation was designed. The multi-level-pointer-network annotation method was introduced in decoding the head entity, and a complex entity labeling strategy was adopted in the decoding network of the tail entity to enhance the model’s extraction performance for complex entities. By adopting a complex entity labeling strategy in the tail entity decoding network, the synchronous extraction of entity relationship triples was realized, and the problems of overlapping triples and nested entities were solved. Experimental results on a self-built citrus diseases and pests dataset showed that the precision, recall, and F1-score of the DPNA CASREL model reached 82.12%, 81.97%, and 82.05%, respectively, which was superior to those of other models. Compared with CASREL, the F1-score of the nested and complex entity extraction were improved by 8.16 percentage points and 6.58 percentage points, respectively. This method can effectively solve the problems of entity nesting and unclear entity boundaries. It can provide a basis for citrus diseases and pests knowledge-graph construction and other downstream tasks.

    参考文献
    相似文献
    引证文献
引用本文

吴叶兰,于宛莹,秦晴,廉小亲,于重重,吴静珠.基于DPNA-CASREL的柑橘病虫害实体关系联合抽取方法[J].农业机械学报,2026,57(5):398-406. WU Yelan, YU Wanying, QIN Qing, LIAN Xiaoqin, YU Chongchong, WU Jingzhu. Joint Entity-relation Extraction Method for Citrus Diseases and Pests Based on DPNA-CASREL[J]. Transactions of the Chinese Society for Agricultural Machinery,2026,57(5):398-406.

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-10-31
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-03-01
  • 出版日期:
文章二维码