兽药致病命名实体Att-Aux-BERT-BiLSTM-CRF识别
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

北京市现代农业产业技术体系创新团队项目(BAIC02-2020)和国家重点研发计划项目(2017YFC1601803)


Recognition of Animal Drug Pathogenicity Named Entity Based on Att-Aux-BERT-BiLSTM-CRF
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对兽药致病知识图谱构建过程中,关于兽药命名实体识别使用传统方法依赖人工设计特征耗时耗力以及兽药致病语料数据量较少的问题,提出一种引入注意力机制(Attention)与辅助层分类(Auxiliary layer)相结合兽药文本命名实体识别模型(Att-Aux-BERT-BiLSTM-CRF)。通过BERT预处理模型进行文本向量化,然后连接双向长短期记忆网络(Bi-directional long-short term memory,BiLSTM);引入注意力机制,将模型的BERT层输出用作辅助分类层, BiLSTM层输出作为主分类层(Mainlayer),通过注意力机制组合以提高整体性能;最后输入条件随机场(Conditional random field,CRF),构建端到端的适合于兽药领域实体识别的深度学习模型框架。实验选取兽药文本共10643个句子、485711个字符,针对动物、药物、不良反应、摄入方式4类实体进行识别。实验结果表明,本文模型能有效地辨别兽药致病文本中的实体,识别的F1值为96.7%。

    Abstract:

    In order to solve the problems that traditional methods of veterinary drug named entity recognition rely on artificial design features, which is time-consuming and labor-consuming, and the amount of veterinary drug pathogenic corpus data is less in the process of building veterinary drug pathogenic knowledge graph, a method based on Att-Aux-BERT-BiLSTM-CRF of veterinary drug text named entity recognition model was proposed, which combined BERT-BiLSTM-CRF models by introducing attention mechanism and auxiliary classification layer.The text was vectorized by the BERT preprocessing model, and then connected to bi-directional long-short term memory network.The auxiliary classification mechanism was introduced, the output of the BERT layer was used as the auxiliary classification layer, and the output of the BiLSTM layer was used as the main classification layer. The attention mechanism was proposed to combine auxiliary classification layer with main classification layer to improve the overall performance.Finally, it was sent to conditional random field to construct an end-to-end deep learning model framework suitable for veterinary drug name entity recognition.In the experiment, totally 10643 sentences and 485711 characters of veterinary drug text were selected to identify four kinds of entities: drug, adverse effect, intake mode, aimal. The results showed that the model can effectively identify the entities in the veterinary drug pathogenic text, and the F1 value of recognition was 96.7%.

    参考文献
    相似文献
    引证文献
引用本文

杨璐,张恬,郑丽敏,田立军.兽药致病命名实体Att-Aux-BERT-BiLSTM-CRF识别[J].农业机械学报,2022,53(3):294-300. YANG Lu, ZHANG Tian, ZHENG Limin, TIAN Lijun. Recognition of Animal Drug Pathogenicity Named Entity Based on Att-Aux-BERT-BiLSTM-CRF[J]. Transactions of the Chinese Society for Agricultural Machinery,2022,53(3):294-300.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-02-02
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2022-03-10
  • 出版日期: