基于植物电子病历多类型数据融合的作物病害诊断方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(62176261)


Crop Disease Diagnosis Method Based on Fusion of Multiple Types of Data from Plant EMRs
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    植物电子病历(EMR)以结构化和非结构化的形式记录了大量关于疾病症状、环境特征以及诊断开方的信息,为病害的智能诊断提供了优质知识来源,但是其样本量少、公开数据集缺乏和多种类型数据并存的特点给相关研究带来困难。根据植物EMR多类型数据混合的特点,提出了一种基于BERT-MPL数据融合与注意力机制优化的作物病害诊断模型(BERT-MPL data fusion model based on attention mechanism,BM-Att)。首先采用BERT预训练语言模型抽取电子病历中非结构化部分的文本语义特征;其次通过one-hot编码和多层感知机(MLP)对结构化数据进行编码和向量维度的扩增;最后在特征融合阶段采用注意力机制强调关键特征,利用多层全连接层实现病害诊断。构建了番茄、黄瓜、生菜和西瓜4种作物的15种病害数据集验证模型的效果并进行消融实验,并且对比了CNN、RCNN、AttRNN、FastText、Transformer、BERT和ERNIE等处理文本数据的常见模型,以及BERT-ALEX、BERT-1dCNN、BERT-1dLSTM、BERT-1dAttLSTM、BERT-MLP、ERNIE-ALEX、ERNIE-1dCNN、ERNIE-1dLSTM、ERNIE-1dAttLSTM、ERNIE-MLP等不同数据融合策略。结果表明,BM-Att取得最优结果,在测试集的准确率、精确率、召回率和F1值宏平均值分别达到95.82%、96.38%、95.48%和95.85%,能够实现作物病害的有效诊断。在特征融合阶段添加注意力机制的策略将模型F1值宏平均值提高1.47个百分点,显著提升了模型对生菜霜霉病、西瓜线虫等小样本病害的分类效果。该研究可为电子病历数据挖掘及实现智能辅助病害诊断提供参考。

    Abstract:

    The rapid diagnosis of crop diseases is crucial for agricultural production. A large amount of information on disease symptoms, drug prescriptions and environmental characteristics is recorded in the plant electronic medical record (EMR) in both structured and unstructured forms. Plant EMRs can provide a high-quality source of knowledge for intelligent diagnosis of diseases. However, their small sample size, the lack of publicly available datasets and the co-existence of multiple types of data posed difficulties for related research. A crop disease diagnosis model based on BERT-MPL data fusion and attention mechanism (BM-Att) was proposed for the characteristics of multiple types of data mixing in plant EMR. Firstly, BERT pre-trained language model was used to extract text semantic features from the unstructured part of the electronic medical record. Secondly, one-hot coding and multi-layer perceptron (MLP) was used to encode the structured data and augment the vector dimension. Finally, an attention mechanism was used to selectively highlight key features in the feature fusion phase and multiple fully connected layers were used to enable disease diagnosis. To verify the validity of the model, a dataset of 15 diseases of four crops, namely tomato, cucumber, lettuce and watermelon, was constructed and the following experiments were carried out. Ablation experiments were conducted;representative deep learning models for text classification were compared, such as CNN, RCNN, AttRNN, FastText, Transformer, BERT and ERNIE;representative models with different approaches to structured data processing were compared, such as BERT-ALEX, BERT-1dCNN, BERT-1dLSTM, BERT-1dAttLSTM, BERT-MLP, ERNIE-ALEX, ERNIE-1dCNN, ERNIE-1dLSTM, ERNIE-1dAttLSTM, ERNIE-MLP, etc. The results showed that BM-Att achieved optimal results with accuracy, precision, recall and F1-score of 95.82%, 96.38%, 95.48% and 95.85%, respectively in the test set, indicating that effective diagnosis of crop diseases can be achieved. The strategy of adding an attention mechanism to the feature fusion stage improved the F1 macro mean of the model by 1.47 percentage points, significantly improving the model’s classification of small sample diseases such as lettuce downy mildew and watermelon nematode. The research result can provide a reference for data mining of electronic medical records and the implementation of intelligent diagnosis of diseases.

    参考文献
    相似文献
    引证文献
引用本文

丁俊琦,李博,乔岩,张领先.基于植物电子病历多类型数据融合的作物病害诊断方法[J].农业机械学报,2023,54(1):196-204,223. DING Junqi, LI Bo, QIAO Yan, ZHANG Lingxian. Crop Disease Diagnosis Method Based on Fusion of Multiple Types of Data from Plant EMRs[J]. Transactions of the Chinese Society for Agricultural Machinery,2023,54(1):196-204,223.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-03-23
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-01-10
  • 出版日期: