郑丽敏,乔振铎,田立军,杨璐.基于BERT-LEAM模型的食品安全法规问题多标签分类[J].农业机械学报,2021,52(7):244-250,158.
ZHENG Limin,QIAO Zhenduo,TIAN Lijun,YANG Lu.Multi-label Classification of Food Safety Regulatory Issues Based on BERT-LEAM[J].Transactions of the Chinese Society for Agricultural Machinery,2021,52(7):244-250,158.
摘要点击次数: 911
全文下载次数: 380
基于BERT-LEAM模型的食品安全法规问题多标签分类   [下载全文]
Multi-label Classification of Food Safety Regulatory Issues Based on BERT-LEAM   [Download Pdf][in English]
投稿时间:2020-09-29  
DOI:10.6041/j.issn.1000-1298.2021.07.026
中文关键词:  食品安全法规  多标签分类  BERT  BERT-LEAM
基金项目:国家重点研发计划项目(2017YFC1601803)
作者单位
郑丽敏 中国农业大学 
乔振铎 中国农业大学 
田立军 中国农业大学 
杨璐 中国农业大学 
中文摘要:在食品安全法规问答系统中,食品安全法规问题的单标签文本分类不能完全概括问题所包含的有效信息,为了改进单标签文本分类效果,根据问题所涉及食品安全角度和层次的不同,提出一种基于BERT-LEAM(Bidirectional encoder representational from transformers-label embedding attentive model)的多标签文本分类方法。采用多角度、分层次的多标签标注方法将单个问题文本赋予多个标签,并引入BERT预训练语言模型表示上下文特征信息, 通过Attention机制学习标签与文本的依赖关系,进行Word embedding的聚合,将标签应用到文本分类过程中。实验表明,在粗粒度多标签数据集上的分类效果明显优于细粒度多标签数据集上的分类效果,BERT进行文本特征表示的方法优于Word2Vec方法,采用BERT-LEAM模型的分类方法在粗粒度多标签数据集与细粒度多标签数据集的F1-W值分别为93.35%和79.81%,其分类效果优于其他分类模型。
ZHENG Limin  QIAO Zhenduo  TIAN Lijun  YANG Lu
China Agricultural University
Key Words:food safety regulations  multi-label classification  BERT  BERT-LEAM
Abstract:Effective classification of food safety regulatory issues is the key to the realization of the food safety regulatory question and answer system. In order to improve the effect of single label text classification, a multi-label text classification method based on bidirectional encoder representational from transformers-label embedding attentive model (BERT-LEAM) was proposed according to the different food safety perspectives and levels involved in the problem. A multi-angle and hierarchical multi-label labeling method was used to assign multiple labels to a single question text, and the pre-training language model of BERT was introduced to represent the context feature information. The dependency between the label and the text was learned by attention mechanism, the word was processed by embedding aggregation, and the tag was applied to the text classification process. The experimental results showed that the classification effect on the coarse-grained multi-label data set was better than that on the fine-grained multi-label data set. The method of text feature representation by BERT model was better than that of Word2Vec. The F1-W values of coarse-grained multi-label data set and fine-grained multi-label data set were 93.35% and 79.81%, respectively, which was better than other classification methods model. The problem classification based on food safety regulations question answering system was realized effectively by using the method of BERT-LEAM classification, which laid the foundation for the implementation of the follow-up question answering system.

Transactions of the Chinese Society for Agriculture Machinery (CSAM), in charged of China Association for Science and Technology (CAST), sponsored by CSAM and Chinese Academy of Agricultural Mechanization Science(CAAMS), started publication in 1957. It is the earliest interdisciplinary journal in Chinese which combines agricultural and engineering. It always closely grasps the development direction of agriculture engineering disciplines and the published papers represent the highest academic level of agriculture engineering in China. Currently, nearly 8,000 papers have been already published. There are around 3,000 papers contributed to the journal each year, but only around 600 of them will be accepted. Transactions of CSAM focuses on a wide range of agricultural machinery, irrigation, electronics, robotics, agro-products engineering, biological energy, agricultural structures and environment and more. Subjects in Transactions of the CSAM have been embodied by many internationally well-known index systems, such as: EI Compendex, CA, CSA, etc.

   下载PDF阅读器