王轩慧,陈建毅,郑西来,朱成,王轩力,单春芝.基于SGA-RF算法的农业土壤镉浓度反演研究[J].农业机械学报,2018,49(10):261-269.
WANG Xuanhui,CHEN Jianyi,ZHENG Xilai,ZHU Cheng,WANG Xuanl,SHAN Chunzhi.Inversion of Cadmium Content in Agriculture Soil Based on SGA-RF Algorithm[J].Transactions of the Chinese Society for Agricultural Machinery,2018,49(10):261-269.
摘要点击次数: 110
全文下载次数: 76
基于SGA-RF算法的农业土壤镉浓度反演研究   [下载全文]
Inversion of Cadmium Content in Agriculture Soil Based on SGA-RF Algorithm   [Download Pdf][in English]
投稿时间:2018-04-12  
DOI:10.6041/j.issn.1000-1298.2018.10.029
中文关键词:  农业土壤  镉浓度  特征波长选择  斯皮尔曼等级相关分析  遗传算法  随机森林
基金项目:国家自然科学基金重点项目(41731280)和国家自然科学基金项目(11701310)
作者单位
王轩慧 中国海洋大学
青岛农业大学 
陈建毅 青岛农业大学 
郑西来 中国海洋大学 
朱成 中国联合网络通信有限公司济南软件研究院 
王轩力 山西工程技术学院 
单春芝 国家海洋局北海环境监测中心 
中文摘要:在农业土壤重金属高光谱检测领域,土壤镉元素近红外光谱的高维、高冗余特性会严重影响高光谱反演模型的准确性和稳定性。为了解决上述问题,本文提出一种基于斯皮尔曼相关分析的遗传随机森林特征选择算法(SGA-RF)。该算法首先对初始特征集合使用基于斯皮尔曼相关分析的特征预选方法,筛选出大量冗余波段,保留与镉元素相关性最强的特征波段;其次在特征精选阶段,提出一种基于随机森林的适应度函数评估方法,该方法充分结合遗传算法强大的全局搜索能力和随机森林算法较高的反演能力,提高了对相似个体的区分能力,获得具有最小冗余度和最大区分性的最优特征波段子集。为了验证所提算法的有效性,选取青岛市大沽河流域具有代表性的124个土壤样品为实验对象,利用SGA-RF算法将原始2051个波段优选至37个最具代表性的敏感波段,并与现有特征选择算法所建模型进行对比分析。试验结果表明,该特征选择方法与随机森林回归模型相结合具有较低的预测均方根误差(0.0601),较高的相关系数(0.9502)和预测相对分析误差(2.03)。作为应用可见/近红外光谱技术定量反演农业土壤镉浓度的重要步骤,SGA-RF算法以较少的敏感波段达到了较高的反演效果,可为监测土壤重金属污染情况提供一定的理论依据。
WANG Xuanhui  CHEN Jianyi  ZHENG Xilai  ZHU Cheng  WANG Xuanl  SHAN Chunzhi
Ocean University of China;Qingdao Agricultural University,Qingdao Agricultural University,Ocean University of China,China Unicom Ji’nan Software Research Institute,Shanxi Institute of Technology and The Environmental Monitoring Center of North China Sea
Key Words:agriculture soil  cadmium content  characteristic wavelength selection  Spearman’s rank correlation analysis  genetic algorithm  random forest
Abstract:In the field of hyperspectral detection on heavy metal pollution levels in agricultural soils, the accuracy and stability of hyperspectral inversion model for soil cadmium were seriously affected by the high dimensional and high redundancy characteristics in visible/NIR spectra. In order to solve the above problems, Spearman’s rank correlation analysis-based genetic algorithm by using random forest (SGA-RF) was proposed to select the characteristic wavelength from hyperspectral data. On the first-layer of feature selection stage, Spearman correlation analysis-based feature selection method was applied to remove redundancy between all spectra features and retain the characteristic wavelength which was the most relevant to the cadmium content. On the second-layer of feature selection stage, a new fitness function based on random forest was proposed, which perfectly combined the strong global search ability of genetic algorithm and the high inversion ability of random forest. With the proposed fitness function to evaluate the viability of individuals, the distinguishing ability between similar individuals was improved and a subset of optimal spectra feature set with minimum redundancy and maximum differentiation were obtained. In order to verify the validity of the proposed algorithm, totally 124 representative soil samples collected from the Dagu River Basin were chosen as samples. The optimal feature subset which contained 37 sensitive wavelengths was chosen and used to build soil available cadmium content inversion model, and its performance was compared with that of current feature selection methods. Results indicated that the minimum numbers of wavelength features was selected and meanwhile the prediction performance had lower predictive root mean square error of 0.0601, higher correlation coefficient of 0.9502 and residual predictive deviation of 2.03. As an important step for the quantitative inversion of cadmium concentration by using visible/NIR spectra, the research could provide some theoretical basis for monitoring soil heavy metal pollution.

Transactions of the Chinese Society for Agriculture Machinery (CSAM), in charged of China Association for Science and Technology (CAST), sponsored by CSAM and Chinese Academy of Agricultural Mechanization Science(CAAMS), started publication in 1957. It is the earliest interdisciplinary journal in Chinese which combines agricultural and engineering. It always closely grasps the development direction of agriculture engineering disciplines and the published papers represent the highest academic level of agriculture engineering in China. Currently, nearly 8,000 papers have been already published. There are around 3,000 papers contributed to the journal each year, but only around 600 of them will be accepted. Transactions of CSAM focuses on a wide range of agricultural machinery, irrigation, electronics, robotics, agro-products engineering, biological energy, agricultural structures and environment and more. Subjects in Transactions of the CSAM have been embodied by many internationally well-known index systems, such as: EI Compendex, CA, CSA, etc.

   下载PDF阅读器