Abstract:Accurately and efficiently detecting straw coverage is crucial for soil protection and sustainable agriculture, as straw coverage not only affects soil fertility and moisture retention but also plays a key role in controlling soil erosion and improving the ecological environment. However, existing straw coverage detection models are often susceptible to interference from natural environmental factors such as lighting and shadows in practical applications. When the similarity between the straw and the soil in terms of color and texture is high, the accuracy of these models significantly decreases, leading to inaccurate coverage assessments and ultimately affecting the efficiency and reliability of farmland management decisions. Aiming to address the challenges posed by the diverse morphology of straw in images captured by vehicle-mounted cameras, including issues of image reflection and shadows, a novel semantic segmentation method called UMU-KAN for detecting straw of varying scales in natural environments was proposed. The replacement of conventional dilated convolutions in the atrous spatial pyramid pooling module with depth-wise dilated separable convolutions was proposed to enhance the extraction of fine-grained straw-related detail information. Additionally, a strip pooling branch captured features of widely spaced straw more effectively, integrating feature information from various branches through skip connections to reduce information loss. This series of improvements constructed a mixed pooling dilated spatial pyramid module, applied to the top semantic layer of the backbone network, thereby obtaining multi-scale information for sparsely distributed straw. Furthermore, a unified attention fusion module appeared during the decoding phase to effectively restore detailed edge information of straw segmentation, enabling the model to better learn features from different levels. Experimental results demonstrated that UMU-KAN achieved a mean intersection over union (mIoU) of 85.36% and a mean pixel accuracy (mPA) of 91.71% on the constructed straw dataset. Compared with the Unet, Swin-Unet, and DeepLabv3+ models, UMU-KAN improved mIoU by 4.20, 3.26, and 1.25 percentage points, respectively, and mPA by 3.58, 2.39, and 0.77 percentage points, respectively. Additionally, the parameter count of UMU-KAN was significantly lower than that of Swin-Unet and DeepLabv3+. UMU-KAN successfully achieved accurate detection of straw in images captured by agricultural machinery cameras, ensuring high detection efficiency even under dynamic and uncontrolled outdoor conditions. This not only highlighted the model’s adaptability and precision but also further demonstrated the significant developmental potential of the KAN architecture in the field of precision agriculture, contributing to the promotion of sustainable agricultural practices and enhancing the efficiency of agricultural management.