Abstract:Aiming to address the current limitations in detecting chestnut of varying scales under natural conditions, an innovative multi-scale chestnut detection method was introduced, YOLO 11-MCS, based on an improved YOLO 11 model. Firstly, a novel multi-scale key feature aggregation (MKFA) module was proposed, which was integrated into the C3k2 module to form the C3k2-MKFA feature extraction module, effectively capturing features at different scales, enhancing multi-scale feature extraction capabilities. Subsequently, the CGAFPN network was introduced, which incorporated a small object detection layer through a content-guided attention module and increased the contribution proportion of chestnut small object to multi-scale object, overcoming the deficiencies of the original algorithm in multi-scale and small object detection. Finally, a shared convolution separated batch normalization detection head (SCSB) was presented, utilizing shared convolution and separated batch normalization structures to efficiently extract cross-scale features and enhance feature consistency across different scales, effectively improved the performance of multi-scale object detection. Experimental results demonstrated that the improved model achieved a chestnut detection precision of 88.2%, a recall rate of 79.2%, and an average precision of 87.2%, which had improvements of 0.8, 5.9, and 5.5 percentage points, respectively, compared with the original YOLO 11 network. The model with channel-wise feature distillation achieved an average precision of 84.7%, with a model size of 6.0MB. When deployed on the Jetson Nano using the Infer inference library, the detection speed was 23ms per image, meeting the requirements for chestnut detection.