Abstract:Sugarcane is a globally important crop for both sugar production and bioenergy, and it is widely cultivated in tropical and subtropical regions. Effective disease diagnosis is essential to ensuring agricultural productivity and economic returns. In response to challenges posed by complex field environments, such as uneven lighting, low recognition accuracy, and limited detection efficiency, a novel algorithm: sugarcane disease classification algorithm using global-local attention mechanism (SDCA-GLAM) was proposesd. To enhance model capacity, the linear projection layers in a modified Vision Transformer (ViT) were replaced with deformable convolution modules, enabling adaptive extraction of lesion textures and leaf-edge information. Re-parameterized convolution was incorporated to strengthen spatial positional encoding, and deep convolutional modules were embedded in the multilayer perceptron to extract high-dimensional semantic features. To improve both accuracy and model efficiency, a parallel global-local self-attention architecture was designed. The local branch leveraged window attention to refine fine-grained textures, while the global branch reduced the spatial dimensions of key/value vectors via pooling and aggregating critical region information using a hyperparameter α. Finally, LayerNorm was replaced with BatchNorm to reduce the memory and time overhead caused by frequent reshaping. Experimental results on an 11-class sugarcane leaf dataset demonstrated that SDCA-GLAM achieved an accuracy of 88.26%, a throughput of 1,620 images per second, and a model size of 2.76×10^7. The proposed method outperformed mainstream models in both accuracy and efficiency, making it suitable for real-time mobile deployment in field diagnosis of sugarcane conditions.