HRformer: Hierarchical Regression Transformer for Infrared Small-Target Detection
-
-
Abstract
Infrared small-target detection refers to the detection of small targets in infrared images with low signal-to-noise ratios and complex backgrounds. Infrared small-target detection is essential in applications, such as maritime rescue and traffic management. However, because of factors such as low image resolution, small target size, and inconspicuous features, infrared targets are prone to submergence in a background that contains noise and clutter. The accurate detection of the shape information of small infrared targets remains a challenge. An infrared small-target detection algorithm based on a hierarchical regression transformer (HRformer) network was constructed to address these problems. Specifically, the PixelUnShuffle operation was leveraged to downsample the original image and obtain the input of different network levels to obtain multiscale information while minimizing the loss of the original image information. The PixelShuffle operation upsamples the output feature map of each level, improving the flexibility of the network. Next, a cross-attention fusion module that includes the spatial and channel attention calculation branches realizes efficient feature fusion and information complementarity to realize the information interaction between different levels of features in the network. Finally, combined with the ordinary Transformer structure, which has a large receptive field, and the window-based Transformer, which has the advantage of minimal computational complexity, a local–global transformer structure is proposed to further improve the detection performance of the network and reduce computational costs. The proposed structure can model global dependencies while extracting local context information. The experimental results show that the proposed algorithm has a higher detection accuracy and fewer parameters than some advanced infrared small-target detection algorithms. Therefore, the proposed algorithm is suitable for solving practical problems.
-
-