Abstract:
A large number of infrared images are generated during the operation of power equipment. When the power equipment in the infrared image is densely arranged, incline-angled, and has a large aspect ratio, the target detection network based on a horizontal rectangular frame can only provide the approximate position of the target, which is prone to overlap with the target detection area and introduce redundant background information, giving detection results that are not sufficiently accurate. To solve this problem, we propose to introduce a rotating rectangular box mechanism into the retina net target detection network and mosaic data enhancement technology at the network input, replacing the ReLU function in the original backbone network with a smoother mish activation function of gradient flow; the Pan module is added after the FPN module of the original model to further fuse image features. Finally, the data set is made by using the power-equipment infrared images collected on-site. The improved model is compared and evaluated with three target detection networks based on horizontal rectangular frame positioning: fast R-CNN, YOLOv3, and original RetinaNet. The experiments show that the improved model can detect the infrared targets of power equipment with inclination in dense scenes more accurately, and the detection accuracy of multi-category power equipment is higher than that of the above three models.