This article constructs an automated recognition system for surface defect recognition of metal bars, which includes hardware components, recognition algorithms, and recognition systems. Firstly, a visual recognition hardware system was established, which includes the selection process of cameras and lenses, determining the hardware model and brand. At the same time, considering the reflective characteristics of the rod surface, the light source scheme of this system was explained. In terms of recognition algorithms, Mask R-CNN is used as the metal rod feature network. In response to the problems of information loss and aliasing caused by the reduction of pyramid feature channels and multi-level feature fusion in the feature extraction process of Mask R-CNN, a bottom-up reverse fusion path is proposed to be added to the original feature extraction network, fully utilizing shallow feature information. At the same time, a hierarchical attention mechanism and a difference region attention module are added to the path of obtaining the feature map; Finally, in the experimental stage, the construction of the visual detection system framework was completed, which proved the effectiveness and scientificity of the proposed method in terms of recognition accuracy and speed.