This article focuses on the current situation of large assembly errors, easy omissions and errors in the mechanical assembly process. Computer vision is introduced in the assembly process, and visual images are used to estimate assembly errors, thereby improving assembly accuracy. To this end, through improvements to the neural network, the addition of attention and measurement mechanisms, the network’s ability to extract and distinguish features from assembly images has been improved. Finally, deep learning algorithms are used to estimate assembly features in the image. Finally, simulation experiments have shown that the algorithm proposed in this paper can achieve 94.7% improvement in assembly accuracy and error estimation accuracy.