Non-motor vehicles are widely used in the urban and rural transportation system for their portability, but the related violations also occur frequently and are difficult to be supervised intelligently, considering their colossal quantity, various styles, and small volumes. To solve this problem, this paper presents a non-motor vehicle violation detection algorithm with efficient target detection and deliberate logical calculation. A target detection network with high speed and accuracy is constructed firstly by fusing two different types of attention mechanism. Specifically, the Squeeze-and-Excitation Network is employed to optimize the extraction of local features, which can effectively reduce the error rate of target detection. Meanwhile, the Transformer Network is adopted to strengthen the extraction of global features, which can improve the target location performance for target tracking in high-density scenarios. As the global and local features are integrated with attention mechanism, the proposed network can accurately identify and locate the numerous small targets in real-time, and avoid identity switching caused by target occlusion. Finally, the violation of non-motor vehicles is recognized in real-time by constructing the logical calculation between the target features and their motion trajectories. Experiments on datasets show that the detection accuracy of the proposed algorithm is better than the current mainstream algorithms, especially the accuracy of small targets such as Head and Helmet is higher than 92.2%. And our ID Switch is dropped by more than 60% compared with the classical Deep SORT algorithm. In real-life scenarios, the proposed algorithm also shows excellent accuracy and real-time performance for non-motor vehicle violations.