英文摘要 |
Purpose: With the advent of COVID-19, dynamic hand movement recognition is getting important in the hospital. In order to reduce the risk of contact infection, we develop a gesture recognition system for the electronic nursing whiteboard. Methods: For better efficiency and accuracy, the system adopts MediaPipe technology to capture the hand features of the image and collects the 21 key points of the hand; hence, we establish the dataset of their corresponding 21 key points of the original images. For gesture recognition, we propose a dual-stream model for deep learning: one stream is constructed from one 3DCNN (3D Convolutional Neural Networks) model followed by three ConvLSTM (Convolutional Long short-term memory) models, simply abbreviated as 3DCNN-ConvLSTM, and the other is constructed from three LSTM models, simply abbreviated as LSTM. For convenience, the dual-stream model is simply abbreviated as 3DCNN-ConvLSTM+LSTM. For training, the reconstructed hand images from the dataset of all 21 key points are input into the first-stream model 3DCNN-ConvLSTM; the original 21 key points of each corresponding hand image are input into the second-stream model LSTM, and the different features of the two models are merged through multi-layer fusion technology to train the dual-stream model. Results: For the original 3DCNN-ConvLSTM model, the accuracy performed on the grayscale images is 57.5%; for the 3DCNN-ConvLSTM model, an improved version of the original 3DCNN-ConvLSTM model, the accuracy performed on the reconstructed hand images is promoted to 95%. Furthermore, the dual-stream model 3DCNN-ConvLSTM+LSTM has a higher accuracy of 97.5% and obtains a quicker and smoother convergence. In addition, the dual-stream model has raised the transmission rate by 146 times plus. Conclusions: Our proposed model not only raises the accuracy from 57.5% up to 97.5%, but it also needs a tiny fraction of the original transmission time, lower than 0.685%. |