英文摘要 |
Chinese misspelling detection technology can be applied in fields such as education and publishing. This research topic has garnered considerable attention. Recently, although many studies have proposed models that are based on deep learning and that are capable of improving detection accuracy, these models have the disadvantage of high false alarm rates. In real application scenarios, it is important to reduce the occurrence of false alarms because false alarms, while using the system, lead to poor user experience. Therefore, it is important to create a model with low false alarm rate and high efficiency. In this paper, BERT Single Sentence Tagging task model is used to solve the Chinese misspelling detection problem. To work with this model, mass training data generation methods were designed. Experiments showed that the method employed in this study has a false alarm rate of 0.0297 for the SIGHAN 2015 test set. Compared to other previous methods with low false alarm rates, this method has the lowest false alarm rate and the highest recall rate. |