英文摘要 |
In recent years, Internet provides more and more information for people in daily life. Due to the limitation of information retrieval techniques, information retrieved might not be related and helpful for users. Two research topics in natural language processing have attracted much attention due to the important applications of information retrieval and chatbot in the past few years: question answering and machine comprehension. In this paper, we use Google BERT pre-trained model as a word embedding model to form semantic sentence features based on single words and phrases. Based on different strategies for question answering, we use cosine similarity to calculate similarity and choose the option of highest cosine similarity score as machine inferenced answer. In our experiments on TOEFL-QA dataset for English and Formosa Grand Challenge dataset for Chinese, our proposed method was compared with Bi-directional GRU and a strong alignment IR baseline, and obtained an accuracy of 34.87% and 57.5%, respectively. With the grammar difference between difference language, our model is capable of processing multilingual questions with comparable performance to existing methods. |