英文摘要 |
The task of word sense disambiguation is to identify the correct sense of a word in context. In this paper, we define a new notion, classification information, based on the Shannon's information theory. The classification information of a word consists of the pair of the most probable class MPC and the discrimination score DS. In the sense decision of the target word, the MPC of a surrounding word represents the sense of the target word most closely related, and the DS represents the degree of correlation between the MPC and the surrounding word. When a new sentence containing the target polysemous word is given, the sense of the target word is determined to the most plausible sense based on the classification information of all surrounding words in the sentence. Experimental results show that the average accuracy of the proposed method is 84.6% for the Korean data set, and 80.0% for the English data set. |