英文摘要 |
The 'lack of lexical information involves a hidden Markov model for part-of-speech (POS) tagging in lots of difficulties in improving the performance. To alleviate the burden, this paper proposes a method for combining multiword units, which are types of lexical information, into a hidden Markov model for POS tagging. This paper also proposes a method for extracting. multiword units from POS tagged corpus. In this paper, the multiword unit is defined as more than one word, which frequently makes POS tagging errors. Our experiment shows that the error reduction rate is about 13%. |