英文摘要 |
In Natural Language Processing systems, different classification of lexical categories will lead to different set of rules, and thus different kinds of analyses, therefore the choice of a good category system is very important to the efficiency and to the memory load of the overall parsing system. Unfortunately, although every parsing system has a set of lexical categories, the issues as to whether these category systems are properly chosen, and the factors for evaluating the adequacy of the classification of lexical categories has generally been ignored. Especially, things go worse in research areas, such as Mandarin NLP field, where many fundamental issues are just beginning to be explored, the lack of a good category system apts to obstruct an. in-depth research. In this paper, we propose eight criteria for the classification of lexical categories in a syntax-oriented parsing system. These criteria are syntax dominance, descriptive power, simplicity, explicitness, mutual exclusion, collective exhaustiveness, applicational efficiency, and conventionality. Each of them is clearly defined and illustrated by Mandarin examples. Furthermore, the tradeoffs among these criteria are also taken into consideration. These criteria and the discussions of tradeoffs will be helpful in serving as a guide for designing and evaluating a category system. |