中文摘要 |
This paper investigates the appropriateness of using lexical cohesion analysis to assess Chinese readability. In addition to term frequency features, we derive features from the result of lexical chaining to capture the lexical cohesive information, where E-HowNet lexical database is used to compute semantic similarity between nouns with high word frequency. Classification models for assessing readability of Chinese text are learned from the features using support vector machines. We select articles from textbooks of elementary schools to train and test the classification models. The experiments compare the prediction results of different sets of features. |