英文摘要 |
Text readability refers to the degree to which a text can be understood by its readers: the higher the readability of a text for readers, the better the the comprehension and learning retention can be achieved. In order to facilitate readers to digest and comprehend documents, researchers have long been developing readability models that can automatically and accurately estimate text readability. Conventional approaches to readability classification is to infer a readability model using a set of handcrafted features defined a priori and computed from the training documents, along with the readability levels of these documents. However, the use of handcrafted features requires special expertise and its applicability also is limited. With the recent advance of representation learning techniques, we can efficiently extract salient features from dcouments without recourse to specialized expertise, which offers a promising avenue of research on readability classification. In view of this, we in this paper propose two novel readability models built on top of a convolutional neural network based representation and the so-called fastText representation, respectively, which have the capability of effectively analyzing documents belonging to different domains and covering a wide variety of topics. A series of emperical experiments seem to demonstrate the utility of the proposed models in relation to several existing methods. |