英文摘要 |
In this paper, a Phrase-Level-Building (PLB) mechanism is proposed to parse ill-formed sentences. By decomposing a syntactic tree into phrase-levels, this mechanism regards the task of parsing a sentence as a task of building the phrase-levels for the sentence. During parsing, a level-synchronous scoring function is used to remove less likely phrase-levels. As a result, instead of ~numerating all possible parses, the PLB parser only generates the more likely tree groups, each of which is a set of partial parses jointly deriving the input. Whenever all active phrase-levels in the search beam cannot be further reduced by any grammar rules, the process of building phrase-levels is stopped and a probabilistic scoring function is used to select the best tree group. With this approach, the best tree group is selected within a wider scope (i.e., the whole sentence), and thus generates better result. Compared with the baseline system using the stochastic context-free grammar and the 'leftmost longest phrase first' heuristics (which operates in a narrow scope), the proposed PLB approach improves the precision of brackets in the tree group from 69.37% to 79.49%. The recall of brackets is also improved from 78.73% to 81.39%. |