英文摘要 |
Rescoring approaches for parsing aim to re-rank and change the order of parse trees produced by a general parser for a given sentence. The re-ranking quality depends on the precision of the rescoring function. However it is a challenge to design an appropriate function to determine the qualities of parse trees. No matter which method is used, Treebank is a widely used resource in parsing task. Most approaches utilize complex features to re-estimate the tree structures of a given sentence. Unfortunately, sizes of treebanks are generally small and insufficient, which results in a common problem of data sparseness. Learning knowledge from analyzing large-scaled unlabeled data is compulsory and proved useful in the previous works. How to extract useful information from unannotated large scale corpus has been a research issue. Word embeddings have become increasingly popular lately, proving to be valuable as a source of features in a broad range of NLP tasks. The word2vec is among the most widely used word embedding models today. |