英文摘要 |
This paper presents a study on the portability of our grammatical inference system called CAGC (Computer Assisted Grammar Construction). The CAGC system has been developed [1] to generate broad-coverage grammars for large natural language corpora. It utilises both an extended Inside-Outside algorithm [2] and an automatic phrase bracketing (AUTO) technique [3], which is designed to provide the extended algorithm with constituent information during learning. The system is firstly trained and tested on the Wall Street Journal (WSJ) corpus, and then ,for the study of its portability, it is moved onto the Brown Corpus to infer a Brown grammar. The experimental results shown in this paper demonstrate that the CAGC inference technique as well as the initial grammar used in the system are transferable to the new corpus. |