中文摘要 |
Noun-verb event frame (NVEF) knowledge in conjunction with an NVEF
word-pair identifier [Tsai et al. 2002] comprises a system that can be used to
support natural language processing (NLP) and natural language understanding
(NLU). In [Tsai et al. 2002a], we demonstrated that NVEF knowledge can be used
effectively to solve the Chinese word-sense disambiguation (WSD) problem with
93.7% accuracy for nouns and verbs. In [Tsai et al. 2002b], we showed that NVEF
knowledge can be applied to the Chinese syllable-to-word (STW) conversion
problem to achieve 99.66% accuracy for the NVEF related portions of Chinese
sentences. In [Tsai et al. 2002a], we defined a collection of NVEF knowledge as an
NVEF word-pair (a meaningful NV word-pair) and its corresponding NVEF
sense-pairs. No methods exist that can fully and automatically find collections of
NVEF knowledge from Chinese sentences. We propose a method here for
automatically acquiring large-scale NVEF knowledge without human intervention
in order to identify a large, varied range of NVEF-sentences (sentences containing
at least one NVEF word-pair). The auto-generation of NVEF knowledge
(AUTO-NVEF) includes four major processes: (1) segmentation checking; (2)
Initial Part-of-Speech (IPOS) sequence generation; (3) NV knowledge generation;
and (4) NVEF knowledge auto-confirmation.
Our experimental results show that AUTO-NVEF achieved 98.52% accuracy for
news and 96.41% for specific text types, which included research reports, classical
literature and modern literature. AUTO-NVEF automatically discovered over
400,000 NVEF word-pairs from the 2001 United Daily News (2001 UDN) corpus.
According to our estimation, the acquired NVEF knowledge from 2001 UDN
helped to identify 54% of the NVEF-sentences in the Academia Sinica Balanced
Corpus (ASBC), and 60% in the 2001 UDN corpus.
We plan to expand NVEF knowledge so that it is able to identify more than 75% of
NVEF-sentences in ASBC. We will also apply the acquired NVEF knowledge to
support other NLP and NLU researches, such as machine translation, shallow
parsing, syllable and speech understanding and text indexing. The auto-generation
of bilingual, especially Chinese-English, NVEF knowledge will be also addressed
in our future work. |