英文摘要 |
In this paper, we present a multi-word terminology extractor for thematic corpus based upon the co-occurrence of subterms. With regard to the basic properties of terminologies, among which we emphasize the structural dependency relation between subterms, a number of straightforward hypotheses are proposed as strategies for terminology recognition. The key idea to measure the structural dependency within a corpus-based approach is that higher frequency of subterm cooccurrence may indicate higher structural dependecy. The experimental results show that our algorithm can extract multi-word terminologies with nice correspondence to domain-specific concepts and notions. |