英文摘要 |
The construction and collocation analysis of the Japanese - Chinese Parallel Corpus is presented in this paper. In addition to the large amounts of data and methods of statistics while comparing the parallel corpus with CTLJ, the analysis of different usage of collocation is also included in this study.A self-designed KL-Divergence function of Collocation Tool, is used to compare those original texts in public corpus, CTLJ and parallel corpus. The results show that a collocation pair is often used if it accounts the used frequency with both positive and high grade in CTU and in parallel corpus. When a collocation pair accounts the used frequency with a negative number in GTLJ, but with a positive number in parallel corpus, this collocation pair is probably seldom used or need to be learned. In contrast, when a collocation pair accounts the using frequency with a positive score in CTLJ, but with a negative one in parallel corpus, this collocation pair is probably misused. In comparison with CTLJ and parallel corpus, the seldom used or easily misused collocation pairs are known effectively. |