英文摘要 |
An automatic conversion system capable of translating text between simplified and complex Chinese characters is presented in this paper. This OCR-based system demonstrates an efficient feature extraction algorithm to recognize either complex or simplified printed Chinese characters. A new postprocessing model is developed to facilitate meaningful conversion of words, as well as correction of character recognition errors. Experimental results show that the average recognition rates are about 99.2% and 95.3% for single-font and multi-font character recognition respectively. When tested with real documents printed in simplified Chinese characters, the recognition rate is 96.2% without using contextual information. Upon employing the proposed language model for postprocessing, the text conversion rate can be improved to 97.8%. |