英文摘要 |
In the last decade of research and development in Natural Language Technology, basic tools have been put in place and are already impacting daily life. Speech recognition saves the phone company millions of dollars. Text to speech synthesis aids the blind. Massive resources for training and analysis are available in the form of annotated and analyzed corpora for spoken and written language. This explosion in applications has been largely due to new algorithms using statistical techniques and above all to the huge increase in power per dollar in computing machinery. Yet the goals of accurate information extraction, focused information retrieval and fluent machine translation still remain tantalizingly out of reach. Our next quantum leap in language processing capabilities has to come from a closer integration of syntax and lexical semantics. However, the difficulty of achieving adequate hand-crafted semantic representations has limited the field of natural language processing to applications that can be contained within well-defined subdomains. The only escape from this limitation will be through the application of robust, statistical techniques to the task of semantic representation. |