英文摘要 |
National Palace Museum (NPM) obtains nearly 700,000 world-class extensive art collections, of which the large quantity is not only a great challenge for digitization, but also a high threshold for researchers on subsequent interpretation and application. Ever since 2017, Department of Rare Books and Historical Documents submitted the “Subordinate Program of Digitalizing Crucial Historical Documents in High Resolutions” to bid for the Executive Yuan’s Forward-looking Infrastructure Development Program. Based upon the idea above, the department’s main goal was to digitize at least 400,000 pages, which adds up to nearly 2.4 million pages of digital files over the years. The long working hours and large digital assets have prompted us to think about ways to leverage new technologies and optimize the value-added applications of completed digital scans. One of the major milestones in digitizing documents is the creation of full-text searches. Since this is a resource-intensive and time-consuming task to accomplish manually, full-text retrieval is even more unattainable when digital scanning is long overdue. In order to do so, the artificial intelligence technology has been introduced to perform text recognition and metadata auxiliary classification with digital scans to speed up the process of digitization, so that there may be more possibilities for subsequent value-added applications, such as connecting geographic data in the literature to the GIS (Geographic Information System) to facilitate the retrieval of all Qing Dynasty archives by geographical locations; or automatically linking the characters in the literature to names in the Qing Dynasty archive database, automatically establish geopolitical relations or networking to their titles, making it more convenient for the researchers of Qing History. Although it is still difficult to perfectly identify and punctuate literature directly with artificial intelligence, there are a number of case studies in the academic world, and this paper will also provide some insights on this basis, in the hope that it can facilitate the process of digitization of literature. |