結構資料的再次使用：語意、連結與實作

黃韋菁; 李承錱; 莊庭瑞

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	結構資料的再次使用：語意、連結與實作
作者	黃韋菁、李承錱、莊庭瑞
中文摘要	持續創造資料的語意與連結，藉由全球資訊網散布同時可由常人和機器處理並理解的結構性資料，進而增進資料集的「再次使用價值」（reuse value）是目前廣受重視的課題，也是本研究由理論探討邁向系統實作的動力與目的。本文簡述與「開放資料連結」（Linked Open Data, LOD）相關國際計畫與技術發展，介紹以「開放資料連結」方式建置的五項跨領域知識庫和七項專業知識庫，並解析資料品質、後設資料（Metadata）及資料溯源（Provenance）的關聯脈絡。本研究同時進行實作網站data.odw.tw，收納典藏品目錄資料，並設計知識本體（voc4odw）轉換半結構式資料為富語意結構的連結式資料。一方面擴充CKAN（The Comprehensive Knowledge Archive Network）資料集管理系統，作為連結式資料的儲存與展示平台，進而強調從原始目錄資料到語意連結資料的分段轉換步驟，最後將各步驟轉換程式以及CKAN軟體程式碼以「開放原始碼」（Open Source）方式釋出。另一方面，由於研究資料來源採「創用CC」（Creative Commons）公眾授權，因此研究成果亦以相同方式釋出，在開放基礎上促使資料與程式碼的保存與發展，可被自由再次使用與擴散。
英文摘要	In order to increase the reuse value of existing datasets, it is now becoming a general practice to add semantic links among the records in a dataset, and to link these records to external resources. The enriched datasets are published on the web for both human and machine to consume and re-purpose. In this paper, we make use of publicly available structured records from a digital archive catalogue, and we demonstrate a principled approach to converting the records into semantically rich and interlinked resources for all to reuse. While exploring the various issues involved in the process of reusing and re-purposing existing datasets, we review the recent progress in the field of Linked Open Data (LOD), and examine twelve well-known knowledge bases built with a Linked Data approach. We also discuss the general issues of data quality, metadata vocabularies, and data provenance. The concrete outcome of this research work is the following: (1) a website data.odw.tw that hosts more than 840,000 semantically enriched catalogue records across multiple subject areas, (2) a lightweight ontology voc4odw for describing data reuse and provenance, among others, and (3) a set of open source software tools available to all to perform the kind of data conversion and enrichment we did in this research. We have used and extended CKAN (The Comprehensive Knowledge Archive Network) as a platform to host and publish Linked Data. Our extensions to CKAN is open sourced as well. As the records we drawn from the originally catalogue are released under the Creative Commons licenses, the semantically enriched resources we now re-publish on the Web are free for all to reuse as well.
起訖頁	7-46
關鍵詞	資料溯源、資料品質、知識庫、開放資料連結（LOD）、知識本體、語意再現、CKAN、Data Provenance、Data Quality、Knowledge Base、Linked Open Data、Ontology、Semantic Representation
刊名	圖書館學與資訊科學
期數	201704 (43:1期)
出版單位	國立臺灣師範大學圖書資訊學研究所
該期刊-下一篇	以連結開放資料服務為基礎的數位人文平臺建設方案研究