A Greedy Approach with New Cost Model for Intermediate Datasets Storage Problem in General Workflows

Zimao Li; Yingying Wang

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	A Greedy Approach with New Cost Model for Intermediate Datasets Storage Problem in General Workflows
並列篇名	A Greedy Approach with New Cost Model for Intermediate Datasets Storage Problem in General Workflows
作者	Zimao Li (Zimao Li)、Yingying Wang (Yingying Wang)
英文摘要	Running a scientific workflow on the cloud will generate a large volume of intermediate datasets and many of them have valuable information that can be used for further study, but the cost of storing them all is unbelievably high for the enormous data size. A feasible solution is to keep some of the intermediate datasets stored and re-compute the others when needed, the intermediate dataset storage problem asks to find a tradeoff to minimize the total cost of storing or re-generating each of the intermediate datasets. This paper focuses on a new cost model for the problem with general workflow, which incorporates additional delay tolerance, usage frequency and the transfer cost to make the cost model becoming more general. Based on a directed acyclic graph describing the dependence relationship between datasets, a greedy approach for the problem is proposed and implemented. Experimental results demonstrate the effectiveness and efficiency of our algorithm.
起訖頁	166-174
關鍵詞	delay tolerance、greedy algorithm、intermediate datasets storage、transfer cost、usage rate
刊名	電腦學刊
期數	201802 (29:1期)
該期刊-上一篇	On the Node Searching Spanning Tree Problem
該期刊-下一篇	Locality Preserving Semi-Supervised Canonical Correlation Analysis for Localization in Wireless Sensor Network