月旦知識庫
 
  1. 熱門:
 
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
電子商務學報 本站僅提供期刊文獻檢索。
  【月旦知識庫】是否收錄該篇全文,敬請【登入】查詢為準。
最新【購點活動】


篇名
以資料挖礦法則預測網頁更新規則之研究
並列篇名
Discovering Web Page Update Patterns with Data Mining
作者 許秉瑜張維捷
中文摘要
企在電子商務時代,有各式代理人軟體 (Agent) 在網路搜尋資訊以建構各式各類網站。由於資料量通常相當龐大,對這類軟體而言,何時應更新其所取得的資訊,便成為一個系統管理員重要的決策課題。目前通常採取固定時間更新方式,亦即更新的間隔為一使用者自定的固定時間。但是一旦其間隔的設定不佳,則可能造成抓回來的網頁內容都是與先前相同的 (間隔太短),或是網頁的內容已經被更新過多次以上了 (間隔太長),這樣一來就可能會有浪費網路資源或資料過舊的情況出現。所以本論文利用資料挖礦中產生序列關聯規則的方法,對網頁找出其更新時間的樣式 (up-date pattern),並以此樣式來實際擷取網頁,以做驗證。由於網頁更動的樣式可能隨著時間變化而產生修改,因此一成不動的預測樣式會逐漸失去準確性。本研究因此也提出累進式的方法來更新預測規則,使規則能適時反應現況但又不至於耗用過多電腦資源。
英文摘要
In the E-Commerce era, many agents roam over Internet to find best prices, cluster related product information, etc. Agents have to visit targeted web pages periodically to update information. If agents visit pages too frequently then they end up reloading existing information. On the other hand, if agents visit web pages too infrequently, collected data may be out of date. To minimize out-of-date errors, agents temp to visit a site as soon as possible. However, to minimize network traffic and database update cost, system administrators temp to reduce the visit as much as possible. To the best of our knowledge, no research has have been directed to finding a scientific approach to solve the dilemma.
In the paper, we propose to visit web pages according to past update patterns. That is, a page should be visited as soon as it is expected to be changed, but should not be visited in any other time. To discover the update patterns, we propose to use sequential association rules of data mining methodology. Association rules can find patterns implicitly associated with update temporal patterns. In the paper, each web page will be associated with a sequence of binary digits denoting whether the page is updated in last agent fetching slot. We designed an algorithm to mine patterns from the sequence of binary digits. The patterns will be composed of large item sequences and related association rules. The rule states under some preconditions, the web page will be changed in next time slot. If a precondition matches current situation then an agent will be sent to fetch the page. Besides computing patterns for existing pages, the system will also update its database dynamically to consider the factors of newly inserted pages and deleted pages.
起訖頁 11-36
關鍵詞 企網頁更新資料挖礦樣式關聯規則網頁挖礦Web page updateDate miningpattern DiscoveryWWW
刊名 電子商務學報  
期數 200309 (5:2期)
出版單位 中華企業資源規劃學會
該期刊-上一篇 幾個快速挖掘關聯規則的資料探勘方法
該期刊-下一篇 一個企業內網路使用記錄之資料發掘架構
 

新書閱讀



最新影音


優惠活動




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄