中文摘要 |
社群網路的普及使得不少人以Facebook為媒介來宣傳活動,因此本論文的目的即是建立一個Facebook的活動事件擷取系統,以幫助使用者快速地掌握活動的資訊。我們改善了黃等人的Web NER Model Generation工具,藉以建立活動名稱及地點擷取模型,再利用序列樣版探勘找出活動的起始、結束日期。此外,我們也嘗試以大量的Facebook打卡地點來改善地點辨識準確率。實驗測試了1,300篇人工標記答案的貼文,以評斷系統擷取活動事件的效能和命名實體辨識的效能,並將擷取出來的活動地點實際投射到經緯度座標上,以評估預測活動實際位置的準確度。實驗結果顯示活動名稱、地點以及開始、結束日期擷取的F1-score分別為0.727, 0.694及0.865, 0.72,活動事件整體辨識率為0.708,顯示藉由此系統來統整Facebook上的活動事件並定位出事件發生的地點是相當可行的。 |
英文摘要 |
The popularity of social networks has made them a perfect medium for activity or advertising campaign promotion. Therefore, many people use Facebook pages to announce their advertising campaign. The purpose of this study is to extract activity events by constructing two named entity recognition models, namely activity name and location, via a Web NER model generation tool. We enhance the tool by improving the tokenizer and alignment technique. In addition, we also use a large database of FB checkin places for location name recognition improvement. For entity relation extraction, we apply sequential pattern mining to find rules for start date, end date, and location coupling. We use 1,300 posts from Facebook to test the activity event extraction performance. The experimental results show 0.727, 0.694 F1- score for activity name and location recognition; and 0.865, 0.72 F1-score for start and end date extraction. Overall, the extraction performance for activity event extraction is 0.708. |