  1. 熱門:
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
ROCLING論文集 本站僅提供期刊文獻檢索。

Unsupervised Multi-document Summarization for News Corpus with Key Synonyms and Contextual Embeddings
Unsupervised Multi-document Summarization for News Corpus with Key Synonyms and Contextual Embeddings
作者 Yen-Hao Huang (Yen-Hao Huang)Ratana Pornvattanavichai (Ratana Pornvattanavichai)Fernando Henrique Calderon Alvarado (Fernando Henrique Calderon Alvarado)Yi-Shin Chen (Yi-Shin Chen)
Information overload has been one of the challenges regarding information from the Internet. It is not a matter of information access, instead, the focus had shifted towards the quality of the retrieved data. Particularly in the news domain, multiple outlets report on the same news events but may differ in details. This work considers that different news outlets are more likely to differ in their writing styles and the choice of words, and proposes a method to extract sentences based on their key information by focusing on the shared synonyms in each sentence. Our method also attempts to reduce redundancy through hierarchical clustering and arrange selected sentences on the proposed orderBERT. The results show that the proposed unsupervised framework successfully improves the coverage, coherence, and, meanwhile, reduces the redundancy for a generated summary. Moreover, due to the process of obtaining the dataset, we also propose a data refinement method to alleviate the problems of undesirable texts, which result from the process of automatic scraping.
起訖頁 192-201
刊名 ROCLING論文集  
期數 202112 (2021期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 一個基於BERT與孿生架構的檢索模型
該期刊-下一篇 基於參數生成網路的遷移學習進行情感分析和歌手命名識別




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄