月旦知識庫
 
  1. 熱門:
 
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
中文計算語言學期刊 本站僅提供期刊文獻檢索。
  【月旦知識庫】是否收錄該篇全文,敬請【登入】查詢為準。
最新【購點活動】


篇名
基於 BERT 的強健性抽取式摘要法
並列篇名
EBSUM: An Enhanced BERT-based Extractive Summarization Framework
作者 吳政育陳冠宇
中文摘要
目前大部分自動摘要方法,分為抽取式摘要(Extractive)與重寫式摘要(Abstractive),重寫式摘要雖然能夠改寫文章形成摘要,但這並不是一種有效的方式,困難點在於語意不通順、重複字等。抽取式摘要則是從文章中抽取句子形成摘要,能夠避免掉語意不通順,重複字的缺點。目前基於BERT(Bidirectional Encoder Representation from Transformers)的抽取式摘要法,多半是利用BERT取得句子表示法後,再微調模型進行摘要句子之選取。在本文中,我們提出一套新穎的基於BERT之強健性抽取式摘要法(Enhanced BERT-based Extractive Summarization Framework, EBSUM),它不僅考慮了句子的位置資訊、利用強化學習增強摘要模型與評估標準的關聯性,更直接的將最大邊緣相關性(Maximal Marginal Relevance, MMR)概念融入摘要模型之中,以避免冗餘資訊的選取。在實驗中,EBSUM在公認的摘要資料集CNN/DailyMail中,獲得相當優良的任務成效,與經典的各式基於類神經網路的摘要模型相比,EBSUM同樣可以獲得最佳的摘要結果。
英文摘要
Automatic summarization methods can be categorized into two major streams: the extractive summarization and the abstractive summarization. Although abstractive summarization is to generate a short paragraph for expressing the original document, but most of the generated summaries are hard to read. On the contrary, extractive summarization task is to extract sentences from the given document to construct a summary. Recently, BERT (Bidirectional encoder representation from transformers), which has been introduced to several NLP-related tasks and achieved remarkable results, is a pre-trained language representation method. In the context of extractive summarization, BERT is usually be used to obtain representations for sentences and documents, and then a simple model is employed to select potential summary sentences based on the inferred representations. In this paper, an enhanced BERT-based extractive summarization framework (EBSUM) is proposed. The major innovations are: first, EBSUM takes the sentence position information into account; second, in order to maximize the ROUGE score, the model is trained by the reinforcement learning strategy; third, to avoid the redundancy information, the maximal marginal relevance (MMR) criterion is incorporated with the proposed EBSUM model. In the experiments, EBSUM can outperforms several state-of-the-art models on the CNN/DailyMail corpus.
起訖頁 19-35
關鍵詞 自動摘要抽取式BERT強化學習最大邊緣相關性Auto-summarizationExtractiveBERTReinforcement LearningMMR
刊名 中文計算語言學期刊  
期數 201912 (24:2期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 基於特徵粒度之訓練策略於中文口語問答系統之應用
該期刊-下一篇 適合漸凍人使用之語音轉換系統初步研究
 

新書閱讀



最新影音


優惠活動




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄