GALs: A GAN-based Listwise Summarizer
作者 郭家銍陳冠宇 (Guan-Yu Chen)
抽取式摘要(Extractive Summarization)著眼於選擇文本中的幾個句子,使其組成足以表整篇文本內容的摘要。排序學習(Learning to Rank)最早興起於資料檢索領域,並被應用於各種排序的任務之中。在本研究中,我們將抽取式摘要視為一個整列式(listwise)句子排序問題,提出一套基於對抗式學習之整列式摘要法(GAN-based Listwise Summarizer, GALs)。GALs以生成對抗網路為架構,將抽取式摘要器作為生成器,並將其生成的摘要與參考答案的表面特徵(Surface Features)輸入給判別器,最後利用強化學習的方式,將判別器的預測做為回饋獎勵用於更新整個模型的參數。因此,本研究所提出之GALs融合了對抗式學習、整列式排序的概念、句子與文本的表面特徵以及強化學習,旨於提出一套經典的摘要模型方法。實驗中,我們不僅發現GALs在CNN/Daily Mail數據集上相較於傳統的最佳模型有明顯的分數提昇,我們亦對GALs模型所使用的參數,做了細節上的調查與分析。
Extractive summarization aims at selecting a set of sentences to form a summary for a given document. Learning-to-rank is first appeared in the field of information retrieval, and it has been employed to solve several ranking-based tasks. In this study, we regard the task of extractive summarization as a listwise sentence ranking problem, and thus a GAN-based listwise summarizer (GALs) is proposed. On top of the generative adversarial network (GAN), an extractive summarizer is introduced to be the generator, and a discriminator is employed to distinguish the generated summary from the ground truth. Especially, the input to the discriminator is a set of surface features, which are extracted from the generated summary and the ground truth. Finally, GALs can be optimized by leveraging the reinforcement learning (RL) strategy. The experimental results demonstrate the effectiveness of the proposed framework on the CNN/Daily Mail corpus. Moreover, we make detailed investigation and analysis of the parameters used in GALs.
起訖頁 15-24
關鍵詞 抽取式摘要整列式排序生成對抗網路表面特徵Extractive summarizationlistwiseGANsurface features
刊名 ROCLING論文集  
期數 2019 (2019期)
出版單位 中華民國計算語言學學會
