| 英文摘要 |
Meeting summarization aims to distill meaningful information from lengthy meeting transcripts into concise texts, allowing participants to grasp key points quickly. However, meeting transcripts often feature complex dialogue structures, such as incomplete sentences and information scattered across multiple utterances. Additionally, the length of these transcripts often exceeds the maximum input limit for pretrained language models. In this paper, we introduce a two-stage summarization framework specifically designed for long-input texts and complex dialogue structures. First, we extract key segments from the original transcript. Second, we generate the summary based on these extracted segments. To address the complexity of dialogue structures, we employ dialogue discourse parsing to comprehend the relationships between utterances, which we represent in a treelike structure. We select more structured text as the output from the extraction phase to enhance information density, thereby providing a more organized input for the summary generator. Experimental results demonstrate that our approach significantly improves the quality of the generated summaries. |