基於多重注意力機制的輔助損失函數用於端到端語者標記

楊憶婷; 李俊廷; 陳柏琳

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	基於多重注意力機制的輔助損失函數用於端到端語者標記
並列篇名	Auxiliary Loss to Attention Head for End to End Speaker Diarization
作者	楊憶婷、李俊廷 (Chun-Ting Lee)、陳柏琳
中文摘要	本研究提出新穎的輔助函數用於自注意力端到端語者自動標記模型(SA-EEND)，實現在重疊語音區域進行準確的語者標籤預測。過去的研究缺乏充分利用模型中的語者信息以增強輔助模型訓練的方法，並且未考慮到不同語音活動模式(speech activity patterns)的數量分佈差異。本研究提出了一種新穎的輔助函數，以實現在重疊的語音區域中對語者標籤的預測。通過整體語音活動模式以及不同語者的語音活動模式任務，我們調整了Transformer層中的注意力機制(multi-head self-attention)的權重矩陣，並且挑選損失函數能夠加強數量較少的標籤的學習效果，以達到更好的語者辨別效果。本研究在Mini LibriSpeech上進行了實驗，雖然成果稍微有限，但仍然取得了一些進展。
英文摘要	This study introduces a novel auxiliary function for use in the Self-Attention End-to-End Speaker Diarization (SA-EEND) model, aiming to achieve accurate speaker label prediction within overlapping speech regions. Previous research has lacked effective methods for leveraging speaker information within the model to enhance auxiliary model training and has not taken into account variations in the distribution of different speech activity patterns. This study proposes a novel auxiliary function to facilitate speaker label prediction within overlapping speech regions. By considering both the overall speech activity patterns and the task-specific speech activity patterns for different speakers, we adjust the weight matrices of the multi-head self-attention mechanism in the Transformer layers. We also select loss functions that can improve the learning performance for labels with fewer occurrences, resulting in better speaker discrimination. Experimental evaluations were conducted on Mini LibriSpeech. Although the results exhibited some limitations, there were still notable advancements made.
起訖頁	38-43
關鍵詞	語者標記、端到端語者標記、注意力機制、輔助損失函數、speaker diarization、end-to-end neural diarization、multi-head attention、auxiliary loss
刊名	ROCLING論文集
期數	202310 (2023期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	中文訊息傳遞服務對話系統之建構
該期刊-下一篇	臺灣客語斷詞前導研究與模型建立