融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究

楊明翰; 許曜麒

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究
並列篇名	Leveraging Multi-Task Learning with Neural Network Based Acoustic Modeling for Improved Meeting Speech Recognition
作者	楊明翰、許曜麒
中文摘要	本論文旨在研究如何融合多任務學習(Multi-Task Learning, MTL)技術於聲學模型之參數估測，藉以改善會議語音辨識(Meeting Speech Recognition)之準確性。我們的貢獻主要有兩點：1)我們進行了實證研究以充分利用各種輔助任務來加強多任務學習在會議語音辨識的表現。此外，我們還研究多任務與不同聲學模型像是深層類神經網路(Deep Neural Networks, DNN)聲學模型及摺積神經網路(Convolutional Neural Networks, CNN)結合的協同效應，期望增加聲學模型建模之一般化能力(Generalization Capability)；2)由於訓練多任務聲學模型的過程中，調整不同輔助任務之貢獻（權重）的方式並不是最佳的，因此我們提出了重新調適法，以減輕這個問題。我們基於在台灣所收錄的中文會議語料庫(Mandarin Meeting Recording Corpus, MMRC)建立了一系列的實驗。與數種現有的基礎實驗相比，實驗結果揭示了我們所提出的方法之有效性。
英文摘要	This paper sets out to explore the use of multi-task learning (MTL) techniques for more accurate estimation of the parameters involved in neural network based acoustic models, so as to improve the accuracy of meeting speech recognition. Our main contributions are two-fold. First, we conduct an empirical study to leverage various auxiliary tasks to enhance the performance of multi-task learning on meeting speech recognition. Furthermore, we also study the synergy effect of combing multi-task learning with disparate acoustic models, such as deep neural network (DNN) and convolutional neural network (CNN) based acoustic models, with the expectation to increase the generalization ability of acoustic modeling. Second, since the way to modulate the contribution (weights) of different auxiliary tasks during acoustic model training is far from optimal and actually a matter of heuristic judgment, we thus propose a simple model adaptation method to alleviate such a problem. A series of experiments have been carried out on the Mandarin meeting recording (MMRC) corpora, which seem to reveal the effectiveness of our proposed methods in relation to several existing baselines.
起訖頁	85-103
關鍵詞	多任務學習、深層學習、類神經網路、會議語音辨識、Multi-Task Learning、Deep Learning、Neural Network、Meeting Speech Recognition
刊名	中文計算語言學期刊
期數	201612 (21:2期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	基於字元階級之語音合成用文脈訊息擷取