探究端對端語音辨識於發音檢測與診斷

張修瑞; 羅天宏; 劉慈恩; 陳柏琳

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	探究端對端語音辨識於發音檢測與診斷
並列篇名	Investigating on Computer-Assisted Pronunciation Training Leveraging End-to-End Speech Recognition Techniques
作者	張修瑞、羅天宏、劉慈恩、陳柏琳
中文摘要	電腦輔助發音系統（Computer assisted pronunciation techniques, CAPT），任務可分為錯誤發音檢測（Mispronunciation detection）以及錯誤發音診斷（Mispronunciation diagnosis）。在過往的研究中，這兩種任務主要依賴於傳統語音辨識系統的強制對齊（Forced alignment）方法，並利用強制對齊產生的音素（Phone）段落與觀測到的全部音素或較混淆的音素計算GOP（Goodness of pronunciation）分數，並以此作為發音好壞的依據。然而傳統語音辨識系統的訓練流程既冗長且複雜。近年來，端對端語音辨識系統不僅大幅簡化此問題，且效能也有追上傳統語音辨識的趨勢。因此，本論文將基於端對端架構下，分別探討（1）基於辨識產生的信心分數（Confidence score）；（2）基於語音辨識結果，兩者對於發音檢測任務的影響。實驗結果顯示，使用端對端架構進行發音檢測與診斷，不僅相較於以往基於傳統語音辨識架構有更少的訓練流程，也大幅提升檢測與診斷的效果。
英文摘要	One of the primary tasks of a computer-assisted the pronunciation techniques (CAPT) system is mispronunciation detection and diagnosis. Previous research on CAPT mostly relies on a forced-alignment procedure which is usually conducted with the acoustic models adopted from a traditional speech recognition system, in conjunction with a phoneme paragraph, to calculate the goodness of pronunciation (GOP) scores for the phonemes of spoken words with respect to a text prompt. However, the training process of the traditional speech recognition system is complicated. In recent years, the end-to-end speech recognition system has not only greatly simplified this problem, but also has the trend of catching up with traditional speech recognition. In view of this, this thesis sets out to conduct mispronunciation detection and diagnosis on the strength of end-to-end speech recognition. To this end, we design and develop two mispronunciation detection methods: 1) method leveraging a recognition confidence measure; 2) method simply based speech recognition results; A series of experiments showed that leveraging end-to-end speech recognition architecture on mispronunciation detection and diagnosis not only reduced the training steps originally required for traditional speech recognition but also improve the performance of detection and diagnosis significantly.
起訖頁	266-280
關鍵詞	端對端語音辨識、聲學模型、發音檢測、發音診斷、end-to-end speech recognition、acoustic model、mispronunciation detection、mispronunciation diagnosis
刊名	ROCLING論文集
期數	2019 (2019期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	即時中文語音合成系統
該期刊-下一篇	基於語境特徵及分群模型之中文多義詞消歧

新書閱讀

元照讀書館

優惠活動

月旦品評家

元照讀書館

．研討會新訊

月旦知識庫

月旦法律分析庫
月旦醫事法網
月旦會計財稅網

期刊數位服務

社群平台

讀者服務

關於元照

讀者服務專線：+886-2-23756688　傳真：+886-2-23318496
地址：臺北市館前路28 號 7 樓　客服信箱