探究語言模型合併策略應用於中英文語碼轉換語音辨識

林韋廷; 陳柏琳

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	探究語言模型合併策略應用於中英文語碼轉換語音辨識
並列篇名	Exploring Disparate Language Model Combination Strategies for Mandarin-English Code-Switching ASR
作者	林韋廷、陳柏琳
中文摘要	"語碼轉換（Code-Switching, CS）在多語言社會中是一種常見的現象；例如，在台灣的官方語言是中文，但居民們日常對話時而會夾雜一些英文詞彙、片語或語句。語碼轉換語音的轉寫，在自動語言辨識（Automatic Speech Recognition, ASR）上仍被視為一個重要且具有挑戰性的任務。而為了提升CS ASR效能，改進其語言模型是最直接且有效的方法之一。有鑒於此，我們提出多種不同階段的語言模型合併策略以用於中英文語碼轉換自動語言辨識。在本篇論文的實驗設定中，會有兩種中英文CS語言模型和一種中文的單語言模型，其中CS語言模型使用的訓練資料與測試集同一領域（Domain)，而單語言模型是用大量一般中文語料訓練而成。我們透過多種不同階段的語言模型合併策略以探究ASR是否能結合不同的語言模型其各自的優勢以在不同任務上都有好的表現。在本篇論文中有三種語言模型合併策略，分別為N-gram語言模型合併、解碼圖（Decoding Graph）合併和詞圖（Word Lattice）合併。經由一系列在企業應用領域的多種語料之實驗結證實，透過語言模型的合併的確能讓CS ASR對不同的測試集都有好的表現。"
英文摘要	Code-switching (CS) speech is a common language phenomenon in multilingual societies. For example, the official language in Taiwan is Mandarin Chinese, but the daily conversations of the ordinary populace are often mingled with English words, phrases or sentences. It is generally agreed that transcription of CS speech remains an important challenge for the current development of automatic speech recognition (ASR). One of the straightforward and feasible ways to promote the efficacy of CS ASR is to improve the language model (LM) involved in ASR. Given these observations, we put forward disparate strategies that conduct combination of various language models at different stages of the ASR process. Our experimental configuration consists of two CS (i.e., mixing of Mandarin Chinese and English) language models and one monolingual (i.e. Mandarin Chinese) language models, where the two CS language models are domain-specific and the monolingual language model is trained on a general text collection. Through the language model combination at different stages of the ASR process, we purport to know if the ASR system could integrate the strengths of various language models to achieve improved performance across different tasks. More specifically, three strategies for combining language models are investigated, namely simple N-gram language model combination, decoding graph combination and word lattice combination. A series of ASR experiments conduct on CS speech corpora complied from different industrial application scenarios have confirm the utility of the aforementioned LM combination strategies.
起訖頁	1-13
關鍵詞	語碼轉換、語言模型、語音辨識、解碼圖、詞圖、code-switching、language model、automatic speech recognition、decoding graph、word lattice
刊名	ROCLING論文集
期數	2020 (2020期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	門控圖序列神經網路之中文健康照護命名實體辨識
該期刊-下一篇	基於深度聲學模型其狀態精確度最大化之強健語音特徵擷取的初步研究