Improving Low-Resource Speech Recognition through Multilingual Fine-Tuning with Language Identifiers and Self-Training

Karol Nowakowski; Michal Ptaszynski

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	Improving Low-Resource Speech Recognition through Multilingual Fine-Tuning with Language Identifiers and Self-Training
並列篇名	Improving Low-Resource Speech Recognition through Multilingual Fine-Tuning with Language Identifiers and Self-Training
作者	Karol Nowakowski (Karol Nowakowski)、Michal Ptaszynski (Michal Ptaszynski)
英文摘要	Previous work has demonstrated that multilingual fine-tuning of a pretrained multilingual speech representation model can lead to improved speech recognition accuracy when there is extremely little target language data available. In this paper we show that fine-tuning on labeled speech data from multiple languages sharing common phonological traits, preprocessed by attaching a language identifier to each speech sample, yields competitive results compared to monolingual fine-tuning, even if a moderate amount of target language data is available. In order to further improve the performance of our system, we apply self-training using unlabeled speech data. Our results indicate that fine-tuning a speech recognition model jointly on a combination of multilingual data and pseudo-labeled data yields superior performance compared to using any of the two augmentation techniques individually. We also find that models fine-tuned on multilingual data with language identifiers produce better results even if explicit information about language identity is not provided at inference time.
起訖頁	63-70
關鍵詞	Speech recognition、Underresourced language、Ainu、Multilingual learning、Transfer learning、Cross-lingual transfer、Language identifiers、Self-training
刊名	ROCLING論文集
期數	202310 (2023期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	應用對話語篇剖析於兩階段會議摘要之研究
該期刊-下一篇	用於語音增強之偽影感知加權損失函數