月旦知識庫
月旦知識庫 會員登入元照網路書店月旦品評家
 
 
  1. 熱門:
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
ROCLING論文集 本站僅提供期刊文獻檢索。
  【月旦知識庫】是否收錄該篇全文,敬請【登入】查詢為準。
最新【購點活動】


篇名
通過半監督學習改進端到端台語語音至中文文字翻譯)
並列篇名
Improving End-to-end Taiwanese-Speech-to-Chinese-Text Translation by Semi-supervised Learning
中文摘要
傳統台語語音辨識的主要問題,為缺乏大量且公開的台語語料集,以及台語文字書寫系統不統一;前者導致進行語音辨識的任務上面臨資料不足,而後者則造成輸出格式不統一且不易讀解。因此,本研究以台語語音至中文文字的語音翻譯為任務,透過預訓練語音模型結合端到端深度學習模型的架構,來建立台語語音至中文文字的語音翻譯模型。我們的方法是以少量台語語音配對中文文本的語料為基礎,並透過大量蒐集未配對的台語語音資料,並設計各種演算法來利用大量未配對語料改善台語語音至中文文字的翻譯系統。研究探討主要分為端到端語音翻譯模型、預訓練語音模型特徵、疊代訓練方法以及語料清洗四種改進方向。根據實驗結果顯示,上述方法皆能有效改善台語語音至中文文字的翻譯表現。
英文摘要
The main challenges in Taiwanese speech recognition are the lack of abundant and publicly available Taiwanese speech corpora, and the inconsistency in the written system of Taiwanese. The former results in insufficient data for speech recognition tasks, while the latter leads to inconsistent output formats and difficulties in interpretation. Therefore, this study takes the speech translation from Taiwanese speech to Chinese text as the task, and builds a speech translation model from Taiwanese speech to Chinese text by combining the pre-trained speech model with the architecture of the end-to-end deep learning model. Our method is based on a small amount of Taiwanese speech paired with Chinese text, and by collecting a large amount of unpaired Taiwanese speech data, and designing various algorithms to use a large amount of unpaired corpus to improve the system of translating Taiwanese speech into Chinese text. The research and discussion are mainly divided into four improvement directions: end-to-end speech translation model, pre-trained speech model features, iterative training method and corpus cleaning. Experimental results show that the above methods can effectively improve the translation performance of Taiwanese speech to Chinese text.
起訖頁 21-28
關鍵詞 端到端語音翻譯半監督式學習語料清洗End-to-end speech translationSemi-supervised learningCorpus cleaning
刊名 ROCLING論文集  
期數 202310 (2023期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 應用強化學習與知識圖譜於故事共述生成之研究
該期刊-下一篇 中文訊息傳遞服務對話系統之建構
 

新書閱讀



最新影音


優惠活動




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄