臺灣口音中英雙語之多語者影音合成系統

林珈萱; Jian-Peng Liao; Cho-Chun Hsieh; Kai-Chun Liao; Chun-Hsin Wu

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	臺灣口音中英雙語之多語者影音合成系統
並列篇名	Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System
作者	林珈萱 (Chia-Hsuan Lin)、Jian-Peng Liao (Jian-Peng Liao)、Cho-Chun Hsieh (Cho-Chun Hsieh)、Kai-Chun Liao (Kai-Chun Liao)、Chun-Hsin Wu (Chun-Hsin Wu)
中文摘要	本論文提出一個多語者影音合成系統，結合語音複製與嘴型同步技術，透過取得任意語者短暫的說話語音及影像片段，以零樣本之遷移學習，來實現可即時翻譯的文字轉人物說話影像。除此之外，我們利用開源語料集訓練了多個臺灣口音的模型，同時也提出使用注音作為合成器之文字嵌入的方式，來提升系統合成中英交雜語句的能力。透過此系統，使用者便可創造出豐富的應用，且此技術之研究與應用，在影音合成領域具有相當的新穎性。
英文摘要	This paper proposes a multi-speaker talking-face synthesis system. The system incorporates voice cloning and lipsyncing technology to achieve text-totalking- face generation by acquiring audio and video clips of any speaker and using zero-shot transfer learning. In addition, we used open-source corpora to train several Taiwanese-accented models and proposed using Mandarin Phonetic Symbols (Bopomofo) as the character embedding of the synthesizer to improve the system’s ability to synthesize Chinese-English codeswitched sentences. Through our system, users can create rich applications. Also, the research on this technology is novel in the audiovisual speech synthesis field.
起訖頁	40-48
關鍵詞	多語者語音合成、語者驗證、語音複製、語碼轉換、嘴型同步、人物說話影像、Multi-Speaker TTS、Speaker Verification、Voice Cloning、Code-Switching、Lip-Syncing、Talking-Face Generation
刊名	ROCLING論文集
期數	202212 (2022期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	結合詞向量技術與分群演算法於信用卡商戶名稱辨識
該期刊-下一篇	Is Character Trigram Overlapping Ratio Still the Best Similarity Measure for Aligning Sentences in a Paraphrased Corpus?

新書閱讀

元照讀書館

優惠活動

月旦品評家

元照讀書館

．研討會新訊

月旦知識庫

月旦法律分析庫
月旦醫事法網
月旦會計財稅網

期刊數位服務

社群平台

讀者服務

關於元照

讀者服務專線：+886-2-23756688　傳真：+886-2-23318496
地址：臺北市館前路28 號 7 樓　客服信箱