Toward Constructing a Multilingual Speech Corpus for Taiwanese (Min-nan), Hakka, and Mandarin

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	Toward Constructing a Multilingual Speech Corpus for Taiwanese (Min-nan), Hakka, and Mandarin
作者	Lyu, Ren-yuan (Lyu, Ren-yuan)、Liang, Min-siong (Liang, Min-siong)、Chiang, Yuang-chin (Chiang, Yuang-chin)
中文摘要	The Formosa speech database (ForSDat) is a multilingual speech corpus collected at Chang Gung University and sponsored by the National Science Council of Taiwan. It is expected that a multilingual speech corpus will be collected, covering the three most frequently used languages in Taiwan: Taiwanese (Min-nan), Hakka, and Mandarin. This 3-year project has the goal of collecting a phonetically abundant speech corpus of more than 1,800 speakers and hundreds of hours of speech. Recently, the first version of this corpus containing speech of 600 speakers of Taiwanese and Mandarin was finished and is ready to be released. It contains about 49 hours of speech and 247,000 utterances.
起訖頁	1-12
關鍵詞	Phonetic alphabet、Pronunciation lexicon、Phonetically balanced word、Speech corpus
刊名	中文計算語言學期刊
期數	200408 (9:2期)
出版單位	中華民國計算語言學學會
該期刊-下一篇	Multiple-Translation Spotting for Mandarin-Taiwanese Speech-to-Speech Translation