月旦知識庫
 
  1. 熱門:
 
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
中文計算語言學期刊 本站僅提供期刊文獻檢索。
  【月旦知識庫】是否收錄該篇全文,敬請【登入】查詢為準。
最新【購點活動】


篇名
Construction and Automatization of a Minnan Child Speech Corpus with some Research Findings
作者 Jane S. Tsay (Jane S. Tsay)
中文摘要
Taiwanese Child Language Corpus (TAICORP) is a corpus based on spontaneous conversations between young children and their adult caretakers in Minnan (Taiwan Southern Min) speaking families in Chiayi County, Taiwan. This corpus is special in several ways: (1) It is a Minnan corpus; (2) It is a speech-based corpus; (3) It is a corpus of a language that does not yet have a conventionalized orthography; (4) It is a collection of longitudinal child language data; (5) It is one of the largest child corpora in the world with about two million syllables in 497,426 lines (utterances) based on about 330 hours of recordings. Regarding the format, TAICORP adopted the Child Language Data Exchange System (CHILDES) [MacWhinney and Snow 1985; MacWhinney 1995] for transcribing and coding the recordings into machine-readable text. The goals of this paper are to introduce the construction of this speech-based corpus and at the same time to discuss some problems and challenges encountered. The development of an automatic word segmentation program with a spell-checker is also discussed. Finally, some findings in syllable distribution are reported.
起訖頁 411-441
關鍵詞 MinnanTaiwan Southern MinTaiwaneseSpeech CorpusChild LanguageCHILDESAutomatic Word Segmentation
刊名 中文計算語言學期刊  
期數 200712 (12:4期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 Some Studies on Min-Nan Speech Processing
該期刊-下一篇 Automatic Pronunciation Assessment for Mandarin Chinese: Approaches and System Overview
 

新書閱讀



最新影音


優惠活動




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄