使用生成對抗網路於強健式自動語音辨識的應用

楊明璋; 趙福安; 羅天宏; 陳柏琳

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	使用生成對抗網路於強健式自動語音辨識的應用
並列篇名	Exploiting Generative Adversarial Network for Robustness Automatic Speech Recognition
作者	楊明璋、趙福安、羅天宏、陳柏琳
中文摘要	在過去幾年中，深度學習技術的發展在許多領域中大放異彩，應用在語音辨識中也一樣表現優異。儘管語音辨識有了大幅度的改進，然而「雜訊」仍然一定程度的干擾語音辨識之準確度。諸如:背景人聲、火車、公車站牌、汽車噪音、餐館背景雜音…以上皆為易影響語音辨識結果的環境噪音。因此語音辨識的強健性技術研究仍扮演著重要角色。過往於強健性技術的研究主要可區分為以特徵為基礎，以及以模型為基礎兩大面向。以特徵為基礎的強健性技術又可分為特徵正規化以及語音訊號增益。本研究主要採用生成對抗網路（Generative Adversarial Network, GAN）以語音訊號增益方式使用在調變頻譜特徵上。我們的目的在於把受到吵雜環境干擾，或被通道效應破壞的語音特徵轉換成接近乾淨環境下錄製之語音特徵，此方法比起原始梅爾倒頻譜係數特徵可以有效的提升辨識率。
英文摘要	In the recent past, deep learning techniques have reached record-breaking performance in a wild variety of applications like automatic speech recognition (ASR). Even though cutting-edge ASR systems evaluated on a few benchmark tasks have already reached human-like performance, they, in reality, are not robust, in the manner that humans are, to disparate types of environmental noise such as babble, train, bus station, car driving, restaurant, and among others. In view of this, this paper embarks on an effort to develop effective enhancement methods, stemming from the so-called generative adversarial networks (GAN), for use in the modulation domain of speech feature vector sequences. A series of experiments conducted on the Aurora-4 database and task seem to demonstrate the practical merits of our methods.
起訖頁	212-225
關鍵詞	生成對抗網路、語音訊號增益、語音強健性技術、強健性語音辨識、Generative Adversarial Network、Speech Enhancement、Robustness Techniques、Robust Speech Recognition
刊名	ROCLING論文集
期數	2019 (2019期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	植基於深度學習假新聞人工智慧偵測：台灣真實資料實作
該期刊-下一篇	Speech enhancement based on the integration of fully convolutional network, temporal lowpass filtering and spectrogram masking

新書閱讀

元照讀書館

優惠活動

月旦品評家

元照讀書館

．研討會新訊

月旦知識庫

月旦法律分析庫
月旦醫事法網
月旦會計財稅網

期刊數位服務

社群平台

讀者服務

關於元照

讀者服務專線：+886-2-23756688　傳真：+886-2-23318496
地址：臺北市館前路28 號 7 樓　客服信箱