月旦知識庫
 
  1. 熱門:
 
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
中文計算語言學期刊 本站僅提供期刊文獻檢索。
  【月旦知識庫】是否收錄該篇全文,敬請【登入】查詢為準。
最新【購點活動】


篇名
使用低通時序列語音特徵訓練理想比率遮罩法之語音強化
並列篇名
Employing Low-Pass Filtered Temporal Speech Features for the Training of Ideal Ratio Mask in Speech Enhancement
作者 陳彥同洪志偉
中文摘要
在諸多基於深度學習之語音強化法中,遮罩式(masking-based)強化法求取一個遮罩與雜訊語音之時頻圖相乘、藉此使所得乘積之新時頻圖所含雜訊成分降低、以重建相對乾淨的語音訊號。在用以訓練遮罩之深度模型其輸入特徵的選取上,許多長期以來用以語音辨識的特徵、如梅爾倒倒頻譜、振幅調變時頻圖、感知線性估測係數等都是適合的選擇、可使訓練所得的遮罩達到有效的語音強化效果。另外,傳統上若將語音特徵之時序列作低通濾波處理,可以抑制雜訊所帶來的失真,因此,在本研究中,我們嘗試將各種語音特徵時序列,藉由離散小波轉換的方式加以低通濾波,再用它們來訓練語音遮罩的深度模型,探究其是否能使所學習之遮罩能對於原始雜訊語音之時頻圖有更佳的語音強化效果。在我們的初步實驗裡,在人聲雜訊環境中,我們發現上述之低通濾波所得之特徵序列、相較於原始特徵序列而言所學習而得的深度模型,能更有效地提升測試語音之品質與可讀性。
英文摘要
"The masking-based speech enhancement method pursues a multiplicative mask that applies to the spectrogram of input noise-corrupted utterance, and a deep neural network (DNN) is often used to learn the mask. In particular, the features commonly used for automatic speech recognition can serve as the input of the DNN to learn the well-behaved mask that significantly reduce the noise distortion of processed utterances. This study proposes to preprocess the input speech features for the ideal ratio mask (IRM)-based DNN by lowpass filtering in order to alleviate the noise components. In particular, we employ the discrete wavelet transform (DWT) to decompose the temporal speech feature sequence and scale down the detail coefficients, which correspond to the high-pass portion of the sequence. Preliminary experiments conducted on a subset of TIMIT corpus reveal that the proposed method can make the resulting IRM achieve higher speech quality and intelligibility for the babble noise-corrupted signals compared with the original IRM, indicating that the lowpass filtered temporal feature sequence can learn a superior IRM network for speech enhancement. "
起訖頁 35-47
關鍵詞 語音強化特徵時序列低通濾波理想比例遮罩法小波轉換Speech EnhancementTemporal Feature SequenceLowpass FilteringIdeal Ratio MaskWavelet Transform
刊名 中文計算語言學期刊  
期數 202112 (26:2期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 A Pretrained YouTuber Embeddings for Improving Sentiment Classification of YouTube Comments
該期刊-下一篇 語者嵌入向量與後置濾波器於提升個人化合成語音之語者相似度
 

新書閱讀



最新影音


優惠活動




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄