  1. 熱門:
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
ROCLING論文集 本站僅提供期刊文獻檢索。

RCRNN-based Sound Event Detection System with Specific Speech Resolution
作者 Sung-Jen HuangYih-Wen Wang (Yih-Wen Wang)Chia-Ping Chen (Chia-Ping Chen)Chung-Li LuBo-Cheng Chan
聲音事件偵測的目標是標記出音訊中的聲音事件及其時間界線。我們基於半監督式學習的均值教師框架,提出一個帶有殘差連接與注意力機制的RCRNN網路架構,其可用大量弱標註/未標註資料來訓練。而在許多聲音事件中,語音具有更豐富的訊息量,因此我們使用特定的時間頻率參數來擷取該類別的聲學特徵,並且利用資料增強與後處理來進一步提升效能。我們提出的系統於DCASE 2021 Task 4的驗證集上,PSDS(Polyphonic Sound Detection Score)-scenario 1、2和Event-based F1-Score分別達到38.2%,58.2%和44.3%,優於baseline的33.8%, 52.9%和40.7%。
Sound event detection (SED) system outputs sound events and their time boundaries in audio signals. We proposed an RCRNN-based SED system with residual connection and convolution block attention mechanism based on the mean-teacher framework of semi-supervised learning. The neural network can be trained with an amount of weakly labeled data and unlabeled data. In addition, we consider that the speech event has more information than other sound events. Thus, we use the specific time-frequency resolution to extract the acoustic feature of the speech event. Furthermore, we apply data augmentation and post-processing to improve the performance. On the DCASE 2021 Task 4 validation set, the proposed system achieves the PSDS (Poly-phonic Sound Event Detection Score)-scenario 2 of 57.6% and event-based F1-score of 41.6%, outperforming the baseline score of 52.7% and 40.7%.
起訖頁 118-123
關鍵詞 聲音事件偵測均值教師模型卷積注意力機制語音Sound event detectionMean teacher modelCBMASpeech
刊名 ROCLING論文集  
期數 202112 (2021期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 Mining Commonsense and Domain Knowledge from Math Word Problems
該期刊-下一篇 使用對話行為嵌入改善對話系統用戶訊息中提問句與閒聊句之判別




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄