英文摘要 |
This paper presents a new speech enhancement approach originated from factor analysis (FA) framework. FA is a data analysis model where the relevant common factors can be extracted from observations. A factor loading matrix is found and a resulting model error is introduced for each observation. Interestingly, FA is a subspace approach properly representing the noisy speech. This approach partitions the space of noisy speech into a principal subspace containing clean speech and a complimentary (minor) subspace containing the residual speech and noise. We show that FA is a generalized data model compared to signal subspace approach. To perform FA speech enhancement, we present a perceptual optimization procedure that minimizes the signal distortion subject to the energies of residual speech and noise under a specified level. Importantly, we present a hypothesis testing approach to optimally perform subspace decomposition. In the experiments, we implement perceptual FA speech enhancement using Aurora2 corpus. We find that proposed approach achieves desirable speech recognition rates especially when signal-to-noise ratio is lower than 5 dB. |