Blind speech source separation via nonlinear time-frequency masking

来源 :Chinese Journal of Acoustics | 被引量 : 0次 | 上传用户:qiushuicai
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Aim at the underdetermined convolutive mixture model,a blind speech source separation method based on nonlinear time-frequency masking was proposed,where the ap- proximate W-disjoint orthogonality(W-DO)property among independent speech signals in time-frequency domain is utilized.In this method,the observation mixture signal from multi- microphones is normalized to be independent of frequency in the time-frequency domain at first,then the dynamic clustering algorithm is adopted to obtain the active source information in each time-frequency slot,a nonlinear function via deflection angle from the cluster center is selected for time-frequency masking,finally the blind separation of mixture speech signals can be achieved by inverse STFT(short-time Fourier transformation).This method can not only solve the problem of frequency permutation which may be met in most classic frequency-domain blind separation techniques,but also suppress the spatial direction diffusion of the separation matrix.The simulation results demonstrate that the proposed separation method is better than the typical BLUES method,the signal-noise-ratio gain(SNRG)increases 1.58 dB averagely. Aim at the underdetermined convolutive mixture model, a blind speech source separation method based on nonlinear time-frequency masking was proposed, where the ap-proximate W-disjoint orthogonality (W-DO) property among independent speech signals in time-frequency domain is applied .In this method, the observation mixture signal obtained from multi- microphones is normalized to be independent of frequency in the time-frequency domain at first, then the dynamic clustering algorithm is obtained to obtain the active source information in each time-frequency slot, a nonlinear function via deflection angle from the cluster center is selected for time-frequency masking, finally the blind separation of mixture speech signals can be achieved by inverse STFT (short-time Fourier transformation). This method can not only solve the problem of frequency permutation which may be met in most classic frequency-domain blind separation techniques, but also suppress the spatial direction diffusion of the separation matrix.T he simulation results demonstrate that the proposed separation method is better than the typical BLUES method, the signal-noise-ratio gain (SNRG) increases 1.58 dB averagely.
其他文献