Sub-Band Knowledge Distillation Framework for Speech Enhancement

Hao, Xiang; Wen, Shixue; Su, Xiangdong; Liu, Yun; Gao, Guanglai; Li, Xiaofei

doi:10.21437/Interspeech.2020-1539

Sub-Band Knowledge Distillation Framework for Speech Enhancement

Xiang Hao, Shixue Wen, Xiangdong Su, Yun Liu, Guanglai Gao, Xiaofei Li

In single-channel speech enhancement, methods based on full-band spectral features have been widely studying, while only a few methods pay attention to non-full-band spectral features. In this paper, we explore a knowledge distillation framework based on sub-band spectral mapping for single-channel speech enhancement. First, we divide the full frequency band into multiple sub-bands and pre-train elite-level sub-band enhancement model (teacher model) for each sub-band. The teacher models are dedicated to processing their own sub-bands. Next, under the teacher models’ guidance, we train a general sub-band enhancement model (student model) that works for all sub-bands. Without increasing the number of model parameters and computational complexity, the student model’s performance is further improved. To evaluate the proposed method, we conducted a large number of experiments on an open-source data set. The final experimental results show that the guidance from the elite-level teacher models dramatically improves the student model’s performance, which exceeds the full-band model by employing fewer parameters.

doi: 10.21437/Interspeech.2020-1539

Cite as: Hao, X., Wen, S., Su, X., Liu, Y., Gao, G., Li, X. (2020) Sub-Band Knowledge Distillation Framework for Speech Enhancement. Proc. Interspeech 2020, 2687-2691, doi: 10.21437/Interspeech.2020-1539

@inproceedings{hao20b_interspeech,
  author={Xiang Hao and Shixue Wen and Xiangdong Su and Yun Liu and Guanglai Gao and Xiaofei Li},
  title={{Sub-Band Knowledge Distillation Framework for Speech Enhancement}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2687--2691},
  doi={10.21437/Interspeech.2020-1539}
}