Abstract :-Low speech intelligibility in noisy listening conditions makes more difficult our communication with others. Various strate- gies have been suggested to modify a speech signal before it is presented in a noisy listening environment with the goal to increase its intelligibility. A state-of-the art approach, referred to as Spectral Shaping and Dynamic Range Compression (SSDRC), relies on modifying spectral and temporal structure of the clean speech and has been shown to considerably improve the intelligibility of speech in noisy listening conditions. In this paper, we present a non-causal Wavenet-like model for mapping clean speech samples to samples generated by SSDRC. A successful non-linear mapping function has the potential to be used a) in improving the intelligibility of noisy speech and b) in the Wavenet-based speech synthesizers as a model based intelligibility improvement layer. Objective and subjective results show that the Wavenet-based mapping function is able to reproduce the intelligibility gains of SSDRC, while by far it improves the quality of the modified signal compared to the quality obtained by SSDRC.
Acknowledgment: This work was funded by the E.U. Horizon2020 Grant Agreement 675324, Marie Sklodowska-Curie Innovative Training Network, ENRICH.