DEGREASE (ANR-23-CE23-0009) is a 45-month project (2024/04 - 2028/01) funded by the French National Research Agency (ANR) within the Young Researcher program (JCJC) and coordinated by Simon Leglaive.
Figure 1: Illustration of the speech enhacement task.
DEGREASE stands for deep generative and inference models for weakly-supervised speech enhancement. Speech enhancement consists of improving the quality and intelligibility of a speech signal in a degraded recording, for instance due to interferring sound sources and reverberation (see Figure 1). Speech enhancement finds applications in various technologies for human and machine listening (hearing aids, assistive listening, vocal assistants, smartphones, smart homes, etc.)
Figure 2: The (now) conventional approach to supervised speech enhancement.
In recent years, there has been great progress in speech enhancement thanks to deep learning models trained in a supervised manner. Supervised speech enhancement involves three main ingredients, as illustrated in Figure 2:
Figure 3: High-level overview of the methodology proposed in DEGREASE.
The scientific ambition of the DEGREASE project is to develop speech enhancement methods that can leverage real unlabeled recordings of noisy and reverberant speech at training time and that can adapt to new acoustic conditions at test time. To reach this objective we propose a methodology at the crossroads of audio signal processing, probabilistic graphical modeling, and deep learning, which is based on deep generative and inference models specifically designed for the processing of multi-microphone speech signals.
The probabilistic generative modeling approach will allow us to consider the clean speech signals as partially-observed variables during training. Models will thus be learned in a semi-supervised manner at training time, and they will be adapted in an unsupervised manner at test time. Speech enhancement will be achieved by inverting the learned generative model, i.e., performing inference.
The outcomes of the DEGREASE project are expected to help building more reliable speech technologies that can work optimally in diverse and uncontrolled acoustic environments.
Simon Leglaive - Principal Investigator
Sofiene Kammoun - PhD Student
Louis Bahrman - Postdoctoral Researcher
|
Modeling strategies for speech enhancement in the latent space of a neural audio codec Sofiene Kammoun, Xavier Alameda-Pineda, Simon Leglaive arXiv preprint arXiv:2510.26299, 2025 Accepted at IEEE ICASSP 2026 Paper | Webpage |
|
Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge Simon Leglaive, Matthieu Fraticelli, Hend ElGhazaly, Léonie Borne, Mostafa Sadeghi, Scott Wisdom, Manuel Pariente, John R. Hershey, Daniel Pressnitzer, Jon P. Barker Computer Speech & Language, vol. 89, 2025 Paper | GitHub | Data |
|
AnCoGen: Analysis, control and generation of speech with a masked autoencoder
Samir Sadok, Simon Leglaive, Laurent Girin, Gaël Richard, Xavier Alameda-Pineda IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Hyderabad, India, 2025 Paper | Webpage | Code |
|
Débruitage de parole semi-supervisé par modélisation générative dans un espace de représentation discret des signaux audio Sofiene Kammoun, Simon Leglaive XXXe Colloque GRETSI, Strasbourg, France, August 2025 |