Simon Leglaive

Tenured Assistant Professor
CentraleSupélec, IETR (UMR CNRS 6164)

simon.leglaive@centralesupelec.fr
+33 (0)1 75 31 65 51
CentraleSupélec - Rennes Campus
Avenue de la Boulaie
CS 47601
F-35576 Cesson-Sévigné Cedex

Journals

2025

A vector quantized masked autoencoder for audiovisual speech emotion recognition
Samir Sadok, Simon Leglaive, Renaud Séguier
Computer Vision and Image Understanding, Volume 257, 2025
Paper | Webpage

Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge
Simon Leglaive, Matthieu Fraticelli, Hend ElGhazaly, Léonie Borne, Mostafa Sadeghi, Scott Wisdom, Manuel Pariente, John R. Hershey, Daniel Pressnitzer, Jon P. Barker
Computer Speech & Language, vol. 89, 2025
Paper | GitHub | Data

2024

A multimodal dynamical variational autoencoder for audiovisual speech representation learning
Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier
Neural Networks, 2024
Paper | Webpage | Code

2023

Learning and controlling the source-filter representation of speech with a variational autoencoder
Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier
Speech Communication, vol. 148, 2023
Paper | Webpage | Code

2022

Unsupervised speech enhancement using dynamical variational autoencoders
Xiaoyu Bie, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin
IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 30, 2022
Paper | Webpage | Code

2021

Dynamical variational autoencoders: A comprehensive review
Laurent Girin, Simon Leglaive, Xiaoyu Bie, Julien Diard, Thomas Hueber, Xavier Alameda-Pineda
Foundations and Trends in Machine Learning, vol. 15, no. 1-2, 2021
Paper | Code

2020

Audio-visual speech enhancement using conditional variational auto-encoders
Mostafa Sadeghi, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud
IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 28, 2020
Paper | Code

2019

Noise power spectral density estimation using long short-term memory
Xiaofei Li, Simon Leglaive, Laurent Girin, Radu Horaud
IEEE Signal Processing Letters, vol. 26, no. 6, 2019
Paper

2018

Student's t source and mixing models for multichannel audio source separation
Simon Leglaive, Roland Badeau, Gaël Richard
IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 26, no. 6, 2018
Paper | Webpage | Code

2016

Multichannel audio source separation with probabilistic reverberation priors
Simon Leglaive, Roland Badeau, Gaël Richard
IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 24, no. 12, 2016
Paper | Webpage | Code

International conferences

2026

Test-time adaptation for speech enhancement with an autoregressive speech prior
Sofiene Kammoun, Simon Leglaive, Xavier Alameda-Pineda, Timo Gerkmann
Accepted to the International Workshop on Acoustic Signal Enhancement (IWAENC), Cremona, Italy, 2026

Masked autoregressive speech enhancement with continuous neural audio codec representations
Yoto Fujita, Simon Leglaive, Laurent Girin
Accepted to the International Workshop on Acoustic Signal Enhancement (IWAENC), Cremona, Italy, 2026

Modeling strategies for speech enhancement in the latent space of a neural audio codec
Sofiene Kammoun, Xavier Alameda-Pineda, Simon Leglaive
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, 2026
Paper | Webpage

2025

MEGA: Masked generative autoencoder for human mesh recovery
Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, Francesc Moreno-Noguer
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, USA, 2025
Oral presentation (3.3% of accepted papers)
Paper | Webpage

AnCoGen: Analysis, control and generation of speech with a masked autoencoder
Samir Sadok, Simon Leglaive, Laurent Girin, Gaël Richard, Xavier Alameda-Pineda
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hyderabad, India, 2025
Paper | Webpage | Code

2024

VQ-HPS: Human pose and shape estimation in a vector-quantized latent space
Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, Antonio Agudo, Francesc Moreno-Noguer
European Conference on Computer Vision (ECCV), Milano, Italy, 2024
Paper | Webpage

Towards improving speech emotion recognition using synthetic data augmentation from emotion conversion
Karim M. Ibrahim, Antony Perzo, Simon Leglaive
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Seoul, Korea, 2024
Paper

2023

Motion-DVAE: Unsupervised learning for fast human motion denoising
Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, Renaud Séguier
ACM SIGGRAPH Conference on Motion, Interaction and Games (ACM MIG), 2023
Paper | Webpage

SwimXYZ: A large-scale dataset of synthetic swimming motions and videos
Guénolé Fiche, Vincent Sevestre, Camila Gonzalez-Barral, Simon Leglaive, Renaud Séguier
ACM SIGGRAPH Conference on Motion, Interaction and Games (ACM MIG), 2023
Paper | Webpage

The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement
Simon Leglaive, Léonie Borne, Efthymios Tzinis, Mostafa Sadeghi, Matthieu Fraticelli, Scott Wisdom, Manuel Pariente, Daniel Pressnitzer, John R. Hershey
The 7th International Workshop on Speech Processing in Everyday Environments (CHiME), Dublin, Ireland, 2023
Paper | Website | Slides

Unsupervised speech enhancement with deep dynamical generative speech and noise models
Xiaoyu Lin, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda
Interspeech, Dublin, Ireland, August 2023
Paper

A vector quantized masked autoencoder for speech emotion recognition
Samir Sadok, Simon Leglaive, Renaud Séguier
IEEE ICASSP 2023 Workshop on Self-Supervision in Audio, Speech and Beyond (SASB), Rhodes, Greece, June 2023
Paper | Webpage | Code

Speech modeling with a hierarchical transformer dynamical VAE
Xiaoyu Lin, Xiaoyu Bie, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Rhodes, Greece, June 2023
Paper

2022

Expectation-maximization based defense mechanism for distributed model predictive control
Rafael Accácio Nogueira, Romain Bourdais, Simon Leglaive, Hervé Guéguen
IFAC Conference on Networked Systems (NecSys22), Zürich, Switzerland, July 2022
Paper

2021

On speech sparsity for computational efficiency and noise reduction in hearing aids
Adrien Llave, Simon Leglaive
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Tokyo, Japan, December 2021
Paper | Webpage

A benchmark of dynamical variational autoencoders applied to speech spectrogram modeling
Xiaoyu Bie, Laurent Girin, Simon Leglaive, Thomas Hueber, Xavier Alameda-Pineda
Interspeech, Brno, Czech Republic, September 2021
Paper | Code

2020

Localization cues preservation in hearing aids by combining noise reduction and dynamic range compression
Adrien Llave, Simon Leglaive, Renaud Séguier
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Auckland, New Zealand, December 2020
Paper | Webpage

A recurrent variational autoencoder for speech enhancement
Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020
Paper | Webpage | Code | Slides | Video

2019

Notes on the use of variational autoencoders for speech and audio spectrogram modeling
Laurent Girin, Thomas Hueber, Fanny Roche, Simon Leglaive
International Conference on Digital Audio Effects (DAFx), Birmingham, UK, September 2019
Paper

Semi-supervised multichannel speech enhancement with variational autoencoders and non-negative matrix factorization
Simon Leglaive, Laurent Girin, Radu Horaud
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, May 2019
Paper | Webpage | Code | Slides

Speech enhancement with variational autoencoders and alpha-stable distributions
Simon Leglaive, Umut Şimşekli, Antoine Liutkus, Laurent Girin, Radu Horaud
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, UK, May 2019
Paper | Webpage | Code | Poster

2018

A variance modeling framework based on variational autoencoders for speech enhancement
Simon Leglaive, Laurent Girin, Radu Horaud
IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, September 2018
Paper | Supp. document | Webpage | Code | Slides

Alpha-stable low-rank plus residual decomposition for speech enhancement
Umut Şimşekli, Halil Erdoğan, Simon Leglaive, Antoine Liutkus, Roland Badeau, Gaël Richard
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Calgary, Canada, April 2018
Paper

2017

Separating time-frequency sources from time-domain convolutive mixtures using non-negative matrix factorization
Simon Leglaive, Roland Badeau, Gaël Richard
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, October 2017
Paper | Supp. document | Webpage | Code | Slides

Semi-blind Student's t source separation for multichannel audio convolutive mixtures
Simon Leglaive, Roland Badeau, Gaël Richard
European Signal Processing Conference (EUSIPCO), Kos island, Greece, August 2017
Paper | Webpage | Code | Slides

Alpha-stable multichannel audio source separation
Simon Leglaive, Umut Şimşekli, Antoine Liutkus, Roland Badeau, Gaël Richard
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New-Orleans, LA, USA, March 2017
Paper | Supp. document | Poster

Multichannel audio source separation: variational inference of time-frequency sources from time-domain observations
Simon Leglaive, Roland Badeau, Gaël Richard
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), New-Orleans, LA, USA, March 2017
Paper | Webpage | Code | Slides

2016

Autoregressive moving average modeling of late reverberation in the frequency domain
Simon Leglaive, Roland Badeau, Gaël Richard
European Signal Processing Conference (EUSIPCO), Budapest, Hungary, August 2016
Paper | Poster

2015

Multichannel audio source separation with probabilistic reverberation modeling
Simon Leglaive, Roland Badeau, Gaël Richard
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, October 2015
Paper | Slides

Singing voice detection with deep recurrent neural networks
Simon Leglaive, Romain Hennequin, Roland Badeau
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brisbane, Australia, April 2015
Paper | Slides

Technical reports

2022

HiT-DVAE: Human motion generation via hierarchical transformer dynamical VAE
Xiaoyu Bie, Wen Guo, Simon Leglaive, Laurent Girin, Francesc Moreno-Noguer, Xavier Alameda-Pineda
arXiv preprint arXiv:2204.01565, 2022
Paper

National conferences

Theses

Extraction d'information dans la musique pour l'automatisation de la séparation de la voix chantée : vers des méthodes d'apprentissage profond

2017	Modèles de mélange pour la séparation multicanale de sources sonores en milieu réverbérant Ph.D. Thesis Defended on December 12, 2017 Manuscript \| Slides
2014	Extraction d'information dans la musique pour l'automatisation de la séparation de la voix chantée : vers des méthodes d'apprentissage profond Master Thesis Work carried out at Audionamix Manuscript

2025	Débruitage de parole semi-supervisé par modélisation générative dans un espace de représentation discret des signaux audio Sofiene Kammoun, Simon Leglaive XXXe Colloque GRETSI, Strasbourg, France, August 2025
2023	Étude sur l’inversion de StyleGAN dans un contexte de détection d’hypertrucages Matthieu Delmas, Amine Kacete, Stéphane Paquelet, Simon Leglaive, Renaud Séguier XXIXe Colloque GRETSI, Grenoble, France, August-September 2023 Paper
2022	Les auto-encodeurs variationnels dynamiques et leur application à la modélisation de spectrogrammes de parole Laurent Girin, Simon Leglaive, Xiaoyu Bie, Julien Diard, Thomas Hueber, Xavier Alameda-Pineda XXXIVe Journées d’Études sur la Parole (JEP), Île de Noirmoutier, France, June 2022 Paper
2017	Séparation de sources audio en milieu réverbérant : Factorisation en matrices non-négatives et représentation temporelle du mélange convolutif Simon Leglaive, Roland Badeau, Gaël Richard XXVIe Colloque GRETSI, Juan-Les-Pins, France, September 2017 Paper \| Webpage \| Code \| Slides
2015	A priori probabiliste anéchoïque pour la séparation sous-déterminée de sources sonores en milieu réverbérant Simon Leglaive, Roland Badeau, Gaël Richard XXVe Colloque GRETSI, Lyon, France, September 2015 Paper \| Poster