Title: Decentralized collaborative learning for speech recognition in a context of privacy protection
The candidate will contribute to the DEEP-PRIVACY project funded by the French national research agency (ANR). DEEP-PRIVACY aims at developing private-by-design acoustic models for use in automatic speech recognition systems. Indeed, more and more devices –mobile or not– invest our daily lives. These devices use voice technology to interact with human users who can access IT services through natural interaction (Apple Siri, Amazon Alexa, Google Home …). With the stated goal of improving the quality of their services, which includes automatic speech recognition, companies that offer such solutions usually collect private audio data from these devices. This constitutes a real risk for privacy.
The PhD proposal is to study and propose approaches for distributed and decentralized learning of neural acoustic models in which user data remain on the user’s device.
A first objective consists in experimenting and comparing different approaches from the literature to the particular case of acoustic models in speech processing tasks. We will focus on machine learning algorithms where data are collected locally on every peer and are not transmitted to a central server. Communications are restricted to models or updates of weights computed locally on the device. The cases of centralized federated learning approaches [Konečny et al. 2016, Leroy et al. 2018] and purely decentralized approaches [Bellet et al. 2017] will be studied. Comparisons will include several trade-offs involving model size and communication costs for instance.
Beyond the implementation and experimentation of federated learning and distributed collaborative learning approaches for automatic speech recognition, a study on the nature of the information conveyed (paralinguistics, phonetics, lexics) during these exchanges will also be conducted to assess the level of privacy protection provided by the proposed approaches. More precisely, it will be a question of studying the possibility of recognizing, at a level that will have to be quantified if necessary, the speaker and the phonemes, even the words uttered, by analyzing the values of the exchanged updates.
We will also investigate how to locally adapt the acoustic models to the voice and speech characteristics of the user in order to obtain personalized models. The amount of data and adaptation time will be studied.
In general, the thesis will keep a critical eye on the advantages and disadvantages offered by collaborative automatic learning in a real application framework [Bhowmick et al. 2018; Hitaj et al. 2018]. In view of the results of these experiments and analyses, the purpose of the thesis will be to propose effective solutions to overcome the disadvantages and propose an effective framework, linking privacy protection and improvement of acoustic modeling for speech recognition in a distributed deployment context.
Technical skills required:
•Master’s degree in machine learning or in computer science
•Background in deep learning, and in statistics
•Experience with deep learning tools is a plus (preferably PyTorch)
•Good programing skills (preferably in Python)
•Experience in speech and/or speaker recognition is a plus
This PhD thesis fits within the scope of a collaborative project (project DEEP-PRIVACY, funded by the French National Research Agency) involving the LIA (Laboratoire Informatique d’Avignon), the MAGNET team of Inria Lille – Nord Europe, the LIUM (Laboratoire d’Informatique de l’Université du Mans) and the MULTISPEECH team of Inria Nancy – Grand-Est.
This PhD position is in collaboration with Avignon University, and will be co-supervised by Yannick Estève (https://cv.archives-ouvertes.
Knowledge of the French language is not required.
Three-year work contract, with a monthly net salary of approximately 1685€/month, and a financial support for international research training and conference participation plus a contribution to the research costs.
Duration: 3 years
Doctoral School: Sciences et Agrosciences (Avignon University)
[Bellet et al. 2017] Bellet, A., Guerraoui, R., Taziki, M., & Tommasi, M. (2017). Personalized and private peer-to-peer machine learning. arXiv preprint arXiv:1705.08435
[Bhowmick et al. 2018] A. Bhowmick, J. Duchi, J. Freudiger, G. Kapoor, and R. Rogers. Protection Against Reconstruction and Its Applications in Private Federated Learning. arXiv preprint https://arxiv.org/abs/1812.009
[Hitaj et al. 2018] B. Hitaj, G. Ateniese, F. Perez-Cruz. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. arXiv preprint https://arxiv.org/abs/1702.074
[Konečný et al. 2016] J. Konečný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, & D. Bacon. (2016). Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv: https://arxiv.org/abs/1610.054
[Leroy et al. 2018] D. Leroy, A. Coucke, T. Lavril, T. Gisselbrecht, J. Dureau, Federated Learning for Keyword Spotting. arXiv preprint https://arxiv.org/abs/1810.055