BOFFIN TTS : Few-Shot Speaker Adaptation by Bayesian Optimization

Moss, Henry and Aggarwal, Vatsal and Prateek, Nishant and González, Javier and Barra-Chicote, Roberto (2020) BOFFIN TTS : Few-Shot Speaker Adaptation by Bayesian Optimization. In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) :. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2020). IEEE, pp. 7639-7643. ISBN 9781509066322

Text (boffin)
boffin.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.
Download (629kB)

Abstract

We present BOFFIN TTS (Bayesian Optimization For FIne-tuning Neural Text To Speech), a novel approach for few-shot speaker adaptation. Here, the task is to fine-tune a pre-trained TTS model to mimic a new speaker using a small corpus of target utterances. We demonstrate that there does not exist a one-size-fits-all adaptation strategy, with convincing synthesis requiring a corpus-specific configuration of the hyper-parameters that control fine-tuning. By using Bayesian optimization to efficiently optimize these hyper-parameter values for a target speaker, we are able to perform adaptation with an average 30% improvement in speaker similarity over standard techniques. Results indicate, across multiple corpora, that BOFFIN TTS can learn to synthesize new speakers using less than ten minutes of audio, achieving the same naturalness as produced for the speakers used to train the base model.

Item Type:

Contribution in Book/Report/Proceedings

Additional Information:

©2020 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Departments:

Faculty of Science and Technology > Mathematics and Statistics

ID Code:

145409

Deposited By:

ep_importer_pure

Deposited On:

29 Oct 2020 16:15

Refereed?:

Yes

Published?:

Published

Last Modified:

10 Dec 2025 19:17

URI:

https://eprints.lancs.ac.uk/id/eprint/145409

Altmetric