Using speech synthesis to explain automatic speaker recognition : a new application of synthetic speech

Brown, Georgina and Kirchhübel, Christin and Cuthbert, Ramiz (2023) Using speech synthesis to explain automatic speaker recognition : a new application of synthetic speech. In: Interspeech 2023, 2023-08-20 - 2023-08-24, Convention Centre.

Full text not available from this repository.

Abstract

Some speech synthesis systems make use of zero-shot adaptation to generate speech based on a target speaker. These systems produce speaker embeddings in the same way that speaker embeddings (often called 'x-vectors') are produced in automatic speaker recognition systems. This commonality between the two technologies could lower barriers that constrain the use of automatic speaker recognition systems in forensic speech analysis casework. A key barrier to the use of automatic speaker recognition in the forensic context is the issue of explainability, including what information about the voice a system uses in order to arrive at conclusions. This paper sets out a new approach that could be used to effectively communicate this type of information to audiences in the legal setting. Specifically, it is proposed that exposing listeners to synthetic speech produced by a zero-shot adaptation system could illustrate what aspects of the voice an automatic speaker recognition system captures.

Item Type:

Contribution to Conference (Paper)

Journal or Publication Title:

Interspeech 2023

Uncontrolled Keywords:

Research Output Funding/no_not_funded

Subjects:

?? no - not funded ??

Departments:

Faculty of Arts & Social Sciences > Linguistics & English Language

ID Code:

205297

Deposited By:

ep_importer_pure

Deposited On:

25 Oct 2023 14:00

Refereed?:

Yes

Published?:

Published

Last Modified:

15 Jul 2024 08:51

URI:

https://eprints.lancs.ac.uk/id/eprint/205297