IRCAM 2021 - ViaDialog - David Guennec
Sept. 10, 2021
Intervention title :
Towards helpful, customer-specific Text-To-Speech synthesis
Abstract :
The subject of automatic speech synthesis began to be democratized in the 90s. We've all had to deal with those automatic answering machine voices that made us all suffer at first. Today, however, advances in both language understanding and the acoustic quality of speech synthesis approaches have enabled us to make giant leaps forward, and new voice services are now rapidly increasing in quality and capability, with voices that are ever more human and expressive.
In this presentation, we will briefly review recent advances in speech synthesis. After this introduction, we'll move on to discuss how to customize text-to-speech voices to meet customer needs, on several levels. Firstly, at the level of the main components of oral expression: language, speech style, language register and gender, for example. Secondly, we'll look at issues relating to the utterance itself, mainly prosodic (pitch manipulation, flow). Finally, we'll conclude with a discussion of the subsidiary elements that need to be taken into account to best meet the needs of customers and end-users of synthesized voice in our constantly changing world.
Speaker information :
Name: David GUENNEC
Mini bio: A computer scientist with a passion for the history of sound reproduction, David Guennec specializes in new voice technologies. After a PhD in speech synthesis, he turned to the creation of voice assistants integrating the entire voice reproduction chain, from speech recognition to synthesis and natural language understanding. Currently at ViaDialog, he is focusing on speech synthesis and recognition.