Unveiling conversational voice design

AI voices are now part of our everyday lives, but how are they designed? Mozilla Festival’s event ‘A Primer on Conversational Voice Design’ gave us the lowdown. 

Unveiling conversational voice design

We hear and interact with AI voices through our phones, in elevators, and at train or bus stops, and the commonplace nature of these interactions makes us take AI voices for granted. 8 billion digital voice assistants could be in use by 2023, and on top of this the Covid-19 crisis has increased the need for touchless devices with voice assistants. Once a tech gimmick of an imagined future, voices such as Siri and Alexa are now our close and invisible companions. 

But the work that goes into these voices is far from effortless. Requiring an understanding of dialogue and speech, professional voice designers often come from screenwriting or journalistic backgrounds, and it is a growing career field. Sentence structure and the interplay of questions and answers is something voice designers must painstakingly analyse when building voice assistants. If the user gives an assistant an order such as ‘Hey Google, Book conference room 3 for Monday’, the AI needs to repeat the key words and give them a satisfying and confirmative answer in one go, stating ‘Conference room 3 is booked for Monday’. 

Voice assistants work best for immediate requests such as setting reminders, asking for translations or turning on lights. The Mozilla audience were shown a video of a paralysed man whose quality of life improved with the installation of Google Home devices so he could complete tasks independently through using just his voice. In this sense, they can be a huge advantage in people’s lives. 

Where AI voices fall short however are complex commands that ask for advice or comparison, like ‘Which is the best Thai restaurant near me?’. They also struggle with multiple intents or meanings, such as Apple the brand, or apple the fruit. Designers must train the AI to ask for specificity and clarification on word choices to establish dialogue that has a clear direction, or users will get frustrated. There is also an accessibility gap for non-native English speakers, whose overall accuracy rate in using Alexa is 80%. You can find more information on this here.

The boundary between AI and humanity has long been a favourite trope of science fiction, and this is another matter that voice designers must carefully consider. We don’t want our voice assistants to be too human as this is uncanny, but they must also understand our needs. The principles of voice design cover personality (the assistant has character and attributes that make you want to trust it), vocality (Siri pinging when you hit the button), and predictability (it learns your previous requests). 

Questions have also been raised about why voice assistants are often female. The creator of Siri, Dag Kittlaus, is said to have named the voice after a female colleague, and the name itself means ‘beautiful woman who will lead you to victory’ in Norwegian. Surely it is no coincidence that a technology created to be subservient has been established as female, aided by the fact that only 22% of people building AI are women. In spite of this, it is also difficult to reproduce a genderless voice, which has been attempted but hasn’t taken off in the same way as Siri or Alexa’s compliant cadences. 

It is important to understand anything that functions as a significant part of your everyday life, and voice assistants are no exception. As ever, the technology is impressive but still has room for improvement in terms of its biases and gaps in accessibility, and only humans can be blamed for this. 

Header Image Credit: Photo by Omid Armin on Unsplash

Author

Claire Jenns

Claire Jenns Kickstart Team

English Literature graduate, loves reading, writing and travel.

Recent posts by this author

View more posts by Claire Jenns

0 Comments

Post A Comment

You must be signed in to post a comment. Click here to sign in now

You might also like

Palworld — Making Bank, Artistically Bankrupt

Palworld — Making Bank, Artistically Bankrupt

by Christopher Hill

Read now