A young man in front of a laptop and using an Amazon Echo Dot to communicate with friends

In July 2019, Voicebot, along with Pulse Labs and Voices, brought together 240 people to understand their preferences for voice assistant voices and evaluate different voice user experiences (voice UX) and how people use Voice Apps.

The user experience will always have elements of subjectivity that do not make it an exact science and the number of people in this study is not so high.

Some data was quite predictable, while others are very interesting and may come in handy to anyone who has already entered the voice market or is thinking of doing so.

Being able to understand more about what those who use this new technology prefer can be a great advantage to start on the right foot.

Do voice technology users prefer human or synthetic voices? Do they prefer male or female voices? When does content get too long? How much does people’s age affect these preferences?

We find out right away thanks to the report data.

Length of content and type of voice apps: here’s what to choose

Human voice or synthetic voice?

It is no wonder here that the human voice received a rating of almost 72% higher than the synthetic one.

This preference has a greater influence on the age of the responder  than the responder.

For the 18-29 age group, the human voice had a higher rating of 80.1%, for 30-59 of 66% and for 60% of 91.4%.

As for gender, the data was closer: 72.9% for women and 69.6% for men.

In addition to the distinction between human voice and synthetic voice, researchers wanted to understand whether the sex of the voice impacted user sentiment.

As for the synthetic voice there were practically no differences depending on whether it was a female or male voice.

The recurring adjectives were: robotic, monotonous, boring.

The reactions were similar for human voices as well.

The most recurring was enthusiastic for the male one, friendly for the female one.

Energetic, pleasant, natural, professional, and exciting were some of the other recurring words.

In general, therefore, those who listened to human voices had much more positive feelings, the preference expressed in the previous data was confirmed by the actual reactions of people.

Male voice or female voice?

Okay, we’ve found that we prefer to hear a human voice, but what about gender?

Do people prefer a male or female voice?

The fact here is interesting: if we talk about the human voice there is a slight advantage of the male voice (2.3%), but if we talk about synthetic voice there is more tolerance for the female one (a 12.5% advantage over the male one).

Since it is also confirmed by dividing the responses by gender and by age group, the study participants.

This is a subject that is talked about a lot.

Amazon said it chose a female voice for Alexa because it was the preferred choice for early technology testers.

The data from this research seem to confirm.

We can assume that, given the fact that for years there has been Siri and it has a female voice and that Alexa also has a female voice, we have got kind of used to it, therefore influencing the answer.

Long content or Short content?

Who owns a smart speaker prefers long or short content and dialogue? How does the assessment of one’s experience change in relation to length?

To search for a response, participants were divided into four groups.

Each group listened to a different combination of content consisting of an introduction (short or long) followed by a list (long or short).

81.8% rated the content from the long/long combination too long.

Among those who listened to the short/short combination, only 18.2% rated it too short.

In general, each combination involving a shorter part was judged to be between 50 and 56% of people on average.

However, if you want to include a long part of the dialogue, it is absolutely preferable to do so in the introduction rather than to have a long list of things.

At the skill grade level,  the group with a short content (short/short combination) gave a 16% better rating than the long/long group.

An interesting fact about tolerance to the length of dialogues emerges from the difference between synthetic voice and human voice.

Long content distributed by a human voice had a similar assessment of a short content distributed by a synthetic voice.

In the end the user call-to-Action Recall was evaluated and there was no history there.

Long content with a human voice was far more effective (32.5%) than those with synthetic entries (12.1% short content and 14.3% long content).

Conclusions on Voice Apps

Many more tests and experiments would be needed to draw definitive conclusions.

Based on what we know so far, however, you can already adapt your voice strategy to increase the chance of making your users have a positive experience.

Content with a human voice and short dialogues are what you should focus on.

If using a human voice is not feasible for you right now, at least try to keep the dialogues rather short.

The elements to have successful voice apps are many and putting them all together is not easy.

IPERVOX is ready to help you on your journey in the world of voice.

In the meantime you can read other guides that can help you out.

To find out how to create a good voice design for your voice apps click here, to learn how to improve the reviews of your Alexa skill click here, or to get some advice on how to promote your Alexa skill click here.

[limited time offer]
free VOICE tech market research


Learn how to increase sales up to 50% with Voice Tech

2021 Free Voice Tech Market Research report