Four business coworkers talking and discussing in their office

There are many elements that make a voice app extremely good or extremely bad.

But probably the most important of all is Voice Design.

What is it?

Keep reading to find it out.

The importance of a good Voice Design

In web or mobile applications (visual applications in general), the user usually finds himself in front of an interface with text, references, buttons and various directions on what to do and how to move to accomplish a task.

That can’t happen with voice-only applications (those without the support of visual modes accompanying feedback).

The risk is to leave the user disoriented and frustrated because he doesn’t know how to move and what to do.

Conversations can go in different directions and take different paths depending on the user who participates in it.

That’s why, when dealing with a voice based project, it is very important to create a good Voice Design, that is to rely on normal conversations to make the interactions as natural as possible.

To do this we must:

  • Write on paper a sample conversation (scripting) that the user might have with the voice app and recite it over and over with a person to see if that conversation makes sense and flows
  • Start with a simple and ideal script, which starts and ends with the completion of the required action and where the conversation between the user and the voice app can be completely predictable
  • Find and test different directions in which the conversation can go, adding nuances, variations, etc.
  • Therefore, structure an optimal Voice Design, which facilitates people’s lives and responds quickly to what the user requires, despite the various possibilities of interaction

This aspect of Voice Design allows us to make the leap in quality to your voice app.

Now let’s go deeper into principles of Voice Design.

Principles of Voice Design to create a great voice app

Voice apps must offer a conversational experience to the user and not the classic command → action experience.

They basically have to simulate conversations between people.

To do this every interaction and conversation must always refer to the shared context (both the human and the machine at the given moment have to know what they are talking about, they have to know the context).

For example if Alexa is playing audio content, both the user and Alexa know this fact, so if the user says “Alexa, next” both subjects have enough data to understand that it’s time to switch to the next song.

Conversational partners such as voice apps must have the ability to:

  1. build a meaning from a shared context.
  2. evolve the datasets to operate on.
  3. converge on a deeper and deeper understanding of others.
  4. remember new information beyond a single given conversation session.

The life cycle of a conversation between human and voice app is divided into two main situations:

  1. First approach, that is the first time the user opens the voice app (an excellent opportunity to introduce itself and collect the first information necessary to offer a personalized experience in the future).
  2. Return use, that is, every other time the user will return to use the voice app. Here the interaction must be more immediate thanks to the information collected in the past, so there is the impression that a voice app-user relationship and knowledge has been created and bring him back to use it again.

4 must-have conversational skills for a good voice app:

  1. Being personal, giving the user the ability to move back and forth in a dialogue in the most natural way possible, adapting to the changing context and to the preferred way of speaking of the user.
  2. Be adaptable, solving a problem for the user and being careful not to create others. In a word: simplify.
  3. Be reliable, respond to information in the context requested by the user, and confirm the understanding of the information only when strictly necessary.
  4. Be available, remembering and leveraging past information and interactions related to the specific context to deliver better and faster responses.

4 questions to ask yourself during the conception phase of the voice app

  1. Does the idea create a conversation and cooperate with the user?
  2. Is the voice app actually solving a problem? Does it make the user’s life easier?
  3. Does the voice app reduce for the user the friction in the enjoyment and understanding of information?
  4. Does the voice app remember past information/interactions by adapting and improving the user experience?

Identify your users

You create the voice app to provide a service to people, so the voice app needs to be thought out and designed for them.

You need to understand who these people are, what situation they are in when they activate the voice app, and what preferences they have.

People’s desires will be the intent of the voice app. To identify who will use the voice app you need to ask yourselves these 4 questions for each user type:

  • Who are the people who will use my voice app? Background, interests, motivation, etc.
  • What do they expect from the voice app? Note: Many people don’t know what they want when they interact with a voice app, but they still have expectations.
  • When and where are they most likely to use the voice app? Home, work, car, morning, afternoon, dinner, etc.
  • How do they interact with the voice app? They use a dialect, common language, etc.

You do not need to identify all potential users of your voice app in all possible situations, but to describe 3-4 people or standard situations.

By answering these questions you will get a guide both to define the interaction script with the voice assistant and to optimize it by testing it.

So remember:

do not create unnatural voice apps that follow a flowchart (simulating the phone menu), you are building conversational voice apps, with features that operate and adapt to situations separate from the others.

Users must be able to access, connect, jump, stop and ask in any way and combination that is more natural for them.

Imagine that every situation (question-answer) is a card in a deck of cards. Now spread the deck of cards on the table in front of you.

The power to design voice experiences in this way is that you can take a set of cards, combine it in any way and offer a different experience every time.


I hope after reading this article you have a better idea on how to take care of the Voice Design of your skill.

It’s a good thing to know the basics when you have to create your voice app, but doing it completely on your own could be not the best solution.

It’s not that easy to become a very good Voice Designer and that’s why there are just a few in the world.

In addition to an experienced developer, a Voice Designer is what it takes to create a very good and effective voice app.

But these two professionals can cost you around $20.000 and 3 months of work to see your voice app published.

These entry barriers were way too high and many content creators and businesses couldn’t afford a voice app.

That’s why IPERVOX was born. We have compressed hours of developing in just 3 simple steps that anyone can do.

IPERVOX is the easiest way to create a voice app and you can try it for free by clicking down below on “Start for free”.

I’ll see you inside.

CEO and Founder of IPERVOX

[limited time offer]
free VOICE tech market research


Learn how to increase sales up to 50% with Voice Tech

2021 Free Voice Tech Market Research report