Arun George is a voice user experience designer and UX Consultant. He has worked with Fortune 500 companies to help improve their speech applications by making them more user-centered and conversational. In this guest post, Arun talks about improving communication with voice user experience design, from blue prints to prototyping, and the importance of VUI to user-centered design.
Voice is the most natural way of communicating, so it’s no wonder that the future of User Interfaces will revolve around voice interaction. I have worked with Fortune 500 companies to help improve their speech applications and I want to share my take on how to create a world-class experience for your users.
What is a Voice User Interface?
Voice User Interface, or VUI (pronounced “voo-e”), is simply an application that the user interacts with by communicating vocally. Most of us are familiar with voice interfaces by interacting with automated phone systems. Sadly, a large amount of phone systems have a very badly designed interface. That’s because the developers who design these systems don’t understand how to design for voice experience.
Whether you’re building a speech application or Chatbot, it’s important to understand the persona you’re trying to create. Your voice system’s persona or personality is an extension of your brand’s image and plays an important role.
A great vocal persona is not just about having a pretty voice; it’s about connecting with the user on the other end. When we hear a voice we unconsciously make a lot of assumptions about that person. These assumptions include how intelligent that person might be or which region or country they’re from. According to the book, Voice User Interface Design and my personal experience working on voice, here are some things you should think about:
- Role: What’s the role of the application? Is it an assistant helping get things done? Is it a stock advisor? Bank clerk?
- Company Brand: The persona you select should represent your brand’s image.
- Target Audience: A good persona should be familiar to your users. For a compelling persona, we need to consider demographics, attitudes, lifestyle of the user, etc.
Click here for Justinmind’s User Persona template and for more tips.
Designing the Blueprint
While it’s very tempting to design your voice experience using a flow-chart, it’s not the best way to achieve a great experience. Although it’s important to have a flow-chart explaining the workflow or process, the conversational design prompts shouldn’t be focused on that. Remember that speech happens in context and is not based on logic. If you design your voice interface just on call-flows or flow-charts, then you are in a very poor position.
For example, in a conversation, we always go forward and never backward. How many automated systems have you heard with prompts such as, ‘If you want to return to the main menu, press 1’. The ingredient to make a design naturally conversational is understanding linguistics. In a conversational design interface, the focus is on building an interaction based on the way people speak, not how they write. A lot of voice interfaces are written in the way we write rather than the way we speak.
Designing a VUI dialog
So how do we get started? In voice interfaces, we don’t have wireframes: we have sample dialogs. But before we do that, let’s start with user stories. Start by describing scenarios in which your users will find useful. Identify major scenarios to describe the purpose of your interface and ways your users can interact with them.
Next, start crafting a sample dialog for your scenarios. A sample dialog is a script that shows a conversational flow between the system and the user. You can think of this as the VUI version of a wireframing process. A sample dialog can be something you show your client to get feedback before you start prototyping.
Here’s a sample dialog:
System: Thanks for calling BankOMatic. I looked up your number and found an account. I see you have a payment that’s due today. Are you calling about that?
System: OK. For security, tell me the last digits of your social security number.
System: Thanks. Let me look it up.
Found it. You have a payment of $40.98, that’s due today. Would you like to make a payment now? Just say ‘yes’ or ‘no’.
System: All right, I see you have a VISA card ending in 789 on file, would you like me to use that?
Caller: Yes, please.
System: Just a second… All done! I’ve put that through. Just so you know, it might take up to 48 hours to process that payment to post to your account. Now, if that’s all you needed, feel free to hang up and thanks for using BankOMatic. For help with anything else, just say ‘help me with something else’.
The dialog above follows conversational design approach. In a conversation, we tend to use contractions (i.e., ‘You’re’, ‘I’m’). Whilst we do this unconsciously, it’s important to realize this – the way we speak is different from the way we write, so please do keep that in mind when designing your VUIs.
Another idea to keep in mind is what’s called a discourse marker. A discourse marker is a word or phrase that you can use to connect and arrange what you say or write (i.e., ‘anyway’, ‘now’, etc). In the dialog above, I have used the word ‘Now’ to make an obvious transition from one idea to another.
Documenting and Prototyping a VUI
Prototyping for voice can be tricky, since not a lot of voice prototyping tools are readily available. Some tools available for prototyping VUIs are Sayspring and API.AI. I can’t comment on any of these tools since I use internal tools to prototype and not external ones.
When it comes to documenting your designs, you should think of each interaction as a series of states. Each state should have:
- State Name – The unique state name you can reference back to.
- State Type – Type of the state – i.e. transfer state, data look up state, etc.
- Initial Prompt – Your main initial prompt that your user will hear.
- Error Prompt(s) – What are the prompts you’re going to be playing if the user says something out of context?
- State Conditions – Conditions can be used to point to the next successful state (i.e. the user selects order status and the condition state points to the next state – ‘OrderStatusPrompt’ upon success).
State Name: Welcome Alexa
State Type: Presentation
Initial Prompt: Welcome to Alexa Skills. What can I help you with?
Error: Sorry – I didn’t understand. You can tell me things like…
Order Status -> OrderStatusPrompt
When it comes to designing error strategies, one powerful way is by starting with general to specific. Let’s have a look at this:
System: What’s your date of birth?
System: Just tell me your date of birth using 2 digits for the month, 2 digits for the day, and 4 for the year.
As you can see, the error strategy started escalating from general to specific rather than just giving all of the information right away. In other words, your errors should be progressive and helpful. Another approach is known as rapid re-prompt. In this approach, we don’t give away all of the details but instead, the response is given with something simple, such as, ‘I’m sorry?’
Here’s an example:
Bot: How can I help you?
Bot: I’m sorry?
The disadvantage, as you know, is that the user’s not getting all of the information in details at first. The best place to use this is during open-ended prompts, such as ‘How can I help you?’ and the user often doesn’t consider this as an error.
There are many ways you can design a great experience for voice. Remember that the end focus is on users like you and I. So, it’s important to understand not only the target users, but also the context in which the dialogs will appear.
In other words, study the way we speak and write. Study user-centered design processes and learn how to approach the challenge in a human way. It’s never the limitation of speech technology that’s responsible for a horrible voice experience. It’s usually the designers not knowing how to apply these processes that result in a less-than-desirable voice interface. Hopefully this article will help you design for mere mortals.