From Voice-First Development by Ann Thyme-Gobbel
This article delves into capturing and documenting voice UI design.
Capturing and documenting VUI design
VUI designs are typically captured and documented in three ways, each with its own strengths and purposes – you should make use of all three:
- Dialog flows
- Sample dialogs
- Detailed design specification
Let’s look briefly at each one.
Some dialog flows are high-level overviews of the core intents and experience, others are highly detailed. This is the best tool for everyone on the team to establish, for example, where data access takes place and the overall sequence of steps in multi-step dialogs with multiple slots. You want to settle on the overall logic flow before getting into the detailed design. When you add prompt wording, use a notation that signifies draft status. We often use all caps or angle brackets. Once your detailed design is done, you might update your flows with the prompt wording. Figure 1 shows a few close-to-real-world examples, scrubbed to protect the innocent.
Figure 1 VUI dialog flows take many shapes – literally
Sample dialogs are snapshots of representative voice interactions between users and your voice app. Each dialog should include clear descriptions of the context and condition for that dialog. They should illustrate both happy paths and error conditions. Above all, they should include both written (for convenience) and audio recordings (for relevance). They’re great tools for illustrating the intended style and features of any voice app. Sample dialogs and dialog flows are great tools for making user stories concrete. Figure 2 gives you a taste of what sample dialogs might look like.
Figure 2 Sample dialog examples. If multi-modal, include description of all modes.
Detailed design specifications
Your detailed VUI design needs to cover enough detail to make clear to developers what the intended behavior should be for every context and condition. This is obviously true for any design in any modality, but it’s probably less obvious how to do this for voice than for a visual interface where you include images of screens, exact measurements, and color codes. Detailed VUI design documentation should cover the following at a minimum:
- Every intent, name and descriptions
- For each intent
- Archetype utterances, and any expected words
- Slot names and values, required and optional
- Outcome and next step for every combination of context and conditions
- Context: user identity or category, user preferences, environment, previous user request and result (dialog path), current system activity, and so on
- Conditions: behavior for each expected user request status (recognized and handled, recognized and not handled, not recognized, and so on)
- Prompts: reference labels and exact wording for each context and condition, including error handling, retries, randomization, and so on.
- Logic and pseudo code describing behavior clearly and consistently
- Data needs: type, format, from where and saved to where
The tools you use to document your design obviously depends on where you work, who you work with, and what platform and voice service you’re using. Don’t use a documentation style or tool that ignores limitations on the design or implementation. Don’t use one that makes it difficult to capture what can be done either! You’ll probably use different approaches for multi-step dialog tasks versus one-step “one and done” requests.
VUI design documentation approaches
Let’s take a look at your most common options for documenting VUI designs:
Approach Self-contained design tools
Description Graphical software tools for VUI design. The goal is to minimize coding and worrying about details to quickly create a functional VUI dialog that can run on Alexa or Google or either. Examples: BotTalk, Invocable (was Storyline), PullString, SaySpring (now part of Adobe), Voice Apps, Voiceflow, and others.
Pros Fast. Can lay out a design and turn it into a functional dialog. Great for concepts and simple contexts.
Cons Typically limited in functionality and features. Many are not mature, still changing. All emphasize speed of creation over flexibility and quality design. Some favor one platform over another. Some ramp-up time needed.
Take-home Great for smaller projects and sample dialogs. Evolving – find what works for you and doesn’t limit you.
Approach Platform-specific tools
Description Examples: Google Dialogflow (was api.ai), and others
Pros Limited to what’s available through the platform or voice service, which automatically keeps you from designing out of scope features.
Cons Limited to what’s available through the platform or voice service. Most are still evolving, and can be unstable. Usually involves some coding, which can be a drawback for some designers. Some ramp-up time needed.
Take-home If you know you’ll design for one of the common platforms, these tools are well worth looking at.
Approach Proprietary in-house tools
Description Highly flexible and powerful VUI design tools often created by a large effort over a long time. Examples: Nuance Application Studio (NAS), and others.
Pros Powerful and mature. Typically ties together every kind of design documentation, automatically updating change in all relevant places to keep things synchronized. Can generate code and prompt lists, even a prototyping set-up. Provides features corresponding to what’s available in the platform and only those features.
Cons Not accessible unless you’re in a relationship with the company who controls the tool. Auto-generated code isn’t optimized for production unless everything is streamlined, and it typically needs rework for larger realistic complex systems. Associated with a particular company’s platform, it isn’t generalized. Ramp-up time can be steep.
Take-home Top of the top – use them if you have access.
Approach Standard documentation tools
Description Standard office software, sometimes with added tailored scripts. Examples: text (Word), spreadsheets (Excel, Google Docs), flow (Visio, Balsamiq) Can hook to databases and generate prompt lists.
Pros Completely flexible, can tailor exactly to your needs. Not dependent on others changing feature availability. No training ramp-up time needed.
Cons You have to do all the work. Any auto-generation is limited to what you create. Make sure everyone referencing the documentation uses the same notation.
Take-home Probably the most widespread VUI design tools today – don’t under-estimate the power of simplicity.
In figure 3 you can see a couple of examples from different VUI design specifications. The spreadsheet approach is best for broad-but-shallow interactions, like initial open-ended interactions which are no more than one or two turns long. They provide an easy view of the big picture and patterns across contexts. For longer transactional dialogs, a combination of flow and a word processing format with clickable links is often the better choice. In bigger implementations, you’ll probably combine both.
Figure 3 VUI dialog specs can be a few tabs in a spreadsheet or hundreds of pages of documentation
As you can see, you have lots of choices. No ‘One Way’ to VUI documentation is used by all voice practitioners. You need to determine what works for you based on many factors: your work style, team structure and familiarity with voice, company demands, voice platform, infrastructure, existing designs, tool availability, type of voice interaction, multimodality, or content. We used all approaches mentioned here in various combinations with varying results and level of happiness. Investigate your options – they’ll constantly change as tools come and go. Make use of anything already in place. Could be design patterns or existing designs for similar voice dialogs. Those may impact your documentation choice, or make it faster to get started. At times you’ll have little or no choice because the approaches and formats have already been set.
How to review a VUI design, as told by a developer-turned-VUI-designer: “If there’s a paired dialog flow, I always start with that. For me it makes the most sense. A lot of developers don’t care about the dialog flow and I have no *#! idea why. I always have. Big picture, then smaller details.
I can’t emphasize the dialog flow’s importance enough, like with mixed audiences. Business folks understand them. It’s the easiest doc for the largest range of people to understand. It’s great for orienting people in a bigger system. It’s helpful to go back and forth between the dialog flow and VUI spec, even having them both up at once allows people to reference what you went over.
I like to go through the spec module by module, first explaining at a high level what the module does such that when we get into the details, people can connect that to what I said earlier. The first few modules might take longer but then if it’s clear and predictable (like similar conditions written the same way), even a non-techie person can learn the lingo and follow any detailed VUI design. Then I usually go node-by-node to a point. When there’s a million conditions it may become more “In scenario A, we go down this path” and follow that through, then go back to scenario B. Specs must have clickable links which allow you to step through any scenario.
Explaining what to look out for and why helps folks who have less of a tech perspective, even with easy modules. Like “this is your transfer funds module…we have checks in place to ensure there’s enough money, that this business rule is caught here, etc. etc.” Talking about business rules piques their interest because everyone knows what a transfer funds module does. Those small details upfront help get them interested and also let them know you “got it.”
I’m a developer who always cared about design. I like when developers say, “If you do this part this way, the experience is the same but it’ll be much easier for me to code.” That’s a good sign they understand, are paying attention, and can compromise.”
Many tools today are aimed at letting developers jump in with little or no voice or design expertise. It’s cool, and one reason for the explosion of voice development, but for enterprise-level voice systems, you’ll probably need something else to coordinate and track details across teams. And you’ll certainly start with design, not development.
The detailed design references another important set of documented information: the VUI style guide. The core purpose of the style guide is to establish consistency. This means consistency on every level: in prompt wording and delivery within and across conversations, in VUI behavior, interpretation, and relative to any branding. Style guides are crucial when there’s more than one designer working on portions of a VUI or on related VUIs for a company, but even a lone designer needs a style guide to keep dialogs consistent. Developers and speech scientists need it to ensure that user behavior is handled consistently and that words and phrases are consistently interpreted.