Google’s CEO, Sundar Pichai, shared his thoughts about voice search.
“From a user’s standpoint, they are looking for information, looking to get things done”, “so they may ask a question on voice, later when they pick up their phone they want continuity, so we think of this as an end-to-end thing“, and “needs to be thought about as a part of a user’s overall search journey.”
Spot on! In a post last week, I looked at the digital assistant ecosystem. But voice search is about more than just the right interaction and interface. An end-to end solution needs also to cater for security, privacy, functionality, fragmentation and customization. Past concerns (great insights in the comments section), remain valid today. Thankfully, technology has moved forward.
First let’s take about security. New, but subtle advances in voice recognition are tackling complex problems around identity and security. Researchers are looking at “converting voice to kind of like a barcode to identify every human.” But is this enough when, as a user and customer, our own life savings are at stake? To create products for everyday use, banks require even more stringent security than law enforcement, analyzing not only the voice and environment, but also “physical features of the head producing the sound – larynx type, nasal passages etc.” Is the technology perfect? Probably not, but the steps are being taken to tackle issues such as fraud, impersonation and miscommunication. For example, Alibaba’s Tmall Genie comes with built in “voice-based user identification,” and “learn from past interactions, helping it improve in its ability to serve as your digital assistant.” Such voice recognition features should also enable use of voice assistance in crowded noisy places.
Along with Security, another critical concern users have is privacy. The primary concern is that people don’t want to be listened to all the time. While the major players have policies and guidelines around privacy of voice recordings, including, what is transmitted and when, duration the recordings are kept and control over keeping and deleting said recordings, these are already being tested in the real world. The challenge though isn’t just about voice recordings – it is also about data privacy and access to other devices in the home. While the privacy vs convenience issue is not new, it does create new models, opportunities and segments to target. Just read the pitch by Andy Rubin’s new venture “We designed Essential Home to directly talk to your devices over your in-home network as much as possible in order to limit sending data to the cloud. Essential Home’s proactive assistant also runs its AI engine locally on the device.”
The next biggest challenge is functionality – what does voice offer other than the ability to set reminders, check the news and play your favorite tracks? A common complaint about voice services is the limited applications available for voice services.
If you expect a complete package to be available, it isn’t. It is too early in the game. From an Agile / Lean perspective – why should Google, Amazon or Apple build all the use cases they can think of only to find out that 90% of the use cases they built were wrong? Wouldn’t it be better to build an MVP and an API that can be quickly designed to meet use cases as they are discovered? Wrap each set of use case into an “app” and you have an eco-system. Allow a subscription model and third party development and you have a marketplace.
Which assistant should I get? From a users’ perspective, functionality is key. But are businesses in a position to develop and maintain voice solution across multiple platforms? Will they have the financial will to be early adopters and invest knowing that some of these platforms will eventually die out? These questions are likely to raise a new breed of cross platform tools for integration, modelling, design, analytics and testing – though they have a long way to go before getting an “awesome” label, there are already prototyping tools that offer some level of cross functional adaptability.
Beyond the Customer Journey
Google is not the only one thinking about voice assistant and the customer journey. Expedia is exploring a “real time” travel assistant. To deliver the kind of digital assistant that is our own personal Jarvis, it needs to handle “unstructured queries” across a variety of environments while pulling from a variety of sources. This structure highlights the possibility of various digital assistants specialized in different tasks – finance, travel, hoteling, and driving. The question is not then about designing just a customer journey, but providing a great but accurate experience within the customer journey. Just listen to Dara speak about voice (21:30 onwards): “And if they ask a question, and you have a bad answer, first time maybe they will be okay with it, third time this is a complete waste of time I am going away.”
In the field of voice, building a great user experience is not just about designing smart and engaging voice interactions, but also aligning all elements, from devices and displays to the underlying data, to deliver useful information and actions to the consumer. If the voice interaction works but any of the other elements don’t align, voice will feel “wrong”. In fact, designing the voice interactions might end up being the easy part – after all, we are interacting with other human beings through voice on a daily basis and are familiar with how we like to receive and deliver information as well as its limitations. The challenge may lie in creating processes that deliver consistent results.
Gain insights to help you grow
Use our knowledge to improve your business: Get our latest research, case studies, and insights sent straight to your inbox.