Conversational artificial intelligence (AI) is fundamentally transforming how we interact with technology. Beginning with Siri in 2011 and evolving over the past half-decade—first in smartphone operating systems (OSes) and now natively available in a range of different form factors—digital or virtual assistants have become a more ingrained element in our everyday user experience (UX). As we get used to virtual assistants doing more things for us and existing wherever we are, the next question is: Where do they go from here?
Continue Reading Below
Amazon's Alexa, Apple's Siri, Microsoft's Cortana, and Google Assistant have become our voice-activated helpers when searching for information, organizing our schedules, and completing all manner of everyday tasks in more intuitive ways. This could mean anything from sending a hands-free text or playing music through an Amazon Echo to ordering an Uber ride or a pizza through your Google Home. These simple types of actions are just the beginning. We're also starting to see the value of this kind of AI in more complex scenarios and business settings.
Cortana and Google Assistant are getting smarter, with contextual reminders and recommendations geared toward optimizing productivity along with fascinating innovations incorporating computer vision and other machine learning (ML) algorithms. Alexa is building out a diverse ecosystem of third-party skills, and Google and Microsoft have followed suit.
Alexa Skills Kit, Cortana Skills, and Actions on Google give companies and developers the tools to apply the voice tech to everything from email marketing and e-commerce to expense tracking and fleet management. These business applications and use cases are only what we've seen so far. PCMag spoke to execs from Amazon, Google, and Microsoft to understand their virtual assistant vision, how the tech is evolving, and what these companies believe businesses can do with voice-capable AI helpers.
What Are Virtual Assistants?
Continue Reading Below
Whether you call it a digital assistant, virtual assistant, or voice assistant, you're essentially talking about an agent or app that lets users ask questions or give commands using their NL. A virtual assistant can exist in many different forms, be it on your smartphone or desktop, within a connected speaker, or called up in specific apps and services when you need it. Google Assistant, for instance, is now available on Android devices such as the Google Pixel and the Google Allo chat app, and is now on its way to both iPhones and TVs.
"We were looking for a natural way for people to interact with technology, which is how we came up with a conversational interface to talk to a machine," said Gummi Hafsteinsson, Product Management Director for Google Assistant. "Conversation is the most natural interface you can think of and it's something we all know how to do very naturally."
PCMag spoke to Hafsteinsson at Mobile World Congress in Barcelona earlier this year. Hafsteinsson began his career at Google, helping to launch Google Maps and Google Voice Search. He then spent a few years at Apple, working on Siri before founding smart messaging startup Emu, which was acquired by Google in 2014. Hafsteinsson has been back at Google working on Google Assistant ever since.
"Google Assistant is now in Allo, in the car, in Android TV, in the new version of Android Wear…we're expanding the kind of devices you can get Assistant on," said Hafsteinsson. "In an Allo group chat, you can ask the assistant for travel information and share it in the conversation. Assistant also interacts with third-party services like Uber."
Hafsteinsson said one of the challenges virtual assistants seek to solve is how you create an interface that works without having to understand each service you need to talk to. Google is building intelligence into its entire product stack, and with Google Assistant, the company is trying to keep the UX simple while introducing increasingly complex data sets, integrations, and automated logic.
"The only interface that works is one you don't have to learn," said Hafsteinsson. "The assistant is a natural interface to anything in the wild, be it services or devices. And we orchestrate all those things together in a way that works for you."
One important distinction is to separate a virtual assistant from a chatbot. Chatbots provide a very similar type of experience: a conversational interface through which a user interacts with a virtual agent. But their scope is more narrow than virtual assistants, which are more ubiquitously available across digital experiences and serve a wider range of uses.
Leaving the means of NL communication out of it (chatbots generally use text-based chat as opposed to voice-activated conversation), you can think of chatbots as a targeted app of what a virtual assistant can do. Microsoft CEO Satya Nadella and David Marcus, Head of Messaging Products at Facebook, have both expressed similar sentiments positioning chatbots as a new version of an app. If you look at Facebook Messenger and Microsoft's bot directories, then that logic holds true.
Chatbots are beginning to get a lot smarter, but for businesses, their primary function is as a virtual agent for a specific app, brand, or service. Chatbots help customers do things such as book travel, shop and complete e-commerce transactions, or get customer support information and submit helpdesk tickets through a conversational interface. If a chatbot is a virtual agent set to task within a specific app, then a virtual assistant is what happens when you give the AI free reign throughout an OS.
The value of a virtual assistant is having it there wherever you are, giving you the tailored information you need sometimes before you even know you need it. In the case of Cortana, the AI's origins date back to flesh-and-blood personal assistants and how they function in a business.
"When we went to develop Cortana, which started out on Windows Phone before we brought it to Windows 10, the originators actually sat down with assistants," said Laura Jones, Senior Marketing Manager for Cortana, during a meeting at the company's Redmond campus last month. "They did all these interviews with personal assistants on what they do. That was the starting point for creating a true digital assistant versus just a voice interface."
Microsoft, more than the other tech giants building out this tech, has deeper roots in business software and productivity. Cortana is enabled across a number of Microsoft's apps and services—from Microsoft Power BI to Skype for Business to provide immediate contextual responses to business queries without leaving the app you're in. Cortana (along with Alexa and Google Assistant) is also cross-platform. Cortana is available as an app on Android and iOS as well as through the Cortana Device SDK. Jones said a virtual assistant is only as good as when it's with you.
"One of the early learnings is that all the personal assistants keep what they know about someone so they can better help them," said Jones. "So one of the things we wanted to do was actually personalize the experience to someone, so the assistance you were getting from Cortana was unique to you and could proactively track important things for you."
The simplest way Cortana does this is through reminders. Scheduling, reminders, and lists are a top-of-mind business use case for virtual assistants. Jones talked about using Cortana in a touchscreen device such as the Microsoft Surface Pro. Smart Sticky Notes in the Windows 10 Anniversary Update let you write something like, "Call my boss at 3pm," either by typing or by writing a note with the stylus as part of Windows Ink. Cortana will then add that reminder to keep track of the task.
Microsoft is also working with Wunderlist (soon to be Microsoft To-Do) to integrate Cortana and sync lists across devices. This is all part of a more proactive strategy, using both contextual data and location-based reminders to help users manage their emails, schedule, and day-to-day commitments.
"We've kind of taken reminders to a whole other level where we're proactively reminding people about commitments they've made in email," said Jones. "So, if I sent an email saying I would update this section of a report, Cortana picks that out and reminds me. We think this is a really compelling enterprise scenario. Think about how many commitments you make to someone at work as opposed to your consumer life. Cortana will remind me multiple times a day that I set up this meeting or I have to follow-up with so-and-so, and then deep-links me into the email so I can either follow-up right away or set a reminder."
Microsoft is looking to expand this even further to dynamically create Cortana to-do lists and surface information based on data throughout Office 365. Cortana is already fully integrated into the Microsoft Edge browser (you can right-click on any highlighted word in Edge to Ask Cortana), and can search for documents or people across apps such as OneDrive for Business and SharePoint.
The Windows 10 Creators Update also integrated Cortana with Microsoft Azure Active Directory (AAD) to bring the AI capabilities to enterprise users who may not have had access to it before. These kinds of integrations also extend to Power BI, which lets you pull Cortana data into business intelligence (BI) queries and reports. That's not to be confused with the Cortana Intelligence Suite, a separate enterprise offering that builds machine learning (ML) and predictive analytics into business apps.
Beyond that, Jones said the Cortana team is working with Microsoft Research through projects such as Calendar.help (which is still in Preview) to automate processes like scheduling meetings with contacts outside your organization. The team is also working with the Microsoft IT Division (MSIT) development team to create experiences specific to Cortana that pull in a range of apps and contextual data.
"Once we start to grow Cortana's presence in work, think about waking up in the morning to ask Cortana for your schedule on a speaker device," said Jones. "We want that to flow across and, if I ask Cortana to remind me of something when I get to work, I want to open my PC and get that."
Jones gave a demo along those lines at Microsoft Build last month, asking Cortana through the new Harman Kardon Invoke speaker about upcoming appointments and to-do items.
How Virtual Assistants Get Smarter
The more tasks you teach and program an AI to perform, the more it will be able to do. In this respect, virtual assistants have something in common with the deep learning process by which ML algorithms and neural networks are trained on massive data sets. Training virtual assistants to perform specific business tasks is easier; all you have to do is open up the ecosystem to third-party skills development.
Amazon is the standard-bearer in this regard. The Alexa Skills Kit has been available since 2015 and lets companies and developers apply Alexa to whatever business environment or process they desire. As a result, there's already a wide selection of available business skills that companies can simply enable and start using—and that ecosystem is growing.
"We've seen people build skills using Alexa skills to automate conference rooms and we've seen people connecting to their emails and calendars. Companies are re-purposing Alexa for lots of enterprise scenarios," said David Limp, Senior Vice President of Devices at Amazon, who oversees Alexa development across all of Amazon's devices.
Google and Microsoft have followed Amazon's lead on this front with Actions on Google and Cortana Skills, respectively. These toolkits let you build particular skills but they're also evolving to incorporate natural language processing (NLP) and features such as proactive suggestions to recommend a skill to users in the right context (for instance, a reminder to order food for a regular lunch meeting).
PCMag recently caught up with Limp at the Wired Business Conference in New York City. Limp said Amazon is not actively building any Alexa capabilities specifically geared toward enterprise business, because it doesn't have to. The benefit of having the strongest ecosystem is that businesses will take the platform and build what they want, and in some cases, pair that experience with Amazon's hardware at scale.
"In enterprise areas, we've seen a bunch of fascinating uses already," added Limp. "Wynn Hotels is putting an Echo in every hotel room, so imagine all the use cases to automate that room. Another entertainment one is at one of the stadiums [Safeco Field] in Seattle, where the stadium is putting them in sky boxes to get scores and order food, or they can write custom skills for it."
The next step, according to Limp, is making all of those skills (currently more than 13,000) easily discoverable and manageable. App stores have never quite solved longtail app discoverability. After that, he said it's about integrating and standardizing across virtual assistant platforms to tie together this "world of a million skills."
"Our view is, it's not homogeneous. Our hope and first efforts were to open up the environment to skills to augment Alexa, but over time, you could imagine invoking Cortana or another assistant," said Limp. "You should be able to say, 'Alexa, ask Siri X.' That's a very real use case. The difference between Alexa, Siri, Cortana, and Google Assistant are more horizontal than anything else. Opening standards around how these AIs interoperate is super important. Whether that's the formulation of standards or making sure that that our APIs [application programming interfaces] are open and published. We see the world looking like that."
Interoperability aside, the fact is, this space is still only a few years old. Amazon launched The Alexa Fund last year to spur innovation in the space, committing to invest up to $100 million in venture capital (VC) funding to both startups and established brands pushing the boundaries of what voice and virtual assistant tech can do. Google and Microsoft are both heavily invested in continued research as well.
Both Microsoft's Jones and Google's Hafsteinsson talked about the need to improve NL processing and conversational understanding before we see virtual assistants become not only widely adopted by consumers, but truly ingrained in everyday business operations.
"Overall, the industry is still really early," said Microsoft's Jones. "Adoption is early. We're seeing a growing interest in the home and a big opportunity in productivity and work, but it's not just about understanding what I'm saying. It's about understanding the intent, especially as we bring this technology to even more conversational platforms."
Google's Hafsteinsson agress that we're only scratching the surface of the different contexts and new use cases for this tech. More important than looking at adoption numbers, he feels the presence and contextual convenience of virtual assistants will gradually shift our behavior.
"In the last two years, we've seen this explosion in the technology. But you have to understand that its future is not based on numbers," said Hafsteinsson. "It's based on the advantages it brings to human beings. Ignore it at your own risk."