Amazon’s Alexa is set to receive a major upgrade that will bring its conversational capabilities more in line with modern chatbots like Google Bard or OpenAI’s ChatGPT, Dave Limp, SVP of Amazon Devices & Services, announced during the company’s 2023 Devices event on Wednesday. The long-running digital assistant will soon be driven by a purpose-built large language model that will be available in nearly every new Echo device.
“Our latest model has been specifically optimized for voice,” Limp told the assembled crowd, “and the things we know our customers love — like having access to real-time information, efficiently controlling their smart home, and getting the most out of their home entertainment.”
Amazon is itself no stranger to genAI technology, having spent more than a decade researching its “ambient intelligence” systems. Generative AI models, specifically Alexa Teacher, have long driven the background functions of Alexa devices. “With generative AI within reach, we started doubling down on the home about nine years ago, and we had an epiphany, ” Limp said. “We realized that all the investments in the R&D in the consumer electronics industry was being funneled into mobile phones. The SOCs, the displays, the chip sets, the sensors — it was be optimized to the phone.”
“That was understandable,” he conceded. “It’s a multi-billion-dollar-a-year industry. But at the same time, the place where you spend the vast majority of your life — your home — was virtually forgotten.”
The new model will be both “larger and more generalized,” Limp said, and will “help us take the next steps towards a remarkably different customer experience.” To that end, Amazon set out to design the LLM based on five foundational capabilities and then tune the model specifically for voice applications rather than mobile screens.
The voice optimizations simply mean that you won’t have to repeat Alexa every time you talk to it. Customers enrolled in the company’s Visual ID system will just need to face the screen before they start talking. What’s more, the new Alexa will be more forgiving of stumbling or pause-filled speech, and it will soon modulate its tone and emotion based on the context of the conversation.
The LLM will also be “connected to hundreds of thousands of real-world devices and services via APIs,” the company’s release reads. “It also enhances Alexa’s ability to process nuance and ambiguity—much like a person would—and intelligently take action.” As such, users will soon be able to program complex requests, like “Alexa, every weeknight at 9 PM, make an announcement that it’s bedtime for the kids, dim the lights upstairs, turn on the porch light, and switch on the fan in the bedroom,” all using just spoken commands.
Limp tried to show off those natural conversation capabilities during an on-stage demonstration Wednesday, however the Alexa was not particularly cooperative, patently ignoring two of Limp’s spoken prompts which required him to sheepishly repeat himself.
The new model is far from Amazon’s only genAI project. The company recently released a generative model to help its e-commerce sellers write product listings as well as incorporated a slew of AI-based features into its Thursday Night Football broadcasts at the start of the NFL season. The company has also weathered criticism from the Writers Guild of America over the retailer’s allowance of AI-generated book listings which infringe heavily upon copyrighted works (and occasionally recommend eating suspect mushrooms).
The new LLM will be available to existing Echo owners as part of a free preview on the devices they already own as well on every new Echo device sold, starting in 2024.