Amazon: Alexa

Introduction

Alexa is a cloud-based voice service that can help you with tasks, entertainment, general information, and more.

Alexa does many things and is always learning. Alexa can do even more when you enable Alexa skills. Alexa is available on home devices, like Echo and Amazon Fire TV, and some third-party products, like apps and watches.

A few things that Alexa can do are:

Play music
Make Alexa-to-Alexa calls and messages
Play audiobooks
Get information
Control connected home devices
shop on Amazon
Order food
Review your calendar events

Alexa Skills Kit (ASK)

Polly

Amazon Polly is a service that turns text into lifelike speech. Amazon Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products. Amazon Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. Amazon Polly includes dozens of lifelike voices across a variety of languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries.

Amazon Polly delivers the consistently fast response times required to support real-time, interactive dialog. You can cache and save Amazon Polly’s speech audio to replay offline or redistribute. And Amazon Polly is easy to use. You simply send the text you want converted into speech to the Amazon Polly API, and Amazon Polly immediately returns the audio stream to your application so your application can play it directly or store it in a standard audio file format, such as MP3. With Amazon Polly, you only pay for the number of characters you convert to speech, and you can save and replay Amazon Polly’s generated speech. Amazon Polly’s low cost per character converted, and lack of restrictions on storage and reuse of voice output, make it a cost-effective way to enable Text-to-Speech everywhere.

Amazon Polly Web Site

Lex

Amazon Lex is a service for building conversational interfaces into any application using voice and text. Amazon Lex provides the advanced deep learning functionalities of automatic speech recognition (ASR) for converting speech to text, and natural language understanding (NLU) to recognize the intent of the text, to enable you to build applications with highly engaging user experiences and lifelike conversational interactions. With Amazon Lex, the same deep learning technologies that power Amazon Alexa are now available to any developer, enabling you to quickly and easily build sophisticated, natural language, conversational bots (“chatbots”).

Speech recognition and natural language understanding are some of the most challenging problems to solve in computer science, requiring sophisticated deep learning algorithms to be trained on massive amounts of data and infrastructure. Amazon Lex democratizes these deep learning technologies by putting the power of Amazon Alexa within reach of all developers. Harnessing these technologies, Amazon Lex enables you to define entirely new categories of products made possible through conversational interfaces.

As a fully managed service, Amazon Lex scales automatically, so you don’t need to worry about managing infrastructure. With Amazon Lex, you pay only for what you use. There are no upfront commitments or minimum fees.