AWS steps up AI focus

AWS has announced three new features for its artificial intelligence portfolio at re:Invent 2016 in Las Vegas this week.

Jamie Davies

December 1, 2016

4 Min Read
AWS steps up AI focus

AWS has announced three new features for its artificial intelligence portfolio at re:Invent 2016 in Las Vegas this week.

Artificial intelligence is a development which could have wide ranging impacts on businesses throughout the world, and while many have been sceptical about its effectiveness the technology is starting to make waves. Google recently announced a few changes to the way its translation tool works with AI, IBM’s Watson seems to be making progress constantly and Salesforce’s Einstein is paying off with the company growing year-on-year.

Considering the normalization of artificial intelligence and growing acceptance in the technology world, it wasn’t going to be long before the public cloud leader started making itself known. And sure enough just in time for Christmas, the AWS team has put forward three new features for developers to play with.

Firstly, Polly is making its debut in the US East (Northern Virginia), US West (Oregon), US East (Ohio), and EU (Ireland) regions. Polly is a text-to-speech service which utilizes machine learning to recognize context, environment and location. The focus of the AWS team was to create a proposition which takes text and coverts it to lifelike speech that you can use in your own tools and applications. Lifelike is the big focus here.

“Polly was designed to address many of the more challenging aspects of speech generation,” said AWS Chief Evangelist Jeff Barr. “For example, consider the difference in pronunciation of the word “live” in the phrases “I live in Seattle” and “Live from New York.” Polly knows that this pair of homographs are spelled the same but are pronounced quite differently.

“Or, what about the ‘St.’ Depending on the language and the context, this could mean (and should be pronounced) as either ‘street’ or ‘saint’. Again, Polly knows what to do here. Polly can also deal with units, fractions, abbreviations, currencies, dates, times, and other speech components in sophisticated, language-specific fashion.

“In order to do this, we worked with professional, native speakers of each target language. We asked each speaker to pronounce a myriad of representative words and phrases in their chosen language, and then disassembled the audio into sound units known as diphones.”

From a pricing perspective, Polly can be used for 5 million characters per month free of charge, after which its $0.000004 per character, or about $0.004 per minute of generated audio. To translate the full text of ‘The Adventures of Huckleberry Finn’ would cost roughly $2.40.

Secondly, the team has also introduced Amazon Rekognition, its image recognition API. The tool is built on deep learning and has already been trained on thousands of objects and scenes. The API also breaks down the image into components when using facial recognition, so if it is wrong, you can see why (see image below).

While this is not ground-breaking advancements in the AI world, it was necessary. Google, Microsoft, IBM and a number of other competitors offer the same functionality, so while the AWS team are not setting themselves apart with this launch, it was required to make sure it didn’t fall behind.

Rekognition is now available in the US East (Northern Virginia), US West (Oregon), and EU (Ireland) regions, though Barr did not give pricing details.

Finally, Amazon Lex has been released to allow developers to build their own conversational applications, such as chat bots or other web & mobile applications that support engaging, lifelike interactions.

Lex is built on the same deep learning technologies that powers Amazon Alexa, which has generally been receiving positive reviews. Using Lex, chat bots can be connected with Facebook Messenger right now, with Slack and Twilio integration being worked on currently.

Although the consumer applications maybe attractive and the more obvious implementation, the team has been working to allow Lex to work with AWS Lambda functions to implement business logic for a bot. The bot can potentially be connected to enterprise applications and data. In theory, this could be seen as a simple route to automating routine tasks within enterprise.

“In conjunction with the newly announced SaaS integration for AWS Mobile Hub, you can build enterprise productivity bots that provide conversational interfaces to the accounts, contacts, leads, and other enterprise data stored in the SaaS applications that you are already using,” said Barr.

“Putting it all together, you now have access to all of the moving parts needed to build fully integrated solutions that start at the mobile app and go all the way to the fulfilment logic.”

This is another example of incremental developments for a technology which is slowly finding its way to normalization and mass market acceptance. AWS may not be making waves with the AI announcements, but all these small ripples will add up. After all, there is a reason AWS is running away with market share in the public cloud segment.

Get the latest news straight to your inbox.
Register for the newsletter here.

You May Also Like