Amazon Polly

Last updated
Amazon Polly
Initial releaseNovember 29, 2016
Available in29 languages
Type Speech synthesis
Website aws.amazon.com/polly/

Amazon Polly is a cloud service by Amazon Web Services, a subsidiary of Amazon.com, that converts text into spoken audio. [1] [2] [3] It allows developers to create speech-enabled applications and products. [4] It was launched in November 2016 [5] [6] [7] and now includes 60 voices across 29 languages, [8] [9] some of which are Neural Text-to-Speech voices of higher quality. Users include Duolingo, a language education platform. [10]

See also

Related Research Articles

<span class="mw-page-title-main">Amazon Web Services</span> On-demand cloud computing company

Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered, pay-as-you-go basis. Clients will often use this in combination with autoscaling. These cloud computing web services provide various services related to networking, compute, storage, middleware, IoT and other processing capacity, as well as software tools via AWS server farms. This frees clients from managing, scaling, and patching hardware and operating systems. One of the foundational services is Amazon Elastic Compute Cloud (EC2), which allows users to have at their disposal a virtual cluster of computers, with extremely high availability, which can be interacted with over the internet via REST APIs, a CLI or the AWS console. AWS's virtual computers emulate most of the attributes of a real computer, including hardware central processing units (CPUs) and graphics processing units (GPUs) for processing; local/RAM memory; Hard-disk(HDD)/SSD storage; a choice of operating systems; networking; and pre-loaded application software such as web servers, databases, and customer relationship management (CRM).

<span class="mw-page-title-main">Danny Lange</span> Danish computer scientist

Danny B. Lange is a Danish computer scientist who has worked on machine learning for IBM, Microsoft, Amazon Web Services, Uber, and Unity Technologies.

<span class="mw-page-title-main">Virtual assistant</span> Software agent

A virtual assistant (VA) is a software agent that can perform a range of tasks or services for a user based on user input such as commands or questions, including verbal ones. Such technologies often incorporate chatbot capabilities to simulate human conversation, such as via online chat, to facilitate interaction with their users. The interaction may be via text, graphical interface, or voice - as some virtual assistants are able to interpret human speech and respond via synthesized voices.

<span class="mw-page-title-main">Figure Eight Inc.</span> American software company

Figure Eight was a human-in-the-loop machine learning and artificial intelligence company based in San Francisco.

<span class="mw-page-title-main">Siri</span> Software-based personal assistant from Apple Inc.

Siri is the digital assistant that is part of Apple Inc.'s iOS, iPadOS, watchOS, macOS, tvOS, audioOS, and visionOS operating systems. It uses voice queries, gesture based control, focus-tracking and a natural-language user interface to answer questions, make recommendations, and perform actions by delegating requests to a set of Internet services. With continued use, it adapts to users' individual language usages, searches, and preferences, returning individualized results.

Duolingo is an American educational technology company that produces learning apps and provides language certification.

Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing services that provides a series of modular cloud services including computing, data storage, data analytics, and machine learning, alongside a set of management tools. It runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search, Gmail, and Google Docs, according to Verma, et.al. Registration requires a credit card or bank account details.

Amazon Echo, often shortened to Echo, is an American brand of smart speakers developed by Amazon. Echo devices connect to the voice-controlled intelligent personal assistant service Alexa, which will respond when a user says "Alexa". Users may change this wake word to "Amazon", "Echo", "Computer", and other options. The features of the device include voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, and playing audiobooks, in addition to providing weather, traffic and other real-time information. It can also control several smart devices, acting as a home automation hub.

Annapurna Labs is an Israeli microelectronics company. Since January 2015 it has been a wholly-owned subsidiary of Amazon.com. Amazon reportedly acquired the company for its Amazon Web Services division for US$350–370M.

Autoscaling, also spelled auto scaling or auto-scaling, and sometimes also called automatic scaling, is a method used in cloud computing that dynamically adjusts the amount of computational resources in a server farm - typically measured by the number of active servers - automatically based on the load on the farm. For example, the number of servers running behind a web application may be increased or decreased automatically based on the number of active users on the site. Since such metrics may change dramatically throughout the course of the day, and servers are a limited resource that cost money to run even while idle, there is often an incentive to run "just enough" servers to support the current load while still being able to support sudden and large spikes in activity. Autoscaling is helpful for such needs, as it can reduce the number of active servers when activity is low, and launch new servers when activity is high. Autoscaling is closely related to, and builds upon, the idea of load balancing.

<span class="mw-page-title-main">Google Assistant</span> AI-powered digital assistant from Google

The Google Assistant is a virtual assistant software application developed by Google that is primarily available on mobile and home automation devices. Based on artificial intelligence, The Google Assistant can engage in two-way conversations, unlike the company's previous virtual assistant, Google Now.

This is a timeline of Amazon Web Services, which offers a suite of cloud computing services that make up an on-demand computing platform.

Amazon Alexa or Alexa is a virtual assistant technology largely based on a Polish speech synthesizer named Ivona, bought by Amazon in 2013. It was first used in the Amazon Echo smart speaker and the Echo Dot, Echo Studio and Amazon Tap speakers developed by Amazon Lab126. It is capable of natural language processing (NLP) for tasks such as voice interaction, music playback, creating to-do lists, setting alarms, streaming podcasts, playing audiobooks, providing weather, traffic, sports, other real-time information and news. Alexa can also control several smart devices as a home automation system. Alexa capabilities may be extended by installing "skills" such as weather programs and audio features. It performs these tasks using automatic speech recognition, NLP, and other forms of weak AI.

Amazon Lex is a service for building conversational interfaces into any application using voice and text. It powers the Amazon Alexa virtual assistant. In April 2017, the platform was released to the developer community, and suggested that it could be used for conversational interfaces including Web, mobile apps, robots, toys, drones, and more. Amazon already had launched Alexa Voice Services, which developers can use to integrate Alexa into their own devices, like smart speakers, alarm clocks, etc.; however, Lex will not require that end users interact with the Alexa assistant per se, but rather any type of assistant or interface. As of February 2018, users can now define a response for Amazon Lex chatbots directly from the AWS management console.

<span class="mw-page-title-main">Witlingo</span> Software as a service company

Witlingo is a B2B Software as a Service (SaaS) company that enables businesses and organization of all sizes to use the latest innovations in Human Language Technology and Conversational AI, such Speech recognition, Natural Language Processing, IVR, Virtual Assistant apps on Smartphone platforms(iOS and Android), Chatbots, and Digital audio, to deeply engage with their communities.

Amazon Elastic File System is a cloud storage service provided by Amazon Web Services (AWS) designed to provide scalable, elastic, concurrent with some restrictions, and encrypted file storage for use with both AWS cloud services and on-premises resources. Amazon EFS is built to be able to grow and shrink automatically as files are added and removed. Amazon EFS supports Network File System (NFS) versions 4.0 and 4.1 (NFSv4) protocol, and control access to files through Portable Operating System Interface (POSIX) permissions.

Crowdsource is a crowdsourcing platform developed by Google intended to improve a host of Google services through the user-facing training of different algorithms.

Amazon SageMaker is a cloud based machine-learning platform that allows the creation, training, and deployment by developers of machine-learning (ML) models on the cloud. It can be used to deploy ML models on embedded systems and edge-devices. SageMaker was launched in November 2017.

Amazon Rekognition is a cloud-based software as a service (SaaS) computer vision platform that was launched in 2016. It has been sold to, and used by, a number of United States government agencies, including U.S. Immigration and Customs Enforcement (ICE) and Orlando, Florida police, as well as private entities.

Hugging Face, Inc. is a French-American company based in New York City that develops computer tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

References

  1. Dignan, Larry (September 20, 2018). "Amazon's slew of new Echo, Alexa devices obscures new developer tools, features". ZDNet.
  2. "Amazon Polly". TechTarget . Retrieved 22 September 2018.
  3. Perez, Sarah (February 8, 2018). "Amazon launches a Polly WordPress plugin that turns blog posts into audio, including podcasts". TechCrunch.
  4. Alawadhi, Neha (August 29, 2018). "AWS announces addition of Hindi language support for Amazon Polly". moneycontrol.com.
  5. "AWS makes Amazon Polly talk in Hindi in addition to Indian English". Digit. August 29, 2018.
  6. "Amazon announces three new AI services called Lex, Polly and recognition for AWS". Firstpost.com. December 1, 2016.
  7. Lardinois, Frederic (November 30, 2016). "Amazon launches Amazon AI to bring its machine learning smarts to developers". TechCrunch.
  8. "Amazon Polly Adds Arabic Language Support". aws.amazon.com. Archived from the original on 2019-04-29.
  9. "Languages Supported by Amazon Polly - Amazon Polly". docs.aws.amazon.com. Retrieved 2019-09-06.
  10. "Powering Language Learning on Duolingo with Amazon Polly". 12 May 2017.