List of speech recognition software

Last updated

Speech recognition software is available for many computing platforms, operating systems, use models, and software licenses. Here is a listing of such, grouped in various useful ways.

Contents

Acoustic models and speech corpus (compilation)

The following list presents notable speech recognition software engines with a brief synopsis of characteristics.

Application nameDescription Open-source License Operating system Programming language Supported language, noteOffline or online
CMU Sphinx HMM Yes BSD style Cross-platform Java English, German, French, Mandarin, RussianOffline
HTK HMM neural net NoHTK specific Cross-platform C English; version 3.5 released December 2015
Julius HMM trigramsYesBSD style, non-commercial Cross-platform CJapanese, English; Offline
Kaldi Neural net Yes Apache Cross-platform C++ English
RWTH ASR RWTH Aachen UniversityNoRWTH ASR, non-commercial use only Linux, macOS C++English
Whisper Encoder/decoder transformer Yes MIT license Cross-platform Python (programming language) MultilingualOnline (through API) and Offline

Macintosh

Application nameDescription Open-source License PriceNote
Dragon for Mac (discontinued 2018) macOS; by NuanceNo Proprietary
Dragon Dictate (discontinued)macOS; by NuanceNo Proprietary
MacSpeech Scribe (discontinued)Transcription from recorded text; acquired by Nuance
iListen (discontinued) PowerPC Macintosh; discontinued by MacSpeech; acquired by Nuance
Speakable items Included with macOS
ViaVoice (discontinued)IBM Product; acquired by Nuance
Voice Navigator Original GUI voice control; 1989
Weesper Neon Flow Voice-dictation software for macOS 11+; 100% offline processing; cross-app dictationNo Proprietary €5/month (free trial)Provides local speech-to-text on device; available for macOS and Windows [1]

Cross-platform web apps based on Chrome

The following list presents notable speech recognition software that operate in a Chrome browser as web apps. They make use of HTML5 Web-Speech-API. [2]

Application nameDescription Open-source License PriceNote
Speechmatics [3] Cloud based and on-premise automatic speech recognitionNo Proprietary From £0.06 per minute of audio

Mobile devices and smartphones

Many mobile phone handsets, including feature phones and smartphones such as iPhones and BlackBerrys, have basic dial-by-voice features built in. Many third-party apps have implemented natural-language speech recognition support, including:

Application nameDescription Open-source License PriceNote
Assistant.ai Assistant for Android, iOS and Windows PhoneNo Proprietary, freeware FreeDiscontinued
Dragon Dictation No Proprietary, freeware Free
Google Now Android voice searchNo Proprietary, freeware Free
Google Voice Search No Proprietary, freeware Free
Microsoft Cortana Microsoft voice searchNo Proprietary, freeware Free
Siri Personal Assistant Apple's virtual personal assistantNo Proprietary, freeware Free
Alexa – Amazon Echo Amazon's personal assistantNo Proprietary
SILVIA Android and iOSNo
Vlingo

Windows

Windows built-in speech recognition

The Windows Speech Recognition version 8.0 by Microsoft comes built into Windows Vista, Windows 7, Windows 8 and Windows 10. Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows; meaning you cannot use the speech recognition engine in one language if you use a version of Windows in another language. Windows 7 Ultimate and Windows 8 Pro allow you to change the system language, and therefore change which speech engine is available. Windows Speech Recognition evolved into Cortana (software), a personal assistant included in Windows 10.

Windows 7, 8, 10, 11 third-party speech recognition

Microsoft Speech API

The first version of the Microsoft Speech API was released for Windows NT 3.51 and Windows 95 in 1995, it was then part of Windows up to Windows Vista. This initial version already contained Direct Speech Recognition and Direct Text To Speech APIs which applications could use to directly control engines, as well as simplified 'higher-level' Voice Command and Voice Talk APIs. Speech recognition functionality included as part of Microsoft Office and on Tablet PCs running Microsoft Windows XP Tablet PC Edition. It can also be downloaded as part of the Speech SDK 5.1 for Windows applications, but since that is aimed at developers building speech applications, the pure SDK form lacks any user interface (numerous applications were available), and thus is unsuitable for end users.

Built-in software

Interactive voice response

The following are interactive voice response (IVR) systems:

Unix-like x86 and x86-64 speech transcription software

Discontinued software

See also

References

  1. "Download – Weesper Neon Flow". Weesper Neon Flow. Retrieved 2025-10-29.
  2. "Web Speech API Specification". dvcs.w3.org. Archived from the original on 2016-06-21.
  3. Orlowski, Andrew. "Total recog: British AI makes universal speech breakthrough". The Register. Situation Publishing. Retrieved 17 May 2018.
  4. "Speech Recognition Software for Windows PC – Braina". www.brainasoft.com. Archived from the original on 2015-04-07.
  5. "Dynamic Faceting-List of Most 57 Speech Recognition SWs and Web Services". Archived from the original on February 13, 2019. Retrieved February 23, 2019.
  6. O'Neill, Mark (2013-11-06). "Control your PC with these 5 speech recognition programs". PC World . Archived from the original on 2014-01-01. Retrieved 2013-12-30.
  7. "Interactive Voice Response". Genesys. Archived from the original on 2016-10-14.
  8. [ dead link ]
  9. Lavie, A.; Waibel, A.; Levin, L.; Finke, M.; Gates, D.; Gavalda, M.; Zeppenfeld, T.; Zhan, Puming (1 April 1997). "Janus-III: speech-to-speech translation in multiple languages". 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 1. IEEE Xplore. pp. 99–102. CiteSeerX   10.1.1.36.6967 . doi:10.1109/ICASSP.1997.599557. ISBN   978-0-8186-7919-3. S2CID   1514209.
  10. "Mozilla DeepSpeech github page, discontinued". 2025-09-08.
  11. "A TensorFlow implementation of Baidu's DeepSpeech architecture". Mozilla. 2017-12-05. Retrieved 2017-12-05.
  12. "Download – Weesper Neon Flow". Weesper Neon Flow. Retrieved 2025-10-29.
  13. "IBM - Embedded ViaVoice - Embedded ViaVoice - Software". Archived from the original on 2010-08-08. Retrieved 2010-06-29.
  14. "Nuance product support for Microsoft Windows 7". Nuance Communications, Customer Help. Retrieved 2019-03-16.
  15. "ViaVoice for Mac OS X on Intel Chipset". Nuance Communications, Customer Help. Retrieved 2019-03-16.