IBM ViaVoice

Last updated
ViaVoice
Developer(s) IBM
Initial releaseAugust 1997;26 years ago (1997-08) [1] [2] [3] [4] [5] [6] [7]
Stable release
10.5 / 2005;19 years ago (2005)
Operating system Microsoft Windows, macOS
Type Voice recognition
License Proprietary
Website IBM ViaVoice website

IBM ViaVoice was a range of language-specific continuous speech recognition software products offered by IBM. The current version is designed primarily for use in embedded devices. The latest stable version of IBM Via Voice was 9.0 and was able to transfer text directly into Word.

Contents

The most important process for the correct use of this software is the so-called ‘quick training, and ‘enrollment,: it consists of reading many specific words and sentences in order to make the software adapt itself to the specific users' sound and intonation features. It lasts for one hour or more and can be divided in many parts. Users are able to improve decoding accuracy, by reading prepared texts of a few hundred sentences. The recorded data was used to tune the acoustic model to that specific user. In addition, user specific text files could be parsed to tune the language model. Correction of mis-recognised words was also used to improve subsequent decode accuracy.

Editions

Individual language editions may have different features, specifications, technical support, and microphone support. Some of the products or editions available are:

The IBM Via Voice 98 TM has been available in the Home, Office and Executive Edition in the following languages:

Chinese, French, German, Italian, Japanese, Spanish, UK English, US English. The Executive Edition allows you to dictate into most Windows applications and control them using your voice.

Designed for Windows 95, 98 and NT 4.0, it has been working very well with Windows 7.

In the Executive package are included:

History

Prior to the development of ViaVoice, IBM launched a product in 1993 named the IBM Personal Dictation System (later renamed to VoiceType) [8] which ran on Windows, AIX, and OS/2. [9] In 1997, ViaVoice was first introduced to the general public. Two years later, in 1999, IBM released a free of charge version of ViaVoice. [10]

In 2003, IBM awarded ScanSoft, which owned the competitive product Dragon NaturallySpeaking, exclusive global distribution rights to ViaVoice Desktop products for Windows and Mac OS X. [11] Two years later, Nuance merged with ScanSoft. [12]

See also

Related Research Articles

<span class="mw-page-title-main">OS/2</span> Operating system from IBM

OS/2 is a series of computer operating systems, initially created by Microsoft and IBM under the leadership of IBM software designer Ed Iacobucci. As a result of a feud between the two companies over how to position OS/2 relative to Microsoft's new Windows 3.1 operating environment, the two companies severed the relationship in 1992 and OS/2 development fell to IBM exclusively. The name stands for "Operating System/2", because it was introduced as part of the same generation change release as IBM's "Personal System/2 (PS/2)" line of second-generation personal computers. The first version of OS/2 was initially released in December 1987, and newer versions were released until December 2001.

<span class="mw-page-title-main">Personal digital assistant</span> Multi-purpose mobile device

A personal digital assistant (PDA) is a multi-purpose mobile device which functions as a personal information manager. PDAs have been mostly displaced by the widespread adoption of highly capable smartphones, in particular those based on iOS and Android, and thus saw a rapid decline in use after 2007.

<span class="mw-page-title-main">Handwriting recognition</span> Ability of a computer to receive and interpret intelligible handwritten input

Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning or intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface, a generally easier task as there are more clues available. A handwriting recognition system handles formatting, performs correct segmentation into characters, and finds the most possible words.

PlainTalk is the collective name for several speech synthesis (MacinTalk) and speech recognition technologies developed by Apple Inc.

<span class="mw-page-title-main">Workplace OS</span> Defunct 1990s operating system

Workplace OS is IBM's ultimate operating system prototype of the 1990s. It is the product of an exploratory research program in 1991 which yielded a design called the Grand Unifying Theory of Systems (GUTS), proposing to unify the world's systems as generalized personalities cohabitating concurrently upon a universally sophisticated platform of object-oriented frameworks upon one microkernel. Developed in collaboration with Taligent and its Pink operating system imported from Apple via the AIM alliance, the ambitious Workplace OS was intended to improve software portability and maintenance costs by aggressively recruiting all operating system vendors to convert their products into Workplace OS personalities. In 1995, IBM reported that "Nearly 20 corporations, universities, and research institutes worldwide have licensed the microkernel, laying the foundation for a completely open microkernel standard." At the core of IBM's new unified strategic direction for the entire company, the project was intended also as a bellwether toward PowerPC hardware platforms, to compete with the Wintel duopoly.

<span class="mw-page-title-main">Microsoft Office XP</span> Version of Microsoft Office suite

Microsoft Office XP is an office suite which was officially revealed in July 2000 by Microsoft for the Windows operating system. Office XP was released to manufacturing on March 5, 2001, and was later made available to retail on May 31, 2001, less than five months prior to the release of Windows XP. It is the successor to Office 2000 and the predecessor of Office 2003. A Mac OS X equivalent, Microsoft Office v. X was released on November 19, 2001.

<span class="mw-page-title-main">MacSpeech</span> Speech recognition etc. software company

MacSpeech, Inc. was a New Hampshire-based technology company that produced software-based speech recognition and voice dictation solutions for the Apple ecosystem. The company's products included iListen, MacSpeech Dictate, MacSpeech Dictate Medical, MacSpeech Dictate Legal, MacSpeech Dictate International, and MacSpeech Scribe. On February 12, 2010, Nuance Communications, Inc. acquired MacSpeech.

<span class="mw-page-title-main">Dragon NaturallySpeaking</span> Speech recognition software package

Dragon NaturallySpeaking is a speech recognition software package developed by Dragon Systems of Newton, Massachusetts, which was acquired in turn by Lernout & Hauspie Speech Products, Nuance Communications, and Microsoft. It runs on Windows personal computers. Version 15, which supports 32-bit and 64-bit editions of Windows 7, 8 and 10, was released in August 2016.

A voice-user interface (VUI) enables spoken human interaction with computers, using speech recognition to understand spoken commands and answer questions, and typically text to speech to play a reply. A voice command device is a device controlled with a voice user interface.

DragonDictate, Dragon Dictate, or Dragon for Mac is proprietary speech recognition software. The older program, DragonDictate, was originally developed by Dragon Systems for Microsoft Windows, and was replaced by Dragon NaturallySpeaking for Windows. It was later acquired by Nuance Communications. Dragon Dictate for Mac 2.0 is supported only on Mac OS X 10.6. Nuance's other products for Mac include MacSpeech Scribe.

The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications. To date, a number of versions of the API have been released, which have shipped either as part of a Speech SDK or as part of the Windows OS itself. Applications that use SAPI include Microsoft Office, Microsoft Agent and Microsoft Speech Server.

SCRIPT, any of a series of text markup languages starting with Script under Control Program-67/Cambridge Monitor System (CP-67/CMS) and Script/370 under Virtual Machine Facility/370 (VM/370) and the Time Sharing Option (TSO) of OS/VS2; the current version, SCRIPT/VS, is part of IBM's Document Composition Facility (DCF) for IBM z/VM and z/OS systems. SCRIPT was developed for CP-67/CMS by Stuart Madnick at MIT, succeeding CTSS RUNOFF.

<span class="mw-page-title-main">Pen computing</span> Uses a stylus and tablet/touchscreen

Pen computing refers to any computer user-interface using a pen or stylus and tablet, over input devices such as a keyboard or a mouse.

As of the early 2000s, several speech recognition (SR) software packages exist for Linux. Some of them are free and open-source software and others are proprietary software. Speech recognition usually refers to software that attempts to distinguish thousands of words in a human language. Voice control may refer to software used for communicating operational commands to a computer.

<span class="mw-page-title-main">Windows Speech Recognition</span> Speech recognition software

Windows Speech Recognition (WSR) is speech recognition developed by Microsoft for Windows Vista that enables voice commands to control the desktop user interface, dictate text in electronic documents and email, navigate websites, perform keyboard shortcuts, and operate the mouse cursor. It supports custom macros to perform additional or supplementary tasks.

MacSpeech Scribe is speech recognition software for Mac OS X designed specifically for transcription of recorded voice dictation. It runs on Mac OS X 10.6 Snow Leopard. The software transcribes dictation recorded by an individual speaker. Typically, the speaker will record their dictation using a digital recording device such as a handheld digital recorder, mobile smartphone, or desktop or laptop computer with a suitable microphone. MacSpeech Scribe supports specific audio file formats for recorded dictation: .aif, .aiff, .wav, .mp4, .m4a, and .m4v.

Tazti is a speech recognition software package developed and sold by Voice Tech Group, Inc. for Windows personal computers. The most recent package is version 3.2, which supports Windows 10, Windows 8.1, Windows 8 and Windows 7 64-bit editions. Earlier versions of Tazti supported Windows Vista and Windows XP. PC video game play by voice, controlling PC applications and programs by voice and creating speech commands to trigger a browser to open web pages, or trigger the Windows operating system to open files, folders or programs are Tazti's primary features. Earlier versions of Tazti included a lite Dictation feature that is eliminated from the latest version.

<span class="mw-page-title-main">Braina</span> Intelligent personal assistant & dictation software

Braina is a virtual assistant and speech-to-text dictation application for Microsoft Windows developed by Brainasoft. Braina uses natural language interface, speech synthesis, and speech recognition technology to interact with its users and allows them to use natural language sentences to perform various tasks on a computer in most languages of the world. The name Braina is a short form of “Brain Artificial”.

Michael Phillips is the CEO and co-founder of Sense Labs and a pioneer in machine learning, including mobile speech recognition and text-to-speech technology.

References

  1. Scannell, Ed (16 June 1997). Client: IBM dictation software package gives computers a voice. InfoWorld. p. 36. At the same announcement last week, IBM also unveiled ViaVoice, a general-purpose continuous-speech dictation product [...] ViaVoice is expected to carry a suggested retail price of $199, and it will ship by the end of August
  2. Intranet Applications: Briefs. Network World. 25 August 1997. p. 21. IBM's release last week of its ViaVoice speech recognition product [...]
  3. Alan S. Horowitz (27 October 1997). Can we talk?. Computerworld. p. 80. [...] and ViaVoice from IBM, which hit the market in September
  4. Victor R. Garza (8 September 1997). Product Reviews: Continuous speech-recognition software: NaturallySpeaking edges out ViaVoice with hands-free editing. InfoWorld. p. 116.
  5. tap, tap, tap talk, talk, talk Free your hands. And your mind will follow (Advertisement). PC Magazine. 23 September 1997. p. 20.
  6. Haskin, David (21 October 1997). First Looks: Voice Recognition Software: Talking to Your PC. PC Magazine. p. 64.
  7. Bühler, Christian; Knops, Harry, eds. (December 1997). Assistive Technology on the Threshold of the New Millennium. 4.3. p. 290. ISBN   9781586030018. IBM : Speech Application Programming Interface Reference, IBM ViaVoice * Developer Tools
  8. "Pioneering Speech Recognition". ibm.com. Retrieved 2020-09-06.
  9. "IBM VoiceType Dictation for OS/2 Version 1.1" (PDF). ardent-tool.com. 1994-11-08. Retrieved 2020-09-06.
  10. Micahel, Alex. "Text to speech" . Retrieved 18 May 2023.
  11. "ScanSoft and IBM to Expand Server, Embedded and Desktop Speech Offerings". April 1, 2003.
  12. "IBM Destop ViaVoice". IBM. 2003. Retrieved 29 November 2017.