Self-voicing

Last updated December 30, 2024

A self-voicing application is an application that provides an aural interface without requiring a separate screen reader. Self-voicing applications can be an important form of assistive technology, useful to those who have difficulty reading or seeing.

A prominent group of self-voicing applications are talking web browsers. Traditionally, talking web browsers have been specially created, as was the case with:

pwWebSpeak, originally developed by The Productivity Works in Princeton, New Jersey (now obsolete)^{[ citation needed ]}
Simply Web (also now obsolete)^{[ citation needed ]}
Home Page Reader (HPR) from IBM (recently discontinued)^{[ citation needed ]}
Connect Outloud from Freedom Scientific ^[1]
WebAnywhere from University of Washington^[2]

A more recent trend has seen the self-voicing capabilities added to mainstream web browsers with free add-ons. In 2004, Opera Software created a self-voicing and speech-recognition extension for the Windows version of their web browser.^[3] And in 2005 Charles L. Chen devised Fire Vox, an extension that adds speech capabilities to the Mozilla Firefox web browser on Mac, Windows, or Linux.^[4]

A second important category are broader self-voicing applications that function as what T. V. Raman calls "complete audio desktops",^[5] including editing, browsing, and even gaming capabilities. These include Raman's own Emacspeak enhancement for Emacs.

Related Research Articles

Environmental Systems Research Institute, Inc., doing business as Esri, is an American multinational geographic information system (GIS) software company headquartered in Redlands, California. It is best known for its ArcGIS products. With 40% market share as of 2011, Esri is one of the world's leading supplier of GIS software, web GIS and geodatabase management applications.

VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service portals. VoiceXML applications are developed and deployed in a manner analogous to how a web browser interprets and visually renders the Hypertext Markup Language (HTML) it receives from a web server. VoiceXML documents are interpreted by a voice browser and in common deployment architectures, users interact with voice browsers via the public switched telephone network (PSTN).

A screen reader is a form of assistive technology (AT) that renders text and image content as speech or braille output. Screen readers are essential to people who are blind, and are useful to people who are visually impaired, illiterate, or have a learning disability. Screen readers are software applications that attempt to convey what people with normal eyesight see on a display to their users via non-visual means, like text-to-speech, sound icons, or a braille device. They do this by applying a wide variety of techniques that include, for example, interacting with dedicated accessibility APIs, using various operating system features, and employing hooking techniques.

HCL Sametime Premium is a client–server application and middleware platform that provides real-time, unified communications and collaboration for enterprises. Those capabilities include presence information, enterprise instant messaging, web conferencing, community collaboration, and telephony capabilities and integration. Currently it is developed and sold by HCL Software, a division of Indian company HCL Technologies, until 2019 by the Lotus Software division of IBM.

This is a comparison of both historical and current web browsers based on developer, engine, platform(s), releases, license, and cost.

In computing, a news aggregator, also termed a feed aggregator, content aggregator, feed reader, news reader, or simply an aggregator, is client software or a web application that aggregates digital content such as online newspapers, blogs, podcasts, and video blogs (vlogs) in one location for easy viewing. The updates distributed may include journal tables of contents, podcasts, videos, and news items.

Emacspeak is a free computer application, a speech interface, and an audio desktop. It employs Emacs, Emacs Lisp, and Tcl. Developed principally by T. V. Raman, it was first released in April 1995. It is portable to all POSIX-compatible OSs. It is tightly integrated with Emacs, allowing it to render intelligible and useful content rather than parsing the graphics ; its default voice synthesizer can be replaced with other software synthesizers when a server module is installed. Emacspeak is one of the most popular speech interfaces for Linux, bundled with most major distributions. In 2014, Raman wrote an article describing how the software's design was impacted by shifts in computer technology and its general usage over 20 years.

The mobile web comprises mobile browser-based World Wide Web services accessed from handheld mobile devices, such as smartphones or feature phones, through a mobile or other wireless network.

Home Page Reader (Hpr) was a computer program, a self-voicing web browser designed for people who are blind. It was developed by IBM from the work of Chieko Asakawa at IBM Japan.

This article details features of the Opera web browser.

Zotero is free and open-source reference management software to manage bibliographic data and related research materials, such as PDF and ePUB files. Features include web browser integration, online syncing, generation of in-text citations, footnotes, and bibliographies, integrated PDF, ePUB and HTML readers with annotation capabilities, and a note editor, as well as integration with the word processors Microsoft Word, LibreOffice Writer, and Google Docs. It was originally created at the Center for History and New Media at George Mason University and, as of 2021, is developed by the non-profit Corporation for Digital Scholarship.

NonVisual Desktop Access (NVDA) is a free and open-source, portable screen reader for Microsoft Windows. The project was started by Michael Curran in 2006.

T. V. Raman is a computer scientist who specializes in accessibility research. His research interests are primarily in the areas of auditory user interfaces and structured electronic documents. He has worked on speech interaction and markup technologies in the context of the World Wide Web at Digital's Cambridge Research Lab (CRL), Adobe Systems and IBM Research. He currently works at Google Research. Raman has himself been partially sighted since birth, and blind since the age of 14.

As of the early 2000s, several speech recognition (SR) software packages exist for Linux. Some of them are free and open-source software and others are proprietary software. Speech recognition usually refers to software that attempts to distinguish thousands of words in a human language. Voice control may refer to software used for communicating operational commands to a computer.

Ericom Software, Inc. is a Closter, New Jersey–based company that provides web isolation and remote application access software to businesses.

This list is a comparison of web conferencing software available for Linux, macOS, and Windows platforms. Many of the applications support the use of videoconferencing.

Mozilla Firefox 2 is a version of Firefox, a web browser released on October 24, 2006 by the Mozilla Corporation.

HTML audio is a subject of the HTML specification, incorporating audio input, playback, and synthesis, as well as speech to text, all in the browser.

Chrome Remote Desktop is a remote desktop software tool, developed by Google, that allows a user to remotely control another computer's desktop through a proprietary protocol also developed by Google, internally called Chromoting. The protocol transmits the keyboard and mouse events from the client to the server, relaying the graphical screen updates back in the other direction over a computer network. This feature, therefore, consists of a server component for the host computer, and a client component on the computer accessing the remote server. Chrome Remote Desktop uses a unique protocol, as opposed to using the common Remote Desktop Protocol.

References

↑ "Freedom Scientific Connect Outloud".
↑ "Archived copy". Archived from the original on 2016-05-23. Retrieved 2016-01-15.{{cite web}}: CS1 maint: archived copy as title (link)
↑ Opera Sings with IBM's Speech Technology: New version of Opera Embeds ViaVoice from IBM (Opera press release, 23 March 2004). Accessed 2007-02-03.
↑ Charles L. Chen, About Fire Vox. Accessed 2007-02-03.
↑ T. V. Raman, Emacspeak - The Complete Audio Desktop. Accessed 2007-02-03.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[connect_outloud-1] "Freedom Scientific Connect Outloud".

[Washington_University-2] "Archived copy". Archived from the original on 2016-05-23. Retrieved 2016-01-15.{{cite web}}: CS1 maint: archived copy as title (link)

[3] Opera Sings with IBM's Speech Technology: New version of Opera Embeds ViaVoice from IBM (Opera press release, 23 March 2004). Accessed 2007-02-03.

[4] Charles L. Chen, About Fire Vox. Accessed 2007-02-03.

[5] T. V. Raman, Emacspeak - The Complete Audio Desktop. Accessed 2007-02-03.

[1]

[2]

[3]

[4]

[5]