Screen reader

Last updated
An example of someone using a screen reader showing documents that are inaccessible, readable and accessible

A screen reader is a form of assistive technology (AT) [1] that renders text and image content as speech or braille output. Screen readers are essential to people who are blind, [2] and are useful to people who are visually impaired, [2] illiterate, or have a learning disability. [3] Screen readers are software applications that attempt to convey what people with normal eyesight see on a display to their users via non-visual means, like text-to-speech, [4] sound icons, [5] or a braille device. [2] They do this by applying a wide variety of techniques that include, for example, interacting with dedicated accessibility APIs, using various operating system features (like inter-process communication and querying user interface properties), and employing hooking techniques. [6]

Contents

Microsoft Windows operating systems have included the Microsoft Narrator screen reader since Windows 2000, though separate products such as Freedom Scientific's commercially available JAWS screen reader and ZoomText screen magnifier and the free and open source screen reader NVDA by NV Access are more popular for that operating system. [7] Apple Inc.'s macOS, iOS, and tvOS include VoiceOver as a built-in screen reader, while Google's Android provides the Talkback screen reader and its ChromeOS can use ChromeVox. [8] Similarly, Android-based devices from Amazon provide the VoiceView screen reader. There are also free and open source screen readers for Linux and Unix-like systems, such as Speakup and Orca.

Types

Command-line (text)

In early operating systems, such as MS-DOS, which employed command-line interfaces (CLIs), the screen display consisted of characters mapping directly to a screen buffer in memory and a cursor position. Input was by keyboard. All this information could therefore be obtained from the system either by hooking the flow of information around the system and reading the screen buffer or by using a standard hardware output socket [9] and communicating the results to the user.

In the 1980s, the Research Centre for the Education of the Visually Handicapped (RCEVH) at the University of Birmingham developed a Screen Reader for the BBC Micro and NEC Portable. [10] [11]

Graphical

Off-screen models

With the arrival of graphical user interfaces (GUIs), the situation became more complicated. A GUI has characters and graphics drawn on the screen at particular positions, and therefore there is no purely textual representation of the graphical contents of the display. Screen readers were therefore forced to employ new low-level techniques, gathering messages from the operating system and using these to build up an "off-screen model", a representation of the display in which the required text content is stored. [12]

For example, the operating system might send messages to draw a command button and its caption. These messages are intercepted and used to construct the off-screen model. The user can switch between controls (such as buttons) available on the screen and the captions and control contents will be read aloud and/or shown on a refreshable braille display.

Screen readers can also communicate information on menus, controls, and other visual constructs to permit blind users to interact with these constructs. However, maintaining an off-screen model is a significant technical challenge; hooking the low-level messages and maintaining an accurate model are both difficult tasks.[ citation needed ]

Accessibility APIs

Operating system and application designers have attempted to address these problems by providing ways for screen readers to access the display contents without having to maintain an off-screen model. These involve the provision of alternative and accessible representations of what is being displayed on the screen accessed through an API . Existing APIs include:

Screen readers can query the operating system or application for what is currently being displayed and receive updates when the display changes. For example, a screen reader can be told that the current focus is on a button and the button caption to be communicated to the user. This approach is considerably easier for the developers of screen readers, but fails when applications do not comply with the accessibility API: for example, Microsoft Word does not comply with the MSAAAPI, so screen readers must still maintain an off-screen model for Word or find another way to access its contents.[ citation needed ] One approach is to use available operating system messages and application object models to supplement accessibility APIs.

Screen readers can be assumed to be able to access all display content that is not intrinsically inaccessible. Web browsers, word processors, icons and windows and email programs are just some of the applications used successfully by screen reader users. However, according to some users,[ who? ] using a screen reader is considerably more difficult than using a GUI, and many applications have specific problems resulting from the nature of the application (e.g. animations) or failure to comply with accessibility standards for the platform (e.g. Microsoft Word and Active Accessibility).[ citation needed ]

Self-voicing programs and applications

Some programs and applications have voicing technology built in alongside their primary functionality. These programs are termed self-voicing and can be a form of assistive technology if they are designed to remove the need to use a screen reader.[ citation needed ]

Cloud-based

Some telephone services allow users to interact with the internet remotely. For example, TeleTender can read web pages over the phone and does not require special programs or devices on the user side.[ citation needed ]

Virtual assistants can sometimes read out written documents (textual web content, PDF documents, e-mails etc.) The best-known examples are Apple's Siri, Google Assistant, and Amazon Alexa.

Web-based

A relatively new development in the field is web-based applications like Spoken-Web that act as web portals, managing content like news updates, weather, science and business articles for visually-impaired or blind computer users.[ citation needed ] Other examples are ReadSpeaker or BrowseAloud that add text-to-speech functionality to web content.[ citation needed ] The primary audience for such applications is those who have difficulty reading because of learning disabilities or language barriers.[ citation needed ] Although functionality remains limited compared to equivalent desktop applications, the major benefit is to increase the accessibility of said websites when viewed on public machines where users do not have permission to install custom software, giving people greater "freedom to roam".[ citation needed ]

This functionality depends on the quality of the software but also on a logical structure of the text. Use of headings, punctuation, presence of alternate attributes for images, etc. is crucial for a good vocalization. Also a web site may have a nice look because of the use of appropriate two dimensional positioning with CSS but its standard linearization, for example, by suppressing any CSS and Javascript in the browser may not be comprehensible.[ citation needed ]

Customization

Most screen readers allow the user to select whether most punctuation is announced or silently ignored. Some screen readers can be tailored to a particular application through scripting. One advantage of scripting is that it allows customizations to be shared among users, increasing accessibility for all. JAWS enjoys an active script-sharing community, for example.[ citation needed ]

Verbosity

Verbosity is a feature of screen reading software that supports vision-impaired computer users. Speech verbosity controls enable users to choose how much speech feedback they wish to hear. Specifically, verbosity settings allow users to construct a mental model of web pages displayed on their computer screen. Based on verbosity settings, a screen-reading program informs users of certain formatting changes, such as when a frame or table begins and ends, where graphics have been inserted into the text, or when a list appears in the document. The verbosity settings can also control the level of descriptiveness of elements, such as lists, tables, and regions. [16] For example, JAWS provides low, medium, and high web verbosity preset levels. The high web verbosity level provides more detail about the contents of a webpage. [17]

Language

Some screen readers can read text in more than one language, provided that the language of the material is encoded in its metadata. [18]

Screen reading programs like JAWS, NVDA, and VoiceOver also include language verbosity, which automatically detects verbosity settings related to speech output language. For example, if a user navigated to a website based in the United Kingdom, the text would be read with an English accent.[ citation needed ]

See also

Related Research Articles

<span class="mw-page-title-main">Graphical user interface</span> User interface allowing interaction through graphical icons and visual indicators

A graphical user interface, or GUI, is a form of user interface that allows users to interact with electronic devices through graphical icons and visual indicators such as secondary notation. In many applications, GUIs are used instead of text-based UIs, which are based on typed command labels or text navigation. GUIs were introduced in reaction to the perceived steep learning curve of command-line interfaces (CLIs), which require commands to be typed on a computer keyboard.

<span class="mw-page-title-main">History of the graphical user interface</span>

The history of the graphical user interface, understood as the use of graphic icons and a pointing device to control a computer, covers a five-decade span of incremental refinements, built on some constant core principles. Several vendors have created their own windowing systems based on independent code, but with basic elements in common that define the WIMP "window, icon, menu and pointing device" paradigm.

<span class="mw-page-title-main">Refreshable braille display</span> Device for displaying braille characters

A refreshable braille display or braille terminal is an electro-mechanical device for displaying braille characters, usually by means of round-tipped pins raised through holes in a flat surface. Visually impaired computer users who cannot use a standard computer monitor can use it to read text output. Deafblind computer users may also use refreshable braille displays.

<span class="mw-page-title-main">Windowing system</span> Software that manages separately different parts of display screens

In computing, a windowing system is a software suite that manages separately different parts of display screens. It is a type of graphical user interface (GUI) which implements the WIMP paradigm for a user interface.

In computing, a window is a graphical control element. It consists of a visual area containing some of the graphical user interface of the program it belongs to and is framed by a window decoration. It usually has a rectangular shape that can overlap with the area of other windows. It displays the output of and may allow input to one or more processes.

In computing, an icon is a pictogram or ideogram displayed on a computer screen in order to help the user navigate a computer system. The icon itself is a quickly comprehensible symbol of a software tool, function, or a data file, accessible on the system and is more like a traffic sign than a detailed illustration of the actual entity it represents. It can serve as an electronic hyperlink or file shortcut to access the program or data. The user can activate an icon using a mouse, pointer, finger, or voice commands. Their placement on the screen, also in relation to other icons, may provide further information to the user about their usage. In activating an icon, the user can move directly into and out of the identified function without knowing anything further about the location or requirements of the file or code.

<span class="mw-page-title-main">Computer accessibility</span> Ability of a computer system to be used by all people

Computer accessibility refers to the accessibility of a computer system to all people, regardless of disability type or severity of impairment. The term accessibility is most often used in reference to specialized hardware or software, or a combination of both, designed to enable the use of a computer by a person with a disability or impairment.

<span class="mw-page-title-main">Screen magnifier</span>

A screen magnifier is software that interfaces with a computer's graphical output to present enlarged screen content. By enlarging part of a screen, people with visual impairments can better see words and images. This type of assistive technology is useful for people with some functional vision; people with visual impairments and little or no functional vision usually use a screen reader.

<span class="mw-page-title-main">Text-based user interface</span> Type of interface based on outputting to or controlling a text display

In computing, text-based user interfaces (TUI), is a retronym describing a type of user interface (UI) common as an early form of human–computer interaction, before the advent of bitmapped displays and modern conventional graphical user interfaces (GUIs). Like modern GUIs, they can use the entire screen area and may accept mouse and other inputs. They may also use color and often structure the display using box-drawing characters such as ┌ and ╣. The modern context of use is usually a terminal emulator.

JAWS is a computer screen reader program for Microsoft Windows that allows blind and visually impaired users to read the screen either with a text-to-speech output or by a refreshable Braille display. JAWS is produced by the Blind and Low Vision Group of Freedom Scientific.

<span class="mw-page-title-main">VoiceOver</span> Screen reader developed by Apple

VoiceOver is a screen reader built into Apple Inc.'s macOS, iOS, tvOS, watchOS, and iPod operating systems. By using VoiceOver, the user can access their Macintosh or iOS device based on spoken descriptions and, in the case of the Mac, the keyboard. The feature is designed to increase accessibility for blind and low-vision users, as well as for users with dyslexia.

Microsoft Active Accessibility (MSAA) is an application programming interface (API) for user interface accessibility. MSAA was introduced as a platform add-on to Microsoft Windows 95 in 1997. MSAA is designed to help Assistive Technology (AT) products interact with standard and custom user interface (UI) elements of an application, as well as to access, identify, and manipulate an application's UI elements. AT products work with MSAA enabled applications in order to provide better access for individuals who have physical or cognitive difficulties, impairments, or disabilities. Some examples of AT products are screen readers for users with limited sight, on screen keyboards for users with limited physical access, or narrators for users with limited hearing. MSAA can also be used for automated testing tools, and computer-based training applications.

The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications. To date, a number of versions of the API have been released, which have shipped either as part of a Speech SDK or as part of the Windows OS itself. Applications that use SAPI include Microsoft Office, Microsoft Agent and Microsoft Speech Server.

The clipboard is a buffer that some operating systems provide for short-term storage and transfer within and between application programs. The clipboard is usually temporary and unnamed, and its contents reside in the computer's RAM.

A console application or command-line program is a computer program designed to be used via a text-only user interface, such as a text terminal, the command-line interface of some operating systems or the text-based interface included with most graphical user interface (GUI) operating systems, such as the Windows Console in Microsoft Windows, the Terminal in macOS, and xterm in Unix.

<span class="mw-page-title-main">NonVisual Desktop Access</span> Free and open source screen reader for Windows

NonVisual Desktop Access (NVDA) is a free and open-source, portable screen reader for Microsoft Windows. The project was started by Michael Curran in 2006.

<span class="mw-page-title-main">AmigaOS</span> Operating system for Amiga computers

AmigaOS is a family of proprietary native operating systems of the Amiga and AmigaOne personal computers. It was developed first by Commodore International and introduced with the launch of the first Amiga, the Amiga 1000, in 1985. Early versions of AmigaOS required the Motorola 68000 series of 16-bit and 32-bit microprocessors. Later versions were developed by Haage & Partner and then Hyperion Entertainment. A PowerPC microprocessor is required for the most recent release, AmigaOS 4.

A software widget is a relatively simple and easy-to-use software application or component made for one or more different software platforms.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive interface available with punched cards.

References

  1. "Types of Assistive Technology Products". Microsoft Accessibility. Retrieved June 13, 2016.
  2. 1 2 3 "Screen reading technology". AFB . Retrieved February 23, 2022.
  3. "Screen Readers and how they work with E-Learning". Virginia.gov. Archived from the original on November 13, 2018. Retrieved March 31, 2019.
  4. "Hear text read aloud with Narrator". Microsoft . Retrieved June 13, 2016.
  5. Coyier, Chris (October 29, 2007). "Accessibility Basics: How Does Your Page Look To A Screen Reader?". CSS-Tricks. Retrieved June 13, 2016.
  6. "What is a Screen Reader". Nomensa . Retrieved July 9, 2017.
  7. "Screen Reader User Survey #9". WebAIM . Retrieved July 1, 2021.
  8. "ChromeVox". Google. Retrieved March 9, 2020.
  9. "Talking Terminals. BYTE, September 1982". Archived from the original on June 25, 2006. Retrieved September 7, 2006.
  10. Paul Blenkhorn, "The RCEVH project on micro-computer systems and computer assisted learning", British Journal of Visual Impairment, 4/3, 101-103 (1986). Free HTML version at Visugate.
  11. "Access to personal computers using speech synthesis. RNIB New Beacon No.76, May 1992". March 3, 2014.
  12. According to "Making the GUI Talk" (by Richard Schwerdtfeger, BYTE December 1991, p. 118-128), the first screen reader to build an off-screen model was outSPOKEN.
  13. Implementing Accessibility on Android.
  14. Apple Accessibility API.
  15. "Oracle Technology Network for Java Developers – Oracle Technology Network – Oracle".
  16. Zong, Jonathan; Lee, Crystal; Lundgard, Alan; Jang, JiWoong; Hajas, Daniel; Satyanarayan, Arvind (2022). "Rich Screen Reader Experiences for Accessible Data Visualization". Computer Graphics Forum. 41 (3): 15–27. arXiv: 2205.04917 . doi:10.1111/cgf.14519. ISSN   0167-7055. S2CID   248665696.
  17. "JAWS Web Verbosity". www.freedomscientific.com. Retrieved November 6, 2022.
  18. Chris Heilmann (March 13, 2008). "Yahoo! search results now with natural language support". Yahoo! Developer Network Blog. Archived from the original on January 25, 2009. Retrieved February 28, 2015.