This article needs additional citations for verification .(July 2017) |
A screen reader is a form of assistive technology (AT) [1] that renders text and image content as speech or braille output. Screen readers are essential to people who are blind, [2] and are useful to people who are visually impaired, [2] illiterate, or have a learning disability. [3] Screen readers are software applications that attempt to convey what people with normal eyesight see on a display to their users via non-visual means, like text-to-speech, [4] sound icons, [5] or a braille device. [2] They do this by applying a wide variety of techniques that include, for example, interacting with dedicated accessibility APIs, using various operating system features (like inter-process communication and querying user interface properties), and employing hooking techniques. [6]
Microsoft Windows operating systems have included the Microsoft Narrator screen reader since Windows 2000, though separate products such as Freedom Scientific's commercially available JAWS screen reader and ZoomText screen magnifier and the free and open source screen reader NVDA by NV Access are more popular for that operating system. [7] Apple Inc.'s macOS, iOS, and tvOS include VoiceOver as a built-in screen reader, while Google's Android provides the Talkback screen reader and its ChromeOS can use ChromeVox. [8] Similarly, Android-based devices from Amazon provide the VoiceView screen reader. There are also free and open source screen readers for Linux and Unix-like systems, such as Speakup and Orca.
Around 1978, Al Overby of IBM Raleigh developed a prototype of a talking terminal, known as SAID (for Synthetic Audio Interface Driver), for the IBM 3270 terminal. [9] SAID read the ASCII values of the display in a stream and spoke them through a large vocal track synthesizer the size of a suitcase, and it cost around $10,000. [10] Dr. Jesse Wright, a blind research mathematician, and Jim Thatcher, formerly his graduate student from the University of Michigan, working as mathematicians for IBM, adapted this as an internal IBM tool for use by blind people. After the early IBM Personal Computer (PC) was released in 1981, Thatcher and Wright developed a software equivalent to SAID, called PC-SAID, or Personal Computer Synthetic Audio Interface Driver. This was renamed and released in 1984 as IBM Screen Reader, which became the proprietary eponym for that general class of assistive technology. [10]
In early operating systems, such as MS-DOS, which employed command-line interfaces (CLIs), the screen display consisted of characters mapping directly to a screen buffer in memory and a cursor position. Input was by keyboard. All this information could therefore be obtained from the system either by hooking the flow of information around the system and reading the screen buffer or by using a standard hardware output socket [11] and communicating the results to the user.
In the 1980s, the Research Centre for the Education of the Visually Handicapped (RCEVH) at the University of Birmingham developed a Screen Reader for the BBC Micro and NEC Portable. [12] [13]
With the arrival of graphical user interfaces (GUIs), the situation became more complicated. A GUI has characters and graphics drawn on the screen at particular positions, and therefore there is no purely textual representation of the graphical contents of the display. Screen readers were therefore forced to employ new low-level techniques, gathering messages from the operating system and using these to build up an "off-screen model", a representation of the display in which the required text content is stored. [14]
For example, the operating system might send messages to draw a command button and its caption. These messages are intercepted and used to construct the off-screen model. The user can switch between controls (such as buttons) available on the screen and the captions and control contents will be read aloud and/or shown on a refreshable braille display.
Screen readers can also communicate information on menus, controls, and other visual constructs to permit blind users to interact with these constructs. However, maintaining an off-screen model is a significant technical challenge; hooking the low-level messages and maintaining an accurate model are both difficult tasks.[ citation needed ]
Operating system and application designers have attempted to address these problems by providing ways for screen readers to access the display contents without having to maintain an off-screen model. These involve the provision of alternative and accessible representations of what is being displayed on the screen accessed through an API . Existing APIs include:
Screen readers can query the operating system or application for what is currently being displayed and receive updates when the display changes. For example, a screen reader can be told that the current focus is on a button and the button caption to be communicated to the user. This approach is considerably easier for the developers of screen readers, but fails when applications do not comply with the accessibility API: for example, Microsoft Word does not comply with the MSAAAPI, so screen readers must still maintain an off-screen model for Word or find another way to access its contents.[ citation needed ] One approach is to use available operating system messages and application object models to supplement accessibility APIs.
Screen readers can be assumed to be able to access all display content that is not intrinsically inaccessible. Web browsers, word processors, icons and windows and email programs are just some of the applications used successfully by screen reader users. However, according to some users,[ who? ] using a screen reader is considerably more difficult than using a GUI, and many applications have specific problems resulting from the nature of the application (e.g. animations) or failure to comply with accessibility standards for the platform (e.g. Microsoft Word and Active Accessibility).[ citation needed ]
Some programs and applications have voicing technology built in alongside their primary functionality. These programs are termed self-voicing and can be a form of assistive technology if they are designed to remove the need to use a screen reader.[ citation needed ]
Some telephone services allow users to interact with the internet remotely. For example, TeleTender can read web pages over the phone and does not require special programs or devices on the user side.[ citation needed ]
Virtual assistants can sometimes read out written documents (textual web content, PDF documents, e-mails etc.) The best-known examples are Apple's Siri, Google Assistant, and Amazon Alexa.
A relatively new development in the field is web-based applications like Spoken-Web that act as web portals, managing content like news updates, weather, science and business articles for visually-impaired or blind computer users.[ citation needed ] Other examples are ReadSpeaker or BrowseAloud that add text-to-speech functionality to web content.[ citation needed ] The primary audience for such applications is those who have difficulty reading because of learning disabilities or language barriers.[ citation needed ] Although functionality remains limited compared to equivalent desktop applications, the major benefit is to increase the accessibility of said websites when viewed on public machines where users do not have permission to install custom software, giving people greater "freedom to roam".[ citation needed ]
This functionality depends on the quality of the software but also on a logical structure of the text. Use of headings, punctuation, presence of alternate attributes for images, etc. is crucial for a good vocalization. Also a web site may have a nice look because of the use of appropriate two dimensional positioning with CSS but its standard linearization, for example, by suppressing any CSS and Javascript in the browser may not be comprehensible.[ citation needed ]
Most screen readers allow the user to select whether most punctuation is announced or silently ignored. Some screen readers can be tailored to a particular application through scripting. One advantage of scripting is that it allows customizations to be shared among users, increasing accessibility for all. JAWS enjoys an active script-sharing community, for example.[ citation needed ]
Verbosity is a feature of screen reading software that supports vision-impaired computer users. Speech verbosity controls enable users to choose how much speech feedback they wish to hear. Specifically, verbosity settings allow users to construct a mental model of web pages displayed on their computer screen. Based on verbosity settings, a screen-reading program informs users of certain formatting changes, such as when a frame or table begins and ends, where graphics have been inserted into the text, or when a list appears in the document. The verbosity settings can also control the level of descriptiveness of elements, such as lists, tables, and regions. [18] For example, JAWS provides low, medium, and high web verbosity preset levels. The high web verbosity level provides more detail about the contents of a webpage. [19]
Some screen readers can read text in more than one language, provided that the language of the material is encoded in its metadata. [20]
Screen reading programs like JAWS, NVDA, and VoiceOver also include language verbosity, which automatically detects verbosity settings related to speech output language. For example, if a user navigated to a website based in the United Kingdom, the text would be read with an English accent.[ citation needed ]
A personal digital assistant (PDA) is a multi-purpose mobile device which functions as a personal information manager. Following a boom in the 1990s and 2000s, PDA's were mostly displaced by the widespread adoption of more highly capable smartphones, in particular those based on iOS and Android in the late 2000s, and thus saw a rapid decline.
A refreshable braille display or braille terminal is an electro-mechanical device for displaying braille characters, usually by means of round-tipped pins raised through holes in a flat surface. Visually impaired computer users who cannot use a standard computer monitor can use it to read text output. Deafblind computer users may also use refreshable braille displays.
Scroll Lock is a lock key on most IBM-compatible computer keyboards. Depending on the operating system, it may be used for different purposes, and applications may assign functions to the key or change their behavior depending on its toggling state. The key is not frequently used, and therefore some reduced or specialized keyboards lack Scroll Lock altogether.
A computer terminal is an electronic or electromechanical hardware device that can be used for entering data into, and transcribing data from, a computer or a computing system. Most early computers only had a front panel to input or display bits and had to be connected to a terminal to print or input text through a keyboard. Teleprinters were used as early-day hard-copy terminals and predated the use of a computer screen by decades. The computer would typically transmit a line of data which would be printed on paper, and accept a line of data from a keyboard over a serial or other interface. Starting in the mid-1970s with microcomputers such as the Sphere 1, Sol-20, and Apple I, display circuitry and keyboards began to be integrated into personal and workstation computer systems, with the computer handling character generation and outputting to a CRT display such as a computer monitor or, sometimes, a consumer TV, but most larger computers continued to require terminals.
A function key is a key on a computer or terminal keyboard that can be programmed to cause the operating system or an application program to perform certain actions, a form of soft key. On some keyboards/computers, function keys may have default actions, accessible on power-on.
Computer accessibility refers to the accessibility of a computer system to all people, regardless of disability type or severity of impairment. The term accessibility is most often used in reference to specialized hardware or software, or a combination of both, designed to enable the use of a computer by a person with a disability or impairment.
A screen magnifier is software that interfaces with a computer's graphical output to present enlarged screen content. By enlarging part of a screen, people with visual impairments can better see words and images. This type of assistive technology is useful for people with some functional vision; people with visual impairments and little or no functional vision usually use a screen reader.
An input method is an operating system component or program that enables users to generate characters not natively available on their input devices by using sequences of characters that are available to them. Using an input method is usually necessary for languages that have more graphemes than there are keys on the keyboard.
In computer science, message queues and mailboxes are software-engineering components typically used for inter-process communication (IPC), or for inter-thread communication within the same process. They use a queue for messaging – the passing of control or of content. Group communication systems provide similar kinds of functionality.
In computing, text-based user interfaces (TUI), is a retronym describing a type of user interface (UI) common as an early form of human–computer interaction, before the advent of bitmapped displays and modern conventional graphical user interfaces (GUIs). Like modern GUIs, they can use the entire screen area and may accept mouse and other inputs. They may also use color and often structure the display using box-drawing characters such as ┌ and ╣. The modern context of use is usually a terminal emulator.
Windows Console is the infrastructure for console applications in Microsoft Windows. An instance of a Windows Console has a screen buffer and an input buffer. It allows console apps to run inside a window or in hardware text mode. The user can switch between the two using the Alt+↵ Enter key combination. The text mode is unavailable in Windows Vista and later. Starting with Windows 10, however, a native full-screen mode is available.
Common User Access (CUA) is a standard for user interfaces to operating systems and computer programs. It was developed by IBM and first published in 1987 as part of their Systems Application Architecture. Used originally in the MVS/ESA, VM/CMS, OS/400, OS/2 and Microsoft Windows operating systems, parts of the CUA standard are now implemented in programs for other operating systems, including variants of Unix. It is also used by Java AWT and Swing.
Job Access With Speech (JAWS) is a computer screen reader program for Microsoft Windows that allows blind and visually impaired users to read the screen either with a text-to-speech output or by a refreshable Braille display. JAWS is produced by the Blind and Low Vision Group of Freedom Scientific.
VoiceOver is a screen reader built into Apple Inc.'s macOS, iOS, tvOS, watchOS, and iPod operating systems. By using VoiceOver, the user can access their Macintosh or iOS device based on spoken descriptions and, in the case of the Mac, the keyboard. The feature is designed to increase accessibility for blind and low-vision users, as well as for users with dyslexia.
Microsoft Active Accessibility (MSAA) is an application programming interface (API) for user interface accessibility. MSAA was introduced as a platform add-on to Microsoft Windows 95 in 1997. MSAA is designed to help Assistive Technology (AT) products interact with standard and custom user interface (UI) elements of an application, as well as to access, identify, and manipulate an application's UI elements. AT products work with MSAA enabled applications in order to provide better access for individuals who have physical or cognitive difficulties, impairments, or disabilities. Some examples of AT products are screen readers for users with limited sight, on screen keyboards for users with limited physical access, or narrators for users with limited hearing. MSAA can also be used for automated testing tools, and computer-based training applications.
The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications. To date, a number of versions of the API have been released, which have shipped either as part of a Speech SDK or as part of the Windows OS itself. Applications that use SAPI include Microsoft Office, Microsoft Agent and Microsoft Speech Server.
NonVisual Desktop Access (NVDA) is a free and open-source, portable screen reader for Microsoft Windows. The project was started by Michael Curran in 2006.
Braille technology is assistive technology which allows blind or visually impaired people to read, write, or manipulate braille electronically. This technology allows users to do common tasks such as writing, browsing the Internet, typing in Braille and printing in text, engaging in chat, downloading files and music, using electronic mail, burning music, and reading documents. It also allows blind or visually impaired students to complete all assignments in school as the rest of their sighted classmates and allows them to take courses online. It enables professionals to do their jobs and teachers to lecture using hardware and software applications. The advances in Braille technology are meaningful because blind people can access more texts, books, and libraries, and it also facilitates the printing of Braille texts.
A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive interface available with punched cards.
Data scraping is a technique where a computer program extracts data from human-readable output coming from another program.