Reading machine

Last updated

A reading machine is a piece of assistive technology that allows blind people to access printed materials. It scans text, converts the image into text by means of optical character recognition and uses a speech synthesizer to read out what it has found.

Development

The first prototype of reading machine, called optophone, was developed by Dr. Edmund Edward Fournier d'Albe of Birmingham University in 1913. Five vertically-aligned photodetectors were used to scan a line of printed text. Each cell generated a different tone (G, C, D, E, G8) when detecting black print, so that each character was associated with a specific time-varying chords of tones. With some practice, blind users were able to interpret this audio output as a meaningful message. However, the reading speed of this device was very slow (approximately one word per minute). [1] [2]

From 1944 until up to the 1970s, new prototypes of reading machine were developed at Haskins Laboratories under contract from the Veterans Administration. The research project was conducted by Caryl Parker Haskins, Franklin S. Cooper and Alvin Liberman. Their first attempts to improve the optophone all ended in failures, [2] and users were still unable to read more than 5 words per minutes in average, even after long training sessions. [1] This observation led Liberman to suppose that the limitation was cognitive rather than technical, and to formulate his motor theory of speech perception. He realized that the speech signal was not heard like an acoustic "alphabet" or "cipher," but as a "code" of overlapping speech gestures, due to coarticulation. Therefore, a reading machine cannot simply convert the printed characters into a series of abstract sounds, rather it must be able to identify the characters and to produce a speech sound as output using a speech synthesizer.

The first commercial reading machine for the blind was developed by Kurzweil Computer Products (later acquired by Xerox Corporation) in 1975. Walter Cronkite used this machine to give his signature sound off, "And that's the way it is, January 13, 1976." [3]

In the mid-1960s, Francis F. Lee joined Dr. Samuel Jefferson Mason's Cognitive Information Processing Group in the Research Laboratory of Electronics at the Massachusetts Institute of Technology to work on a reading machine for the blind, the first system that would scan text and produce continuous speech. [4] Early reading machines were desk-based and large, found in libraries, schools, and hospitals or owned by wealthy individuals. In 2009, a cellphone running Kurzweil-National Federation of the Blind software works as a reading machine. [5]

Related Research Articles

<span class="mw-page-title-main">Refreshable braille display</span> Device for displaying braille characters

A refreshable braille display or braille terminal is an electro-mechanical device for displaying braille characters, usually by means of round-tipped pins raised through holes in a flat surface. Visually impaired computer users who cannot use a standard computer monitor can use it to read text output. Deafblind computer users may also use refreshable braille displays.

<span class="mw-page-title-main">Ray Kurzweil</span> American author, inventor and futurist (born 1948)

Raymond Kurzweil is an American computer scientist, author, inventor, and futurist. He is involved in fields such as optical character recognition (OCR), text-to-speech synthesis, speech recognition technology, and electronic keyboard instruments. He has written books on health, artificial intelligence (AI), transhumanism, the technological singularity, and futurism. Kurzweil is a public advocate for the futurist and transhumanist movements and gives public talks to share his optimistic outlook on life extension technologies and the future of nanotechnology, robotics, and biotechnology.

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

<span class="mw-page-title-main">Optical character recognition</span> Computer recognition of visual text

A process called Optical Character Recognition (OCR) converts printed texts into digital image files. It is a digital copier that uses automation to convert scanned documents into editable, shareable PDFs that are machine-readable. OCR may be seen in action when you use your computer to scan a receipt. The scan is then saved as a picture on your computer. The words in the image cannot be searched, edited, or counted, but you may use OCR to convert the image to a text document with the content stored as text. OCR software can extract data from scanned documents, camera photos, and image-only PDFs. It makes static material editable and does away with the necessity for human data entry.

<span class="mw-page-title-main">Kurzweil Music Systems</span> American electronic musical instrument manufacturer

Kurzweil Music Systems is an American company that produces electronic musical instruments. It was founded in 1982 by Stevie Wonder (musician), Ray Kurzweil (innovator) and Bruce Cichowlas.

The Optacon is an electromechanical device that enables blind people to read printed material that has not been transcribed into Braille. The device consists of two parts: a scanner which the user runs over the material to be read, and a finger pad which translates the words into vibrations felt on the finger tips. The Optacon was conceived by John Linvill, a professor of Electrical Engineering at Stanford University, and developed with researchers at Stanford Research Institute. Telesensory Systems manufactured the device from 1971 until it was discontinued in 1996. Although effective once mastered, it was expensive and took many hours of training to reach competency. In 2005, TSI suddenly shut down. Employees were "walked out" of the building and lost accrued vacation time, medical insurance, and all benefits. Customers could not buy new machines or get existing machines fixed. Some work was done by other companies but no device with the versatility of the Optacon had been developed as of 2007. Many blind people continue to use their Optacons to this day. The Optacon offers capabilities that no other device offers including the ability to see a printed page or computer screen as it truly appears including drawings, typefaces, and specialized text layouts.

<span class="mw-page-title-main">Votrax</span>

Votrax International, Inc., or just Votrax, was a speech synthesis company located in the Detroit, Michigan area from 1971 to 1996. It began as a division of Federal Screw Works from 1971 to 1973. In 1974, it was given the Votrax name and moved to Troy, Michigan and, in 1980, split off of its parent company entirely and became Votrax International, Inc., which produced speech products up until 1984.

The K-NFB Reader is a handheld electronic reading device for the blind. It was developed in a partnership between Ray Kurzweil and National Federation of the Blind.

<span class="mw-page-title-main">Haskins Laboratories</span>

Haskins Laboratories, Inc. is an independent 501(c) non-profit corporation, founded in 1935 and located in New Haven, Connecticut, since 1970. Haskins has formal affiliation agreements with both Yale University and the University of Connecticut; it remains fully independent, administratively and financially, of both Yale and UConn. Haskins is a multidisciplinary and international community of researchers that conducts basic research on spoken and written language. A guiding perspective of their research is to view speech and language as emerging from biological processes, including those of adaptation, response to stimuli, and conspecific interaction. Haskins Laboratories has a long history of technological and theoretical innovation, from creating systems of rules for speech synthesis and development of an early working prototype of a reading machine for the blind to developing the landmark concept of phonemic awareness as the critical preparation for learning to read an alphabetic writing system.

Alvin Meyer Liberman was born in St. Joseph, Missouri. Liberman was an American psychologist. His ideas set the agenda for fifty years of psychological research in speech perception.

The pattern playback is an early talking device that was built by Dr. Franklin S. Cooper and his colleagues, including John M. Borst and Caryl Haskins, at Haskins Laboratories in the late 1940s and completed in 1950. There were several different versions of this hardware device. Only one currently survives. The machine converts pictures of the acoustic patterns of speech in the form of a spectrogram back into sound. Using this device, Alvin Liberman, Frank Cooper, and Pierre Delattre were able to discover acoustic cues for the perception of phonetic segments. This research was fundamental to the development of modern techniques of speech synthesis, reading machines for the blind, the study of speech perception and speech recognition, and the development of the motor theory of speech perception.

Kurzweil Education is an American-based company that provides educational technology.

Franklin Seaney Cooper was an American physicist and inventor who was a pioneer in speech research.

Ignatius G. Mattingly (1927–2004) was a prominent American linguist and speech scientist. Prior to his academic career, he was an analyst for the National Security Agency from 1955 to 1966. He was a Lecturer and then Professor of Linguistics at the University of Connecticut from 1966 to 1996 and a researcher at Haskins Laboratories from 1966 until his death in 2004. He is best known for his pioneering work on speech synthesis and reading and for his theoretical work on the motor theory of speech perception in conjunction with Alvin Liberman. He received his B.A. in English from Yale University in 1947, his M.A. in Linguistics from Harvard University in 1959, and his Ph.D. in English from Yale University in 1968.

Michael Studdert-Kennedy was an American psychologist and speech scientist 1927–2017.https://haskinslabs. We org/news/michael-studdert-kennedy. He is well known for his contributions to studies of speech perception, the motor theory of speech perception, and the evolution of language, among other areas. He is a professor emeritus of psychology at the University of Connecticut and a professor emeritus of linguistics at Yale University. He is the former president (1986–1992) of Haskins Laboratories in New Haven, Connecticut. He was also a member of the Haskins Laboratories Board of Directors and was chairman of the board from 1988 until 2001. He was the son of the priest and Christian socialist Geoffrey Studdert-Kennedy.

Donald P. ShankweilerArchived 2006-06-26 at the Wayback Machine is an eminent psychologist and cognitive scientist who has done pioneering work on the representation and processing of language in the brain. He is a Professor Emeritus of Psychology at the University of Connecticut, a Senior Scientist at Haskins Laboratories in New Haven, Connecticut, and a member of the Board of Directors Archived 2021-01-26 at the Wayback Machine at Haskins. He is married to well-known American philosopher of biology, psychology, and language Ruth Millikan.

The Kurzweil K250, manufactured by Kurzweil Music Systems, was an early electronic musical instrument which produced sound from sampled sounds compressed in ROM, faster than common mass storage such as a disk drive. Acoustic sounds from brass, percussion, string and woodwind instruments as well as sounds created using waveforms from oscillators were utilized. Designed for professional musicians, it was invented by Raymond Kurzweil, founder of Kurzweil Computer Products, Inc., Kurzweil Music Systems and Kurzweil Educational Systems with consultation from Stevie Wonder; Lyle Mays, an American jazz pianist; Alan R. Pearlman, founder of ARP Instruments Inc.; and Robert Moog, inventor of the Moog synthesizer.

Leonard Katz (1938–2017) was an American experimental psychologist, born in Boston, Massachusetts. He was a professor of psychology at the University of Connecticut (1965–2006) and then professor emeritus until 2017. He was a Fellow of the American Association for the Advancement of Science and the Association for Psychological Science.

Telesensory Systems, Inc. (TSI) was an American corporation that invented, designed, manufactured, and distributed technological aids for blind and low vision persons. TSI's products helped visually impaired people work independently with computers and with ordinary printed materials.

The motor theory of speech perception is the hypothesis that people perceive spoken words by identifying the vocal tract gestures with which they are pronounced rather than by identifying the sound patterns that speech generates. It originally claimed that speech perception is done through a specialized module that is innate and human-specific. Though the idea of a module has been qualified in more recent versions of the theory, the idea remains that the role of the speech motor system is not only to produce speech articulations but also to detect them.

References

  1. 1 2 Shankweiler, D; Fowler, CA (February 2015). "Seeking a reading machine for the blind and discovering the speech code" (PDF). History of Psychology. 18 (1): 78–99. doi:10.1037/a0038299. PMID   25528275. S2CID   2347141.
  2. 1 2 Cooper, FS; Gaitenby, JH; Nye, PW (May 1984). "Evolution of reading machines for the blind: Haskins Laboratories' research as a case history". Journal of Rehabilitation Research and Development. 21 (1): 51–87. PMID   6396402.
  3. "Kurzweil Computer Products". Kurzweil Tech.
  4. RLE Timeline 1960-1979 Retrieved 3 January 2015
  5. "Mobile Products". KNFB Reader. Archived from the original on 2010-08-15. Retrieved 2010-10-19.