Subvocal recognition

Last updated May 05, 2024

Subvocal recognition (SVR) is the process of taking subvocalization and converting the detected results to a digital output, aural or text-based.^[1]

Concept

A set of electrodes are attached to the skin of the throat and, without opening the mouth or uttering a sound, the words are recognized by a computer.

Subvocal speech recognition deals with electromyograms that are different for each speaker. Therefore, consistency can be thrown off just by the positioning of an electrode. To improve accuracy, researchers in this field are relying on statistical models that get better at pattern-matching the more times a subject "speaks" through the electrodes, but even then there are lapses. At Carnegie Mellon University, researchers found that the same "speaker" with accuracy rates of 94% one day can see that rate drop to 48% a day later; between two different speakers it drops even more.^{[ citation needed ]}

Relevant applications for this technology where audible speech is impossible: for astronauts, underwater Navy Seals, fighter pilots and emergency workers charging into loud, harsh environments. At Worcester Polytechnic Institute in Massachusetts, research is underway to use subvocal information as a control source for computer music instruments.^{[ citation needed ]}

Research and patents

With a grant from the U.S. Army, research into synthetic telepathy using subvocalization is taking place at the University of California, Irvine under lead scientist Mike D'Zmura.^[2]

NASA's Ames Research Laboratory in Mountain View, California, under the supervision of Charles Jorgensen is conducting subvocalization research.^{[ citation needed ]}

The Brain Computer Interface R&D program at Wadsworth Center under the New York State Department of Health has confirmed the existing ability to decipher consonants and vowels from imagined speech, which allows for brain-based communication using imagined speech.^[3], however using EEGs instead of subvocalization techniques.

US Patents on silent communication technologies include: US Patent 6587729 "Apparatus for audibly communicating speech using the radio frequency hearing effect",^[4] US Patent 5159703 "Silent subliminal presentation system",^[5] US Patent 6011991 "Communication system and method including brain wave analysis and/or use of brain activity",^[6] US Patent 3951134 "Apparatus and method for remotely monitoring and altering brain waves".^[7] Latter two rely on brain wave analysis.

In fiction

In Speaker for the Dead and subsequent novels, author Orson Scott Card described an ear implant, called a "jewel", that allows subvocal communication with computer systems.
Author Robert J. Sawyer made use of subvocal recognition to allow silent commands to the cybernetic 'companion implants' used by the advanced Neanderthal characters in his Neanderthal Parallax trilogy of science fiction novels.
In Earth , David Brin depicts this technology and its uses as a normal gear in the near future.
In Down and Out in the Magic Kingdom , Cory Doctorow has cellphone technology become silent through a cochlear implant and miking the throat to pick up subvocalization.
William Gibson's Sprawl Trilogy frequently uses sub-vocalization systems in various devices.
In Kage Baker's Company novels, the immortal cyborgs communicate subvocally.
In the Hugo Award-winning Hyperion Cantos by Dan Simmons, the characters often use subvocalization to communicate.
In the Culture novels by Iain M. Banks, more highly advanced species often communicate subvocally through their technology.
In Deus Ex: Human Revolution (2011), the protagonist is augmented with a subvocalization implant for sending covert communications (and a corresponding cochlear implant for receiving covert communications).
In the tabletop RPG and video game series Shadowrun , player characters can communicate via subvocal microphones in some instances.
In Paranoia , all citizens can speak to the computer via their "cerebral cortech" implants.
Alistair Reynolds Revelation Space trilogy frequently uses sub-vocalization systems in various devices.

Related Research Articles

A cochlear implant (CI) is a surgically implanted neuroprosthesis that provides a person who has moderate-to-profound sensorineural hearing loss with sound perception. With the help of therapy, cochlear implants may allow for improved speech understanding in both quiet and noisy environments. A CI bypasses acoustic hearing by direct electrical stimulation of the auditory nerve. Through everyday listening and auditory training, cochlear implants allow both children and adults to learn to interpret those signals as speech and sound.

A universal translator is a device common to many science fiction works, especially on television. First described in Murray Leinster's 1945 novella "First Contact", the translator's purpose is to offer an instant translation of any language.

Lip reading, also known as speechreading, is a technique of understanding a limited range of speech by visually interpreting the movements of the lips, face and tongue without sound. Estimates of the range of lip reading vary, with some figures as low as 30% because lip reading relies on context, language knowledge, and any residual hearing. Although lip reading is used most extensively by deaf and hard-of-hearing people, most people with normal hearing process some speech information from sight of the moving mouth.

A brain–computer interface (BCI), sometimes called a brain–machine interface (BMI), is a direct communication pathway between the brain's electrical activity and an external device, most commonly a computer or robotic limb. BCIs are often directed at researching, mapping, assisting, augmenting, or repairing human cognitive or sensory-motor functions. They are often conceptualized as a human–machine interface that skips the intermediary component of the physical movement of body parts, although they also raise the possibility of the erasure of the discreteness of brain and machine. Implementations of BCIs range from non-invasive and partially invasive to invasive, based on how close electrodes get to brain tissue.

<span class="mw-page-title-main">Subvocalization</span> Internal process while reading

Subvocalization, or silent speech, is the internal speech typically made when reading; it provides the sound of the word as it is read. This is a natural process when reading, and it helps the mind to access meanings to comprehend and remember what is read, potentially reducing cognitive load.

Brain implants, often referred to as neural implants, are technological devices that connect directly to a biological subject's brain – usually placed on the surface of the brain, or attached to the brain's cortex. A common purpose of modern brain implants and the focus of much current research is establishing a biomedical prosthesis circumventing areas in the brain that have become dysfunctional after a stroke or other head injuries. This includes sensory substitution, e.g., in vision. Other brain implants are used in animal experiments simply to record brain activity for scientific reasons. Some brain implants involve creating interfaces between neural systems and computer chips. This work is part of a wider research field called brain–computer interfaces.

Neuroprosthetics is a discipline related to neuroscience and biomedical engineering concerned with developing neural prostheses. They are sometimes contrasted with a brain–computer interface, which connects the brain to a computer rather than a device meant to replace missing biological functionality.

Graeme Milbourne Clark is an Australian Professor of Otolaryngology at the University of Melbourne. Worked in ENT surgery, electronics and speech science contributed towards the development of the multiple-channel cochlear implant. His invention was later marketed by Cochlear Limited.

The Greenwood function correlates the position of the hair cells in the inner ear to the frequencies that stimulate their corresponding auditory neurons. Empirically derived in 1961 by Donald D. Greenwood, the relationship has shown to be constant throughout mammalian species when scaled to the appropriate cochlear spiral lengths and audible frequency ranges. Moreover, the Greenwood function provides the mathematical basis for cochlear implant surgical electrode array placement within the cochlea.

An auditory brainstem implant (ABI) is a surgically implanted electronic device that provides a sense of sound to a person who is profoundly deaf, due to retrocochlear hearing impairment. In Europe, ABIs have been used in children and adults, and in patients with neurofibromatosis type II.

A cyborg —a portmanteau of cybernetic and organism—is a being with both organic and biomechatronic body parts. The term was coined in 1960 by Manfred Clynes and Nathan S. Kline. In contrast to biorobots and androids, the term cyborg applies to a living organism that has restored function or enhanced abilities due to the integration of some artificial component or technology that relies on feedback.

Silent speech interface is a device that allows speech communication without using the sound made when people vocalize their speech sounds. As such it is a type of electronic lip reading. It works by the computer identifying the phonemes that an individual pronounces from nonauditory sources of information about their speech movements. These are then used to recreate the speech using speech synthesis.

Neurostimulation is the purposeful modulation of the nervous system's activity using invasive or non-invasive means. Neurostimulation usually refers to the electromagnetic approaches to neuromodulation.

The neurotrophic electrode is an intracortical device designed to read the electrical signals that the brain uses to process information. It consists of a small, hollow glass cone attached to several electrically conductive gold wires. The term neurotrophic means "relating to the nutrition and maintenance of nerve tissue" and the device gets its name from the fact that it is coated with Matrigel and nerve growth factor to encourage the expansion of neurites through its tip. It was invented by neurologist Dr. Philip Kennedy and was successfully implanted for the first time in a human patient in 1996 by neurosurgeon Roy Bakay.

Imagined speech is thinking in the form of sound – "hearing" one's own voice silently to oneself, without the intentional movement of any extremities such as the lips, tongue, or hands. Logically, imagined speech has been possible since the emergence of language, however, the phenomenon is most associated with its investigation through signal processing and detection within electroencephalograph (EEG) data as well as data obtained using alternative non-invasive, brain–computer interface (BCI) devices.

Frank H. Guenther is an American computational and cognitive neuroscientist whose research focuses on the neural computations underlying speech, including characterization of the neural bases of communication disorders and development of brain–computer interfaces for communication restoration. He is currently a professor of speech, language, and hearing sciences and biomedical engineering at Boston University.

Ingeborg J. Hochmair-Desoyer is an Austrian electrical engineer and the CEO and CTO of hearing implant company MED-EL. Dr Hochmair and her husband Prof. Erwin Hochmair co-created the first micro-electronic multi-channel cochlear implant in the world. She received the Lasker-DeBakey Clinical Medical Research Award for her contributions towards the development of the modern cochlear implant. She also received the 2015 Russ Prize for bioengineering.

Claude-Henri Chouard is a French surgeon. An otologist, he has been a full member of the Académie Nationale de Médecine since 1999. He was director of the AP-HP Laboratory of Auditory Prosthesis and director of the ENT Research Laboratory at Paris-Saint-Antoine University Hospital from 1967 to 2001. He was also head of the institution's ENT Department from 1978 to 1998. In 1982, he was elected a member of the International Collegium ORL-AS. He achieved worldwide recognition in the late 1970s thanks to the work completed by his Paris laboratory's multidisciplinary team on the multichannel cochlear implant. This implanted electronic hearing device was developed at Saint-Antoine and alleviates bilateral total deafness. When implanted early in young children, it can also help overcome the spoken language problems associated with deafness.

Monita Chatterjee is an auditory scientist and the Director of the Auditory Prostheses & Perception Laboratory at Boys Town National Research Hospital. She investigates the basic mechanisms underlying auditory processing by cochlear implant listeners.

Neural dust is a hypothetical class of nanometer-sized devices operated as wirelessly powered nerve sensors; it is a type of brain–computer interface. The sensors may be used to study, monitor, or control the nerves and muscles and to remotely monitor neural activity. In practice, a medical treatment could introduce thousands of neural dust devices into human brains. The term is derived from "smart dust", as the sensors used as neural dust may also be defined by this concept.

References

↑ Shirley, John (2013-05-01). New Taboos. PM Press. ISBN 9781604868715 . Retrieved 14 April 2017.
↑ "Army developing 'synthetic telepathy'". NBC News . 13 October 2008.
↑ Pei, Xiaomei; Barbour, Dennis L; Leuthardt, Eric C; Schalk, Gerwin (2011). "Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans". Journal of Neural Engineering. 8 (4): 046028. Bibcode:2011JNEng...8d6028P. doi:10.1088/1741-2560/8/4/046028. PMC 3772685 . PMID 21750369.
↑ Apparatus for audibly communicating speech using the radio frequency hearing effect
↑ Silent subliminal presentation system
↑ Communication system and method including brain wave analysis and/or use of brain activity
↑ Apparatus and method for remotely monitoring and altering brain waves

External links

NASA Ames Center

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Shirley, John (2013-05-01). New Taboos. PM Press. ISBN 9781604868715 . Retrieved 14 April 2017.

[2] "Army developing 'synthetic telepathy'". NBC News . 13 October 2008.

[3] Pei, Xiaomei; Barbour, Dennis L; Leuthardt, Eric C; Schalk, Gerwin (2011). "Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans". Journal of Neural Engineering. 8 (4): 046028. Bibcode:2011JNEng...8d6028P. doi:10.1088/1741-2560/8/4/046028. PMC 3772685 . PMID 21750369.

[4] Apparatus for audibly communicating speech using the radio frequency hearing effect

[5] Silent subliminal presentation system

[6] Communication system and method including brain wave analysis and/or use of brain activity

[7] Apparatus and method for remotely monitoring and altering brain waves

[1]

[2]

[3]

[4]

[5]

[6]

[7]