Janet M. Baker

Last updated
Janet M. Baker
Alma mater Tufts University
Carnegie Mellon University
Known for Dragon Systems
Spouse James K. Baker
Scientific career
Fields speech recognition
Institutions MIT Media Lab
Harvard Medical School
Dragon Systems
Thesis A new time-domain analysis of human speech and other complex waveforms  (1975)
Doctoral advisor Raj Reddy

Janet MacIver Baker is an American computer scientist, neuroscientist and entrepreneur. Along with her husband James K. Baker, they founded Dragon Systems and are together credited with the creation of Dragon NaturallySpeaking. [1]

In 2012, she received the IEEE James L. Flanagan Speech and Audio Processing Award with her husband. [2]

Baker is currently affiliated with the MIT Media Lab and Harvard Medical School as a visiting scientist and lecturer. [3]

Related Research Articles

Audio signal processing is a subfield of signal processing that is concerned with the electronic manipulation of audio signals. Audio signals are electronic representations of sound waves—longitudinal waves which travel through air, consisting of compressions and rarefactions. The energy contained in audio signals or sound power level is typically measured in decibels. As audio signals may be represented in either digital or analog format, processing may occur in either domain. Analog processors operate directly on the electrical signal, while digital processors operate mathematically on its digital representation.

Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to speech signals. Aspects of speech processing includes the acquisition, manipulation, storage, transfer and output of speech signals. Different speech processing tasks include speech recognition, speech synthesis, speaker diarization, speech enhancement, speaker recognition, etc.

Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.

<span class="mw-page-title-main">Thomas Huang</span> Chinese-American engineer and computer scientist (1936–2020)

Thomas Shi-Tao Huang was a Chinese-born American computer scientist, electrical engineer, and writer. He was a researcher and professor emeritus at the University of Illinois at Urbana-Champaign (UIUC). Huang was one of the leading figures in computer vision, pattern recognition and human computer interaction.

James Loton Flanagan was an American electrical engineer. He was Rutgers University's vice president for research until 2004. He was also director of Rutgers' Center for Advanced Information Processing and the Board of Governors Professor of Electrical and Computer Engineering. He is known for co-developing adaptive differential pulse-code modulation (ADPCM) with P. Cummiskey and Nikil Jayant at Bell Labs.

Mark Yoffe Liberman is an American linguist. He has a dual appointment at the University of Pennsylvania, as Trustee Professor of Phonetics in the Department of Linguistics, and as a professor in the Department of Computer and Information Sciences. He is the founder and director of the Linguistic Data Consortium. Liberman is the Faculty Director of Ware College House at the University of Pennsylvania.

<span class="mw-page-title-main">IEEE Medal of Honor</span> Award

The IEEE Medal of Honor is the highest recognition of the Institute of Electrical and Electronics Engineers (IEEE). It has been awarded since 1917, when its first recipient was Major Edwin H. Armstrong. It is given for an exceptional contribution or an extraordinary career in the IEEE fields of interest. The award consists of a gold medal, bronze replica, certificate, and honorarium. The Medal of Honor may only be awarded to an individual.

Fumitada Itakura is a Japanese scientist. He did pioneering work in statistical signal processing, and its application to speech analysis, synthesis and coding, including the development of the linear predictive coding (LPC) and line spectral pairs (LSP) methods.

The IEEE James L. Flanagan Speech and Audio Processing Award is a Technical Field Award presented by the IEEE for an outstanding contribution to the advancement of speech and/or audio signal processing. It may be presented to an individual or a team of up to three people. The award was established by the IEEE Board of Directors in 2002. The award is named after James L. Flanagan, who was a scientist from Bell Labs where he worked on acoustics for many years.

James K. Baker is an American entrepreneur and computer scientist. Along with his wife Janet M. Baker, they co-founded the Dragon Systems and together credited with the creation of Dragon NaturallySpeaking.

Nelson Harold Morgan is an American computer scientist and professor in residence (emeritus) of electrical engineering and computer science at the University of California, Berkeley. Morgan is the co-inventor of the Relative Spectral (RASTA) approach to speech signal processing, first described in a technical report published in 1991.

Julia Hirschberg is an American computer scientist noted for her research on computational linguistics and natural language processing.

<span class="mw-page-title-main">Audio coding format</span> Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

Victor Waito Zue is a Chinese American computer scientist and professor at Massachusetts Institute of Technology.

Biing Hwang "Fred" Juang is a communication and information scientist, best known for his work in speech coding, speech recognition and acoustic signal processing. He joined Georgia Institute of Technology in 2002 as Motorola Foundation Chair Professor in the School of Electrical & Computer Engineering.

<span class="mw-page-title-main">Steve Young (software engineer)</span> British researcher (born 1951)

Stephen John Young is a British researcher, Professor of Information Engineering at the University of Cambridge and an entrepreneur. He is one of the pioneers of automated speech recognition and statistical spoken dialogue systems. He served as the Senior Pro-Vice-Chancellor of the University of Cambridge from 2009 to 2015, responsible for planning and resources. From 2015 to 2019, he held a joint appointment between his professorship at Cambridge and Apple, where he was a senior member of the Siri development team.

<span class="mw-page-title-main">John Makhoul</span> American computer scientist

John Makhoul is a Lebanese-American computer scientist who works in the field of speech and language processing. Dr. Makhoul's work on linear predictive coding was used in the establishment of the Network Voice Protocol, which enabled the transmission of speech signals over the ARPANET. Makhoul is recognized in the field for his vital role in the areas of speech and language processing, including speech analysis, speech coding, speech recognition and speech understanding. He has made a number of significant contributions to the mathematical modeling of speech signals, including his work on linear prediction, and vector quantization. His patented work on the direct application of speech recognition techniques for accurate, language-independent optical character recognition (OCR) has had a dramatic impact on the ability to create OCR systems in multiple languages relatively quickly.

Mari Ostendorf is a professor of electrical engineering in the area of speech and language technology and the vice provost for research at the University of Washington.

Chin-Hui Lee is an information scientist, best known for his work in speech recognition, speaker recognition and acoustic signal processing. He joined Georgia Institute of Technology in 2002 as a professor in the school of electrical and computer engineering

References

  1. "History of Speech Recognition". Dragon Medical Transcription. Archived from the original on 2015-08-13. Retrieved 17 January 2015.
  2. "IEEE James L. Flanagan Speech and Audio Processing Award Recipients". IEEE. Retrieved 2 July 2017.
  3. Blanding, Michael (Fall 2012). "Speechless". Tufts Magazine. Retrieved 2 July 2017.