John Makhoul

Last updated
Prototype telephone for the Network Voice Protocol, signed by John Makhoul Voice over Stream Protocol (ST) - early VOIP prototype - Lincoln Lab, signed by John Makhoul.jpg
Prototype telephone for the Network Voice Protocol, signed by John Makhoul

John Makhoul is a Lebanese-American computer scientist who works in the field of speech and language processing. Dr. Makhoul's work on linear predictive coding was used in the establishment of the Network Voice Protocol, which enabled the transmission of speech signals over the ARPANET. [1] Makhoul is recognized in the field for his vital role in the areas of speech and language processing, including speech analysis, speech coding, speech recognition and speech understanding. He has made a number of significant contributions to the mathematical modeling of speech signals, including his work on linear prediction, and vector quantization. His patented work on the direct application of speech recognition techniques for accurate, language-independent optical character recognition (OCR) has had a dramatic impact on the ability to create OCR systems in multiple languages relatively quickly. [2]

Contents

Dr. Makhoul is a Chief Scientist at BBN Technologies, where he has led several successful research projects including the DARPA GALE program. [3]

Early life and education

Makhoul was born in Deirmimas, a village in southern Lebanon. He did his early schooling in Lebanon. During his high school years, he spent one year as an exchange student in a high school in Foley, Minnesota. He went to college at the American University of Beirut, where he graduated with a Bachelor of Engineering degree in Electrical Engineering in the year 1964. Makhoul then received his Master of Science degree in Electrical Engineering from Ohio State University in 1965, and finished his PhD from MIT in the year 1970. Makhoul has since been working at BBN Technologies. [4] [5]

Awards and honors

Throughout his career, Makhoul has received several awards and honors. He is a Fellow of the IEEE for contributions to the theory of linear prediction and its applications to spectral estimation, speech analysis and data compression [6] and a Fellow of the Acoustical Society of America. [7] [8] In 2013, he became a Fellow of the International Speech Communication Association (ISCA). [9]

Makhoul's 1975 IEEE Proceedings paper on linear prediction was named a "Citation Classic" by the Institute for Scientific Information. His other honors include the 1978 IEEE Senior Award, the 1982 IEEE Technical Achievement Award, the 1988 Society Award of the IEEE Signal Processing Society, and the 2000 IEEE Third Millennium Medal. [10]

In 2009, Makhoul was awarded the IEEE James L. Flanagan Speech and Audio Processing Award, which is awarded for an outstanding contribution to the advancement of speech and/or audio signal processing. [11]

In 2016, he received the ISCA Medal for "leadership and extensive contributions to speech and language processing ". [12]

Related Research Articles

Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model.

Lawrence R. Rabiner is an electrical engineer working in the fields of digital signal processing and speech processing; in particular in digital signal processing for automatic speech recognition. He has worked on systems for AT&T Corporation for speech recognition.

<span class="mw-page-title-main">Gunnar Fant</span>

Carl Gunnar Michael Fant was a leading researcher in speech science in general and speech synthesis in particular who spent most of his career as a professor at the Swedish Royal Institute of Technology (KTH) in Stockholm. He was a first cousin of the actors and directors George Fant and Kenne Fant.

<span class="mw-page-title-main">Thomas Huang</span> Chinese-American engineer and computer scientist (1936–2020)

Thomas Shi-Tao Huang was a Chinese-born American computer scientist, electrical engineer, and writer. He was a researcher and professor emeritus at the University of Illinois at Urbana-Champaign (UIUC). Huang was one of the leading figures in computer vision, pattern recognition and human computer interaction.

<span class="mw-page-title-main">Kenneth N. Stevens</span> American computer scientist

Kenneth Noble Stevens was the Clarence J. LeBel Professor of Electrical Engineering and Computer Science, and professor of health sciences and technology at the research laboratory of electronics at MIT. Stevens was head of the speech communication group in MIT's research laboratory of electronics (RLE), and was one of the world's leading scientists in acoustic phonetics.

James Loton Flanagan was an American electrical engineer. He was Rutgers University's vice president for research until 2004. He was also director of Rutgers' Center for Advanced Information Processing and the Board of Governors Professor of Electrical and Computer Engineering. He is known for co-developing adaptive differential pulse-code modulation (ADPCM) with P. Cummiskey and Nikil Jayant at Bell Labs.

Bishnu S. Atal is an Indian physicist and engineer. He is a noted researcher in acoustics, and is best known for developments in speech coding. He advanced linear predictive coding (LPC) during the late 1960s to 1970s, and developed code-excited linear prediction (CELP) with Manfred R. Schroeder in 1985.

Fumitada Itakura is a Japanese scientist. He did pioneering work in statistical signal processing, and its application to speech analysis, synthesis and coding, including the development of the linear predictive coding (LPC) and line spectral pairs (LSP) methods.

<span class="mw-page-title-main">Manfred R. Schroeder</span>

Manfred Robert Schroeder was a German physicist, most known for his contributions to acoustics and computer graphics. He wrote three books and published over 150 articles in his field.

<span class="mw-page-title-main">Roberto Pieraccini</span> Italian-American computer scientist

Roberto Pieraccini is an Italian and US electrical engineer working in the field of speech recognition, natural language understanding, and spoken dialog systems. He has been an active contributor to speech language research and technology since 1981. He is currently the Chief Scientist of Uniphore, a conversational automation technology company.

Nelson Harold Morgan is an American computer scientist and professor in residence (emeritus) of electrical engineering and computer science at the University of California, Berkeley. Morgan is the co-inventor of the Relative Spectral (RASTA) approach to speech signal processing, first described in a technical report published in 1991.

The Medical Intelligence and Language Engineering Laboratory, also known as MILE lab, is a research laboratory at the Indian Institute of Science, Bangalore under the Department of Electrical Engineering. The lab is known for its work on Image processing, online handwriting recognition, Text-To-Speech and Optical character recognition systems, all of which are focused mainly on documents and speech in Indian languages. The lab is headed by A. G. Ramakrishnan.

<span class="mw-page-title-main">Shrikanth Narayanan</span>

Shrikanth Narayanan is an Indian-American Professor at the University of Southern California. He is an interdisciplinary engineer-scientist with a focus on human-centered signal processing and machine intelligence with speech and spoken language processing at its core. A prolific award-winning researcher, educator, and inventor, with hundreds of publications and a number of acclaimed patents to his credit, he has pioneered several research areas including in computational speech science, speech and human language technologies, audio, music and multimedia engineering, human sensing and imaging technologies, emotions research and affective computing, behavioral signal processing, and computational media intelligence. His technical contributions cover a range of applications including in defense, security, health, education, media, and the arts. His contributions continue to impact numerous domains including in human health, national defense/intelligence, and the media arts including in using technologies that facilitate awareness and support of diversity and inclusion. His award-winning patents have contributed to the proliferation of speech technologies on the cloud and on mobile devices and in enabling novel emotion-aware artificial intelligence technologies.

Biing Hwang "Fred" Juang is a communication and information scientist, best known for his work in speech coding, speech recognition and acoustic signal processing. He joined Georgia Institute of Technology in 2002 as Motorola Foundation Chair Professor in the School of Electrical & Computer Engineering.

<span class="mw-page-title-main">Steve Young (software engineer)</span> British researcher (born 1951)

Stephen John Young is a British researcher, Professor of Information Engineering at the University of Cambridge and an entrepreneur. He is one of the pioneers of automated speech recognition and statistical spoken dialogue systems. He served as the Senior Pro-Vice-Chancellor of the University of Cambridge from 2009 to 2015, responsible for planning and resources. From 2015 to 2019, he held a joint appointment between his professorship at Cambridge and Apple, where he was a senior member of the Siri development team.

<span class="mw-page-title-main">Mads Græsbøll Christensen</span>

Mads Græsbøll Christensen is a Danish Professor in Audio Processing at Department of Architecture, Design & Media Technology, Aalborg University, where he is also head and founder of the Audio Analysis Lab which conducts research in audio and acoustic signal processing. Before that he worked at the Department of Electronic Systems at Aalborg University and has held visiting positions at Philips Research Labs, ENST, UCSB, and Columbia University. He has published extensively on these topics in books, scientific journals and conference proceedings, and he has given tutorials and keynote talks at major international scientific conferences.

Mari Ostendorf is a professor of electrical engineering in the area of speech and language technology and the vice provost for research at the University of Washington.

Lori Faith Lamel is a speech processing researcher known for her work with the TIMIT corpus of American English speech and for her work on voice activity detection, speaker recognition, and other non-linguistic inferences from speech signals. She works for the French National Centre for Scientific Research (CNRS) as a senior research scientist in the Spoken Language Processing Group of the Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur.

Abeer Alwan is an American electrical engineer and speech processing researcher. She is a professor of electrical and computer engineering in the UCLA Henry Samueli School of Engineering and Applied Science, and vice chair for undergraduate affairs in the Department of Electrical & Computer Engineering.

Chin-Hui Lee is an information scientist, best known for his work in speech recognition, speaker recognition and acoustic signal processing. He joined Georgia Institute of Technology in 2002 as a professor in the school of electrical and computer engineering

References

  1. "Linear Predictive Coding and the Internet Protocol, A survey of LPC and a History of Realtime Digital Speech on Packet Networks" (PDF).
  2. "Citations for ISCA Medalists".
  3. Anderson, Nate. "Defense Department funds massive speech recognition and translation program".
  4. "Understanding speech: an interview with John Makhoul". IEEE Signal Processing Magazine. 22 (3): 76–79. 2005-05-09. doi:10.1109/MSP.2005.1425901. ISSN   1053-5888.
  5. "BBN Technologies' John Makhoul, Pioneer in Speech Signal Processing, Receives 2009 IEEE James L. Flanagan Speech and Audio Processing Award" (Press release). Retrieved 23 January 2018.
  6. "IEEE Fellows 1980 | IEEE Communications Society".
  7. "IEEE Fellows 1980".
  8. "New Fellows of the Acoustical Society of America—65 (3–6), 851(N), 1071(N), 1344(N), 1591(N)". The Journal of the Acoustical Society of America. 65 (3): 851–851. 1979-03-01. doi:10.1121/1.382511. ISSN   0001-4966.
  9. "ISCA Fellows, 2013".
  10. "John Makhoul, BBN Technologies Chief Scientist, Awarded IEEE's Highest Award in Speech" (Press release).
  11. "IEEE James L. Flanagan Speech and Audio Processing Award Recipients". Institute of Electrical and Electronics Engineers (IEEE).
  12. "ISCA Medalists".