Larry Heck

Last updated
Larry Heck
Larry Heck.jpg
Born
Larry Paul Heck

Nationality American
Alma mater Georgia Institute of Technology,
Texas Tech University
Scientific career
Institutions Viv
Samsung
Google
Microsoft
Yahoo!
Nuance Communications
Stanford Research Institute
Thesis A Subspace Approach to the Automatic Design of Pattern Recognition Systems for Mechanical System Monitoring  (1991)
Doctoral advisor Prof. James H. McClellan

Larry Paul Heck is the Rhesa Screven Farmer, Jr., Advanced Computing Concepts Chair, Georgia Research Alliance Eminent Scholar, Chief Scientist of the AI Hub, Executive Director of the Machine Learning Center, and Professor at the Georgia Institute of Technology. His career spans many of the sub-disciplines of artificial intelligence, including conversational AI, speech recognition and speaker recognition, natural language processing, web search, online advertising and acoustics. He is best known for his role as a co-founder of the Microsoft] Cortana] Personal Assistant and his early work in deep learning] for speech processing.

Contents

Education and career

Larry Heck was born in Havre, Montana. After receiving the Bachelor of Science in electrical engineering at Texas Tech University, he was admitted to graduate school at the Georgia Institute of Technology in 1986. Heck received the MSEE in 1989 and the PhD in 1991 under advisor Prof. James H. McClellan. [1]

From 1992 to 1998, he was a senior research engineer at SRI International initially with the Acoustics and Radar Technology Lab (ARTL) and later with the Speech Technology and Research (STAR) Lab. Funded by the US government's NSA and DARPA, Heck led the SRI team that was the first to successfully create large-scale deep neural network (DNN) deep learning technology in the field of speech processing and the first to deploy a major industrial application of deep learning. [2] The deep learning technology was used to win the 1998 National Institute of Standards and Technology Speaker Recognition evaluation. [3]

From 1998 to 2005, he was vice president of R&D at Nuance Communications, where he led the company's efforts in speech recognition, natural language processing, speaker recognition, and speech synthesis technology.

From 2005 to 2008, he was vice president of search & advertising sciences at Yahoo!, responsible for the company's search and advertising quality. In 2008, Heck worked with Yahoo! Research to combine the two organizations to form Yahoo! Labs.

Beginning in 2009, he was the chief scientist of speech products at Microsoft. In this role, he established the vision, mission and long-range plan and hired the initial team to create Microsoft’s digital-personal-assistant Cortana. [4] Heck was named a Microsoft Distinguished Engineer in 2012 and joined Microsoft Research that same year.

In 2014, he joined Google as a principal research scientist, where he founded the deep learning-based conversational AI team "Deep Dialogue". The team works on advanced research for the Google Assistant.

In 2017, Heck joined Samsung as SVP and co-head of global AI Research. In 2019, he became head of Bixby (virtual assistant) North America and the CEO of Viv Labs, an independent subsidiary of Samsung.

In 2021, Heck returned to the Georgia Institute of Technology as a Professor.

Awards and honors

Larry Heck was named Fellow of the Institute of Electrical and Electronics Engineers (IEEE) in 2016 [5] for leadership in application of machine learning to spoken and text language processing.

Heck received the 2017 Academy of Distinguished Engineering Alumni Award from the Georgia Institute of Technology. In the same year, he also received the Texas Tech University Whitacre College of Engineering Distinguished Engineer Award.

Related Research Articles

Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to speech signals. Aspects of speech processing includes the acquisition, manipulation, storage, transfer and output of speech signals. Different speech processing tasks include speech recognition, speech synthesis, speaker diarization, speech enhancement, speaker recognition, etc.

Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech-to-text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.

Microsoft Research (MSR) is the research subsidiary of Microsoft. It was created in 1991 by Richard Rashid, Bill Gates and Nathan Myhrvold with the intent to advance state-of-the-art computing and solve difficult world problems through technological innovation in collaboration with academic, government, and industry researchers. The Microsoft Research team has more than 1,000 computer scientists, physicists, engineers, and mathematicians, including Turing Award winners, Fields Medal winners, MacArthur Fellows, and Dijkstra Prize winners.

The following outline is provided as an overview of and topical guide to artificial intelligence:

<span class="mw-page-title-main">German Research Centre for Artificial Intelligence</span> Nonprofit contract research institute

The German Research Center for Artificial Intelligence is one of the world's largest nonprofit contract research institutes for software technology based on artificial intelligence (AI) methods. DFKI was founded in 1988, and has facilities in the German cities of Kaiserslautern, Saarbrücken, Lübeck, Oldenburg, Osnabrück, Bremen, Darmstadt and Berlin.

<span class="mw-page-title-main">Xuedong Huang</span> American computer scientist

Xuedong David Huang is a Chinese American computer scientist and technology executive who has made contributions to spoken language processing and artificial intelligence, including Azure AI Services. He is Zoom's chief technology officer after serving as Microsoft's Technical Fellow and Azure AI Chief Technology Officer for 30 years. Huang is a strong advocate of AI for Accessibility, and AI for Cultural Heritage.

<span class="mw-page-title-main">Virtual assistant</span> Software agent

A virtual assistant (VA) is a software agent that can perform a range of tasks or services for a user based on user input such as commands or questions, including verbal ones. Such technologies often incorporate chatbot capabilities to simulate human conversation, such as via online chat, to facilitate interaction with their users. The interaction may be via text, graphical interface, or voice - as some virtual assistants are able to interpret human speech and respond via synthesized voices.

<span class="mw-page-title-main">Roberto Pieraccini</span> Italian-American computer scientist

Roberto Pieraccini is an Italian and US electrical engineer working in the field of speech recognition, natural language understanding, and spoken dialog systems. He has been an active contributor to speech language research and technology since 1981. He is currently the Chief Scientist of Uniphore, a conversational automation technology company.

<span class="mw-page-title-main">Deep learning</span> Branch of machine learning

Deep learning is the subset of machine learning methods based on neural networks with representation learning. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

<span class="mw-page-title-main">Cortana (virtual assistant)</span> Discontinued personal assistant by Microsoft

Cortana was a virtual assistant developed by Microsoft that used the Bing search engine to perform tasks such as setting reminders and answering questions for users.

Jianchang (JC) Mao is a Chinese-American computer scientist and Vice President, Google Assistant Engineering at Google. His research spans artificial intelligence, machine learning, computational advertising, data mining, and information retrieval. He was named a Fellow of the Institute of Electrical and Electronics Engineers (IEEE) in 2012 for his contributions to pattern recognition, search, content analysis, and computational advertising.

Biing Hwang "Fred" Juang is a communication and information scientist, best known for his work in speech coding, speech recognition and acoustic signal processing. He joined Georgia Institute of Technology in 2002 as Motorola Foundation Chair Professor in the School of Electrical & Computer Engineering.

WaveNet is a deep neural network for generating raw audio. It was created by researchers at London-based AI firm DeepMind. The technique, outlined in a paper in September 2016, is able to generate relatively realistic-sounding human-like voices by directly modelling waveforms using a neural network method trained with recordings of real speech. Tests with US English and Mandarin reportedly showed that the system outperforms Google's best existing text-to-speech (TTS) systems, although as of 2016 its text-to-speech synthesis still was less convincing than actual human speech. WaveNet's ability to generate raw waveforms means that it can model any kind of audio, including music.

Uniphore is a conversational automation technology company. Uniphore sells software for conversational analytics, conversational assistants, and conversational security. The company is headquartered in Palo Alto, California, with offices in the United States, Singapore, India, Japan, Spain, and Israel. Its products are used by up to 75,000 customer service agents during approximately 160 million interactions per month.

<span class="mw-page-title-main">Witlingo</span> Software as a service company

Witlingo is a B2B Software as a Service (SaaS) company that enables businesses and organization to engage with members of their communities by using the latest innovations in Human Language Technology and Conversational AI, such Speech recognition, Natural Language Processing, IVR, Virtual Assistant apps on Smartphone platforms(iOS and Android), Chatbots, and Digital audio.

<span class="mw-page-title-main">Joy Buolamwini</span> Computer scientist and digital activist

Joy Adowaa Buolamwini is a Canadian-American computer scientist and digital activist formerly based at the MIT Media Lab. She founded the Algorithmic Justice League (AJL), an organization that works to challenge bias in decision-making software, using art, advocacy, and research to highlight the social implications and harms of artificial intelligence (AI).

<span class="mw-page-title-main">Voice computing</span> Discipline in computing

Voice computing is the discipline that develops hardware or software to process voice inputs.

Chin-Hui Lee is an information scientist, best known for his work in speech recognition, speaker recognition and acoustic signal processing. He joined Georgia Institute of Technology in 2002 as a professor in the school of electrical and computer engineering

References

  1. Larry Heck at the Mathematics Genealogy Project
  2. Heck, L.; Konig, Y.; Sonmez, M.; Weintraub, M. (2000). "Robustness to Telephone Handset Distortion in Speaker Recognition by Discriminative Feature Design" (PDF). Speech Communication. 31 (2): 181–192. doi:10.1016/s0167-6393(99)00077-1.
  3. Doddington, G.; Przybocki, M.; Martin, A.; Reynolds, D. (2000). "The NIST speaker recognition evaluation ± Overview, methodology, systems, results, perspective". Speech Communication. 31 (2): 225–254. doi:10.1016/S0167-6393(99)00080-1.
  4. Microsoft Research (April 17, 2014). "Anticipating More from Cortana". Microsoft Research blogs. Archived from the original on March 3, 2016.
  5. "2016 elevated fellow" (PDF). IEEE Fellows Directory.