Kurt Keutzer

Last updated
Kurt Keutzer
Born
Kurt William Keutzer

(1955-11-09) November 9, 1955 (age 68)
Nationality American
Alma mater Maharishi University of Management (B.S.)
Indiana University Bloomington (M.S. and PhD) [1]
AwardsFellow, IEEE
Scientific career
FieldsComputer science
Institutions Bell Labs, Synopsys, University of California, Berkeley, DeepScale

Kurt Keutzer (born November 9, 1955) is an American computer scientist.

Contents

Early life and education

Kurt Keutzer grew up in Indianapolis, Indiana.[ citation needed ] He earned a bachelor's degree in mathematics from Maharishi University of Management (formerly Mararishi International University) in 1978, [2] and a PhD in computer science from Indiana University in 1984. [3]

Career

Keutzer joined Bell Labs in 1984, where he worked on logic synthesis. [2] In 1991, he joined the electronic design automation company Synopsys, where he was promoted to chief technology officer. [2] [4] He subsequently joined the University of California, Berkeley as a professor in 1998. [2]

His research at Berkeley has focused on the intersection of high performance computing and machine learning. Working with a number of graduate students at Berkeley, Keutzer developed FireCaffe, which scaled the training of deep neural networks to over 100 GPUs. Later, with LARS and LAMB optimizers, they scaled it to over 1000 servers. [5] [6] [7] [8] Keutzer and his students also developed deep neural networks such as SqueezeNet, SqueezeDet, and SqueezeSeg, which can run efficiently on mobile devices. [9]

Keutzer co-founded DeepScale with his PhD student Forrest Iandola in 2015, and Keutzer served as the company's chief strategy officer. [10] The firm was focused on developing deep neural networks for advanced driver assistance systems in passenger cars. [11]

On October 1, 2019, electric vehicle manufacturer Tesla, Inc. purchased DeepScale to augment and accelerate its self-driving vehicle work. [12]

Honors and awards

Keutzer was named a Fellow of the IEEE in 1996. [13]

Recipient of DAC Most Influential Paper (MIP) award (24th DAC, 1987) for his "Dagon: technology binding and local optimization by DAG matching” publication. [14]

Books by Keutzer

Related Research Articles

<span class="mw-page-title-main">Neural network (machine learning)</span> Computational model used in machine learning, based on connected, hierarchical functions

In machine learning, a neural network is a model inspired by the structure and function of biological neural networks in animal brains.

<span class="mw-page-title-main">Application-specific integrated circuit</span> Integrated circuit customized for a specific task

An application-specific integrated circuit is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use, such as a chip designed to run in a digital voice recorder or a high-efficiency video codec. Application-specific standard product chips are intermediate between ASICs and industry standard integrated circuits like the 7400 series or the 4000 series. ASIC chips are typically fabricated using metal–oxide–semiconductor (MOS) technology, as MOS integrated circuit chips.

<span class="mw-page-title-main">Mixed-signal integrated circuit</span> Integrated circuit

A mixed-signal integrated circuit is any integrated circuit that has both analog circuits and digital circuits on a single semiconductor die. Their usage has grown dramatically with the increased use of cell phones, telecommunications, portable electronics, and automobiles with electronics and digital sensors.

A recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to the uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. The term "recurrent neural network" is used to refer to the class of networks with an infinite impulse response, whereas "convolutional neural network" refers to the class of finite impulse response. Both classes of networks exhibit temporal dynamic behavior. A finite impulse recurrent network is a directed acyclic graph that can be unrolled and replaced with a strictly feedforward neural network, while an infinite impulse recurrent network is a directed cyclic graph that can not be unrolled.

<span class="mw-page-title-main">Hardware acceleration</span> Specialized computer hardware

Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.

Giovanni De Micheli is a research scientist in electronics and computer science. He is credited for the invention of the Network on a Chip design automation paradigm and for the creation of algorithms and design tools for Electronic Design Automation (EDA). He is Professor and Director of the Integrated Systems laboratory at École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. Previously, he was Professor of Electrical Engineering at Stanford University. He was Director of the Electrical Engineering Institute at EPFL from 2008 to 2019 and program leader of the Swiss Federal Nano-Tera.ch program. He holds a Nuclear Engineer degree, a M.S. and a Ph.D. degree in Electrical Engineering and Computer Science under Alberto Sangiovanni-Vincentelli.

An application-specific instruction set processor (ASIP) is a component used in system on a chip design. The instruction set architecture of an ASIP is tailored to benefit a specific application. This specialization of the core provides a tradeoff between the flexibility of a general purpose central processing unit (CPU) and the performance of an application-specific integrated circuit (ASIC).

<span class="mw-page-title-main">ESS Technology</span> Former synthetic speech synthesizer company that is now known for its Sabre DAC chips

ESS Technology Incorporated is a private manufacturer of computer multimedia products, Audio DACs and ADCs based in Fremont, California with R&D centers in Kelowna, British Columbia, Canada and Beijing, China. It was founded by Forrest Mozer in 1983. Robert L. Blair is the CEO and President of the company.

A physical neural network is a type of artificial neural network in which an electrically adjustable material is used to emulate the function of a neural synapse or a higher-order (dendritic) neuron model. "Physical" neural network is used to emphasize the reliance on physical hardware used to emulate neurons as opposed to software-based approaches. More generally the term is applicable to other artificial neural networks in which a memristor or other electrically adjustable resistance material is used to emulate a neural synapse.

<span class="mw-page-title-main">Deep learning</span> Branch of machine learning

Deep learning is the subset of machine learning methods based on neural networks with representation learning. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

NanGate, Inc was a privately held United States, Silicon Valley–based company dealing in Electronic Design Automation (EDA) for electrical engineering and electronics until its acquisition by Silvaco, Inc. in 2018. NanGate was founded in October 2004 by a group of semiconductor professionals with a background from Intel Corporation and Vitesse Semiconductor Corp. The company has received capital investments from a group of Danish business angels and venture capital companies. The company is today owned and controlled by its management following a management buy-out in 2012. NanGate markets a range of software products and design services for the design and optimization of standard cell libraries and application-specific integrated circuits. The market focus is standard cell library design and optimization for 14–28 nanometer CMOS processes.

Convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns feature engineering by itself via filters optimization. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.

Google Brain was a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, it combined open-ended machine learning research with information systems and large-scale computing resources. It created tools such as TensorFlow, which allow neural networks to be used by the public, and multiple internal AI research projects, and aimed to create research opportunities in machine learning and natural language processing. It was merged into former Google sister company DeepMind to form Google DeepMind in April 2023.

Approximate computing is an emerging paradigm for energy-efficient and/or high-performance design. It includes a plethora of computation techniques that return a possibly inaccurate result rather than a guaranteed accurate result, and that can be used for applications where an approximate result is sufficient for its purpose. One example of such situation is for a search engine where no exact answer may exist for a certain search query and hence, many answers may be acceptable. Similarly, occasional dropping of some frames in a video application can go undetected due to perceptual limitations of humans. Approximate computing is based on the observation that in many scenarios, although performing exact computation requires large amount of resources, allowing bounded approximation can provide disproportionate gains in performance and energy, while still achieving acceptable result accuracy. For example, in k-means clustering algorithm, allowing only 5% loss in classification accuracy can provide 50 times energy saving compared to the fully accurate classification.

An AI accelerator, deep learning processor, or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.

PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is recognized as one of the two most popular machine learning libraries alongside TensorFlow, offering free and open-source software released under the modified BSD license. Although the Python interface is more polished and the primary focus of development, PyTorch also has a C++ interface.

Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used:

In computer vision, SqueezeNet is the name of a deep neural network for image classification that was released in 2016. SqueezeNet was developed by researchers at DeepScale, University of California, Berkeley, and Stanford University. In designing SqueezeNet, the authors' goal was to create a smaller neural network with fewer parameters while achieving competitive accuracy.

DeepScale, Inc. was an American technology company headquartered in Mountain View, California, that developed perceptual system technologies for automated vehicles. On October 1, 2019, the company was acquired by Tesla, Inc.

Forrest N. Iandola is an American computer scientist specializing in efficient AI.

References

  1. "Kurt Keutzer Ph.D.: Executive Profile & Biography". Bloomberg. Retrieved 2019-07-21.
  2. 1 2 3 4 Keutzer, Kurt. "Faculty Page". UC Berkeley. Retrieved 2019-07-21.
  3. Keutzer, Kurt (1984). From paracomputer to ultracomputer (PhD thesis). Indiana University, Bloomington. OSTI   5401984.
  4. "Announcing AI@The House". The House. 2018-01-17. Retrieved 2019-07-21.
  5. Iandola, Forrest N.; Ashraf, Khalid; Moskewicz, Matthew W.; Keutzer, Kurt (2015). "FireCaffe: Near-linear acceleration of deep neural network training on compute clusters". arXiv: 1511.00175 [cs.CV].
  6. "Supercomputing speeds up deep learning training". Tech Xplore. 2017-11-13. Retrieved 2019-07-21.
  7. James, Mike (2017-09-21). "i-programmer". ImageNet Training Record - 24 Minutes. Retrieved 2019-07-21.
  8. You, Yang; et al. (2017-09-14). "ImageNet Training in Minutes". arXiv: 1709.05011 [cs.CV].
  9. Niedermeyer, Edward (2019-10-01). "Tesla Beefs Up Autonomy Effort With DeepScale Acqui-Hire". The Drive. Retrieved 2019-10-10.
  10. Iandola, Forrest (2018-03-23). "DeepScale's plan to put Deep Learning in mass-produced cars". DeepScale Blog. Retrieved 2019-10-10.
  11. Yoshida, Junko (2018-01-09). "Visteon Works with DNN Vanguard DeepScale". EE Times. Retrieved 2018-04-07.
  12. Kolodny, Lora (2019-10-01). "Tesla is buying computer vision start-up DeepScale in a quest to create truly driverless cars". CNBC. Retrieved 2019-10-10.
  13. "33rd DAC Awards" (PDF). DAC. 1996. Retrieved 2019-10-10.
  14. "DAC Most Influential Paper (MIP) Award". DAC. 1987. Retrieved 2019-10-10.
  15. Hill, Dwight; Shugard, Don; Fishburn, John; Keutzer, Kurt (1988-11-30). Algorithms and Techniques for VLSI Layout Synthesis. Springer. ISBN   0-89838-301-3.
  16. "Logic Synthesis". Amazon. Retrieved 2019-07-21.
  17. Chinnery, David; Keutzer, Kurt (2002-06-30). Closing the Gap Between ASIC & Custom: Tools and Techniques for High-Performance ASIC Design. Springer. ISBN   1-4020-7113-2.
  18. Goering, Richard (2002-05-28). "Closing the custom gap". EE Times. Retrieved 2019-07-21.
  19. Closing the Power Gap between ASIC & Custom: Tools and Techniques for Low Power Design, 2007 Edition. ISBN   0-387-25763-2.
  20. Chen, Pinhong; Kirkpatrick, Desmond A.; Keutzer, Kurt (2004-06-07). Static Crosstalk-Noise Analysis: For Deep Sub-Micron Digital Designs. Springer. ISBN   1-4020-8091-3.
  21. Gries, Matthias; Keutzer, Kurt (2005-06-28). Building ASIPs: The Mescal Methodology. Springer. ISBN   0-387-26057-9.