Kurt Keutzer | |
---|---|
Born | Kurt William Keutzer November 9, 1955 |
Nationality | American |
Alma mater | Maharishi University of Management (B.S.) Indiana University Bloomington (M.S. and PhD) [1] |
Awards | Fellow, IEEE |
Scientific career | |
Fields | Computer science |
Institutions | Bell Labs, Synopsys, University of California, Berkeley, DeepScale |
Kurt Keutzer (born November 9, 1955) is an American computer scientist.
Kurt Keutzer grew up in Indianapolis, Indiana.[ citation needed ] He earned a bachelor's degree in mathematics from Maharishi University of Management (formerly Mararishi International University) in 1978, [2] and a PhD in computer science from Indiana University in 1984. [3]
Keutzer joined Bell Labs in 1984, where he worked on logic synthesis. [2] In 1991, he joined the electronic design automation company Synopsys, where he was promoted to chief technology officer. [2] [4] He subsequently joined the University of California, Berkeley as a professor in 1998. [2]
His research at Berkeley has focused on the intersection of high performance computing and machine learning. Working with a number of graduate students at Berkeley, Keutzer developed FireCaffe, which scaled the training of deep neural networks to over 100 GPUs. Later, with LARS and LAMB optimizers, they scaled it to over 1000 servers. [5] [6] [7] [8] Keutzer and his students also developed deep neural networks such as SqueezeNet, SqueezeDet, and SqueezeSeg, which can run efficiently on mobile devices. [9]
Keutzer co-founded DeepScale with his PhD student Forrest Iandola in 2015, and Keutzer served as the company's chief strategy officer. [10] The firm was focused on developing deep neural networks for advanced driver assistance systems in passenger cars. [11]
On October 1, 2019, electric vehicle manufacturer Tesla, Inc. purchased DeepScale to augment and accelerate its self-driving vehicle work. [12]
Keutzer was named a Fellow of the IEEE in 1996. [13]
Recipient of DAC Most Influential Paper (MIP) award (24th DAC, 1987) for his "Dagon: technology binding and local optimization by DAG matching” publication. [14]
In machine learning, a neural network is a model inspired by the structure and function of biological neural networks in animal brains.
An application-specific integrated circuit is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use, such as a chip designed to run in a digital voice recorder or a high-efficiency video codec. Application-specific standard product chips are intermediate between ASICs and industry standard integrated circuits like the 7400 series or the 4000 series. ASIC chips are typically fabricated using metal–oxide–semiconductor (MOS) technology, as MOS integrated circuit chips.
A mixed-signal integrated circuit is any integrated circuit that has both analog circuits and digital circuits on a single semiconductor die. Their usage has grown dramatically with the increased use of cell phones, telecommunications, portable electronics, and automobiles with electronics and digital sensors.
A recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to the uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. The term "recurrent neural network" is used to refer to the class of networks with an infinite impulse response, whereas "convolutional neural network" refers to the class of finite impulse response. Both classes of networks exhibit temporal dynamic behavior. A finite impulse recurrent network is a directed acyclic graph that can be unrolled and replaced with a strictly feedforward neural network, while an infinite impulse recurrent network is a directed cyclic graph that can not be unrolled.
Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.
Giovanni De Micheli is a research scientist in electronics and computer science. He is credited for the invention of the Network on a Chip design automation paradigm and for the creation of algorithms and design tools for Electronic Design Automation (EDA). He is Professor and Director of the Integrated Systems laboratory at École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. Previously, he was Professor of Electrical Engineering at Stanford University. He was Director of the Electrical Engineering Institute at EPFL from 2008 to 2019 and program leader of the Swiss Federal Nano-Tera.ch program. He holds a Nuclear Engineer degree, a M.S. and a Ph.D. degree in Electrical Engineering and Computer Science under Alberto Sangiovanni-Vincentelli.
An application-specific instruction set processor (ASIP) is a component used in system on a chip design. The instruction set architecture of an ASIP is tailored to benefit a specific application. This specialization of the core provides a tradeoff between the flexibility of a general purpose central processing unit (CPU) and the performance of an application-specific integrated circuit (ASIC).
ESS Technology Incorporated is a private manufacturer of computer multimedia products, Audio DACs and ADCs based in Fremont, California with R&D centers in Kelowna, British Columbia, Canada and Beijing, China. It was founded by Forrest Mozer in 1983. Robert L. Blair is the CEO and President of the company.
A physical neural network is a type of artificial neural network in which an electrically adjustable material is used to emulate the function of a neural synapse or a higher-order (dendritic) neuron model. "Physical" neural network is used to emphasize the reliance on physical hardware used to emulate neurons as opposed to software-based approaches. More generally the term is applicable to other artificial neural networks in which a memristor or other electrically adjustable resistance material is used to emulate a neural synapse.
Deep learning is the subset of machine learning methods based on neural networks with representation learning. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
NanGate, Inc was a privately held United States, Silicon Valley–based company dealing in Electronic Design Automation (EDA) for electrical engineering and electronics until its acquisition by Silvaco, Inc. in 2018. NanGate was founded in October 2004 by a group of semiconductor professionals with a background from Intel Corporation and Vitesse Semiconductor Corp. The company has received capital investments from a group of Danish business angels and venture capital companies. The company is today owned and controlled by its management following a management buy-out in 2012. NanGate markets a range of software products and design services for the design and optimization of standard cell libraries and application-specific integrated circuits. The market focus is standard cell library design and optimization for 14–28 nanometer CMOS processes.
Convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns feature engineering by itself via filters optimization. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.
Google Brain was a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, it combined open-ended machine learning research with information systems and large-scale computing resources. It created tools such as TensorFlow, which allow neural networks to be used by the public, and multiple internal AI research projects, and aimed to create research opportunities in machine learning and natural language processing. It was merged into former Google sister company DeepMind to form Google DeepMind in April 2023.
Approximate computing is an emerging paradigm for energy-efficient and/or high-performance design. It includes a plethora of computation techniques that return a possibly inaccurate result rather than a guaranteed accurate result, and that can be used for applications where an approximate result is sufficient for its purpose. One example of such situation is for a search engine where no exact answer may exist for a certain search query and hence, many answers may be acceptable. Similarly, occasional dropping of some frames in a video application can go undetected due to perceptual limitations of humans. Approximate computing is based on the observation that in many scenarios, although performing exact computation requires large amount of resources, allowing bounded approximation can provide disproportionate gains in performance and energy, while still achieving acceptable result accuracy. For example, in k-means clustering algorithm, allowing only 5% loss in classification accuracy can provide 50 times energy saving compared to the fully accurate classification.
An AI accelerator, deep learning processor, or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.
PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is recognized as one of the two most popular machine learning libraries alongside TensorFlow, offering free and open-source software released under the modified BSD license. Although the Python interface is more polished and the primary focus of development, PyTorch also has a C++ interface.
Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used:
In computer vision, SqueezeNet is the name of a deep neural network for image classification that was released in 2016. SqueezeNet was developed by researchers at DeepScale, University of California, Berkeley, and Stanford University. In designing SqueezeNet, the authors' goal was to create a smaller neural network with fewer parameters while achieving competitive accuracy.
DeepScale, Inc. was an American technology company headquartered in Mountain View, California, that developed perceptual system technologies for automated vehicles. On October 1, 2019, the company was acquired by Tesla, Inc.
Forrest N. Iandola is an American computer scientist specializing in efficient AI.