The Nvidia Drive PX is a series of computers aimed at providing autonomous car and driver assistance functionality powered by deep learning. [1] The platform was introduced at the Consumer Electronics Show (CES) in Las Vegas in January 2015. [2] An enhanced version, the Drive PX 2 was introduced at CES a year later, in January 2016. [3]
Nvidia Corporation is an American technology company incorporated in Delaware and based in Santa Clara, California. It designs graphics processing units (GPUs) for the gaming and professional markets, as well as system on a chip units (SoCs) for the mobile computing and automotive market. Its primary GPU product line, labeled "GeForce", is in direct competition with Advanced Micro Devices' (AMD) "Radeon" products. Nvidia expanded its presence in the gaming industry with its handheld Shield Portable, Shield Tablet and Shield Android TV.
Deep learning is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, semi-supervised or unsupervised.
CES is an annual trade show organized by the Consumer Technology Association (CTA). Held in January at the Las Vegas Convention Center in Las Vegas, Nevada, United States, the event typically hosts presentations of new products and technologies in the consumer electronics industry.
The first of Nvidia's autonomous chips was announced at CES 2015, based on the Maxwell GPU microarchitecture. [4] The line-up existed of two platforms: Drive CX for digital cockpits.
Maxwell is the codename for a GPU microarchitecture developed by Nvidia as the successor to the Kepler microarchitecture. The Maxwell architecture was introduced in later models of the GeForce 700 series and is also used in the GeForce 800M series, GeForce 900 series, and Quadro Mxxx series, all manufactured in 28 nm.
The Drive CX was based on a single Tegra X1 SoC (System on a Chip) and was marketed as a digital cockpit computer, providing a rich dashboard, navigation and multimedia experience. Early Nvidia press releases reported that the Drive CX board will be capable of carrying either a Tegra K1 or a Tegra X1. [5]
A system on a chip or system on chip is an integrated circuit that integrates all components of a computer or other electronic system. These components typically include a central processing unit (CPU), memory, input/output ports and secondary storage – all on a single substrate. It may contain digital, analog, mixed-signal, and often radio frequency signal processing functions, depending on the application. As they are integrated on a single substrate, SoCs consume much less power and take up much less area than multi-chip designs with equivalent functionality. Because of this, SoCs are very common in the mobile computing and edge computing markets. Systems on chip are commonly used in embedded systems and the Internet of Things.
The first version of Drive PX is based on two Tegra X1 SoCs. It is targeted at (semi-)autonomous driving cars, and is being applied by Toyota. [6]
Toyota Motor Corporation is a Japanese multinational automotive manufacturer headquartered in Toyota City, Aichi, Japan. In 2017, Toyota's corporate structure consisted of 364,445 employees worldwide and, as of September 2018, was the sixth-largest company in the world by revenue. As of 2017, Toyota is the world's second-largest automotive manufacturer. Toyota was the world's first automobile manufacturer to produce more than 10 million vehicles per year which it has done since 2012, when it also reported the production of its 200-millionth vehicle. As of July 2014, Toyota was the largest listed company in Japan by market capitalization and by revenue.
Drive PX platforms based on the Pascal GPU microarchitecture were first announced at CES 2016. [7] This time only a new version of Drive PX was announced, but in multiple configurations.
Pascal is the codename for a GPU microarchitecture developed by Nvidia, as the successor to the Maxwell architecture. The architecture was first introduced in April 2016 with the release of the Tesla P100 (GP100) on April 5, 2016, and is primarily used in the GeForce 10 series, starting with the GeForce GTX 1080 and GTX 1070, which were released on May 17, 2016 and June 10, 2016 respectively. Pascal is manufactured using the 16nm FinFET process.
The Nvidia Drive PX 2 is based on one or two Tegra X2 SoCs where each SoC contains 2 Denver cores, 4 ARM A57 cores and a GPU from the Pascal generation. [8] There are two real world board configurations:
There is further the proposal from Nvidia for fully autonomous driving by means of combining multiple items of the AutoChauffeur board variant and connecting these boards using e.g. UART, CAN, LIN, FlexRay, USB, 1 Gbit Ethernet or 10 Gbit Ethernet. For any derived custom PCB design the option of linking the Tegra X2 Processors via some PCIe bus bridge is further available, according to board block diagrams that can be found on the web.
All Tesla Motors vehicles manufactured from mid-October 2016 include a Drive PX 2, which will be used for neural net processing to enable Enhanced Autopilot and full self-driving functionality. [9] Other applications are Roborace. [10] Disassembling the Nvidia-based control unit from a recent Tesla car showed that a Tesla was using a modified single-chip Drive PX 2 AutoCruise, with a GP106 GPU added as a MXM Module. The chip markings gave strong hints for the Tegra X2 Parker as the CPU SoC. [11] [12]
Systems based on the Volta GPU microarchitecture were first announced at CES 2017 [13]
The first Volta based Drive PX system was announced at CES 2017 as the Xavier AI Car Supercomputer. [13] It was re-presented at CES 2018 as Drive PX Xavier. [14] [15] Initial reports of the Xavier SoC suggested a single chip with similar processing power to the Drive PX 2 Autochauffeur system. [16] However, in 2017 the performance of the Xavier-based system was later revised upward, to 50% greater than Drive PX 2 Autochauffeur system. [13] Drive PX Xavier is supposed to deliver 30 INT8 TOPS of performance while consuming only 30 watts of power. [17] This spreads across two distinct units, the iGPU with 20 INT8 TOPS as published early and the somewhat later on announced, newly introduced DLA that provided an additional 10 INT8 TOPS.
In October 2017 Nvidia and partner development companies announced the Drive PX Pegasus system, based upon two Xavier CPU/GPU devices and two post-Volta (Turing) generation GPUs. The companies stated the third generation Drive PX system would be capable of Level 5 autonomous driving, with a total of 320 INT8 TOPS of AI computational power and a 500 Watts TDP. [18] [19]
Nvidia provided reference board | Drive CX | Drive PX | Drive PX 2 (AutoCruise) | Drive PX 2 (Tesla) | Drive PX 2 (AutoChauffeur) | Drive PX 2 (Tesla 2.5) | Drive PX Xavier [15] | Drive PX Pegasus [20] |
---|---|---|---|---|---|---|---|---|
GPU Microarchitecture | Maxwell (28 nm) | Pascal (16 nm) | Volta (12 nm) | |||||
Introduced | January 2015 | September 2016 [21] | October 2016 [22] | January 2016 | August 2017 [23] | January 2017 | October 2017 | |
Computing | 1x Tegra X1 | 2x Tegra X1 | 1x Tegra X2 (Parker) + 1x Pascal GPU | 2x Tegra X2 (Parker) + 2x Pascal GPU | 2x Tegra X2 (Parker) + 1x Pascal GPU [24] | 1x Tegra Xavier [25] | 2x Tegra Xavier | |
CPU | 4x Cortex A57 4x Cortex A53 | 8x Cortex A57 8x Cortex A53 | 2x Denver 4x Cortex A57 | 4x Denver 8x Cortex A57 | 4x Denver 8x Cortex A57 | 8x Carmel ARM64 [25] | 16x Carmel ARM64 | |
GPU | 2 SMM Maxwell 256 CUDA cores | 4 SMM Maxwell 512 CUDA cores | 1x Parker GPGPU (1x 2 SM Pascal, 256 CUDA cores) | 1x Parker GPGPU (1x 2 SM Pascal, 256 CUDA cores on a MXM slot [11] ) | 2x Parker GPGPU (2x 2 SM Pascal, 512 CUDA cores) | 1x Parker GPGPU | 1x Volta iGPU (512 CUDA cores) [25] | 2x Volta iGPU (512 CUDA cores) 2x Turing dGPUs |
Accelerator | 1x DLA [25] | 2x DLA | ||||||
Memory | 8GB LPDDR4 [27] | 16GB LPDDR4 [27] | LPDDR4 [25] | |||||
Storage | 64GB eMMC [27] | 128GB eMMC [27] | ||||||
Performance | 4 FP32 TFLOPS | 4 FP32 TFLOPS | 16 FP16 TFLOPS 8 FP32 TFLOPS | 4 FP32 TFLOPS | 20 INT8 TOPS, 1.3 FP32 TFLOPS (GPU) 10 INT8 TOPS, 5 FP16 TFLOPS (DLA) [25] | 320 INT8 TOPS (total) [30] | ||
TDP | 20W [29] | 40W SoC portion: 10 W [21] | 40W SoC portion: 10 W [21] | 80W [31] [32] [29] [33] SoC portion: 20 W [21] | 60W [31] [32] [29] SoC portion: 20 W [21] | 30W [25] | 500W [30] |
Note: dGPU and memory are stand-alone semiconductors; all other components, especially ARM cores, iGPU and DLA are integrated components of the listed main computing device(s)
GeForce is a brand of graphics processing units (GPUs) designed by Nvidia. As of the GeForce 20 series, there have been fifteen iterations of the design. The first GeForce products were discrete GPUs designed for add-on graphics boards, intended for the high-margin PC gaming market, and later diversification of the product line covered all tiers of the PC graphics market, ranging from cost-sensitive GPUs integrated on motherboards, to mainstream add-in retail boards. Most recently, GeForce technology has been introduced into Nvidia's line of embedded application processors, designed for electronic handhelds and mobile handsets.
A graphics processing unit (GPU) is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobile phones, personal computers, workstations, and game consoles. Modern GPUs are very efficient at manipulating computer graphics and image processing. Their highly parallel structure makes them more efficient than general-purpose CPUs for algorithms that process large blocks of data in parallel. In a personal computer, a GPU can be present on a video card or embedded on the motherboard. In certain CPUs, they are embedded on the CPU die.
Nvidia Quadro is Nvidia's brand for graphics cards intended for use in workstations running professional computer-aided design (CAD), computer-generated imagery (CGI), digital content creation (DCC) applications, scientific calculations and machine learning.
CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia. It allows software developers and software engineers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing — an approach termed GPGPU. The CUDA platform is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements, for the execution of compute kernels.
Tegra is a system on a chip (SoC) series developed by Nvidia for mobile devices such as smartphones, personal digital assistants, and mobile Internet devices. The Tegra integrates an ARM architecture central processing unit (CPU), graphics processing unit (GPU), northbridge, southbridge, and memory controller onto one package. Early Tegra SoCs are designed as efficient multimedia processors, while more recent models emphasize performance for gaming and machine learning applications, without sacrificing power efficiency.
Project Denver is the codename of a microarchitecture designed by Nvidia that implements the ARMv8-A 64/32-bit instruction sets using a combination of simple hardware decoder and software-based binary translation where "Denver's binary translation layer runs in software, at a lower level than the operating system, and stores commonly accessed, already optimized code sequences in a 128 MB cache stored in main memory". Denver is a very wide in-order superscalar pipeline. Its design makes it suitable for integration with other SIPs cores into one die constituting a system on a chip (SoC).
Nvidia Tesla is Nvidia's brand name for their products targeting stream processing or general-purpose graphics processing units (GPGPU). Products use GPUs from the G80 series onward. The underlying Tesla microarchitecture of the GPUs and the Tesla product line are named after pioneering electrical engineer Nikola Tesla.
NVLink is a wire-based communications protocol serial multi-lane near-range communication link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central Hub.
Volta, an Nvidia-developed GPU microarchitecture codename, succeeds Pascal. It was first announced as a future roadmap ambition in March 2013. However the first product was not announced until May 2017. The architecture is named after Alessandro Volta. It was NVIDIA's first chip to feature Tensor cores, specially designed cores that have superior deep learning performance over regular CUDA cores.
An AI accelerator is a class of microprocessor or computer system designed as hardware acceleration for artificial intelligence applications, especially artificial neural networks, machine vision and machine learning. Typical applications include algorithms for robotics, internet of things and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. A number of vendor-specific terms exist for devices in this category, and it is an emerging technology without a dominant design. AI accelerators can be found in many devices such as smartphones, tablets, and computers all around the world. See the heading titled ¨Examples" for more examples.
Nvidia DGX-1 is a line of Nvidia produced servers and workstations which specialize in using GPGPU to accelerate deep learning applications. The servers feature 8 GPUs based on the Pascal or Volta daughter cards with HBM 2 memory, connected by an NVLink mesh network.
Tesla Autopilot also known as Enhanced Autopilot after a second hardware version started to be shipped, is an advanced driver-assistance system feature offered by Tesla that has lane centering, adaptive cruise control, self-parking, ability to automatically change lanes with driver confirmation, and enables the car to be summoned to and from a garage or parking spot.
Vibrante is the name of a Linux distribution created by NVIDIA and used for at least their Drive PX 2 platform series. The name is listed as a registered trademark of NVIDIA. First appearances of the name were seen in about the year 2010 when it labeled some rather universal multimedia engine including audio, video and 3D building display that was in tight cooperation with Audi company. At NVidia TechDay in December 2015 the distribution was reported with version numbers 3.0 for Jetson TK1 Pro and Drive CX, and with version 4.0 for Drive CX and PX platforms. Jetson TK1 is mentioned as running with the Linux4Tegra package instead. Companies like Toradex have built and published e.g. sample application codes on top of it. Abbreviations of Vibrante Linux like V3L, V3Le or V4L with the number representing the version plus terms like L4T and assigned to certain devices can be found in some history and release docs, e.g. for NVIDIA VisionWorks. On top of Vibrante it is possible to run NVIDIAs VisionWorks Toolkit. Vibrante is one of the targets that OpenCV4Tegra can run upon. Further there is the NVIDIA PerfKit Package that copes with Vibrante.
Nvidia Jetson is a series of embedded computing boards from Nvidia. The Jetson TK1, TX1 and TX2 models all carry a Tegra processor from Nvidia that integrates an ARM architecture central processing unit (CPU). Jetson is a low-power system and is especially designed for accelerating machine learning applications.
Kar-go is an autonomous delivery vehicle, designed and built by the Academy of Robotics, a UK company registered in Wales. The vehicle uses self-drive/driverless car technology to drive itself to locations where it will deliver packages autonomously.
Sierra or ATS-2 is a supercomputer built for the Lawrence Livermore National Laboratory for use by the National Nuclear Security Administration as the second Advanced Technology System. It is primarily used for predictive applications in stockpile stewardship, helping to assure the safety, reliability and effectiveness of the United States' nuclear weapons.
AImotive is an autonomous vehicle technology company working towards Level 5 autonomous vehicles. The company describes its approach as "vision-first", a system that primarily relies on cameras and artificial intelligence to detect its surroundings. The technology is designed to be implemented by automobile manufacturers to create fully autonomous vehicles, which can operate in all conditions and locations. In September 2017, PSA Group teamed up with AImotive.
Turing is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to the Volta architecture. It is named after the prominent mathematician and computer scientist, Alan Turing. The architecture was first introduced in August 2018 at SIGGRAPH 2018 along with professional workstation Quadro RTX products based on it and one week later at Gamescom along with consumer GeForce RTX products based on it. The architecture introduces the first consumer products capable of real-time raytracing, which has been a longstanding goal of the computer graphics industry. Key elements include dedicated artificial intelligence processors and dedicated ray-tracing processors. Turing leverages Microsoft's DXR, OptiX and Vulkan for access to ray-tracing. In February 2019, Nvidia released the GeForce 16 series of GPUs which utilize the new Turing design which, lacking the ray tracing and artificial intelligence cores, provides a lower cost alternative to the 20 series.