Data processing unit

Last updated
SolidRun's SolidNet OCP-8K SmartNIC OCP-8K-top.jpg
SolidRun's SolidNet OCP-8K SmartNIC

A data processing unit (DPU) is a programmable computer processor that tightly integrates a general-purpose CPU with network interface hardware. [1] Sometimes they are called "IPUs" (for "infrastructure processing unit") or "SmartNICs". [2] They can be used in place of traditional NICs to relieve the main CPU of complex networking responsibilities and other "infrastructural" duties; although their features vary, they may be used to perform encryption/decryption, serve as a firewall, handle TCP/IP, process HTTP requests, or even function as a hypervisor or storage controller. [1] [3] These devices can be attractive to cloud computing providers whose servers might otherwise spend a significant amount of CPU time on these tasks, cutting into the cycles they can provide to guests. [1]

Contents

Examples of DPUs

Azure Boost DPU

In 2024, Microsoft introduced the Azure Boost DPU, a custom-designed data processing unit aimed at optimizing network and infrastructure efficiency across its Azure cloud platform. This DPU offloads network-related tasks such as packet processing, security enforcement, and traffic management from central CPUs, enabling better performance for application workloads. [4] [5]

Key Features

  • Network Optimization: The Azure Boost DPU enhances network throughput and reduces latency by processing data packets and offloading these tasks from traditional CPUs. [6]
  • Security Capabilities: It integrates advanced isolation techniques to secure multi-tenant environments, protecting sensitive workloads. [5]
  • Hyperscale Adaptability: Designed for large-scale data centers, the DPU supports Azure’s hyperscale infrastructure, ensuring scalability for modern cloud applications. [4]

Industry Context

The Azure Boost DPU aligns with the trend of custom silicon development in hyperscale cloud environments. Similar to AWS’s Nitro System and NVIDIA’s BlueField DPUs, Microsoft’s DPU focuses on enhancing cloud efficiency while addressing rising energy and security demands. [5] This innovation positions Microsoft alongside other cloud leaders leveraging DPUs to optimize data center operations and provide cost-effective, high-performance solutions for customers. [6]

Impact on Cloud Computing

The introduction of DPUs like Azure Boost reflects a broader shift in the cloud computing industry toward offloading specific functions from general-purpose processors to specialized hardware. Microsoft’s Azure Boost DPU represents its strategy to reduce costs, enhance security, and achieve sustainability goals while improving infrastructure efficiency. [4] [5]

See also

Related Research Articles

TCP offload engine (TOE) is a technology used in some network interface cards (NIC) to offload processing of the entire TCP/IP stack to the network controller. It is primarily used with high-speed network interfaces, such as gigabit Ethernet and 10 Gigabit Ethernet, where processing overhead of the network stack becomes significant. TOEs are often used as a way to reduce the overhead associated with Internet Protocol (IP) storage protocols such as iSCSI and Network File System (NFS).

Software multitenancy is a software architecture in which a single instance of software runs on a server and serves multiple tenants. Systems designed in such manner are "shared". A tenant is a group of users who share a common access with specific privileges to the software instance. With a multitenant architecture, a software application is designed to provide every tenant a dedicated share of the instance—including its data, configuration, user management, tenant individual functionality and non-functional properties. Multitenancy contrasts with multi-instance architectures, where separate software instances operate on behalf of different tenants.

Marvell Technology, Inc. is an American company, headquartered in Santa Clara, California, which develops and produces semiconductors and related technology. Founded in 1995, the company had more than 6,500 employees as of 2024, with over 10,000 patents worldwide, and an annual revenue of $5.5 billion for fiscal 2024.

<span class="mw-page-title-main">Microsoft Azure SQL Database</span> Managed cloud database

Microsoft Azure SQL Database is a managed cloud database (PaaS) cloud-based Microsoft SQL Servers, provided as part of Microsoft Azure services. The service handles database management functions for cloud based Microsoft SQL Servers including upgrading, patching, backups, and monitoring without user involvement.

<span class="mw-page-title-main">Cloud computing</span> Form of shared internet-based computing

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.

<span class="mw-page-title-main">Microsoft Azure</span> Cloud computing platform by Microsoft

Microsoft Azure, or just Azure, is the cloud computing platform developed by Microsoft. It has management, access and development of applications and services to individuals, companies, and governments through its global infrastructure. It also provides capabilities that are usually not included within other cloud platforms, including software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). Microsoft Azure supports many programming languages, tools, and frameworks, including Microsoft-specific and third-party software and systems.

Dynamic Infrastructure is an information technology concept related to the design of data centers, whereby the underlying hardware and software can respond dynamically and more efficiently to changing levels of demand. In other words, data center assets such as storage and processing power can be provisioned to meet surges in user's needs. The concept has also been referred to as Infrastructure 2.0 and Next Generation Data Center.

<span class="mw-page-title-main">Accton Technology Corporation</span> Taiwanese electronics company

Accton Technology Corporation is a Taiwanese company in the electronics industry that primarily engages in the development and manufacture of networking and communication solutions, as an original equipment manufacturer (OEM) or original design manufacturer (ODM) partner. Accton has manufacturing plants in Taiwan (Hsinchu), China (Shenzhen), and Vietnam, supported by research and development centers in Taiwan, Shanghai, and California. Its product include 100G, 400G, and 800G switches designed for data center applications, along with wireless devices and artificial intelligence acceleration hardware.

<span class="mw-page-title-main">OpenNebula</span> Cloud-computing platform for managing heterogeneous distributed infrastructure

OpenNebula is an open source cloud computing platform for managing heterogeneous data center, public cloud and edge computing infrastructure resources. OpenNebula manages on-premises and remote virtual infrastructure to build private, public, or hybrid implementations of Infrastructure as a Service and multi-tenant Kubernetes deployments. The two primary uses of the OpenNebula platform are data center virtualization and cloud deployments based on the KVM hypervisor, LXD/LXC system containers, and AWS Firecracker microVMs. The platform is also capable of offering the cloud infrastructure necessary to operate a cloud on top of existing VMware infrastructure. In early June 2020, OpenNebula announced the release of a new Enterprise Edition for corporate users, along with a Community Edition. OpenNebula CE is free and open-source software, released under the Apache License version 2. OpenNebula CE comes with free access to patch releases containing critical bug fixes but with no access to the regular EE maintenance releases. Upgrades to the latest minor/major version is only available for CE users with non-commercial deployments or with significant open source contributions to the OpenNebula Community. OpenNebula EE is distributed under a closed-source license and requires a commercial Subscription.

Google Compute Engine (GCE) is the infrastructure as a service (IaaS) component of Google Cloud Platform which is built on the global infrastructure that runs Google's search engine, Gmail, YouTube and other services. Google Compute Engine enables users to launch virtual machines (VMs) on demand. VMs can be launched from the standard images or custom images created by users. Google Compute Engine can be accessed via the Developer Console, RESTful API or command-line interface (CLI).

Kubernetes is an open-source container orchestration system for automating software deployment, scaling, and management. Originally designed by Google, the project is now maintained by a worldwide community of contributors, and the trademark is held by the Cloud Native Computing Foundation.

Computation offloading is the transfer of resource intensive computational tasks to a separate processor, such as a hardware accelerator, or an external platform, such as a cluster, grid, or a cloud. Offloading to a coprocessor can be used to accelerate applications including: image rendering and mathematical calculations. Offloading computing to an external platform over a network can provide computing power and overcome hardware limitations of a device, such as limited computational power, storage, and energy.

An AI accelerator, deep learning processor or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.

Serverless computing is a cloud computing execution model in which the cloud provider allocates machine resources on demand, taking care of the servers on behalf of their customers. Serverless is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. However, developers of serverless applications are not concerned with capacity planning, configuration, management, maintenance, fault tolerance, or scaling of containers, virtual machines, or physical servers. When an app is not in use, there are no computing resources allocated to the app. Pricing is based on the actual amount of resources consumed by an application. It can be a form of utility computing.

Fungible Inc. is a technology company headquartered in Santa Clara, California. The company develops hardware and software to improve the performance, reliability and economics of data centers.

<span class="mw-page-title-main">IBM Cloud</span> Cloud computing services provided by IBM

IBM Cloud is a set of cloud computing services for business offered by the information technology company IBM.

<span class="mw-page-title-main">Ampere Computing</span> American fabless semiconductor company

Ampere Computing LLC is an American fabless semiconductor company based in Santa Clara, California that develops processors for servers operating in large scale environments. It was founded in 2017 by Renée James.

Nvidia BlueField is a line of data processing units (DPUs) designed and produced by Nvidia. Initially developed by Mellanox Technologies, the BlueField IP was acquired by Nvidia in March 2019, when Nvidia acquired Mellanox Technologies for US$6.9 billion. The first Nvidia produced BlueField cards, named BlueField-2, were shipped for review shortly after their announcement at VMworld 2019, and were officially launched at GTC 2020. Also launched at GTC 2020 was the Nvidia BlueField-2X, an Nvidia BlueField card with an Ampere generation graphics processing unit (GPU) integrated onto the same card. BlueField-3 and BlueField-4 DPUs were first announced at GTC 2021, with the tentative launch dates for these cards being 2022 and 2024 respectively.

The ARM Neoverse is a group of 64-bit ARM processor cores licensed by Arm Holdings. The cores are intended for datacenter, edge computing, and high-performance computing use. The group consists of ARM Neoverse V-Series, ARM Neoverse N-Series, and ARM Neoverse E-Series.

<span class="mw-page-title-main">Cilium (computing)</span>

Cilium is a cloud native technology for networking, observability, and security. It is based on the kernel technology eBPF, originally for better networking performance, and now leverages many additional features for different use cases. The core networking component has evolved from only providing a flat Layer 3 network for containers to including advanced networking features, like BGP and Service mesh, within a Kubernetes cluster, across multiple clusters, and connecting with the world outside Kubernetes. Hubble was created as the network observability component and Tetragon was later added for security observability and runtime enforcement. Cilium runs on Linux and is one of the first eBPF applications being ported to Microsoft Windows through the eBPF on Windows project.

References

  1. 1 2 3 Davie, Bruce (November 24, 2021). "SmartNICs, IPUs, DPUs de-hyped: Why and how cloud giants are offloading work from server CPUs". The Register . Retrieved 2023-07-11.
  2. Sharwood, Simon (May 23, 2023). "Google Cloud upgrades with next-gen accelerator that embiggens its VMs". The Register . Retrieved 2023-07-11. …Infrastructure Processing Unit – the same kind of kit that others call SmartNICs or Data Processing Units…
  3. "Definition of SmartNIC". PCMag . Ziff Davis . Retrieved 2023-07-11.
  4. 1 2 3 "Enhancing Infrastructure Efficiency with Azure Boost DPU". Microsoft Tech Community. November 19, 2024. Retrieved November 19, 2024.
  5. 1 2 3 4 "Microsoft debuts custom chips to boost data center security and power efficiency". VentureBeat. November 19, 2024. Retrieved November 19, 2024.
  6. 1 2 "New in-house chips round out Microsoft's portfolio". TechCrunch. November 19, 2024. Retrieved November 19, 2024.