Torus interconnect

Last updated

A torus interconnect is a switch-less network topology for connecting processing nodes in a parallel computer system.

Contents

Diagram of a 3-dimensional torus interconnect. It is not limited to 8 nodes but can consist of any number of nodes in a similar rectilinear array. 2x2x2torus.svg
Diagram of a 3-dimensional torus interconnect. It is not limited to 8 nodes but can consist of any number of nodes in a similar rectilinear array.

Introduction

In geometry, a torus is created by revolving a circle about an axis coplanar to the circle. While this is a general definition in geometry, the topological properties of this type of shape describes the network topology in its essence.

Geometry illustration

The following images are 1D, and 2D torus. 1D torus is a simple circle, and 2D torus has the shape of a doughnut. The animation below illustrates how a 2D torus is generated from a rectangle by connecting its two pairs of opposite edges. Here the concept of torus is used to describe essentially the beginning and ending of a sequence of nodes are connected, like a doughnut. To better illustrate the concept, and understand what the topology means in network interconnect, we give 3 examples of parallel interconnected nodes using torus topology. At one dimension, a torus topology is equivalent to a ring interconnect network, of a shape of a circle. At 2D, it is equivalent to a 2D mesh, but with extra connection at the edge nodes, which is the definition of 2D torus.

Torus network topology

We can generalize the rule from the figures above. Torus interconnect is a switch-less topology that can be seen as a mesh interconnect with nodes arranged in a rectilinear array of N = 2, 3, or more dimensions, with processors connected to their nearest neighbors, and corresponding processors on opposite edges of the array connected.[1] In this lattice, each node has 2N connections. This topology got the name from the fact that the lattice formed in this way is topologically homogeneous to an N-dimensional torus.

Visualization

The first 3 dimensions of torus network topology are easier to visualize and are described below:

Higher-dimensional arrays are difficult to visualize but we can see from above rule that each higher dimension adds another pair of nearest neighbor connections to each node.

Performance

A number of supercomputers on the TOP500 list use three-dimensional torus networks, e.g. IBM's Blue Gene/L and Blue Gene/P, and the Cray XT3. [1] IBM's Blue Gene/Q uses a five-dimensional torus network. Fujitsu's K computer and the PRIMEHPC FX10 use a proprietary three-dimensional torus 3D mesh interconnect called Tofu. [2]

3D Torus performance simulation

Sandeep Palur and Dr. Ioan Raicu from Illinois Institute of Technology conducted experiments to simulate 3D torus performance. Their experiments ran on a computer with 250GB RAM, 48 cores and x86_64 architecture. The simulator they used was ROSS (Rensselaer’s Optimistic Simulation System). They mainly focused on three aspects:

They concluded that throughput decreases with the increase of servers and network size. Otherwise, throughput increases with the increase of message size. [3]

6D Torus product performance

Fujitsu Limited developed a 6D torus computer model called "Tofu". In their model, a 6D torus can achieve 100 GB/s off-chip bandwidth, 12 times higher scalability than a 3D torus, and high fault tolerance. The model is used in the K computer and Fugaku. [4]

Advantages and disadvantages

Advantages

Higher speed, lower latency
Because of the connection of opposite edges, data have more options to travel from one node to another which greatly increased speed.
Better fairness
In a 4×4 mesh interconnect, the longest distance between nodes is from upper left corner to lower right corner. Each datum takes 6 hops to travel the longest path. But in a 4×4 Torus interconnect, upper left corner can travel to lower right corner with only 2 hops
Lower energy consumption
Since data tend to travel fewer hops, the energy consumption tends to be lower.

Disadvantages

Complexity of wiring
Extra wires can make the routing process in the physical design phase more difficult. If we want to lay out more wires on chip, it is likely that we need to increase the number of metal layers or decrease density on chip, which is more expensive. Otherwise, the wires that connect opposite edges can be much longer than other wires. This inequality of link lengths can cause problems because of RC delay.
Cost
While long wrap-around links may be the easiest way to visualize the connection topology, in practice, restrictions on cable lengths often make long wrap-around links impractical. Instead, directly connected nodes—including nodes that the above visualization places on opposite edges of a grid, connected by a long wrap-around link—are physically placed nearly adjacent to each other in a folded torus network. [5] [6] Every link in the folded torus network is very short—almost as short as the nearest-neighbor links in a simple grid interconnect—and therefore low-latency. [7]

See also

Related Research Articles

Grid network

A grid network is a computer network consisting of a number of computer systems connected in a grid topology.

Network topology Arrangement of the elements of a communication network

Network topology is the arrangement of the elements of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, including command and control radio networks, industrial fieldbusses and computer networks.

Torus Doughnut-shaped surface of revolution

In geometry, a torus is a surface of revolution generated by revolving a circle in three-dimensional space about an axis that is coplanar with the circle.

Scalable Coherent Interface High-speed interconnect standard for shared memory multiprocessing and message passing

The Scalable Coherent Interface or Scalable Coherent Interconnect (SCI), is a high-speed interconnect standard for shared memory multiprocessing and message passing. The goal was to scale well, provide system-wide memory coherence and a simple interface; i.e. a standard to replace existing buses in multiprocessor systems with one with no inherent scalability and performance limitations.

Multiple instruction, multiple data Computing technique employed to achieve parallelism

In computing, multiple instruction, multiple data (MIMD) is a technique employed to achieve parallelism. Machines using MIMD have a number of processors that function asynchronously and independently. At any time, different processors may be executing different instructions on different pieces of data.

In parallel computer architectures, a systolic array is a homogeneous network of tightly coupled data processing units (DPUs) called cells or nodes. Each node or DPU independently computes a partial result as a function of the data received from its upstream neighbours, stores the result within itself and passes it downstream. Systolic arrays were first used in Colossus, which was an early computer used to break German Lorenz ciphers during World War II. Due to the classified nature of Colossus, they were independently invented or rediscovered by H. T. Kung and Charles Leiserson who described arrays for many dense linear algebra computations for banded matrices. Early applications include computing greatest common divisors of integers and polynomials. They are sometimes classified as multiple-instruction single-data (MISD) architectures under Flynn's taxonomy, but this classification is questionable because a strong argument can be made to distinguish systolic arrays from any of Flynn's four categories: SISD, SIMD, MISD, MIMD, as discussed later in this article.

Mesh networking Network with multiple links between nodes

A mesh network is a local area network topology in which the infrastructure nodes connect directly, dynamically and non-hierarchically to as many other nodes as possible and cooperate with one another to efficiently route data to and from clients.

Switched fabric or switching fabric is a network topology in which network nodes interconnect via one or more network switches. Because a switched fabric network spreads network traffic across multiple physical links, it yields higher total throughput than broadcast networks, such as the early 10BASE5 version of Ethernet and most wireless networks such as Wi-Fi.

Mesh generation Subdivision of space into cells

Mesh generation is the practice of creating a mesh, a subdivision of a continuous geometric space into discrete geometric and topological cells. Often these cells form a simplicial complex. Usually the cells partition the geometric input domain. Mesh cells are used as discrete local approximations of the larger domain. Meshes are created by computer algorithms, often with human guidance through a GUI, depending on the complexity of the domain and the type of mesh desired. A typical goal is to create a mesh that accurately captures the input domain geometry, with high-quality (well-shaped) cells, and without so many cells as to make subsequent calculations intractable. The mesh should also be fine in areas that are important for the subsequent calculations.

In distributed computing, leader election is the process of designating a single process as the organizer of some task distributed among several computers (nodes). Before the task has begun, all network nodes are either unaware which node will serve as the "leader" of the task, or unable to communicate with the current coordinator. After a leader election algorithm has been run, however, each node throughout the network recognizes a particular, unique node as the task leader.

Multistage interconnection networks (MINs) are a class of high-speed computer networks usually composed of processing elements (PEs) on one end of the network and memory elements (MEs) on the other end, connected by switching elements (SEs). The switching elements themselves are usually connected to each other in stages, hence the name.

The topology of an electronic circuit is the form taken by the network of interconnections of the circuit components. Different specific values or ratings of the components are regarded as being the same topology. Topology is not concerned with the physical layout of components in a circuit, nor with their positions on a circuit diagram; similarly to the mathematic concept of topology, it is only concerned with what connections exist between the components. There may be numerous physical layouts and circuit diagrams that all amount to the same topology.

An equivalent impedance is an equivalent circuit of an electrical network of impedance elements which presents the same impedance between all pairs of terminals as did the given network. This article describes mathematical transformations between some passive, linear impedance networks commonly found in electronic circuits.

QPACE is a massively parallel and scalable supercomputer designed for applications in lattice quantum chromodynamics.

K computer Supercomputer in Kobe, Japan

The K computer – named for the Japanese word/numeral "kei" (京), meaning 10 quadrillion (1016) – was a supercomputer manufactured by Fujitsu, installed at the Riken Advanced Institute for Computational Science campus in Kobe, Hyōgo Prefecture, Japan. The K computer was based on a distributed memory architecture with over 80,000 compute nodes. It was used for a variety of applications, including climate research, disaster prevention and medical research. The K computer's operating system was based on the Linux kernel, with additional drivers designed to make use of the computer's hardware.

In graph drawing and geometric graph theory, a Tutte embedding or barycentric embedding of a simple, 3-vertex-connected, planar graph is a crossing-free straight-line embedding with the properties that the outer face is a convex polygon and that each interior vertex is at the average of its neighbors' positions. If the outer polygon is fixed, this condition on the interior vertices determines their position uniquely as the solution to a system of linear equations. Solving the equations geometrically produces a planar embedding. Tutte's spring theorem, proven by W. T. Tutte (1963), states that this unique solution is always crossing-free, and more strongly that every face of the resulting planar embedding is convex. It is called the spring theorem because such an embedding can be found as the equilibrium position for a system of springs representing the edges of the graph.

QPACE2

QPACE 2 is a massively parallel and scalable supercomputer. It was designed for applications in lattice quantum chromodynamics but is also suitable for a wider range of applications..

Hypercube internetwork topology

In computer networking, hypercube networks are a type of network topology used to connect multiple processors with memory modules and accurately route data. Hypercube networks consist of 2m nodes, which form the vertices of squares to create an internetwork connection. A hypercube is basically a multidimensional mesh network with two nodes in each dimension. Due to similarity, such topologies are usually grouped into a k-ary d-dimensional mesh topology family, where d represents the number of dimensions and k represents the number of nodes in each dimension.

Butterfly network Technique to link multiple computers into a high-speed network

A butterfly network is a technique to link multiple computers into a high-speed network. This form of multistage interconnection network topology can be used to connect different nodes in a multiprocessor system. The interconnect network for a shared memory multiprocessor system must have low latency and high bandwidth unlike other network systems, like local area networks (LANs) or internet for three reasons:

Torus fusion (tofu) is a proprietary computer network topology for supercomputers developed by Fujitsu. It is a variant of the torus interconnect. The system has been used in the K computer and the Fugaku supercomputer.

References

  1. N. R. Agida et al. 2005 Blue Gene/L Torus Interconnection Network, IBM Journal of Research and Development, Vol 45, No 2/3 March–May 2005 page 265 "Archived copy" (PDF). Archived from the original (PDF) on 2011-08-15. Retrieved 2012-02-09.{{cite web}}: CS1 maint: archived copy as title (link)
  2. Fujitsu Unveils Post-K Supercomputer HPC Wire Nov 7 2011
  3. Sandeep, Palur; Raicu, Dr. Ioan. "Understanding Torus Network Performance through Simulations" (PDF). Retrieved 28 November 2016.
  4. Inoue, Tomohiro. "The 6D Mesh/Torus Interconnect of K Computer" (PDF). Fujitsu. Retrieved 28 November 2016.
  5. "Small-World Torus Topology".
  6. Pavel Tvrdik. "Topics in parallel computing: Embeddings and simulations of INs: Optimal embedding of tori into meshes".
  7. "The 3D Torus architecture and the Eurotech approach".