Single-chip Cloud Computer

Last updated

The Single-Chip Cloud Computer (SCC) is a computer processor created by Intel Corporation in 2009 that features 48 distinct physical cores. [1] These cores communicate through an architecture similar to a cloud computer data center. Cores are components of the processor responsible for executing instructions that enable the computer to function. The SCC resulted from an Intel project focusing on researching multi-core processors and parallel processing. Intel also aimed to explore the integration of designs and architecture from large cloud computer data centers (cloud computing) into a single processing chip. The name "Single-chip Cloud Computer" reflects this concept. [2]

Contents

Uses

The SCC is currently utilized for research purposes. It can run the Linux operating system on the chip but is unable to boot Windows. [3] Some applications of the SCC include web servers, data informatics, bioinformatics, and financial analytics. [4]

Technical details

Intel developed this chip architecture based on huge cloud data centers, with the cores separated across the chip but capable of direct communication. The chip comprises 48 P54C Pentium cores connected with a 4×6 2D-mesh. This mesh consists of 24 tiles arranged in four rows and six columns. Each tile contains two cores and a 16 KB (8 per core) message passing buffer (MPB) shared by the two cores, essentially functioning as a router. [5] This router enables each core to communicate directly with others, eliminating the need to send information back to the main memory for rerouting to other cores. [3] The SCC contains 1.3 billion 45 nm transistors capable of amplifying signals or acting as a switch, using 25 to 125 watts of power depending on processing demand. For comparison, the Intel i7 processor uses 156 watts of power. [6] Each chip includes four DDR3 memory controllers connected to the 2D mesh, capable of addressing 64 GB of random-access memory. The DDR3 memory facilitates communication among tiles, contributing to the chip's functionality. These controllers, along with the transistors, manage the activation and deactivation of specific tiles to conserve power when not in use. Proper coding integration results in a functional processor with high speed, power, and energy efficiency, resembling a network of cloud computers. [7]

Modes of operation

The SCC comes with RCCE, a simple message-passing interface provided by Intel supporting basic message-buffering operations. [5] The SCC operates in two modes: processor mode and mesh mode.

Processor mode

In processor mode, cores are active, executing code from the system memory, and performing programmed I/O (inputs and outputs) through the system connected to the system board FPGA. Software running on the SCC's embedded management console handles tasks such as loading memory and configuring the processor for bootstrapping (sustaining after the initial load). [6]

Mesh mode

In mesh mode, cores are turned off, leaving only the routers, transistors, and RAM controllers active. These components send and receive large packets of data, and there is no memory map. [6]

The future

Intel intends to share this technology with other companies, including HP, Yahoo, and Microsoft, fostering collaborative research on the SCC to advance the technology efficiently. The goal is to make the SCC scalable to 100+ cores, potentially achieved by enabling communication between individual chips. Intel aims to enhance parallel programming productivity and power management, leveraging the chip's architecture and numerous cores. Further experimentation is planned on this architecture and similar chip architectures to develop many-core scalable processors maximizing processing power while maintaining energy efficiency. [4]

See also

Related Research Articles

Direct memory access (DMA) is a feature of computer systems that allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU).

<span class="mw-page-title-main">Dynamic random-access memory</span> Type of computer memory

Dynamic random-access memory is a type of random-access semiconductor memory that stores each bit of data in a memory cell, usually consisting of a tiny capacitor and a transistor, both typically based on metal–oxide–semiconductor (MOS) technology. While most DRAM memory cell designs use a capacitor and transistor, some only use two transistors. In the designs where a capacitor is used, the capacitor can either be charged or discharged; these two states are taken to represent the two values of a bit, conventionally called 0 and 1. The electric charge on the capacitors gradually leaks away; without intervention the data on the capacitor would soon be lost. To prevent this, DRAM requires an external memory refresh circuit which periodically rewrites the data in the capacitors, restoring them to their original charge. This refresh process is the defining characteristic of dynamic random-access memory, in contrast to static random-access memory (SRAM) which does not require data to be refreshed. Unlike flash memory, DRAM is volatile memory, since it loses its data quickly when power is removed. However, DRAM does exhibit limited data remanence.

<span class="mw-page-title-main">System on a chip</span> Micro-electronic component

A system on a chip or system-on-chip is an integrated circuit that integrates most or all components of a computer or other electronic system. These components almost always include on-chip central processing unit (CPU), memory interfaces, input/output devices and interfaces, and secondary storage interfaces, often alongside other components such as radio modems and a graphics processing unit (GPU) – all on a single substrate or microchip. SoCs may contain digital and also analog, mixed-signal and often radio frequency signal processing functions.

<span class="mw-page-title-main">Parallel computing</span> Programming paradigm in which many processes are executed simultaneously

Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

<span class="mw-page-title-main">Xeon</span> Line of Intel server and workstation processors

Xeon is a brand of x86 microprocessors designed, manufactured, and marketed by Intel, targeted at the non-consumer workstation, server, and embedded markets. It was introduced in June 1998. Xeon processors are based on the same architecture as regular desktop-grade CPUs, but have advanced features such as support for error correction code (ECC) memory, higher core counts, more PCI Express lanes, support for larger amounts of RAM, larger cache memory and extra provision for enterprise-grade reliability, availability and serviceability (RAS) features responsible for handling hardware exceptions through the Machine Check Architecture (MCA). They are often capable of safely continuing execution where a normal processor cannot due to these extra RAS features, depending on the type and severity of the machine-check exception (MCE). Some also support multi-socket systems with two, four, or eight sockets through use of the Ultra Path Interconnect (UPI) bus, which replaced the older QuickPath Interconnect (QPI) bus.

<span class="mw-page-title-main">Intel i960</span> RISC-based microprocessor design

Intel's i960 was a RISC-based microprocessor design that became popular during the early 1990s as an embedded microcontroller. It became a best-selling CPU in that segment, along with the competing AMD 29000. In spite of its success, Intel stopped marketing the i960 in the late 1990s, as a result of a settlement with DEC whereby Intel received the rights to produce the StrongARM CPU. The processor continues to be used for a few military applications.

<span class="mw-page-title-main">Northbridge (computing)</span> PC chip handling onboard control tasks

In computing, a northbridge is one of two chips comprising the core logic chipset architecture on motherboards for older personal computers. A northbridge is connected directly to a CPU via the front-side bus (FSB) to handle high-performance tasks, and is usually used in conjunction with a slower southbridge to manage communication between the CPU and other parts of the motherboard.

Semiconductor memory is a digital electronic semiconductor device used for digital data storage, such as computer memory. It typically refers to devices in which data is stored within metal–oxide–semiconductor (MOS) memory cells on a silicon integrated circuit memory chip. There are numerous different types using different semiconductor technologies. The two main types of random-access memory (RAM) are static RAM (SRAM), which uses several transistors per memory cell, and dynamic RAM (DRAM), which uses a transistor and a MOS capacitor per cell. Non-volatile memory uses floating-gate memory cells, which consist of a single floating-gate transistor per cell.

<span class="mw-page-title-main">POWER7</span> 2010 family of multi-core microprocessors by IBM

POWER7 is a family of superscalar multi-core microprocessors based on the Power ISA 2.06 instruction set architecture released in 2010 that succeeded the POWER6 and POWER6+. POWER7 was developed by IBM at several sites including IBM's Rochester, MN; Austin, TX; Essex Junction, VT; T. J. Watson Research Center, NY; Bromont, QC and IBM Deutschland Research & Development GmbH, Böblingen, Germany laboratories. IBM announced servers based on POWER7 on 8 February 2010.

In the fields of digital electronics and computer hardware, multi-channel memory architecture is a technology that increases the data transfer rate between the DRAM memory and the memory controller by adding more channels of communication between them. Theoretically, this multiplies the data rate by exactly the number of channels present. Dual-channel memory employs two channels. The technique goes back as far as the 1960s having been used in IBM System/360 Model 91 and in CDC 6600.

Double Data Rate 3 Synchronous Dynamic Random-Access Memory is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth interface, and has been in use since 2007. It is the higher-speed successor to DDR and DDR2 and predecessor to DDR4 synchronous dynamic random-access memory (SDRAM) chips. DDR3 SDRAM is neither forward nor backward compatible with any earlier type of random-access memory (RAM) because of different signaling voltages, timings, and other factors.

<span class="mw-page-title-main">Multi-core processor</span> Microprocessor with more than one processing unit

A multi-core processor is a microprocessor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions but the single processor can run instructions on separate cores at the same time, increasing overall speed for programs that support multithreading or other parallel computing techniques. Manufacturers typically integrate the cores onto a single integrated circuit die or onto multiple dies in a single chip package. The microprocessors currently used in almost all personal computers are multi-core.

<span class="mw-page-title-main">Fully Buffered DIMM</span>

A Fully Buffered DIMM (FB-DIMM) is a type of memory module used in computer systems. It is designed to improve memory performance and capacity by allowing multiple memory modules to be each connected to the memory controller using a serial interface, rather than a parallel one. Unlike the parallel bus architecture of traditional DRAMs, an FB-DIMM has a serial interface between the memory controller and the advanced memory buffer (AMB). Conventionally, data lines from the memory controller have to be connected to data lines in every DRAM module, i.e. via multidrop buses. As the memory width increases together with the access speed, the signal degrades at the interface between the bus and the device. This limits the speed and memory density, so FB-DIMMs take a different approach to solve the problem.

A memory controller is a digital circuit that manages the flow of data going to and from a computer's main memory. A memory controller can be a separate chip or integrated into another chip, such as being placed on the same die or as an integral part of a microprocessor; in the latter case, it is usually called an integrated memory controller (IMC). A memory controller is sometimes also called a memory chip controller (MCC) or a memory controller unit (MCU).

Intel Tera-Scale is a research program by Intel that focuses on development in Intel processors and platforms that utilize the inherent parallelism of emerging visual-computing applications. Such applications require teraFLOPS of parallel computing performance to process terabytes of data quickly. Parallelism is the concept of performing multiple tasks simultaneously. Utilizing parallelism will not only increase the efficiency of computer processing units (CPUs), but also increase the bytes of data analyzed each second. In order to appropriately apply parallelism, the CPU must be able to handle multiple threads and to do so the CPU must consist of multiple cores. The conventional amount of cores in consumer grade computers are 2–8 cores while workstation grade computers can have even greater amounts. However, even the current amount of cores aren't great enough to perform at teraFLOPS performance leading to an even greater amount of cores that must be added. As a result of the program, two prototypes have been manufactured that were used to test the feasibility of having many more cores than the conventional amount and proved to be successful.

Intel Teraflops Research Chip is a research manycore processor containing 80 cores, using a network-on-chip architecture, developed by Intel's Tera-Scale Computing Research Program. It was manufactured using a 65 nm CMOS process with eight layers of copper interconnect and contains 100 million transistors on a 275 mm2 die. Its design goal was to demonstrate a modular architecture capable of a sustained performance of 1.0 TFLOPS while dissipating less than 100 W. Research from the project was later incorporated into Xeon Phi. The technical lead of the project was Sriram R. Vangal.

<span class="mw-page-title-main">POWER8</span> 2014 family of multi-core microprocessors by IBM

POWER8 is a family of superscalar multi-core microprocessors based on the Power ISA, announced in August 2013 at the Hot Chips conference. The designs are available for licensing under the OpenPOWER Foundation, which is the first time for such availability of IBM's highest-end processors.

Heterogeneous computing refers to systems that use more than one kind of processor or core. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incorporating specialized processing capabilities to handle particular tasks.

References

  1. "SCCC PDF from intel.cn" (PDF). Intel | China. Retrieved December 27, 2023.
  2. Ng, Jason. "Intel Demonstrates 48-Core "Single-Chip Cloud Computer"". Daily Tech. Retrieved 30 October 2014.
  3. 1 2 Corley, Anne-Marie (9 February 2010). "Intel Lifts the Hood on its "Single-Chip Cloud Computer"". IEEE Spectrum. IEEE. Retrieved 30 October 2014.
  4. 1 2 "Intel Labs Announces Single-chip Cloud Computing Experimental Chip". YouTube. Intel. Retrieved 11 November 2014.
  5. 1 2 Pichel, Juan; Rivera, Francisco (2013). "Sparse matrix–vector multiplication on the Single-Chip Cloud Computer many-core processor". Journal of Parallel and Distributed Computing. 73 (12): 1539–1550. doi:10.1016/j.jpdc.2013.07.017.
  6. 1 2 3 Nerurkar, Nishad; Mhatre, Aniket. "Overview of the Intel Single Chip Cloud-Computer" (PDF). Retrieved 30 October 2014.
  7. Matson, Tim. "Using Intel's Single-Chip Cloud Computer (SCC)" (PDF). Intel.com. Intel Corporation. Retrieved 30 October 2014.