SPARC

Last updated

SPARC
Sparc-logo.svg
Designer Sun Microsystems (acquired by Oracle Corporation) [1] [2]
Bits64-bit (32 → 64)
Introduced1986;38 years ago (1986) (production)
1987;37 years ago (1987) (shipments)
VersionV9 (1993) / OSA2017
Design RISC
TypeRegister–Register
Encoding Fixed
Branching Condition code
Endianness Bi (Big → Bi)
Page size8 KB (4 KB → 8 KB)
Extensions VIS 1.0, 2.0, 3.0, 4.0
OpenYes, and royalty free
Registers
General-purpose 31 (G0 = 0; non-global registers use register windows)
Floating point 32 (usable as 32 single-precision, 32 double-precision, or 16 quad-precision)
A Sun UltraSPARC II microprocessor (1997) Sun UltraSPARCII.jpg
A Sun UltraSPARC II microprocessor (1997)

SPARC (Scalable Processor ARChitecture) is a reduced instruction set computer (RISC) instruction set architecture originally developed by Sun Microsystems. [1] [2] Its design was strongly influenced by the experimental Berkeley RISC system developed in the early 1980s. First developed in 1986 and released in 1987, [3] [2] SPARC was one of the most successful early commercial RISC systems, and its success led to the introduction of similar RISC designs from many vendors through the 1980s and 1990s.

Contents

The first implementation of the original 32-bit architecture (SPARC V7) was used in Sun's Sun-4 computer workstation and server systems, replacing their earlier Sun-3 systems based on the Motorola 68000 series of processors. SPARC V8 added a number of improvements that were part of the SuperSPARC series of processors released in 1992. SPARC V9, released in 1993, introduced a 64-bit architecture and was first released in Sun's UltraSPARC processors in 1995. Later, SPARC processors were used in symmetric multiprocessing (SMP) and non-uniform memory access (CC-NUMA) servers produced by Sun, Solbourne, and Fujitsu, among others.

The design was turned over to the SPARC International trade group in 1989, and since then its architecture has been developed by its members. SPARC International is also responsible for licensing and promoting the SPARC architecture, managing SPARC trademarks (including SPARC, which it owns), and providing conformance testing. SPARC International was intended to grow the SPARC architecture to create a larger ecosystem; SPARC has been licensed to several manufacturers, including Atmel, Bipolar Integrated Technology, Cypress Semiconductor, Fujitsu, Matsushita and Texas Instruments. Due to SPARC International, SPARC is fully open, non-proprietary and royalty-free.

As of 2024, the latest commercial high-end SPARC processors are Fujitsu's SPARC64 XII (introduced in September 2017 for its SPARC M12 server) and Oracle's SPARC M8 introduced in September 2017 for its high-end servers.

On September 1, 2017, after a round of layoffs that started in Oracle Labs in November 2016, Oracle terminated SPARC design after completing the M8. Much of the processor core development group in Austin, Texas, was dismissed, as were the teams in Santa Clara, California, and Burlington, Massachusetts. [4] [5]

Fujitsu will also discontinue their SPARC production (has already shifted to producing their own ARM-based CPUs), after two "enhanced" versions of Fujitsu's older SPARC M12 server in 2020–22 (formerly planned for 2021) and again in 2026–27, end-of-sale in 2029, of UNIX servers and a year later for their mainframe and end-of-support in 2034 "to promote customer modernization". [6]

Features

The SPARC architecture was heavily influenced by the earlier RISC designs, including the RISC I and II from the University of California, Berkeley and the IBM 801. These original RISC designs were minimalist, including as few features or op-codes as possible and aiming to execute instructions at a rate of almost one instruction per clock cycle. This made them similar to the MIPS architecture in many ways, including the lack of instructions such as multiply or divide. Another feature of SPARC influenced by this early RISC movement is the branch delay slot.

The SPARC processor usually contains as many as 160 general-purpose registers. According to the "Oracle SPARC Architecture 2015" specification an "implementation may contain from 72 to 640 general-purpose 64-bit" registers. [7] At any point, only 32 of them are immediately visible to software — 8 are a set of global registers (one of which, g0, is hard-wired to zero, so only seven of them are usable as registers) and the other 24 are from the stack of registers. These 24 registers form what is called a register window, and at function call/return, this window is moved up and down the register stack. Each window has eight local registers and shares eight registers with each of the adjacent windows. The shared registers are used for passing function parameters and returning values, and the local registers are used for retaining local values across function calls.

The "scalable" in SPARC comes from the fact that the SPARC specification allows implementations to scale from embedded processors up through large server processors, all sharing the same core (non-privileged) instruction set. One of the architectural parameters that can scale is the number of implemented register windows; the specification allows from three to 32 windows to be implemented, so the implementation can choose to implement all 32 to provide maximum call stack efficiency, or to implement only three to reduce cost and complexity of the design, or to implement some number between them. Other architectures that include similar register file features include Intel i960, IA-64, and AMD 29000.

The architecture has gone through several revisions. It gained hardware multiply and divide functionality in version 8. [8] [9] 64-bit (addressing and data) were added to the version 9 SPARC specification published in 1994. [10]

In SPARC version 8, the floating-point register file has 16 double-precision registers. Each of them can be used as two single-precision registers, providing a total of 32 single-precision registers. An odd–even number pair of double-precision registers can be used as a quad-precision register, thus allowing 8 quad-precision registers. SPARC Version 9 added 16 more double-precision registers (which can also be accessed as 8 quad-precision registers), but these additional registers can not be accessed as single-precision registers. No SPARC CPU implements quad-precision operations in hardware as of 2024. [11]

Tagged add and subtract instructions perform adds and subtracts on values checking that the bottom two bits of both operands are 0 and reporting overflow if they are not. This can be useful in the implementation of the run time for ML, Lisp, and similar languages that might use a tagged integer format.

The endianness of the 32-bit SPARC V8 architecture is purely big-endian. The 64-bit SPARC V9 architecture uses big-endian instructions, but can access data in either big-endian or little-endian byte order, chosen either at the application instruction (load–store) level or at the memory page level (via an MMU setting). The latter is often used for accessing data from inherently little-endian devices, such as those on PCI buses.

History

There have been three major revisions of the architecture. The first published version was the 32-bit SPARC version 7 (V7) in 1986. SPARC version 8 (V8), an enhanced SPARC architecture definition, was released in 1990. The main differences between V7 and V8 were the addition of integer multiply and divide instructions, and an upgrade from 80-bit "extended-precision" floating-point arithmetic to 128-bit "quad-precision" arithmetic. SPARC V8 served as the basis for IEEE Standard 1754-1994, an IEEE standard for a 32-bit microprocessor architecture.

SPARC version 9, the 64-bit SPARC architecture, was released by SPARC International in 1993. It was developed by the SPARC Architecture Committee consisting of Amdahl Corporation, Fujitsu, ICL, LSI Logic, Matsushita, Philips, Ross Technology, Sun Microsystems, and Texas Instruments. Newer specifications always remain compliant with the full SPARC V9 Level 1 specification.

In 2002, the SPARC Joint Programming Specification 1 (JPS1) was released by Fujitsu and Sun, describing processor functions which were identically implemented in the CPUs of both companies ("Commonality"). The first CPUs conforming to JPS1 were the UltraSPARC III by Sun and the SPARC64 V by Fujitsu. Functionalities which are not covered by JPS1 are documented for each processor in "Implementation Supplements".

At the end of 2003, JPS2 was released to support multicore CPUs. The first CPUs conforming to JPS2 were the UltraSPARC IV by Sun and the SPARC64 VI by Fujitsu.

In early 2006, Sun released an extended architecture specification, UltraSPARC Architecture 2005. This includes not only the non-privileged and most of the privileged portions of SPARC V9, but also all the architectural extensions developed through the processor generations of UltraSPARC III, IV IV+ as well as CMT extensions starting with the UltraSPARC T1 implementation:

In 2007, Sun released an updated specification, UltraSPARC Architecture 2007, to which the UltraSPARC T2 implementation complied.

In August 2012, Oracle Corporation made available a new specification, Oracle SPARC Architecture 2011, which besides the overall update of the reference, adds the VIS 3 instruction set extensions and hyperprivileged mode to the 2007 specification. [12]

In October 2015, Oracle released SPARC M7, the first processor based on the new Oracle SPARC Architecture 2015 specification. [7] [13] This revision includes VIS 4 instruction set extensions and hardware-assisted encryption and silicon secured memory (SSM). [14]

SPARC architecture has provided continuous application binary compatibility from the first SPARC V7 implementation in 1987 through the Sun UltraSPARC Architecture implementations.

Among various implementations of SPARC, Sun's SuperSPARC and UltraSPARC-I were very popular, and were used as reference systems for SPEC CPU95 and CPU2000 benchmarks. The 296 MHz UltraSPARC-II is the reference system for the SPEC CPU2006 benchmark.

Architecture

SPARC is a load–store architecture (also known as a register–register architecture); except for the load/store instructions used to access memory, all instructions operate on the registers, in accordance with the RISC design principles.

A SPARC processor includes an integer unit (IU) that performs integer load, store, and arithmetic operations. [15] :9 [10] :15–16 It may include a floating-point unit (FPU) that performs floating-point operations [15] :9 [10] :15–16 and, for SPARC V8, may include a co-processor (CP) that performs co-processor-specific operations; the architecture does not specify what functions a co-processor would perform, other than load and store operations. [15] :9

Registers

The SPARC architecture has an overlapping register window scheme. At any instant, 32 general-purpose registers are visible. A Current Window Pointer (CWP) variable in the hardware points to the current set. The total size of the register file is not part of the architecture, allowing more registers to be added as the technology improves, up to a maximum of 32 windows in SPARC V7 and V8 as CWP is 5 bits and is part of the PSR register.

In SPARC V7 and V8 CWP will usually be decremented by the SAVE instruction (used by the SAVE instruction during the procedure call to open a new stack frame and switch the register window), or incremented by the RESTORE instruction (switching back to the call before returning from the procedure). Trap events (interrupts, exceptions or TRAP instructions) and RETT instructions (returning from traps) also change the CWP. For SPARC V9, CWP register is decremented during a RESTORE instruction, and incremented during a SAVE instruction. This is the opposite of PSR.CWP's behavior in SPARC V8. This change has no effect on nonprivileged instructions.

Window Addressing
Register groupMnemonicRegister addressAvailability
globalG0...G7R[0]...R[7]always the same ones, G0 being zero always
outO0...O7R[8]...R[15]to be handed over to, and returned from, the called subroutine, as its "in"
localL0...L7R[16]...R[23]truly local to the current subroutine
inI0...I7R[24]...R[31]handed over from the caller, and returned to the caller, as its "out"

SPARC registers are shown in the figure above.

There is also a non-windowed Y register, used by the multiply-step, integer multiply, and integer divide instructions. [15] :32

A SPARC V8 processor with an FPU includes 32 32-bit floating-point registers, each of which can hold one single-precision IEEE 754 floating-point number. An even–odd pair of floating-point registers can hold one double-precision IEEE 754 floating-point number, and a quad-aligned group of four floating-point registers can hold one quad-precision IEEE 754 floating-point number. [15] :10

A SPARC V9 processor with an FPU includes: [10] :36–40

The registers are organized as a set of 64 32-bit registers, with the first 32 being used as the 32-bit floating-point registers, even–odd pairs of all 64 registers being used as the 64-bit floating-point registers, and quad-aligned groups of four floating-point registers being used as the 128-bit floating-point registers.

Floating-point registers are not windowed; they are all global registers. [10] :36–40

Instruction formats

All SPARC instructions occupy a full 32-bit word and start on a word boundary. Four formats are used, distinguished by the first two bits. All arithmetic and logical instructions have 2 source operands and 1 destination operand. [16] RD is the "destination register", where the output of the operation is deposited. The majority of SPARC instructions have at least this register, so it is placed near the "front" of the instruction format. RS1 and RS2 are the "source registers", which may or may not be present, or replaced by a constant.

SPARC instruction formats
TypeBit
313029282726252423222120191817161514131211109876543210
SETHI format00RD100Immediate constant 22 bits
I Branch format00Aicc010Displacement constant 22 bits
F Branch format00Afcc110Displacement constant 22 bits
C Branch format00Accc111Displacement constant 22 bits
CALL disp01PC-relative displacement
Arithmetic register10RDopcodeRS100RS2
Arithmetic immediate10RDopcodeRS11Immediate constant 13 bits
FPU operation10FD110100/110101FS1opfFS2
CP operation10RD110110/110111RS1opcRS2
JMPL register10RD111000RS100RS2
JMPL immediate10RD111000RS11Immediate constant 13 bits
LD/ST register11RDopcodeRS100RS2
LD/ST immediate11RDopcodeRS11Immediate constant 13 bits

Instructions

Loads and stores

Load and store instructions have a three-operand format, in that they have two operands representing values for the address and one operand for the register to read or write to. The address is created by adding the two address operands to produce an address. The second address operand may be a constant or a register. Loads take the value at the address and place it in the register specified by the third operand, whereas stores take the value in the register specified by the first operand and place it at the address. To make this more obvious, the assembler language indicates address operands using square brackets with a plus sign separating the operands, instead of using a comma-separated list. Examples: [16]

ld [%L1+%L2],%L3  !load the 32-bit value at address %L1+%L2 and put the value into %L3 ld [%L1+8],%L2    !load the value at %L1+8 into %L2 ld [%L1],%L2      !as above, but no offset, which is the same as +%G0 st %L1,[%I2]      !store the value in %L1 into the location stored in %I2 st %G0,[%I1+8]    !clear the memory at %I1+8

Due to the widespread use of non-32-bit data, such as 16-bit or 8-bit integral data or 8-bit bytes in strings, there are instructions that load and store 16-bit half-words and 8-bit bytes, as well as instructions that load 32-bit words. During a load, those instructions will read only the byte or half-word at the indicated location and then either fill the rest of the target register with zeros (unsigned load or with the value of the uppermost bit of the byte or half-word (signed load). During a store, those instructions discard the upper bits in the register and store only the lower bits. There are also instructions for loading double-precision values used for floating-point arithmetic, reading or writing eight bytes from the indicated register and the "next" one, so if the destination of a load is L1, L1 and L2 will be set. The complete list of instructions 32-bit SPARC is LD, ST, LDUB (unsigned byte), LDSB (signed byte), LDUH (unsigned half-word), LDSH (signed half-word), LDD (load double), STB (store byte), STH (store half-word), STD (store double). [16]

In SPARC V9, registers are 64-bit, and the LD instruction, renamed LDUW, clears the upper 32 bits in the register and loads the 32-bit value into the lower 32 bits, and the ST instruction, renamed STW, discards the upper 32 bits of the rgister and stores only the lower 32 bits. The new LDSW instruction sets the upper bits in the register to the value of the uppermost bit of the word and loads the 32-bit value into the lower bits. The new LDX instruction loads a 64-bit value into the register, and the STX instruction stores all 64 bits of the register.

The LDF, LDDF, and LDQF instructions load a single-precision, double-precision, or quad-precision value from memory into a floating-point register; the STF, STDF, and STQF instructions store a single-precision, double-precision, or quad-precision floating-point register into memory.

The memory barrier instruction, MEMBAR, serves two interrelated purposes: it articulates order constraints among memory references and facilitates explicit control over the completion of memory references. For example, all effects of the stores that appear prior to the MEMBAR instruction must be made visible to all processors before any loads following the MEMBAR can be executed. [17]

ALU operations

Arithmetic and logical instructions also use a three-operand format, with the first two being the operands and the last being the location to store the result. The middle operand can be a register or a 13-bit signed integer constant; the other operands are registers. Any of the register operands may point to G0; pointing the result to G0 discards the results, which can be used for tests. Examples include: [16]

add %L1,%L2,%L3   !add the values in %L1 and %L2 and put the result in %L3 add %L1,1,%L1     !increment %L1 add %G0,%G0,%L4   !clear any value in %L4

The list of mathematical instructions is ADD, SUB, AND, OR, XOR, and negated versions ANDN, ORN, and XNOR. One quirk of the SPARC design is that most arithmetic instructions come in pairs, with one version setting the NZVC condition code bits in the status register, and the other not setting them, with the default being not to set the codes. This is so that the compiler has a way to move instructions around when trying to fill delay slots. If one wants the condition codes to be set, this is indicated by adding cc to the instruction: [16]

subcc %L1,10,%G0  !compare %L1 to 10 and ignore the result, but set the flags

add and sub also have another modifier, X, which indicates whether the operation should set the carry bit:

addx %L1,100,%L1  !add 100 to the value in %L1 and track carry

SPARC V7 does not have multiplication or division instructions, but it does have MULSCC, which does one step of a multiplication testing one bit and conditionally adding the multiplicand to the product. This was because MULSCC can complete over one clock cycle in keeping with the RISC philosophy. SPARC V8 added UMUL (unsigned multiply), SMUL (signed multiply), UDIV (unsigned divide), and SDIV (signed divide) instructions, with both versions that do not update the condition codes and versions that do. MULSCC and the multiply instructions use the Y register to hold the upper 32 bits of the product; the divide instructions use it to hold the upper 32 bits of the dividend. The RDY instruction reads the value of the Y register into a general-purpose register; the WRY instruction writes the value of a general-purpose register to the Y register. [15] :32 SPARC V9 added MULX, which multiplies two 64-bit values and produces a 64-bit result, SDIVX, which divides a 64-bit signed dividend by a 64-bit signed divisor and produces a 64-bit signed quotient, and UDIVX, which divides a 64-bit unsigned dividend by a 64-bit unsigned divisor and produces a 64-bit signed quotient; none of those instructions use the Y register. [10] :199

Branching

Conditional branches test condition codes in a status register, as seen in many instruction sets such the IBM System/360 architecture and successors and the x86 architecture. This means that a test and branch is normally performed with two instructions; the first is an ALU instruction that sets the condition codes, followed by a branch instruction that examines one of those flags. The SPARC does not have specialized test instructions; tests are performed using normal ALU instructions with the destination set to %G0. For instance, to test if a register holds the value 10 and then branch to code that handles it, one would:

subcc %L1,10,%G0 !subtract 10 from %L1, setting the zero flag if %L1 is 10 be WASEQUAL      !if the zero flag is set, branch to the address marked WASEQUAL

In a conditional branch instruction, the icc or fcc field specifies the condition being tested. The 22-bit displacement field is the address, relative to the current PC, of the target, in words, so that conditional branches can go forward or backward up to 8 megabytes. The ANNUL (A) bit is used to get rid of some delay slots. If it is 0 in a conditional branch, the delay slot is executed as usual. If it is 1, the delay slot is only executed if the branch is taken. If it is not taken, the instruction following the conditional branch is skipped.

There are a wide variety of conditional branches: BA (branch always, essentially a jmp), BN (branch never), BE (equals), BNE (not equals), BL (less than), BLE (less or equal), BLEU (less or equal, unsigned), BG (greater), BGE (greater or equal), BGU (greater unsigned), BPOS (positive), BNEG (negative), BCC (carry clear), BCS (carry set), BVC (overflow clear), BVS (overflow set). [15] :119–120

The FPU and CP have sets of condition codes separate from the integer condition codes and from each other; two additional sets of branch instructions were defined to test those condition codes. Adding an F to the front of the branch instruction in the list above performs the test against the FPU's condition codes, [15] :121–122 while, in SPARC V8, adding a C tests the flags in the otherwise undefined CP. [15] :123–124

The CALL (jump to subroutine) instruction uses a 30-bit program counter-relative word offset. As the target address is specifying the start of a word, not a byte, 30-bits is all that is needed to reach any address in the 4 gigabyte address space. [16] The CALL instruction deposits the return address in register R15, also known as output register O7.

The JMPL (jump and link) instruction is a three-operand instruction, with two operands representing values for the target address and one operand for a register in which to deposit the return address. The address is created by adding the two address operands to produce a 32-bit address. The second address operand may be a constant or a register.

Large constants

As the instruction opcode takes up some bits of the 32-bit instruction word, there is no way to load a 32-bit constant using a single instruction. This is significant because addresses are manipulated through registers and they are 32-bits. To ease this, the special-purpose SETHI instruction copies its 22-bit immediate operand into the high-order 22 bits of any specified register, and sets each of the low-order 10 bits to 0. In general use, SETHI is followed by an or instruction with only the lower 10 bits of the value set. To ease this, the assembler includes the %hi(X) and %lo(X) macros. For example: [16]

sethi %hi(0x89ABCDEF),%L1       !sets the upper 22 bits of L1 or    %L1,%lo(0x89ABCDEF),%L1   !sets the lower 10 bits of L1 by ORing

The hi and lo macros are performed at compile time, not runtime, so it has no performance hit yet makes it clearer that you are setting L1 to a single value, not two unrelated ones. To make this even easier, the assembler also includes a "synthetic instruction", set, that performs these two operations in a single line:

set   0x89ABCDEF,% L1

This outputs the two instructions above if the value is larger than 13 bits, otherwise it will emit a single ld with the value. [16]

Synthetic instructions

As noted earlier, the SPARC assembler uses "synthetic instructions" to ease common coding tasks. Additional examples include (among others): [16]

SPARC synthetic instructions
mnemonicactual outputpurpose
nopsethi 0,%g0do nothing
clr %regor %g0,%g0,%regset a register to zero
clr [address]st %g0,[address]set a memory address to zero
clrh [address]sth %g0,[address]set the half-word at memory address to zero
clrb [address]sth %g0,[address]set the byte at memory address to zero
cmp %reg1,%reg2subcc %reg1,%reg2,%g0compare two registers, set codes, discard results
cmp %reg,constsubcc %reg,const,%g0compare register with constant
mov %reg1,%reg2or %g0,%reg1,%reg2copy value from one register to another
mov const,%regor %g0,const,%regcopy constant value into a register
inc %regadd %reg,1,%regincrement a register
inccc %regaddcc %reg,1,%regincrement a register, set conditions
dec %regsub %reg,1,%regdecrement a register
deccc %regsubcc %reg,1,%regdecrement a register, set conditions
not %regxnor %reg,%g0,%regflip the bits in a register
neg %regsub %g0,%reg,%regtwo's complement a register
tst %regorcc %reg,%g0,%g0test whether the value in a register is > 0, 0, or < 0

SPARC architecture licensees

The following organizations have licensed the SPARC architecture:

Implementations

Name (codename)ModelFrequency (MHz)Arch. versionYearTotal threads [note 1] Process (nm)Transistors (millions)Die size (mm2)IO pinsPower (W)Voltage (V)L1 Dcache (KB)L1 Icache (KB)L2 cache (KB)L3 cache (KB)
SPARC MB86900 Fujitsu [1] [3] [2] 14.2833V719861×1=113000.112560128 (unified)nonenone
SPARCVarious [note 2] 14.2840V71989–19921×1=18001300~0.11.81602560128 (unified)nonenone
MN10501 (KAP) Solbourne Computer,

Matsushita [18]

33–36V81990–19911x1=11.0 [19] 880–256none
microSPARC I (Tsunami)TI TMS390S104050V819921×1=18000.8225?2882.5524nonenone
SuperSPARC I (Viking)TI TMX390Z50 / Sun STP10203360V819921×1=18003.129314.3516200–2048none
SPARClite Fujitsu MB8683x66108V8E19921×1=1144, 1762.5/3.3–5.0 V, 2.5–3.3 V1, 2, 8, 161, 2, 8, 16nonenone
hyperSPARC (Colorado 1)Ross RT620A4090V819931×1=15001.55?08128–256none
microSPARC II (Swift)Fujitsu MB86904 / Sun STP101260125V819941×1=15002.323332153.3816nonenone
hyperSPARC (Colorado 2)Ross RT620B90125V819941×1=14001.53.308128–256none
SuperSPARC II (Voyager)Sun STP10217590V819941×1=18003.12991616201024–2048none
hyperSPARC (Colorado 3)Ross RT620C125166V819951×1=13501.53.308512–1024none
TurboSPARC Fujitsu MB86907160180V819961×1=13503.013241673.51616512none
UltraSPARC (Spitfire)Sun STP1030143167V919951×1=14703.831552130 [note 3] 3.31616512–1024none
UltraSPARC (Hornet)Sun STP1030200V919951×1=14205.22655213.31616512–1024none
hyperSPARC (Colorado 4)Ross RT620D180200V819961×1=13501.73.31616512none
SPARC64 Fujitsu (HAL)101118V919951×1=1400Multichip286503.8128128
SPARC64 II Fujitsu (HAL)141161V919961×1=1350Multichip286643.3128128
SPARC64 III Fujitsu (HAL) MBCS70301250330V919981×1=124017.62402.564648192
UltraSPARC IIs (Blackbird)Sun STP1031250400V919971×1=13505.414952125 [note 4] 2.516161024 or 4096none
UltraSPARC IIs (Sapphire-Black)Sun STP1032 / STP1034360480V919991×1=12505.412652121 [note 5] 1.9161610248192none
UltraSPARC IIi (Sabre)Sun SME1040270360V919971×1=13505.4156587211.916162562048none
UltraSPARC IIi (Sapphire-Red)Sun SME1430333480V919981×1=12505.458721 [note 6] 1.916162048none
UltraSPARC IIe (Hummingbird)Sun SME1701400500V919991×1=1180 Al37013 [note 7] 1.5–1.71616256none
UltraSPARC IIi (IIe+) (Phantom)Sun SME1532550650V920001×1=1180 Cu37017.61.71616512none
SPARC64 GP Fujitsu SFCB81147400563V920001×1=118030.22171.81281288192
SPARC64 GP --600810V91×1=115030.21.51281288192
SPARC64 IVFujitsu MBCS80523450810V920001×1=11301281282048
UltraSPARC III (Cheetah)Sun SME1050600JPS120011×1=1180 Al293301368531.664328192none
UltraSPARC III (Cheetah)Sun SME1052750900JPS120011×1=1130 Al2913681.664328192none
UltraSPARC III Cu (Cheetah+)Sun SME10569001200JPS120011×1=1130 Cu29232136850 [note 8] 1.664328192none
UltraSPARC IIIi (Jalapeño)Sun SME160310641593JPS120031×1=113087.5206959521.364321024none
SPARC64 V (Zeus)Fujitsu11001350JPS120031×1=1130190289269401.21281282048
SPARC64 V+ (Olympus-B)Fujitsu16502160JPS120041×1=1904002972796511281284096
UltraSPARC IV (Jaguar)Sun SME116710501350JPS220041×2=21306635613681081.35643216384none
UltraSPARC IV+ (Panther)Sun SME1167A15002100JPS220051×2=2902953361368901.16464204832768
UltraSPARC T1 (Niagara)Sun SME190510001400UA200520054×8=32903003401933721.38163072none
SPARC64 VI (Olympus-C)Fujitsu21502400JPS220072×2=4905404221201501.1128×2128×240966144none
UltraSPARC T2 (Niagara 2)Sun SME1908A10001600UA200720078×8=64655033421831951.11.58164096none
UltraSPARC T2 Plus (Victoria Falls)Sun SME1910A12001600UA200720088×8=646550334218318164096none
SPARC64 VII (Jupiter) [20] Fujitsu24002880JPS220082×4=86560044515064×464×46144none
UltraSPARC "RK" (Rock) [21] Sun SME18322300????canceled [22] 2×16=3265?3962326??32322048?
SPARC64 VIIIfx (Venus) [23] [24] Fujitsu2000JPS2 / HPC-ACE20091×8=845760513127158?32×832×86144none
LEON2FT Atmel AT697F100V820091×1=118019611.8/3.31632none
SPARC T3 (Rainbow Falls)Oracle/Sun1650UA200720108×16=12840 [25] ????371?139?8166144none
Galaxy FT-1500 NUDT (China)1800UA2007?201?8×16=12840????????65?16×1616×16512×164096
SPARC64 VII+ (Jupiter-E or M3) [26] [27] Fujitsu2667–3000JPS220102×4=86516064×464×412288none
LEON3FT Cobham Gaisler GR712RC100V8E20111×2=21801.5 [note 9] 1.8/3.34x4Kb4x4Kbnonenone
R1000 MCST (Russia)1000JPS220111×4=490180128151, 1.8, 2.532162048none
SPARC T4 (Yosemite Falls) [28] Oracle2850–3000OSA201120118×8=6440855403?240?16×816×8128×84096
SPARC64 IXfx [29] [30] [31] Fujitsu1850JPS2 / HPC-ACE20121x16=164018704841442110?32×1632×1612288none
SPARC64 X (Athena) [32] Fujitsu2800OSA2011 / HPC-ACE20122×16=32282950587.51500270?64×1664×1624576none
SPARC T5 Oracle3600OSA201120138×16=128281500478???16×1616×16128×168192
SPARC M5 [33] Oracle3600OSA201120138×6=48283900511???16×616×6128×649152
SPARC M6 [34] Oracle3600OSA201120138×12=96284270643???16×1216×12128×1249152
SPARC64 X+ (Athena+) [35] Fujitsu3200–3700OSA2011 / HPC-ACE20142×16=322829906001500392?64×1664×1624Mnone
SPARC64 XIfx [36] Fujitsu2200JPS2 / HPC-ACE220141×(32+2)=34203750?1001??64×3464×3412M×2none
SPARC M7 [37] [38] Oracle4133OSA201520158×32=25620>10,000????16×3216×32256×2465536
SPARC S7 [39] [40] Oracle4270OSA201520168×8=6420????????16×816×8256×2+256×416384
SPARC64 XII [41] Fujitsu4250OSA201? / HPC-ACE20178×12=962055007951860??64×1264×12512×1232768
SPARC M8 [42] [43] Oracle5000OSA201720178×32=25620?????32×3216×32128×32+256×865536
LEON4 Cobham Gaisler GR740250 [note 10] V8E20171×4=4321.2/2.5/3.34x44x42048none
R2000 MCST (Russia)2000?20181×8=828500??????none
LEON5 Cobham Gaisler V8E2019????16–8192none
Name (codename)ModelFrequency (MHz)Arch. versionYearTotal threads [note 1] Process (nm)Transistors (millions)Die size (mm2)IO pinsPower (W)Voltage (V)L1 Dcache (KB)L1 Icache (KB)L2 cache (KB)L3 cache (KB)

Notes:

  1. 1 2 Threads per core × number of cores
  2. Various SPARC V7 implementations were produced by Fujitsu, LSI Logic, Weitek, Texas Instruments, Cypress and Temic. A SPARC V7 processor generally consisted of several discrete chips, usually comprising an integer unit (IU), a floating-point unit (FPU), a memory management unit (MMU) and cache memory. Conversely, the Atmel (now Microchip Technology) TSC695 is a single-chip SPARC V7 implementation.
  3. @167 MHz
  4. @250 MHz
  5. @400 MHz
  6. @440 MHz
  7. max. @500 MHz
  8. @1200 MHz
  9. excluding I/O buses
  10. nominal; specification from 100 to 424 MHz depending on attached RAM capabilities

Operating system support

SPARC machines have generally used Sun's SunOS, Solaris, JavaOS, or OpenSolaris including derivatives illumos and OpenIndiana, but other operating systems have also been used, such as NeXTSTEP, RTEMS, FreeBSD, OpenBSD, NetBSD, and Linux.

In 1993, Intergraph announced a port of Windows NT to the SPARC architecture, [44] but it was later cancelled.

In October 2015, Oracle announced a "Linux for SPARC reference platform". [45]

Open source implementations

Several fully open source implementations of the SPARC architecture exist:

A fully open source simulator for the SPARC architecture also exists:

Supercomputers

For HPC loads Fujitsu builds specialized SPARC64 fx processors with a new instruction extensions set, called HPC-ACE (High Performance Computing – Arithmetic Computational Extensions).

Fujitsu's K computer ranked No.1 in the TOP500 June 2011 and November 2011 lists. It combines 88,128 SPARC64 VIIIfx CPUs, each with eight cores, for a total of 705,024 cores—almost twice as many as any other system in the TOP500 at that time. The K Computer was more powerful than the next five systems on the list combined, and had the highest performance-to-power ratio of any supercomputer system. [46] It also ranked No.6 in the Green500 June 2011 list, with a score of 824.56 MFLOPS/W. [47] In the November 2012 release of TOP500, the K computer ranked No.3, using by far the most power of the top three. [48] It ranked No.85 on the corresponding Green500 release. [49] Newer HPC processors, IXfx and XIfx, were included in recent PRIMEHPC FX10 and FX100 supercomputers.

Tianhe-2 (TOP500 No.1 as of November 2014 [50] ) has a number of nodes with Galaxy FT-1500 OpenSPARC-based processors developed in China. However, those processors did not contribute to the LINPACK score. [51] [52]

See also

Related Research Articles

MIPS is a family of reduced instruction set computer (RISC) instruction set architectures (ISA) developed by MIPS Computer Systems, now MIPS Technologies, based in the United States.

<span class="mw-page-title-main">Reduced instruction set computer</span> Processor executing one instruction in minimal clock cycles

In computer science, a reduced instruction set computer (RISC) is a computer architecture designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set computer (CISC), a RISC computer might require more instructions in order to accomplish a task because the individual instructions are written in simpler code. The goal is to offset the need to process more instructions by increasing the speed of each instruction, in particular by implementing an instruction pipeline, which may be simpler to achieve given simpler instructions.

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

<span class="mw-page-title-main">Endianness</span> Order of bytes in a computer word

In computing, endianness is the order in which bytes within a word of digital data are transmitted over a data communication medium or stored (upwardly) in computer memory, counting only byte significance compared to earliness. Endianness is primarily expressed as big-endian (BE) or little-endian (LE), terms introduced by Danny Cohen into computer science for data ordering in an Internet Experiment Note published in 1980. The adjective endian has its origin in the writings of 18th century Anglo-Irish writer Jonathan Swift. In the 1726 novel Gulliver's Travels, he portrays the conflict between sects of Lilliputians divided into those breaking the shell of a boiled egg from the big end or from the little end. By analogy, a CPU may read a digital word big end first, or little end first.

In computer science, an instruction set architecture (ISA) is a part of the abstract model of a computer, which generally defines how software controls the CPU. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an implementation.

Visual Instruction Set, or VIS, is a SIMD instruction set extension for SPARC V9 microprocessors developed by Sun Microsystems. There are five versions of VIS: VIS 1, VIS 2, VIS 2+, VIS 3 and VIS 4.

In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit central processing units (CPU) and arithmetic logic units (ALU) are those that are based on processor registers, address buses, or data buses of that size. A computer that uses such a processor is a 64-bit computer.

SSE2 is one of the Intel SIMD processor supplementary instruction sets introduced by Intel with the initial version of the Pentium 4 in 2000. It extends the earlier SSE instruction set, and is intended to fully replace MMX. Intel extended SSE2 to create SSE3 in 2004. SSE2 added 144 new instructions to SSE, which has 70 instructions. Competing chip-maker AMD added support for SSE2 with the introduction of their Opteron and Athlon 64 ranges of AMD64 64-bit CPUs in 2003.

<span class="mw-page-title-main">HAL Computer Systems</span>

HAL Computer Systems, Inc was a Campbell, California-based computer manufacturer founded in 1990 by Andrew Heller, a principal designer of the original IBM POWER architecture. His idea was to build computers based on a RISC architecture for the commercial market. The inspiration of the name comes from the novel 2001: A Space Odyssey.

SPARC64 is a microprocessor developed by HAL Computer Systems and fabricated by Fujitsu. It implements the SPARC V9 instruction set architecture (ISA), the first microprocessor to do so. SPARC64 was HAL's first microprocessor and was the first in the SPARC64 brand. It operates at 101 and 118 MHz. The SPARC64 was used exclusively by Fujitsu in their systems; the first systems, the Fujitsu HALstation Model 330 and Model 350 workstations, were formally announced in September 1995 and were introduced in October 1995, two years late. It was succeeded by the SPARC64 II in 1996.

The SPARC Enterprise series is a range of UNIX server computers based on the SPARC V9 architecture. It was co-developed by Sun Microsystems and Fujitsu, announced on June 1, 2004, and introduced in 2007. They were marketed and sold by Sun Microsystems, Fujitsu, and Fujitsu Siemens Computers under the common brand of "SPARC Enterprise", superseding Sun's Sun Fire and Fujitsu's PRIMEPOWER server product lines. Codename is APL.

The PDP-11 architecture is a 16-bit CISC instruction set architecture (ISA) developed by Digital Equipment Corporation (DEC). It is implemented by central processing units (CPUs) and microprocessors used in PDP-11 minicomputers. It was in wide use during the 1970s, but was eventually overshadowed by the more powerful VAX architecture in the 1980s.

The SPARC64 V (Zeus) is a SPARC V9 microprocessor designed by Fujitsu. The SPARC64 V was the basis for a series of successive processors designed for servers, and later, supercomputers.

The hyperSPARC, code-named "Pinnacle", is a microprocessor that implements the SPARC Version 8 instruction set architecture (ISA) developed by Ross Technology for Cypress Semiconductor.

An instruction set architecture (ISA) is an abstract model of a computer, also referred to as computer architecture. A realization of an ISA is called an implementation. An ISA permits multiple implementations that may vary in performance, physical size, and monetary cost ; because the ISA serves as the interface between software and hardware. Software that has been written for an ISA can run on different implementations of the same ISA. This has enabled binary compatibility between different generations of computers to be easily achieved, and the development of computer families. Both of these developments have helped to lower the cost of computers and to increase their applicability. For these reasons, the ISA is one of the most important abstractions in computing today.

<span class="mw-page-title-main">TurboSPARC</span>

The TurboSPARC is a microprocessor that implements the SPARC V8 instruction set architecture (ISA) developed by Fujitsu Microelectronics, Inc. (FMI), the United States subsidiary of the Japanese multinational information technology equipment and services company Fujitsu Limited located in San Jose, California. It was a low-end microprocessor primarily developed as an upgrade for the Sun Microsystems microSPARC-II-based SPARCstation 5 workstation. It was introduced on 30 September 1996, with a 170 MHz version priced at US$499 in quantities of 1,000. The TurboSPARC was mostly succeeded in the low-end SPARC market by the UltraSPARC IIi in late 1997, but remained available.

In computing, quadruple precision is a binary floating-point–based computer number format that occupies 16 bytes with precision at least twice the 53-bit double precision.

Java bytecode is the instruction set of the Java virtual machine (JVM), crucial for executing programs written in the Java language and other JVM-compatible languages. Each bytecode operation in the JVM is represented by a single byte, hence the name "bytecode", making it a compact form of instruction. This intermediate form enables Java programs to be platform-independent, as they are compiled not to native machine code but to a universally executable format across different JVM implementations.

<span class="mw-page-title-main">AArch64</span> 64-bit extension of the ARM architecture

AArch64 or ARM64 is the 64-bit extension of the ARM architecture family.

The A64FX is a 64-bit ARM architecture microprocessor designed by Fujitsu. The processor is replacing the SPARC64 V as Fujitsu's processor for supercomputer applications. It powers the Fugaku supercomputer, ranked in the TOP500 as the fastest supercomputer in the world from June 2020, until falling to second place behind Frontier in June 2022.

References

  1. 1 2 3 "Fujitsu to take ARM into the realm of Super". The CPU Shack Museum. June 21, 2016. Archived from the original on June 30, 2019. Retrieved June 30, 2019.
  2. 1 2 3 4 "Timeline". SPARC International. Archived from the original on April 24, 2019. Retrieved June 30, 2019.
  3. 1 2 "Fujitsu SPARC". cpu-collection.de. Archived from the original on August 6, 2016. Retrieved June 30, 2019.
  4. Vaughan-Nichols, Steven J. (September 5, 2017). "Sun set: Oracle closes down last Sun product lines". ZDNet . Archived from the original on September 10, 2017. Retrieved September 11, 2017.
  5. Nichols, Shaun (August 31, 2017). "Oracle finally decides to stop prolonging the inevitable, begins hardware layoffs". The Register . Archived from the original on September 12, 2017. Retrieved September 11, 2017.
  6. "Roadmap: Fujitsu Global". www.fujitsu.com. Retrieved February 15, 2022.
  7. 1 2 "Oracle SPARC Architecture 2015: One Architecture ... Multiple Innovative Implementations" (PDF). Draft D1.0.0. January 12, 2016. Archived (PDF) from the original on April 24, 2016. Retrieved June 13, 2016. IMPL. DEP. #2-V8: An Oracle SPARC Architecture implementation may contain from 72 to 640 general-purpose 64-bit R registers. This corresponds to a grouping of the registers into MAXPGL + 1 sets of global R registers plus a circular stack of N_REG_WINDOWS sets of 16 registers each, known as register windows. The number of register windows present (N_REG_WINDOWS) is implementation dependent, within the range of 3 to 32 (inclusive).
  8. "SPARC Options", Using the GNU Compiler Collection (GCC), GNU, archived from the original on January 9, 2013, retrieved January 8, 2013
  9. SPARC Optimizations With GCC, OSNews, February 23, 2004, archived from the original on May 23, 2013, retrieved January 8, 2013
  10. 1 2 3 4 5 6 Weaver, D. L.; Germond, T., eds. (1994). The SPARC Architecture Manual, Version 9. SPARC International, Inc.: Prentice Hall. ISBN   0-13-825001-4 . Retrieved May 27, 2023.
  11. "SPARC Behavior and Implementation". Numerical Computation Guide – Sun Studio 10. Sun Microsystems, Inc. 2004. Archived from the original on January 25, 2022. Retrieved September 24, 2011. There are four situations, however, when the hardware will not successfully complete a floating-point instruction: ... The instruction is not implemented by the hardware (such as ... quad-precision instructions on any SPARC FPU).
  12. "Oracle SPARC Architecture 2011" (PDF), Oracle Corporation , May 21, 2014, archived (PDF) from the original on September 24, 2015, retrieved November 25, 2015
  13. Soat, John. "SPARC M7 Innovation". Oracle web site. Oracle Corporation. Archived from the original on September 5, 2015. Retrieved October 13, 2015.
  14. "Software in Silicon Cloud - Oracle". www.oracle.com. Archived from the original on January 21, 2019. Retrieved January 21, 2019.
  15. 1 2 3 4 5 6 7 8 9 The SPARC Architecture Manual, Version 8. SPARC International, Inc. 1992. Retrieved May 27, 2023.
  16. 1 2 3 4 5 6 7 8 9 "SPARC Fundamental Instructions".
  17. "SPARC64™ IXfx Extensions Fujitsu Limited Ver 12, 2 Dec. 2013" (PDF). p. 103-104. Retrieved December 17, 2023.
  18. "Floodgap Retrobits presents the Solbourne Solace: a shrine to the forgotten SPARC". www.floodgap.com. Archived from the original on December 1, 2020. Retrieved January 14, 2020.
  19. Sager, D.; Hinton, G.; Upton, M.; Chappell, T.; Fletcher, T.D.; Samaan, S.; Murray, R. (2001). "A 0.18 μm CMOS IA32 microprocessor with a 4 GHZ integer execution unit". 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177). San Francisco, CA, USA: IEEE. pp. 324–325. doi:10.1109/ISSCC.2001.912658. ISBN   978-0-7803-6608-4.
  20. FX1 Key Features & Specifications (PDF), Fujitsu, February 19, 2008, archived (PDF) from the original on January 18, 2012, retrieved December 6, 2011
  21. Tremblay, Marc; Chaudhry, Shailender (February 19, 2008), "A Third-Generation 65nm 16-Core 32-Thread Plus 32-Scout-Thread CMT SPARC(R) Processor" (PDF), OpenSPARC , Sun Microsystems, archived from the original on January 16, 2013, retrieved December 6, 2011
  22. Vance, Ashlee (June 15, 2009), "Sun Is Said to Cancel Big Chip Project", The New York Times , archived from the original on November 4, 2011, retrieved May 23, 2010
  23. "Fujitsu shows off SPARC64 VII", heise online, August 28, 2008, archived from the original on May 23, 2013, retrieved December 6, 2011
  24. Barak, Sylvie (May 14, 2009), "Fujitsu unveils world's fastest CPU", The Inquirer , archived from the original on May 17, 2009, retrieved December 6, 2011{{citation}}: CS1 maint: unfit URL (link)
  25. "Sparc T3 processor" (PDF), Oracle Corporation , archived (PDF) from the original on April 24, 2016, retrieved December 6, 2011
  26. Morgan, Timothy Prickett (December 3, 2010), "Ellison: Sparc T4 due next year", The Register , archived from the original on March 7, 2012, retrieved December 6, 2011
  27. "SPARC Enterprise M-series Servers Architecture" (PDF), Fujitsu , April 2011, archived (PDF) from the original on March 4, 2016, retrieved November 5, 2011
  28. Morgan, Timothy Prickett (August 22, 2011), "Oracle's Sparc T4 chip", The Register , archived from the original on November 30, 2011, retrieved December 6, 2011
  29. Morgan, Timothy Prickett (November 21, 2011), "Fujitsu parades 16-core Sparc64 super stunner", The Register , archived from the original on November 24, 2011, retrieved December 8, 2011
  30. "Fujitsu Launches PRIMEHPC FX10 Supercomputer", Fujitsu , November 7, 2011, archived from the original on January 18, 2012, retrieved February 3, 2012
  31. "Ixfx Download" (PDF). fujitsu.com. Archived (PDF) from the original on May 18, 2015. Retrieved May 17, 2015.
  32. "Images of SPARC64" (PDF). fujitsu.com. Archived (PDF) from the original on April 22, 2016. Retrieved August 29, 2017.
  33. "Oracle Products" (PDF). oracle.com. Archived (PDF) from the original on March 8, 2017. Retrieved August 29, 2017.
  34. "Oracle SPARC products" (PDF). oracle.com. Archived (PDF) from the original on September 26, 2018. Retrieved August 29, 2017.
  35. "Fujitsu Presentation pdf" (PDF). fujitsu.com. Archived (PDF) from the original on April 22, 2016. Retrieved August 29, 2017.
  36. "Fujitsu Global Images" (PDF). fujitsu.com. Archived from the original (PDF) on May 18, 2015. Retrieved August 29, 2017.
  37. "M7: Next Generation SPARC. Hotchips 26" (PDF). swisdev.oracle.com. Archived (PDF) from the original on October 31, 2014. Retrieved August 12, 2014.
  38. "Oracle's SPARC T7 and SPARC M7 Server Architecture" (PDF). oracle.com. Archived (PDF) from the original on November 6, 2015. Retrieved October 10, 2015.
  39. Vinaik, Basant; Puri, Rahoul (August 24, 2015). "Hot Chips – August 23–25, 2015 – Conf. Day1 – Oracle's Sonoma Processor: Advanced low-cost SPARC processor for enterprise workloads" (PDF). hotchips.org. Archived (PDF) from the original on October 9, 2022. Retrieved January 25, 2022.
  40. "Blueprints revealed: Oracle crams Sparc M7 and InfiniBand into cheaper 'Sonoma' chips". theregister.co.uk. Archived from the original on August 29, 2017. Retrieved August 29, 2017.
  41. "Documents at Fujitsu" (PDF). fujitsu.com. Archived (PDF) from the original on August 29, 2017. Retrieved August 29, 2017.
  42. "Oracle's New SPARC Systems Deliver 2-7x Better Performance, Security Capabilities, and Efficiency than Intel-based Systems". oracle.com. Archived from the original on September 18, 2017. Retrieved September 18, 2017.
  43. "SPARC M8 Processor" (PDF). oracle.com. Archived (PDF) from the original on February 28, 2019. Retrieved September 18, 2017.
  44. McLaughlin, John (July 7, 1993), "Intergraph to Port Windows NT to SPARC", The Florida SunFlash, 55 (11), archived from the original on July 23, 2014, retrieved December 6, 2011
  45. Project: Linux for SPARC - oss.oracle.com, October 12, 2015, archived from the original on December 8, 2015, retrieved December 4, 2015
  46. "TOP500 List (1-100)", TOP500 , June 2011, archived from the original on June 23, 2011, retrieved December 6, 2011
  47. "The Green500 List", Green500 , June 2011, archived from the original on July 3, 2011
  48. "Top500 List – November 2012 | TOP500 Supercomputer Sites", TOP500 , November 2012, archived from the original on November 13, 2012, retrieved January 8, 2013
  49. "The Green500 List – November 2012 | The Green500", Green500 , November 2012, archived from the original on June 6, 2016, retrieved January 8, 2013
  50. "Tianhe-2 (MilkyWay-2)", TOP500 , May 2015, archived from the original on May 26, 2015, retrieved May 27, 2015
  51. Keane, Andy, "Tesla Supercomputing" (mp4), Nvidia , archived from the original on February 25, 2021, retrieved December 6, 2011
  52. Thibodeau, Patrick (November 4, 2010). "U.S. says China building 'entirely indigenous' supercomputer". Computerworld. Archived from the original on October 11, 2012. Retrieved August 28, 2017.