Multimedia Acceleration eXtensions

Last updated

The Multimedia Acceleration eXtensions or MAX are instruction set extensions to the Hewlett-Packard PA-RISC instruction set architecture (ISA). MAX was developed to improve the performance of multimedia applications that were becoming more prevalent during the 1990s.

Contents

MAX instructions operate on 32- or 64-bit SIMD data types consisting of multiple 16-bit integers packed in general purpose registers. The available functionality includes additions, subtractions and shifts.

The first version, MAX-1, was for the 32-bit PA-RISC 1.1 ISA. The second version, MAX-2, was for the 64-bit PA-RISC 2.0 ISA.

Notability

The approach is notable because the set of instructions is much smaller than in other multimedia CPUs, and also more general-purpose. The small set and simplicity of the instructions reduce the recurring costs of the electronics, as well as the costs and difficulty of the design. The general-purpose nature of the instructions increases their overall value. These instructions require only small changes to a CPU's arithmetic-logic unit. A similar design approach promises to be a successful model for the multimedia instructions of other CPU designs. [1] [2] [3] The set is also small because the CPU already included powerful shift and bit-manipulation instructions: "Shift pair" which shifts a pair of registers, "extract" and "deposit" of bit fields, and all the common bit-wise logical operations (and, or, exclusive-or, etc.). [2]

This set of multimedia instructions has proven its performance, as well. In 1996 the 64-bit "MAX-2" instructions enabled real-time performance of MPEG-1 and MPEG-2 video while increasing the area of a RISC CPU by only 0.2%. [1]

Implementations

MAX-1 was first implemented with the PA-7100LC in 1994. It is usually attributed as being the first SIMD extensions to an ISA. The second version, MAX-2, was for the 64-bit PA-RISC 2.0 ISA. It was first implemented in the PA-8000 microprocessor released in 1996. [1]

The basic approach to the arithmetic in MAX-2 is to "interrupt the carries" between the 16-bit subwords, and choose between modular arithmetic, signed and unsigned saturation. This requires only small changes to the arithmetic logic unit. [2]

MAX-1

InstructionDescription
HADDParallel add with modulo arithmetic
HADD,ssParallel add with signed saturation
HADD,usParallel add with unsigned saturation
HSUBParallel subtract with modulo arithmetic
HSUB,ssParallel subtract with signed saturation
HSUB,usParallel subtract with unsigned saturation
HAVEParallel average
HSHLADDParallel shift left and add with signed saturation
HSHRADDParallel shift right and add with signed saturation

MAX-2

MAX-2 instructions are register-to-register instructions that operate on multiple integers in 64-bit quantities. All have a one cycle latency in the PA-8000 microprocessor and its derivatives. Memory accesses are via the standard 64-bit loads and stores.

The "MIX" and "PERMH" instructions are a notable innovation because they permute words in the register set without accessing memory. This can substantially speed many operations. [2]

InstructionDescription
HADDParallel add with modulo arithmetic
HADD,ssParallel add with signed saturation
HADD,usParallel add with unsigned saturation
HSUBParallel subtract with modulo arithmetic
HSUB,ssParallel subtract with signed saturation
HSUB,usParallel subtract with unsigned saturation
HSHLADDParallel shift left and add with signed saturation
HSHRADDParallel shift right and add with signed saturation
HAVGParallel average
HSHRParallel shift right signed
HSHR,uParallel shift right unsigned
HSHLParallel shift left
MIXMix 16-bit sub-words in a 64-bit word; MIX Left, Ra,Rb,Rc, Rc:=a1,b1,a3,b3; MIX Right, Rc:=a2,b2,a4,b4 [2]
MIXWMix 32-bit sub-words in a 64-bit word; e.g. MIXW Left, Ra,Rb,Rc, Rc:=a1,a2,b1,b2; MIXW Right, Rc:=a3,a4,b3,b4 [2]
PERMHPermute 16-bit sub-words of the source in any possible permutation in the destination register, including repetitions. [2]

Related Research Articles

<span class="mw-page-title-main">Central processing unit</span> Central computer component which executes instructions

A central processing unit (CPU)—also called a central processor or main processor—is the most important processor in a given computer. Its electronic circuitry executes instructions of a computer program, such as arithmetic, logic, controlling, and input/output (I/O) operations. This role contrasts with that of external components, such as main memory and I/O circuitry, and specialized coprocessors such as graphics processing units (GPUs).

<span class="mw-page-title-main">DEC Alpha</span> 64-bit RISC instruction set architecture

Alpha is a 64-bit reduced instruction set computer (RISC) instruction set architecture (ISA) developed by Digital Equipment Corporation (DEC). Alpha was designed to replace 32-bit VAX complex instruction set computers (CISC) and to be a highly competitive RISC processor for Unix workstations and similar markets.

MIPS is a family of reduced instruction set computer (RISC) instruction set architectures (ISA) developed by MIPS Computer Systems, now MIPS Technologies, based in the United States.

<span class="mw-page-title-main">PA-RISC</span> Instruction set architecture by Hewlett-Packard

Precision Architecture RISC (PA-RISC) or Hewlett Packard Precision Architecture, is a general purpose computer instruction set architecture (ISA) developed by Hewlett-Packard from the 1980s until the 2000s.

<span class="mw-page-title-main">Reduced instruction set computer</span> Processor executing one instruction in minimal clock cycles

In computer engineering, a reduced instruction set computer (RISC) is a computer architecture designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set computer (CISC), a RISC computer might require more instructions in order to accomplish a task because the individual instructions are written in simpler code. The goal is to offset the need to process more instructions by increasing the speed of each instruction, in particular by implementing an instruction pipeline, which may be simpler to achieve given simpler instructions.

In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an implementation.

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

AltiVec is a single-precision floating point and integer SIMD instruction set designed and owned by Apple, IBM, and Freescale Semiconductor — the AIM alliance. It is implemented on versions of the PowerPC processor architecture, including Motorola's G4, IBM's G5 and POWER6 processors, and P.A. Semi's PWRficient PA6T. AltiVec is a trademark owned solely by Freescale, so the system is also referred to as Velocity Engine by Apple and VMX by IBM and P.A. Semi.

<span class="mw-page-title-main">MMX (instruction set)</span> Instruction set designed by Intel

MMX is a single instruction, multiple data (SIMD) instruction set architecture designed by Intel, introduced on January 8, 1997 with its Pentium P5 (microarchitecture) based line of microprocessors, named "Pentium with MMX Technology". It developed out of a similar unit introduced on the Intel i860, and earlier the Intel i750 video pixel processor. MMX is a processor supplementary capability that is supported on IA-32 processors by Intel and other vendors as of 1997.

Visual Instruction Set, or VIS, is a SIMD instruction set extension for SPARC V9 microprocessors developed by Sun Microsystems. There are five versions of VIS: VIS 1, VIS 2, VIS 2+, VIS 3 and VIS 4.

SuperH is a 32-bit reduced instruction set computing (RISC) instruction set architecture (ISA) developed by Hitachi and currently produced by Renesas. It is implemented by microcontrollers and microprocessors for embedded systems.

<span class="mw-page-title-main">Emotion Engine</span> Central processing unit by Sony Computer Entertainment and Toshiba

The Emotion Engine is a central processing unit developed and manufactured by Sony Computer Entertainment and Toshiba for use in the PlayStation 2 video game console. It was also used in early PlayStation 3 models sold in Japan and North America to provide PlayStation 2 game support. Mass production of the Emotion Engine began in 1999 and ended in late 2012 with the discontinuation of the PlayStation 2.

In computer architecture, 128-bit integers, memory addresses, or other data units are those that are 128 bits wide. Also, 128-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers, address buses, or data buses of that size.

The R8000 is a microprocessor chipset developed by MIPS Technologies, Inc. (MTI), Toshiba, and Weitek. It was the first implementation of the MIPS IV instruction set architecture. The R8000 is also known as the TFP, for Tremendous Floating-Point, its name during development.

SIMD within a register (SWAR), also known by the name "packed SIMD" is a technique for performing parallel operations on data contained in a processor register. SIMD stands for single instruction, multiple data. Flynn's 1972 taxonomy categorises SWAR as "pipelined processing".

The MDMX, also known as MaDMaX, is an extension to the MIPS architecture released in October 1996 at the Microprocessor Forum.

Hombre is a RISC chipset for the Amiga, designed by Commodore, which was intended as the basis of a range of Amiga personal computers and multimedia products, including a successor to the Amiga 1200, a next generation game machine called CD64 and a 3D accelerator PCI card. Hombre was canceled along with the bankruptcy of Commodore International.

<span class="mw-page-title-main">PA-8000</span> HP microprocessor

The PA-8000 (PCX-U), code-named Onyx, is a microprocessor developed and fabricated by Hewlett-Packard (HP) that implemented the PA-RISC 2.0 instruction set architecture (ISA). It was a completely new design with no circuitry derived from previous PA-RISC microprocessors. The PA-8000 was introduced on 2 November 1995 when shipments began to members of the Precision RISC Organization (PRO). It was used exclusively by PRO members and was not sold on the merchant market. All follow-on PA-8x00 processors are based on the basic PA-8000 processor core.

<span class="mw-page-title-main">PA-7100LC</span>

The PA-7100LC is a microprocessor that implements the PA-RISC 1.1 instruction set architecture (ISA) developed by Hewlett-Packard (HP). It is also known as the PCX-L, and by its code-name, Hummingbird. It was designed as a low-cost microprocessor for low-end systems. The first systems to feature the PA-7100LC were introduced in January 1994. These systems used 60 and 80 MHz parts. A 100 MHz part debuted in June 1994. The PA-7100LC was the first PA-RISC microprocessor to implement the MAX-1 multimedia instructions, an early single instruction, multiple data (SIMD) multimedia instruction set extension that provided instructions for improving the performance of MPEG video decoding.

RISC-V is an open standard instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles. Unlike most other ISA designs, RISC-V is provided under royalty-free open-source licenses. A number of companies are offering or have announced RISC-V hardware; open source operating systems with RISC-V support are available, and the instruction set is supported in several popular software toolchains.

References

  1. 1 2 3 Lee, Ruby B. (August 1996). "Subword Parallelism with MAX-2" (PDF). IEEE Micro. 16 (4): 51–59. doi:10.1109/40.526925 . Retrieved 21 September 2014.
  2. 1 2 3 4 5 6 7 Lee, Ruby; Huck, Jerry (February 25, 1996). "64-bit and multimedia extensions in the PA-RISC 2.0 architecture". COMPCON '96. Technologies for the Information Superhighway Digest of Papers. pp. 152–160. doi:10.1109/CMPCON.1996.501762. ISBN   0-8186-7414-8. S2CID   13081443.
  3. Lee, Ruby B. (April 1995). "Accelerating Multimedia with Enhanced Microprocessors" (PDF). IEEE Micro. 15 (2): 22–32. doi:10.1109/40.372347 . Retrieved 21 September 2014.