3DNow!

Last updated
3DNow!
3dnow-logo.svg
Design firm Advanced Micro Devices
Introduced1998
Type instruction set architecture

3DNow! is a deprecated extension to the x86 instruction set developed by Advanced Micro Devices (AMD). It adds single instruction multiple data (SIMD) instructions to the base x86 instruction set, enabling it to perform vector processing of floating-point vector operations using vector registers. This improvement enhances the performance of many graphics-intensive applications. The first microprocessor to implement 3DNow! was the AMD K6-2, introduced in 1998. In appropriate applications, this enhancement raised the speed by about 2–4 times. [1]

Contents

However, the instruction set never gained much popularity, and AMD announced in August 2010 that support for 3DNow! would be dropped in future AMD processors, except for two instructions, PREFETCH and PREFETCHW. [2] These two instructions are also available in Bay-Trail Intel processors. [3]

History

3DNow! was developed at a time when 3D graphics were becoming mainstream in PC multimedia and games. Realtime display of 3D graphics depended heavily on the host CPU's floating-point unit (FPU) to perform floating-point calculations, a task in which AMD's K6 processor was easily outperformed by its competitor, the Intel Pentium II.

As an enhancement to the MMX instruction set, the 3DNow! instruction-set augmented the MMX SIMD registers to support common arithmetic operations (add/subtract/multiply) on single-precision (32-bit) floating-point data. Software written to use AMD's 3DNow! instead of the slower x87 FPU could execute up to four times faster, depending on the instruction mix.

Versions

3DNow!

The first implementation of 3DNow! technology contains 21 new instructions that support SIMD floating-point operations. The 3DNow! data format is packed, single-precision, floating-point. The 3DNow! instruction set also includes operations for SIMD integer operations, data prefetch, and faster MMX-to-floating-point switching. Later, Intel would add similar (but incompatible) instructions to the Pentium III, known as SSE (Streaming SIMD Extensions).

3DNow! floating-point instructions are the following:

  • PI2FD  Packed 32-bit integer to floating-point conversion
  • PF2ID  Packed floating-point to 32-bit integer conversion
  • PFCMPGE  Packed floating-point comparison, greater or equal
  • PFCMPGT  Packed floating-point comparison, greater
  • PFCMPEQ  Packed floating-point comparison, equal
  • PFACC  Packed floating-point accumulate
  • PFADD  Packed floating-point addition
  • PFSUB  Packed floating-point subtraction
  • PFSUBR  Packed floating-point reverse subtraction
  • PFMIN  Packed floating-point minimum
  • PFMAX  Packed floating-point maximum
  • PFMUL  Packed floating-point multiplication
  • PFRCP  Packed floating-point reciprocal approximation
  • PFRSQRT  Packed floating-point reciprocal square root approximation
  • PFRCPIT1  Packed floating-point reciprocal, first iteration step
  • PFRSQIT1  Packed floating-point reciprocal square root, first iteration step
  • PFRCPIT2  Packed floating-point reciprocal/reciprocal square root, second iteration step

3DNow! integer instructions are the following:

3DNow! performance-enhancement instructions are the following:

3DNow! extensions

There is little or no evidence that the second version of 3DNow! was ever officially given its own trade name. This has led to some confusion in documentation that refers to this new instruction set. The most common terms are Extended 3DNow!, Enhanced 3DNow! and 3DNow!+. The phrase "Enhanced 3DNow!" can be found in a few locations on the AMD website but the capitalization of "Enhanced" appears to be either purely grammatical or used for emphasis on processors that may or may not have these extensions (the most notable of which references a benchmark page for the K6-III-P that does not have these extensions). [4] [5]

This extension to the 3DNow! instruction set was introduced with the first-generation Athlon processors. The Athlon added five new 3DNow! instructions and 19 new MMX instructions. Later, the K6-2+ and K6-III+ (both targeted at the mobile market) included the five new 3DNow! instructions, leaving out the 19 new MMX instructions. The new 3DNow! instructions were added to boost DSP. The new MMX instructions were added to boost streaming media.

The 19 new MMX instructions are a subset of Intel's SSE instruction set. In AMD technical manuals, AMD segregates these instructions apart from the 3DNow! extensions. [4] In AMD customer product literature, however, this segregation is less clear where the benefits of all 24 new instructions are credited to enhanced 3DNow! technology. [6] This has led programmers to come up with their own name for the 19 new MMX instructions. The most common appears to be Integer SSE (ISSE). [7] SSEMMX and MMX2 are also found in video filter documentation from the public domain sector. ISSE could also refer to Internet SSE, an early name for SSE.

3DNow! extension DSP instructions are the following:

MMX extension instructions (Integer SSE) are the following:

  • MASKMOVQ  Streaming (cache bypass) store using byte mask
  • MOVNTQ  Streaming (cache bypass) store
  • PAVGB  Packed average of unsigned byte
  • PAVGW  Packed average of unsigned word
  • PMAXSW  Packed maximum signed word
  • PMAXUB  Packed maximum unsigned byte
  • PMINSW  Packed minimum signed word
  • PMINUB  Packed minimum unsigned byte
  • PMULHUW  Packed multiply high unsigned word
  • PSADBW  Packed sum of absolute byte differences
  • PSHUFW  Packed shuffle word
  • PEXTRW  Extract word into integer register
  • PINSRW  Insert word from integer register
  • PMOVMSKB  Move byte mask to integer register
  • PREFETCHNTA  Prefetch using the NTA reference
  • PREFETCHT0  Prefetch using the T0 reference
  • PREFETCHT1  Prefetch using the T1 reference
  • PREFETCHT2  Prefetch using the T2 reference
  • SFENCE  Store fence

3DNow! Professional

3DNow! Professional is a trade name used to indicate processors that combine 3DNow! technology with a complete SSE instructions set (such as SSE, SSE2 or SSE3). [8] The Athlon XP was the first processor to carry the 3DNow! Professional trade name, and was the first product in the Athlon family to support the complete SSE instruction set (for the total of: 21 original 3DNow! instructions; five 3DNow! extension DSP instructions; 19 MMX extension instructions; and 52 additional SSE instructions for complete SSE compatibility). [9]

3DNow! and the Geode GX/LX

The Geode GX and Geode LX added two new 3DNow! instructions which is absent in all other processors.

3DNow! "professional" instructions unique to the Geode GX/LX are the following:

Advantages and disadvantages

One advantage of 3DNow! is that it is possible to add or multiply the two numbers that are stored in the same register. With SSE, each number can only be combined with a number in the same position in another register. This capability, known as horizontal in Intel terminology, was the major addition to the SSE3 instruction set.

A disadvantage with 3DNow! is that 3DNow! instructions and MMX instructions share the same register-file, whereas SSE adds 8 new independent registers (XMM0XMM7).

Because MMX/3DNow! registers are shared by the standard x87 FPU, 3DNow! instructions and x87 instructions cannot be executed simultaneously. However, because it is aliased to the x87 FPU, the 3DNow! and MMX register states can be saved and restored by the traditional x87 F(N)SAVE and F(N)RSTOR instructions. This arrangement allowed operating systems to support 3DNow! with no explicit modifications, whereas SSE registers required explicit operating system support to properly save and restore the new XMM registers (via the added FXSAVE and FXRSTOR instructions.)

The FX* instructions from SSE provide a functional superset of the older x87 save and restore instructions. They can save not only SSE register states but also the x87 register states (hence are applicable also for MMX and 3DNow! operations where supported).

On AMD Athlon XP and K8-based cores (i.e. Athlon 64), assembly programmers have noted that it is possible to combine 3DNow! and SSE instructions to reduce register pressure, but in practice it is difficult to improve performance due to the instructions executing on shared functional units. [10]

Processors supporting 3DNow!

Related Research Articles

<span class="mw-page-title-main">Athlon</span> Brand of microprocessors by AMD

Athlon is the brand name applied to a series of x86-compatible microprocessors designed and manufactured by AMD. The original Athlon was the first seventh-generation x86 processor and the first desktop processor to reach speeds of one gigahertz (GHz). It made its debut as AMD's high-end processor brand on June 23, 1999. Over the years AMD has used the Athlon name with the 64-bit Athlon 64 architecture, the Athlon II, and Accelerated Processing Unit (APU) chips targeting the Socket AM1 desktop SoC architecture, and Socket AM4 Zen microarchitecture. The modern Zen-based Athlon with a Radeon Graphics processor was introduced in 2019 as AMD's highest-performance entry-level processor.

<span class="mw-page-title-main">AMD K6</span> Computer microprocessor

The K6 microprocessor was launched by AMD in 1997. The main advantage of this particular microprocessor is that it was designed to fit into existing desktop designs for Pentium-branded CPUs. It was marketed as a product that could perform as well as its Intel Pentium II equivalent but at a significantly lower price. The K6 had a considerable impact on the PC market and presented Intel with serious competition.

<span class="mw-page-title-main">Pentium (original)</span> Intel microprocessor

The Pentium is a x86 microprocessor introduced by Intel on March 22, 1993. It is the first CPU using the Pentium brand. Considered the fifth generation in the 8086 compatible line of processors, its implementation and microarchitecture was internally called P5.

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

<span class="mw-page-title-main">MMX (instruction set)</span> Instruction set designed by Intel

MMX is a single instruction, multiple data (SIMD) instruction set architecture designed by Intel, introduced on January 8, 1997 with its Pentium P5 (microarchitecture) based line of microprocessors, named "Pentium with MMX Technology". It developed out of a similar unit introduced on the Intel i860, and earlier the Intel i750 video pixel processor. MMX is a processor supplementary capability that is supported on IA-32 processors by Intel and other vendors as of 1997.

In computing, Streaming SIMD Extensions (SSE) is a single instruction, multiple data (SIMD) instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series of central processing units (CPUs) shortly after the appearance of Advanced Micro Devices (AMD's) 3DNow!. SSE contains 70 new instructions, most of which work on single precision floating-point data. SIMD instructions can greatly increase performance when exactly the same operations are to be performed on multiple data objects. Typical applications are digital signal processing and graphics processing.

Visual Instruction Set, or VIS, is a SIMD instruction set extension for SPARC V9 microprocessors developed by Sun Microsystems. There are five versions of VIS: VIS 1, VIS 2, VIS 2+, VIS 3 and VIS 4.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

<span class="mw-page-title-main">AMD K6-2</span> CPU by AMD

The K6-2 is an x86 microprocessor introduced by AMD on May 28, 1998, and available in speeds ranging from 266 to 550 MHz. An enhancement of the original K6, the K6-2 introduced AMD's 3DNow! SIMD instruction set and an upgraded system-bus interface called Super Socket 7, which was backward compatible with older Socket 7 motherboards. It was manufactured using a 250 nanometer process, ran at 2.2 volts, and had 9.3 million transistors.

<span class="mw-page-title-main">AMD K6-III</span> Microprocessor series by AMD

The K6-III was an x86 microprocessor line manufactured by AMD that launched on February 22, 1999. The launch consisted of both 400 and 450 MHz models and was based on the preceding K6-2 architecture. Its improved 256 KB on-chip L2 cache gave it significant improvements in system performance over its predecessor the K6-2. The K6-III was the last processor officially released for desktop Socket 7 systems, however later mobile K6-III+ and K6-2+ processors could be run unofficially in certain socket 7 motherboards if an updated BIOS was made available for a given board. The Pentium III processor from Intel launched 6 days later.

SSE2 is one of the Intel SIMD processor supplementary instruction sets introduced by Intel with the initial version of the Pentium 4 in 2000. It extends the earlier SSE instruction set, and is intended to fully replace MMX. Intel extended SSE2 to create SSE3 in 2004. SSE2 added 144 new instructions to SSE, which has 70 instructions. Competing chip-maker AMD added support for SSE2 with the introduction of their Opteron and Athlon 64 ranges of AMD64 64-bit CPUs in 2003.

SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions (PNI), is the third iteration of the SSE instruction set for the IA-32 (x86) architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU. In April 2005, AMD introduced a subset of SSE3 in revision E of their Athlon 64 CPUs. The earlier SIMD instruction sets on the x86 platform, from oldest to newest, are MMX, 3DNow!, SSE, and SSE2.

The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

The AMD Family 10h, or K10, is a microprocessor microarchitecture by AMD based on the K8 microarchitecture. The first third-generation Opteron products for servers were launched on September 10, 2007, with the Phenom processors for desktops following and launching on November 11, 2007 as the immediate successors to the K8 series of processors.

x87 is a floating-point-related subset of the x86 architecture instruction set. It originated as an extension of the 8086 instruction set in the form of optional floating-point coprocessors that work in tandem with corresponding x86 CPUs. These microchips have names ending in "87". This is also known as the NPX. Like other extensions to the basic instruction set, x87 instructions are not strictly needed to construct working programs, but provide hardware and microcode implementations of common numerical tasks, allowing these tasks to be performed much faster than corresponding machine code routines can. The x87 instruction set includes instructions for basic floating-point operations such as addition, subtraction and comparison, but also for more complex numerical operations, such as the computation of the tangent function and its inverse, for example.

Supplemental Streaming SIMD Extensions 3 is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology.

Advanced Vector Extensions are SIMD extensions to the x86 instruction set architecture for microprocessors from Intel and Advanced Micro Devices (AMD). They were proposed by Intel in March 2008 and first supported by Intel with the Sandy Bridge processor shipping in Q1 2011 and later by AMD with the Bulldozer processor shipping in Q3 2011. AVX provides new features, new instructions, and a new coding scheme.

AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and first implemented in the 2016 Intel Xeon Phi x200, and then later in a number of AMD and other Intel CPUs. AVX-512 consists of multiple extensions that may be implemented independently. This policy is a departure from the historical requirement of implementing the entire instruction block. Only the core extension AVX-512F is required by all AVX-512 implementations.

References

  1. "Effectively Utilizing 3DNow in Linux". Linux Journal. December 1, 1999. Archived from the original on 2011-06-07. Retrieved 2010-10-03.
  2. "3DNow Instructions are Being Deprecated | AMD Developer Central". Blogs.amd.com. 2010-08-18. Archived from the original on 2010-10-24. Retrieved 2010-10-03.
  3. "IntelE38xx - MinnowBoard Wiki". Archived from the original on 11 February 2017. Retrieved 13 February 2017.
  4. 1 2 "AMD Extensions to the 3DNow and MMX Instruction Sets Manual" (PDF). Advanced Micro Devices, Inc. March 2000. Archived (PDF) from the original on 2008-05-17. Retrieved 2008-06-07.
  5. "Mobile AMD-K6-III-P Processor-Based Notebook: Ziff-Davis CPUmark 99". Archived from the original on 2008-07-24. Retrieved 2008-06-07. Incorrect title on page: Mobile AMD-K6-III+ and Mobile AMD-K6-2+ Processors with Enchanced [ sic ] 3DNow! Technology
  6. "AMD Athlon Processor Product Brief". Advanced Micro Devices, Inc. Archived from the original on 2008-02-25. Retrieved 2008-06-08.
  7. "ISSE". AviSynth . Archived from the original on 2017-07-02. Retrieved 2017-07-19.
  8. "Explaining the new 3DNow! Professional Technology". Advanced Micro Devices, Inc. Archived from the original on 2009-01-21. Retrieved 2008-06-08.
  9. "AMD Athlon XP Architectural Features". Advanced Micro Devices, Inc. Archived from the original on 2008-02-25. Retrieved 2008-06-08.
  10. Larry Lewis (9 July 2003). "3DNow+ vs SSE on Athlon XP". Newsgroup:  comp.sys.ibm.pc.hardware.chips. Usenet:   ad82cd69.0307090931.25391323@posting.google.com. Archived from the original on 2012-10-03. Retrieved 4 January 2023 via Google Groups.

Further reading