4D vector

Last updated

In computer science, a 4D vector is a 4-component vector data type. Uses include homogeneous coordinates for 3-dimensional space in computer graphics, and red green blue alpha (RGBA) values for bitmap images with a color and alpha channel (as such they are widely used in computer graphics). They may also represent quaternions (useful for rotations) although the algebra they define is different.

Contents

Computer hardware support

Some microprocessors have hardware support for 4D vectors with instructions dealing with 4 lane single instruction, multiple data (SIMD) instructions, usually with a 128-bit data path and 32-bit floating point fields. [1]

Specific instructions (e.g., 4 element dot product) may facilitate the use of one 128-bit register to represent a 4D vector. For example, in chronological order: Hitachi SH4, PowerPC VMX128 extension, [2] and Intel x86 SSE4. [3]

Some 4-element vector engines (e.g., the PS2 vector units) went further with the ability to broadcast components as multiply sources, and cross product support. [4] [5] Earlier generations of graphics processing unit (GPU) shader pipelines used very long instruction word (VLIW) instruction sets tailored for similar operations.

Software support

SIMD use for 4D vectors can be conveniently wrapped in a vector maths library (commonly implemented in C or C++) [6] [7] [8] commonly used in video game development, along with 4×4 matrix support. These are distinct from more general linear algebra libraries in other domains focussing on matrices of arbitrary size. Such libraries sometimes support 3D vectors padded to 4D or loading 3D data into 4D registers, with arithmetic mapped efficiently to SIMD operations by per platform intrinsic function implementations. There is choice between AOS and SOA approaches given the availability of 4 element registers, versus SIMD instructions that are usually tailored toward homogenous data.

Shading languages for graphics processing unit (GPU) programming usually have a 4D datatypes (along with 2D, 3D) with x-y-z-w accessors including permutes or swizzle access, e.g., allowing easy swapping of RGBA or ARGB formats, accessing two 2D vectors packed into one 4D vector, etc. [9] Modern GPUs have since moved to scalar single instruction, multiple threads (SIMT) pipelines (for more efficiency in general-purpose computing on graphics processing units (GPGPU)) but still support this programming model. [10]

See also

Related Research Articles

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

<span class="mw-page-title-main">MMX (instruction set)</span> Instruction set designed by Intel

MMX is a single instruction, multiple data (SIMD) instruction set architecture designed by Intel, introduced on January 8, 1997 with its Pentium P5 (microarchitecture) based line of microprocessors, named "Pentium with MMX Technology". It developed out of a similar unit introduced on the Intel i860, and earlier the Intel i750 video pixel processor. MMX is a processor supplementary capability that is supported on IA-32 processors by Intel and other vendors as of 1997. AMD also added MMX instruction set in its K6 processor.

In computing, Streaming SIMD Extensions (SSE) is a single instruction, multiple data (SIMD) instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series of central processing units (CPUs) shortly after the appearance of Advanced Micro Devices (AMD's) 3DNow!. SSE contains 70 new instructions, most of which work on single precision floating-point data. SIMD instructions can greatly increase performance when exactly the same operations are to be performed on multiple data objects. Typical applications are digital signal processing and graphics processing.

In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called vectors. This is in contrast to scalar processors, whose instructions operate on single data items only, and in contrast to some of those same scalar processors having additional single instruction, multiple data (SIMD) or SIMD within a register (SWAR) Arithmetic Units. Vector processors can greatly improve performance on certain workloads, notably numerical simulation and similar tasks. Vector processing techniques also operate in video-game console hardware and in graphics accelerators.

SSE2 is one of the Intel SIMD processor supplementary instruction sets introduced by Intel with the initial version of the Pentium 4 in 2000. SSE2 instructions allow the use of XMM (SIMD) registers on x86 instruction set architecture processors. These registers can load up to 128 bits of data and perform instructions, such as vector addition and multiplication, simultaneously.

<span class="mw-page-title-main">Coprocessor</span> Type of computer processor

A coprocessor is a computer processor used to supplement the functions of the primary processor. Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography or I/O interfacing with peripheral devices. By offloading processor-intensive tasks from the main processor, coprocessors can accelerate system performance. Coprocessors allow a line of computers to be customized, so that customers who do not need the extra performance do not need to pay for it.

<span class="mw-page-title-main">Volume rendering</span> Representing a 3D-modeled object or dataset as a 2D projection

In scientific visualization and computer graphics, volume rendering is a set of techniques used to display a 2D projection of a 3D discretely sampled data set, typically a 3D scalar field.

<span class="mw-page-title-main">Emotion Engine</span> Central processing unit by Sony Computer Entertainment and Toshiba

The Emotion Engine is a central processing unit developed and manufactured by Sony Computer Entertainment and Toshiba for use in the PlayStation 2 video game console. It was also used in early PlayStation 3 models sold in Japan and North America to provide PlayStation 2 game support. Mass production of the Emotion Engine began in 1999 and ended in late 2012 with the discontinuation of the PlayStation 2.

In computer science, stream processing is a programming paradigm which views streams, or sequences of events in time, as the central input and output objects of computation. Stream processing encompasses dataflow programming, reactive programming, and distributed data processing. Stream processing systems aim to expose parallel processing for data streams and rely on streaming algorithms for efficient implementation. The software stack for these systems includes components such as programming models and query languages, for expressing computation; stream management systems, for distribution and scheduling; and hardware components for acceleration including floating-point units, graphics processing units, and field-programmable gate arrays.

Supplemental Streaming SIMD Extensions 3 is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology.

SSE4 is a SIMD CPU instruction set used in the Intel Core microarchitecture and AMD K10 (K8L). It was announced on September 27, 2006, at the Fall 2006 Intel Developer Forum, with vague details in a white paper; more precise details of 47 instructions became available at the Spring 2007 Intel Developer Forum in Beijing, in the presentation. SSE4 extended the SSE3 instruction set which was released in early 2004. All software using previous Intel SIMD instructions are compatible with modern microprocessors supporting SSE4 instructions. All existing software continues to run correctly without modification on microprocessors that incorporate SSE4, as well as in the presence of existing and new applications that incorporate SSE4.

<span class="mw-page-title-main">Larrabee (microarchitecture)</span> Canceled Intel GPGPU chip

Larrabee is the codename for a cancelled GPGPU chip that Intel was developing separately from its current line of integrated graphics accelerators. It is named after either Mount Larrabee or Larrabee State Park in Whatcom County, Washington, near the town of Bellingham. The chip was to be released in 2010 as the core of a consumer 3D graphics card, but these plans were cancelled due to delays and disappointing early performance figures. The project to produce a GPU retail product directly from the Larrabee research project was terminated in May 2010 and its technology was passed on to the Xeon Phi. The Intel MIC multiprocessor architecture announced in 2010 inherited many design elements from the Larrabee project, but does not function as a graphics processing unit; the product is intended as a co-processor for high performance computing.

ARB assembly language is a low-level shading language, which can be characterized as an assembly language. It was created by the OpenGL Architecture Review Board (ARB) to standardize GPU instructions controlling the hardware graphics pipeline.

<span class="mw-page-title-main">VideoCore</span> Low-power mobile multimedia processor

VideoCore is a series of low-power mobile multimedia processors originally developed by Alphamosaic Ltd and now owned by Broadcom. Alphamosaic marketed its first version as a two-dimensional DSP architecture that makes it flexible and efficient enough to decode a number of multimedia codecs in software while maintaining low power usage. The semiconductor intellectual property core has been found so far only on Broadcom SoCs.

Graphics Core Next (GCN) is the codename for a series of microarchitectures and an instruction set architecture that were developed by AMD for its GPUs as the successor to its TeraScale microarchitecture. The first product featuring GCN was launched on January 9, 2012.

Adaptive scalable texture compression (ASTC) is a lossy block-based texture compression algorithm developed by Jørn Nystad et al. of ARM Ltd. and AMD.

Intel Advisor is a design assistance and analysis tool for SIMD vectorization, threading, memory use, and GPU offload optimization. The tool supports C, C++, Data Parallel C++ (DPC++), Fortran and Python languages. It is available on Windows and Linux operating systems in form of Standalone GUI tool, Microsoft Visual Studio plug-in or command line interface. It supports OpenMP. Intel Advisor user interface is also available on macOS.

This is a glossary of terms relating to computer graphics.

In computing, an array of structures (AoS), structure of arrays (SoA) or array of structures of arrays (AoSoA) are contrasting ways to arrange a sequence of records in memory, with regard to interleaving, and are of interest in SIMD and SIMT programming.

Permute instructions, part of bit manipulation as well as vector processing, copy unaltered contents from a source array to a destination array, where the indices are specified by a second source array. The size (bitwidth) of the source elements is not restricted but remains the same as the destination size.

References

  1. "intel SSE intrinsics".
  2. "Putting It All Together: Anatomy of the XBox 360 Game Console (see VMX128 dot product)" (PDF).
  3. "intel SSE4 dot product".
  4. "VU0 user manual" (PDF).
  5. "feasibility study on using the playstation 2 for scientific computing" (PDF).
  6. "sce vectormath". GitHub . 19 April 2022.
  7. "GLM (vector maths library)".
  8. "Microsoft DirectX Maths".
  9. "GLSL data types & swizzling".
  10. "AMD graphics core next".