Framewave

Last updated
Framewave
Developer(s) Advanced Micro Devices (AMD)
Initial releaseSeptember 19, 2007;16 years ago (2007-09-19)
Final release
1.3.1 / July 9, 2009;14 years ago (2009-07-09)
Repository sourceforge.net/projects/framewave
Written in C, C++
Operating system Linux, macOS, Solaris, Windows
Platform IA-32, x86-64
Available inEnglish
Type Library
License Apache 2.0
Website framewave.sourceforge.net

Framewave (formerly AMD Performance Library (APL)) is computer software, a high-performance optimized programming library, consisting of low level application programming interfaces (APIs) for image processing, signal processing, JPEG, and video functions. These APIs are programmed with task level parallelization (multi-threading) and instruction-level parallelism single instruction, multiple data (SIMD) for maximum performance on multi-core processors from Advanced Micro Devices (AMD).

Contents

Framewave is free and open-source software released under the Apache License version 2.0, which is compatible with the GNU General Public License 3.0. [1]

Overview

The AMD Performance Library was developed by Advanced Micro Devices (AMD) as a collection of popular software routines designed to accelerate application development, debugging, and optimization on x86 class processors. It includes simple arithmetic routines, and more complex functions for applications such as image and signal processing. APL is available as a static library for 32- or 64-bit versions of GNU Compiler Collection (GCC) 4.1 and Microsoft Visual Studio 2005, and as a 32- or 64-bit dynamic library for the operating systems Linux, Solaris, and Windows.

In 2008, AMD deprecated the APL library in favor of an open-source derivative named Framewave. [1] [2] [3]

Framewave is available as 32- and 64-bit static libraries for GCC 4.3 and Microsoft Visual Studio 2008, and as 32- and 64-bit dynamic libraries for the operating systems Linux, macOS, Solaris, and Windows. Relative to Framewave 1.0, noticeable performance gains occurred in several APIs, including JPEG.

Features

Framewave consists of the following main components: [4]

APL 1.1

Released on 2007-09-19, APL 1.1 added these feature enhancements: [5]

See also

Related Research Articles

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

SSE2 is one of the Intel SIMD processor supplementary instruction sets introduced by Intel with the initial version of the Pentium 4 in 2000. SSE2 instructions allow the use of XMM (SIMD) registers on x86 instruction set architecture processors. These register can load up to 128 bits of data and perform instructions, such as vector addition and multiplication, simultaneously.

<span class="mw-page-title-main">MinGW</span> Free and open-source software for developing applications in Microsoft Windows

MinGW, formerly mingw32, is a free and open source software development environment to create Microsoft Windows applications.

<span class="mw-page-title-main">Graphics processing unit</span> Specialized electronic circuit; graphics accelerator

A graphics processing unit (GPU) is a specialized electronic circuit initially designed to accelerate computer graphics and image processing. After their initial design, GPUs were found to be useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. Other non-graphical uses include the training of neural networks and cryptocurrency mining.

Open64 is a free, open-source, optimizing compiler for the Itanium and x86-64 microprocessor architectures. It derives from the SGI compilers for the MIPS R10000 processor, called MIPSPro. It was initially released in 2000 as GNU GPL software under the name Pro64. The following year, University of Delaware adopted the project and renamed the compiler to Open64. It now mostly serves as a research platform for compiler and computer architecture research groups. Open64 supports Fortran 77/95 and C/C++, as well as the shared memory programming model OpenMP. It can conduct high-quality interprocedural analysis, data-flow analysis, data dependence analysis, and array region analysis. Development has ceased, although other projects can use the project's source.

In computer software, in compiler theory, an intrinsic function, also called built-in function or builtin function, is a function (subroutine) available for use in a given programming language whose implementation is handled specially by the compiler. Typically, it may substitute a sequence of automatically generated instructions for the original function call, similar to an inline function. Unlike an inline function, the compiler has an intimate knowledge of an intrinsic function and can thus better integrate and optimize it for a given situation.

AMD Core Math Library (ACML) is an end-of-life software development library released by AMD, replaced by many open source libraries, including AMD libm 4.0. This library provides mathematical routines optimized for AMD processors.

<span class="mw-page-title-main">CUDA</span> Parallel computing platform and programming model

Compute Unified Device Architecture (CUDA) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated general-purpose processing, an approach called general-purpose computing on GPUs (GPGPU). CUDA API and its runtime: The CUDA API is an extension of the C programming language that adds the ability to specify thread-level parallelism in C and also to specify GPU device specific operations (like moving data between the CPU and the GPU). CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements for the execution of compute kernels. In addition to drivers and runtime kernels, the CUDA platform includes compilers, libraries and developer tools to help programmers accelerate their applications.

LatticeMico32 is a 32-bit microprocessor reduced instruction set computer (RISC) soft core from Lattice Semiconductor optimized for field-programmable gate arrays (FPGAs). It uses a Harvard architecture, which means the instruction and data buses are separate. Bus arbitration logic can be used to combine the two buses, if desired.

The PowerPC e500 is a 32-bit microprocessor core from Freescale Semiconductor. The core is compatible with the older PowerPC Book E specification as well as the Power ISA v.2.03. It has a dual issue, seven-stage pipeline with FPUs, 32/32 KiB data and instruction L1 caches and 256, 512 or 1024 KiB L2 frontside cache. Speeds range from 533 MHz up to 1.5 GHz, and the core is designed to be highly configurable and meet the specific needs of embedded applications with features like multi-core operation interface for auxiliary application processing units (APU).

Intel Integrated Performance Primitives is an extensive library of ready-to-use, domain-specific functions that are highly optimized for diverse Intel architectures. Its royalty-free APIs help developers: Take advantage of Single Instruction, Multiple Data (SIMD) instructions.

AMD FireStream was AMD's brand name for their Radeon-based product line targeting stream processing and/or GPGPU in supercomputers. Originally developed by ATI Technologies around the Radeon X1900 XTX in 2006, the product line was previously branded as both ATI FireSTREAM and AMD Stream Processor. The AMD FireStream can also be used as a floating-point co-processor for offloading CPU calculations, which is part of the Torrenza initiative. The FireStream line has been discontinued since 2012, when GPGPU workloads were entirely folded into the AMD FirePro line.

Oracle Developer Studio, formerly named Oracle Solaris Studio, Sun Studio, Sun WorkShop, Forte Developer, and SunPro Compilers, is the Oracle Corporation's flagship software development product for the Solaris and Linux operating systems. It includes optimizing C, C++, and Fortran compilers, libraries, and performance analysis and debugging tools, for Solaris on SPARC and x86 platforms, and Linux on x86/x64 platforms, including multi-core systems.

mediaLib is a portable low level library for accelerating multimedia applications, with interfaces in C. It was developed by Sun Microsystems and open-sourced under the CDDL license as part of the OpenSolaris project.

Advanced Vector Extensions are SIMD extensions to the x86 instruction set architecture for microprocessors from Intel and Advanced Micro Devices (AMD). They were proposed by Intel in March 2008 and first supported by Intel with the Sandy Bridge processor shipping in Q1 2011 and later by AMD with the Bulldozer processor shipping in Q3 2011. AVX provides new features, new instructions, and a new coding scheme.

The FMA instruction set is an extension to the 128 and 256-bit Streaming SIMD Extensions instructions in the x86 microprocessor instruction set to perform fused multiply–add (FMA) operations. There are two variants:

An AES instruction set is a set of instructions that are specifically designed to perform AES encryption and decryption operations efficiently. These instructions are typically found in modern processors and can greatly accelerate AES operations compared to software implementations. An AES instruction set includes instructions for key expansion, encryption, and decryption using various key sizes.

AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and first implemented in the 2016 Intel Xeon Phi x200, and then later in a number of AMD and other Intel CPUs. AVX-512 consists of multiple extensions that may be implemented independently. This policy is a departure from the historical requirement of implementing the entire instruction block. Only the core extension AVX-512F is required by all AVX-512 implementations.

Bit manipulation instructions sets are extensions to the x86 instruction set architecture for microprocessors from Intel and AMD. The purpose of these instruction sets is to improve the speed of bit manipulation. All the instructions in these sets are non-SIMD and operate only on general-purpose registers.

References

  1. 1 2 "AMD Accelerates Application Development with Inaugural Release of Open Source Performance Library". AMD. February 20, 2008. Retrieved 2008-02-20.
  2. "AMD Performance Library (APL)". AMD. February 20, 2008. Archived from the original on January 24, 2008. Retrieved 2008-02-20.
  3. "The Framewave Project". AMD. February 20, 2008. Archived from the original on 2008-02-25. Retrieved 2008-02-20.
  4. "AMD Performance Library (APL)". AMD. February 1, 2007. Archived from the original on 2007-06-02. Retrieved 2007-06-05.
  5. "APL Product Features". AMD. September 19, 2007. Archived from the original on 2007-10-15. Retrieved 2007-09-19.