Intel Array Building Blocks

Last updated
Intel Array Building Blocks
Developer(s) Intel
Initial release May 17, 2010
Preview release
1.0 beta 6 / August 25, 2011
Written in C++
Operating system Windows, Linux
Type library or framework
Website software.intel.com/en-us/articles/intel-array-building-blocks

Intel Array Building Blocks (also known as ArBB) was a C++ library developed by Intel Corporation for exploiting data parallel portions of programs to take advantage of multi-core processors, graphics processing units and Intel Many Integrated Core Architecture processors. ArBB provides a generalized vector parallel programming solution designed to avoid direct dependencies on particular low-level parallelism mechanisms or hardware architectures. ArBB is oriented to applications that require data-intensive mathematical computations. By default, ArBB programs cannot create data races or deadlocks.

C++ general purpose high-level programming language

C++ is a general-purpose programming language that was developed by Bjarne Stroustrup as an extension of the C language, or "C with Classes". It has imperative, object-oriented and generic programming features, while also providing facilities for low-level memory manipulation. It is almost always implemented as a compiled language, and many vendors provide C++ compilers, including the Free Software Foundation, Microsoft, Intel, and IBM, so it is available on many platforms.

Intel American semiconductor company

Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, in the Silicon Valley. It is the world's second largest and second highest valued semiconductor chip manufacturer based on revenue after being overtaken by Samsung, and is the inventor of the x86 series of microprocessors, the processors found in most personal computers (PCs). Intel ranked No. 46 in the 2018 Fortune 500 list of the largest United States corporations by total revenue.

Corporation separate legal entity that has been incorporated through a legislative or registration process established through legislation

A corporation is an organization, usually a group of people or a company, authorized to act as a single entity and recognized as such in law. Early incorporated entities were established by charter. Most jurisdictions now allow the creation of new corporations through registration. Corporations enjoy limited liability for their investors, which can lead to losses being externalized from investors to the government or general public, while losses to investors are generally limited to the amount of their investment.

Contents

History

Intel Ct was a parallel programming model developed by Intel in 2007 for its future multi-core processors as part of the Tera-Scale research program. [1] In April 2009, Intel announced that "Ct [is] to appear in programmer tools by end of the year". [2] On August 19, 2009, Intel acquired RapidMind, a privately held company founded and headquartered in Waterloo, Ontario, Canada. [3] In September 2010, Intel Array Building Blocks (ArBB) were introduced as the result of the merger of Intel Ct and RapidMind technologies. [4] [5] The first version of ArBB supported Microsoft Windows and Linux, and Intel, Microsoft Visual C++ and GCC C++ compilers.

Intel Ct is a programming model developed by Intel to ease the exploitation of its future multicore chips, as demonstrated by the Tera-Scale research program.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a sequential language, as an extension to an existing language, or as an entirely new language.

Intel Tera-Scale is a research program by Intel that focuses on development in Intel processors and platforms that utilize the inherent parallelism of emerging visual-computing applications. Such applications require teraFLOPS of parallel computing performance to process terabytes of data quickly. Parallelism is the concept of performing multiple tasks simultaneously. Utilizing parallelism will not only increase the efficiency of computer processing units (CPUs), but also increase the bytes of data analyzed each second. In order to appropriately apply parallelism, the CPU must be able to handle multiple threads and to do so the CPU must consist of multiple cores. The conventional amount of cores in consumer grade computers are 2–8 cores while workstation grade computers can have even greater amounts. However, even the current amount of cores aren't great enough to perform at teraFLOPS performance leading to an even greater amount of cores that must be added. As a result of the program, two prototypes have been manufactured that were used to test the feasibility of having many more cores than the conventional amount and proved to be successful.

In October 2012 the project was discontinued in favour of other Intel projects: Cilk Plus and Threading Building Blocks. [6]

Threading Building Blocks (TBB) is a C++ template library developed by Intel for parallel programming on multi-core processors. Using TBB, a computation is broken down into tasks that can run in parallel. The library manages and schedules threads to execute these tasks.

See also

Cilk, Cilk++ and Cilk Plus are general-purpose programming languages designed for multithreaded parallel computing. They are based on the C and C++ programming languages, which they extend with constructs to express parallel loops and the fork–join idiom.

Intel Array Visualizer version 3.1 - is a scientific graphics software.

Intel Parallel Building Blocks (PBB) was a collection of three programming solutions designed for multithreaded parallel computing. PBB consisted of Cilk Plus, Threading Building Blocks (TBB) and Intel Array Building Blocks (ArBB).

Related Research Articles

Parallel computing programming paradigm in which many calculations or the execution of processes are carried out simultaneously

Parallel computing is a type of computation in which many calculations or the execution of processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but it's gaining broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

OpenMP open standard for parallelizing

OpenMP is an application programming interface (API) that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran, on most platforms, instruction set architectures and operating systems, including Solaris, AIX, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

In computing, hardware acceleration is the use of computer hardware specially made to perform some functions more efficiently than is possible in software running on a general-purpose CPU. Any transformation of data or routine that can be computed, can be calculated purely in software running on a generic CPU, purely in custom-made hardware, or in some mix of both. An operation can be computed faster in application-specific hardware designed or programmed to compute the operation than specified in software and performed on a general-purpose computer processor. Each approach has advantages and disadvantages. The implementation of computing tasks in hardware to decrease latency and increase throughput is known as hardware acceleration.

Automatic parallelization, also auto parallelization, autoparallelization, or parallelization, the last one of which implies automation when used in context, refers to converting sequential code into multi-threaded or vectorized code in order to utilize multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. The goal of automatic parallelization is to relieve programmers from the hectic and error-prone manual parallelization process. Though the quality of automatic parallelization has improved in the past several decades, fully automatic parallelization of sequential programs by compilers remains a grand challenge due to its need for complex program analysis and the unknown factors during compilation.

Intel C++ Compiler, also known as icc or icl, is a group of C and C++ compilers from Intel available for Windows, Mac, Linux, FreeBSD and Intel-based Android devices.

Data parallelism

Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures like arrays and matrices by working on each element in parallel. It contrasts to task parallelism as another form of parallelism.

Task parallelism is a form of parallelization of computer code across multiple processors in parallel computing environments. Task parallelism focuses on distributing tasks—concurrently performed by processes or threads—across different processors. In contrast to data parallelism which involves running the same task on different components of data, task parallelism is distinguished by running many different tasks at the same time on the same data. A common type of task parallelism is pipelining which consists of moving a single set of data through a series of separate tasks where each task can execute independently of the others.

RapidMind Inc. was a privately held company founded and headquartered in Waterloo, Ontario, Canada, acquired by Intel in 2009. It provided a software product that aims to make it simpler for software developers to target multi-core processors and accelerators such as graphics processing units (GPUs).

The Sieve C++ Parallel Programming System is a C++ compiler and parallel runtime designed and released by Codeplay that aims to simplify the parallelization of code so that it may run efficiently on multi-processor or multi-core systems. It is an alternative to other well-known parallelisation methods such as OpenMP, the RapidMind Development Platform and Threading Building Blocks (TBB).

Parallel Extensions

Parallel Extensions was the development name for a managed concurrency library developed by a collaboration between Microsoft Research and the CLR team at Microsoft. The library was released in version 4.0 of the .NET Framework. It is composed of two parts: Parallel LINQ (PLINQ) and Task Parallel Library (TPL). It also consists of a set of coordination data structures (CDS) – sets of data structures used to synchronize and co-ordinate the execution of concurrent tasks.

Intel Parallel Studio XE is a software development product developed by Intel that facilitates native code development on Windows, macOS and Linux in C++ and Fortran for parallel computing. Parallel programming enables software programs to take advantage of multi-core processors from Intel and other processor vendors.

Concurrent Collections is a programming model for software frameworks to expose parallelism in applications. The Concurrent Collections conception originated from tagged stream processing development with HP TStreams.

Software is said to exhibit scalable parallelism if it can make use of additional processors to solve larger problems, i.e. this term refers to software for which Gustafson's law holds. Consider a program whose execution time is dominated by one or more loops, each of that updates every element of an array --- for example, the following finite difference heat equation stencil calculation:

for t := 0 to T dofor i := 1 to N-1 do  new(i) := * .25  // explicit forward-difference with R = 0.25  endfor i := 1 to N-1 do  A(i) := new(i)  endend

The following outline is provided as an overview of and topical guide to C++:

Fork–join model

In parallel computing, the fork–join model is a way of setting up and executing parallel programs, such that execution branches off in parallel at designated points in the program, to "join" (merge) at a subsequent point and resume sequential execution. Parallel sections may fork recursively until a certain task granularity is reached. Fork–join can be considered a parallel design pattern. It was formulated as early as 1963.

References

  1. "The Many Flavors of Data Parallelism", Anwar Ghuloum (2007-09-06). Retrieved on 2010-09-14. Archived April 12, 2010, at the Wayback Machine .
  2. "Intel’s Ct to appear in programmer tools by end of the year", insideHPC (2009-04-08). Retrieved on 2010-09-14.
  3. "RapidMind + Intel", Intel Blog (2009-08-19). Retrieved on 2010-09-14.
  4. "Intel Flexes Parallel Programming Muscles" Archived 2010-09-06 at the Wayback Machine ., HPCwire (2010-09-02). Retrieved on 2010-09-14.
  5. "Parallel Studio 2011: Now We Know What Happened to Ct, Cilk++, and RapidMind", Dr. Dobbs Journal (2012-08-06). Retrieved on 2010-09-14.
  6. "Intel® Array Building Blocks" Intel Article. Retrieved on 2013-09-04.