RaftLib

RaftLib
Original author(s)	Jonathan Beard
Initial release	late 2014
Stable release	0.9 / January 2020
Preview release	1.0a / May 18, 2020;2 years ago
Written in	C++
Operating system	Linux, macOS, Windows
Type	Data analytics, HPC, Signal Processing, Machine Learning, Algorithms, Big Data
License	Apache License 2.0
Website	www.raftlib.io

Last updated July 05, 2022

RaftLib^[1] is a portable parallel processing system that aims to provide extreme performance while increasing programmer productivity. It enables a programmer to assemble a massively parallel program (both local and distributed) using simple iostream-like operators. RaftLib handles threading, memory allocation, memory placement, and auto-parallelization of compute kernels.^[2] It enables applications to be constructed from chains of compute kernels forming a task and pipeline parallel compute graph. Programs are authored in C++ (although other language bindings are planned).

Example

Here is a Hello World example for demonstration purposes:^[3]

#include<raft>#include<raftio>#include<cstdlib>#include<string>classhi:publicraft::kernel{public:hi():raft::kernel(){output.addPort<std::string>("0");}virtualraft::kstatusrun(){output["0"].push(std::string("Hello World\n"));return(raft::stop);}};intmain(intargc,char**argv){/** instantiate print kernel **/raft::print<std::string>p;/** instantiate hello world kernel **/hihello;/** make a map object **/raft::mapm;/** add kernels to map, both hello and p are executed concurrently **/m+=hello>>p;/** execute the map **/m.exe();return(EXIT_SUCCESS);}

Related Research Articles

A "Hello, world!" program is generally a computer program that outputs or displays the message "Hello, world!". A small piece of code in most general-purpose programming languages, this program is used to illustrate a language's basic syntax. "Hello, world!" programs are often the first a student learns to write in a given language, and they can also be used as a sanity check to ensure computer software intended to compile or run source code is correctly installed, and that its operator understands how to use it.

In computer programming, lazy initialization is the tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed. It is a kind of lazy evaluation that refers specifically to the instantiation of objects or other resources.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

Message Passing Interface (MPI) is a standardized and portable message-passing standard designed to function on parallel computing architectures. The MPI standard defines the syntax and semantics of library routines that are useful to a wide range of users writing portable message-passing programs in C, C++, and Fortran. There are several open-source MPI implementations, which fostered the development of a parallel software industry, and encouraged development of portable and scalable large-scale parallel applications.

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is a profoundly different language —features of D can be considered streamlined and expanded-upon ideas from C++, however D also draws inspiration from other high-level programming languages, notably Java, Python, Ruby, C#, and Eiffel.

In computing, particularly in the context of the Unix operating system and its workalikes, fork is an operation whereby a process creates a copy of itself. It is an interface which is required for compliance with the POSIX and Single UNIX Specification standards. It is usually implemented as a C Standard Library (libc) wrapper to the fork, clone, or other system calls of the kernel. Fork is the primary method of process creation on Unix-like operating systems.

OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.

Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is duplicated but not modified, it is not necessary to create a new resource; the resource can be shared between the copy and the original. Modifications must still create a copy, hence the technique: the copy operation is deferred until the first write. By sharing resources in this way, it is possible to significantly reduce the resource consumption of unmodified copies, while adding a small overhead to resource-modifying operations.

A function pointer, also called a subroutine pointer or procedure pointer, is a pointer that points to a function. As opposed to referencing a data value, a function pointer points to executable code within memory. Dereferencing the function pointer yields the referenced function, which can be invoked and passed arguments just as in a normal function call. Such an invocation is also known as an "indirect" call, because the function is being invoked indirectly through a variable instead of directly through a fixed identifier or address.

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

Modern C++ Design: Generic Programming and Design Patterns Applied is a book written by Andrei Alexandrescu, published in 2001 by Addison-Wesley. It has been regarded as "one of the most important C++ books" by Scott Meyers.

In computer science, stream processing is a programming paradigm which views data streams, or sequences of events in time, as the central input and output objects of computation. Stream processing encompasses dataflow programming, reactive programming, and distributed data processing. Stream processing systems aim to expose parallel processing for data streams and rely on streaming algorithms for efficient implementation. The software stack for these systems includes components such as programming models and query languages, for expressing computation; stream management systems, for distribution and scheduling; and hardware components for acceleration including floating point units, graphics processing units, and field-programmable gate arrays.

A class in C++ is a user-defined type or data structure declared with keyword class that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class is private. The private members are not accessible outside the class; they can be accessed only through methods of the class. The public members form an interface to the class and are accessible outside the class.

In computer programming, a pure function is a function that has the following properties:

the function return values are identical for identical arguments, and
the function application has no side effects.

Charm++ is a parallel object-oriented programming paradigm based on C++ and developed in the Parallel Programming Laboratory at the University of Illinois at Urbana–Champaign. Charm++ is designed with the goal of enhancing programmer productivity by providing a high-level abstraction of a parallel program while at the same time delivering good performance on a wide variety of underlying hardware platforms. Programs written in Charm++ are decomposed into a number of cooperating message-driven objects called chares. When a programmer invokes a method on an object, the Charm++ runtime system sends a message to the invoked object, which may reside on the local processor or on a remote processor in a parallel computation. This message triggers the execution of code within the chare to handle the message asynchronously.

CPU time is the amount of time for which a central processing unit (CPU) was used for processing instructions of a computer program or operating system, as opposed to elapsed time, which includes for example, waiting for input/output (I/O) operations or entering low-power (idle) mode. The CPU time is measured in clock ticks or seconds. Often, it is useful to measure CPU time as a percentage of the CPU's capacity, which is called the CPU usage. CPU time and CPU usage have two main uses.

In computer programming, variadic templates are templates that take a variable number of arguments.

In software engineering, the module pattern is a design pattern used to implement the concept of software modules, defined by modular programming, in a programming language with incomplete direct support for the concept.

In the C++ programming language, input/output library refers to a family of class templates and supporting functions in the C++ Standard Library that implement stream-based input/output capabilities. It is an object-oriented alternative to C's FILE-based streams from the C standard library.

References

↑ "RaftLib: A C++ Template Library for High Performance Stream Parallel Processing" (PDF). Retrieved 2016-08-10.
↑ "Online Modeling and Tuning of Parallel Stream Processing Systems" (PDF). Retrieved 2016-08-10.
↑ "Hello World Example" . Retrieved 2016-08-10.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "RaftLib: A C++ Template Library for High Performance Stream Parallel Processing" (PDF). Retrieved 2016-08-10.

[2] "Online Modeling and Tuning of Parallel Stream Processing Systems" (PDF). Retrieved 2016-08-10.

[3] "Hello World Example" . Retrieved 2016-08-10.

[1]

[2]

[3]

v t e Parallel computing
General	Distributed computing Parallel computing Massively parallel Cloud computing High-performance computing Multiprocessing Manycore processor GPGPU Computer network Systolic array
Levels	Bit Instruction Thread Task Data Memory Loop Pipeline
Multithreading	Temporal Simultaneous (SMT) Speculative (SpMT) Preemptive Cooperative Clustered multi-thread (CMT) Hardware scout
Theory	PRAM model PEM model Analysis of parallel algorithms Amdahl's law Gustafson's law Cost efficiency Karp–Flatt metric Slowdown Speedup
Elements	Process Thread Fiber Instruction window Array data structure
Coordination	Multiprocessing Memory coherence Cache coherence Cache invalidation Barrier Synchronization Application checkpointing
Programming	Stream processing Dataflow programming Models Implicit parallelism Explicit parallelism Concurrency Non-blocking algorithm
Hardware	Flynn's taxonomy SISD SIMD Array processing (SIMT) Pipelined processing Associative processing MISD MIMD Dataflow architecture Pipelined processor Superscalar processor Vector processor Multiprocessor symmetric asymmetric Memory shared distributed distributed shared UMA NUMA COMA Massively parallel computer Computer cluster Grid computer Hardware acceleration
APIs	Ateji PX Boost Chapel HPX Charm++ Cilk Coarray Fortran CUDA Dryad C++ AMP Global Arrays GPUOpen MPI OpenMP OpenCL OpenHMPP OpenACC Parallel Extensions PVM pthreads RaftLib ROCm UPC TBB ZPL
Problems	Automatic parallelization Deadlock Deterministic algorithm Embarrassingly parallel Parallel slowdown Race condition Software lockout Scalability Starvation
Category: Parallel computing

RaftLib

Contents

Example

Related Research Articles

References

External links