This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these template messages) (Learn how and when to remove this template message)
|
OpenHMPP (HMPP [1] for Hybrid Multicore Parallel Programming) - programming standard for heterogeneous computing. Based on a set of compiler directives, standard is a programming model designed to handle hardware accelerators without the complexity associated with GPU programming. This approach based on directives has been implemented because they enable a loose relationship between an application code and the use of a hardware accelerator (HWA).
The OpenHMPP directive-based programming model offers a syntax to offload computations on hardware accelerators and to optimize data movement to/from the hardware memory.
The model is based on works initialized by CAPS (Compiler and Architecture for Embedded and Superscalar Processors), a common project from INRIA, CNRS, the University of Rennes 1 and the INSA of Rennes.
OpenHMPP is based on the concept of codelets, functions that can be remotely executed on HWAs.
A codelet has the following properties:
These properties ensure that a codelet RPC can be remotely executed by a HWA. This RPC and its associated data transfers can be asynchronous.
HMPP provides synchronous and asynchronous RPC. Implementation of asynchronous operation is hardware dependent.
HMPP considers two address spaces: the host processor one and the HWA memory.
The OpenHMPP directives may be seen as “meta-information” added in the application source code. They are safe meta-information i.e. they do not change the original code behavior. They address the remote execution (RPC) of a function as well as the transfers of data to/from the HWA memory.
The table below introduces the OpenHMPP directives. OpenHMPP directives address different needs: some of them are dedicated to declarations and others are dedicated to the management of the execution.
Control flow instructions | Directives for data management | |
---|---|---|
Declarations | codelet group | resident map mapbyname |
Operational Directives | callsite synchronize region | allocate release advancedload delegatedstore |
One of the fundamental points of the HMPP approach is the concept of directives and their associated labels which makes it possible to expose a coherent structure on a whole set of directives disseminated in an application.
There are two kinds of labels:
In order to simplify the notations, regular expressions will be used to describe the syntax of the HMPP directives.
The color convention below is used for the description of syntax directives:
The general syntax of OpenHMPP directives is:
#pragma hmpp <grp_label> [codelet_label]? directive_type [,directive_parameters]* [&]
!$hmpp <grp_label> [codelet_label]? directive_type [,directive_parameters]* [&]
Where:
<grp_label>
: is a unique identifier naming a group of codelets. In cases where no groups are defined in the application, this label can simply miss. Legal label name must follow this grammar: [a-z,A-Z,_][a-z,A-Z,0-9,_]*. Note that the “< >” characters belong to the syntax and are mandatory for this kind of label.codelet_label
: is a unique identifier naming a codelet. Legal label name must follow this grammar: [a-z,A-Z,_][a-z,A-Z,0-9,_]*directive
: is the name of the directive;directive_parameters
: designates some parameters associated to the directive. These parameters may be of different kinds and specify either some arguments given to the directive either a mode of execution (asynchronous versus synchronous for example);[&]
: is a character used to continue the directive on the next line (same for C and FORTRAN).The parameters associated to a directive may be of different types. Below are the directive parameters defined in OpenHMPP:
version = major.minor[.micro]
: specifies the version of the HMPP directives to be considered by the preprocessor.args[arg_items].size={dimsize[,dimsize]*}
: specifies the size of a non scalar parameter (an array).args[arg_items].io=[in|out|inout]
: indicates that the specified function arguments are either input, output or both. By default, unqualified arguments are inputs.cond = "expr"
: specifies an execution condition as a boolean C or Fortran expression that needs to be true in order to start the execution of the group or codelets.target=target_name[:target_name]*
: specifies which targets to try to use in the given order.asynchronous
: specifies that the codelet execution is not blocking (default is synchronous).args[<arg_items>].advancedload=true
: indicates that the specified parameters are preloaded. Only in or inout parameters can be preloaded.args[arg_items].noupdate=true
: this property specifies that the data is already available on the HWA and so that no transfer is needed. When this property is set, no transfer is done on the considered argumentargs[<arg_items>].addr="<expr>"
: <expr>
is an expression that gives the address of the data to upload.args[<arg_items>].const=true
: indicates that the argument is to be uploaded only once.A codelet
directive declares a computation to be remotely executed on a hardware accelerator. For the codelet
directive:
The syntax of the directive is:
#pragma hmpp <grp_label> codelet_label codelet [, version = major.minor[.micro]?]? [, args[arg_items].io=[[<span style="color:#339933;">in</span>|out|inout]]* [, args[arg_items].size={dimsize[,dimsize]*}]* [, args[arg_items].const=true]* [, cond = "expr"] [, target=target_name[:target_name]*]
More than one codelet directive can be added to a function in order to specify different uses or different execution contexts. However, there can be only one codelet directive for a given call site label.
The callsite
directive specifies how the use a codelet at a given point in the program.
The syntax of the directive is:
#pragma hmpp <grp_label> codelet_label callsite [, asynchronous]? [, args[arg_items].size={dimsize[,dimsize]*}]* [, args[arg_items].advancedload=[[<span style="color:#339933;">true</span>|false]]* [, args[arg_items].addr="expr"]* [, args[arg_items].noupdate=true]*
An example is shown here :
/* declaration of the codelet */#pragma hmpp simple1 codelet, args[outv].io=inout, target=CUDAstaticvoidmatvec(intsn,intsm,floatinv[sm],floatinm[sn][sm],float*outv){inti,j;for(i=0;i<sm;i++){floattemp=outv[i];for(j=0;j<sn;j++){temp+=inv[j]*inm[i][j];}outv[i]=temp;}intmain(intargc,char**argv){intn;......../* codelet use */#pragma hmpp simple1 callsite, args[outv].size={n}matvec(n,m,myinc,inm,myoutv);........}
In some cases, a specific management of the data throughout the application is required (CPU/GPU data movements optimization, shared variables...).
The group
directive allows the declaration of a group of codelets. The parameters defined in this directive are applied to all codelets belonging to the group. The syntax of the directive is:
#pragma hmpp <grp_label> group [, version = <major>.<minor>[.<micro>]?]? [, target =target_name[:target_name]*]]? [, cond = “expr”]?
When using a HWA, the main bottleneck is often the data transfers between the HWA and the main processor.
To limit the communication overhead, data transfers can be overlapped with successive executions of the same codelet by using the asynchronous property of the HWA.
The allocate
directive locks the HWA and allocates the needed amount of memory.
#pragma hmpp <grp_label> allocate [,args[arg_items].size={dimsize[,dimsize]*}]*
The release
directive specifies when to release the HWA for a group or a stand-alone codelet.
#pragma hmpp <grp_label> release
The advancedload
directive prefetches data before the remote execution of the codelet.
#pragma hmpp <grp_label> [codelet_label]? advancedload,args[arg_items] [,args[arg_items].size={dimsize[,dimsize]*}]* [,args[arg_items].addr="expr"]* [,args[arg_items].section={[subscript_triplet,]+}]* [,asynchronous]
The delegatedstore
directive is a synchronization barrier to wait for an asynchronous codelet execution to complete and to then download the results.
#pragma hmpp <grp_label> [codelet_label]? delegatedstore,args[arg_items] [,args[arg_items].addr="expr"]* [,args[arg_items].section={[subscript_triplet,]+}]*
The synchronize
directive specifies to wait until the completion of an asynchronous callsite execution. For the synchronize directive, the codelet label is always mandatory and the group label is required if the codelet belongs to a group.
#pragma hmpp <grp_label> codelet_label synchronize
In the following example, the device initialization, memory allocation and upload of the input data are done only once outside the loop and not in each iteration of the loop.
The synchronize
directive allows to wait for the asynchronous execution of the codelet to complete before launching another iteration. Finally the delegatedstore
directive outside the loop uploads the sgemm result.
intmain(intargc,char**argv){#pragma hmpp sgemm allocate, args[vin1;vin2;vout].size={size,size}#pragma hmpp sgemm advancedload, args[vin1;vin2;vout], args[m,n,k,alpha,beta]for(j=0;j<2;j++){#pragma hmpp sgemm callsite, asynchronous, args[vin1;vin2;vout].advancedload=true, args[m,n,k,alpha,beta].advancedload=truesgemm(size,size,size,alpha,vin1,vin2,beta,vout);#pragma hmpp sgemm synchronize}#pragma hmpp sgemm delegatedstore, args[vout]#pragma hmpp sgemm release
Those directives map together all the arguments sharing the given name for all the group.
The types and dimensions of all mapped arguments must be identical.
The map
directive maps several arguments on the device.
#pragma hmpp <grp_label> map, args[arg_items]
This directive is quite similar as the map
directive except that the arguments to be mapped are directly specified by their name. The mapbyname
directive is equivalent to multiple map
directives.
#pragma hmpp <grp_label> mapbyname [,variableName]+
The resident
directive declares some variables as global within a group. Those variables can then be directly accessed from any codelet belonging to the group. This directive applies to the declaration statement just following it in the source code.
The syntax of this directive is:
#pragma hmpp <grp_label> resident [, args[::var_name].io=[[<span style="color:#339933;">in</span>|out|inout]]* [, args[::var_name].size={dimsize[,dimsize]*}]* [, args[::var_name].addr="expr"]* [, args[::var_name].const=true]*
The notation ::var_name
with the prefix ::
, indicates an application's variable declared as resident.
A region is a merge of the codelet/callsite directives. The goal is to avoid code restructuration to build the codelet. Therefore, all the attributes available for codelet
or callsite
directives can be used on regions
directives.
In C language:
#pragma hmpp [<MyGroup>] [label] region [, args[arg_items].io=[[<span style="color:#339933;">in</span>|out|inout]]* [, cond = "expr"]< [, args[arg_items].const=true]* [, target=target_name[:target_name]*] [, args[arg_items].size={dimsize[,dimsize]*}]* [, args[arg_items].advancedload=[[<span style="color:#339933;">true</span>|false]]* [, args[arg_items].addr="expr"]* [, args[arg_items].noupdate=true]* [, asynchronous]? [, private=[arg_items]]* { C BLOCK STATEMENTS }
The OpenHMPP Open Standard is based on HMPP Version 2.3 (May 2009, CAPS entreprise).
The OpenHMPP directive-based programming model is implemented in:
OpenHMPP is used by HPC actors[ who? ] in Oil & Gas,[ citation needed ] Energy,[ citation needed ] Manufacturing,[ citation needed ] Finance,[ citation needed ] Education & Research.[ citation needed ]
In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the called function. Inline expansion is similar to macro expansion, but occurs during compilation, without changing the source code, while macro expansion occurs prior to compilation, and results in different text that is then processed by the compiler.
The C preprocessor or cpp is the macro preprocessor for the C, Objective-C and C++ computer programming languages. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control.
D, also known as Dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is a distinct language. It has redesigned some core C++ features, while also sharing characteristics of other languages, notably Java, Python, Ruby, C#, and Eiffel.
The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.
In computer science, conditional statements, conditional expressions and conditional constructs are features of a programming language, which perform different computations or actions depending on whether a programmer-specified boolean condition evaluates to true or false. Apart from the case of branch predication, this is always achieved by selectively altering the control flow based on some condition.
In computer programming, a function object is a construct allowing an object to be invoked or called as if it were an ordinary function, usually with the same syntax. Function objects are often called functors.
In computer programming, ?:
is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, inline if (iif), or ternary if. An expression a ? b : c
evaluates to b
if the value of a
is true, and otherwise to c
. One can read it aloud as "if a then b otherwise c".
The syntax of Java refers to the set of rules defining how a Java program is written and interpreted.
In computer programming, an entry point is where the first instructions of a program are executed, and where the program has access to command line arguments.
The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.
SystemVerilog, standardized as IEEE 1800, is a hardware description and hardware verification language used to model, design, simulate, test and implement electronic systems. SystemVerilog is based on Verilog and some extensions, and since 2008 Verilog is now part of the same IEEE standard. It is commonly used in the semiconductor and electronic design industry as an evolution of Verilog.
In computing, IIf is a function in several editions of the Visual Basic programming language and ColdFusion Markup Language (CFML), and on spreadsheets that returns the second or third parameter based on the evaluation of the first parameter. It is an example of a conditional expression, which is similar to a conditional statement.
sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.
C++11 is a version of the standard for the programming language C++. It was approved by International Organization for Standardization (ISO) on 12 August 2011, replacing C++03, superseded by C++14 on 18 August 2014 and later, by C++17. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.
In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions, or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.
stdarg.h
is a header in the C standard library of the C programming language that allows functions to accept an indefinite number of arguments. It provides facilities for stepping through a list of function arguments of unknown number and type. C++ provides this functionality in the header cstdarg
.
The ECL programming language and system were an extensible high-level programming language and development environment developed at Harvard University in the 1970s. The name 'ECL' stood for 'Extensible Computer Language' or 'EClectic Language'. Some publications used the name 'ECL' for the entire system and 'EL/1' for the language itself.
This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.
In computer programming, variadic templates are templates that take a variable number of arguments.
Nemerle is a general-purpose, high-level, statically typed programming language designed for platforms using the Common Language Infrastructure (.NET/Mono). It offers functional, object-oriented and imperative features. It has a simple C#-like syntax and a powerful metaprogramming system. In June 2012, the core developers of Nemerle were hired by the Czech software development company JetBrains. The team is focusing on developing Nitra, a framework to implement extant and new programming languages. This framework will likely be used to create future versions of Nemerle.