This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these template messages)
|
The C preprocessor is the macro preprocessor for several computer programming languages, such as C, Objective-C, C++, and a variety of Fortran languages. The preprocessor provides inclusion of header files, macro expansions, conditional compilation, and line control.
The language of preprocessor directives is only weakly related to the grammar of C, and so is sometimes used to process other kinds of text files. [1]
The preprocessor was introduced to C around 1973 at the urging of Alan Snyder and also in recognition of the usefulness of the file inclusion mechanisms available in BCPL and PL/I. Its original version offered only file inclusion and simple string replacement using #include
and #define
for parameterless macros, respectively. It was extended shortly after, firstly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation. [2]
The C preprocessor was part of a long macro-language tradition at Bell Labs, which was started by Douglas Eastwood and Douglas McIlroy in 1959. [3]
Preprocessing is defined by the first four (of eight) phases of translation specified in the C Standard.
_Pragma
operators.One of the most common uses of the preprocessor is to include another source file:
#include<stdio.h>intmain(void){printf("Hello, World!\n");return0;}
The preprocessor replaces the line #include <stdio.h>
with the textual content of the file 'stdio.h', which declares the printf()
function among other things.
This can also be written using double quotes, e.g. #include "stdio.h"
. If the filename is enclosed within angle brackets, the file is searched for in the standard compiler include paths. If the filename is enclosed within double quotes, the search path is expanded to include the current source file directory. C compilers and programming environments all have a facility that allows the programmer to define where include files can be found. This can be introduced through a command-line flag, which can be parameterized using a makefile, so that a different set of include files can be swapped in for different operating systems, for instance.
By convention, include files are named with either a .h
or .hpp
extension. However, there is no requirement that this be observed. Files with a .def
extension may denote files designed to be included multiple times, each time expanding the same repetitive content; #include "icon.xbm"
is likely to refer to an XBM image file (which is at the same time a C source file).
#include
often compels the use of #include
guards or #pragma once
to prevent double inclusion.
The if–else directives #if
, #ifdef
, #ifndef
, #else
, #elif
, and #endif
can be used for conditional compilation. #ifdef
and #ifndef
are simple shorthands for #if defined(...)
and #if !defined(...)
.
#if VERBOSE >= 2printf("trace message");#endif
Most compilers targeting Microsoft Windows implicitly define _WIN32
. [4] This allows code, including preprocessor commands, to compile only when targeting Windows systems. A few compilers define WIN32
instead. For such compilers that do not implicitly define the _WIN32
macro, it can be specified on the compiler's command line, using -D_WIN32
.
#ifdef __unix__ /* __unix__ is usually defined by compilers targeting Unix systems */#include<unistd.h>#elif defined _WIN32 /* _WIN32 is usually defined by compilers targeting 32 or 64 bit Windows systems */#include<windows.h>#endif
The example code tests if a macro __unix__
is defined. If it is, the file <unistd.h>
is then included. Otherwise, it tests if a macro _WIN32
is defined instead. If it is, the file <windows.h>
is then included.
A more complex #if
example can use operators; for example:
#if !(defined __LP64__ || defined __LLP64__) || defined _WIN32 && !defined _WIN64// we are compiling for a 32-bit system#else// we are compiling for a 64-bit system#endif
Translation can also be caused to fail by using the #error
directive:
#if RUBY_VERSION == 190#error 1.9.0 not supported#endif
There are two types of macros: object-like and function-like. Object-like macros do not take parameters; function-like macros do (although the list of parameters may be empty). The generic syntax for declaring an identifier as a macro of each type is, respectively:
#define <identifier> <replacement token list> // object-like macro#define <identifier>(<parameter list>) <replacement token list> // function-like macro, note parameters
The function-like macro declaration must not have any whitespace between the identifier and the first, opening parenthesis. If whitespace is present, the macro will be interpreted as object-like with everything starting from the first parenthesis added to the token list.
A macro definition can be removed with #undef
:
#undef <identifier> // delete the macro
Whenever the identifier appears in the source code it is replaced with the replacement token list, which can be empty. For an identifier declared to be a function-like macro, it is only replaced when the following token is also a left parenthesis that begins the argument list of the macro invocation. The exact procedure followed for expansion of function-like macros with arguments is subtle.
Object-like macros were conventionally used as part of good programming practice to create symbolic names for constants; for example:
#define PI 3.14159
instead of hard-coding numbers throughout the code. An alternative in both C and C++, especially in situations in which a pointer to the number is required, is to apply the const
qualifier to a global variable. This causes the value to be stored in memory, instead of being substituted by the preprocessor.
An example of a function-like macro is:
#define RADTODEG(x) ((x) * 57.29578)
This defines a radians-to-degrees conversion which can be inserted in the code where required; for example, RADTODEG(34)
. This is expanded in-place, so that repeated multiplication by the constant is not shown throughout the code. The macro here is written as all uppercase to emphasize that it is a macro, not a compiled function.
The second x
is enclosed in its own pair of parentheses to avoid the possibility of incorrect order of operations when it is an expression instead of a single value. For example, the expression RADTODEG(r+1)
expands correctly as ((r+1)*57.29578)
; without parentheses, (r+1*57.29578)
gives precedence to the multiplication.
Similarly, the outer pair of parentheses maintain correct order of operation. For example, 1/RADTODEG(r)
expands to 1/((r)*57.29578)
; without parentheses, 1/(r)*57.29578
gives precedence to the division.
Function-like macro expansion occurs in the following stages:
This may produce surprising results:
#define HE HI#define LLO _THERE#define HELLO "HI THERE"#define CAT(a,b) a##b#define XCAT(a,b) CAT(a,b)#define CALL(fn) fn(HE,LLO)CAT(HE,LLO)// "HI THERE", because concatenation occurs before normal expansionXCAT(HE,LLO)// HI_THERE, because the tokens originating from parameters ("HE" and "LLO") are expanded firstCALL(CAT)// "HI THERE", because this evaluates to CAT(a,b)
Certain symbols are required to be defined by an implementation during preprocessing. These include __FILE__
and __LINE__
, predefined by the preprocessor itself, which expand into the current file and line number. For instance, the following:
// debugging macros so we can pin down message origin at a glance// is bad#define WHERESTR "[file %s, line %d]: "#define WHEREARG __FILE__, __LINE__#define DEBUGPRINT2(...) fprintf(stderr, __VA_ARGS__)#define DEBUGPRINT(_fmt, ...) DEBUGPRINT2(WHERESTR _fmt, WHEREARG, __VA_ARGS__)// OR// is good#define DEBUGPRINT(_fmt, ...) fprintf(stderr, "[file %s, line %d]: " _fmt, __FILE__, __LINE__, __VA_ARGS__)DEBUGPRINT("hey, x=%d\n",x);
prints the value of x
, preceded by the file and line number to the error stream, allowing quick access to which line the message was produced on. Note that the WHERESTR
argument is concatenated with the string following it. The values of __FILE__
and __LINE__
can be manipulated with the #line
directive. The #line
directive determines the line number and the file name of the line below. For example:
#line 314 "pi.c"printf("line=%d file=%s\n",__LINE__,__FILE__);
generates the printf
function:
printf("line=%d file=%s\n",314,"pi.c");
Source code debuggers refer also to the source position defined with __FILE__
and __LINE__
. This allows source code debugging when C is used as the target language of a compiler, for a totally different language. The first C Standard specified that the macro __STDC__
be defined to 1 if the implementation conforms to the ISO Standard and 0 otherwise, and the macro __STDC_VERSION__
defined as a numeric literal specifying the version of the Standard supported by the implementation. Standard C++ compilers support the __cplusplus
macro. Compilers running in non-standard mode must not set these macros or must define others to signal the differences.
Other Standard macros include __DATE__
, the current date, and __TIME__
, the current time.
The second edition of the C Standard, C99, added support for __func__
, which contains the name of the function definition within which it is contained, but because the preprocessor is agnostic to the grammar of C, this must be done in the compiler itself using a variable local to the function.
Macros that can take a varying number of arguments (variadic macros) are not allowed in C89, but were introduced by a number of compilers and standardized in C99. Variadic macros are particularly useful when writing wrappers to functions taking a variable number of parameters, such as printf
, for example when logging warnings and errors.
One little-known usage pattern of the C preprocessor is known as X-Macros. [5] [6] [7] An X-Macro is a header file. Commonly, these use the extension .def
instead of the traditional .h
. This file contains a list of similar macro calls, which can be referred to as "component macros." The include file is then referenced repeatedly.
Many compilers define additional, non-standard macros, although these are often poorly documented. A common reference for these macros is the Pre-defined C/C++ Compiler Macros project, which lists "various pre-defined compiler macros that can be used to identify standards, compilers, operating systems, hardware architectures, and even basic run-time libraries at compile-time."
The #
operator (known as the stringification operator or stringizing operator) converts a token into a C string literal, escaping any quotes or backslashes appropriately.
Example:
#define str(s) #sstr(p="foo\n";)// outputs "p = \"foo\\n\";"str(\n)// outputs "\n"
If stringification of the expansion of a macro argument is desired, two levels of macros must be used:
#define xstr(s) str(s)#define str(s) #s#define foo 4str(foo)// outputs "foo"xstr(foo)// outputs "4"
A macro argument cannot be combined with additional text and then stringified. However, a series of adjacent string constants and stringified arguments can be written: the C compiler will then combine all the adjacent string constants into one long string.
The ##
operator (known as the "Token Pasting Operator") concatenates two tokens into one token.
Example:
#define DECLARE_STRUCT_TYPE(name) typedef struct name##_s name##_tDECLARE_STRUCT_TYPE(g_object);// Outputs: typedef struct g_object_s g_object_t;
The #error
directive outputs a message through the error stream.
#error "error message"
C23 will introduce the #embed
directive for binary resource inclusion. [8] This allows binary files (like images) to be included into the program without them being valid C source files (like XBM), without requiring processing by external tools like xxd -i
and without the use of string literals which have a length limit on MSVC. Similarly to xxd -i
the directive is replaced by a comma separated list of integers corresponding to the data of the specified resource. More presicely, if an array of type unsigned char
is initialized using an #embed
directive, the result is the same as-if the resource was written to the array using fread
(unless a parameter changes the embed element width to something other than CHAR_BIT
). Apart from the convenience, #embed
is also easier for compilers to handle, since they are allowed to skip expanding the directive to its full form due to the as-if rule.
The file to be embedded can be specified in an identical fashion to #include
, meaning, either between chevrons or between quotes. The directive also allows certain parameters to be passed to it to customise its behaviour, which follow the file name. The C standard defines the following parameters and implementations may define their own. The limit
parameter is used to limit the width of the included data. It is mostly intended to be used with "infinite" files like urandom. The prefix
and suffix
parameters allow the programmer to specify a prefix and suffix to the embedded data, which is used if and only if the embedded resource is not empty. Finally, the if_empty
parameter replaces the entire directive if the resource is empty (which happens if the file is empty or a limit of 0 is specified). All standard parameters can also be surrounded by double underscores, just like standard attributes on C23, for example __prefix__
is interchangeable with prefix
. Implementation-defined parameters use a form similar to attribute syntax (e.g., vendor::attr
) but without the square brackets. While all standard parameters require an argument to be passed to them (e.g., limit requires a width), this is generally optional and even the set of parentheses can be omitted if an argument is not required, which might be the case for some implementation-defined parameters.
All C, C++, and Objective-C implementations provide a preprocessor, as preprocessing is a required step for those languages, and its behavior is described by official standards for these languages, such as the ISO C standard.
Implementations may provide their own extensions and deviations, and vary in their degree of compliance with written standards. Their exact behavior may depend on command-line flags supplied on invocation. For instance, the GNU C preprocessor can be made more standards compliant by supplying certain flags. [9]
The #pragma
directive is a compiler-specific directive, which compiler vendors may use for their own purposes. For instance, a #pragma
is often used to allow suppression of specific error messages, manage heap and stack debugging and so on. A compiler with support for the OpenMP parallelization library can automatically parallelize a for
loop with #pragma omp parallel for
.
C99 introduced a few standard #pragma
directives, taking the form #pragma STDC ...
, which are used to control the floating-point implementation. The alternative, macro-like form _Pragma(...)
was also added.
#warning
to the standard for this purpose). A typical use is to warn about the usage of some old code, which is now deprecated and only included for compatibility reasons; for example:// GNU, Intel and IBM#warning "Do not use ABC, which is deprecated. Use XYZ instead."
// Microsoft#pragma message("Do not use ABC, which is deprecated. Use XYZ instead.")
#include_next
for chaining headers of the same name. [13] There are some preprocessor directives that have been added to the C preprocessor by the specifications of some languages and are specific to said languages.
#import
, which is like #include
but only includes the file once. A common vendor pragma with a similar functionality in C is #pragma once
.#
character; instead, they start with import
and module
respectively, optionally preceded by export
.As the C preprocessor can be invoked separately from the compiler with which it is supplied, it can be used separately, on different languages. Notable examples include its use in the now-deprecated imake system and for preprocessing Fortran. However, such use as a general purpose preprocessor is limited: the input language must be sufficiently C-like. [9] The GNU Fortran compiler automatically calls "traditional mode" (see below) cpp before compiling Fortran code if certain file extensions are used. [16] Intel offers a Fortran preprocessor, fpp, for use with the ifort compiler, which has similar capabilities. [17]
CPP also works acceptably with most assembly languages and Algol-like languages. This requires that the language syntax not conflict with CPP syntax, which means no lines starting with #
and that double quotes, which cpp interprets as string literals and thus ignores, don't have syntactical meaning other than that. The "traditional mode" (acting like a pre-ISO C preprocessor) is generally more permissive and better suited for such use. [18]
The C preprocessor is not Turing-complete, but it comes very close: recursive computations can be specified, but with a fixed upper bound on the amount of recursion performed. [19] However, the C preprocessor is not designed to be, nor does it perform well as, a general-purpose programming language. As the C preprocessor does not have features of some other preprocessors, such as recursive macros, selective expansion according to quoting, and string evaluation in conditionals, it is very limited in comparison to a more general macro processor such as m4.
C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.
In computer programming, a macro is a rule or pattern that specifies how a certain input should be mapped to a replacement output. Applying a macro to an input is known as macro expansion. The input and output may be a sequence of lexical tokens or characters, or a syntax tree. Character macros are supported in software applications to make it easy to invoke common command sequences. Token and tree macros are supported in some programming languages to enable code reuse or to extend the language, sometimes for domain-specific languages.
In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.
The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.
The printf family of functions in the C programming language are a set of functions that take a format string as input among a variable sized list of other values and produce as output a string that corresponds to the format specifier and given input values. The string is written in a simple template language: characters are usually copied literally into the function's output, but format specifiers, which start with a %
character, indicate the location and method to translate a piece of data to characters. The design has been copied to expose similar functionality in other programming languages.
C99 is an informal name for ISO/IEC 9899:1999, a past version of the C programming language standard. It extends the previous version (C90) with new features for the language and the standard library, and helps implementations make better use of available computer hardware, such as IEEE 754-1985 floating-point arithmetic, and compiler technology. The C11 version of the C programming language standard, published in 2011, updates C99.
A variadic macro is a feature of some computer programming languages, especially the C preprocessor, whereby a macro may be declared to accept a varying number of arguments.
In computer programming, a directive or pragma is a language construct that specifies how a compiler should process its input. Depending on the programming language, directives may or may not be part of the grammar of the language and may vary from compiler to compiler. They can be processed by a preprocessor to specify compiler behavior, or function as a form of in-band parameterization.
In mathematics and in computer programming, a variadic function is a function of indefinite arity, i.e., one which accepts a variable number of arguments. Support for variadic functions differs widely among programming languages.
OpenGL Shading Language (GLSL) is a high-level shading language with a syntax based on the C programming language. It was created by the OpenGL ARB to give developers more direct control of the graphics pipeline without having to use ARB assembly language or hardware-specific languages.
In the C and C++ programming languages, an #include guard, sometimes called a macro guard, header guard or file guard, is a particular construct used to avoid the problem of double inclusion when dealing with the include directive.
In the C and C++ programming languages, #pragma once
is a non-standard but widely supported preprocessor directive designed to cause the current header file to be included only once in a single compilation. Thus, #pragma once
serves the same purpose as include guards, but with several advantages, including less code, avoidance of name clashes, and sometimes improvement in compilation speed. On the other hand, #pragma once
is not necessarily available in all compilers and its implementation is tricky and might not always be reliable.
A weak symbol denotes a specially annotated symbol during linking of Executable and Linkable Format (ELF) object files. By default, without any annotation, a symbol in an object file is strong. During linking, a strong symbol can override a weak symbol of the same name. In contrast, in the presence of two strong symbols by the same name, the linker resolves the symbol in favor of the first one found. This behavior allows an executable to override standard library functions, such as malloc(3). When linking a binary executable, a weakly declared symbol does not need a definition. In comparison, a declared strong symbol without a definition triggers an undefined symbol link error.
stdarg.h
is a header in the C standard library of the C programming language that allows functions to accept an indefinite number of arguments. It provides facilities for stepping through a list of function arguments of unknown number and type. C++ provides this functionality in the header cstdarg
.
The Windows software trace preprocessor is a preprocessor that simplifies the use of WMI event tracing to implement efficient software tracing in drivers and applications that target Windows 2000 and later operating systems. WPP was created by Microsoft and is included in the Windows DDK. Although WPP is wide in its applicability, it is not included in the Windows SDK, and therefore is primarily used for drivers and driver support software produced by software vendors that purchase the Windows DDK.
In computer programming, variadic templates are templates that take a variable number of arguments.
In C and C++ programming language terminology, a translation unit is the ultimate input to a C or C++ compiler from which an object file is generated. A translation unit roughly consists of a source file after it has been processed by the C preprocessor, meaning that header files listed in #include
directives are literally included, sections of code within #ifndef
may be included, and macros have been expanded.
Many programming languages and other computer files have a directive, often called include
, import
, or copy
, that causes the contents of the specified file to be inserted into the original file. These included files are called header files or copybooks. They are often used to define the physical layout of program data, pieces of procedural code, and/or forward declarations while promoting encapsulation and the reuse of code or data.
OpenHMPP - programming standard for heterogeneous computing. Based on a set of compiler directives, standard is a programming model designed to handle hardware accelerators without the complexity associated with GPU programming. This approach based on directives has been implemented because they enable a loose relationship between an application code and the use of a hardware accelerator (HWA).
X macros are an idiomatic usage of programming language macros for generating list-like structures of data or code. They are most useful when at least some of the lists cannot be composed by indexing, such as compile time. They provide reliable maintenance of parallel lists whose corresponding items must be declared or executed in the same order.
{{cite journal}}
: Cite journal requires |journal=
(help)Having said that, you can often get away with using cpp on things which are not C. Other Algol-ish programming languages are often safe (Ada, etc.) So is assembly, with caution. -traditional-cpp mode preserves more white space, and is otherwise more permissive. Many of the problems can be avoided by writing C or C++ style comments instead of native language comments, and keeping macros simple.