Inline function

Last updated

In the C and C++ programming languages, an inline function is one qualified with the keyword inline; this serves two purposes:

Contents

  1. It serves as a compiler directive that suggests (but does not require) that the compiler substitute the body of the function inline by performing inline expansion, i.e. by inserting the function code at the address of each function call, thereby saving the overhead of a function call. In this respect it is analogous to the register storage class specifier, which similarly provides an optimization hint. [1]
  2. The second purpose of inline is to change linkage behavior; the details of this are complicated. This is necessary due to the C/C++ separate compilation + linkage model, specifically because the definition (body) of the function must be duplicated in all translation units where it is used, to allow inlining during compiling, which, if the function has external linkage, causes a collision during linking (it violates uniqueness of external symbols). C and C++ (and dialects such as GNU C and Visual C++) resolve this in different ways. [1]

Example

An inline function can be written in C or C++ like this:

inlinevoidswap(int*m,int*n){inttmp=*m;*m=*n;*n=tmp;}

Then, a statement such as the following:

swap(&x,&y);

may be translated into (if the compiler decides to do the inlining, which typically requires optimization to be enabled):

inttmp=x;x=y;y=tmp;

When implementing a sorting algorithm doing lots of swaps, this can increase the execution speed.

Standard support

C++ and C99, but not its predecessors K&R C and C89, have support for inline functions, though with different semantics. In both cases, inline does not force inlining; the compiler is free to choose not to inline the function at all, or only in some cases. Different compilers vary in how complex a function they can manage to inline. Mainstream C++ compilers like Microsoft Visual C++ and GCC support an option that lets the compilers automatically inline any suitable function, even those not marked as inline functions. However, simply omitting the inline keyword to let the compiler make all inlining decisions is not possible, since the linker will then complain about duplicate definitions in different translation units. This is because inline not only gives the compiler a hint that the function should be inlined, it also has an effect on whether the compiler will generate a callable out-of-line copy of the function (see storage classes of inline functions).

Nonstandard extensions

GNU C, as part of the dialect gnu89 that it offers, has support for inline as an extension to C89. However, the semantics differ from both those of C++ and C99. armcc in C90 mode also offers inline as a non-standard extension, with semantics different from gnu89 and C99.

Some implementations provide a means by which to force the compiler to inline a function, usually by means of implementation-specific declaration specifiers:

Indiscriminate uses of that can result in larger code (bloated executable file), minimal or no performance gain, and in some cases even a loss in performance. Moreover, the compiler cannot inline the function in all circumstances, even when inlining is forced; in this case both gcc and Visual C++ generate warnings.

Forcing inlining is useful if:

For code portability, the following preprocessor directives may be used:

#ifdef _MSC_VER#define forceinline __forceinline#elif defined(__GNUC__)#define forceinline inline __attribute__((__always_inline__))#elif defined(__CLANG__)#if __has_attribute(__always_inline__)#define forceinline inline __attribute__((__always_inline__))#else#define forceinline inline#endif#else#define forceinline inline#endif

Storage classes of inline functions

static inline has the same effects in all C dialects and C++. It will emit a locally visible (out-of-line copy of the) function if required.

Regardless of the storage class, the compiler can ignore the inline qualifier and generate a function call in all C dialects and C++.

The effect of the storage class extern when applied or not applied to inline functions differs between the C dialects [2] and C++. [3]

C99

In C99, a function defined inline will never, and a function defined extern inline will always, emit an externally visible function. Unlike in C++, there is no way to ask for an externally visible function shared among translation units to be emitted only if required.

If inline declarations are mixed with extern inline declarations or with unqualified declarations (ie., without inline qualifier or storage class), the translation unit must contain a definition (no matter whether unqualified, inline, or extern inline) and an externally visible function will be emitted for it.

A function defined inline requires exactly one function with that name somewhere else in the program which is either defined extern inline or without qualifier. If more than one such definition is provided in the whole program, the linker will complain about duplicate symbols. If, however, it is lacking, the linker does not necessarily complain, because, if all uses could be inlined, it is not needed. But it may complain, since the compiler can always ignore the inline qualifier and generate calls to the function instead, as typically happens if the code is compiled without optimization. (This may be the desired behavior, if the function is supposed to be inlined everywhere by all means, and an error should be generated if it is not.) A convenient way is to define the inline functions in header files and create one .c file per function, containing an extern inline declaration for it and including the respective header file with the definition. It does not matter whether the declaration is before or after the include.

To prevent unreachable code from being added to the final executable if all uses of a function were inlined, it is advised [3] to put the object files of all such .c files with a single extern inline function into a static library file, typically with ar rcs, then link against that library instead of the individual object files. That causes only those object files to be linked that are actually needed, in contrast to linking the object files directly, which causes them to be always included in the executable. However, the library file must be specified after all the other object files on the linker command line, since calls from object files specified after the library file to the functions will not be considered by the linker. Calls from inline functions to other inline functions will be resolved by the linker automatically (the s option in ar rcs ensures this).

An alternative solution is to use link time optimization instead of a library. gcc provides the flag -Wl,--gc-sections to omit sections in which all functions are unused. This will be the case for object files containing the code of a single unused extern inline function. However, it also removes any and all other unused sections from all other object files, not just those related to unused extern inline functions. (It may be desired to link functions into the executable that are to be called by the programmer from the debugger rather than by the program itself, eg., for examining the internal state of the program.) With this approach, it is also possible to use a single .c file with all extern inline functions instead of one .c file per function. Then the file has to be compiled with -fdata-sections -ffunction-sections. However, the gcc manual page warns about that, saying "Only use these options when there are significant benefits from doing so."

Some recommend an entirely different approach, which is to define functions as static inline instead of inline in header files. [2] Then, no unreachable code will be generated. However, this approach has a drawback in the opposite case: Duplicate code will be generated if the function could not be inlined in more than one translation unit. The emitted function code cannot be shared among translation units because it must have different addresses. This is another drawback; taking the address of such a function defined as static inline in a header file will yield different values in different translation units. Therefore, static inline functions should only be used if they are used in only one translation unit, which means that they should only go to the respective .c file, not to a header file.

gnu89

gnu89 semantics of inline and extern inline are essentially the exact opposite of those in C99, [4] with the exception that gnu89 permits redefinition of an extern inline function as an unqualified function, while C99 inline does not. [5] Thus, gnu89 extern inline without redefinition is like C99 inline, and gnu89 inline is like C99 extern inline; in other words, in gnu89, a function defined inline will always and a function defined extern inline will never emit an externally visible function. The rationale for this is that it matches variables, for which storage will never be reserved if defined as extern and always if defined without. The rationale for C99, in contrast, is that it would be astonishing if using inline would have a side-effectto always emit a non-inlined version of the functionthat is contrary to what its name suggests.

The remarks for C99 about the need to provide exactly one externally visible function instance for inlined functions and about the resulting problem with unreachable code apply mutatis mutandis to gnu89 as well.

gcc up to and including version 4.2 used gnu89 inline semantics even when -std=c99 was explicitly specified. [6] With version 5, [5] gcc switched from gnu89 to the gnu11 dialect, effectively enabling C99 inline semantics by default. To use gnu89 semantics instead, they have to be enabled explicitly, either with -std=gnu89 or, to only affect inlining, -fgnu89-inline, or by adding the gnu_inline attribute to all inline declarations. To ensure C99 semantics, either -std=c99, -std=c11, -std=gnu99 or -std=gnu11 (without -fgnu89-inline) can be used. [3]

C++

In C++, a function defined inline will, if required, emit a function shared among translation units, typically by putting it into the common section of the object file for which it is needed. The function must have the same definition everywhere, always with the inline qualifier. In C++, extern inline is the same as inline. The rationale for the C++ approach is that it is the most convenient way for the programmer, since no special precautions for elimination of unreachable code must be taken and, like for ordinary functions, it makes no difference whether extern is specified or not.

The inline qualifier is automatically added to a function defined as part of a class definition.

armcc

armcc in C90 mode provides extern inline and inline semantics that are the same as in C++: Such definitions will emit a function shared among translation units if required. In C99 mode, extern inline always emits a function, but like in C++, it will be shared among translation units. Thus, the same function can be defined extern inline in different translation units. [7] This matches the traditional behavior of Unix C compilers [8] for multiple non-extern definitions of uninitialized global variables.

Restrictions

Taking the address of an inline function requires code for a non-inlined copy of that function to be emitted in any case.

In C99, an inline or extern inline function must not access static global variables or define non-conststatic local variables. const static local variables may or may not be different objects in different translation units, depending on whether the function was inlined or whether a call was made. Only static inline definitions can reference identifiers with internal linkage without restrictions; those will be different objects in each translation unit. In C++, both const and non-conststatic locals are allowed and they refer to the same object in all translation units.

gcc cannot inline functions if [3]

  1. they are variadic,
  2. use alloca
  3. use computed goto
  4. use nonlocal goto
  5. use nested functions
  6. use setjmp
  7. use __builtin_longjmp
  8. use __builtin_return, or
  9. use __builtin_apply_args

Based on Microsoft Specifications at MSDN, MS Visual C++ cannot inline (not even with __forceinline), if

  1. The function or its caller is compiled with /Ob0 (the default option for debug builds).
  2. The function and the caller use different types of exception handling (C++ exception handling in one, structured exception handling in the other).
  3. The function has a variable argument list.
  4. The function uses inline assembly, unless compiled with /Og, /Ox, /O1, or /O2.
  5. The function is recursive and not accompanied by #pragma inline_recursion(on). With the pragma, recursive functions are inlined to a default depth of 16 calls. To reduce the inlining depth, use inline_depth pragma.
  6. The function is virtual and is called virtually. Direct calls to virtual functions can be inlined.
  7. The program takes the address of the function and the call is made via the pointer to the function. Direct calls to functions that have had their address taken can be inlined.
  8. The function is also marked with the naked __declspec modifier.

Problems

Besides the problems with inline expansion in general (see Inline expansion § Effect on performance), inline functions as a language feature may not be as valuable as they appear, for a number of reasons:

Quotes

A function declaration ... with an inline specifier declares an inline function. The inline specifier indicates to the implementation that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. An implementation is not required to perform this inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules for inline functions defined by 7.1.2 shall still be respected.

ISO/IEC 14882:2011, the current C++ standard, section 7.1.2

A function declared with an inline function specifier is an inline function ... Making a function an inline function suggests that calls to the function be as fast as possible. The extent to which such suggestions are effective is implementation-defined (footnote: For example, an implementation might never perform inline substitution, or might only perform inline substitutions to calls in the scope of an inline declaration.)

... An inline definition does not provide an external definition for the function, and does not forbid an external definition in another translation unit. An inline definition provides an alternative to an external definition, which a translator may use to implement any call to the function in the same translation unit. It is unspecified whether a call to the function uses the inline definition or the external definition.

ISO 9899:1999(E), the C99 standard, section 6.7.4

See also

Related Research Articles

ANSI C, ISO C, and Standard C are successive standards for the C programming language published by the American National Standards Institute (ANSI) and ISO/IEC JTC 1/SC 22/WG 14 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Historically, the names referred specifically to the original and best-supported version of the standard. Software developers writing in C are encouraged to conform to the standards, as doing so helps portability between compilers.

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

Java and C++ are two prominent object-oriented programming languages. By many language popularity metrics, the two languages have dominated object-oriented and high-performance software development for much of the 21st century, and are often directly compared and contrasted. Java's syntax was based on C/C++.

The C preprocessor is the macro preprocessor for several computer programming languages, such as C, Objective-C, C++, and a variety of Fortran languages. The preprocessor provides inclusion of header files, macro expansions, conditional compilation, and line control.

The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. Starting from the original ANSI C standard, it was developed at the same time as the C library POSIX specification, which is a superset of it. Since ANSI C was adopted by the International Organization for Standardization, the C standard library is also called the ISO C library.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

<span class="mw-page-title-main">C99</span> C programming language standard, 1999 revision

C99 is an informal name for ISO/IEC 9899:1999, a past version of the C programming language standard. It extends the previous version (C90) with new features for the language and the standard library, and helps implementations make better use of available computer hardware, such as IEEE 754-1985 floating-point arithmetic, and compiler technology. The C11 version of the C programming language standard, published in 2011, updates C99.

In compiler construction, name mangling is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.

The One Definition Rule (ODR) is an important rule of the C++ programming language that prescribes that classes/structs and non-inline functions cannot have more than one definition in the entire program and template and types cannot have more than one definition by translation unit. It is defined in the ISO C++ Standard 2003, at section 3.2. Some other programming languages have similar but differently defined rules towards the same objective.

In computer programming, a declaration is a language construct specifying identifier properties: it declares a word's (identifier's) meaning. Declarations are most commonly used for functions, variables, constants, and classes, but can also be used for other entities such as enumerations and type definitions. Beyond the name and the kind of entity, declarations typically specify the data type, or the type signature ; types may also include dimensions, such as for arrays. A declaration is used to announce the existence of the entity to the compiler; this is important in those strongly typed languages that require functions, variables, and constants, and their types to be specified with a declaration before use, and is used in forward declaration. The term "declaration" is frequently contrasted with the term "definition", but meaning and usage varies significantly between languages; see below.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

Interprocedural optimization (IPO) is a collection of compiler techniques used in computer programming to improve performance in programs containing many frequently used functions of small or medium length. IPO differs from other compiler optimizations by analyzing the entire program as opposed to a single function or block of code.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computer programming, a pure function is a function that has the following properties:

  1. the function return values are identical for identical arguments, and
  2. the function has no side effects.

A weak symbol denotes a specially annotated symbol during linking of Executable and Linkable Format (ELF) object files. By default, without any annotation, a symbol in an object file is strong. During linking, a strong symbol can override a weak symbol of the same name. In contrast, in the presence of two strong symbols by the same name, the linker resolves the symbol in favor of the first one found. This behavior allows an executable to override standard library functions, such as malloc(3). When linking a binary executable, a weakly declared symbol does not need a definition. In comparison, a declared strong symbol without a definition triggers an undefined symbol link error.

In the C programming language, an external variable is a variable defined outside any function block. On the other hand, a local (automatic) variable is a variable defined inside a function block.

As an alternative to automatic variables, it is possible to define variables that are external to all functions, that is, variables that can be accessed by name by any function. Because external variables are globally accessible, they can be used instead of argument lists to communicate data between functions. Furthermore, because external variables remain in existence permanently, rather than appearing and disappearing as functions are called and exited, they retain their values even after the functions that set them have returned.

In programming languages, particularly the compiled ones like C, C++, and D, linkage describes how names can or can not refer to the same entity throughout the whole program or one single translation unit.

References

  1. 1 2 Meyers, Randy (July 1, 2002). "The New C: Inline Functions".{{cite journal}}: Cite journal requires |journal= (help)
  2. 1 2 "Inline Functions in C".
  3. 1 2 3 4 "Using the GNU Compiler Collection (GCC): Inline".
  4. "Josef "Jeff" Sipek » GNU inline vs. C99 inline".
  5. 1 2 "Porting to GCC 5 - GNU Project".
  6. "Ian Lance Taylor - Clean up extern inline".
  7. "Documentation – Arm Developer".
  8. gcc manual page, description of -fno-common