Include directive

Last updated

Many programming languages and other computer files have a directive, often called include, import, or copy, that causes the contents of the specified file to be inserted into the original file. These included files are called header files or copybooks. They are often used to define the physical layout of program data, pieces of procedural code, and/or forward declarations while promoting encapsulation and the reuse of code or data.

Contents

Header files

In computer programming, a header file is a file that allows programmers to separate certain elements of a program's source code into reusable files. Header files commonly contain forward declarations of classes, subroutines, variables, and other identifiers. Programmers who wish to declare standardized identifiers in more than one source file can place such identifiers in a single header file, which other code can then include whenever the header contents are required. This is to keep the interface in the header separate from the implementation. [1]

The C standard library and the C++ standard library traditionally declare their standard functions in header files.

Some recently created compiled languages (such as Java and C#) do not use forward declarations; identifiers are recognized automatically from source files and read directly from dynamic library symbols. This means header files are not needed.

Purpose

The include directive allows libraries of code to be developed which help to:

Example

An example situation which benefits from the use of an include directive is when referring to functions in a different file. Suppose there is some C source file containing a function add, which is referred to in a second file by first declaring its external existence and type (with a function prototype) as follows:

intadd(int,int);inttriple(intx){returnadd(x,add(x,x));}

One drawback of this approach is that the function prototype must be present in all files that use the function. Another drawback is that if the return type or arguments of the function are changed, all of these prototypes would need to be updated. Putting the prototype in a single, separate file avoids these issues. Assuming the prototype is moved to the file add.h, the second source file can then become:

#include"add.h"inttriple(intx){returnadd(x,add(x,x));}

Now, every time the code is compiled, the latest function prototypes in add.h will be included in the files using them, avoiding potential errors.

Language support

C/C++

In the C and C++ programming languages, the #include preprocessor directive causes the compiler to replace that line with the entire text of the contents of the named source file (if included in quotes: "") or named header (if included in angle brackets: <>); [2] a header doesn't need to be a source file. [3] Inclusion continues recursively on these included contents, up to an implementation-defined nesting limit. Headers need not have names corresponding to files: in C++ standard headers are typically identified with words, like "vector", hence #include <vector>, while in C standard headers have identifiers in the form of filenames with a ".h" extension, as in #include <stdio.h>. A "source file" can be any file, with a name of any form, but is most commonly named with a ".h" extension and called a "header file" (sometimes ".hpp" or ".hh" to distinguish C++ headers), though files with .c, .cc, and .cpp extensions may also be included (particularly in the single compilation unit technique), and sometimes other extensions are used.

These two forms of #include directive can determine which header or source file to include in an implementation-defined way. In practice, what is usually done is that the angle-brackets form searches for source files in a standard system directory (or set of directories), and then searches for source files in local or project-specific paths (specified on the command line, in an environment variable, or in a Makefile or other build file), while the form with quotes does not search in a standard system directory, only searching in local or project-specific paths. [4] In case there is no clash, the angle-brackets form can also be used to specify project-specific includes, but this is considered poor form. The fact that headers need not correspond to files is primarily an implementation technicality, and is used to omit the .h extension in including C++ standard headers; in common use, "header" means "header file".

For example:

#include<stdio.h>  // Include the contents of the standard header 'stdio.h' (probably a file 'stdio.h').#include<vector>  // Include the contents of the standard header 'vector' (probably a file 'vector.h').#include"user_defined.h"  // Include the contents of the file 'user_defined.h'.

In C and C++, problems may be faced if two (or more) include files contain the same third file. One solution is to avoid include files from including any other files, possibly requiring the programmer to manually add extra include directives to the original file. Another solution is to use include guards. [5]

Since C++20, headers can also be imported as header units, that is, separate translation units synthesized from a header. [6] They are meant to be used alongside modules. The syntax used in that case is

exportoptional import header-name;

For example:

import<stdio.h>;// supporting this is optionalimport<vector>;// supporting this is mandated by the standardexportimport"user_defined.h";

Header units are provided for all the C++ standard library headers. [7]

COBOL

COBOL (and also RPG IV) allows programmers to copy copybooks into the source of the program in a similar way to header files, but it also allows for the replacement of certain text in them with other text. The COBOL keyword for inclusion is COPY, and replacement is done using the REPLACING ... BY ... clause. An include directive has been present in COBOL since COBOL 60, but changed from the original INCLUDE [8] to COPY by 1968. [9]

Fortran

Fortran does not require header files per se. However, Fortran 90 and later have two related features: include statements and modules. The former can be used to share a common file containing procedure interfaces, much like a C header, although the specification of an interface is not required for all varieties of Fortran procedures. This approach is not commonly used; instead, procedures are generally grouped into modules that can then be referenced with a use statement within other regions of code. For modules, header-type interface information is automatically generated by the compiler and typically put into separate module files, although some compilers have placed this information directly into object files. The language specification itself does not mandate the creation of any extra files, even though module procedure interfaces are almost universally propagated in this manner.

Pascal

Most Pascal compilers support the $i or $include compiler directive, in which the $i or $include directive immediately follows the start of a comment block in the form of

Where the $i or $include directive is not case sensitive, and filename.pas or filename.inc is the name of the file to be included. (It has been common practice to name Pascal's include files with the extension .inc, but this is not required.) Some compilers, to prevent unlimited recursion, limit invoking an include file to a certain number, prohibit invoking itself or any currently open file, or are limited to a maximum of one include file at a time, e.g. an include file cannot include itself or another file. However, the program that includes other files can include several, just one at a time.

PHP

In PHP, the include directive causes another PHP file to be included and evaluated. [10] Similar commands are require, which upon failure to include will produce a fatal exception and halt the script, [11] and include_once and require_once, which prevent a file from being included or required again if it has already been included or required, avoiding the C's double inclusion problem.

Other languages

There are many forms of the include directive, such as:

Modern languages (e.g. Haskell and Java) tend to avoid copybooks or includes, preferring modules and import/export systems for namespace control. Some of these languages (such as Java and C#) do not use forward declarations and, instead, identifiers are recognized automatically from source files and read directly from dynamic library symbols (typically referenced with import or using directives), meaning header files are not needed.

See also

Related Research Articles

In computing, a namespace is a set of signs (names) that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified.

<span class="mw-page-title-main">Library (computing)</span> Collection of non-volatile resources used by computer programs

In computer science, a library is a collection of non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, pre-written code and subroutines, classes, values or type specifications. In IBM's OS/360 and its successors they are referred to as partitioned data sets.

A filename extension, file name extension or file extension is a suffix to the name of a computer file. The extension indicates a characteristic of the file contents or its intended use. A filename extension is typically delimited from the rest of the filename with a full stop (period), but in some systems it is separated with spaces.

In computer science, imperative programming is a programming paradigm of software that uses statements that change a program's state. In much the same way that the imperative mood in natural languages expresses commands, an imperative program consists of commands for the computer to perform. Imperative programming focuses on describing how a program operates step by step, rather than on high-level descriptions of its expected results.

The C preprocessor is the macro preprocessor for several computer programming languages, such as C, Objective-C, C++, and a variety of Fortran languages. The preprocessor provides inclusion of header files, macro expansions, conditional compilation, and line control.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is a profoundly different language — features of D can be considered streamlined and expanded-upon ideas from C++, however D also draws inspiration from other high-level programming languages, notably Java, Python, Ruby, C#, and Eiffel.

In computer programming, a global variable is a variable with global scope, meaning that it is visible throughout the program, unless shadowed. The set of all global variables is known as the global environment or global state. In compiled languages, global variables are generally static variables, whose extent (lifetime) is the entire runtime of the program, though in interpreted languages, global variables are generally dynamically allocated when declared, since they are not known ahead of time.

The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. Starting from the original ANSI C standard, it was developed at the same time as the C library POSIX specification, which is a superset of it. Since ANSI C was adopted by the International Organization for Standardization, the C standard library is also called the ISO C library.

In compiler construction, name mangling is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect of the desired functionality.

The GNU Scientific Library is a software library for numerical computations in applied mathematics and science. The GSL is written in C; wrappers are available for other programming languages. The GSL is part of the GNU Project and is distributed under the GNU General Public License.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

Managed Extensions for C++ or Managed C++ is a now-deprecated set of language extensions for C++, including grammatical and syntactic extensions, keywords and attributes, to bring the C++ syntax and language to the .NET Framework. These extensions were created by Microsoft to allow C++ code to be targeted to the Common Language Runtime (CLR) in the form of managed code, as well as continue to interoperate with native code.

In the C and C++ programming languages, an #include guard, sometimes called a macro guard, header guard or file guard, is a particular construct used to avoid the problem of double inclusion when dealing with the include directive.

A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written or compiled in another one. An FFI is often used in contexts where calls are made into binary dynamic-link library.

Platform Invocation Services, commonly referred to as P/Invoke, is a feature of Common Language Infrastructure implementations, like Microsoft's Common Language Runtime, that enables managed code to call native code.

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

Tcov is a source code coverage analysis and statement-by-statement profiling tool for software written in Fortran, C and C++. Tcov generates exact counts of the number of times each statement in a program is executed and annotates source code to add instrumentation. It is a standard utility, provided free of cost with Sun Studio software.

In software engineering, the module pattern is a design pattern used to implement the concept of software modules, defined by modular programming, in a programming language with incomplete direct support for the concept.

Although C++ is one of the most widespread programming languages, many prominent software engineers criticize C++ for being overly complex and fundamentally flawed. Among the critics have been: Robert Pike, Joshua Bloch, Linus Torvalds, Donald Knuth, Richard Stallman, and Ken Thompson. C++ was widely adopted and implemented as a systems language through most of its existence, it has been used to build many pieces of very important software.

References

  1. Alan Griffiths (2005). "Separating Interface and Implementation in C++". ACCU. Retrieved 2013-05-07.
  2. C11 standard, 6.10.2 Source file inclusion, pp. 164–165
  3. C11 standard, 7.1.2 Standard headers, p. 181, footnote 182: "A header is not necessarily a source file, nor are the < and > delimited sequences in header names necessarily valid source file names.
  4. Stallman, Richard M. (July 1992). "The C Preprocessor" (PDF). Archived from the original (PDF) on 4 September 2012. Retrieved 19 February 2014.
  5. Pike, Rob (21 Feb 1989), Notes on programming in C, Cat-v document archive, retrieved 9 Dec 2011
  6. "Merging Modules - P1103R3" (PDF).
  7. "P1502R1 - Standard library header units for C++20".
  8. "COBOL Initial Specifications for a COmmon Business Oriented Language" (PDF). Department of Defense. April 1960. p. IX-9. Archived from the original (PDF) on 12 February 2014. Retrieved 11 February 2014.
  9. "The COPY Statement". CODASYL COBOL Journal of Development 1968. July 1969. LCCN   73601243.
  10. "include". php.net. The PHP Group. Retrieved 20 February 2014.
  11. "require". php.net. The PHP Group. Retrieved 20 February 2014.