Sizeof

Last updated January 30, 2025

sizeof is a unary operator in the C and C++ programming languages that evaluates to the storage size of an expression or a data type, measured in units sized as char. Consequently, the expression sizeof(char) evaluates to 1. The number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof is an unsigned integer that is usually typed as size_t.

Purpose

Many programs must know the storage size of a particular datatype. Though for any given implementation of C or C++ the size of a particular datatype is constant, the sizes of even primitive types in C and C++ may be defined differently for different platforms of implementation. For example, runtime allocation of array space may use the following code, in which the sizeof operator is applied to the cast of the type int:

int*pointer=malloc(10*sizeof(int));

In this example, function malloc allocates memory and returns a pointer to the memory block. The size of the block allocated is equal to the number of bytes for a single object of type int multiplied by 10, providing space for ten integers.

It is generally not safe to assume the size of any datatype. For example, even though most implementations of C and C++ on 32-bit systems define type int to be four octets, this size may change when code is ported to a different system, breaking the code. The exception to this is the data type char, which always has the size 1 in any standards-compliant C implementation. In addition, it is frequently difficult to predict the sizes of compound datatypes such as a struct or union, due to padding. The use of sizeof enhances readability, since it avoids unnamed numeric constants (magic numbers).

An equivalent syntax for allocating the same array space results from using the dereferenced form of the pointer to the storage address, this time applying the operator to a pointer variable:

int*pointer=malloc(10*sizeof*pointer);

Use

The operator sizeof produces the required memory storage space of its operand when the code is compiled. The operand is written following the keyword sizeof and may be the symbol of a storage space, e.g., a variable, an expression, or a type name. Parentheses for the operand are optional, except when specifying a type name. The result of the operator is the size of the operand in bytes, or the size of the memory storage requirement. For expressions, it evaluates to the representation size for the type that would result from evaluation of the expression, which is not performed.

For example, since sizeof (char) is defined to be 1^[1] and assuming the integer type is four bytes long, the following code fragment prints 1,4:

charc;printf("%zu,%zu\n",sizeofc,sizeof(int));

Certain standard header files, such as stddef.h, define size_t to denote the unsigned integral type of the result of a sizeof expression. The printf width specifier z is intended to format that type.

sizeof cannot be used in C preprocessor expressions, such as #if, because it is an element of the programming language, not of the preprocessor syntax, which has no data types.

The following example in C++ uses the operator sizeof with variadic templates.

template<typename...Args>std::size_tGetSize(Args&&...args){/* Get size of parameter pack.*/std::size_tCount=sizeof...(Args);returnCount;}

sizeof can be used with variadic templates in C++11 and above on a parameter pack to determine the number of arguments.

Application to arrays

When sizeof is applied to the name of an array, the result is the number of bytes required to store the entire array. This is one of the few exceptions to the rule that the name of an array is converted to a pointer to the first element of the array, and is possible just because the actual array size is fixed and known at compile time, when the sizeof operator is evaluated. The following program uses sizeof to determine the size of a declared array, avoiding a buffer overflow when copying characters:

#include<stdio.h>intmain(intargc,char**argv){charbuffer[10];/* Array of 10 chars *//* Copy at most 9 characters from argv[1] into buffer,   * null-terminate the buffer. */snprintf(buffer,sizeofbuffer,"%s",argv[1]);return0;}

Here, sizeof buffer is equivalent to 10 * sizeof buffer [0], which evaluates to 10, because the size of the type char is defined as 1.

C99 adds support for flexible array members to structures. This form of array declaration is allowed as the last element in structures only, and differs from normal arrays in that no length is specified to the compiler. For a structure named s containing a flexible array member named a, sizeof s is therefore equivalent to offsetof (s, a):

#include<stdio.h>structflexarray{charval;intarray[];/* Flexible array member; must be last element of struct */};intmain(intargc,char**argv){printf("sizeof (struct flexarray) == %zu\n",sizeof(structflexarray));return0;}

In this case the sizeof operator returns the size of the structure, including any padding, but without any storage allowed for the array. Most platforms produce the following output:

sizeof (struct flexarray) == 4

C99 also allows variable length arrays that have the length specified at runtime,^[2] although the feature is considered an optional implementation in later versions of the C standard. In such cases, the sizeof operator is evaluated in part at runtime to determine the storage occupied by the array.

#include<stddef.h>size_tflexsize(intn){charb[n+3];/* Variable length array */returnsizeofb;/* Execution time sizeof */}intmain(void){size_tsize=flexsize(10);/* flexsize returns 13 */return0;}

sizeof can be used to determine the number of elements in an array, by dividing the size of the entire array by the size of a single element. This should be used with caution; When passing an array to another function, it will "decay" to a pointer type. At this point, sizeof will return the size of the pointer, not the total size of the array. As an example with a proper array:

intmain(void){inttab[10];printf("Number of elements in the array: %zu\n",sizeoftab/sizeoftab[0]);/* yields 10 */return0;}

Incomplete types

sizeof can only be applied to "completely" defined types. With arrays, this means that the dimensions of the array must be present in its declaration, and that the type of the elements must be completely defined. For structs and unions, this means that there must be a member list of completely defined types. For example, consider the following two source files:

/* file1.c */intarr[10];structx{intone;inttwo;};/* more code *//* file2.c */externintarr[];structx;/* more code */

Both files are perfectly legal C, and code in file1.c can apply sizeof to arr and struct x. However, it is illegal for code in file2.c to do this, because the definitions in file2.c are not complete. In the case of arr, the code does not specify the dimension of the array; without this information, the compiler has no way of knowing how many elements are in the array, and cannot calculate the array's overall size. Likewise, the compiler cannot calculate the size of struct x because it does not know what members it is made up of, and therefore cannot calculate the sum of the sizes of the structure's members (and padding). If the programmer provided the size of the array in its declaration in file2.c, or completed the definition of struct x by supplying a member list, this would allow the application of sizeof to arr or struct x in that source file.

Object members

C++11 introduced the possibility to apply the sizeof parameter to specific members of a class without the necessity to instantiate the object to achieve this.^[3] The following example for instance yields 4 and 8 on most platforms.

#include<iostream>structfoo{inta;intb;};intmain(){std::cout<<sizeoffoo::a<<"\n"<<sizeof(foo)<<"\n";}

Variadic template packs

C++11 introduced variadic templates; the keyword sizeof followed by ellipsis returns the number of elements in a parameter pack.

template<typename...Args>voidprint_size(Args...args){std::cout<<sizeof...(args)<<"\n";}intmain(){print_size();// outputs 0print_size("Is the answer",42,true);// outputs 3}

Implementation

When applied to a fixed-length datatype or variable, expressions with the operator sizeof are evaluated during program compilation; they are replaced by constant result-values. The C99 standard introduced variable-length arrays (VLAs), which required evaluation for such expressions during program execution. In many cases, the implementation specifics may be documented in an application binary interface (ABI) document for the platform, specifying formats, padding, and alignment for the data types, to which the compiler must conform.

Structure padding

When calculating the size of any object type, the compiler must take into account any required data structure alignment to meet efficiency or architectural constraints. Many computer architectures do not support multiple-byte access starting at any byte address that is not a multiple of the word size, and even when the architecture allows it, usually the processor can fetch a word-aligned object faster than it can fetch an object that straddles multiple words in memory.^[4] Therefore, compilers usually align data structures to at least a word boundary, and also align individual members to their respective boundaries. In the following example, the structure student is likely to be aligned on a word boundary, which is also where the member grade begins, and the member age is likely to start at the next word address. The compiler accomplishes the latter by inserting padding bytes between members as needed to satisfy the alignment requirements. There may also be padding at the end of a structure to ensure proper alignment in case the structure is used as an element of an array.

Thus, the aggregate size of a structure in C can be greater than the sum of the sizes of its individual members. For example, on many systems the following code prints 8:

structstudent{chargrade;/* char is 1 byte long */intage;/* int is 4 bytes long */};printf("%zu",sizeof(structstudent));

Related Research Articles

C is a general-purpose programming language. It was created in the 1970s by Dennis Ritchie and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems code, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free.

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In the C programming language, struct is the keyword used to define a composite, a.k.a. record, data type – a named set of values that occupy a block of memory. It allows for the different values to be accessed via a single identifier, often a pointer. A struct can contain other data types so is used for mixed-data-type records. For example a bank customer struct might contains fields: name, address, telephone, balance.

In computing, an uninitialized variable is a variable that is declared but is not set to a definite known value before it is used. It will have some value, but not a predictable one. As such, it is a programming error and a common source of bugs in software.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.

Data structure alignment is the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.

scanf, short for scan formatted, is a C standard library function that reads and parses text from standard input.

A class in C++ is a user-defined type or data structure declared with any of the keywords class, struct or union that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class declared with the keyword class is private. The private members are not accessible outside the class; they can be accessed only through member functions of the class. The public members form an interface to the class and are accessible outside the class.

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior version of the C++ standard, named C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

setjmp.h is a header defined in the C standard library to provide "non-local jumps": control flow that deviates from the usual subroutine call and return sequence. The complementary functions setjmp and longjmp provide this functionality.

In computer science, a type punning is any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

C's offsetof macro is an ANSI C library feature found in stddef.h. It evaluates to the offset of a given member within a struct or union type, an expression of type size_t. The offsetof macro takes two parameters, the first being a structure or union name, and the second being the name of a subobject of the structure/union that is not a bit field. It cannot be described as a C prototype.

In computer programming, variadic templates are templates that take a variable number of arguments.

In the C programming language, operations can be performed on a bit level using bitwise operators.

C struct data types may end with a flexible array member with no specified size:

References

↑ "C99 standard (ISO/IEC9899)" (PDF). ISO/IEC. 7 September 2007. 6.5.3.4.3, p. 80. Retrieved 31 October 2010.
↑ "WG14/N1124 Committee Draft ISO/IEC 9899" (PDF). 6 May 2005. 6 May 2005. 6.5.3.4 The sizeof operator.
↑ "N2253 Extending sizeof to apply to non-static data members without an object (Revision 1)".
↑ Rentzsch, Jonathan (8 February 2005). "Data alignment: Straighten up and fly right". IBM. Retrieved 29 September 2014.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[sizeof-char-1] "C99 standard (ISO/IEC9899)" (PDF). ISO/IEC. 7 September 2007. 6.5.3.4.3, p. 80. Retrieved 31 October 2010.

[2] "WG14/N1124 Committee Draft ISO/IEC 9899" (PDF). 6 May 2005. 6 May 2005. 6.5.3.4 The sizeof operator.

[3] "N2253 Extending sizeof to apply to non-static data members without an object (Revision 1)".

[4] Rentzsch, Jonathan (8 February 2005). "Data alignment: Straighten up and fly right". IBM. Retrieved 29 September 2014.

[1]

[2]

[3]

[4]