C dynamic memory allocation

Last updated

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free. [1] [2] [3]

Contents

The C++ programming language includes these functions; however, the operators new and delete provide similar functionality and are recommended by that language's authors. [4] Still, there are several situations in which using new/delete is not applicable, such as garbage collection code or performance-sensitive code, and a combination of malloc and placement new may be required instead of the higher-level new operator.

Many different implementations of the actual memory allocation mechanism, used by malloc, are available. Their performance varies in both execution time and required memory.

Rationale

The C programming language manages memory statically, automatically, or dynamically. Static-duration variables are allocated in main memory, usually along with the executable code of the program, and persist for the lifetime of the program; automatic-duration variables are allocated on the stack and come and go as functions are called and return. For static-duration and automatic-duration variables, the size of the allocation must be compile-time constant (except for the case of variable-length automatic arrays [5] ). If the required size is not known until run-time (for example, if data of arbitrary size is being read from the user or from a disk file), then using fixed-size data objects is inadequate.

The lifetime of allocated memory can also cause concern. Neither static- nor automatic-duration memory is adequate for all situations. Automatic-allocated data cannot persist across multiple function calls, while static data persists for the life of the program whether it is needed or not. In many situations the programmer requires greater flexibility in managing the lifetime of allocated memory.

These limitations are avoided by using dynamic memory allocation, in which memory is more explicitly (but more flexibly) managed, typically by allocating it from the free store (informally called the "heap"),[ citation needed ] an area of memory structured for this purpose. In C, the library function malloc is used to allocate a block of memory on the heap. The program accesses this block of memory via a pointer that malloc returns. When the memory is no longer needed, the pointer is passed to free which deallocates the memory so that it can be used for other purposes.

The original description of C indicated that calloc and cfree were in the standard library, but not malloc. Code for a simple model implementation of a storage manager for Unix was given with alloc and free as the user interface functions, and using the sbrk system call to request memory from the operating system. [6] The 6th Edition Unix documentation gives alloc and free as the low-level memory allocation functions. [7] The malloc and free routines in their modern form are completely described in the 7th Edition Unix manual. [8] [9]

Some platforms provide library or intrinsic function calls which allow run-time dynamic allocation from the C stack rather than the heap (e.g. alloca() [10] ). This memory is automatically freed when the calling function ends.

Overview of functions

The C dynamic memory allocation functions are defined in stdlib.h header (cstdlib header in C++). [1]

FunctionDescription
mallocallocates the specified number of bytes
aligned_allocallocates the specified number of bytes at the specified alignment
reallocincreases or decreases the size of the specified block of memory, moving it if necessary
callocallocates the specified number of bytes and initializes them to zero
freereleases the specified block of memory back to the system

Differences between malloc() and calloc()

Usage example

Creating an array of ten integers with automatic scope is straightforward in C:

intarray[10];

However, the size of the array is fixed at compile time. If one wishes to allocate a similar array dynamically without using a variable-length array, which is not guaranteed to be supported in all C11 implementations, the following code can be used:

int*array=malloc(10*sizeof(int));

This computes the number of bytes that ten integers occupy in memory, then requests that many bytes from malloc and assigns the result to a pointer named array (due to C syntax, pointers and arrays can be used interchangeably in some situations).

Because malloc might not be able to service the request, it might return a null pointer and it is good programming practice to check for this:

int*array=malloc(10*sizeof(int));if(array==NULL){fprintf(stderr,"malloc failed\n");return-1;}

When the program no longer needs the dynamic array, it must eventually call free to return the memory it occupies to the free store:

free(array);

The memory set aside by malloc is not initialized and may contain cruft: the remnants of previously used and discarded data. After allocation with malloc, elements of the array are uninitialized variables. The command calloc will return an allocation that has already been cleared:

int*array=calloc(10,sizeof(int));

With realloc we can resize the amount of memory a pointer points to. For example, if we have a pointer acting as an array of size and we want to change it to an array of size , we can use realloc.

int*arr=malloc(2*sizeof(int));arr[0]=1;arr[1]=2;arr=realloc(arr,3*sizeof(int));arr[2]=3;

Note that realloc must be assumed to have changed the base address of the block (i.e. if it has failed to extend the size of the original block, and has therefore allocated a new larger block elsewhere and copied the old contents into it). Therefore, any pointers to addresses within the original block are also no longer valid.

Type safety

malloc returns a void pointer (void *), which indicates that it is a pointer to a region of unknown data type. The use of casting is required in C++ due to the strong type system, whereas this is not the case in C. One may "cast" (see type conversion) this pointer to a specific type:

int*ptr,*ptr2;ptr=malloc(10*sizeof(*ptr));/* without a cast */ptr2=(int*)malloc(10*sizeof(*ptr));/* with a cast */

There are advantages and disadvantages to performing such a cast.

Advantages to casting

Disadvantages to casting

Common errors

The improper use of dynamic memory allocation can frequently be a source of bugs. These can include security bugs or program crashes, most often due to segmentation faults.

Most common errors are as follows: [15]

Not checking for allocation failures
Memory allocation is not guaranteed to succeed, and may instead return a null pointer. Using the returned value, without checking if the allocation is successful, invokes undefined behavior. This usually leads to crash (due to the resulting segmentation fault on the null pointer dereference), but there is no guarantee that a crash will happen so relying on that can also lead to problems.
Memory leaks
Failure to deallocate memory using free leads to buildup of non-reusable memory, which is no longer used by the program. This wastes memory resources and can lead to allocation failures when these resources are exhausted.
Logical errors
All allocations must follow the same pattern: allocation using malloc, usage to store data, deallocation using free. Failures to adhere to this pattern, such as memory usage after a call to free (dangling pointer) or before a call to malloc (wild pointer), calling free twice ("double free"), etc., usually causes a segmentation fault and results in a crash of the program. These errors can be transient and hard to debug – for example, freed memory is usually not immediately reclaimed by the OS, and thus dangling pointers may persist for a while and appear to work.

In addition, as an interface that precedes ANSI C standardization, malloc and its associated functions have behaviors that were intentionally left to the implementation to define for themselves. One of them is the zero-length allocation, which is more of a problem with realloc since it is more common to resize to zero. [16] Although both POSIX and the Single Unix Specification require proper handling of 0-size allocations by either returning NULL or something else that can be safely freed, [17] not all platforms are required to abide by these rules. Among the many double-free errors that it has led to, the 2019 WhatsApp RCE was especially prominent. [18] A way to wrap these functions to make them safer is by simply checking for 0-size allocations and turning them into those of size 1. (Returning NULL has its own problems: it otherwise indicates an out-of-memory failure. In the case of realloc it would have signaled that the original memory was not moved and freed, which again is not the case for size 0, leading to the double-free.) [19]

Implementations

The implementation of memory management depends greatly upon operating system and architecture. Some operating systems supply an allocator for malloc, while others supply functions to control certain regions of data. The same dynamic memory allocator is often used to implement both malloc and the operator new in C++. [20]

Heap-based

Implementation of legacy allocators was commonly done using the heap segment. The allocator would usually expand and contract the heap to fulfill allocation requests.

The heap method suffers from a few inherent flaws:

dlmalloc and ptmalloc

Doug Lea has developed the public domain dlmalloc ("Doug Lea's Malloc") as a general-purpose allocator, starting in 1987. The GNU C library (glibc) is derived from Wolfram Gloger's ptmalloc ("pthreads malloc"), a fork of dlmalloc with threading-related improvements. [21] [22] [23] As of November 2023, the latest version of dlmalloc is version 2.8.6 from August 2012. [24]

dlmalloc is a boundary tag allocator. Memory on the heap is allocated as "chunks", an 8-byte aligned data structure which contains a header, and usable memory. Allocated memory contains an 8- or 16-byte overhead for the size of the chunk and usage flags (similar to a dope vector). Unallocated chunks also store pointers to other free chunks in the usable space area, making the minimum chunk size 16 bytes on 32-bit systems and 24/32 (depends on alignment) bytes on 64-bit systems. [22] [24] :2.8.6, Minimum allocated size

Unallocated memory is grouped into "bins" of similar sizes, implemented by using a double-linked list of chunks (with pointers stored in the unallocated space inside the chunk). Bins are sorted by size into three classes: [22] [24] :Overlaid data structures

Game developer Adrian Stone argues that dlmalloc, as a boundary-tag allocator, is unfriendly for console systems that have virtual memory but do not have demand paging. This is because its pool-shrinking and growing callbacks (sysmalloc/systrim) cannot be used to allocate and commit individual pages of virtual memory. In the absence of demand paging, fragmentation becomes a greater concern. [27]

FreeBSD's and NetBSD's jemalloc

Since FreeBSD 7.0 and NetBSD 5.0, the old malloc implementation (phkmalloc by Poul-Henning Kamp) was replaced by jemalloc, written by Jason Evans. The main reason for this was a lack of scalability of phkmalloc in terms of multithreading. In order to avoid lock contention, jemalloc uses separate "arenas" for each CPU. Experiments measuring number of allocations per second in multithreading application have shown that this makes it scale linearly with the number of threads, while for both phkmalloc and dlmalloc performance was inversely proportional to the number of threads. [28]

OpenBSD's malloc

OpenBSD's implementation of the malloc function makes use of mmap. For requests greater in size than one page, the entire allocation is retrieved using mmap; smaller sizes are assigned from memory pools maintained by malloc within a number of "bucket pages", also allocated with mmap. [29] [ better source needed ] On a call to free, memory is released and unmapped from the process address space using munmap. This system is designed to improve security by taking advantage of the address space layout randomization and gap page features implemented as part of OpenBSD's mmap system call, and to detect use-after-free bugs—as a large memory allocation is completely unmapped after it is freed, further use causes a segmentation fault and termination of the program.

The GrapheneOS project initially started out by porting OpenBSD's memory allocator to Android's Bionic C Library. [30]

Hoard malloc

Hoard is an allocator whose goal is scalable memory allocation performance. Like OpenBSD's allocator, Hoard uses mmap exclusively, but manages memory in chunks of 64 kilobytes called superblocks. Hoard's heap is logically divided into a single global heap and a number of per-processor heaps. In addition, there is a thread-local cache that can hold a limited number of superblocks. By allocating only from superblocks on the local per-thread or per-processor heap, and moving mostly-empty superblocks to the global heap so they can be reused by other processors, Hoard keeps fragmentation low while achieving near linear scalability with the number of threads. [31]

mimalloc

An open-source compact general-purpose memory allocator from Microsoft Research with focus on performance. [32] The library is about 11,000 lines of code.

Thread-caching malloc (tcmalloc)

Every thread has a thread-local storage for small allocations. For large allocations mmap or sbrk can be used. TCMalloc, a malloc developed by Google, [33] has garbage-collection for local storage of dead threads. The TCMalloc is considered to be more than twice as fast as glibc's ptmalloc for multithreaded programs. [34] [35]

In-kernel

Operating system kernels need to allocate memory just as application programs do. The implementation of malloc within a kernel often differs significantly from the implementations used by C libraries, however. For example, memory buffers might need to conform to special restrictions imposed by DMA, or the memory allocation function might be called from interrupt context. [36] This necessitates a malloc implementation tightly integrated with the virtual memory subsystem of the operating system kernel.

Overriding malloc

Because malloc and its relatives can have a strong impact on the performance of a program, it is not uncommon to override the functions for a specific application by custom implementations that are optimized for application's allocation patterns. The C standard provides no way of doing this, but operating systems have found various ways to do this by exploiting dynamic linking. One way is to simply link in a different library to override the symbols. Another, employed by Unix System V.3, is to make malloc and free function pointers that an application can reset to custom functions. [37]

The most common form on POSIX-like systems is to set the environment variable LD_PRELOAD with the path of the allocator, so that the dynamic linker uses that version of malloc/calloc/free instead of the libc implementation.

Allocation size limits

The largest possible memory block malloc can allocate depends on the host system, particularly the size of physical memory and the operating system implementation.

Theoretically, the largest number should be the maximum value that can be held in a size_t type, which is an implementation-dependent unsigned integer representing the size of an area of memory. In the C99 standard and later, it is available as the SIZE_MAX constant from < stdint.h >. Although not guaranteed by ISO C, it is usually 2^(CHAR_BIT * sizeof(size_t)) - 1.

On glibc systems, the largest possible memory block malloc can allocate is only half this size, namely 2^(CHAR_BIT * sizeof(ptrdiff_t) - 1) - 1. [38]

Extensions and alternatives

The C library implementations shipping with various operating systems and compilers may come with alternatives and extensions to the standard malloc interface. Notable among these is:

See also

Related Research Articles

<span class="mw-page-title-main">Buffer overflow</span> Anomaly in computer security and programming

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

<span class="mw-page-title-main">Memory management</span> Computer memory management methodology

Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when no longer needed. This is critical to any advanced computer system where more than a single process might be underway at any time.

Berkeley sockets is an application programming interface (API) for Internet sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

A heap overflow, heap overrun, or heap smashing is a type of buffer overflow that occurs in the heap data area. Heap overflows are exploitable in a different manner to that of stack-based overflows. Memory on the heap is dynamically allocated at runtime and typically contains program data. Exploitation is performed by corrupting this data in specific ways to cause the application to overwrite internal structures such as linked list pointers. The canonical heap overflow technique overwrites dynamic memory allocation linkage and uses the resulting pointer exchange to overwrite a program function pointer.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

In computing, mmap(2) is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It implements demand paging because file contents are not immediately read from disk and initially use no physical RAM at all. The actual reads from disk are performed after a specific location is accessed, in a lazy manner. After the mapping is no longer needed, the pointers must be unmapped with munmap(2). Protection information—for example, marking mapped regions as executable—can be managed using mprotect(2), and special treatment can be enforced using madvise(2).

Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably redirecting code execution to, for example, a particular exploited function in memory, ASLR randomly arranges the address space positions of key data areas of a process, including the base of the executable and the positions of the stack, heap and libraries.

In computing, a data segment is a portion of an object file or the corresponding address space of a program that contains initialized static variables, that is, global variables and static local variables. The size of this segment is determined by the size of the values in the program's source code, and does not change at run time.

<span class="mw-page-title-main">Stack-based memory allocation</span> Form of computer memory allocation

Stacks in computing architectures are regions of memory where data is added or removed in a last-in-first-out (LIFO) manner.

The Boehm–Demers–Weiser garbage collector, often simply known as the Boehm GC or Boehm collector, is a conservative garbage collector for C and C++ developed by Hans Boehm, Alan Demers, and Mark Weiser.

In the C++ programming language, new and delete are a pair of language constructs that perform dynamic memory allocation, object construction and object destruction.

In computer science, manual memory management refers to the usage of manual instructions by the programmer to identify and deallocate unused objects, or garbage. Up until the mid-1990s, the majority of programming languages used in industry supported manual memory management, though garbage collection has existed since 1959, when it was introduced with Lisp. Today, however, languages with garbage collection such as Java are increasingly popular and the languages Objective-C and Swift provide similar functionality through Automatic Reference Counting. The main manually managed languages still in widespread use today are C and C++ – see C dynamic memory allocation.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

mtrace is the memory debugger included in the GNU C Library.

In computer programming, a variable-length array (VLA), also called variable-sized or runtime-sized, is an array data structure which length is determined at runtime, instead of at compile time. In the language C, the VLA is said to have a variably modified data type that depends on a value.

brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process. These functions are typically called from a higher-level memory management library function such as malloc. In the original Unix system, brk and sbrk were the only ways in which applications could acquire additional data space; later versions allowed this to also be done using the mmap call.

In C++ computer programming, allocators are a component of the C++ Standard Library. The standard library provides several data structures, such as list and set, commonly referred to as containers. A common trait among these containers is their ability to change size during the execution of the program. To achieve this, some form of dynamic memory allocation is usually required. Allocators handle all the requests for allocation and deallocation of memory for a given container. The C++ Standard Library provides general-purpose allocators that are used by default, however, custom allocators may also be supplied by the programmer.

qsort is a C standard library function that implements a sorting algorithm for arrays of arbitrary objects according to a user-provided comparison function. It is named after the "quicker sort" algorithm, which was originally used to implement it in the Unix C library, although the C standard does not require it to implement quicksort.

References

  1. 1 2 7.20.3 Memory management functions (PDF). ISO/IEC 9899:1999 specification (Technical report). p. 313.
  2. Summit, Steve. "Chapter 11: Memory Allocation". C Programming Notes. Retrieved 2020-07-11.
  3. "aligned_alloc(3) - Linux man page".
  4. Stroustrup, Bjarne (2008). Programming: Principles and Practice Using C++. Addison Wesley. p. 1009. ISBN   978-0-321-54372-1.
  5. "gcc manual". gnu.org. Retrieved 2008-12-14.
  6. Brian W. Kernighan, Dennis M. Ritchie, The C Programming Language, Prentice-Hall, 1978; Section 7.9 (page 156) describes calloc and cfree, and Section 8.7 (page 173) describes an implementation for alloc and free.
  7. alloc(3)    Version 6 Unix Programmer's Manual
  8. malloc(3)    Version 7 Unix Programmer's Manual
  9. Anonymous, Unix Programmer's Manual, Vol. 1, Holt Rinehart and Winston, 1983 (copyright held by Bell Telephone Laboratories, 1983, 1979); The man page for malloc etc. is given on page 275.
  10. alloca(3)    FreeBSD Library Functions Manual
  11. calloc(3)    Linux Programmer's Manual – Library Functions
  12. 1 2 "Casting malloc". Cprogramming.com. Retrieved 2007-03-09.
  13. "clang: lib/StaticAnalyzer/Checkers/MallocSizeofChecker.cpp Source File". clang.llvm.org. Retrieved 2018-04-01.
  14. "comp.lang.c FAQ list · Question 7.7b". C-FAQ. Retrieved 2007-03-09.
  15. Reek, Kenneth (1997-08-04). Pointers on C (1 ed.). Pearson. ISBN   9780673999863.
  16. "MEM04-C. Beware of zero-length allocations - SEI CERT C Coding Standard - Confluence". wiki.sei.cmu.edu.
  17. "POSIX.1-2017: malloc". pubs.opengroup.org. Retrieved 2019-11-29.
  18. Awakened (2019-10-02). "How a double-free bug in WhatsApp turns to RCE" . Retrieved 2019-11-29.
  19. Felker, Rich [@RichFelker] (2019-10-03). "Wow. The WhatsApp RCE was the wrong behavior for realloc(p,0) so many implementations insist on" (Tweet). Retrieved 2022-08-06 via Twitter.
  20. Alexandrescu, Andrei (2001). Modern C++ Design: Generic Programming and Design Patterns Applied. Addison-Wesley. p. 78.
  21. "Wolfram Gloger's malloc homepage". malloc.de. Retrieved 2018-04-01.
  22. 1 2 3 Kaempf, Michel (2001). "Vudo malloc tricks". Phrack (57): 8. Archived from the original on 2009-01-22. Retrieved 2009-04-29.
  23. "Glibc: Malloc Internals". sourceware.org Trac. Retrieved 2019-12-01.
  24. 1 2 3 Lee, Doug. "A Memory Allocator" . Retrieved 2019-12-01. HTTP for Source Code
  25. "Malloc Tunable Parameters". GNU . Retrieved 2009-05-02.
  26. Sanderson, Bruce (2004-12-12). "RAM, Virtual Memory, Pagefile and all that stuff". Microsoft Help and Support.
  27. Stone, Adrian. "The Hole That dlmalloc Can't Fill". Game Angst. Retrieved 2019-12-01.
  28. Evans, Jason (2006-04-16). "A Scalable Concurrent malloc(3) Implementation for FreeBSD" (PDF). Retrieved 2012-03-18.
  29. "libc/stdlib/malloc.c". BSD Cross Reference, OpenBSD src/lib/.
  30. "History | GrapheneOS". grapheneos.org. Retrieved 2023-03-02.
  31. Berger, E. D.; McKinley, K. S.; Blumofe, R. D.; Wilson, P. R. (November 2000). Hoard: A Scalable Memory Allocator for Multithreaded Applications (PDF). ASPLOS-IX. Proceedings of the ninth international conference on Architectural support for programming languages and operating systems. pp. 117–128. CiteSeerX   10.1.1.1.4174 . doi:10.1145/378993.379232. ISBN   1-58113-317-0.
  32. Microsoft releases optimized malloc() as open source - Slashdot
  33. TCMalloc homepage
  34. Ghemawat, Sanjay; Menage, Paul; TCMalloc : Thread-Caching Malloc
  35. Callaghan, Mark (2009-01-18). "High Availability MySQL: Double sysbench throughput with TCMalloc". Mysqlha.blogspot.com. Retrieved 2011-09-18.
  36. "kmalloc()/kfree() include/linux/slab.h". People.netfilter.org. Retrieved 2011-09-18.
  37. Levine, John R. (2000) [October 1999]. "Chapter 9: Shared libraries". Linkers and Loaders. The Morgan Kaufmann Series in Software Engineering and Programming (1 ed.). San Francisco, USA: Morgan Kaufmann. ISBN   1-55860-496-0. OCLC   42413382. Archived from the original on 2012-12-05. Retrieved 2020-01-12. Code: Errata:
  38. "malloc: make malloc fail with requests larger than PTRDIFF_MAX". Sourceware Bugzilla. 2019-04-18. Retrieved 2020-07-30.
  39. "Why is the use of alloca() not considered good practice?". stackoverflow.com. Retrieved 2016-01-05.
  40. Amarasinghe, Saman; Leiserson, Charles (2010). "6.172 Performance Engineering of Software Systems, Lecture 10". MIT OpenCourseWare. Massachusetts Institute of Technology. Archived from the original on 2015-06-22. Retrieved 2015-01-27.
  41. "alloca(3) - Linux manual page". man7.org. Retrieved 2016-01-05.
  42. posix_memalign   System Interfaces Reference, The Single UNIX Specification , Version 4 from The Open Group