Slab allocation

Last updated

Slab allocation is a memory management mechanism intended for the efficient memory allocation of objects. In comparison with earlier mechanisms, it reduces fragmentation caused by allocations and deallocations. This technique is used for retaining allocated memory containing a data object of a certain type for reuse upon subsequent allocations of objects of the same type. It is analogous to an object pool, but only applies to memory, not other resources.

Contents

Slab allocation was first introduced in the Solaris 2.4 kernel by Jeff Bonwick. [1] It is now widely used by many Unix and Unix-like operating systems including FreeBSD [2] and Linux; [3] however, due to it being an outdated allocation method, it is currently deprecated and may be removed soon from the Linux kernel. [4]

Basis

Slab allocation renders infrequent the very costly practice (in CPU time) of initialization and destruction of kernel data-objects, which can outweigh the cost of allocating memory for them. [1] When the kernel creates and deletes objects often, overhead costs of initialization can result in significant performance drops. Object caching leads to less frequent invocation of functions which initialize object state: when a slab-allocated object is released after use, the slab allocation system typically keeps it cached (rather than doing the work of destroying it) ready for re-use next time an object of that type is needed (thus avoiding the work of constructing and initialising a new object).

With slab allocation, a cache for a certain type or size of data object has a number of pre-allocated "slabs" of memory; within each slab there are memory chunks of fixed size suitable for the objects. [5] The slab allocator keeps track of these chunks, so that when it receives a request to allocate memory for a data object of a certain type, usually it can satisfy the request with a free slot (chunk) from an existing slab. When the allocator is asked to free the object's memory, it just adds the slot to the containing slab's list of free (unused) slots. The next call to create an object of the same type (or allocate memory of the same size) will return that memory slot (or some other free slot) and remove it from the list of free slots. This process eliminates the need to search for suitable memory space and greatly alleviates memory fragmentation. In this context, a slab is one or more contiguous pages in the memory containing pre-allocated memory chunks.

Implementation

Understanding the slab allocation algorithm requires defining and explaining some terms:

  1. Cache: cache represents a small amount of very fast memory. A cache is a storage for a specific type of object, such as semaphores, process descriptors, file objects, etc.
  2. Slab: slab represents a contiguous piece of memory, usually made of several virtually contiguous pages. The slab is the actual container of data associated with objects of the specific kind of the containing cache.

When a program sets up a cache, it allocates a number of objects to the slabs associated with that cache. This number depends on the size of the associated slabs.

Slabs may exist in one of the following states:

  1. empty – all objects on a slab marked as free
  2. partial – slab consists of both used and free objects
  3. full – all objects on a slab marked as used

Initially, the system marks each slab as "empty". When the process calls for a new kernel object, the system tries to find a free location for that object on a partial slab in a cache for that type of object. If no such location exists, the system allocates a new slab from contiguous virtual pages and assigns it to a cache. The new object gets allocated from this slab, and its location becomes marked as "partial".

The allocation takes place quickly, because the system builds the objects in advance and readily allocates them from a slab.

Implementation techniques

Free lists

A slab represents one memory allocation to the cache from the machine, and whose size is customarily a multiple of the page size. The slab will be divided into a number of entries, which will then be requested by the cache as the client code requests memory for new objects. It is necessary then to keep track of which parts of the slab are free to use and which ones were already occupied. This is generally done using "free lists": lists of free entries in the slab ready to store new objects.

The free list may be a separate data structure, such as an array of indices indicating which entries of the slab are free, or it may be embedded within the slab. The Linux SLUB allocator keeps the free list as a linked list of pointers, each of which is stored directly in the free memory area of the slab they represent. [6]

Slab sizes

Operating systems may use different slab sizes and internal layouts depending on the size of the objects to be stored. The reason for the large slabs having a different layout from the small slabs is that it allows large slabs to pack better into page-size units, which helps with fragmentation. For example, objects that are at least 1/8 of the page size for a given machine may benefit from a "large slab" size, with explicit free lists, while smaller objects may use a "small slab" setup, embed the free list tracking. Bonwick's original presentation of the slab allocator already made the distinction of layouts for large and small slabs. [1]

Systems using slab allocation

See also

Notes

  1. 1 2 3 Jeff Bonwick,The Slab Allocator: An Object-Caching Kernel Memory Allocator (1994)
  2. FreeBSD Kernel Developer's Manual
  3. M. Tim Jones, Anatomy of the Linux slab allocator Archived 2 October 2013 at the Wayback Machine
  4. Vlastimil Babka, remove the SLAB allocator
  5. Abraham Silberschatz et al.: Operating system concepts. Wiley: 2004. ISBN   0-471-69466-5
  6. Lameter, Christoph. "Slab allocators in the Linux Kernel: SLAB, SLOB, SLUB" (PDF). LinuxCon/Düsseldorf 2014 (Revision Oct 3, 2014).
  7. "Gnu Mach Allocator Documentation".
  8. "Console Security – Switch (34c3)". media.ccc.de. Retrieved 28 December 2017.
  9. Chris Cooper and Chris Moore, HP-UX 11i Internals, Upper Saddle River, New Jersey: Prentice Hall PTR, 2004, ISBN   0-13-032861-8, p. 334.
  10. "Perl5-Porters Weekly: 2012 June 17" . Retrieved 18 November 2012.
  11. Bonwick, Jeff. "Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources". USENIX 2001. Retrieved 18 November 2012.

Related Research Articles

XFS is a high-performance 64-bit journaling file system created by Silicon Graphics, Inc (SGI) in 1993. It was the default file system in SGI's IRIX operating system starting with its version 5.3. XFS was ported to the Linux kernel in 2001; as of June 2014, XFS is supported by most Linux distributions; Red Hat Enterprise Linux uses it as its default file system.

New Technology File System (NTFS) is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. It superseded File Allocation Table (FAT) as the preferred filesystem on Windows and is supported in Linux and BSD as well. NTFS reading and writing support is provided using a free and open-source kernel implementation known as NTFS3 in Linux and the NTFS-3G driver in BSD. By using the convert command, Windows can convert FAT32/16/12 into NTFS without the need to rewrite all files. NTFS uses several files typically hidden from the user to store metadata about other files stored on the drive which can help improve speed and performance when reading data. Unlike FAT and High Performance File System (HPFS), NTFS supports access control lists (ACLs), filesystem encryption, transparent compression, sparse files and file system journaling. NTFS also supports shadow copy to allow backups of a system while it is running, but the functionality of the shadow copies varies between different versions of Windows.

Journaled File System (JFS) is a 64-bit journaling file system created by IBM. There are versions for AIX, OS/2, eComStation, ArcaOS and Linux operating systems. The latter is available as free software under the terms of the GNU General Public License (GPL). HP-UX has another, different filesystem named JFS that is actually an OEM version of Veritas Software's VxFS.

<span class="mw-page-title-main">Memory management</span> Computer memory management methodology

Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when no longer needed. This is critical to any advanced computer system where more than a single process might be underway at any time.

<span class="mw-page-title-main">Memory management unit</span> Hardware translating virtual addresses to physical address

A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit that examines all memory references on the memory bus, translating these requests, known as virtual memory addresses, into physical addresses in main memory.

<span class="mw-page-title-main">DragonFly BSD</span> Free and open-source Unix-like operating system

DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD in June 2003 and announced it on the FreeBSD mailing lists on 16 July 2003.

In computer operating systems, memory paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. Paging is an important part of virtual memory implementations in modern operating systems, using secondary storage to let programs exceed the size of available physical memory.

<span class="mw-page-title-main">Defragmentation</span> Rearrangement of sectors on a hard disk into contiguous units

In the maintenance of file systems, defragmentation is a process that reduces the degree of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions. It also attempts to create larger regions of free space using compaction to impede the return of fragmentation. Some defragmentation utilities try to keep smaller files within a single directory together, as they are often accessed in sequence.

The buddy memory allocation technique is a memory allocation algorithm that divides memory into partitions to try to satisfy a memory request as suitably as possible. This system makes use of splitting memory into halves to try to give a best fit. According to Donald Knuth, the buddy system was invented in 1963 by Harry Markowitz, and was first described by Kenneth C. Knowlton. The Buddy memory allocation is relatively easy to implement. It supports limited but efficient splitting and coalescing of memory blocks.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free.

tmpfs is a temporary file storage paradigm implemented in many Unix-like operating systems. It is intended to appear as a mounted file system, but data is stored in volatile memory instead of a persistent storage device. A similar construction is a RAM disk, which appears as a virtual disk drive and hosts a disk file system.

In computer storage, logical volume management or LVM provides a method of allocating space on mass-storage devices that is more flexible than conventional partitioning schemes to store volumes. In particular, a volume manager can concatenate, stripe together or otherwise combine partitions into larger virtual partitions that administrators can re-size or move, potentially without interrupting system use.

In computing, a system resource, or simple resource, is any physical or virtual component of limited availability within a computer system. All connected devices and internal system components are resources. Virtual system resources include files, network connections, and memory areas.

The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).

In computer storage, fragmentation is a phenomenon in which storage space, main storage or secondary storage, is used inefficiently, reducing capacity or performance and often both. The exact consequences of fragmentation depend on the specific system of storage allocation in use and the particular form of fragmentation. In many cases, fragmentation leads to storage space being "wasted", and in that case the term also refers to the wasted space itself.

ext4 is a journaling file system for Linux, developed as the successor to ext3.

A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory, described by a single entry in a page table. It is the smallest unit of data for memory management in an operating system that uses virtual memory. Similarly, a page frame is the smallest fixed-length contiguous block of physical memory into which memory pages are mapped by the operating system.

Btrfs is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager, developed together. It was founded by Chris Mason in 2007 for use in Linux, and since November 2013, the file system's on-disk format has been declared stable in the Linux kernel.

The SLOB allocator is one of three available memory allocators in the Linux kernel. The other two are SLAB and SLUB. The SLOB allocator is designed to require little memory for the implementation and housekeeping, for use in small systems such as embedded systems. Unfortunately, a major limitation of the SLOB allocator is that it suffers greatly from external fragmentation.

SLUB is a memory management mechanism intended for the efficient memory allocation of kernel objects which displays the desirable property of eliminating fragmentation caused by allocations and deallocations. The technique is used to retain allocated memory that contains a data object of a certain type for reuse upon subsequent allocations of objects of the same type. It is used in Linux and became the default allocator since 2.6.23.