Log rotation

Last updated

In information technology, log rotation is an automated process used in system administration in which log files are compressed, moved (archived), renamed or deleted once they are too old or too big (there can be other metrics that can apply here). New incoming log data is directed into a new fresh file (at the same location). [1]

The main purpose of log rotation is to restrict the volume of the log data to avoid overflowing the record store, while keeping the log files small enough so viewers can still open them.

Servers which run large applications, such as LAMP stacks, often log every request: in the face of bulky logs, log rotation provides a way to limit the total size of the logs retained while still allowing analysis of recent events.

Even though some arguments in favor of log rotation imply that maintaining smaller files increases writing performance, the size of a file doesn’t affect its writing performance. The reason is that in most modern filesystem implementations, the kernel knows the size of a file, and appending data can happen after performing a seek syscall to position the pointer at the end of the file which is a constant time operation.

Methods of archival

In Linux log rotation is typically performed using the logrotate command. [2] [3] The command can be used to email logs to a systems administrator after log rotation. Dated logs may also be compressed.

In FreeBSD and macOS the newsyslog command is used. [4] It has the ability to trigger rotation based on file size, time or interval (or any combination thereof). It can compress the archives and send a signal to a process to reset logging.

The command is often run as a cron job, which has the effect of fully automatic log rotation.

Typically, a new logfile is created periodically, and the old logfile is renamed by appending a "1" to the name. Each time a new log file is started, the numbers in the file names of old logfiles are increased by one, so the files "rotate" through the numbers (thus the name "log rotation"). Old logfiles whose number exceeds a threshold can then be deleted or archived off-line to save space.

Related Research Articles

NTFS is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family.

ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extending ext2 in Journaling the Linux ext2fs Filesystem in a 1998 paper, and later in a February 1999 kernel mailing list posting. The filesystem was merged with the mainline Linux kernel in November 2001 from 2.4.15 onward. Its main advantage over ext2 is journaling, which improves reliability and eliminates the need to check the file system after an unclean shutdown. Its successor is ext4.

Disk partitioning Creation of separate accessible storage areas on a raw computer storage device

Disk partitioning or disk slicing is the creation of one or more regions on secondary storage, so that each region can be managed separately. These regions are called partitions. It is typically the first step of preparing a newly installed disk, before any file system is created. The disk stores the information about the partitions' locations and sizes in an area known as the partition table that the operating system reads before any other part of the disk. Each partition then appears to the operating system as a distinct "logical" disk that uses part of the actual disk. System administrators use a program called a partition editor to create, resize, delete, and manipulate the partitions. Partitioning allows the use of different filesystems to be installed for different kinds of files. Separating user data from system data can prevent the system partition from becoming full and rendering the system unusable. Partitioning can also make backing up easier. A disadvantage is that it can be difficult to properly size partitions, resulting in having one partition with too much free space and another nearly totally allocated.

rsync File synchronization protocol and software

rsync is a utility for efficiently transferring and synchronizing files between a computer and an external hard drive and across networked computers by comparing the modification times and sizes of files. It is commonly found on Unix-like operating systems. Rsync is written in C as a single threaded application. The rsync algorithm is a type of delta encoding, and is used for minimizing network usage. Zlib may be used for additional data compression, and SSH or stunnel can be used for security. Rsync is the facility typically used for synchronizing software repositories on mirror sites used by package management systems. It is also one of the fastest ways to delete a large number of files with a binary application, especially when compared to the Linux standard applications rm and find.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization.

Defragmentation

In the maintenance of file systems, defragmentation is a process that reduces the degree of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions. It also attempts to create larger regions of free space using compaction to impede the return of fragmentation. Some defragmentation utilities try to keep smaller files within a single directory together, as they are often accessed in sequence.

Utility software is software designed to help to analyze, configure, optimize or maintain a computer. It is used to support the computer infrastructure - in contrast to application software, which is aimed at directly performing tasks that benefit ordinary users. However, utilities often form part of the application systems. For example, a batch job may run user-written code to update a database and may then include a step that runs a utility to back up the database, or a job may run a utility to compress a disk before copying files.

File system Format or program for storing files and directories

In computing, a file system or filesystem controls how data is stored and retrieved. Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stops and the next begins. By separating the data into pieces and giving each piece a name, the data is easily isolated and identified. Taking its name from the way paper-based data management system is named, each group of data is called a "file." The structure and logic rules used to manage the groups of data and their names is called a "file system."

File attributes are a type of meta-data that describe and may modify how files and/or directories in a filesystem behave. Typical file attributes may, for example, indicate or specify whether a file is visible, modifiable, compressed, or encrypted. The availability of most file attributes depends on support by the underlying filesystem where attribute data must be stored along with other control structures. Each attribute can have one of two states: set and cleared. Attributes are considered distinct from other metadata, such as dates and times, filename extensions or file system permissions. In addition to files, folders, volumes and other file system objects may have attributes.

tail is a program available on Unix, Unix-like systems, FreeDOS and MSX-DOS used to display the tail end of a text file or piped data.

Sparse file

In computer science, a sparse file is a type of computer file that attempts to use file system space more efficiently when the file itself is partially empty. This is achieved by writing brief information (metadata) representing the empty blocks to disk instead of the actual "empty" space which makes up the block, using less disk space. The full block size is written to disk as the actual size only when the block contains "real" (non-empty) data.

Yaffs is a file system designed and written by Charles Manning for the company Aleph One.

NILFS or NILFS2 is a log-structured file system implementation for the Linux kernel. It is being developed by Nippon Telegraph and Telephone Corporation (NTT) CyberSpace Laboratories and a community from all over the world. NILFS was released under the terms of the GNU General Public License (GPL).

sync is a standard system call in the Unix operating system, which commits all data in the kernel filesystem to non-volatile storage buffers, i.e., data which has been scheduled for writing via low-level I/O system calls. Higher-level I/O layers such as stdio may maintain separate buffers of their own.

The ext4 journaling file system or fourth extended filesystem is a journaling file system for Linux, developed as the successor to ext3.

In computing, a log file is a file that records either events that occur in an operating system or other software runs, or messages between different users of a communication software. Logging is the act of keeping a log. In the simplest case, messages are written to a single log file.

A trim command allows an operating system to inform a solid-state drive (SSD) which blocks of data are no longer considered to be 'in use' and therefore can be erased internally.

A journaling file system is a file system that keeps track of changes not yet committed to the file system's main part by recording the goal of such changes in a data structure known as a "journal", which is usually a circular log. In the event of a system crash or power failure, such file systems can be brought back online more quickly with a lower likelihood of becoming corrupted.

A disk utility is a utility program that allows a user to perform various functions on a computer disk, such as disk partitioning and logical volume management, as well as multiple smaller tasks such as changing drive letters and other mount points, renaming volumes, disk checking, and disk formatting, which are otherwise handled separately by multiple other built-in commands. Each operating system (OS) has its own basic disk utility, and there are also separate programs which can recognize and adjust the different filesystems of multiple OSes. Types of disk utilities include disk checkers, disk cleaners and disk space analyzers

Append-only is a property of computer data storage such that new data can be appended to the storage, but where existing data is immutable.

References

  1. Kent, Karen; Souppaya, Murugiah (September 2006). "Guide to Computer Security Management: Recommendations of the National Institute of Standards and Technology" (PDF). Special Publication 800-92. Gaithersburg, MD: National Institute of Standards and Technology: Technology Administration, U.S. Department of Commerce. p. 3. Retrieved 2014-10-02. Log rotation is closing a log file and opening a new log file when the first file is considered to be complete. Log rotation is typically performed according to a schedule (e.g., hourly, daily, weekly) or when a log file reaches a certain size.
  2. "logrotate (8) - Linux man page" . Retrieved February 10, 2013.
  3. "Log rotation" . Retrieved February 10, 2013.
  4. "newsyslog (8) - maintain system log files to manageable sizes".