Stat (system call)

Last updated

stat command line Coreutils stat screenshot.png
stat command line

stat() is a Unix system call that returns file attributes about an inode. The semantics of stat() vary between operating systems. As an example, Unix command ls uses this system call to retrieve information on files that includes:

Contents

stat appeared in Version 1 Unix. It is among the few original Unix system calls to change, with Version 4's addition of group permissions and larger file size. [1]

stat() functions

The C POSIX library header sys/stat.h, found on POSIX and other Unix-like operating systems, declares the stat() functions, as well as related functions called fstat() and lstat(). The functions take a pointer to a struct stat buffer argument, which is used to return the file attributes. On success, the functions return zero, and on error, −1 is returned and errno is set appropriately.

The stat() and lstat() functions take a filename argument. If the file is a symbolic link, stat() returns attributes of the eventual target of the link, while lstat() returns attributes of the link itself. The fstat() function takes a file descriptor argument instead, and returns attributes of the file that it identifies.

The family of functions was extended to implement large file support. Functions named stat64(), lstat64() and fstat64() return attributes in a struct stat64 structure, which represents file sizes with a 64-bit type, allowing the functions to work on files 2 GiB and larger (up to 8 EiB). When the _FILE_OFFSET_BITS macro is defined to 64, these 64-bit functions are available under the original names.

The functions are defined as:

intstat(constchar*filename,structstat*buf);intlstat(constchar*filename,structstat*buf);intfstat(intfiledesc,structstat*buf);

stat structure

This structure is defined in sys/stat.h header file as follows, although implementations are free to define additional fields: [2]

structstat{mode_tst_mode;ino_tst_ino;dev_tst_dev;dev_tst_rdev;nlink_tst_nlink;uid_tst_uid;gid_tst_gid;off_tst_size;structtimespecst_atim;structtimespecst_mtim;structtimespecst_ctim;blksize_tst_blksize;blkcnt_tst_blocks;};

POSIX.1 does not require st_rdev, st_blocks and st_blksize members; these fields are defined as part of XSI option in the Single Unix Specification.

In older versions of POSIX.1 standard, the time-related fields were defined as st_atime, st_mtime and st_ctime, and were of type time_t. Since the 2008 version of the standard, these fields were renamed to st_atim, st_mtim and st_ctim, respectively, of type struct timespec, since this structure provides a higher resolution time unit. For the sake of compatibility, implementations can define the old names in terms of the tv_sec member of struct timespec. For example, st_atime can be defined as st_atim.tv_sec. [2]

The struct stat structure includes at least the following members:

The st_mode field is a bit field. It combines the file access modes and also indicates any special file type. There are many macros to work with the different mode flags and file types.

Criticism of atime

Reading a file changes its atime eventually requiring a disk write, which has been criticized as it is inconsistent with a read only file system. File system cache may significantly reduce this activity to one disk write per cache flush.

Linux kernel developer Ingo Molnár publicly criticized the concept and performance impact of atime in 2007, [4] [5] and in 2009, the relatime mount option had become the default, which addresses this criticism. [6] The behavior behind the relatime mount option offers sufficient performance for most purposes and should not break any significant applications, as it has been extensively discussed. [7] Initially, relatime only updated atime if atime < mtime or atime < ctime; that was subsequently modified to update atimes that were 24 hours old or older, so that tmpwatch and Debian's popularity counter (popcon) would behave properly. [8]

Current versions of the Linux kernel support four mount options, which can be specified in fstab:

Current versions of Linux, macOS, Solaris, FreeBSD, and NetBSD support a noatime mount option in /etc/fstab, which causes the atime field never to be updated. Turning off atime updating breaks POSIX compliance, and some applications, such as mbox-driven "new mail" notifications, [9] and some file usage watching utilities, notably tmpwatch.

The noatime option on OpenBSD behaves more like Linux relatime. [10]

Version 4.0 of the Linux kernel mainline, which was released on April 12, 2015, introduced the new mount option lazytime. It allows POSIX-style atime updates to be performed in-memory and flushed to disk together with some non-time-related I/O operations on the same file; atime updates are also flushed to disk when some of the sync system calls are executed, or before the file's in-memory inode is evicted from the filesystem cache. Additionally, it is possible to configure for how long atime modifications can remain unflushed. That way, lazytime retains POSIX compatibility while offering performance improvements. [11] [12]

ctime

It is tempting to believe that ctime originally meant creation time; [13] however, while early Unix did have modification and creation times, the latter was changed to be access time before there was any C structure in which to call anything ctime. The file systems retained just the access time (atime) and modification time (mtime) through 6th edition Unix. The ctime timestamp was added in the file system restructuring that occurred with Version 7 Unix, and has always referred to inode change time. It is updated any time file metadata stored in the inode changes, such as file permissions, file ownership, and creation and deletion of hard links. POSIX also mandates ctime (last status change) update with nonzero write() (file modification). [14] In some implementations, ctime is affected by renaming a file, despite filenames not being stored in inodes: Both original Unix, which implemented a renaming by making a link (updating ctime) and then unlinking the old name (updating ctime again) and modern Linux tend to do this.

Unlike atime and mtime, ctime cannot be set to an arbitrary value with utime(), as used by the touch utility, for example. Instead, when utime() is used, or for any other change to the inode other than an update to atime caused by accessing the file, the ctime value is set to the current time.

Time granularity

Example

#include<stdio.h>#include<stdlib.h>#include<time.h>#include<sys/types.h>#include<pwd.h>#include<grp.h>#include<sys/stat.h>intmain(intargc,char*argv[]){structstatsb;structpasswd*pwuser;structgroup*grpnam;if(argc<2){fprintf(stderr,"Usage: %s: file ...\n",argv[0]);exit(EXIT_FAILURE);}for(inti=1;i<argc;i++){if(-1==stat(argv[i],&sb)){perror("stat()");exit(EXIT_FAILURE);}if(NULL==(pwuser=getpwuid(sb.st_uid))){perror("getpwuid()");exit(EXIT_FAILURE);}if(NULL==(grpnam=getgrgid(sb.st_gid))){perror("getgrgid()");exit(EXIT_FAILURE);}printf("%s:\n",argv[i]);printf("\tinode: %u\n",sb.st_ino);printf("\towner: %u (%s)\n",sb.st_uid,pwuser->pw_name);printf("\tgroup: %u (%s)\n",sb.st_gid,grpnam->gr_name);printf("\tperms: %o\n",sb.st_mode&(S_IRWXU|S_IRWXG|S_IRWXO));printf("\tlinks: %d\n",sb.st_nlink);printf("\tsize: %ld\n",sb.st_size);/* you may use %lld */printf("\tatime: %s",ctime(&sb.st_atim.tv_sec));printf("\tmtime: %s",ctime(&sb.st_mtim.tv_sec));printf("\tctime: %s",ctime(&sb.st_ctim.tv_sec));printf("\n");}return0;}

Related Research Articles

<span class="mw-page-title-main">GNU Debugger</span> Source-level debugger

The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, Assembly, C, C++, D, Fortran, Haskell, Go, Objective-C, OpenCL C, Modula-2, Pascal, Rust, and partially others.

A Berkeley (BSD) socket is an application programming interface (API) for Internet domain sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

In computing, particularly in the context of the Unix operating system and its workalikes, fork is an operation whereby a process creates a copy of itself. It is an interface which is required for compliance with the POSIX and Single UNIX Specification standards. It is usually implemented as a C standard library wrapper to the fork, clone, or other system calls of the kernel. Fork is the primary method of process creation on Unix-like operating systems.

RTLinux is a hard realtime real-time operating system (RTOS) microkernel that runs the entire Linux operating system as a fully preemptive process. The hard real-time property makes it possible to control robots, data acquisition systems, manufacturing plants, and other time-sensitive instruments and machines from RTLinux applications. The design was patented. Despite the similar name, it is not related to the Real-Time Linux project of the Linux Foundation.

In computing, a hard link is a directory entry that associates a name with a file. Thus, each file must have at least one hard link. Creating additional hard links for a file makes the contents of that file accessible via additional paths. This causes an alias effect: a process can open the file by any one of its paths and change its content. By contrast, a soft link or “shortcut” to a file is not a direct link to the data itself, but rather a reference to a hard link or another soft link.

The inode is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attributes may include metadata, as well as owner and permission data.

In Unix and Unix-like computer operating systems, a file descriptor is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket.

in computer programming languages CTime or ctime may refer to:

In computing, vectored I/O, also known as scatter/gather I/O, is a method of input and output by which a single procedure call sequentially reads data from multiple buffers and writes it to a single data stream (gather), or reads data from a data stream and writes it to multiple buffers (scatter), as defined in a vector of buffers. Scatter/gather refers to the process of gathering data from, or scattering data into, the given set of buffers. Vectored I/O can operate synchronously or asynchronously. The main reasons for using vectored I/O are efficiency and convenience.

fstab is a system file commonly found in the directory /etc on Unix and Unix-like computer systems. In Linux, it is part of the util-linux package. The fstab file typically lists all available disk partitions and other types of file systems and data sources that may not necessarily be disk-based, and indicates how they are to be initialized or otherwise integrated into the larger file system structure.

In Unix-like systems, multiple users can be put into groups. POSIX and conventional Unix file system permissions are organized into three classes, user, group, and others. The use of groups allows additional abilities to be delegated in an organized fashion, such as access to disks, printers, and other peripherals. This method, among others, also enables the superuser to delegate some administrative tasks to normal users, similar to the Administrators group on Microsoft Windows NT and its derivatives.

The seven standard Unix file types are regular, directory, symbolic link, FIFO special, block special, character special, and socket as defined by POSIX. Different OS-specific implementations allow more types than what POSIX requires. A file's type can be identified by the ls -l command, which displays the type in the first character of the file-system permissions field.

Unix-like operating systems identify a user by a value called a user identifier, often abbreviated to user ID or UID. The UID, along with the group identifier (GID) and other access control criteria, is used to determine which system resources a user can access. The password file maps textual user names to UIDs. UIDs are stored in the inodes of the Unix file system, running processes, tar archives, and the now-obsolete Network Information Service. In POSIX-compliant environments, the shell command id gives the current user's UID, as well as more information such as the user name, primary user group and group identifier (GID).

In computing, exec is a functionality of an operating system that runs an executable file in the context of an already existing process, replacing the previous executable. This act is also referred to as an overlay. It is especially important in Unix-like systems, although it also exists elsewhere. As no new process is created, the process identifier (PID) does not change, but the machine code, data, heap, and stack of the process are replaced by those of the new program.

<span class="mw-page-title-main">CPU time</span> Time used by a computer

CPU time is the amount of time that a central processing unit (CPU) was used for processing instructions of a computer program or operating system. CPU time is measured in clock ticks or seconds. Sometimes it is useful to convert CPU time into a percentage of the CPU capacity, giving the CPU usage.

Spawn in computing refers to a function that loads and executes a new child process. The current process may wait for the child to terminate or may continue to execute concurrent computing. Creating a new subprocess requires enough memory in which both the child process and the current program can execute.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

Getopt is a C library function used to parse command-line options of the Unix/POSIX style. It is a part of the POSIX specification, and is universal to Unix-like systems. It is also the name of a Unix program for parsing command line arguments in shell scripts.

epoll is a Linux kernel system call for a scalable I/O event notification mechanism, first introduced in version 2.5.45 of the Linux kernel. Its function is to monitor multiple file descriptors to see whether I/O is possible on any of them. It is meant to replace the older POSIX select(2) and poll(2) system calls, to achieve better performance in more demanding applications, where the number of watched file descriptors is large (unlike the older system calls, which operate in O(n) time, epoll operates in O(1) time).

printk is a C function from the Linux kernel interface that prints messages to the kernel log. It accepts a string parameter called the format string, which specifies a method for rendering an arbitrary number of varied data type parameter(s) into a string. The string is then printed to the kernel log.

References

  1. McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.
  2. 1 2 Stevens & Rago 2013, p. 94.
  3. "<sys/stat.h>". The Open Group Base Specifications Issue 6IEEE Std 1003.1, 2004 Edition. The Open Group. 2004.
  4. Kernel Trap: Linux: Replacing atime With relatime, by Jeremy, August 7, 2007
  5. Once upon atime, LWN, by Jonathan Corbet, August 8, 2007
  6. Linux kernel 2.6.30, Linux Kernel Newbies
  7. That massive filesystem thread, LWN, by Jonathan Corbet, March 31, 2009
  8. Relatime Recap, Valerie Aurora
  9. http://www.mail-archive.com/mutt-users@mutt.org/msg24912.html "the shell's $MAIL monitor ... depends on atime, pronouncing new email with atime($MAIL) < mtime($MAIL)"
  10. "mount(2) - OpenBSD manual pages". openbsd.org. April 27, 2018. Retrieved September 26, 2018.
  11. "Linux kernel 4.0, Section 1.5. 'lazytime' option for better update of file timestamps". kernelnewbies.org. May 1, 2015. Retrieved May 2, 2015.
  12. Jonathan Corbet (November 19, 2014). "Introducing lazytime". LWN.net . Retrieved May 2, 2015.
  13. "BSTJ version of C.ACM Unix paper".
  14. "pwrite, write - write on a file". Upon successful completion, where nbyte is greater than 0, write() shall mark for update the last data modification and last file status change timestamps
  15. "stat(2) - Linux manual page". man7.org. Retrieved February 27, 2015.
  16. Andreas Jaeger (December 2, 2002), struct stat.h with nanosecond resolution, mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.
  17. MSDN: File Times