Standard streams

Last updated

In computer programming, standard streams are preconnected input and output communication channels [1] between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console (input via keyboard, output via monitor), but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.

Contents

Application

The standard streams for input, output, and error in a common default configuration Stdstreams-notitle.svg
The standard streams for input, output, and error in a common default configuration

Users generally know standard streams as input and output channels that handle data coming from an input device, or that write data from the application. The data may be text with any encoding, or binary data. When a program is run as a daemon, its standard error stream is redirected into a log file, typically for error analysis purposes.

Streams may be used to chain applications, meaning that the output stream of one program can be redirected to be the input stream to another application. In many operating systems this is expressed by listing the application names, separated by the vertical bar character, for this reason often called the pipeline character. A well-known example is the use of a pagination application, such as more, providing the user control over the display of the output stream on the display.

Background

In most operating systems predating Unix, programs had to explicitly connect to the appropriate input and output devices. OS-specific intricacies caused this to be a tedious programming task. On many systems it was necessary to obtain control of environment settings, access a local file table, determine the intended data set, and handle hardware correctly in the case of a punch card reader, magnetic tape drive, disk drive, line printer, card punch, or interactive terminal.

One of Unix's several groundbreaking advances was abstract devices, which removed the need for a program to know or care what kind of devices it was communicating with[ citation needed ]. Older operating systems forced upon the programmer a record structure and frequently non-orthogonal data semantics and device control. Unix eliminated this complexity with the concept of a data stream: an ordered sequence of data bytes which can be read until the end of file. A program may also write bytes as desired and need not, and cannot easily declare their count or grouping.

Another Unix breakthrough was to automatically associate input and output to terminal keyboard and terminal display, respectively, by default[ citation needed ] — the program (and programmer) did absolutely nothing to establish input and output for a typical input-process-output program (unless it chose a different paradigm). In contrast, previous operating systems usually required some—often complex—job control language to establish connections, or the equivalent burden had to be orchestrated by the program.[ citation needed ]

Since Unix provided standard streams, the Unix C runtime environment was obliged to support it as well. As a result, most C runtime environments (and C's descendants), regardless of the operating system, provide equivalent functionality.

Standard input (stdin)

Standard input is a stream from which a program reads its input data. The program requests data transfers by use of the read operation. Not all programs require stream input. For example, the dir and ls programs (which display file names contained in a directory) may take command-line arguments, but perform their operations without any stream data input.

Unless redirected, standard input is inherited from the parent process. In the case of an interactive shell, that is usually associated with the input device of a terminal (or pseudo terminal) which is ultimately linked to a user's keyboard.

On POSIX systems, the file descriptor for standard input is 0 (zero); the POSIX <unistd.h> definition is STDIN_FILENO; the corresponding C <stdio.h> abstraction is provided via the FILE* stdin global variable. Similarly, the global C++ std::cin variable of type <iostream> provides an abstraction via C++ streams. Similar abstractions exist in the standard I/O libraries of practically every programming language.

Standard output (stdout)

Standard output is a stream to which a program writes its output data. The program requests data transfer with the write operation. Not all programs generate output. For example, the file rename command (variously called mv , move , or ren ) is silent on success.

Unless redirected, standard output is inherited from the parent process. In the case of an interactive shell, that is usually the text terminal which initiated the program.

The file descriptor for standard output is 1 (one); the POSIX <unistd.h> definition is STDOUT_FILENO; the corresponding C <stdio.h> variable is FILE* stdout; similarly, the C++ <iostream> variable is std::cout.

Standard error (stderr)

Standard error is another output stream typically used by programs to output error messages or diagnostics. It is a stream independent of standard output and can be redirected separately.

This solves the semi-predicate problem, allowing output and errors to be distinguished, and is analogous to a function returning a pair of values – see Semipredicate problem § Multivalued return. The usual destination is the text terminal which started the program to provide the best chance of being seen even if standard output is redirected (so not readily observed). For example, output of a program in a pipeline is redirected to input of the next program or a text file, but errors from each program still go directly to the text terminal so they can be reviewed by the user in real time. [2]

It is acceptable and normal to direct standard output and standard error to the same destination, such as the text terminal. Messages appear in the same order as the program writes them, unless buffering is involved. For example, in common situations the standard error stream is unbuffered but the standard output stream is line-buffered; in this case, text written to standard error later may appear on the terminal earlier, if the standard output stream buffer is not yet full.

The file descriptor for standard error is defined by POSIX as 2 (two); the <unistd.h> header file provides the symbol STDERR_FILENO; [3] the corresponding C <stdio.h> variable is FILE* stderr. The C++ <iostream> standard header provides two variables associated with this stream: std::cerr and std::clog, the former being unbuffered and the latter using the same buffering mechanism as all other C++ streams.

Bourne-style shells allow standard error to be redirected to the same destination that standard output is directed to using

 2>&1

csh-style shells allow standard error to be redirected to the same destination that standard output is directed to using

>&

Standard error was added to Unix in the 1970s after several wasted phototypesetting runs ended with error messages being typeset instead of displayed on the user's terminal. [4]

Timeline

1950s: Fortran

Fortran has the equivalent of Unix file descriptors: By convention, many Fortran implementations use unit numbers UNIT=5 for stdin, UNIT=6 for stdout and UNIT=0 for stderr. In Fortran-2003, the intrinsic ISO_FORTRAN_ENV module was standardized to include the named constants INPUT_UNIT, OUTPUT_UNIT, and ERROR_UNIT to portably specify the unit numbers.

! FORTRAN 77 examplePROGRAM MAININTEGER NUMBERREAD(UNIT=5,*)NUMBERWRITE(UNIT=6,'(A,I3)')' NUMBER IS: ',NUMBEREND
! Fortran 2003 exampleprogram mainuse iso_fortran_envimplicit noneinteger::numberread(unit=INPUT_UNIT,*)numberwrite(unit=OUTPUT_UNIT,'(a,i3)')'Number is: ',numberend program

1960: ALGOL 60

ALGOL 60 was criticized for having no standard file access.[ citation needed ]

1968: ALGOL 68

ALGOL 68's input and output facilities were collectively referred to as the transput. [5] Koster coordinated the definition of the transput standard. The model included three standard channels: stand in, stand out, and stand back.

Example
# ALGOL 68 example # main:(   REAL number;   getf(stand in,($g$,number));   printf(($"Number is: "g(6,4)"OR "$,number)); # OR #   putf(stand out,($" Number is: "g(6,4)"!"$,number));   newline(stand out) ) 
Input:Output:
3.14159
Number is: +3.142 OR Number is: +3.142!

1970s: C and Unix

In the C programming language, the standard input, output, and error streams are attached to the existing Unix file descriptors 0, 1 and 2 respectively. [6] In a POSIX environment the <unistd.h> definitions STDIN_FILENO, STDOUT_FILENO or STDERR_FILENO should be used instead rather than magic numbers. File pointers stdin, stdout, and stderr are also provided.

Ken Thompson (designer and implementer of the original Unix operating system) modified sort in Version 5 Unix to accept "-" as representing standard input, which spread to other utilities and became a part of the operating system as a special file in Version 8. Diagnostics were part of standard output through Version 6, after which Dennis M. Ritchie created the concept of standard error. [7]

1995: Java

In Java, the standard streams are referred to by System.in (for stdin), System.out (for stdout), and System.err (for stderr). [8]

publicstaticvoidmain(Stringargs[]){try{BufferedReaderbr=newBufferedReader(newInputStreamReader(System.in));Strings=br.readLine();doublenumber=Double.parseDouble(s);System.out.println("Number is:"+number);}catch(Exceptione){System.err.println("Error:"+e.getMessage());}}

2000s: .NET

In C# and other .NET languages, the standard streams are referred to by System.Console.In (for stdin), System.Console.Out (for stdout) and System.Console.Error (for stderr). [9] Basic read and write capabilities for the stdin and stdout streams are also accessible directly through the class System.Console (e.g. System.Console.WriteLine() can be used instead of System.Console.Out.WriteLine()).

System.Console.In, System.Console.Out and System.Console.Error are System.IO.TextReader (stdin) and System.IO.TextWriter (stdout, stderr) objects, which only allow access to the underlying standard streams on a text basis. Full binary access to the standard streams must be performed through the System.IO.Stream objects returned by System.Console.OpenStandardInput(), System.Console.OpenStandardOutput() and System.Console.OpenStandardError() respectively.

// C# examplepublicstaticintMain(string[]args){try{strings=System.Console.In.ReadLine();doublenumber=double.Parse(s);System.Console.Out.WriteLine("Number is: {0:F3}",number);return0;// If Parse() threw an exception}catch(ArgumentNullException){System.Console.Error.WriteLine("No number was entered!");}catch(FormatException){System.Console.Error.WriteLine("The specified value is not a valid number!");}catch(OverflowException){System.Console.Error.WriteLine("The specified number is too big!");}return-1;}
' Visual Basic .NET examplePublicFunctionMain()AsIntegerTryDimsAsString=System.Console.[In].ReadLine()DimnumberAsDouble=Double.Parse(s)System.Console.Out.WriteLine("Number is: {0:F3}",number)Return0' If Parse() threw an exceptionCatchexAsSystem.ArgumentNullExceptionSystem.Console.[Error].WriteLine("No number was entered!")Catchex2AsSystem.FormatExceptionSystem.Console.[Error].WriteLine("The specified value is not a valid number!")Catchex3AsSystem.OverflowExceptionSystem.Console.[Error].WriteLine("The specified number is too big!")EndTryReturn-1EndFunction

When applying the System.Diagnostics.Process class one can use the instance properties StandardInput, StandardOutput, and StandardError of that class to access the standard streams of the process.

2000 - : Python (2 or 3)

The following example, written in Python, shows how to redirect the standard input both to the standard output and to a text file.

#!/usr/bin/env pythonimportsys# Save the current stdout so that we can revert sys.stdout# after we complete our redirectionstdin_fileno=sys.stdinstdout_fileno=sys.stdout# Redirect sys.stdout to the filesys.stdout=open("myfile.txt","w")ctr=0forinpsinstdin_fileno:ctrs=str(ctr)# Prints to the redirected stdout ()sys.stdout.write(ctrs+") this is to the redirected --->"+inps+"\n")# Prints to the actual saved stdout handlerstdout_fileno.write(ctrs+") this is to the actual  --->"+inps+"\n")ctr=ctr+1# Close the filesys.stdout.close()# Restore sys.stdout to our old saved file handlersys.stdout=stdout_fileno

GUIs

Graphical user interfaces (GUIs) do not always make use of the standard streams; they do when GUIs are wrappers of underlying scripts and/or console programs, for instance the Synaptic package manager GUI, which wraps apt commands in Debian and/or Ubuntu. GUIs created with scripting tools like Zenity and KDialog by KDE project [10] make use of stdin, stdout, and stderr, and are based on simple scripts rather than a complete GUI programmed and compiled in C/C++ using Qt, GTK, or other equivalent proprietary widget framework.

The Services menu, as implemented on NeXTSTEP and Mac OS X, is also analogous to standard streams. On these operating systems, graphical applications can provide functionality through a system-wide menu that operates on the current selection in the GUI, no matter in what application.

Some GUI programs, primarily on Unix, still write debug information to standard error. Others (such as many Unix media players) may read files from standard input. Popular Windows programs that open a separate console window in addition to their GUI windows are the emulators pSX and DOSBox.

GTK-server can use stdin as a communication interface with an interpreted program to realize a GUI.

The Common Lisp Interface Manager paradigm "presents" GUI elements sent to an extended output stream.

See also

Related Research Articles

A Berkeley (BSD) socket is an application programming interface (API) for Internet domain sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

<span class="mw-page-title-main">C shell</span> Unix shell

The C shell is a Unix shell created by Bill Joy while he was a graduate student at University of California, Berkeley in the late 1970s. It has been widely distributed, beginning with the 2BSD release of the Berkeley Software Distribution (BSD) which Joy first distributed in 1978. Other early contributors to the ideas or the code were Michael Ubell, Eric Allman, Mike O'Brien and Jim Kulp.

The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header <stdio.h>. The functionality descends from a "portable I/O package" written by Mike Lesk at Bell Labs in the early 1970s, and officially became part of the Unix operating system in Version 7.

In some operating systems, the null device is a device file that discards all data written to it but reports that the write operation succeeded. This device is called /dev/null on Unix and Unix-like systems, NUL: or NUL on CP/M and DOS, nul on OS/2 and newer Windows systems, NIL: on Amiga operating systems, and NL: on OpenVMS. In Windows Powershell, the equivalent is $null. It provides no data to any process that reads from it, yielding EOF immediately. In IBM operating systems DOS/360 and successors and also in OS/360 and successors such files would be assigned in JCL to DD DUMMY.

<span class="mw-page-title-main">Text-based user interface</span> Type of interface based on outputting to or controlling a text display

In computing, text-based user interfaces (TUI), is a retronym describing a type of user interface (UI) common as an early form of human–computer interaction, before the advent of bitmapped displays and modern conventional graphical user interfaces (GUIs). Like modern GUIs, they can use the entire screen area and may accept mouse and other inputs. They may also use color and often structure the display using box-drawing characters such as ┌ and ╣. The modern context of use is usually a terminal emulator.

dd is a command-line utility for Unix, Plan 9, Inferno, and Unix-like operating systems and beyond, the primary purpose of which is to convert and copy files. On Unix, device drivers for hardware and special device files appear in the file system just like normal files; dd can also read and/or write from/to these files, provided that function is implemented in their respective driver. As a result, dd can be used for tasks such as backing up the boot sector of a hard drive, and obtaining a fixed amount of random data. The dd program can also perform conversions on the data as it is copied, including byte order swapping and conversion to and from the ASCII and EBCDIC text encodings.

<span class="mw-page-title-main">Redirection (computing)</span> Form of interprocess communication

In computing, redirection is a form of interprocess communication, and is a function common to most command-line interpreters, including the various Unix shells that can redirect standard streams to user-specified locations. The concept of redirection is quite old, dating back to the earliest operating systems (OS). A discussion of the design goals for redirection can be found already in the 1971 description of the input-output subsystem of the Multics OS. However, prior to the introduction of UNIX OS with its "pipes", redirection in operating systems was hard or even impossible to do.

In Unix and Unix-like computer operating systems, a file descriptor is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket.

In computing, vectored I/O, also known as scatter/gather I/O, is a method of input and output by which a single procedure call sequentially reads data from multiple buffers and writes it to a single data stream (gather), or reads data from a data stream and writes it to multiple buffers (scatter), as defined in a vector of buffers. Scatter/gather refers to the process of gathering data from, or scattering data into, the given set of buffers. Vectored I/O can operate synchronously or asynchronously. The main reasons for using vectored I/O are efficiency and convenience.

<span class="mw-page-title-main">Pipeline (Unix)</span> Mechanism for inter-process communication using message passing

In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (stdout) is passed directly as input (stdin) to the next one. The second process is started as the first process is still executing, and they are executed concurrently.

inetd is a super-server daemon on many Unix systems that provides Internet services. For each configured service, it listens for requests from connecting clients. Requests are served by spawning a process which runs the appropriate executable, but simple services such as echo are served by inetd itself. External executables, which are run on request, can be single- or multi-threaded. First appearing in 4.3BSD, it is generally located at /usr/sbin/inetd. inetd is based on the (service) activator pattern

In computing, tee is a command in command-line interpreters (shells) using standard streams which reads standard input and writes it to both standard output and one or more files, effectively duplicating its input. It is primarily used in conjunction with pipes and filters. The command is named after the T-splitter used in plumbing.

In computing, exec is a functionality of an operating system that runs an executable file in the context of an already existing process, replacing the previous executable. This act is also referred to as an overlay. It is especially important in Unix-like systems, although it also exists elsewhere. As no new process is created, the process identifier (PID) does not change, but the machine code, data, heap, and stack of the process are replaced by those of the new program.

In Unix-like operating systems, a device file, device node, or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow an application program to interact with a device by using its device driver via standard input/output system calls. Using standard system calls simplifies many programming tasks, and leads to consistent user-space I/O mechanisms regardless of device features and functions.

Toybox is a free and open-source software implementation of over 200 Unix command line utilities such as ls, cp, and mv. The Toybox project was started in 2006, and became a 0BSD licensed BusyBox alternative. Toybox is used for most of Android's command-line tools in all currently supported Android versions, and is also used to build Android on Linux and macOS. All of the tools are tested on Linux, and many of them also work on BSD and macOS.

The script command is a Unix utility that records a terminal session. It dates back to the 1979 3.0 Berkeley Software Distribution (BSD).

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive interface available with punched cards.

SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA. Both simple and advanced tools are provided, supporting complex tasks like variant calling and alignment viewing as well as sorting, indexing, data extraction and format conversion. SAM files can be very large, so compression is used to save space. SAM files are human-readable text files, and BAM files are simply their binary equivalent, whilst CRAM files are a restructured column-oriented binary container format. BAM files are typically compressed and more efficient for software to work with than SAM. SAMtools makes it possible to work directly with a compressed BAM file, without having to uncompress the whole file. Additionally, since the format for a SAM/BAM file is somewhat complex - containing reads, references, alignments, quality information, and user-specified annotations - SAMtools reduces the effort needed to use SAM/BAM files by hiding low-level details.

In Unix-like operating systems, dup and dup2 system calls create a copy of a given file descriptor. This new descriptor actually does not behave like a copy, but like an alias of the old one.

cat (Unix) Unix command utility

cat is a standard Unix utility that reads files sequentially, writing them to standard output. The name is derived from its function to (con)catenate files . It has been ported to a number of operating systems.

References

  1. D. M. Ritchie, "A Stream Input-Output System", AT&T Bell Laboratories Technical Journal, 68(8), October 1984.
  2. "What are stdin, stdout and stderr in Linux? | CodePre.com". 2 December 2021. Retrieved 8 April 2022.
  3. "<unistd.h>". The Open Group Base Specifications Issue 6IEEE Std 1003.1, 2004 Edition. The Open Group. 2004.
  4. Johnson, Steve (2013-12-11). "[TUHS] Graphic Systems C/A/T phototypesetter" (Mailing list). Archived from the original on 2020-09-25. Retrieved 2020-11-07.
  5. "Revised Report on the Algorithmic Language Algol 68", edited by A. van Wijngaarden, B.J. Mailloux, J.E.L. Peck, C.H.A. Koster, M. Sintzoff, C.H. Lindsey, L.G.L.T. Meertens and R.G. Fisker, Section 10.3.
  6. "Stdin(3): Standard I/O streams - Linux man page". die.net. Archived from the original on Jun 8, 2023.
  7. McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139. Archived (PDF) from the original on Dec 15, 2023.
  8. "System (Java Platform SE 7)". Oracle Help Center. Retrieved 20 July 2012.
  9. ".NET Framework 4.7.1, mscorlib, console.cs". Reference Source - Microsoft. Archived from the original on Dec 10, 2017. Retrieved 2017-12-10.
  10. Kißling, Kristian (2009). "Adding graphic elements to your scripts with Zenity and KDialog". Linux Magazine . Retrieved 2021-04-11.

Sources