End-of-file

Last updated

In computing, end-of-file (EOF) [1] is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream.

Contents

Details

In the C standard library, the character-reading functions such as getchar return a value equal to the symbolic value (macro) EOF to indicate that an end-of-file condition has occurred. The actual value of EOF is implementation-dependent and must be negative (it is commonly −1, such as in glibc [2] ). Block-reading functions return the number of bytes read, and if this is fewer than asked for, then the end of file was reached or an error occurred (checking of errno or dedicated function, such as ferror is required to determine which).

EOF character

Input from a terminal never really "ends" (unless the device is disconnected), but it is useful to enter more than one "file" into a terminal, so a key sequence is reserved to indicate end of input. In UNIX, the translation of the keystroke to EOF is performed by the terminal driver, so a program does not need to distinguish terminals from other input files. By default, the driver converts a Control-D character at the start of a line into an end-of-file indicator. To insert an actual Control-D (ASCII 04) character into the input stream, the user precedes it with a "quote" command character (usually Control-V). AmigaDOS is similar but uses Control-\ instead of Control-D.

In DOS and Windows (and in CP/M and many DEC operating systems such as the PDP-6 monitor, [3] RT-11, VMS or TOPS-10 [4] ), reading from the terminal will never produce an EOF. Instead, programs recognize that the source is a terminal (or other "character device") and interpret a given reserved character or sequence as an end-of-file indicator; most commonly, this is an ASCII Control-Z, code 26. Some MS-DOS programs, including parts of the Microsoft MS-DOS shell (COMMAND.COM) and operating-system utility programs (such as EDLIN), treat a Control-Z in a text file as marking the end of meaningful data, and/or append a Control-Z to the end when writing a text file. This was done for two reasons:

In the ANSI X3.27-1969 magnetic tape standard, the end of file was indicated by a tape mark, which consisted of a gap of approximately 3.5 inches of tape followed by a single byte containing the character 0x13 (hex) for nine-track tapes and 017 (octal) for seven-track tapes. [5] The end-of-tape, commonly abbreviated as EOT, was indicated by two tape marks. This was the standard used, for example, on IBM 360. The reflective strip that was used to announce impending physical end of tape was also called an EOT marker.

See also

Related Research Articles

<span class="mw-page-title-main">ASCII</span> American character encoding standard

ASCII, an acronym for American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. ASCII has just 128 code points, of which only 95 are printable characters, which severely limit its scope. The set of available punctuation had significant impact on the syntax of computer languages and text markup. ASCII hugely influenced the design of character sets used by modern computers, including Unicode which has over a million code points, but the first 128 of these are the same as ASCII.

In computing and telecommunications, a control character or non-printing character (NPC) is a code point in a character set that does not represent a written character or symbol. They are used as in-band signaling to cause effects other than the addition of a symbol to the text. All other characters are mainly graphic characters, also known as printing characters, except perhaps for "space" characters. In the ASCII standard there are 33 control characters, such as code 7, BEL, which rings a terminal bell.

A disk operating system (DOS) is a computer operating system that resides on and can use a disk storage device, such as a floppy disk, hard disk drive, or optical disc. A disk operating system provides a file system for organizing, reading, and writing files on the storage disk, and a means for loading and running programs stored on that disk. Strictly speaking, this definition does not include any other functionality, so it does not apply to more complex OSes, such as Microsoft Windows, and is more appropriately used only for older generations of operating systems.

<span class="mw-page-title-main">Booting</span> Process of starting a computer

In computing, booting is the process of starting a computer as initiated via hardware such as a physical button on the computer or by a software command. After it is switched on, a computer's central processing unit (CPU) has no software in its main memory, so some process must load software into memory before it can be executed. This may be done by hardware or firmware in the CPU, or by a separate processor in the computer system. On some systems a power-on reset (POR) does not initiate booting and the operator must initiate booting after POR completes. IBM uses the term Initial Program Load (IPL) on some product lines.

In telecommunications, an End-of-Transmission character (EOT) is a transmission control character. Its intended use is to indicate the conclusion of a transmission that may have included one or more texts and any associated message headings.

<span class="mw-page-title-main">CP/M</span> Discontinued family of computer operating systems

CP/M, originally standing for Control Program/Monitor and later Control Program for Microcomputers, is a mass-market operating system created in 1974 for Intel 8080/85-based microcomputers by Gary Kildall of Digital Research, Inc. CP/M is a disk operating system and its purpose is to organize files on a magnetic storage medium, and to load and run programs stored on a disk. Initially confined to single-tasking on 8-bit processors and no more than 64 kilobytes of memory, later versions of CP/M added multi-user variations and were migrated to 16-bit processors.

<span class="mw-page-title-main">History of operating systems</span> Aspect of computing history

Computer operating systems (OSes) provide a set of functions needed and used by most application programs on a computer, and the links needed to control and synchronize computer hardware. On the first computers, with no operating system, every program needed the full hardware specification to run correctly and perform standard tasks, and its own drivers for peripheral devices like printers and punched paper card readers. The growing complexity of hardware and application programs eventually made operating systems a necessity for everyday use.

<span class="mw-page-title-main">ANSI escape code</span> Method used for display options on video text terminals

ANSI escape sequences are a standard for in-band signaling to control cursor location, color, font styling, and other options on video text terminals and terminal emulators. Certain sequences of bytes, most starting with an ASCII escape character and a bracket character, are embedded into text. The terminal interprets these sequences as commands, rather than text to display verbatim.

RT-11 is a discontinued small, low-end, single-user real-time operating system for the full line of Digital Equipment Corporation PDP-11 16-bit computers. RT-11 was first implemented in 1970. It was widely used for real-time computing systems, process control, and data acquisition across all PDP-11s. It was also used for low-cost general-use computing.

<span class="mw-page-title-main">KIM-1</span> Single-board computer produced by MOS Technology in 1976

The KIM-1, short for Keyboard Input Monitor, is a small 6502-based single-board computer developed and produced by MOS Technology, Inc. and launched in 1976. It was very successful in that period, due to its low price and easy-access expandability.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system.

<span class="mw-page-title-main">Newline</span> Special characters in computing signifying the end of a line of text

A newline is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a sequence of characters, is used to signify the end of a line of text and the start of a new one.

OS/8 is the primary operating system used on the Digital Equipment Corporation's PDP-8 minicomputer.

Peripheral Interchange Program (PIP) was a utility to transfer files on and between devices on Digital Equipment Corporation's computers. It was first implemented on the PDP-6 architecture by Harrison "Dit" Morse early in the 1960s. It was subsequently implemented for DEC's operating systems for PDP-10, PDP-11, and PDP-8 architectures. In the 1970s and 1980s Digital Research implemented PIP on CP/M and MP/M.

<span class="mw-page-title-main">RSTS/E</span> Computer operating system

RSTS is a multi-user time-sharing operating system developed by Digital Equipment Corporation for the PDP-11 series of 16-bit minicomputers. The first version of RSTS was implemented in 1970 by DEC software engineers that developed the TSS-8 time-sharing operating system for the PDP-8. The last version of RSTS was released in September 1992. RSTS-11 and RSTS/E are usually referred to just as "RSTS" and this article will generally use the shorter form. RSTS-11 supports the BASIC programming language, an extended version called BASIC-PLUS, developed under contract by Evans Griffiths & Hart of Boston. Starting with RSTS/E version 5B, DEC added support for additional programming languages by emulating the execution environment of the RT-11 and RSX-11 operating systems.

dd is a command-line utility for Unix, Plan 9, Inferno, and Unix-like operating systems and beyond, the primary purpose of which is to convert and copy files. On Unix, device drivers for hardware and special device files appear in the file system just like normal files; dd can also read and/or write from/to these files, provided that function is implemented in their respective driver. As a result, dd can be used for tasks such as backing up the boot sector of a hard drive, and obtaining a fixed amount of random data. The dd program can also perform conversions on the data as it is copied, including byte order swapping and conversion to and from the ASCII and EBCDIC text encodings.

A fat binary is a computer executable program or library which has been expanded with code native to multiple instruction sets which can consequently be run on multiple processor types. This results in a file larger than a normal one-architecture binary file, thus the name.

In computer data, a substitute character (␚) is a control character that is used to pad transmitted data in order to send it in blocks of fixed size, or to stand in place of a character that is recognized to be invalid, erroneous or unrepresentable on a given device. It is also used as an escape sequence in some programming languages.

In Unix-like operating systems, a device file, device node, or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow an application program to interact with a device by using its device driver via standard input/output system calls. Using standard system calls simplifies many programming tasks, and leads to consistent user-space I/O mechanisms regardless of device features and functions.

References

  1. Pollock, Wayne. "Shell Here Document Overview". hccfl.edu. Archived from the original on 2014-05-29. Retrieved 2014-05-28.
  2. "The GNU C Library". www.gnu.org.
  3. "Table of IO Device Characteristics - Console or Teletypewriters". PDP-6 Multiprogramming System Manual (PDF). Maynard, Massachusetts, USA: Digital Equipment Corporation (DEC). 1965. p. 43. DEC-6-0-EX-SYS-UM-IP-PRE00. Archived (PDF) from the original on 2014-07-14. Retrieved 2014-07-10. (1+84+10 pages)
  4. "5.1.1.1. Device Dependent Functions - Data Modes - Full-Duplex Software A(ASCII) and AL(ASCII Line)". PDP-10 Reference Handbook: Communicating with the Monitor - Time-Sharing Monitors (PDF). Vol. 3. Digital Equipment Corporation (DEC). 1969. pp. 5-3 – 5-6 [5-5 (431)]. Archived (PDF) from the original on 2011-11-15. Retrieved 2014-07-10. (207 pages)
  5. "Tape Transfer (Pre-1977): Exchange Media: MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media (Library of Congress)". www.loc.gov.