End-of-Transmission character

Last updated

In telecommunication, an End-of-Transmission character (EOT) is a transmission control character. Its intended use is to indicate the conclusion of a transmission that may have included one or more texts and any associated message headings. [1]

Contents

An EOT is often used to initiate other functions, such as releasing circuits, disconnecting terminals, or placing receive terminals in a standby condition. [1] Its most common use today is to cause a Unix terminal driver to signal end of file and thus exit programs that are awaiting input.

In ASCII and Unicode, the character is encoded at U+0004<control-0004>. It can be referred to as Ctrl+D, ^D in caret notation. Unicode provides the character U+2404SYMBOL FOR END OF TRANSMISSION for when EOT needs to be displayed graphically. [2] In addition, U+2301ELECTRIC ARROW can also be used as a graphic representation of EOT; it is defined in Unicode as "symbol for End of Transmission". [3]

Meaning in Unix

The EOT character in Unix is different from the Control-Z in DOS. The DOS Control-Z byte is actually sent and/or placed in files to indicate where the text ends. In contrast, the Control-D causes the Unix terminal driver to signal the EOF condition, which is not a character, while the byte has no special meaning if actually read or written from a file or terminal.

In Unix, the end-of-file character (by default EOT) causes the terminal driver to make available all characters in its input buffer immediately; normally the driver would collect characters until it sees an end-of-line character. If the input buffer is empty (because no characters have been typed since the last end-of-line or end-of-file), a program reading from the terminal reads a count of zero bytes. In Unix, such a condition is understood as having reached the end of the file.

This can be demonstrated with the cat program on Unix-like operating systems such as Linux: Run the cat command with no arguments, so it accepts its input from the keyboard and prints output to the screen. Type a few characters without pressing ↵ Enter, then type Ctrl+D. The characters typed to that point are sent to cat, which then writes them to the screen. If Ctrl+D is typed without typing any characters first, the input stream is terminated and the program ends. An actual EOT is obtained by typing Ctrl+V then Ctrl+D.

If the terminal driver is in "raw" mode, it no longer interprets control characters, and the EOT character is sent unchanged to the program, which is free to interpret it any way it likes. A program may then decide to handle the EOT byte as an indication that it should end the text; this would then be similar to how Ctrl+Z is handled by DOS programs.

Usage in mainframe computer system communications protocols

The EOT character is used in legacy communications protocols by mainframe computer manufacturers such as IBM, Burroughs Corporation, and the BUNCH. Terminal transmission control protocols such as IBM 3270 Poll/Select, or Burroughs TD830 Contention Mode protocol use the EOT character to terminate a communications sequence between two cooperating stations (such as a host multiplexer or Input/Output terminal).

A single Poll (ask the station for data) or Select (send data to the station) operation will include two round-trip send-reply operations between the polling station and the station being polled, the final operation being transmission of a single EOT character to the initiating station.

See also

Related Research Articles

<span class="mw-page-title-main">ASCII</span> American character encoding standard

ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are printable characters, which severely limited its scope. Modern computer systems have evolved to use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set.

<span class="mw-page-title-main">ASCII art</span> Computer art form using text characters

ASCII art is a graphic design technique that uses computers for presentation and consists of pictures pieced together from the 95 printable characters defined by the ASCII Standard from 1963 and ASCII compliant character sets with proprietary extended characters. The term is also loosely used to refer to text-based visual art in general. ASCII art can be created with any text editor, and is often used with free-form languages. Most examples of ASCII art require a fixed-width font such as Courier for presentation.

In computing and telecommunication, a control character or non-printing character (NPC) is a code point in a character set that does not represent a written character or symbol. They are used as in-band signaling to cause effects other than the addition of a symbol to the text. All other characters are mainly graphic characters, also known as printing characters, except perhaps for "space" characters. In the ASCII standard there are 33 control characters, such as code 7, BEL, which rings a terminal bell.

UTF-8 is a variable-length character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode Transformation Format – 8-bit.

<span class="mw-page-title-main">ANSI escape code</span> Method used for display options on video text terminals

ANSI escape sequences are a standard for in-band signaling to control cursor location, color, font styling, and other options on video text terminals and terminal emulators. Certain sequences of bytes, most starting with an ASCII escape character and a bracket character, are embedded into text. The terminal interprets these sequences as commands, rather than text to display verbatim.

A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In operating systems such as CP/M and DOS, where the operating system does not keep track of the file size in bytes, the end of a text file is denoted by placing one or more special characters, known as an end-of-file (EOF) marker, as padding after the last line in a text file. On modern operating systems such as Microsoft Windows and Unix-like systems, text files do not contain any special EOF character, because file systems on those operating systems keep track of the file size in bytes. Most text files need to have end-of-line delimiters, which are done in a few different ways depending on operating system. Some operating systems with record-orientated file systems may not use new line delimiters and will primarily store text files with lines separated as fixed or variable length records.

<span class="mw-page-title-main">Newline</span> Special characters in computing signifying the end of a line of text

A newline is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a sequence of characters, is used to signify the end of a line of text and the start of a new one.

The null character is a control character with the value zero. It is present in many character sets, including those defined by the Baudot and ITA2 codes, ISO/IEC 646, the C0 control code, the Universal Coded Character Set, and EBCDIC. It is available in nearly all mainstream programming languages. It is often abbreviated as NUL. In 8-bit codes, it is known as a null byte.

In computing, end-of-file (EOF) is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream.

uuencoding is a form of binary-to-text encoding that originated in the Unix programs uuencode and uudecode written by Mary Ann Horton at the University of California, Berkeley in 1980, for encoding binary data for transmission in email systems.

A bell character is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming message. Though tickers punched the bell codes into their tapes, printers generally do not print a character when the bell code is received. Bell codes are usually represented by the label "BEL". They have been used since 1870.

Caret notation is a notation for control characters in ASCII. The notation assigns ^A to control-code 1, sequentially through the alphabet to ^Z assigned to control-code 26 (0x1A). For the control-codes outside of the range 1–26, the notation extends to the adjacent, non-alphabetic ASCII characters.

The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received.

tail is a program available on Unix, Unix-like systems, FreeDOS and MSX-DOS used to display the tail end of a text file or piped data.

In computer data, a substitute character (␚) is a control character that is used to pad transmitted data in order to send it in blocks of fixed size, or to stand in place of a character that is recognized to be invalid, erroneous or unrepresentable on a given device. It is also used as an escape sequence in some programming languages.

Binary Synchronous Communication is an IBM character-oriented, half-duplex link protocol, announced in 1967 after the introduction of System/360. It replaced the synchronous transmit-receive (STR) protocol used with second generation computers. The intent was that common link management rules could be used with three different character encodings for messages.

On personal computers with numeric keypads that use Microsoft operating systems, such as Windows, many characters that do not have a dedicated key combination on the keyboard may nevertheless be entered using the Alt code. This is done by pressing and holding the Alt key, then typing a number on the keyboard's numeric keypad that identifies the character and then releasing Alt.

The delete control character is the last character in the ASCII repertoire, with the code 127. It is supposed to do nothing and was designed to erase incorrect characters on paper tape. It is denoted as ^? in caret notation and is U+007F in Unicode.

cat (Unix) Unix command utility

cat is a standard Unix utility that reads files sequentially, writing them to standard output. The name is derived from its function to (con)catenate files. It has been ported to a number of operating systems.

References

  1. 1 2 "end-of-transmission character (EOT)". Federal Standard 1037C . 1996. Archived from the original on 2020-11-23. Retrieved 2009-03-15.
  2. "Control Pictures" (PDF). Archived (PDF) from the original on 2019-01-18. Retrieved 2013-04-06.
  3. "Miscellaneous Technical" (PDF). Archived (PDF) from the original on 2019-12-30. Retrieved 2013-04-07.