ANPA-1312

Last updated

ANPA-1312 is a 7-bit news agency text markup specification published by the Newspaper Association of America, designed to standardize the content and structure of text news articles.

Contents

It was last modified in 1989 and is still the most common method of transmitting news to newspapers, web sites and broadcasters from news agencies in North and South America. Although the specification provides for 1200 bit-per-second transmission speeds, modern transmission technology removes any speed limitations.

Using fixed metadata fields and a series of control and other special characters, ANPA 1312 was designed to feed text stories to both teleprinters and computer-based news editing systems.

Although the specification was based upon the 7-bit ASCII character set, some characters were declared to be replaced by traditional newspaper characters, e.g. small fractions and typesetting code. As such, it was a bridge between older typesetting methods, newspaper traditions and newer technology.

Perhaps the best known part of ANPA-1312 was the category code system, which allowed articles to be categorized by a single letter. For example, sports articles were assigned category S, and articles about politics were assigned P. Many newspapers found the system convenient and sorted both incoming news agency and staff articles by ANPA-1312 categories.

Superseded in the early 1990s by IPTC Information Interchange Model and later by the XML-based News Industry Text Format, ANPA-1312's popularity in North America remained strong due, in part, to its widespread support by The Associated Press and the reluctance of newspapers to invest in new computers or software modifications. The Associated Press retired ANPA as a delivery option in 2023.

A modified version but with the same name was implemented by several news agencies after the vendor of some early computer systems modified the specification for its own purposes.

An international standard, IPTC 7901, is widely used in Europe and is closely related to ANPA-1312.

C0 control codes

The ASCII control characters were modified/replaced in this format. [1]

Seq DecHexReplacedAbbrevNameDescription
^I0909HTFOFormattingUsed in tabular data to move to the next tabulation position (retaining "Tab" semantics in this regard), and in standard formats to denote the next phase. The current IPTC specification instead recommends using regular ASCII C0 controls, and using the US control as a column break in tables.
^K110BVTECDEnd of InstructionDelimits the end of a typographical instruction intended for the typesetting device.
^L120CFFSCDStart of InstructionDelimits the start of a typographical instruction intended for the typesetting device.
^M130DCRQLQuad LeftTerminates a line, indicating that it should be left-aligned. The current IPTC specification instead recommends using the < CR LF sequence.
^N140ESOURUpper RailStarts an emphasised region of text. Used in Scandinavian journalistic text transmission as of 1975; [1] IPTC recommendations as of 1976 used FT2 and FT3 instead. The current IPTC specification instead recommends using regular ASCII C0 controls, and marking up this function with the ^ character.
^O150FSILRLower RailEnds an emphasised region of text. Used in Scandinavian journalistic text transmission as of 1975; [1] IPTC recommendations as of 1976 used FT1 instead marking up this function with the @ character.
^X2418CANKWKill WordDeletes the preceding word (deletes back to and including the last space, or back to and excluding the previous line break, whichever it encounters first). Retains "Cancel" semantics in this respect, but has a more specific function.
^\281CFSSSSuper Shift Non-locking shift code.
^]291DGSQCQuad CentreTerminates a line, indicating that it should be centred.
^^301ERSQRQuad RightTerminates a line, indicating that it should be right-aligned.
^_311FUSJYJustifyTerminates a line which is to be justified.

Related Research Articles

<span class="mw-page-title-main">ASCII</span> American character encoding standard

ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are printable characters, which severely limited its scope. Modern computer systems have evolved to use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set.

<span class="mw-page-title-main">Character encoding</span> Using numbers to represent text characters

Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "character map".

In computing and telecommunication, a control character or non-printing character (NPC) is a code point in a character set that does not represent a written character or symbol. They are used as in-band signaling to cause effects other than the addition of a symbol to the text. All other characters are mainly graphic characters, also known as printing characters, except perhaps for "space" characters. In the ASCII standard there are 33 control characters, such as code 7, BEL, which rings a terminal bell.

<span class="mw-page-title-main">Teleprinter</span> Device for transmitting messages in written form by electrical signals

A teleprinter is an electromechanical device that can be used to send and receive typed messages through various communications channels, in both point-to-point and point-to-multipoint configurations.

UTF-8 is a variable-length character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode Transformation Format – 8-bit.

<span class="mw-page-title-main">Punched tape</span> Data storage device

Punched tape or perforated paper tape is a form of data storage device that consists of a long strip of paper through which small holes are punched. It was developed from and was subsequently used alongside punched cards, the difference being that the tape is continuous.

Asynchronous serial communication is a form of serial communication in which the communicating endpoints' interfaces are not continuously synchronized by a common clock signal. Instead of a common synchronization signal, the data stream contains synchronization information in form of start and stop signals, before and after each unit of transmission, respectively. The start signal prepares the receiver for arrival of data and the stop signal resets its state to enable triggering of a new sequence.

<span class="mw-page-title-main">Fieldata</span> Military communication project and ASCII precursor

FIELDATA was a pioneering computer project run by the US Army Signal Corps in the late 1950s that intended to create a single standard for collecting and distributing battlefield information. In this respect it could be thought of as a generalization of the US Air Force's SAGE system that was being created at about the same time.

ISO/IEC 2022Information technology—Character code structure and extension techniques, is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994.

The International Press Telecommunications Council (IPTC), based in London, United Kingdom, is a consortium of the world's major news agencies, other news providers and news industry vendors and acts as the global standards body of the news media.

<span class="mw-page-title-main">Teletext</span> Television information retrieval service developed in the United Kingdom in the early 1970s

Teletext, or broadcast teletext, is a standard for displaying text and rudimentary graphics on suitably equipped television sets. Teletext sends data in the broadcast signal, hidden in the invisible vertical blanking interval area at the top and bottom of the screen. The teletext decoder in the television buffers this information as a series of "pages", each given a number. The user can display chosen pages using their remote control. In broad terms, it can be considered as Videotex, a system for the delivery of information to a user in a computer-like format, typically displayed on a television or a dumb terminal, but that designation is usually reserved for systems that provide bi-directional communication, such as Prestel or Minitel.

News Industry Text Format (NITF) is an XML specification designed to standardize the content and structure of individual text news articles.

The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, an instruction to start a new line, or a message that the text has been received.

<span class="mw-page-title-main">CD-Text</span> CD-based format that allows for song information to be stored alongside audio data

CD-Text is an extension of the Red Book Compact Disc specifications standard for audio CDs. It allows storage of additional information on a standards-compliant audio CD.

A binary-to-text encoding is encoding of data in plain text. More precisely, it is an encoding of binary data in a sequence of printable characters. These encodings are necessary for transmission of data when the communication channel does not allow binary data or is not 8-bit clean. PGP documentation uses the term "ASCII armor" for binary-to-text encoding when referring to Base64.

A six-bit character code is a character encoding designed for use on computers with word lengths a multiple of 6. Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punctuation characters, and sometimes control characters. The 7-track magnetic tape format was developed to store data in such codes, along with an additional parity bit.

Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character is used in C-programming application environments to indicate the end of a string of characters. In this way, these programs only require a single starting memory address for a string, since the string ends once the program reads the null character.

The Information Interchange Model (IIM) is a file structure and set of metadata attributes that can be applied to text, images and other media types. It was developed in the early 1990s by the International Press Telecommunications Council (IPTC) to expedite the international exchange of news among newspapers and news agencies.

IPTC 7901 is a news service text markup specification published by the International Press Telecommunications Council that was designed to standardize the content and structure of text news articles. It was formally approved in 1979, and is still the world's most common way of transmitting news articles to newspapers, web sites and broadcasters from news services.

<span class="mw-page-title-main">Extended ASCII</span> Nickname for 8-bit ASCII-derived character sets

Extended ASCII is a repertoire of character encodings that include the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes criticized, because it can be mistakenly interpreted to mean that the American National Standards Institute (ANSI) had updated its ANSI X3.4-1986 standard to include more characters, or that the term identifies a single unambiguous encoding, neither of which is the case.

References

  1. 1 2 3 Sveriges Standardiseringskommission (1975). NATS Control set for newspaper text transmission (PDF). ITSCJ/IPSJ. ISO-IR-7.