NETDATA

Last updated

NETDATA is a file format used primarily for data transfer and storage on IBM mainframe systems, although implementations are available for other systems.

A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free and may be either unpublished or open.

IBM American multinational technology and consulting corporation

International Business Machines Corporation (IBM) is an American multinational information technology company headquartered in Armonk, New York, with operations in over 170 countries. The company began in 1911, founded in Endicott, New York, as the Computing-Tabulating-Recording Company (CTR) and was renamed "International Business Machines" in 1924.

Contents

Description

NETDATA files are 80-byte card image files containing unloaded file data plus metadata to allow the original file to be reconstituted on the receiving system. A complete NETDATA file consists of a number of control records, followed by data records and terminated by a trailer record. All records have the same format:

Card image is a traditional term for a character string, usually 80 characters in length, that was, or could be, contained on a single punched card. IBM cards were 80 characters in length. UNIVAC cards were 90 characters in length. Card image files stored on magnetic tape or disk were usually used for simulated card input or output.

Metadata data about data

Metadata is "data information that provides information about other data". Many distinct types of metadata exist, among these descriptive metadata, structural metadata, administrative metadata, reference metadata and statistical metadata.

Control records

Control records have a six-character EBCDIC identifier in bytes 2-7 following the length and flags. They contain a number of self-defining fields, called text units. Each text unit consists of a two byte text unit key identifying this text unit, a two-byte big-endian binary number of length-data pairs that follow for this key (usually one), a two byte length field identifying the length of the text unit data, and a text unit of the specified length. Implementations are expected to ignore any text unit information not relevant to the receiving system.

Extended Binary Coded Decimal Interchange Code is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six-bit binary-coded decimal code used with most of IBM's computer peripherals of the late 1950s and early 1960s. It is supported by various non-IBM platforms, such as Fujitsu-Siemens' BS2000/OSD, OS-IV, MSP, and MSP-EX, the SDS Sigma series, Unisys VS/9, Burroughs MCP and ICL VME.

Header Control Record
The header record must be the first record of a NETDATA file. It has the identifier "INMR01". It contains information identifying the sender: node (host), timestamp, and user id, the length of the control record segments, and the target (receiving) node and user id. It may optionally contain a request for acknowledgement of receipt, the version number of the data format, the number of files in the transmission, and a "user parameter string." CMS allows only one file per transmission, but TSO/E and other systems may allow more than one.

File Utility Control Record
This record describes how the file's data is to be reconstituted. Its identifier is "INMR02". Bytes 8-11 contain the big-endian binary number of the file to which this record applies. If there are multiple files in a transmission they are numbered starting with one. The rest of this record describes the file's format, and one or more steps ("utility programs") which must be executed in order to rebuild this file. The text units identify the file's organization (INMDSORG: sequential, partitioned, etc.), its fixed of maximum record length (INMLRECL), its record format (INMRECFM: fixed, variable, etc) the approximate size of the file (IBMSIZE), and the utility program name(s) (INMUTILN). It may also contain the file's block size, creation date, number of directory blocks, name, expiration date, file mode number, last change date, last reference date, member name list (for partitioned datasets), a note file, and a user parameter string.

Data Control Record
The Data Control Record immediately precedes the data and describes its format, similar to the Utility Control Record. Its identifier is "INMR03". This record is ignored by CMS, but is used by TSO/E. It contains the file's organization (INMDSORG), its record length (INMLRECL), its record format (INMRECFM), and the file size (IBMSIZE).

User Control Record
The User Control record can appear at any point in the data stream. Its identifier is "INMR04". If present it is ignored by CMS, but may be used by other systems. It contains only a User Parameter String (INMUSERP).

Trailer Control Record
This record marks the end of the file. Its identifier is "INMR06". No other data is defined for this record.

Acknowledgement Control Record
This record has an id of "INMR07". It is used by the receiving system to acknowledge receipt of a transmission. It contains one of the text units File Name (INMDSNM) or Note File (INMTERM) plus, optionally, the Origin Time Stamp (INMFTIME).

A note file (sometimes called a "PROFS note") "is a short communication, the kind usually done by letter.". [2]

Data records

Data records (identified by their flag value), follow the Data Control Record, if present, and precede the Trailer Control Record. Records can be any size up to INMLRECL. They are sent as multiple segments of up to 253 bytes, split into 80 byte records for transmission, and reassembled by the receiver. Settings of the flags byte in each record mark the beginning, end, or a complete record of the file. Bytes of a record can contain any bit pattern. No character values are reserved.

Related Research Articles

The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP. Applications that do not require reliable data stream service may use the User Datagram Protocol (UDP), which provides a connectionless datagram service that emphasizes reduced latency over reliability.

In computing, endianness refers to the order of bytes within a binary representation of a number. It can also be used more generally to refer to the internal ordering of any representation, such as the digits in a numeral system or the sections of a date.

Interchange File Format (IFF), is a generic container file format originally introduced by the Electronic Arts company in 1985 in order to facilitate transfer of data between software produced by different companies.

The byte order mark (BOM) is a Unicode character, U+FEFFBYTE ORDER MARK (BOM), whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:

The Resource Interchange File Format (RIFF) is a generic file container format for storing data in tagged chunks. It is primarily used to store multimedia such as sound and video, though it may also be used to store any arbitrary data.

A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In operating systems such as CP/M and MS-DOS, where the operating system does not keep track of the file size in bytes, the end of a text file is denoted by placing one or more special characters, known as an end-of-file marker, as padding after the last line in a text file. On modern operating systems such as Microsoft Windows and Unix-like systems, text files do not contain any special EOF character, because file systems on those operating systems keep track of the file size in bytes. There are for most text files a need to have end-of-line delimiters, which are done in a few different ways depending on operating system. Some operating systems with record-orientated file systems may not use new line delimiters and will primarily store text files with lines separated as fixed or variable length records.

In computing, end-of-file is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream.

The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries. An implementation of ar is included as one of the GNU Binutils.

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. A CSV file stores tabular data in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format.

File system concrete format or program for storing files and directories on a data storage device

In computing, a file system or filesystem, controls how data is stored and retrieved. Without a file system, information placed in a storage medium would be one large body of data with no way to tell where one piece of information stops and the next begins. By separating the data into pieces and giving each piece a name, the information is easily isolated and identified. Taking its name from the way paper-based information systems are named, each group of data is called a "file". The structure and logic rules used to manage the groups of information and their names is called a "file system".

In the context of IBM mainframe computers, a data set or dataset is a computer file having a record organization. Use of this term began with OS/360 and is still used by its successors, including the current z/OS. Documentation for these systems historically preferred this term rather than file.

This article discusses support programs included in or available for OS/360 and successors. IBM categorizes some of these programs as utilities and others as service aids; the boundaries are not always consistent or obvious. Many, but not all, of these programs match the types in utility software.

In computer science, a record-oriented filesystem is a file system where data is stored as collections of records. This is in contrast to a byte-oriented filesystem, where the data is treated as an unformatted stream of bytes. There are several different possible record formats; the details vary depending on the particular system. In general the formats can be fixed-length or variable length, with different physical organizations or padding mechanisms; metadata may be associated with the file records to define the record length, or the data may be part of the record. Different methods to access records may be provided, for example sequential, by key or by record number.

This article compares Unicode encodings. Two situations are considered: 8-bit-clean environments, and environments that forbid use of byte values that have the high bit set. Originally such prohibitions were to allow for links that used only seven data bits, but they remain in the standards and so software must generate messages that comply with the restrictions. Standard Compression Scheme for Unicode and Binary Ordered Compression for Unicode are excluded from the comparison tables because it is difficult to simply quantify their size.

Count key data (CKD) is a direct-access storage device (DASD) data recording format introduced in 1964 by IBM with its IBM System/360 and still being emulated on IBM mainframes. It is a self-defining format with each data record represented by a Count Area that identifies the record and provides the number of bytes in an optional Key Area and an optional Data Area. This is in contrast to devices using fixed sector size or a separate format track.

A FAT file system is a specific type of computer file system architecture and a family of industry-standard file systems utilizing it.

The GOFF specification was developed for the IBM zSystem Mainframe computer to supersede the IBM OS/360 Object File Format to compensate for weaknesses in the older format.

The Esri TIN format is a popular yet proprietary geospatial vector data format for geographic information system (GIS) software for storing elevation data as a triangulated irregular network. It is developed and regulated by Esri. The Esri TIN format can spatially describe elevation information including breaking edge features. Each points and triangle can carry a tag information. A TIN stored in this file format can have any shape, cover multiple regions and contain holes.

The CMS file system is the native file system of IBM's Conversational Monitor System (CMS), a component of VM/370. It was the only file system for CMS until the introduction of the CMS Shared File System with VM/SP.

References

  1. IBM Corporation. "x/VM: CMS Macros and Functions Reference". IBM Knowledge Center. Retrieved Sep 5, 2019.
  2. IBM Corporation. "z/VM:CMS Commands and Utilities Reference" . Retrieved Sep 6, 2019.Cite web requires |website= (help)