Octet (computing)

Last updated
octet
Unit system units derived from bit
Unit ofdigital information, data size
Symbolo
In primary units of information 1 o = 8 bits

The octet is a unit of digital information in computing and telecommunications that consists of eight bits. The term is often used when the term byte might be ambiguous, as the byte has historically been used for storage units of a variety of sizes.

Contents

The term octad(e) for eight bits is no longer common. [1] [2]

Definition

The international standard IEC 60027-2, chapter 3.8.2, states that a byte is an octet of bits. However, the unit byte has historically been platform-dependent and has represented various storage sizes in the history of computing. Due to the influence of several major computer architectures and product lines, the byte became overwhelmingly associated with eight bits. This meaning of byte is codified in such standards as ISO/IEC 80000-13. While byte and octet are often used synonymously, those working with certain legacy systems are careful to avoid ambiguity.[ citation needed ]

Octets can be represented using number systems of varying bases such as the hexadecimal, decimal, or octal number systems. The binary value of all eight bits set (or activated) is 111111112, equal to the hexadecimal value FF16, the decimal value 25510, and the octal value 3778. One octet can be used to represent decimal values ranging from 0 to 255.

The term octet (symbol: o [nb 1] ) is often used when the use of byte might be ambiguous. It is frequently used in the Request for Comments (RFC) publications of the Internet Engineering Task Force to describe storage sizes of network protocol parameters. The earliest example is RFC   635 from 1974. In 2000, Bob Bemer claimed to have earlier proposed the usage of the term octet for "8-bit bytes" when he headed software operations for Cie. Bull in France in 1965 to 1966. [3]

In France, French Canada and Romania, octet is used in common language instead of byte when the eight-bit sense is required; for example, a megabyte (MB) is termed a megaoctet (Mo).

A variable-length sequence of octets, as in Abstract Syntax Notation One (ASN.1), is referred to as an octet string.

Octad

Historically, in Western Europe, the term octad (or octade) was used to specifically denote eight bits, [2] [1] a usage no longer common. Early examples of usage exist in British, [2] Dutch and German sources of the 1960s and 1970s, and throughout the documentation of Philips mainframe computers. [1] Similar terms are triad for a grouping of three bits and decade for ten bits.

Unit multiples

Unit multiples of the octet may be formed with SI prefixes and binary prefixes (power of 2 prefixes) as standardized by the International Electrotechnical Commission in 1998.

SI Prefixes
1 kilooctet (ko)= 103 octets= 1000 octets
1 megaoctet (Mo)= 106 octets= 1000 ko= 1000000 octets
1 gigaoctet (Go)= 109 octets= 1000 Mo= 1000000000 octets
1 teraoctet (To)= 1012 octets= 1000 Go= 1000000000000 octets
1 petaoctet (Po)= 1015 octets= 1000 To= 1000000000000000 octets
1 exaoctet (Eo)= 1018 octets= 1000 Po= 1000000000000000000 octets
1 zettaoctet (Zo)= 1021 octets= 1000 Eo= 1000000000000000000000 octets
1 yottaoctet (Yo)= 1024 octets= 1000 Zo= 1000000000000000000000000 octets
Binary Prefixes
1 kibioctet (Kio, also written Ko, as distinct from ko)= 210 octets= 1024 octets
1 mebioctet (Mio)= 220 octets= 1024 Kio= 1048576 octets
1 gibioctet (Gio)= 230 octets= 1024 Mio= 1073741824 octets
1 tebioctet (Tio)= 240 octets= 1024 Gio= 1099511627776 octets
1 pebioctet (Pio)= 250 octets= 1024 Tio= 1125899906842624 octets
1 exbioctet (Eio)= 260 octets= 1024 Pio= 1152921504606846976 octets
1 zebioctet (Zio)= 270 octets= 1024 Eio= 1180591620717411303424 octets
1 yobioctet (Yio)= 280 octets= 1024 Zio= 1208925819614629174706176 octets

Use in Internet Protocol addresses

The octet is used in representations of Internet Protocol computer network addresses. [4] An IPv4 address consists of four octets, usually displayed individually as a series of decimal values ranging from 0 to 255, each separated by a full stop (dot). Using octets with all eight bits set, the representation of the highest-numbered IPv4 address is 255.255.255.255.

An IPv6 address consists of sixteen octets, displayed in hexadecimal representation (two hexits per octet), using a colon character (:) after each pair of octets (16 bits are also known as hextet) for readability, such as 2001:0db8:0000:0000:0123:4567:89ab:cdef. [5]

See also

Notes

  1. However, the IEC 80000-13 symbol "o" for octets can be confused with the postfix "o" to indicate octal numbers in Intel convention .

Related Research Articles

<span class="mw-page-title-main">ASCII</span> American character encoding standard

ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are printable characters, which severely limited its scope. Modern computer systems have evolved to use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set.

The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented as either "1" or "0", but other representations such as true/false, yes/no, on/off, or +/ are also widely used.

The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit of memory in many computer architectures. To disambiguate arbitrarily sized bytes from the common 8-bit definition, network protocol documents such as the Internet Protocol refer to an 8-bit byte as an octet. Those bits in an octet are usually counted with numbering from 0 to 7 or 7 to 0 depending on the bit endianness. The first bit is number 0, making the eighth bit number 7.

A binary prefix is a unit prefix that indicates a multiple of a unit of measurement by an integer power of two. The most commonly used binary prefixes are kibi (symbol Ki, meaning 210 = 1024), mebi (Mi, 220 = 1048576), and gibi (Gi, 230 = 1073741824). They are most often used in information technology as multipliers of bit and byte, when expressing the capacity of storage devices or the size of computer files.

In mathematics and computing, the hexadecimal numeral system is a positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbols, hexadecimal uses sixteen distinct symbols, most often the symbols "0"–"9" to represent values 0 to 9, and "A"–"F" to represent values from ten to fifteen.

In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are commonly represented in a computer as a group of binary digits (bits). The size of the grouping varies so the set of integer sizes available varies between different types of computers. Computer hardware nearly always provides a way to represent a processor register or memory address as an integer.

An Internet Protocol address is a numerical label such as 192.0.2.1 that is connected to a computer network that uses the Internet Protocol for communication. An IP address serves two main functions: network interface identification, and location addressing.

<span class="mw-page-title-main">Internet Protocol version 4</span> Fourth version of the Internet Protocol

Internet Protocol version 4 (IPv4) is the fourth version of the Internet Protocol (IP). It is one of the core protocols of standards-based internetworking methods in the Internet and other packet-switched networks. IPv4 was the first version deployed for production on SATNET in 1982 and on the ARPANET in January 1983. It is still used to route most Internet traffic today, even with the ongoing deployment of Internet Protocol version 6 (IPv6), its successor.

<span class="mw-page-title-main">Nibble</span> Four-bit unit of digital information

In computing, a nibble (occasionally nybble, nyble, or nybl to match the spelling of byte) is a four-bit aggregation, or half an octet. It is also known as half-byte or tetrade. In a networking or telecommunication context, the nibble is often called a semi-octet, quadbit, or quartet. A nibble has sixteen (24) possible values. A nibble can be represented by a single hexadecimal digit (0F) and called a hex digit.

The octal, or oct for short, is the base-8 positional numeral system, and uses the digits 0 to 7. This is to say that 10octal represents eight and 100octal represents sixty-four. However, English, like most languages, uses a base-10 number system, hence a true octal system might use different vocabulary.

A computer number format is the internal representation of numeric values in digital device hardware and software, such as in programmable computers and calculators. Numerical values are stored as groupings of bits, such as bytes and words. The encoding between numerical values and bit patterns is chosen for convenience of the operation of the computer; the encoding used by the computer's instruction set generally requires conversion for external use, such as for printing and display. Different types of processors may have different internal representations of numerical values and different conventions are used for integer and real numbers. Most calculations are carried out with number formats that fit into a processor register, but some software systems allow representation of arbitrarily large numbers using multiple words of memory.

In computer programming, Base64 is a group of tetrasexagesimal binary-to-text encoding schemes that represent binary data in sequences of 24 bits that can be represented by four 6-bit Base64 digits.

<span class="mw-page-title-main">Power of two</span> Two raised to an integer power

A power of two is a number of the form 2n where n is an integer, that is, the result of exponentiation with number two as the base and integer n as the exponent.

Dot-decimal notation is a presentation format for numerical data. It consists of a string of decimal numbers, using the full stop (dot) as a separation character.

A hex editor is a computer program that allows for manipulation of the fundamental binary data that constitutes a computer file. The name 'hex' comes from 'hexadecimal', a standard numerical format for representing binary data. A typical computer file occupies multiple areas on the storage medium, whose contents are combined to form the file. Hex editors that are designed to parse and edit sector data from the physical segments of floppy or hard disks are sometimes called sector editors or disk editors.

IEEE 1541-2002 is a standard issued in 2002 by the Institute of Electrical and Electronics Engineers (IEEE) concerning the use of prefixes for binary multiples of units of measurement related to digital electronics and computing. IEEE 1541-2021 revises and supersedes IEEE 1541–2002, which is 'inactive'.

In computer architecture, 12-bit integers, memory addresses, or other data units are those that are 12 bits wide. Also, 12-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers, address buses, or data buses of that size.

This timeline of binary prefixes lists events in the history of the evolution, development, and use of units of measure which are germane to the definition of the binary prefixes by the International Electrotechnical Commission (IEC) in 1998, used primarily with units of information such as the bit and the byte.

In digital computing and telecommunications, a unit of information is the capacity of some standard data storage system or communication channel, used to measure the capacities of other systems and channels. In information theory, units of information are also used to measure information contained in messages and the entropy of random variables.

In computing, a hextet, or a chomp, is a sixteen-bit aggregation, or four nibbles. As a nibble typically is notated in hexadecimal format, a hextet consists of 4 hexadecimal digits. A hextet is the unofficial name for each of the 8 blocks in an IPv6 address.

References

  1. 1 2 3 "Philips - Philips Data Systems' product range - April 1971" (PDF). Philips. 1971. Archived from the original (PDF) on 2016-03-04. Retrieved 2016-10-03.
  2. 1 2 3 Williams, R. H. (1969-01-01). British Commercial Computer Digest: Pergamon Computer Data Series. Pergamon Press. ISBN   1483122107. 978-1483122106.
  3. Bemer, Robert William (2000-08-08). "Why is a byte 8 bits? Or is it?". Computer History Vignettes. Archived from the original on 2017-04-03. Retrieved 2017-05-15. […] I came to work for IBM, and saw all the confusion caused by the 64-character limitation. Especially when we started to think about word processing, which would require both upper and lower case. […] I even made a proposal (in view of STRETCH, the very first computer I know of with an 8-bit byte) that would extend the number of punch card character codes to 256 […]. So some folks started thinking about 7-bit characters, but this was ridiculous. With IBM's STRETCH computer as background, handling 64-character words divisible into groups of 8 (I designed the character set for it, under the guidance of Dr. Werner Buchholz, the man who DID coin the term "byte" for an 8-bit grouping). […] It seemed reasonable to make a universal 8-bit character set, handling up to 256. In those days my mantra was "powers of 2 are magic". And so the group I headed developed and justified such a proposal […] The IBM 360 used 8-bit characters, although not ASCII directly. Thus Buchholz's "byte" caught on everywhere. I myself did not like the name for many reasons. The design had 8 bits moving around in parallel. But then came a new IBM part, with 9 bits for self-checking, both inside the CPU and in the tape drives. I exposed this 9-bit byte to the press in 1973. But long before that, when I headed software operations for Cie. Bull in France in 1965-66, I insisted that "byte" be deprecated in favor of "octet". […]
  4. Kozierok, Charles M. (2005-09-20) [2001]. "The TCP/IP Guide - Binary Information and Representation: Bits, Bytes, Nibbles, Octets and Characters - Byte versus Octet". 3.0. Archived from the original on 2017-04-03. Retrieved 2017-04-03.
  5. R. Hinden; S. Deering (February 2006). IP Version 6 Addressing Architecture. Network Working Group. doi: 10.17487/RFC4291 . RFC 4291.Draft Standard. Obsoletes RFC  3513. Updated by RFC  5952, 6052, 7136, 7346, 7371 and 8064.