Bitstream

Last updated

A bitstream (or bit stream), also known as binary sequence, is a sequence of bits.

Contents

A bytestream is a sequence of bytes. Typically, each byte is an 8-bit quantity, and so the term octet stream is sometimes used interchangeably. An octet may be encoded as a sequence of 8 bits in multiple different ways (see bit numbering) so there is no unique and direct translation between bytestreams and bitstreams.

Bitstreams and bytestreams are used extensively in telecommunications and computing. For example, synchronous bitstreams are carried by SONET, and Transmission Control Protocol transports an asynchronous bytestream.

Relationship to bytestreams

In practice, bitstreams are not used directly to encode bytestreams; a communication channel may use a signalling method that does not directly translate to bits (for instance, by transmitting signals of multiple frequencies) and typically also encodes other information such as framing and error correction together with its data.[ citation needed ]

Examples

The term bitstream is frequently used to describe the configuration data to be loaded into a field-programmable gate array (FPGA). Although most FPGAs also support a byte-parallel loading method as well, this usage may have originated based on the common method of configuring the FPGA from a serial bit stream, typically from a serial PROM or flash memory chip. The detailed format of the bitstream for a particular FPGA is typically proprietary to the FPGA vendor.

In mathematics, several specific infinite sequences of bits have been studied for their mathematical properties; these include the Baum–Sweet sequence, Ehrenfeucht–Mycielski sequence, Fibonacci word, Kolakoski sequence, regular paperfolding sequence, Rudin–Shapiro sequence, and Thue–Morse sequence.

On most operating systems, including Unix-like and Windows, standard I/O libraries convert lower-level paged or buffered file access to a bytestream paradigm. In particular, in Unix-like operating systems, each process has three standard streams, which are examples of unidirectional bytestreams. The Unix pipe mechanism provides bytestream communications between different processes.

Compression algorithms often code in bitstreams, as the 8 bits offered by a byte (the smallest addressable unit of memory) may be wasteful. Although typically implemented in low-level languages, some high-level languages such as Python [1] and Java [2] offer native interfaces for bitstream I/O.

One well-known example of a communication protocol which provides a byte-stream service to its clients is the Transmission Control Protocol (TCP) of the Internet protocol suite, which provides a bidirectional bytestream.

The Internet media type for an arbitrary bytestream is application/octet-stream. Other media types are defined for bytestreams in well-known formats.

Flow control

Often the contents of a bytestream are dynamically created, such as the data from the keyboard and other peripherals (/dev/tty), data from the pseudorandom number generator (/dev/urandom), etc.

In those cases, when the destination of a bytestream (the consumer) uses bytes faster than they can be generated, the system uses process synchronization to make the destination wait until the next byte is available.

When bytes are generated faster than the destination can use them and the producer is a software algorithm, the system pauses it with the same process synchronization techniques. When the producer supports flow control, the system only sends the ready signal when the consumer is ready for the next byte. When the producer can not be paused—a keyboard or some hardware that does not support flow control—the system typically attempts to temporarily store the data until the consumer is ready for it, typically using a queue. Often the receiver can empty the buffer before it gets completely full. A producer that continues to produce data faster than it can be consumed, even after the buffer is full, leads to unwanted buffer overflow, packet loss, network congestion, and denial of service.

See also

Related Research Articles

In telecommunications, asynchronous communication is transmission of data, generally without the use of an external clock signal, where data can be transmitted intermittently rather than in a steady stream. Any timing required to recover data from the communication symbols is encoded within the symbols.

The Real-time Transport Protocol (RTP) is a network protocol for delivering audio and video over IP networks. RTP is used in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applications including WebRTC, television services and web-based push-to-talk features.

The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.

UTF-8 is a variable-length character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode Transformation Format – 8-bit.

<span class="mw-page-title-main">Synchronous optical networking</span> Standardized protocol

Synchronous Optical Networking (SONET) and Synchronous Digital Hierarchy (SDH) are standardized protocols that transfer multiple digital bit streams synchronously over optical fiber using lasers or highly coherent light from light-emitting diodes (LEDs). At low transmission rates data can also be transferred via an electrical interface. The method was developed to replace the plesiochronous digital hierarchy (PDH) system for transporting large amounts of telephone calls and data traffic over the same fiber without the problems of synchronization.

A frame is a digital data transmission unit in computer networking and telecommunication. In packet switched systems, a frame is a simple container for a single network packet. In other telecommunications systems, a frame is a repeating structure supporting time-division multiplexing.

In telecommunications and computer networking, a network packet is a formatted unit of data carried by a packet-switched network. A packet consists of control information and user data; the latter is also known as the payload. Control information provides data for delivering the payload. Typically, control information is found in packet headers and trailers.

<span class="mw-page-title-main">Universal asynchronous receiver-transmitter</span> Computer hardware device

A Universal Asynchronous Receiver-Transmitter is a peripheral device for asynchronous serial communication in which the data format and transmission speeds are configurable. It sends data bits one by one, from the least significant to the most significant, framed by start and stop bits so that precise timing is handled by the communication channel. The electric signaling levels are handled by a driver circuit external to the UART. Common signal levels are RS-232, RS-485, and raw TTL for short debugging links. Early teletypewriters used current loops.

Abstract Syntax Notation One (ASN.1) is a standard interface description language (IDL) for defining data structures that can be serialized and deserialized in a cross-platform way. It is broadly used in telecommunications and computer networking, and especially in cryptography.

In computing, Deflate is a lossless data compression file format that uses a combination of LZ77 and Huffman coding. It was designed by Phil Katz, for version 2 of his PKZIP archiving tool. Deflate was later specified in RFC 1951 (1996).

AES3 is a standard for the exchange of digital audio signals between professional audio devices. An AES3 signal can carry two channels of pulse-code-modulated digital audio over several transmission media including balanced lines, unbalanced lines, and optical fiber.

In computer programming, Base64 is a group of binary-to-text encoding schemes that transforms binary data into a sequence of printable characters, limited to a set of 64 unique characters. More specifically, the source binary data is taken 6 bits at a time, then this group of 6 bits is mapped to one of 64 unique characters.

<span class="mw-page-title-main">Newline</span> Special characters in computing signifying the end of a line of text

A newline is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a sequence of characters, is used to signify the end of a line of text and the start of a new one.

uuencoding is a form of binary-to-text encoding that originated in the Unix programs uuencode and uudecode written by Mary Ann Horton at the University of California, Berkeley in 1980, for encoding binary data for transmission in email systems.

In software engineering, a pipeline consists of a chain of processing elements, arranged so that the output of each element is the input of the next; the name is by analogy to a physical pipeline. Usually some amount of buffering is provided between consecutive elements. The information that flows in these pipelines is often a stream of records, bytes, or bits, and the elements of a pipeline may be called filters; this is also called the pipe(s) and filters design pattern. Connecting elements into a pipeline is analogous to function composition.

A binary-to-text encoding is encoding of data in plain text. More precisely, it is an encoding of binary data in a sequence of printable characters. These encodings are necessary for transmission of data when the communication channel does not allow binary data or is not 8-bit clean. PGP documentation uses the term "ASCII armor" for binary-to-text encoding when referring to Base64.

In computer networks, a syncword, sync character, sync sequence or preamble is used to synchronize a data transmission by indicating the end of header information and the start of data. The syncword is a known sequence of data used to identify the start of a frame, and is also called reference signal or midamble in wireless communications.

In computer networking, an Ethernet frame is a data link layer protocol data unit and uses the underlying Ethernet physical layer transport mechanisms. In other words, a data unit on an Ethernet link transports an Ethernet frame as its payload.

A variable-length quantity (VLQ) is a universal code that uses an arbitrary number of binary octets to represent an arbitrarily large integer. A VLQ is essentially a base-128 representation of an unsigned integer with the addition of the eighth bit to mark continuation of bytes. VLQ is identical to LEB128 except in endianness. See the example below.

ARINC 818: Avionics Digital Video Bus (ADVB) is a video interface and protocol standard developed for high bandwidth, low-latency, uncompressed digital video transmission in avionics systems. The standard, which was released in January 2007, has been advanced by ARINC and the aerospace community to meet the stringent needs of high performance digital video. The specification was updated and ARINC 818-2 was released in December 2013, adding a number of new features, including link rates up to 32X fibre channel rates, channel-bonding, switching, field sequential color, bi-directional control and data-only links.

References

  1. "Bitstream". Python Software Foundation. Archived from the original on 2016-09-08.
  2. "Class BitSet". Oracle. Archived from the original on 2016-11-30.