Bit slicing

Last updated

Bit slicing is a technique for constructing a processor from modules of processors of smaller bit width, for the purpose of increasing the word length; in theory to make an arbitrary n-bit central processing unit (CPU). Each of these component modules processes one bit field or "slice" of an operand. The grouped processing components would then have the capability to process the chosen full word-length of a given software design.

Contents

Bit slicing more or less died out due to the advent of the microprocessor. Recently it has been used in arithmetic logic units (ALUs) for quantum computers and as a software technique, e.g. for cryptography in x86 CPUs. [1]

Operational details

Bit-slice processors (BSPs) usually include 1-, 2-, 4-, 8- or 16-bit arithmetic logic unit (ALU) and control lines (including carry or overflow signals that are internal to the processor in non-bitsliced CPU designs).

For example, two 4-bit ALU chips could be arranged side by side, with control lines between them, to form an 8-bit ALU (result need not be power of two, e.g. three 1-bit units can make a 3-bit ALU, [2] thus 3-bit (or n-bit) CPU, while 3-bit, or any CPU with higher odd number of bits, hasn't been manufactured and sold in volume). Four 4-bit ALU chips could be used to build a 16-bit ALU. It would take eight chips to build a 32-bit word ALU. The designer could add as many slices as required to manipulate longer word lengths.

A microsequencer or control ROM would be used to execute logic to provide data and control signals to regulate function of the component ALUs.

Known bit-slice microprocessors:

U830C IMD U830C U834C (cropped).jpg
U830C

Historical necessity

Bit slicing, although not called that at the time, was also used in computers before large-scale integrated circuits (LSI, the predecessor to today's VLSI, or very-large-scale integration circuits). The first bit-sliced machine was EDSAC 2, built at the University of Cambridge Mathematical Laboratory in 1956–1958.

Prior to the mid-1970s and late 1980s there was some debate over how much bus width was necessary in a given computer system to make it function. Silicon chip technology and parts were much more expensive than today. Using multiple simpler, and thus less expensive, ALUs was seen as a way to increase computing power in a cost-effective manner. While 32-bit microprocessors were being discussed at the time, few were in production.

The UNIVAC 1100 series mainframes (one of the oldest series, originating in the 1950s) has a 36-bit architecture, and the 1100/60 introduced in 1979 used nine Motorola MC10800 4-bit ALU [12] chips to implement the needed word width while using modern integrated circuits. [16]

At the time 16-bit processors were common but expensive, and 8-bit processors, such as the Z80, were widely used in the nascent home-computer market.

Combining components to produce bit-slice products allowed engineers and students to create more powerful and complex computers at a more reasonable cost, using off-the-shelf components that could be custom-configured. The complexities of creating a new computer architecture were greatly reduced when the details of the ALU were already specified (and debugged).

The main advantage was that bit slicing made it economically possible in smaller processors to use bipolar transistors, which switch much faster than NMOS or CMOS transistors. This allowed much higher clock rates, where speed was needed  for example, for DSP functions or matrix transformation   or, as in the Xerox Alto, the combination of flexibility and speed, before discrete CPUs were able to deliver that.

Modern use

Software use on non-bit-slice hardware

In more recent times, the term bit slicing was reused by Matthew Kwan [17] to refer to the technique of using a general-purpose CPU to implement multiple parallel simple virtual machines using general logic instructions to perform single-instruction multiple-data (SIMD) operations. This technique is also known as SIMD within a register (SWAR).

This was initially in reference to Eli Biham's 1997 article A Fast New DES Implementation in Software, [18] which achieved significant gains in performance of DES by using this method.

Bit-sliced quantum computers

To simplify the circuit structure and reduce the hardware cost of quantum computers (proposed to run the MIPS32 instruction set) a 50 GHz superconducting "4-bit bit-slice arithmetic logic unit (ALU) for 32-bit rapid single-flux-quantum microprocessors was demonstrated". [19]

See also

Related Research Articles

<span class="mw-page-title-main">Central processing unit</span> Central computer component which executes instructions

A central processing unit (CPU), also called a central processor, main processor, or just processor, is the most important processor in a given computer. Its electronic circuitry executes instructions of a computer program, such as arithmetic, logic, controlling, and input/output (I/O) operations. This role contrasts with that of external components, such as main memory and I/O circuitry, and specialized coprocessors such as graphics processing units (GPUs).

Processor design is a subfield of computer science and computer engineering (fabrication) that deals with creating a processor, a key component of computer hardware.

<span class="mw-page-title-main">Microprocessor</span> Computer processor contained on an integrated-circuit chip

A microprocessor is a computer processor for which the data processing logic and control is included on a single integrated circuit (IC), or a small number of ICs. The microprocessor contains the arithmetic, logic, and control circuitry required to perform the functions of a computer's central processing unit (CPU). The IC is capable of interpreting and executing program instructions and performing arithmetic operations. The microprocessor is a multipurpose, clock-driven, register-based, digital integrated circuit that accepts binary data as input, processes it according to instructions stored in its memory, and provides results as output. Microprocessors contain both combinational logic and sequential digital logic, and operate on numbers and symbols represented in the binary number system.

In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer, also known as its machine code. It consists of a set of hardware-level instructions that implement the higher-level machine code instructions or control internal finite-state machine sequencing in many digital processing components. While microcode is utilized in general-purpose CPUs in contemporary desktops, it also functions as a fallback path for scenarios that the faster hardwired control unit is unable to manage.

In computer architecture, 8-bit integers or other data units are those that are 8 bits wide. Also, 8-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers or data buses of that size. Memory addresses for 8-bit CPUs are generally larger than 8-bit, usually 16-bit. 8-bit microcomputers are microcomputers that use 8-bit microprocessors.

The Motorola 68000 series is a family of 32-bit complex instruction set computer (CISC) microprocessors. During the 1980s and early 1990s, they were popular in personal computers and workstations and were the primary competitors of Intel's x86 microprocessors. They were best known as the processors used in the early Apple Macintosh, the Sharp X68000, the Commodore Amiga, the Sinclair QL, the Atari ST and Falcon, the Atari Jaguar, the Sega Genesis and Sega CD, the Philips CD-i, the Capcom System I (Arcade), the AT&T UNIX PC, the Tandy Model 16/16B/6000, the Sun Microsystems Sun-1, Sun-2 and Sun-3, the NeXT Computer, NeXTcube, NeXTstation, and NeXTcube Turbo, early Silicon Graphics IRIS workstations, the Aesthedes, computers from MASSCOMP, the Texas Instruments TI-89/TI-92 calculators, the Palm Pilot, the Control Data Corporation CDCNET Device Interface, the VTech Precomputer Unlimited and the Space Shuttle. Although no modern desktop computers are based on processors in the 680x0 series, derivative processors are still widely used in embedded systems.

<span class="mw-page-title-main">Intel 8008</span> 8-bit microprocessor

The Intel 8008 is an early 8-bit microprocessor capable of addressing 16 KB of memory, introduced in April 1972. The 8008 architecture was designed by Computer Terminal Corporation (CTC) and was implemented and manufactured by Intel. While the 8008 was originally designed for use in CTC's Datapoint 2200 programmable terminal, an agreement between CTC and Intel permitted Intel to market the chip to other customers after Seiko expressed an interest in using it for a calculator.

<span class="mw-page-title-main">PowerPC 970</span> 64-bit processor

The PowerPC 970, PowerPC 970FX, and PowerPC 970MP are 64-bit PowerPC CPUs from IBM introduced in 2002. Apple branded the 970 as PowerPC G5 for its Power Mac G5.

4-bit computing is the use of computer architectures in which integers and other data units are 4 bits wide. 4-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers or data buses of that size. A group of four bits is also called a nibble and has 24 = 16 possible values.

<span class="mw-page-title-main">Motorola MC14500B</span> 1-bit industrial control microprocessor

The MC14500B Industrial Control Unit (ICU) is a CMOS one-bit microprocessor designed by Motorola for simple control applications in 1977.

<span class="mw-page-title-main">AMD Am2900</span>

Am2900 is a family of integrated circuits (ICs) created in 1975 by Advanced Micro Devices (AMD). They were constructed with bipolar devices, in a bit-slice topology, and were designed to be used as modular components each representing a different aspect of a computer control unit (CCU). By using the bit slicing technique, the Am2900 family was able to implement a CCU with data, addresses, and instructions to be any multiple of 4 bits by multiplying the number of ICs. One major problem with this modular technique was that it required a larger number of ICs to implement what could be done on a single CPU IC. The Am2901 chip included an arithmetic logic unit (ALU) and 16 4-bit processor register slices, and was the "core" of the series. It could count using 4 bits and implement binary operations as well as various bit-shifting operations. The Am2909 was a 4-bit-slice address sequencer that could generate 4-bit addresses on a single chip, and by using n of them, it was able to generate 4n-bit addresses. It had a stack that could store a microprogram counter up to 4 nest levels, as well as a stack pointer.

<span class="mw-page-title-main">CVAX</span> Microprocessor chipset

The CVAX is a microprocessor chipset developed and fabricated by Digital Equipment Corporation (DEC) that implemented the VAX instruction set architecture (ISA). The chipset consisted of the CVAX 78034 CPU, CFPA floating-point accelerator, CVAX clock chip, and the associated support chips, the CVAX System Support Chip (CSSC), CVAX Memory Controller (CMCTL), and CVAX Q-Bus Interface Chip (CQBIC).

<span class="mw-page-title-main">IMP-16</span>

The IMP-16, by National Semiconductor, was the first multi-chip 16-bit microprocessor, released in 1973. It consisted of five PMOS integrated circuits: four identical RALU chips, short for register and ALU, providing the data path, and one CROM, Control and ROM, providing control sequencing and microcode storage. The IMP-16 is a bit-slice processor; each RALU chip provides a 4-bit slice of the register and arithmetic that work in parallel to produce a 16-bit word length.

<span class="mw-page-title-main">Four-Phase Systems</span> American computer company, 1969 to 1981

Four-Phase Systems, Inc., was a computer company, founded by Lee Boysel and others, which built one of the earliest computers using semiconductor main memory and MOS LSI logic. The company was incorporated in February 1969 and had moderate commercial success. It was acquired by Motorola in 1981.

<span class="mw-page-title-main">History of general-purpose CPUs</span>

The history of general-purpose CPUs is a continuation of the earlier history of computing hardware.

<span class="mw-page-title-main">74181</span> First arithmetic logic unit (ALU) on a single chip

The 74181 is a 4-bit slice arithmetic logic unit (ALU), implemented as a 7400 series TTL integrated circuit. Introduced by Texas Instruments in February 1970, it was the first complete ALU on a single chip. It was used as the arithmetic/logic core in the CPUs of many historically significant minicomputers and other devices.

<span class="mw-page-title-main">Alpha 21264</span> RISC microprocessor

The Alpha 21264 is a Digital Equipment Corporation RISC microprocessor launched on 19 October 1998. The 21264 implemented the Alpha instruction set architecture (ISA).

In computer architecture, 16-bit integers, memory addresses, or other data units are those that are 16 bits wide. Also, 16-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers, address buses, or data buses of that size. 16-bit microcomputers are microcomputers that use 16-bit microprocessors.

<span class="mw-page-title-main">Arithmetic logic unit</span> Combinational digital circuit

In computing, an arithmetic logic unit (ALU) is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on floating point numbers. It is a fundamental building block of many types of computing circuits, including the central processing unit (CPU) of computers, FPUs, and graphics processing units (GPUs).

<span class="mw-page-title-main">1-bit computing</span> Computer architecture bit width

In computer architecture, 1-bit integers or other data units are those that are 1 bit wide. Also, 1-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on registers of that size.

References

  1. Benadjila, Ryad; Guo, Jian; Lomné, Victor; Peyrin, Thomas (2014-03-21) [2013-07-15]. "Implementing Lightweight Block Ciphers on x86 Architectures". Cryptology Archive. Report 2013/445. Archived from the original on 2017-08-17. Retrieved 2019-12-28.
  2. "How to Create a 1-bit ALU". www.cs.umd.edu. Archived from the original on 2017-05-08. […] Here's how you would put three 1-bit ALU to create a 3-bit ALU […]
  3. "3002 - The CPU Shack Museum". cpushack.com. Retrieved 2017-11-05.
  4. "Technology Leadership - Bipolar Microprocessor" (PDF). Signetics. S2.95. Retrieved 2021-10-11.
  5. "IMP-4 - National Semiconductor". en.wikichip.org. Retrieved 2017-11-05.
  6. 1 2 3 4 5 Klar, Rainer (1989) [1988-10-01]. "5.2 Der Mikroprozessor, ein Universal-Rechenautomat". Digitale Rechenautomaten – Eine Einführung in die Struktur von Computerhardware[Digital Computers – An Introduction into the structure of computer hardware]. Sammlung Göschen (in German). Vol. 2050 (4th reworked ed.). Berlin, Germany: Walter de Gruyter & Co. p. 198. ISBN   3-11011700-2. (320 pages)
  7. "6701 - The CPU Shack Museum". cpushack.com. Retrieved 2017-11-05.
  8. "5700/6700 - Monolithic Memories". en.wikichip.org. Retrieved 2017-11-05.
  9. "File:MMI 5701-6701 MCU (August, 1974).pdf" (PDF). en.wikichip.org. Retrieved 2017-11-05.
  10. "5701/6701 4-Bit Expandable Bipolar Microcontroller Aug74" (PDF). Retrieved 2021-05-24.
  11. "SN74S481". The CPU Shack Museum. Retrieved 2017-11-05.
  12. 1 2 Mueller, Dieter (2012). "The MC10800". 6502.org. Archived from the original on 2018-07-18. Retrieved 2017-11-05.
  13. Kurth, Rüdiger; Groß, Martin; Hunger, Henry, eds. (2021-09-27) [2006]. "Integrierte Schaltkreise" [Integrated Circuits]. robotrontechnik.de (in German). Archived from the original on 2021-12-03. Retrieved 2021-12-07.
  14. Oppelt, Dirk (2016). "Eastern Bloc DEC PDP". cpu-collection.de. Nuremberg, Germany. Archived from the original on 2016-08-09. Retrieved 2021-12-07.
  15. Salomon, Peter (2007-06-25). "Einsatzgebiete des U830C und Chipsatz" [Applications of the U830C and chipset]. Robotrontechnik-Forum (in German). Archived from the original on 2019-11-10. Retrieved 2021-12-07.
  16. "Computers Sperry Univac 1100/60 System" (PDF). Delran, NJ, USA: Datapro Research Corporation. January 1983. 70C-877-12. Retrieved 2021-10-11.
  17. "Bitslice DES". darkside.com.au. Retrieved 2017-11-05.
  18. Biham, Eli (1997). "A Fast New DES Implementation in Software". cs.technion.ac.il. Retrieved 2017-11-05.
  19. Tang, Guang-Ming; Takata, Kensuke; Tanaka, Masamitsu; Fujimaki, Akira; Takagi, Kazuyoshi; Takagi, Naofumi (January 2016) [2015-12-09]. "4-bit Bit-Slice Arithmetic Logic Unit for 32-bit RSFQ Microprocessors". IEEE Transactions on Applied Superconductivity . 26 (1): 2507125. Bibcode:2016ITAS...2607125T. doi:10.1109/TASC.2015.2507125. S2CID   25478156. 1300106. […] 4-bit bit-slice arithmetic logic unit (ALU) for 32-bit rapid single-flux-quantum microprocessors was demonstrated. The proposed ALU covers all of the ALU operations for the MIPS32 instruction set. […] It consists of 3481 Josephson junctions with an area of 3.09 × 1.66 mm2. It achieved the target frequency of 50 GHz and a latency of 524 ps for a 32-bit operation, at the designed DC bias voltage of 2.5 mV […] Another 8-bit parallel ALU has been designed and fabricated with target processing frequency of 30 GHz […] To achieve comparable performance to CMOS parallel microprocessors operating at 2–3 GHz, 4-bit bit-slice processing should be performed with a clock frequency of several tens of gigahertz. Several bit-serial arithmetic circuits have been successfully demonstrated with high-speed clocks of above 50 GHz […]

Further reading