Microcode is a computer hardware technique that interposes a layer of organisation between the CPU hardware and the programmer-visible instruction set architecture of the computer.As such, the microcode is a layer of hardware-level instructions that implement higher-level machine code instructions or internal state machine sequencing in many digital processing elements. Microcode is used in general-purpose central processing units, although in current desktop CPUs it is only a fallback path for cases that the faster hardwired control unit cannot handle.
Microcode typically resides in special high-speed memory and translates machine instructions, state machine data or other input into sequences of detailed circuit-level operations. It separates the machine instructions from the underlying electronics so that instructions can be designed and altered more freely. It also facilitates the building of complex multi-step instructions, while reducing the complexity of computer circuits. Writing microcode is often called microprogramming and the microcode in a particular processor implementation is sometimes called a microprogram.
More extensive microcoding allows small and simple microarchitectures to emulate more powerful architectures with wider word length, more execution units and so on, which is a relatively simple way to achieve software compatibility between different products in a processor family.
Some hardware vendors, especially IBM, use the term microcode as a synonym for firmware . In that way, all code within a device is termed microcode regardless of it being microcode or machine code; for example, hard disk drives are said to have their microcode updated, though they typically contain both microcode and firmware.
The lowest layer in a computer's software stack is traditionally raw binary machine code instructions for the processor. Microcode sits one level below this. To avoid confusion, each microprogram-related element is differentiated by the micro prefix: microinstruction, microassembler, microprogrammer, microarchitecture, etc.
Engineers normally write the microcode during the design phase of a processor, storing it in a read-only memory (ROM) or programmable logic array (PLA)structure, or in a combination of both. However, machines also exist that have some or all microcode stored in SRAM or flash memory. This is traditionally denoted as writeable control store in the context of computers, which can be either read-only or read-write memory. In the latter case, the CPU initialization process loads microcode into the control store from another storage medium, with the possibility of altering the microcode to correct bugs in the instruction set, or to implement new machine instructions.
Complex digital processors may also employ more than one (possibly microcode-based) control unit in order to delegate sub-tasks that must be performed essentially asynchronously in parallel. A high-level programmer, or even an assembly programmer, does not normally see or change microcode. Unlike machine code, which often retains some backward compatibility among different processors in a family, microcode only runs on the exact electronic circuitry for which it is designed, as it constitutes an inherent part of the particular processor design itself.
Microprograms consist of series of microinstructions, which control the CPU at a very fundamental level of hardware circuitry. For example, a single typical horizontal microinstruction might specify the following operations:
To simultaneously control all processor's features in one cycle, the microinstruction is often wider than 50 bits; e.g., 128 bits on a 360/85 with an emulator feature. Microprograms are carefully designed and optimized for the fastest possible execution, as a slow microprogram would result in a slow machine instruction and degraded performance for related application programs that use such instructions.
Microcode was originally developed as a simpler method of developing the control logic for a computer. Initially, CPU instruction sets were hardwired. Each step needed to fetch, decode, and execute the machine instructions (including any operand address calculations, reads, and writes) was controlled directly by combinational logic and rather minimal sequential state machine circuitry. While such hard-wired processors were very efficient, the need for powerful instruction sets with multi-step addressing and complex operations (see below) made them difficult to design and debug; highly encoded and varied-length instructions can contribute to this as well, especially when very irregular encodings are used.
Microcode simplified the job by allowing much of the processor's behaviour and programming model to be defined via microprogram routines rather than by dedicated circuitry. Even late in the design process, microcode could easily be changed, whereas hard-wired CPU designs were very cumbersome to change. Thus, this greatly facilitated CPU design.
From the 1940s to the late 1970s, a large portion of programming was done in assembly language; higher-level instructions mean greater programmer productivity, so an important advantage of microcode was the relative ease by which powerful machine instructions can be defined. The ultimate extension of this are "Directly Executable High Level Language" designs, in which each statement of a high-level language such as PL/I is entirely and directly executed by microcode, without compilation. The IBM Future Systems project and Data General Fountainhead Processor are examples of this. During the 1970s, CPU speeds grew more quickly than memory speeds and numerous techniques such as memory block transfer, memory pre-fetch and multi-level caches were used to alleviate this. High-level machine instructions, made possible by microcode, helped further, as fewer more complex machine instructions require less memory bandwidth. For example, an operation on a character string can be done as a single machine instruction, thus avoiding multiple instruction fetches.
Architectures with instruction sets implemented by complex microprograms included the IBM System/360 and Digital Equipment Corporation VAX. The approach of increasingly complex microcode-implemented instruction sets was later called CISC. An alternate approach, used in many microprocessors, is to use PLAs or ROMs (instead of combinational logic) mainly for instruction decoding, and let a simple state machine (without much, or any, microcode) do most of the sequencing. The MOS Technology 6502 is an example of a microprocessor using a PLA for instruction decode and sequencing. The PLA is visible in photomicrographs of the chip,and its operation can be seen in the transistor-level simulation.
Microprogramming is still used in modern CPU designs. In some cases, after the microcode is debugged in simulation, logic functions are substituted for the control store.[ citation needed ] Logic functions are often faster and less expensive than the equivalent microprogram memory.
A processor's microprograms operate on a more primitive, totally different, and much more hardware-oriented architecture than the assembly instructions visible to normal programmers. In coordination with the hardware, the microcode implements the programmer-visible architecture. The underlying hardware need not have a fixed relationship to the visible architecture. This makes it easier to implement a given instruction set architecture on a wide variety of underlying hardware micro-architectures.
The IBM System/360 has a 32-bit architecture with 16 general-purpose registers, but most of the System/360 implementations actually use hardware that implemented a much simpler underlying microarchitecture; for example, the System/360 Model 30 has 8-bit data paths to the arithmetic logic unit (ALU) and main memory and implemented the general-purpose registers in a special unit of higher-speed core memory, and the System/360 Model 40 has 8-bit data paths to the ALU and 16-bit data paths to main memory and also implemented the general-purpose registers in a special unit of higher-speed core memory. The Model 50 has full 32-bit data paths and implements the general-purpose registers in a special unit of higher-speed core memory. [ citation needed ] In this way, microprogramming enabled IBM to design many System/360 models with substantially different hardware and spanning a wide range of cost and performance, while making them all architecturally compatible. This dramatically reduces the number of unique system software programs that must be written for each model.The Model 65 through the Model 195 have larger data paths and implement the general-purpose registers in faster transistor circuits.
A similar approach was used by Digital Equipment Corporation (DEC) in their VAX family of computers. As a result, different VAX processors use different microarchitectures, yet the programmer-visible architecture does not change.
Microprogramming also reduces the cost of field changes to correct defects (bugs) in the processor; a bug can often be fixed by replacing a portion of the microprogram rather than by changes being made to hardware logic and wiring.
In 1947, the design of the MIT Whirlwind introduced the concept of a control store as a way to simplify computer design and move beyond ad hoc methods. The control store is a diode matrix: a two-dimensional lattice, where one dimension accepts "control time pulses" from the CPU's internal clock, and the other connects to control signals on gates and other circuits. A "pulse distributor" takes the pulses generated by the CPU clock and breaks them up into eight separate time pulses, each of which activates a different row of the lattice. When the row is activated, it activates the control signals connected to it.
Described another way, the signals transmitted by the control store are being played much like a player piano roll. That is, they are controlled by a sequence of very wide words constructed of bits, and they are "played" sequentially. In a control store, however, the "song" is short and repeated continuously.
In 1951, Maurice Wilkes enhanced this concept by adding conditional execution, a concept akin to a conditional in computer software. His initial implementation consisted of a pair of matrices: the first one generated signals in the manner of the Whirlwind control store, while the second matrix selected which row of signals (the microprogram instruction word, so to speak) to invoke on the next cycle. Conditionals were implemented by providing a way that a single line in the control store could choose from alternatives in the second matrix. This made the control signals conditional on the detected internal signal. Wilkes coined the term microprogramming to describe this feature and distinguish it from a simple control store.
Each microinstruction in a microprogram provides the bits that control the functional elements that internally compose a CPU. The advantage over a hard-wired CPU is that internal CPU control becomes a specialized form of a computer program. Microcode thus transforms a complex electronic design challenge (the control of a CPU) into a less complex programming challenge. To take advantage of this, a CPU is divided into several parts:
There may also be a memory address register and a memory data register, used to access the main computer storage. Together, these elements form an "execution unit". Most modern CPUs have several execution units. Even simple computers usually have one unit to read and write memory, and another to execute user code. These elements could often be brought together as a single chip. This chip comes in a fixed width that would form a "slice" through the execution unit. These are known as "bit slice" chips. The AMD Am2900 family is one of the best known examples of bit slice elements. The parts of the execution units and the execution units themselves are interconnected by a bundle of wires called a bus.
Programmers develop microprograms, using basic software tools. A microassembler allows a programmer to define the table of bits symbolically. Because of its close relationship to the underlying architecture, "microcode has several properties that make it difficult to generate using a compiler." [ citation needed ] This program is similar to those used to optimize a programmable logic array. Even without fully optimal logic, heuristically optimized logic can vastly reduce the number of transistors from the number required for a ROM control store. This reduces the cost of producing, and the electricity consumed by, a CPU.A simulator program is intended to execute the bits in the same way as the electronics, and allows much more freedom to debug the microprogram. After the microprogram is finalized, and extensively tested, it is sometimes used as the input to a computer program that constructs logic to produce the same data.
Microcode can be characterized as horizontal or vertical, referring primarily to whether each microinstruction controls CPU elements with little or no decoding (horizontal microcode)or requires extensive decoding by combinatorial logic before doing so (vertical microcode). Consequently, each horizontal microinstruction is wider (contains more bits) and occupies more storage space than a vertical microinstruction.
"Horizontal microcode has several discrete micro-operations that are combined in a single microinstruction for simultaneous operation."Horizontal microcode is typically contained in a fairly wide control store; it is not uncommon for each word to be 108 bits or more. On each tick of a sequencer clock a microcode word is read, decoded, and used to control the functional elements that make up the CPU.
In a typical implementation a horizontal microprogram word comprises fairly tightly defined groups of bits. For example, one simple arrangement might be:
|Register source A||Register source B||Destination register||Arithmetic and logic unit operation||Type of jump||Jump address|
For this type of micromachine to implement a JUMP instruction with the address following the opcode, the microcode might require two clock ticks. The engineer designing it would write microassembler source code looking something like this:
# Any line starting with a number-sign is a comment # This is just a label, the ordinary way assemblers symbolically represent a # memory address. InstructionJUMP: # To prepare for the next instruction, the instruction-decode microcode has already # moved the program counter to the memory address register. This instruction fetches # the target address of the jump instruction from the memory word following the # jump opcode, by copying from the memory data register to the memory address register. # This gives the memory system two clock ticks to fetch the next # instruction to the memory data register for use by the instruction decode. # The sequencer instruction "next" means just add 1 to the control word address. MDR, NONE, MAR, COPY, NEXT, NONE # This places the address of the next instruction into the PC. # This gives the memory system a clock tick to finish the fetch started on the # previous microinstruction. # The sequencer instruction is to jump to the start of the instruction decode. MAR, 1, PC, ADD, JMP, InstructionDecode # The instruction decode is not shown, because it is usually a mess, very particular # to the exact processor being emulated. Even this example is simplified. # Many CPUs have several ways to calculate the address, rather than just fetching # it from the word following the op-code. Therefore, rather than just one # jump instruction, those CPUs have a family of related jump instructions.
For each tick it is common to find that only some portions of the CPU are used, with the remaining groups of bits in the microinstruction being no-ops. With careful design of hardware and microcode, this property can be exploited to parallelise operations that use different areas of the CPU; for example, in the case above, the ALU is not required during the first tick, so it could potentially be used to complete an earlier arithmetic instruction.
In vertical microcode, each microinstruction is significantly encoded – that is, the bit fields generally pass through intermediate combinatory logic that, in turn, generates the actual control and sequencing signals for internal CPU elements (ALU, registers, etc.). This is in contrast with horizontal microcode, in which the bit fields themselves either directly produce the control and sequencing signals or are only minimally encoded. Consequently, vertical microcode requires smaller instruction lengths and less storage, but requires more time to decode, resulting in a slower CPU clock.
Some vertical microcode is just the assembly language of a simple conventional computer that is emulating a more complex computer. Some processors, such as DEC Alpha processors and the CMOS microprocessors on later IBM System/390 mainframes and z/Architecture mainframes, use machine code, running in a special mode that gives it access to special instructions, special registers, and other hardware resources not available to regular machine code, to implement some instructions and other functions,such as page table walks on Alpha processors. This is called PALcode on Alpha processors and millicode on IBM mainframe processors.
Another form of vertical microcode has two fields:
|Field select||Field value|
The field select selects which part of the CPU will be controlled by this word of the control store. The field value actually controls that part of the CPU. With this type of microcode, a designer explicitly chooses to make a slower CPU to save money by reducing the unused bits in the control store; however, the reduced complexity may increase the CPU's clock frequency, which lessens the effect of an increased number of cycles per instruction.
As transistors became cheaper, horizontal microcode came to dominate the design of CPUs using microcode, with vertical microcode being used less often.
When both vertical and horizontal microcode are used, the horizontal microcode may be referred to as nanocode or picocode.
A few computers were built using "writable microcode". In this design, rather than storing the microcode in ROM or hard-wired logic, the microcode is stored in a RAM called a writable control store or WCS. Such a computer is sometimes called a writable instruction set computer or WISC.
Many experimental prototype computers use writable control stores; there are also commercial machines that use writable microcode, such as the Burroughs Small Systems, early Xerox workstations, the DEC VAX 8800 ("Nautilus") family, the Symbolics L- and G-machines, a number of IBM System/360 and System/370 implementations, some DEC PDP-10 machines,and the Data General Eclipse MV/8000.
Many more machines offer user-programmable writable control stores as an option, including the HP 2100, DEC PDP-11/60 and Varian Data Machines V-70 series minicomputers. The IBM System/370 includes a facility called Initial-Microprogram Load (IML or IMPL)that can be invoked from the console, as part of power-on reset (POR) or from another processor in a tightly coupled multiprocessor complex.
Some commercial machines, for example IBM 360/85,have both a read-only storage and a writable control store for microcode.
WCS offers several advantages including the ease of patching the microprogram and, for certain hardware generations, faster access than ROMs can provide. User-programmable WCS allows the user to optimize the machine for specific purposes.
Starting with the Pentium Pro in 1995, several x86 CPUs have writable Intel Microcode.This, for example, has allowed bugs in the Intel Core 2 and Intel Xeon microcodes to be fixed by patching their microprograms, rather than requiring the entire chips to be replaced. A second prominent example is the set of microcode patches that Intel offered for some of their processor architectures of up to 10 years in age, in a bid to counter the security vulnerabilities discovered in their designs - Spectre and Meltdown - which went public at the start of 2018. A microcode update can be installed by Linux, FreeBSD, Microsoft Windows, or the motherboard BIOS.
The design trend toward heavily microcoded processors with complex instructions began in the early 1960s and continued until roughly the mid-1980s. At that point the RISC design philosophy started becoming more prominent.
A CPU that uses microcode generally takes several clock cycles to execute a single instruction, one clock cycle for each step in the microprogram for that instruction. Some CISC processors include instructions that can take a very long time to execute. Such variations interfere with both interrupt latency and, what is far more important in modern systems, pipelining.
When designing a new processor, a hardwired control RISC has the following advantages over microcoded CISC:
There are counterpoints as well:
Many RISC and VLIW processors are designed to execute every instruction (as long as it is in the cache) in a single cycle. This is very similar to the way CPUs with microcode execute one microinstruction per cycle. VLIW processors have instructions that behave similarly to very wide horizontal microcode, although typically without such fine-grained control over the hardware as provided by microcode. RISC instructions are sometimes similar to the narrow vertical microcode.
Microcoding has been popular in application-specific processors such as network processors, microcontrollers, digital signal processors, channel controllers, disk controllers, network interface controllers, graphics processing units, and in other hardware.
Modern CISC implementations, such as the x86 family, decode instructions into dynamically buffered micro-operations ("μops") with an instruction encoding similar to RISC or traditional microcode. A hardwired instruction decode unit directly emits μops for common x86 instructions, but falls back to a more traditional microcode ROM for more complex or rarely used instructions.
For example, an x86 might look up μops from microcode to handle complex multistep operations such as loop or string instructions, floating point unit transcendental functions or unusual values such as denormal numbers, and special purpose instructions such as CPUID.
A central processing unit (CPU), also called a central processor or main processor, is the electronic circuitry within a computer that executes instructions that make up a computer program. The CPU performs basic arithmetic, logic, controlling, and input/output (I/O) operations specified by the instructions in the program. The computer industry used the term "central processing unit" as early as 1955. Traditionally, the term "CPU" refers to a processor, more specifically to its processing unit and control unit (CU), distinguishing these core elements of a computer from external components such as main memory and I/O circuitry.
The control unit (CU) is a component of a computer's central processing unit (CPU) that directs the operation of the processor. It tells the computer's memory, arithmetic and logic unit and input and output devices how to respond to the instructions that have been sent to the processor.
A control store is the part of a CPU's control unit that stores the CPU's microprogram. It is usually accessed by a microsequencer. Early types of control store took the form of diode-arrays that were accessed via address decoders, but were later implemented as writable microcode that was stored in a form of read-only memory called a writable control store. The outputs generally had to go through a register to prevent a race condition from occurring. The register was clocked by the clock signal of the system it was running on.
Processor design is the design engineering task of creating a processor, a key component of computer hardware. It is a subfield of computer engineering and electronics engineering (fabrication). The design process involves choosing an instruction set and a certain execution paradigm and results in a microarchitecture, which might be described in e.g. VHDL or Verilog. For microprocessor design, this description is then manufactured employing some of the various semiconductor device fabrication processes, resulting in a die which is bonded onto a chip carrier. This chip carrier is then soldered onto, or inserted into a socket on, a printed circuit board (PCB).
A complex instruction set computer is a computer in which single instructions can execute several low-level operations or are capable of multi-step operations or addressing modes within single instructions. The term was retroactively coined in contrast to reduced instruction set computer (RISC) and has therefore become something of an umbrella term for everything that is not RISC, from large and complex mainframe computers to simplistic microcontrollers where memory load and store operations are not separated from arithmetic instructions. A modern RISC processor can therefore be much more complex than, say, a modern microcontroller using a CISC-labeled instruction set, especially in the complexity of its electronic circuits, but also in the number of instructions or the complexity of their encoding patterns. The only typical differentiating characteristic is that most RISC designs use uniform instruction length for almost all instructions, and employ strictly separate load/store-instructions.
The 8086 is a 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it was released. The Intel 8088, released July 1, 1979, is a slightly modified chip with an external 8-bit data bus, and is notable as the processor used in the original IBM PC design.
A microassembler is a computer program that helps prepare a microprogram, called firmware, to control the low level operation of a computer in much the same way an assembler helps prepare higher level code for a processor. The difference is that the microprogram is usually only developed by the processor manufacturer and works intimately with the computer hardware. On a microprogrammed computer the microprogram implements the operations of the instruction set in which any normal program is written. The use of a microprogram allows the manufacturer to fix certain mistakes, including working around hardware design errors, without modifying the hardware. Another means of employing microassembler-generated microprograms is in allowing the same hardware to run different instruction sets. After it is assembled, the microprogram is then loaded to a control store to become part of the logic of a CPU's control unit.
Machine code is a computer program written in machine language instructions that can be executed directly by a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very specific task, such as a load, a store, a jump, or an arithmetic logic unit (ALU) operation on one or more units of data in central processing unit (CPU) registers or memory.
A floating-point unit is a part of a computer system specially designed to carry out operations on floating-point numbers. Typical operations are addition, subtraction, multiplication, division, and square root. Some FPUs can also perform various transcendental functions such as exponential or trigonometric calculations, but the accuracy can be very low, so that some systems prefer to compute these functions in software.
An instruction set architecture (ISA) is an abstract model of a computer. It is also referred to as architecture or computer architecture. A realization of an ISA, such as a central processing unit (CPU), is called an implementation.
In computer architecture and engineering, a sequencer or microsequencer generates the addresses used to step through the microprogram of a control store. It is used as a part of the control unit of a CPU or as a stand-alone generator for address ranges.
In computer engineering, microarchitecture, also called computer organization and sometimes abbreviated as µarch or uarch, is the way a given instruction set architecture (ISA) is implemented in a particular processor. A given ISA may be implemented with different microarchitectures; implementations may vary due to different goals of a given design or due to shifts in technology.
The Burroughs B1000 Series was a series of mainframe computers, built by the Burroughs Corporation, and originally introduced in the 1970s with continued software development until 1987. The series consisted of three major generations which were the B1700, B1800, and B1900 series machines. They were also known as the Burroughs Small Systems, by contrast with the Burroughs Large Systems and the Burroughs Medium Systems.
The Nord-100 was a 16-bit minicomputer series made by Norsk Data, introduced in 1979. It shipped with the Sintran III operating system, and the architecture was based on, and backwards compatible with, the Nord-10 line.
The Orion was a series of 32-bit super-minicomputers designed and produced in the 1980s by High Level Hardware Limited (HLH), a company based in Oxford, UK. The company produced four versions of the machine:
In computer central processing units, micro-operations are detailed low-level instructions used in some designs to implement complex machine instructions.
The history of general-purpose CPUs is a continuation of the earlier history of computing hardware.
Joel McCormack is the designer of the NCR Corporation version of the p-code machine, which is a kind of stack machine popular in the 1970s as the preferred way to implement new computing architectures and languages such as Pascal and BCPL. The NCR design shares no common architecture with the Pascal MicroEngine designed by Western Digital but both were meant to execute the UCSD p-System.[1,2]
In computer engineering, computer architecture is a set of rules and methods that describe the functionality, organization, and implementation of computer systems. Some definitions of architecture define it as describing the capabilities and programming model of a computer but not a particular implementation. In other definitions computer architecture involves instruction set architecture design, microarchitecture design, logic design, and implementation.
Intel microcode is microcode that runs inside x86 processors made by Intel. Since the P6 microarchitecture introduced in the mid-1990s, the microcode programs can be patched by the operating system or BIOS firmware to workaround bugs found in the CPU after release. Intel had originally designed microcode updates for processor debugging under its design for testing (DFT) initiative.
4096 x 75-bit SRAM writeable control store: 74-bit microinstruction with 1 parity bit (18 fields)
|Look up microcode in Wiktionary, the free dictionary.|
This article's use of external links may not follow Wikipedia's policies or guidelines. (February 2017) (Learn how and when to remove this template message)