Operand forwarding (or data forwarding) is an optimization in pipelined CPUs to limit performance deficits which occur due to pipeline stalls caused by data hazards. [1] [2] A data hazard can lead to a pipeline stall when the current operation has to wait for the results of an earlier operation which has not yet finished.
It is very common that an instruction requires a value computed by the immediately preceding instruction. It may take a few clock cycles to write a result to the register file and then read it back for the subsequent instruction. To improve performance, the register file write/read is bypassed. The result of an instruction is forwarded directly to the execute stage of a subsequent instruction.
ADD A B C #A=B+C SUB D C A #D=C-A
If these two assembly pseudocode instructions run in a pipeline, after fetching and decoding the second instruction, the pipeline stalls, waiting until the result of the addition is written and read.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
|---|---|---|---|---|---|---|---|
| Fetch ADD | Decode ADD | Read Operands ADD | Execute ADD | Write result | |||
| Fetch SUB | Decode SUB | stall | stall | Read Operands SUB | Execute SUB | Write result |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|
| Fetch ADD | Decode ADD | Read Operands ADD | Execute ADD | Write result | ||
| Fetch SUB | Decode SUB | stall | Read Operands SUB: use result from previous operation | Execute SUB | Write result |
In some cases all stalls from such read-after-write data hazards can be completely eliminated by operand forwarding: [3] [4] [5]
| 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|
| Fetch ADD | Decode ADD | Read Operands ADD | Execute ADD | Write result | |
| Fetch SUB | Decode SUB | Read Operands SUB: use result from previous operation | Execute SUB | Write result |
The CPU control unit must implement logic to detect dependencies where operand forwarding makes sense. A multiplexer can then be used to select the proper register or flip-flop to read the operand from.