CPU Pipeline Stages

Modern CPUs don't execute one instruction completely before starting the next. Instead, they overlap execution using a pipeline — the same way an assembly line lets a car factory build multiple cars simultaneously, each at a different station.

The Classic 5-Stage RISC Pipeline

Most computer architecture courses teach the five-stage pipeline popularized by MIPS and DLX:

Stage	Abbreviation	What Happens
Instruction Fetch	IF	Read the next instruction from memory using the Program Counter (PC)
Instruction Decode	ID	Decode the opcode; read source register values
Execute	EX	ALU performs the operation or calculates a memory address
Memory Access	MEM	Reads (load) or writes (store) data memory
Write Back	WB	Writes the result back to the destination register

Interactive Simulator

Step through clock cycles to watch five instructions flow through all five stages simultaneously. Hover over any colored cell for a description of what's happening in that stage.

CPU Pipeline Simulator

Step through clock cycles to watch instructions flow through IF → ID → EX → MEM → WB

Space-Time Diagram

CC1

CC2

CC3

CC4

CC5

CC6

CC7

CC8

CC9

ADD R1, R2, R3

MEM

LW R4, 0(R1)

MEM

SUB R5, R4, R2

MEM

SW R5, 4(R1)

MEM

BEQ R1, R0, +8

MEM

Clock Cycle 1 — Active Stages

IF: ADD R1, R2, R3

Stage Legend (hover for details)

Instruction Fetch

Instruction Decode

Execute (ALU)

MEM

Memory Access

Write Back

Cycle 1 of 9

Why Pipeline?

Without pipelining, each instruction takes 5 cycles. With a full pipeline, one instruction completes every cycle once the pipeline is filled:

No pipeline:   I1: 5 cycles, I2 starts at cycle 6, I3 at cycle 11 ...
With pipeline: I1 completes at cycle 5, I2 at cycle 6, I3 at cycle 7 ...

Throughput approaches 1 instruction/cycle (IPC = 1). Real out-of-order processors exceed this with superscalar execution (multiple pipelines).

Pipeline Hazards

Pipelining introduces hazards — situations where the next instruction can't start in the immediately following cycle.

Structural Hazard

Two instructions need the same hardware resource at the same time.

Example: A CPU with a single memory port can't fetch a new instruction (IF stage) while another instruction is reading from memory (MEM stage) in the same cycle.

Fix: Separate instruction memory (I-cache) from data memory (D-cache). Stall one instruction.

Data Hazard (RAW — Read After Write)

An instruction reads a register that a previous instruction hasn't written yet.

ADD R1, R2, R3    # writes R1 (available after WB at cycle 5)
SUB R4, R1, R5   # reads R1 (needs it in ID at cycle 3) ← RAW hazard!

Fix 1 — Stalling: Insert NOP bubbles until the value is ready. Wastes cycles.

Fix 2 — Forwarding/Bypassing: Route the EX output directly back to the EX input of the next instruction — no stall needed for most cases.

Control Hazard (Branch Hazard)

A branch instruction changes the PC, but the pipeline has already fetched the next 1–4 instructions from the wrong path.

BEQ R1, R0, target    # branch result known at end of EX
<fetched speculatively — may need to flush>

Fix: Branch prediction (see the Branch Prediction page).

Key Insight

The pipeline is a throughput optimization, not a latency one. Each instruction still takes 5 cycles; you just get one completing per cycle at steady state. Hazards break this ideal and are the reason CPU microarchitecture is complex.

CPU Pipeline Stages

The Classic 5-Stage RISC Pipeline​

Interactive Simulator​

CPU Pipeline Simulator

Why Pipeline?​

Pipeline Hazards​

Structural Hazard​

Data Hazard (RAW — Read After Write)​

Control Hazard (Branch Hazard)​

Further Reading​