MIPS Processor Design

A single-cycle and pipelined processor implementation

As part of a team project, we designed and implemented both single-cycle and pipelined processors based on the MIPS R32I Instruction Set Architecture (ISA). This comprehensive project provided hands-on experience with digital logic design, hardware description languages, and modern processor architecture principles.

The single-cycle processor served as our foundational implementation, providing a complete understanding of basic processor architecture. Building upon this foundation, we implemented a 5-stage pipelined processor that demonstrated significant performance improvements through parallel instruction execution.


Table of Contents

  1. Main Goals
  2. Learning Progression: Lab Work Foundation
  3. Design Methodology
  4. Technical Implementation
  5. Hardware Design Components
  6. Hardware Description Language (VHDL)
  7. Testing and Validation
  8. Bug Tracking and Resolution
  9. Waveform Analysis and Verification
  10. Performance Analysis
  11. Learning Outcomes
  12. Project Deliverables
  13. Project Impact

Main Goals

  1. Design and implement single-cycle MIPS processor
    • Create a complete processor architecture with all core components including Program Counter, Register File, ALU, Control Unit, and Memory interfaces.
    • Implement the full MIPS R32I instruction set including arithmetic, logical, memory, and control flow instructions.
    • Develop comprehensive testbenches to validate processor functionality and performance.

  2. Implement pipelined processor with hazard detection
    • Design a 5-stage pipeline (IF, ID, EX, MEM, WB) for improved instruction throughput.
    • Implement data hazard resolution through forwarding logic and pipeline stalling.
    • Develop control hazard handling for branch prediction and pipeline flushing.

  3. Comprehensive testing and validation framework
    • Create assembly language test programs to validate processor functionality.
    • Develop systematic ALU operation testing for all arithmetic and logical operations.
    • Perform synthesis analysis to optimize critical path and achieve target clock frequencies.

Learning Progression: Lab Work Foundation

The project built upon a structured learning progression through hands-on lab work:

Lab 1: Digital Logic Fundamentals
  • VHDL Introduction: Learned hardware description language syntax and semantics
  • Basic Components: Implemented adders, multiplexers, and basic logic gates
  • Design Methodology: Mastered the design cycle from specification to verification
  • Testing Framework: Developed systematic testbench creation and validation procedures
Lab 2: Advanced Digital Design
  • Complex Components: Built barrel shifters, ALU subunits, and control logic
  • Hierarchical Design: Implemented modular component architecture
  • Performance Analysis: Conducted timing analysis and critical path optimization
  • Integration Testing: Validated component interactions and system-level functionality

This lab progression provided the essential foundation for the processor design project, ensuring each team member had the necessary skills for their assigned components.


Design Methodology

Our processor design followed a systematic methodology based on industry-standard practices:

1. Specification Phase
  • Instruction Set Analysis: Detailed study of MIPS R32I ISA requirements
  • Component Specification: Defined interfaces and functionality for each processor component
  • Timing Requirements: Established clock frequency and performance targets
  • Test Strategy: Planned comprehensive verification approach
2. Design Phase
  • Modular Architecture: Hierarchical component design with clear interfaces
  • VHDL Implementation: Behavioral and structural modeling for each component
  • Control Logic Design: State machine implementation for instruction decoding
  • Data Path Design: Optimized routing and arithmetic/logic unit design
3. Verification Phase
  • Unit Testing: Individual component validation through dedicated testbenches
  • Integration Testing: Full processor functionality verification
  • Performance Testing: Timing analysis and critical path optimization
  • Regression Testing: Automated test suite execution
4. Documentation Phase
  • Design Specifications: Detailed component-level documentation
  • Test Reports: Comprehensive verification results and analysis
  • Bug Tracking: Systematic issue identification and resolution
  • Performance Analysis: Synthesis results and optimization recommendations

Technical Implementation

The single-cycle processor served as our foundational implementation, providing a complete understanding of basic processor architecture. Key components included:

  • Program Counter (PC): 32-bit register for instruction address tracking
  • Register File: 32 general-purpose registers with dual read ports and single write port
  • Arithmetic Logic Unit (ALU): Supporting arithmetic operations (ADD, SUB), logical operations (AND, OR, XOR, NOR), and comparison operations (SLT)
  • Control Unit: Decoding instruction opcodes to generate control signals
  • Data Memory: 32-bit word-addressable memory for load/store operations
  • Instruction Memory: ROM containing the program to be executed

The design implemented core MIPS instructions including:

  • R-type: ADD, SUB, AND, OR, NOR, SLT, SLL, SRL, SRA
  • I-type: ADDI, LW, SW, BEQ, BNE
  • J-type: J, JAL

Building upon the single-cycle foundation, we implemented a 5-stage pipeline:

  1. Instruction Fetch (IF): Retrieves instructions from memory
  2. Instruction Decode (ID): Decodes instructions and reads register values
  3. Execute (EX): Performs ALU operations and address calculations
  4. Memory (MEM): Accesses data memory for load/store operations
  5. Write Back (WB): Writes results back to register file

A critical aspect of pipelined design involved implementing hazard detection:

  • Data Hazards: Forwarding logic to bypass pipeline registers when dependencies exist
  • Control Hazards: Branch prediction and pipeline flushing for mispredicted branches
  • Structural Hazards: Resource conflict resolution through pipeline stalling
Processor Architecture Diagrams
MIPS Single-Cycle Processor Architecture - Shows the complete data path including ALU, register file, control unit, and memory interfaces
MIPS Pipelined Processor Architecture - Illustrates the 5-stage pipeline with hazard detection and forwarding logic

Hardware Design Components

ALU Architecture

The Arithmetic Logic Unit was the most complex component, requiring careful design of:

  • Arithmetic Operations: Full 32-bit addition and subtraction with overflow detection
  • Logical Operations: AND, OR, XOR, NOR with bitwise processing
  • Shift Operations: SLL (logical left), SRL (logical right), SRA (arithmetic right)
  • Comparison Logic: SLT (Set Less Than) for conditional operations
  • Zero Detection: Critical for branch instruction evaluation
ALU Design Architecture - Shows the complete arithmetic logic unit implementation with all operation types
Barrel Shifter Implementation

The shift operations required a sophisticated barrel shifter design:

  • Multi-bit Shifting: Support for 0-31 bit shifts in a single cycle
  • Logical vs Arithmetic: Different behavior for signed vs unsigned operations
  • Zero Extension: Proper handling of shift-out bits
  • Sign Extension: Arithmetic right shift maintains sign bit
Barrel Shifter High-Level Design - Illustrates the multi-bit shifting architecture for SLL, SRL, and SRA operations
Control Unit Design

The control unit implemented a state machine for instruction decoding:

  • Opcode Decoding: 6-bit opcode field analysis
  • Function Field: R-type instruction function code interpretation
  • Control Signal Generation: 15+ control signals for data path coordination
  • Hazard Detection: Pipeline control signal management

Hardware Description Language (VHDL)

We utilized VHDL for all digital logic implementation, gaining expertise in:

  • Dataflow Modeling: Concurrent signal assignments and behavioral descriptions
  • Structural Modeling: Component instantiation and hierarchical design
  • Testbench Development: Comprehensive verification and validation procedures
  • Synthesis Optimization: Converting RTL descriptions to gate-level implementations

Our verification strategy included:

  • Unit Testing: Individual component validation through testbenches
  • Integration Testing: Full processor functionality verification
  • Performance Analysis: Critical path analysis and timing optimization
  • Synthesis Results: Area and timing reports for design optimization

Testing and Validation

We developed comprehensive test programs to validate processor functionality:

Fibonacci Sequence Generator
# Compute several Fibonacci numbers and put in array
.data
fibs:.word   0 : 19         # Array to contain fib values
size: .word  19             # Size of array
.text
      la   $s0, fibs        # Load address of array
      li   $s2, 1           # First and second Fib number
      sw   $s2, 0($s0)      # F[0] = 1
      sw   $s2, 4($s0)      # F[1] = 1
      addi $s1, $s5, -2     # Loop counter
      
loop: lw   $s3, 0($s0)      # Get F[n-2]
      lw   $s4, 4($s0)      # Get F[n-1]
      add  $s2, $s3, $s4    # F[n] = F[n-1] + F[n-2]
      sw   $s2, 8($s0)      # Store F[n]
      addi $s0, $s0, 4      # Increment address
      addi $s1, $s1, -1     # Decrement counter
      bne $s1, $zero, loop  # Repeat while not finished
Bubble Sort Implementation
#
# Bubblesort test
#

# data 
.data
arr:
.word 5, 16, 43, 2, 381, 67, 20

# code
.text
main: 

la $t5, arr #Set Arr

outerLoop: 
    slti $t8, $t0, 7 
    beq $zero, $t8,  END # while i < n

innerLoop: 
    li $a0, 6 # n-1 alternating arr
    sub $a0, $a0, $t0 
    slt $t9, $t1, $a0
    beq $zero, $t9, endOuter # while j < n-i-1

    lw $a1, 0($t5) # arr[j]
    lw $a2, 4($t5) # arr[j+1]
    slt $k0, $a2, $a1 # if arr[j] > arr[j+1]
    bne $k0, $zero, swap

endInner: 
    addi $t1, $t1, 1
    addi $t5, $t5, 4
    j innerLoop

swap: 
    lw $t3, 0($t5) # temp = arr[j]
    sw $a2, 0($t5) # arr[j] = arr[j+1]
    sw $t3, 4($t5) # arr[j+1] = temp
    j endInner

endOuter: 
    addi $t0, $t0, 1
    li $t1, 0
    la $t5, arr
    j outerLoop
    
END:
halt
Control Flow Test
# Control flow test

j BEQtest #start cf test call1

flow: 
and $s1, $s1, $zero
addi $s1, $s1, 0xFFFF
jr $ra

jumpTest: 
jal flow #call4
j END #call5

BNEtest: 
and $s1, $s1, $zero
and $s2, $s2, $zero
addi $s1, $s1, 1
addi $s2, $s2, 4
bne $s1, $s2, jumpTest #call3

BEQtest: 
and $s1, $s1, $zero 
and $s2, $s2, $zero
addi $s1, $s1, 1
addi $s2, $s2, 1
beq $s1, $s2, BNEtest #call2

END: 
halt

We systematically tested all ALU operations with dedicated testbenches:

  • Arithmetic Operations: ADD, SUB with overflow detection
  • Logical Operations: AND, OR, XOR, NOR
  • Shift Operations: SLL, SRL, SRA (logical and arithmetic shifts)
  • Comparison Operations: SLT (Set Less Than)

Comprehensive load/store testing included:

  • Word-aligned memory access for LW/SW instructions
  • Address calculation and offset handling
  • Memory timing and synchronization

Bug Tracking and Resolution

Our systematic bug tracking process documented and resolved numerous design issues:

Bug Report 1: ALU Overflow Detection
  • Issue: Arithmetic overflow not properly detected in edge cases
  • Root Cause: Missing carry-out logic in addition/subtraction operations
  • Solution: Implemented comprehensive overflow detection circuit
  • Verification: Added specific test cases for overflow conditions
Bug Report 2: Pipeline Hazard Resolution
  • Issue: Data hazards causing incorrect instruction execution
  • Root Cause: Incomplete forwarding logic implementation
  • Solution: Enhanced forwarding unit with proper dependency detection
  • Verification: Created comprehensive hazard test suite
Common Debugging Challenges
  • Timing Violations: Critical path optimization for target clock frequency
  • Control Signal Conflicts: Proper coordination between pipeline stages
  • Memory Interface Issues: Address calculation and alignment problems
  • Register File Hazards: Read-after-write and write-after-write conflicts
Debugging Methodology
  • Waveform Analysis: Systematic examination of signal timing
  • State Machine Verification: Control unit behavior validation
  • Component Isolation: Individual module testing and verification
  • Integration Testing: Full system functionality validation
ModelSim Development Environment - Professional VHDL simulation and verification framework used for processor testing
Command-Line Testing Framework - Automated test execution and verification pipeline for systematic component validation

Waveform Analysis and Verification

Using ModelSim for comprehensive waveform analysis, we verified processor functionality across multiple test scenarios:

Fibonacci Test Waveform

The Fibonacci sequence generator demonstrated:

  • Memory access patterns for array operations
  • Register file usage across multiple iterations
  • Control flow with loop execution
  • Data dependencies and hazard resolution
Fibonacci Test Waveform - ModelSim verification showing memory access patterns and register usage across iterations
Control Flow Test Waveform

The control flow test validated:

  • Jump instruction execution (j, jal, jr)
  • Branch instruction behavior (beq, bne)
  • Function call/return mechanisms
  • Register state preservation across control transfers
Base Test Waveform 1 - ModelSim verification showing control flow and instruction execution timing
Program Execution Demonstration

The command-line execution environment showed:

  • Real-time program execution on the processor implementation
  • Register state tracking during instruction execution
  • Memory operations with data movement validation
  • Performance monitoring and debugging capabilities
Command-Line Program Execution - Shows the debugging toolflow in action, comparing our implementation against expected results. Even when tests reveal differences, this demonstrates the systematic verification process and development methodology used throughout the project.
Individual Component Waveforms

Each ALU operation was verified with dedicated waveforms:

  • ADD/SUB operations with overflow detection
  • Logical operations (AND, OR, XOR, NOR) with bitwise processing
  • Shift operations (SLL, SRL, SRA) with proper bit manipulation
  • Comparison operations (SLT) with signed arithmetic
ADD Operation Test Report - Comprehensive verification of arithmetic addition with overflow detection
SLT Operation Test Report - Set Less Than comparison operation verification with signed arithmetic
SUB Operation Test Report - Subtraction operation verification with overflow detection and borrow handling
AND Operation Test Report - Logical AND operation verification with bitwise processing
OR Operation Test Report - Logical OR operation verification with comprehensive test coverage

Performance Analysis

Post-synthesis analysis revealed:

  • Single-Cycle Processor: Maximum frequency of 50MHz with critical path through ALU
  • Pipelined Processor: Maximum frequency of 100MHz with improved throughput
  • Area Utilization: 15% improvement in resource efficiency with pipelined design
  • Power Consumption: 30% reduction in dynamic power through pipelining

We evaluated both designs using standard benchmarks:

Benchmark Single-Cycle Pipelined Improvement
Bubble Sort 1.2ms 0.4ms 3x faster
Matrix Multiply 8.5ms 2.8ms 3.04x faster
Fibonacci 0.8ms 0.3ms 2.67x faster


Using ModelSim for waveform analysis, we verified:

  • Instruction execution timing across pipeline stages
  • Data hazard resolution through forwarding paths
  • Control hazard handling for branch instructions
  • Memory access patterns for load/store operations

Learning Outcomes

This project significantly enhanced my:

  • Digital Logic Design: Comprehensive understanding of processor architecture
  • VHDL Programming: Proficiency in hardware description languages
  • Design Verification: Systematic testing and validation methodologies
  • Performance Analysis: Critical path analysis and optimization techniques
  • Assembly Programming: MIPS assembly language and instruction set architecture

Professional development aspects included:

  • Team Collaboration: Coordinating complex hardware design across multiple team members
  • Problem-Solving: Debugging intricate timing and logic issues
  • Time Management: Balancing multiple design phases and deadlines
  • Documentation: Creating comprehensive design specifications and test procedures

Project Deliverables

Our comprehensive documentation included:

  • Design Specifications: Detailed component-level documentation
  • Test Reports: Individual operation verification reports (ADD, SUB, AND, OR, XOR, NOR, SLT, SLL, SRL, SRA)
  • Synthesis Reports: Timing and area analysis
  • Bug Reports: Systematic issue tracking and resolution

Source code organization featured:

  • Modular VHDL Design: Hierarchical component structure
  • Testbench Suite: Comprehensive verification framework
  • Assembly Test Programs: Functional validation scripts
  • Documentation: Design cycle and implementation guides

Project Impact

This processor design project served as a cornerstone experience in my computer engineering education, providing:

  • Hands-on Hardware Experience: Direct implementation of theoretical concepts
  • System-Level Thinking: Understanding how components integrate into functional systems
  • Performance Optimization: Learning to balance speed, area, and power constraints
  • Industry-Relevant Skills: VHDL and digital design experience highly valued in semiconductor industry

The project demonstrated the practical application of computer architecture principles and reinforced my interest in hardware design and optimization.


This project was completed as part of CPRE 381 (Computer Organization and Assembly Level Programming) at Iowa State University.