A bit about computer architecture
Sponsored Links
This presentation is the property of its rightful owner.
1 / 57

A bit about computer architecture PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on
  • Presentation posted in: General

A bit about computer architecture. CS 147, Fall Semester 2007 Robert Correll. Overview. RISC microprocessor design Diagnostic testing Software development Microprocessor features System-on-Chip (SoC). RISC microprocessor design. 12 members on the team: Design Manager (1)

Download Presentation

A bit about computer architecture

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


A bit about computer architecture

CS 147, Fall Semester 2007

Robert Correll


Overview

  • RISC microprocessor design

  • Diagnostic testing

  • Software development

  • Microprocessor features

  • System-on-Chip (SoC)


RISC microprocessor design

  • 12 members on the team:

    • Design Manager (1)

    • ASIC Design Engineers (9)

    • Diagnostics Manager (1)

    • Software Engineer (1)

  • Culture:

    • High-tech (Verilog)

    • Very quiet


Embedded 32-bit microprocessor

  • Earns Editor's Choice Award

  • Microprocessor Report Names IDT’s RC32364 Best Embedded Processor for Price/Performance

  • (Volume 12, Number 7, June 1, 1998)


Embedded processor-based applications

  • Low-end routers and switches

  • Cellular base stations

  • Consumer multimedia game systems


Device Overview

  • MIPS-II RISC architecture with enhancements

    • Scalar 5-stage pipeline minimizes branch and load delays

    • DSP engine capable of doing 1 multiply accumulate instruction every 2 clock cycles


Device Overview (continued)

  • Enhanced instruction set architecture

    • MIPS-IV compatible conditional move instructions

    • MIPS-IV superset PREF (prefetch) instruction

    • Fast multiplier with atomic multiply-add, multiply-sub

    • Count leading zero/one instructions


Device Overview (continued)

  • Large, efficient on-chip caches

    • Separate 8KB Instruction cache and 2KB Data cache

    • 2-way set associative

    • Write-back and write-through support on a per page basis

    • Optional cache locking, with per line resolution, to facilitate deterministic response

    • Simultaneous instruction and data fetch in each clock cycle, achieves over l GB/sec bandwidth


Device Overview (continued)

  • Flexible MMU with 32-page TLB

    • Variable page size

    • Enhanced write algorithm support

    • Variable number of locked entries

    • No performance penalty for address translation


Device Overview (continued)

  • Flexible bus interface allows simple, low-cost designs

    • Bus interface runs at a fraction of pipeline rate Programmable port-width interface (8-,16-, 32-bit memory and I/O regions)

    • Programmable bus turnaround (BTA) times

    • Supports single datum or burst transactions

    • Selectable system byte-ordering


RC32364 Block Diagram


Diagnostic Testing

  • Began with 300 tests and behavior model

  • Downloaded 10 to 40 new tests per day

  • One test per directory

  • Build each test

  • Run each test on an RTL model

  • Debug and track failures

  • Finished with more than 3,000 tests


Software Development

  • Test Release System

    • Automated regression process

    • Distributed jobs based upon cycle counts

    • Provided customized history reports

  • Accumulated load per signal utility

  • Test vectors

  • Many other value-added scripts

  • Diagnostic tests


CPU Instruction Set


Load Link Store Conditional Opcodes

li $9, 1 sw $9, 0($6) .word 0xc0850000 # opcode # ll $5, 0($4) bne $5, $0, Fail # verify sem = 0 li $5, 2 li $9, 2 sw $9, 0($6) .word 0xe0850000 # opcode

# sc $5, 0($4) bne $5, $8, Fail # verify sc indicates success li $8, 2


CPU Pipeline Architecture


CPU Pipeline Stages

  • 1I - Instruction Fetch, Phase one

    • Instruction address translation begins

  • 2I - Instruction Fetch, Phase two

    • Instruction cache fetch begins

    • Instruction address translation continues


CPU Pipeline Stages (continued)

  • 1R - Register Fetch, Phase one

    • The instruction cache fetch finishes.

    • The instruction cache tag is checked against the physical page frame number obtained from the address translation.


CPU Pipeline Stages (continued)

  • 2R - Register Fetch, Phase two

    • The instruction decoder decodes the instruction.

    • Any required operands are fetched from the register file.

    • Make a decision to either issue or slip (for an interlock condition).

    • For a branch, the branch address is calculated.


CPU Pipeline Stages (continued)

  • 1A - Execution, Phase one

    • Any result from the A or D stages are bypassed.

    • The arithmetic logic unit (ALU) starts the integer arithmetic, logical or shift operation.

    • The ALU calculates the data virtual address for load and store instructions.

    • The ALU determines whether the branch condition is true.


CPU Pipeline Stages (continued)

  • 2A - Execution, Phase two

    • The integer arithmetic, logical or shift operation will complete.

    • A data cache access will start.

    • Store data is shifted to the specified byte position(s).

    • The data virtual to physical address translation will start.


CPU Pipeline Stages (continued)

  • 1D - Data Fetch, Phase one

    • The data cache access will continue.

    • The data address translation completes.

  • 2D - Data Fetch, Phase two

    • The data cache access will finish and the data is then shifted down and extended.

    • The data cache tag is checked against the physical address for any data cache access.


CPU Pipeline Stages (continued)

  • 1W - Write Back, Phase one

    • The processor uses this phase internally to resolve all exceptions in preparation for the register file write.

  • 2W - Write Back, Phase two

    • For register-to-register and load instructions, the result is written back to the register file.

    • Branch instructions perform no operation during this stage.


Activities during each ALU pipeline stage...


...for load, store, and branch instructions.


Stall Conditions

  • Detected after the R pipe-stage.

  • The processor will resolve the condition.

    • Detect cache miss

    • Start moving dirty cache line data to write buffer

    • Get first doubleword into cache and restart pipeline

    • Load remainder of cache line into cache


Slip Conditions

  • Slipped instructions are retried on subsequent cycles

    • Detect cache miss

    • Get entire cache line into cache

    • Continue pipeline

    • Inserted NOP instructions


Memory Management Unit (MMU)

  • Generates translation lookaside buffer (TLB) exceptions such as:

    • TLB refill

    • TLB invalid

    • TLB modified

  • Offers the following advantages:

    • Variable page size

    • Enhanced Write Algorithm support

    • Mapping of a larger portion of the virtual address space

    • Variable number of locked entries


32-bit Virtual Address Translation


TLB Register Format


TLB Register Field Descriptions


MMU Register Descriptions


Range of wired and random entries


User Mode Address Space


Kernal Mode Address Space


CPU Exception Processing

  • Begins when the processor receives and detects exceptions such as:

    • address translation errors

    • arithmetic overflows

    • I/O interrupts

    • system calls

  • Processor suspends normal instruction sequence and enters Kernel mode


CPU Exception Processing (continued)

  • Processor then disables interrupts,

  • Forces execution of a software handler, which is located at a fixed address.

  • The handler may save processor context:

    • program counter contents

    • current operating mode (User or Kernel mode)

    • interrupt status (enabled or disabled)


Exception Processing Registers...


Basic CP0 Registers


Exception Priority


Cache Organization, Operation, and Coherency


Primary I-Cache Line Format


Primary D-Cache Line Format


Conceptual Primary Cache Lookup Seq.


Primary Cache Data and Tag Organization


Primary Cache States


Clocking, Reset, and Initialization Interfaces


Timing Illustration of MasterClock-to-PClock Multiply by 2


EJTAG (In-circuit Emulator) Interface


EJTAG Block Diagram


System-on-Chip (SoC)


SoC (continued)


SoC (continued)


Summary

  • RISC microprocessor design

  • Diagnostic testing

  • Software development

  • Microprocessor features

  • System-on-Chip (SoC)


References

  • IDT™ 79RC32364 RISController™ Advanced Architecture, 32-bit Embedded Microprocessor, User’s Reference Manual, 1999, http://www.idt.com/products/files/10750/79RC32364_MA_38374.pdf?CFID=1729583&CFTOKEN=95787432

  • IDT™ Interprise™ 79RC32351Integrated Communications Processor Data Sheet, 2004http://www.idt.com/products/files/10702/RC32351_DS_23066.pdf?CFID=1729583&CFTOKEN=95787432


References (continued)

  • IDT™ Interprise™ 79RC32365 Integrated Communications Processors User Reference Manual, 2004, http://www.idt.com/products/files/10712/79RC32365_MA_12022.pdf?CFID=1729583&CFTOKEN=95787432

  • IDT™ Interprise™ 79RC32435 Integrated Communications Processor Data Sheet, 2006, http://www.idt.com/products/files/571508/32435_ds.pdf?CFID=1729583&CFTOKEN=95787432


A bit about computer architecture

CS 147, Fall Semester 2007

Robert Correll


  • Login