ece 526 network processing systems design
Download
Skip this Video
Download Presentation
ECE 526 – Network Processing Systems Design

Loading in 2 Seconds...

play fullscreen
1 / 25

ECE 526 – Network Processing Systems Design - PowerPoint PPT Presentation


  • 103 Views
  • Uploaded on

ECE 526 – Network Processing Systems Design. IXP XScale and Microengines Chapter 18 & 19: D. E. Comer. Overview. Recalled Packet processing functions (forwarding, queuing…) Traditional network processing systems (CPU + NICs) General network processor architecture and tradeoffs

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' ECE 526 – Network Processing Systems Design' - brac


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
ece 526 network processing systems design

ECE 526 – Network Processing Systems Design

IXP XScale and Microengines

Chapter 18 & 19: D. E. Comer

overview
Overview
  • Recalled
    • Packet processing functions (forwarding, queuing…)
    • Traditional network processing systems (CPU + NICs)
    • General network processor architecture and tradeoffs
    • Intel IXP network processors overall architecture
  • Focus on individual components of Intel IXP chip
    • Control processor (slow path): XScale core
      • Overall architecture
      • Typical functions
      • Processor features
    • Packet processing processor (fast path): Microengines
      • Architecture and features
      • Differences to conventional processors
      • Pipelining and multi-threading

ECE 526

purpose of control processor
Purpose of Control Processor
  • Functions typically executed by embedded control proc:
    • Bootstrapping
    • Exception handling
    • Higher-layer protocol processing
    • Interactive debugging
    • Diagnostics and logging
    • Memory allocation
    • Application programs (if needed)
    • User interface and/or

interface to the GPP

    • Control of packet processors
    • Other administrative functions

ECE 526

xscale memory architecture
XScale Memory Architecture
  • Memory architecture
    • Uses 32-bit linear address space
    • configurable endian mode
    • Byte addressable
  • Memory Mapping
    • Allocation of address space (2^32) to different system components
    • Accesses to memory is translated into access to component
    • Needs to be carefully crafted
  • XScale assumes byte addressable memory
    • Underlying memory uses different size (SDRAM)
    • How does this work?
  • Support for Virtual Memory
    • For demand paging to secondary storage

ECE 526

shared memory address issues
Shared Memory Address Issues
  • Memory is shared between XScale and Microengines
  • Same data, but different addresses
  • What impact does this have?
    • Pointers need to be translated
    • Data structures with pointers can not be shared

ECE 526

microengines
Microengines
  • Microengines are data-path packet processors IXP
  • IXP 2400 have 8 Microengines
  • Simpler than XScale
  • Low level device

as a micro-sequencer

  • Optimized for

packet processing

  • More complex to use
  • Often abbreviated as uE

ECE 526

ue functions
uE Functions
  • uEs handle ingress and egress packet processing:
    • Packet ingress from physical layer hardware
    • Checksum verification
    • Header processing and classification
    • Packet buffering in memory
    • Table lookup and forwarding
    • Header modification
    • Checksum computation
    • Packet egress to physical layer hardware

ECE 526

ue architecture
uE Architecture
  • uE characteristics:
    • Programmable microcontroller
    • RISC design
    • 256 general-purpose registers
    • 512 transfer registers
    • 128 next neighbor registers
    • Hardware support for 8 threads and context switching
    • 640 words of local memory
    • Control of an Arithmetic and Logic Unit
    • Direct access to various functional units
    • A unit to compute a Cyclic Redundancy Check (CRC)

ECE 526

ue as micro sequencer
uE as Micro-sequencer
  • Micro-sequencer does not contain native instructions for possible operations
    • Instead of using instructions, uE invokes functional units to perform operations
    • Control unit is much “simpler”
  • Example 1:
    • uE does not have ADD R2,R3 instruction
    • Instead: ALU ADD R2, R3
    • “ALU” indicates that ALU should be used
    • “ADD” is a parameter to ALU
  • Example 2:
    • Memory access not by simple LOAD R2, 0xdeadbeef
    • Instead: SRAM LOAD R2, 0xdeadbeef
  • Altogether similar to normal processor, but more basic

ECE 526

ue instruction set
uE Instruction Set
  • General
    • ALU and etc
  • Brach and Jump
    • BR: branch unconditionally
  • CAM
    • CAM_CLEAR: clear all entries in local memories
  • I/O and context swap
    • SCRATCH (read and write)
  • For detail see Figure 19.1, 19.2, Comer.

ECE 526

ue memories
uE Memories
  • uEs: viewing memories differently than XScale does
    • Does not map memories and I/O devices into a liner address space
    • Does not view memories as a seamless, uniform repository
  • uE ISA: requiring a separate instruction for each type of memory and I/O device
    • SRAM[read, $$x, address1, address2…]
  • Programmer: required binding of data items to specific type of memory permanently.

ECE 526

execution pipeline
Execution Pipeline
  • What is pipeline?
  • Why pipeline is employed?
    • One instruction is executed per cycle if pipeline is proper designed
  • uEs use five-stage or six-stage pipeline:

ECE 526

pipelining
Pipelining

ECE 526

pipelining problems
Pipelining Problems
  • Possible sources of pipelining problems
    • Data dependencies
    • Control dependencies
    • Resource dependencies
    • Memory accesses
  • How pipelining problem impact system performance
  • How these impact can be removed or reduced
    • Remove the sources so that no stall happened
    • Hide the impact of pipelining stall

ECE 526

pipeline stalls
Pipeline Stalls
  • K: ALU ADD R2, R1, R2
  • K+1 ALU ADD R3, R2, R3
  • Control dependencies, memory have even bigger impact

ECE 526

hardware threads
Hardware Threads
  • uEs support 8 hardware thread contexts
    • One thread can execute at any given time
    • When stall occurs, uE can switch to other thread (if not stalled)
  • Very low overhead for context switch
    • “Zero-cycle context switch”
    • Effectively can take around three cycles due to pipeline flush
  • Switching rules
    • If thread stalls, check if next is ready for processing
    • Keep trying until ready thread is found
    • If none is available, stall uE and wait for any thread to unblock
  • Improves overall throughput
  • Questions:
    • Why not 16, 32 threads
    • why not have 48 uEs with 1 thread?

ECE 526

summary
Summary
  • Control processor (slow path): XScale core
      • Overall architecture
      • Typical functions
      • Processor features
  • Packet processing processor (fast path): Microengines
      • Architecture and features
      • Differences to conventional processors
      • Pipelining and multi-threading

ECE 526

lab3 brief
Lab3 Brief
  • Intel Reference Systems
  • SDK Tutorial
  • Lab 3

ECE 526

intel reference systems
Intel Reference Systems
  • Hardware Testbed
    • IXP2400 network processors
    • QDRM-SRAM, Flash ROM and other memories
    • 1G optical ethernet ports
    • 100M ethernet management port
    • Serial interface
    • PCI interfaces
  • SDK (software development kit)
    • Compiler
    • Assembler, linker
    • Simulator
    • Reference codes

ECE 526

lab3 forwarding counting classification
Lab3: Forwarding, Counting & Classification
  • Goal: to explore the basic functionalities of the IXP2400 software development kit and Microengines.
  • 3 parts:
    • Part I: collecting a number of workload statistics from the IXP SDK simulator. Follow steps of lab instruction.
    • Part II: adding one counting block to count the number of packets.
    • Part III: implementing a simple packet classification mechanism.
  • Tools: All three parts require access to a machine that has the Intel SDK installed. If you want, you can also request an installation CD for your own machine, check with TA.

ECE 526

part i forwarding simulation
Part I: Forwarding Simulation
  • run an implementation of IP forwarding on the IXP2400 simulator. All the code is provided to you.
  • collect a set of workload statistics that are reported by the simulator.

ECE 526

part ii forwarding and counting
Part II: Forwarding and Counting
  • modify above applications by adding counter block
  • store how many packets are received.

ECE 526

part iii classification and counting
Part III: Classification and Counting
  • classifying packets based on the packet header information. There are four types of traffic that are considered in this lab:
    • Web traffic over TCP over IPv4
    • Non-Web traffic over TCP over IPv4
    • UDP over IPv4
    • IPv6
  • modifying the code to report the number of packets

in each type.

ECE 526

how to do lab3
How to do Lab3
  • Windows machine with SDK installed
  • Download lab instructions and source code from blackboard
  • Start early.
  • Very exciting lab.
  • Due day
    • Part I and Part II 10/13
    • Part III 10/20

ECE 526

ad