1 / 24

CS 162 Computer Architecture Lecture 8: Introduction to Network Processors (II)

CS 162 Computer Architecture Lecture 8: Introduction to Network Processors (II). Instructor: L.N. Bhuyan www.cs.ucr.edu/~bhuyan/cs162. Outline. Introduction to NP Systems Relevant Applications Design Issues and Challenges Relevant Software and Benchmarks

bat
Download Presentation

CS 162 Computer Architecture Lecture 8: Introduction to Network Processors (II)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 162 Computer Architecture Lecture 8: Introduction toNetwork Processors (II) Instructor: L.N. Bhuyan www.cs.ucr.edu/~bhuyan/cs162

  2. Outline • Introduction to NP Systems • Relevant Applications • Design Issues and Challenges • Relevant Software and Benchmarks • A case study: Intel IXP network processors

  3. What are Network Processors • Any device that executes programs to handle packets in a data network • Examples • Processors on router line cards • Processors in network access equipment

  4. Why Network Processors • Current Situation • Data rates are increasing • Protocols are becoming more dynamic and sophisticated • Protocols are being introduced more rapidly • Processing Elements • GP(General-purpose Processor) • Programmable, Not optimized for networking applications • ASIC(Application Specific Integrated Circuit) • high processing capacity, long time to develop, Lack the flexibility • NP(Network Processor) • achieve high processing performance • programming flexibility • Cheaper than GP

  5. Outline • Introduction to NP Systems • Relevant Applications • Design Issues and Challenges • Relevant Software and Benchmarks • A case study: Intel IXP network processors

  6. Organizing Processor Resources • Design decisions: • High-level organization • ISA and micro architecture • Memory and I/O integration • Today’s commercial NPs: • Chip multiprocessors • Most are multithreaded • Exploit little ILP (Cisco does) • No cache • Micro-programmed

  7. Architectural Comparisons • High-level organizations • Aggressive superscalar (SS) • Fine-grained multithreaded (FGMT) • Chip multiprocessor (CMP) • Simultaneous multithreaded (SMT) Ref: [NPRD]

  8. Architectural Comparisons (cont.) Simultaneous Multithreading Multiprocessing Superscalar Fine-Grained Coarse-Grained Time (processor cycle) Thread 1 Thread 3 Thread 5 Thread 2 Thread 4 Idle slot

  9. Tasks and Services Three Benchmarks used in the experiment Ref: [NPRD]

  10. Performance Evaluation • Systems must support some form of concurrent packet-level parallelism • SMT and CMP are nearly equivalent, with SMT always coming out ahead Forwarding: IP Forward Authentication: MD5 Encryption: 3DES SS FGMT CMP SMT • Workloads have little ILP • Need to exploit packet-level parallelism • CMP and SMT do just that Ref: [NPRD]

  11. Example Toaster System: Cisco 10000 • Almost all data plane operations execute on the programmable XMC • Pipeline stages are assigned tasks – e.g. classification, routing, firewall, MPLS • Classic SW load balancing problem • External SDRAM shared by common pipe stages Ref: [NPT]

  12. IBM PowerNP • 16 pico-procesors and 1 powerPC • Each pico-processor • Support 2 hardware threads • 3 stage pipeline : fetch/decode/execute • Dyadic Processing Unit • Two pico-processors • 2KB Shared memory • Tree search engine • Focus is layers 2-4 • PowerPC 405 for control plane operations • 16K I and D caches • Target is OC-48 Ref: [NPT]

  13. C-Port C-5 Chip Architecture Ref: [NPT]

  14. Some Challenges • Intelligent Design • Given a selection of programs, a target network link speed, the ‘best’ design for the processor • Least area • Least power • Most performance • Write efficient multithreaded programs • NPs have • Heterogeneous computer resources • Non-uniform memory • Multiple interacting threads of execution • Real-time constraints • Make use of resources • How to use special instructions and hardware assists • Compilers • Hand-coded • Multithreaded programs • Manage access to shared state • Synchronization between threads Ref: [NPRD]

  15. Outline • Introduction to NP Systems • Relevant Applications • Design Issues and Challenges • Relevant Software and Benchmarks • A case study: Intel IXP network processors

  16. NP Software • Teja • NPU vendor-neutral software tools • Key is a GUI-based state-machine tool • CLICK router • From MIT, supports a specialized development model • Zebra • Open source routing environment • Supporting most of the key IP routing protocols in SW • IP Fusion Inc. is providing commercial support • LVL7 • Closed source – i.e. traditional commercial – complete IP solutions Ref: [NPT]

  17. Benchmarks for Network Processors • NetBench • 10 applications • http://cares.icsl.ucla.edu/NetBench • CommBench • 8 networking and communications applications • http://ccrc.wustl.edu/~wolf/cb/ • EEMBC • http://www.eembc.org/benchmark • MediaBench • Transcoders • Some communications applications Ref: [NPT]

  18. Outline • Introduction to NP Systems • Relevant Applications • Design Issues and Challenges • Relevant Software and Benchmarks • A case study: Intel IXP network processors

  19. IXP1200 Block Diagram • StrongARM processing core • Microengines introduce new ISA • I/O • PCI • SDRAM • SRAM • IX : PCI-like packet bus • On chip FIFOs • 16 entry 64B each Ref: [NPT]

  20. IXP1200 Microengine • 4 hardware contexts • Single issue processor • Explicit optional context switch on SRAM access • Registers • All are single ported • Separate GPR • 256*6 = 1536 registers total • 32-bit ALU • Can access GPR or XFER registers • Shared hash unit • 1/2/3 values – 48b/64b • For IP routing hashing • Standard 5 stage pipeline • 4KB SRAM instruction store – not a cache! • Barrel shifter Ref: [NPT]

  21. DDR DRAM controller ME0 ME1 Scratch /Hash /CSR ME3 ME2 XScale Core MSF Unit PCI ME4 ME5 QDR SRAM controller ME7 ME6 IXP 2400 Block Diagram • XScale core replaces StrongARM • Microengines • Faster • More: 2 clusters of 4 microengines each • Local memory • Next neighbor routes added between microengines • Hardware to accelerate CRC operations and Random number generation • 16 entry CAM

  22. Different Types of Memory Ref: [NPRD]

  23. IXA Software Framework External Processors Control Plane Protocol Stack Control Plane PDK XScale Core C/C++ Language Core Components Core Component Library Resource Manager Library Microengine Pipeline Microblock Library Microengine C Language Micro block Micro block Micro block Protocol Library Utility Library Hardware Abstraction Library Ref: [NPRD]

  24. References • [NPT] W. H. Mangione-Smith, G. Memik Network Processor Technologies • [NPRD] Patrick Crowley, Raj Yavatkar An Introduction to Network Processor Research & Design, HPCA-9 Tutorial

More Related