1 / 20

ECE 526 – Network Processing Systems Design

ECE 526 – Network Processing Systems Design. Network Processor Introduction Chapter 11,12: D. E. Comer. Goal. Understanding the inefficiency of 1 st , 2 nd and 3 rd generation network processing systems Scalability plus flexibility

machiko-rin
Download Presentation

ECE 526 – Network Processing Systems Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE 526 – Network Processing Systems Design Network Processor Introduction Chapter 11,12: D. E. Comer

  2. Goal • Understanding the inefficiency of 1st, 2nd and 3rd generation network processing systems • Scalability plus flexibility • Recognizing the necessity of new solution: 4th generation (network processor technology) • Learning • courage to appreciate the challenges • skill to characterize the “real” problem • art to propose an engineering solution • Be aware of current network processor is a conceptual and general term ECE 526

  3. Recall 1ST • 1st generation network processing system • Feasibility study • Design a software router • data rate 10Gbps • Assuming small packets (64B) • Assuming each packet need 10,000 instruction to process • Can Intel 80986@2007 do the job? • CPU:24Ghz • MIPs:125,000 (Million Instruction Per Second) • 1 billion transistors …. • Conclusion: not feasible • What is the real problem here? ECE 526

  4. Real Problem is • Technology push: uneven • Link bandwidth scaling much faster than CPU and memory technology • Transistor scaling and VLSI technology help but not enough • Application pull: harder • More complex applications are required • Processing complexity is defined as the number of instructions and number of memory access to process one packet ECE 526

  5. Structured ASIC • Reconfigurable Co-processors • Network Processor • FPGA What is the ideal platform?

  6. 2nd and 3rd Generations • 2nd generation: offloading and decentralized • 3rd generation: further offloading and using specialized devices (ASIC + embedded processors) • Problems: losing the flexibility and very cost, why? ECE 526

  7. Why not ASIC? • High cost to develop • Network processing moderate quantity market • Long time to market • Network processing quickly changing services • Difficult to simulate • Complex protocol • Expensive and time-consuming to change • Little reuse across products • Limited reuse across versions • No consensus on framework or supporting chips • Requires expertise ECE 526

  8. Network Processors • Question: where does NP gain higher performance from, compared with conventional processor? ECE 526

  9. Instruction Set: minimality • Not general as RISC and CISC processor • E.g. no floating point instructions • Optimized for packet processing functions only • Not specific to a protocol or part a protocol • Seek a minimal set of instruction set of instructions sufficient to handle arbitrary protocol, • plus specific instructions for protocol processing • Example : atomic operation • Hard problem and will cover later ECE 526

  10. Architecture: multiprocessor • Parallelism • The nature of workload network processing: high parallel • Flow-level • Queue-level • Packet-level • Protocol-level • Pipelining • Pipeline will help system performance at cost of longer delay • Is this acceptable? • System-on-chip • Processing: RISC core • Memory: register, cache, instruction store, scratch pad, SRAM and SDRAM • I/O: network /switch fabric interfaces • Question: how hard to build and use this NPs? ECE 526

  11. Typical Processing ECE 526

  12. From (0) • From (1) • Root • a • b • c • d • e • Prefix (hex : binary) • : 0* • 002 : * • 002F : * • FFE : 000* • FFF : * • FFF • FFE • 000 • 001 • 002 • 003 • Memory access 1 • e • b • a • a • a • 0 • 1 • F • 0 • 1 • F • Memory access 2 • b • b • c • d • d • Lookup • IPRoute • To (0) • Memory access 5 • To (1) • 0 • 1 • F • Memory access 6 Case Study: IPv4 Packet Forwarding • 2-port router (2 Gbps) • Xilinx Virtex-II Pro FPGA (2VP30) • IP Lookup: • longest prefix match • (trie lookup algorithm)

  13. RS232 • Timer • BRAM • BRAM • OPB • LEDs • Verify • Lookup-1 • Lookup-2 • Transmit • Verify • Lookup-1 • Lookup-2 • Transmit • FSL • Packet Transmission • Packet Reception • Verify • Lookup-1 • Lookup-2 • Transmit • Verify • Lookup-1 • Lookup-2 • Transmit • BRAM • BRAM Multiprocessor for Header Processing • FIFO queues

  14. Typical using NPs ECE 526

  15. System Implementation Space ECE 526

  16. Memory Architecture • Memory access bottleneck • Memory is area consuming • Limited memory-on-chip • Limited bandwidth to off-chip memory: pin and package cost • Off-chip memory access is slow: 100 cycles • Possible solutions • Profiling application memory access pattern • Propose heterogeneous memory architecture • Memory aware mapping • Transactional memory (project topic) ECE 526

  17. Application Mapping Mapping Current approach: fixed topology, assembly coding & hand-tuning ECE 526

  18. PE • FPGA • MEM • MEM • From (1) • From (0) • FPGA • PE • MEM • FPGA • PE • PE • FPGA • Lookup • IPRoute • To (0) • MEM • MEM • To (1) Basic Steps for Mapping • Application description • High-level optimizations • Task graph • (platform specific) • Profile • Architecture configuration • HW / SW partitioning • Task allocation • Data layout • Communication assignment • Compilation / Synthesis

  19. Summary • Network Processor • Special purpose, programmable hardware device • Optimized for network processing • Building blocks of network processing systems • Fundamental ideas • Flexibility through programmability • Scalability with parallelism and pipelining • Here, NP is a concept • We will learn example of network processor soon ECE 526

  20. For Next Class & Announcement • Read Comer: chapter 13 and 14 • Lab 1 total grade reduce to 82 • HW 1 due Wed. • Project topic will be announced after Wed. ECE 526

More Related