400 likes | 552 Views
IXP Lab 2012: Part 1. Network Processor Brief. Outline. Network Processor Intel IXP2400 Processing Element Register Memory Interface IXP Programming Language Programming Model Programming Syntax. Router Development (1). Software Based General Purpose Processor Flexible
E N D
IXP Lab 2012: Part 1 Network Processor Brief
Outline • Network Processor • Intel IXP2400 • Processing Element • Register • Memory Interface • IXP Programming • Language • Programming Model • Programming Syntax NCKU CSIE CIAL Lab
Router Development (1) • Software Based • General Purpose Processor • Flexible • Poor Performance … • Hardware Based • ASIC • Best Performance • Long Development Time NCKU CSIE CIAL Lab
Router Development (2) • Network Processor (NPU) Based • Balance of both • How ? • Parallel processors • Multi-threaded cores • Programmable processors with nonprogrammble copressors NCKU CSIE CIAL Lab
Network Processor Overview • For high speed packet processing • Comprise Multi-Cores for Parallel executing • Multi-Threaded Core • Reduced Instruction Set • Multiple Memory Interfaces NCKU CSIE CIAL Lab
Hierarchical Layer • Data-Plane • Fast-Path • Slow-Path • Control-Plane • Routing Protocol • Management-Plane • Monitor Applications • User Interface NCKU CSIE CIAL Lab
Data-Plane • Fast-Path • General Packet Handling • As fast as possible • Slow-Path • Exception Packet Handling • Packet with options • Local TCP/IP Stack NCKU CSIE CIAL Lab
Internet eXchange Processor • First Generation • IXP1200, IXP1240, IXP1250 • Second Generation • IXP2400, IXP2800, IXP2850 • IXP2805, IXP2855 • Others • IXP4XX NCKU CSIE CIAL Lab
Network Flow Processor • By Netronome • From Intel IXP2XXX • NFP-3240, NFP-3216 NCKU CSIE CIAL Lab
Intel IXP2400 Block Diagram NCKU CSIE CIAL Lab
IXP2400 Overview • Functional Block • Processing Element • Memory Interfaces • Coprocessors • Other Interfaces • Hierarchical View NCKU CSIE CIAL Lab
Processing Element • Programmability • Hierarchical Processing Elements • XScale • Microengine (ME) NCKU CSIE CIAL Lab
XScale • RISC based processor (ARMV5TE) • Real-time OS • Montavista Linux • ME Management • Control ME execution • Resource Management NCKU CSIE CIAL Lab
MicroEngine (1) • Eight MEs per IXP2400 (work in parallel) • Eight Threads per ME • Instruction set of ME are reduced for packet processing only • Not as powerful as general processor • No floating point related instructions • No divide instruction NCKU CSIE CIAL Lab
MicroEngine (2) • No OS • Not interactive • Managed by XScale • Code Store (4K Instrcutions) • Executing NCKU CSIE CIAL Lab
MicroEngine Threads • Concurrent Executing • No Preemptive • Round Robin Executing • Each thread own its private set of registers • Zero-Overhead Context Switching NCKU CSIE CIAL Lab
256 GPRs 256 SRAM Transfer Registers 128 Read 128 Write 256 DRAM Transfer Registers 128 Read 128 Write 128 Next Neighbor Registers Registers of ME NCKU CSIE CIAL Lab
Context Switch • Content of registers needs not be swap-out and swap-in during context switching • With the mechanism, another thread can swap in and doing some useful task to cover the long latency when the previous thread has swapped out for issues a memory request NCKU CSIE CIAL Lab
Memory Interface of IXP2400 • Local Memory • Smallest and Fastest • Scratchpad • Passing handle of the packet • SRAM • Hold data structure for packet processing • DRAM • Largest and Slowest • Hold packet’s content NCKU CSIE CIAL Lab
Local Memory • Per ME • Private to Other MEs • Private to XScale • Size: 2560 Bytes (640 LWs) • Usage • Variable Spilling • Caching • Latency: 3 cycles NCKU CSIE CIAL Lab
Scratchpad • On-Chip Memory • Shared by all MEs • Size: 16KB (Fixed) • Usage: • Scratchpad • Scratch Ring (Hardware FIFO) • Latency: ~60 cycles NCKU CSIE CIAL Lab
SRAM • Off-Chip Memory • Shared by all MEs (2-channels) • Size: 64 MB (Per Channel at Maximum) • Usage: • Hardware FIFO • Hold data structure • Hold Meta-data of packets • Latency: ~90 cycles NCKU CSIE CIAL Lab
DRAM • Off-Chip Memory • Shared by all MEs (1-channels) • Size: 1 GB (at Maximum) • Usage: • Hold whole packet contents • Alternative space for data structure • Latency: ~120 cycles NCKU CSIE CIAL Lab
MSF (Media Switch Fabric) Receive Packet to DRAM Transmit Packet from DRAM SHaC Scratchpad Hash Unit CAP Coprocessor NCKU CSIE CIAL Lab
Packet META-DATA (1) • Data for processing packets • How to identify packet? • Packet Handle • Packet Temporal Information • Non-related to packet content • Meta-data • Input Port, Output Port • Info for Packet Address in DRAM NCKU CSIE CIAL Lab
Packet META-DATA (2) • How to pass these info between ME? • Hardware FIFO • Scratch Ring • SRAM Ring • Next-Neighbor Ring • Issues NCKU CSIE CIAL Lab
Hierarchical View (Setting #1) • Only one IXP2400 based board • Data-Plane • Fast-Path: Microengine • Slow-Path: XScale • Control-Plane • XScale • Management-Plane • XScale NCKU CSIE CIAL Lab
Hierarchical View (Setting #2) • Multiple IXP2400 based boards • Data-Plane • Fast-Path: Microengine • Slow-Path: XScale • Control-Plane • CPU • Management-Plane • CPU NCKU CSIE CIAL Lab
Programming IXP2400 • XScale • Programming with C • Microengine • Programming with MicroC or Microcode • We will focus on this part ! NCKU CSIE CIAL Lab
IDE Tool--IXA SDK Workbench NCKU CSIE CIAL Lab
ME Language • MicroC • Subset of ANSI C • Only limited part of standard C libraries are implemented • Intrinsic Library for supporting operations of IXP • Microcode • High level of assembly NCKU CSIE CIAL Lab
Programming Model (1) • Receive – Processing – Transmit • Intel has provided sample code for receive and transmit. • We only focus on the part of processing. RX PROCESSING TX NCKU CSIE CIAL Lab
Programming Model (2) • Processing ME • Pipeline Model • Parallel Model • Mixed Model RX PROCESSING TX NCKU CSIE CIAL Lab
Pipeline Model RX PROC #1 RPOC #2 TX • Control the whole resource of ME • Hard to balance between different stage NCKU CSIE CIAL Lab
Parallel Model PROC #1 RX TX RPOC #2 • Balance is easy • Higher Performance • Resource is limited NCKU CSIE CIAL Lab
Mixed Model PROC #1 PROC #3 TX RX RPOC #2 NCKU CSIE CIAL Lab
MicroC Example 1 (1) void main () { _declspec(shared sram) int old_array[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }; _declspec(shared sram) int new_array[sizeof(old_array)/sizeof(int)]; global_label("start_reverse"); reverse_array(old_array, new_array, sizeof(old_array)/sizeof(int)); global_label("end_reverse"); } NCKU CSIE CIAL Lab
MicroC Example 1 (2) void reverse_array(volatile int* old, volatile int* new, int size) { int index = 0; for (index = 0; index < size; index++) { new[index] = old[size - index - 1]; } } NCKU CSIE CIAL Lab
MicroC Example 2 sram_read(&sram_egt_dim1_2_node, (__declspec(sram) unsigned int *)(PACKET_CLASSIFICATION_SRAM_BASE1 + current*8), 2, sig_done, &sram_read_sig_dim1_2); __wait_for_all(&sram_read_sig_dim1_2); temp = sram_egt_dim1_2_node.next_dim; NCKU CSIE CIAL Lab
1. COPY IXA_SDK_3.51, ixp_book到 D:\ ; 再 reboot • 3.[Ctrl+Enter] 進還原卡總管模式 • 4.Password: davidchang • 5. 解壓縮 ixasdk351cd1windows.zip, ixasdk351cd3.zip, ixasdk351framework.zip, 再依序安裝 (cd1裝完後需reboot) • 6.把ixp_book 目錄 COPY到C:\ NCKU CSIE CIAL Lab