1 / 39

Network Processors

Network Processors. Harsh Chilwal. Evolution : Cellular phone generation. 1G. 2G. 2.5G. 3G. Data Rate. 1000. 170. 900MHz Voice. 900MHz 1800MHz Voice. 900-1800MHz Voice Tiny Internet. 900-1800-1900MHz Smart Phone Full web service. 12 kb/s. Evolution : 3G cellular phones.

rufus
Download Presentation

Network Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Processors Harsh Chilwal

  2. Evolution : Cellular phone generation 1G 2G 2.5G 3G Data Rate 1000 170 900MHz Voice 900MHz 1800MHz Voice 900-1800MHz Voice Tiny Internet 900-1800-1900MHz Smart Phone Full web service 12 kb/s

  3. Evolution : 3G cellular phones base station controller (BSC) 12Kb/second 10 BS 100Mb/second Network 100 MS mobile station (MS) 5Mb/second base station (BS)

  4. Evolution : 3G cellular phones base station controller (BSC) 1Mb/second 100 BS 50Gbit/second NP NP NP mobile station (MS) 500 MS Network base station (BS) 500Mb/second

  5. Evolution : Networks OC768 100,000 40Gb Bandwidth (Mb/s) x4 OC192 10Gb 10,000 x16 OC12 NP 1000 622Mb x12 DS3 100 44Mb x28 10 DS1 1.5M 1 x24 0.1 DS0 64K Year 1980 1985 1990 1995 2000 2005 DS= Digital signal OC = Optical carrier

  6. Networking Trends • Increasing Networking Traffic. • New sophisticated protocols are being introduced at rapid pace. • Need for supporting new applications to provide new services. • Convergence of voice and data networks introducing a lot of changes in the communication industry. • Increasing TTM Pressures • Decreasing product life cycles.

  7. General Purpose Processor based Software Router • Benefits • Flexible for upgrading the system • Easy for supporting additional interfaces • Quick to develop new products with short TTM. • The core processor performs all the routing functionalities • Drawbacks • Not able to scale up for higher bandwidths, maximum up to OC-12 speeds only • Can support complex network operations viz., traffic engineering, QoS, etc • with a major reduction in performance

  8. ASIC based Routers » Benefits • Provide wire-speed performances • provided high speed » Drawbacks • Lacks flexibility; difficult to meet changing market needs/demands • Long design cycles increases TTM reduces PLC. • Change in design or failure in design involves more risks • Need to replace the ASIC to provide new functionality • Complex network operation are still executed in software

  9. Network Processor based boxes • Promises to provide performance and flexibility • Comprises of many packet processing elements supporting multiple threads • Achieves higher performance by pipelining and parallel processing both in terms of threads and packet processing elements • Brings-in flexibility by due software programming • Easy to add features

  10. Network Processor

  11. Basic Architecture of Network Processors

  12. Basic architecture (contd.) Look-A-Side Co-processors Risc CP1 CP2 CP3 CP4 Com – Engine Multiple Streams Dispatcher Merger

  13. Intro: Systems and Protocols: Relation with Standards Systems Protocols • IETF / Forces WG: • Data / Forwarding Plane • Control Plane • IETF/Protocols • IPv4 • MPLS • PPP/L2TP • IPv6 • MIBs • NPF: • Service Layer • System Wide • No awareness where things are • Functional Layer • Awareness where things are • Operational Layer • Interface Management • ITU-T/ANSI/ATM Forum: • ATM • IEEE • Ethernet

  14. B Network Network A DATA 7 Application AH DATA Application 7 6 Pre. Pre. 6 PH DATA 5 Session SH DATA Session 5 4 Transport TH DATA Transport 4 3 Network NH DATA Network 3 2 Data Link DH DATA Data Link 2 1 Physical PH DATA Physical 1 OSI Network Architecture

  15. Typical Applications • WAN/LAN Switching and Routing, Multi-service Switches, Multi-layer switches, Aggregators • Web caching, Load balancing, Web switching, Content based load balancers • QoS solutions • VoIP Gateways • 2.5G and 3G wireless infrastructure equipments • Security - Firewall, VPN, Encryption, Access control • Storage solutions • Residential Gateways

  16. Software Framework

  17. Commonalties in Interpretation? Scene setting - why specs are not enough • 2 NPU vendors want to promote their solution with some ‘numbers’ • Both chip architectures comprise • RISC engines • Hardware support engines • Various types of interfaces • Support for internal and external memory • They report the following data • Aggregate MIPS • Max number of lookups per second • ... Commonalties in building blocks Commonalties in specifications

  18. Specifications

  19. Test scenario • What is measured? Performance in packets per second versus a forwarding information base (FIB) that is increased in size. • Start application is IPv4. • Next, counters are added for per flow billing purposes. • Next, load balancing is introduced as an additional feature. • Finally, encryption becomes an additional requirement for 2% of the data that is being forwarded

  20. Performance curves Performance (Mpps) IPv4 30 20 NPU B 10 NPU A FIB (K entries) 50 100 150

  21. Performance curves Performance (Mpps) IPv4 + counters 30 NPU A 20 Requires more memory references NPU B 10 FIB (K entries) 50 100 150

  22. Performance curves Performance (Mpps) IPv4 + counters + Load balancing 30 NPU A 20 Requires even more memory references 10 NPU B FIB (K entries) 50 100 150

  23. Performance curves Performance (Mpps) IPv4 + counters + Load balancing + encryption 30 20 10 NPU B No extra references and resources available NPU A FIB (K entries) A does not have sufficient resources 50 100 150

  24. Architecture A IPv4 + counters + LB + crypto OC-192 POS Int. mem LU Int. mem OC-192 POS 3 MIPS cores Int. mem Hash Key extract Count Sched External Buffer Mem

  25. Architecture B IPv4 + counters + LB + crypto 10GE IMEM 10 MIPS cores 10GE LB Memory interface External Buffer Mem

  26. Specifications - revisited

  27. So • No clear value statement could be made in favor of either NPU solutions • NPU A achieves higher throughput but with limited flexibility • NPU B achieves lower throughput but is more flexible • Were the provided specs accurate? • Yes. • The devices performed up to spec. • Although NPU B looks better on paper at first sight, more resources have to be consumed for less per formant results. • There is a cost associated with flexibility • Were the provided specs relevant? • No. They represent granular maximum performances. • For ‘real world’ applications, • some resources could not be maximally consumed • some resources were over consumed

  28. Benchmarking considerations • Processor core metrics are not always relevant for networking applications • It might be relevant for NPU B, since functionality relies almost totally on those cores. • It is definitely not the case for NPU A, since there is extensive additional hardware support for specific functions. GRANULARITY Highly granular specifications, data or benchmarking information can offer a wrongful picture of the actual performance capabilities of the DUT. Since Network Processing Devices are designed with specific applications in mind, benchmarks must exist for those specific applications

  29. Benchmarking considerations • External factors affect NPD performance (where you don’t always suspect it) • A forwarding application relies on FIB lookups to determine the destination of a packet • The size of the FIB table can influence performance in many ways • Usage of multiple memory banks • increasing number of hash collisions EXTERNAL FACTORS Benchmarks should include parameters that take into account external factors that are relevant to the particular applications that are being benchmarked.

  30. Benchmarking considerations • Interfaces present performance boundary conditions • Ethernet applications require inter frame gaps that result in more relaxed pps numbers INTERFACES Benchmarks should also specify the types of interfaces that are being used since those interfaces have an impact all by themselves on maximum performance figures

  31. Benchmarking considerations • Combinations of applications or minor extensions have a completely different impact on both network processing devices • NPU A has a lot of well engineered hardware support that can offer additional services BUT fails almost completely when additional computing resources are required • NPU B is very ‘soft’; performance degrades slowly when additional services are requested and shows no abrupt peaks in the performance curves. HEADROOM Benchmarks should combine applications as they occur in the real world to give a ‘sense’ of headroom that is available to support real world scenarios. It is however very hard to define a metric for headroom

  32. CommBench – A Telecommunication Benchmark For NPs CommBench HPAs PPAs • RTR • FRAG • DRR • TCP • CAST • ZIP • REED • JPEG

  33. Benchmark Characteristics – Code & Computational Kernel Sizes

  34. Benchmark Characteristics – Computational Complexity Na,l – Num Of Instructions/byte required for app a operationg on a packet of length l

  35. Benchmark Characteristics – Instruction Set Characteristics

  36. Benchmark Characteristics – Memory Hierarchy

  37. Example System: Cisco Toaster 10000 • Almost all data plane operations execute on the programmable XMC • Pipeline stages are assigned tasks – e.g. classification, routing, firewall, MPLS • Classic SW load balancing problem • External SDRAM shared by common pipe stages

  38. DDR DRAM controller ME0 ME1 Scratch /Hash /CSR ME3 ME2 XScale Core MSF Unit PCI ME4 ME5 QDR SRAM controller ME7 ME6 Example System: IXP 2400 • XScale core replaces StrongARM • Microengines • Faster • More: 2 clusters of 4 microengines each • Local memory • Next neighbor routes added between microengines • Hardware to accelerate CRC operations and Random number generation • 16 entry CAM

  39. References • Network Processor Design – Patrick Crowley etal. • CommBench - A Telecommunications Benchmark for Network Processors, Tilman Wolf and Mark Franklin. Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), http://www.ecs.umass.edu/ece/wolf/papers/commbench.pdf • Network Processing Forum - Benchmarking • www.wipro.com/pdf_files/networkprocessors_wipro_solPPT.pdf • http://intrage.insatlse.fr/~etienne/netpro.ppt

More Related