Deep Packet Inspection Which Implementation Platform?

Deep Packet InspectionWhich Implementation Platform? Sarang Dharmapurikar Cisco

Implementation Platform • Several choices, each with some pros and cons • ASICs • FPGA • Network Processors • Graphics Processors (nVidia) • multiple-core, multi-threaded Commodity processors • Needs evaluation with respect to • Cost • Speed • Overall system performance (DPI is just a small piece of the puzzle) • Ease of use and upgrading • A hardware-software co-design approach • Profile a DPI system and push some components in hardware if the overall speed up is effective (Ahmdal’s law)

ASIC • Examples: ClassiPi, NetLogic, Tarari, some Cisco ASICs • Requires too much investment • NRE close to a million dollars! • A long design cycle • Most of the time is consumed in verification • Hard to upgrade • Algorithms evolve • It is hard to build a flexible enough ASIC • Applications get locked to a platform • To migrate to a new platform requires a lot of software rewriting

FPGA • Very flexible but expensive and power-consuming • Virtex-5 offers 330,000 lookup tables units • 4MB of SRAM • Latest Xilinx FPGA contain multiple PowerPC cores • Possible to design hybrid hw/sw systems • The compoents that assist DPI such as TCP-reassembly, normalization, flow classification done in hardware • Several FPGA platforms for networking acceleration available today • NetFPGA • FPX • Need to be careful in the DPI approach • The raw signature matching techniques that use FPGA logic resources for each signature won’t scale

Network Processors • Intel IXP2850 • 16 micro-engines with • 2KB D$ and 8KB I$ and 16 entry CAM • An integrated XScale processor for control path • 32KB I$ and 32kB D$ • 2 Crypto units • 16KB shared scratch pad SRAM • Cisco QuantumFlow processor • 40 packet processing engines (PPE) each @ 1.2 GHz • 4 threads per PPE • Dedicated hardware for queuing, buffering, IP lookup and classification

Commodity processors • Really powerful server class processors coming up • Intel’s Nehalem • 8 cores • 2 threads per core • 32KB L1, 256 KB L2, 10+MB of shared L3 cache • Sun’s Niagara2 • 8 cores • 8 threads per core! • 16KB I$ and 8KB D$ per core, 4MB shared L2 cache. • Integrated cryptographic coprocessors units • Need to think multi-core, multi-threaded • Think in terms of a complete system, not just pattern matching • Which core should do what? • Need to design cache-friendly data structures

Conclusion • While hardware can assist DPI systems, building proprietary hardware not a good idea • Let’s understand the “actual” performance needs • Let’s not be misguided by “marketing” needs • Need to think of hardware-software co-design • Requires careful profiling of DPI systems to identify the components that can be pushed to hardware • Need to design algorithms for multi-core multi-threaded processors

Deep Packet Inspection Which Implementation Platform?

Deep Packet Inspection Which Implementation Platform?

Presentation Transcript

Advanced Algorithms for Fast and Scalable Deep Packet Inspection

Deep packet inspection, technical configurations and privacy

Network Forensics Deep Packet Inspection

A Memory Efficient DFA based on Pattern Segmentation for Deep Packet Inspection

StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection

Cache-Based Scalable Deep Packet Inspection with Predictive Automaton

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Space-Time Tradeoffs in Software-Based Deep Packet Inspection

Deflating the Big Bang: Fast and Scalable Deep Packet Inspection with Extended Finite Automata

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

An index-split Bloom filter for deep packet inspection

Space-Time Tradeoffs in Software-based Deep Packet Inspection

StriD2FA Scalable Regular Expression Matching for Deep Packet Inspection

A Multi-gigabit Rate Deep Packet Inspection Algorithm using TCAM

Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection

Deep Packet Inspection with Regular Expression Matching

Fast Deep Packet Inspection with a Dual Finite Automata

A Hybrid Finite Automaton for Practical Deep Packet Inspection

SWM: Simplified Wu- Manber for GPU-based Deep Packet Inspection

Packet implementation: discretization

Efficient Memory Utilization on Network Processors for Deep Packet Inspection

Deep Packet Inspection Market Segment to 2020