gnort high performance intrusion detection using graphics processors
Download
Skip this Video
Download Presentation
Gnort: High Performance Intrusion Detection Using Graphics Processors

Loading in 2 Seconds...

play fullscreen
1 / 16

Gnort: High Performance Intrusion Detection Using Graphics Processors - PowerPoint PPT Presentation


  • 359 Views
  • Uploaded on

Gnort: High Performance Intrusion Detection Using Graphics Processors. Giorgos Vasiliadis , Spiros Antonatos , Michalis Polychronakis , Evangelos Markatos , Sotiris Ioannidis Institute of Computer Science Foundation for Research and Technology Hellas. General Idea.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Gnort: High Performance Intrusion Detection Using Graphics Processors' - fathi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
gnort high performance intrusion detection using graphics processors

Gnort: High Performance Intrusion Detection Using Graphics Processors

GiorgosVasiliadis, SpirosAntonatos, MichalisPolychronakis, EvangelosMarkatos, Sotiris Ioannidis

Institute of Computer Science

Foundation for Research and Technology Hellas

general idea
General Idea
  • How to speed up the processing throughput of intrusion detection systems by offloading the pattern matching operations to the GPU.

Giorgos Vasiliadis ICS-FORTH

introduction
Introduction
  • The problem
    • Network Intrusion Detection Systems (NIDS) are based on String Matching for detecting and preventing from well-known attacks
    • String Matching process accounts up to 75% of the total CPU processing
  • String Matching Algorithms
    • Aho-Corasick
  • Specialized hardware devices (NP, FPGAs, ASICs)
    • Complex to modify and program
    • Poor flexibility
  • Graphics Cards
    • Easy to program
    • Powerful and ubiquitous
    • Researches have begun exploring ways to tap their power for non-graphics applications

Giorgos Vasiliadis ICS-FORTH

why use the gpu
Why use the GPU ?
  • The GPU is specialized for compute-intensive, highly parallelcomputation

Giorgos Vasiliadis ICS-FORTH

nvidia geforce simd architecture
NVIDIA GeForce SIMD Architecture
  • Many Multiprocessors
  • Each multiprocessor contains many Stream Processors
  • Memory model
    • Shared On-Chip Memory
      • 1 cycle
    • Constant Memory
      • 400-600 cycles; 1 cycle if cached
    • Texture Memory
      • 400-600 cycles; 1 cycle if cached
    • Global Device Memory
      • 400-600 cycles

Size

GPU can be used as a general purpose processor, capable of executing many threads in parallel

Giorgos Vasiliadis ICS-FORTH

the aho corasick algorithm
The Aho-Corasick Algorithm
  • Used in most modern NIDSes
    • Scans for multiple patterns simultaneously
  • Preprocess all patterns to build a state machine
  • The state machine is used to scan for multiple patterns simultaneously at linear time
    • Complexity is independent of the number of patterns

Example: P={he, she, his, hers}

Giorgos Vasiliadis ICS-FORTH

mapping aho corasick on gpu
Mapping Aho-Corasick on GPU
  • How to represent the State Machine ?
  • Snort represent each state as an array of pointers
    • It is difficult to map them on the GPU memory
  • Transform to a 2D array
    • Can easily bind to Texture Memory
      • Texture fetches are cached
        • Aho-Corasick exhibits strong locality of references
      • Random access memory read
    • The usage of Texture Memory boosts GPU execution time about 19 %

Giorgos Vasiliadis ICS-FORTH

parallelizing packet searching 1 2
Parallelizing Packet Searching (1/2)
  • Assigning a Single Packet to each Multiprocessor
  • Each packet is copied to the shared memory of the Multiprocessor
  • Stream Processors search different parts of the packet concurrently
  • Overlapping computation
    • Matching patterns may span consecutive chunks of the packet
  • Same amount of work per Stream Processor
    • Stream Processors will be synchronized

Giorgos Vasiliadis ICS-FORTH

parallelizing packet searching 2 2
Parallelizing Packet Searching (2/2)
  • Assigning a Single Packet to each Stream Processor
  • Each packet is processed by a different Stream Processor
  • No overlapping computation
  • Different amount of work per Stream Processor
    • Stream processors of the same Multiprocessor will have to wait until all have finished

Giorgos Vasiliadis ICS-FORTH

software mapping
Software Mapping
  • Packets are transferred to the GPU in batches
    • Performs much better than making each transfer separately
    • Packets are stored to a buffer that is copied to the GPU when gets full
  • Use page-locked memory to store the packets
    • Higher transfer throughput from host to device
    • Copies are performed using DMA, without occupying the CPU
      • CPU and GPU execution can overlap

Giorgos Vasiliadis ICS-FORTH

evaluation 1 2
Evaluation (1/2)
  • Scalability as a function of the number of patterns
  • We ran Snort using random generated patterns
    • All patterns are matched against every packet
  • Payload trace contained UDP 800-bytes packets of random payload
  • Throughput remains constant when #patterns increases
  • 2.4x faster than the CPU

Giorgos Vasiliadis ICS-FORTH

evaluation 2 2
Evaluation (2/2)
  • Throughput as a function of the packets size
  • Ran Snort using 1000 random patterns
    • All patterns are matched against every packet
  • 2.3 Gbit/s for full packets
  • 3.2xfaster compared to the CPU
  • Both GPU implementations do not present significant differences in performance

Giorgos Vasiliadis ICS-FORTH

evaluation with real input and rules
Evaluation with real input and rules
  • Experimental setup
    • Two PCs connected via a 1 Gbit/s Ethernet switch
  • To directly compare with prior work [Jacob et al], we re-implemented the Knuth-Morris-Pratt (KMP) and Boyer-Moore (BM) algorithms on the GPU.

Giorgos Vasiliadis ICS-FORTH

evaluation with real input and rules14
Evaluation with real input and rules
  • Snort loaded about 8000 patterns.
  • Preprocessors and PCRE were disabled
  • Original Snort (AC) cannot process all packets in rates higher than 300 Mbit/s
  • GPU-assisted Snort (AC1, AC2) begins to loose packets at 600 Mbit/s
    • 200% improvement
  • KMP and BM algorithms used from [Jacob et al] perform worse in all cases

Giorgos Vasiliadis ICS-FORTH

conclusion
Conclusion
  • Graphics cards can be used effectively to speed up Network Intrusion Detection Systems.
    • Low-cost
    • Easy programming
  • Future work includes
    • Transfer the packets directly from the NIC to the GPU
    • Utilize multiple GPUs on multi-slot motherboards

Giorgos Vasiliadis ICS-FORTH

thank you
Thank you

Any questions?

[email protected]

Giorgos Vasiliadis ICS-FORTH

ad