Packet classification 3
Download
1 / 40

Packet Classification # 3 - PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on

Packet Classification # 3. Ozgur Ozturk CSE 581: Internet Technology Winter 2002. Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02. Introduction. Importance Identify the context of packets  Apply necessary actions Differentiated services

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Packet Classification # 3' - evette


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Packet classification 3

Packet Classification # 3

Ozgur Ozturk

CSE 581: Internet Technology

Winter 2002

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Introduction
Introduction

  • Importance

    • Identify the context of packets 

      Apply necessary actions

    • Differentiated services

  • Memory and Time Efficiency

    • Must handle Ks of rules

    • Must be at wire-speed (No queuing)

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Packet classification 3 paper list
Packet Classification # 3Paper List

  • T. Lakshman, D. Stiliadis, "High-Speed Policy-based Packet Forwarding Using Efficient Multi-dimensional Range Matching” [Bit-Parallelism]

    • http://www.bell-labs.com/user/stiliadi/filter/paper.html

  • F. Baboescu, G. Varghese, "Scalable Packet Classification” [ABV: Agregated Bit Vector]

  • M. Buddhikot, S. Suri, M. Waldvogel, "Space Decomposition Techniques for Fast Layer-4 Switching“ [Space Decomposition]

  • V. Srinivasan, G. Varghese, S. Suri, M. Waldvogel, "Fast and Scalable Layer Four Switching“ [Paper4]

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper intro
Bit-Parallelism Paper-Intro.

  • Presents packet classification schemes

    • traffic-independent and worst-case performance metric

    • a few K rules, at rates of M packets per second using range matches on more than 4 packet header fields

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper requirement for real time operation
Bit-Parallelism PaperRequirement for Real-Time Operation

  • Traditional router architectures

    • flow-cache architectures to classify packets

    • identified flows are expected to arrive in near future

    • Current backbone routers

      • active flows extremely high

        • OC-3 links, 256K flows

      • Cashes implemented as hash tables

        • scales well to that size

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper requirement for real time operation 2 hash table prob s
Bit-Parallelism PaperRequirement for Real-Time Operation 2 - Hash-Table Prob.s

  • Good hash function is non-trivial

    • 100 to 200 bits of header to be randomly distributed to no more than 20 to 24 bits of hash index

    • header value distribution is unknown

  • Performance of cache-based schemes is heavily traffic dependent

  • Malicious Users

    • limitations of hashing algo. & cashing techniques

  • Packet queuing delays acceptable after classification

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper packet classification constraints
Bit-Parallelism PaperPacket Classification Constraints

  • Scale to large routers with Gigabit links.

  • Process at wire-speed

    • 75% of packets < typical TCP packet size (552 bytes)

    • Nearly half are 40 to 44 bytes (TCP Ack)

  • Rules on several fields, specifying ranges, exact matches and prefixes

    • Two prefix fields in some cases

  • Allow arbitrary priorities for policies to allow distinction for multiple matches

  • Optimize for lookups, sacrifice update performance

    • lookup rate/update rate 107.

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper packet classification constraints 2
Bit-Parallelism PaperPacket Classification Constraints-2

  • Memory access time; dominant factor in worst-case lookup execution time

  • Amenable to hardware implementation

  • Time vs. Space

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper general packet classification
Bit-Parallelism PaperGeneral Packet Classification

  • Decomposable search to perform multi-dimensional search for packet filtering

    • k-dimensional query  a set of 1-dimensional queries on 1-dimensional intervals

    • Exploit parallelism where possible

    • Seek poly-logarithmic solution

  • Packet header fields  k-dimensions

  • Filters  overlapping regions in the k-dimensional space

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper efficiency of proposed algorithms
Bit-Parallelism PaperEfficiency of Proposed Algorithms

  • 1st Algorithm

    • Memory: k*n2O(n) bits per dimension

    • Time: log(2n)+1

    • Memory access: n/w

  • 2nd Algorithm

    • Memory reduce to O(n log n) bits

    • Time increase constant

    • Can be optimized for time and memory budget

    • Exploit on-chip memory in traffic-independent manner, to speed up worst case.

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Notation
Notation

  • Rule rm in k dimentions

    • rm = (e1,m, e2,m,…. ek,m)

    • e range

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper algorithm demo on 2 d preprocessing 1
Bit-Parallelism PaperAlgorithm demo on 2-D/Preprocessing 1

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper algorithm demo on 2 d preprocessing 2
Bit-Parallelism PaperAlgorithm demo on 2-D/Preprocessing 2

Max 2n+1 intervals for n rules

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper algorithm demo on 2 d preprocessing 3
Bit-Parallelism PaperAlgorithm demo on 2-D/Preprocessing 3

Sets of rules formed corresponding to each region

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper algorithm demo on 2 d online 1
Bit-Parallelism PaperAlgorithm demo on 2-D/Online 1

  • P1 (x*,y*) to be classified

    • find intervals x* and y* belongs to

      • binary search  log(2n+1)+1 comparisons/dimension

    • Create Intersection of all sets

      • conjunction of corresponding bit vectors

    • Highest Priority entry in the resultant bit vector

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper algorithm demo on 2 d online 2
Bit-Parallelism PaperAlgorithm demo on 2-D/Online 2

  • Max Set Cardinality = O(n)

  • Intersection step examines all rules at least ones  Time complexity = O(n)

  • With bit-level parallelism

    • The bitmaps representing sets stored in a (2n+1)*n array Bj[i,1..n] (Ri,j set stored for each dimension)

    • k*n/w memory accesses

  • Different processing elements for each dimension in hardware implementation

    • Prototype

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Different processing elements for each dimension in hardware implementation prototype
Different processing elements for each dimension in hardware implementation Prototype

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper algorithm 2 packet class based on inc reads
Bit-Parallelism Paper- Algorithm 2 implementationPacket Class. based on Inc. Reads

  • Algorithm utilizes incremental reads to reduce required memory

  • Allows time-space optimization and increases localization for off-chip SDRAM and wide on-chip memory implementations

  • Consider a specific dimension j

    • Assume maximum 2n+1 non-overlapping intervals

    • Corresponding to intervals in an n-bit bitmap with the positions of the 1s indicating the filter rules that overlap this interval

    • Adjacent intervals’ corresponding bitmaps differ in only one bit

    • A single bitmap and 2n pointers of size log n to the differing bits can be used to reconstruct any bitmap

Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


Bit parallelism paper algorithm 2 packet class based on inc reads 2
Bit-Parallelism Paper- Algorithm 2 implementationPacket Class. based on Inc. Reads 2

  • Reduces space requirement to O(n log n) from O(n2)

  • Further Generalize

    • (2n+1)/l bitmaps instead of 1

    • (2n+1)/2l pointers needed

    • Choose l by need

      • 2n+1  memory reduce to O(n log n)

        • Memory access increase n/w2n log n /w

  • Trade off decision according to on-chip/off-chip memory ratio.

  • Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Bit parallelism paper algorithm 2 special case 2 d classification
    Bit-Parallelism Paper- Algorithm 2 implementationSpecial Case: 2-D Classification

    • Necessary for best-effort traffic aggregation in Internet backbone

    • Determine next hop and resource allocations based on destination and source addresses only

      • Longest prefix match lookups

        • Restrict source prefix ranges to powers of 2 in order to reduce space

        • space requirement O(n) with trie implementation

    • Virtual intervals

      • Map intervals of prefix lengths to both dimensions, sorted by length

      • “Virtual Intervals” allow worst-case lookup time of O(ls+log n) where ls is the number of possible prefix lengths

    • Multicast group identification requires only two additional memory accesses

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Bit parallelism paper algorithm 2 conclusions
    Bit-Parallelism Paper- Algorithm 2 implementationConclusions

    • Packet classification, or filtering, is a useful primitive in connectionless networks to provide differentiated service and policy-based routing

    • More recently, security and active processing

      • Two multi-dimensional range matching algorithms allow millions of packets per second to be processed on a set of thousands of filter rules

      • Robust and predictable worst-case performance

    • Efficient 2-D algorithm for backbone routers with hundreds of thousands of routing entries

    • Algorithms demonstrate that there may be no need to restrict filtering to edge routers

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Paper4 layer four switching
    Paper4 implementation Layer Four Switching

    • Traditional router performs looking-up based on destination address

    • Layer four switching provides increased flexibility: it gives a router the capability to distinguish and deal with traffics differently:

      • Block traffic from dangerous site

      • Provide QoS service for certain traffics

      • Give preferential treatment to certain traffic (say, database flow).

    • Difficulties: need layer four header information, which may not always available

      • any modification of layer four header may cause problems

      • Do not how to get header info when encrypted

    • Some variants of L4S:

      • Firewall

      • Reservation protocols such as RSVP

      • Routing based on traffic type, say web traffic


    Paper4 the best matching filter problem
    Paper4 implementationThe Best Matching Filter Problem

    • A packet P has k distinct header fields for lookup: H[1], … , H[k]

    • The filter database of a Layer 4 Router consists of a finite set of filters: F1, F2, …, FN, each filter Fi has an associated directive acti

    • Match: each field of P matches the corresponding field of F

    • Cost: used to determine an unambiguous match (say order of filters)

    • An address range can always be transferred into a sequence of prefixes so we can use prefix match

    A filter database

    Dest

    Src

    DP

    SP

    SP

    M

    M

    M

    M

    T1

    *

    Net

    *

    *

    *

    S

    *

    T0

    Net

    *

    *

    25

    53

    53

    23

    123

    *

    *

    *

    *

    *

    *

    *

    123

    *

    *

    *

    *

    UDP

    *

    *

    UDP

    *

    TCP-ACK

    *

    A packet example:

    (M, S, UDP, 53, 125)


    Paper4 implementationSet Pruning Trees (1)

    • Build a trie on the destination prefixes in the database

    • Each valid prefix in the destination trie points to a trie containing some source prefixes.

    • A single filter may be fit into multiple destination prefixes, thus has multiple source trie copies.

    • Memory space: O(N2)

    • Time complexity: O(N)


    Set pruning trees 2
    Set Pruning Trees (2) implementation

    0

    1

    Dest-Trie

    0

    0

    Src-Trie

    0

    1

    0

    1

    0

    0

    1

    F3

    F4

    F3

    E.g.: Looking for: (001, 001)

    0

    1

    0

    1

    0

    1

    0

    1

    0

    F6

    0

    F7

    F2

    F1

    F5

    F7

    F2

    F1

    F7

    F7


    Avoid the Memory Blowup (1) implementation

    • Avoid the copying by having each destination prefix D point to a source trie that stores the filters whose destination field is exactly D

    • When searching, may need go back to the destination trie for multiple times

    • Time complexity: O(W2)

    • Space complexity: O(NW)


    Avoid the Memory Blowup (2) implementation

    0

    1

    Dest-Trie

    0

    0

    1

    0

    1

    0

    1

    E.g.: Looking for: (001, 001)

    F3

    F4

    1

    1

    0

    F6

    0

    Src-Trie

    F5

    F2

    F1

    F7

    Memory requirement=O(NW)

    Lookup Worst Case= O(W2)


    Improving Search Time: Basic Grid-of-Tries (1) implementation

    • Basic idea:

      • Use pre-computation and switch pointers (in the lower lever tries) to speed up search in a later source trie base on the search in an earlier source trie. (Remember the previous searching result)

    • Role of switch pointer

      • Allow us to increase the length of the matching source prefix, without having to restart at the root of the next ancestor source trie.

      • Stored Filter: node (D,S) stores the least cost filter whose dest field is a prefix of D and src field is a prefix of S

    • Time complexity: 2W

    • Space complexity: O(NW)


    Improving Search Time: Basic Grid-of-Tries (2) implementation

    0

    1

    Dest-Trie

    0

    0

    0

    1

    0

    0

    1

    0

    1

    E.g.: Looking for: (001, 001)

    x

    F3

    F4

    0

    0

    1

    1

    0

    F6

    0

    Src-Trie

    y

    F5

    F2

    F1

    F7


    Further improvement extension
    Further Improvement & Extension implementation

    • Use some faster scheme for destination address matching

      • Time complexity O(W)  O(log W)

    • Use multi-bit tries for source address matching

      • Time complexity O(W)  O(W/k)

    • Extend Grid-of-tries to handle protocol and port fields

      • 3 GOT copies for TCP, UDP and OTHER respectively,

      • 4 hash tables for 4 port combinations:

        • both unspecified, destination only, source only, both specified


    Cross-Producting (1) implementation

    • How-to

      • Slice filter database into column, the i-th column storing all distinct prefixes in field i.

      • Make a cross-product table of all k columns

      • Pre-compute the least cost filter that matches each cross-product entry

      • When packet comes in, do best prefix matching for each field respectively

      • With matching results, find out the corresponding entry in the cross-product table

    • Discussion

      • Very fast (for matching)

      • Problem: memory explosion: N^k

      • Solution: On Demand Cross-Producting


    Cross-Producting (2) implementation

    Dest

    Src

    DP

    SP

    SP

    Dest

    Prefix

    Src

    Prefix

    DestPort

    Prefix

    SrcPort

    Prefix

    Flags

    Prefixes

    M

    M

    M

    M

    T1

    *

    Net

    *

    *

    *

    S

    *

    T0

    Net

    *

    *

    25

    53

    53

    23

    123

    *

    *

    *

    *

    *

    *

    *

    123

    *

    *

    *

    *

    UDP

    *

    *

    UDP

    *

    TCP-ACK

    *

    123

    Default

    M

    T1

    Net

    Default

    S

    T0

    Net

    Default

    25

    53

    23

    123

    Default

    UDP

    TCP-ACK

    Default

    Num

    CrossProduct

    Matching Filter

    F1

    F1

    F1

    F1

    F1

    F1

    F8

    F8

    1

    2

    3

    4

    5

    6

    479

    480

    M, S, 25, 123, UDP

    M, S, 25, 123, TCP-ACK

    M, S, 25, 123, default

    M, S, 25, default, UDP

    M, S, 25, default, TCP-ACK

    M, S, 25, default, default

    … …

    default,default,default,default,TCP-ACK

    default,default,default,default,default

    E.g. Looking for:

    (M,S,UDP,25,57)


    Conclusions
    Conclusions implementation

    • GOT solution scalable (linear) storage & fast lookups for D-S filters.

      • More general filters  high lookup cost

    • Cross-Producting solution, higher variance, but faster on average (for lookup) because of cashing need.

    • Hybrid scheme combines flexibility with efficiency.

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Abv scalable packet classification f baboescu g varghese
    ABV: implementation "Scalable Packet Classification” F. Baboescu, G. Varghese,

    • GOAL

      • Packet classification

        • scalable (in rules, upto 100,000)

        • wire speed

    • Past Work

      • Linear time search

      • Linear amount of TCAMS

      • Lucent scheme

        • worst case doesn't scale

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Solution
    SOLUTION implementation

    • Aggregated Bit Vector

      • improvement on Lucent bit vector

      • rule aggregation

      • rule rearrangement

    • Rule Aggregation

      • bit vectors are sparse

        • i.e., few rules match

      • Some compression scheme

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Solution continued
    SOLUTION continued implementation

    • Rule Rearrangement

      • overlap is rare

      • place rules w/ common values together

      • sort out rule ordering later

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Comparing abv w bv of lucent
    Comparing ABV w/ BV of Lucent implementation

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Results
    Results implementation

    • At least an order magnitude faster than BV

    • Scales well for memory access

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Paper 3 space decomposition techniques for fast layer 4 switching m buddhikot s suri m waldvogel
    Paper # 3 implementation“Space Decomposition Techniques for Fast Layer-4 Switching" M. Buddhikot, S. Suri, M. Waldvogel

    • new scheme, based on space decomposition, whose search time is comparable to the best existing schemes, but which also offers fast worst-case filter update time.

    • three key ideas

      • innovative data-structure based on quadtrees for a hierarchical representation of the recursively decomposed search space

      • fractional cascading and precomputation to improve packet classification time

      • prefix partitioning to improve update time

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    Space decomposition evaluation
    Space Decomposition Evaluation implementation

    • Depending on the actual requirements of the system this algorithm is deployed in, a single parameter can be used to tradeoff search time for update time.

    • Amenable to fast software and hardware implementation.

    • For Ntwo-dimensional filters specified using prefixes of up to W bits in length, Area-based Quadtrees (AQT) data structure requires O(N)space, O(W) search time, and O((N)1/)

    • Both the average and worst-case search times and memory consumption are comparable or better than other schemes known in the literature.

    Packet Classification # 3 CSE 581: Internet Technology (Winter 2002) Ozgur Ozturk 02/11/02


    ad