Summary statistics sumstats framework
This presentation is the property of its rightful owner.
Sponsored Links
1 / 11

Summary Statistics (SumStats framework) PowerPoint PPT Presentation


  • 95 Views
  • Uploaded on
  • Presentation posted in: General

Summary Statistics (SumStats framework). What is SumStats?. Summary Statistics From Wikipedia: Summary statistics are used to summarize a set of observations, in order to communicate the largest amount as simply as possible. 2. Motivations. Load balancing made previous techniques all fail

Download Presentation

Summary Statistics (SumStats framework)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Summary statistics sumstats framework

Summary Statistics(SumStats framework)


What is sumstats

What is SumStats?

  • Summary Statistics

  • From Wikipedia:

    • Summary statistics are used to summarize a set of observations, in order to communicate the largest amount as simply as possible

2


Motivations

Motivations

  • Load balancing made previous techniques all fail

    • SumStats framework hides cluster abstraction

  • Better and more repeatable interface and approach for measurement and thresholding

  • Give more people the ability to write real world deployable measurement scripts

3


Approach

Approach

  • Discrete time slices (epochs)

  • Only streaming algorithms allowed

  • Every measurement must be merge-able for cluster support

  • Probabilistic data structures

    • HyperLogLog & Top-K now

4


Why do any of this

Why do any of this?

Measurement is fun!

5


Sumstats based notices

SumStats Based Notices

200.29.31.26 had 349 failed logins on 2 FTP servers in 14m47s

92.253.122.14 scanned at least 29 unique hosts on port 445/tcp in 1m4s

88.124.212.10 scanned at least 41 unique hosts on port 445/tcp in 1m13s

212.55.8.177 scanned at least 75 unique hosts on port 5900/tcp in 0m36s

200.30.130.101 scanned at least 66 unique hosts on port 445/tcp in 1m20s

107.22.92.186 scanned at least 64 unique hosts on port 443/tcp in 0m1s

5.254.140.123 scanned at least 29 unique hosts on port 102/tcp in 4m1s

122.211.164.196 scanned 15 unique ports of host 75.89.37.60 in 0m5s

6


Summary statistics sumstats framework

Observation - type a

Observation - type a

Observation - type b

Observation - type a

Observation - type b

Reducer has: predicate, key normalizer

calculates: sum, avg, etc...

Reducer has: predicate, key normalizer

calculates: sum, avg, etc...

SumStat has: epoch, thresholds, epoch-end callbacks,

7


Observations

Observations

  • Observations observe a single point of data

    • An HTTP request

    • A DNS lookup

    • An ICMP message

8


Reducers

Reducers

  • Reducers collect observations and apply calculations to them

    • Sum of Content-Length headers

    • Unique number of DNS requests

9


Sumstat

SumStat

  • A SumStat collates multiple reducers

    • Set thresholds at ratios between reducers

      • e.g. Ratio of unique DNS requests and unique HOST names seen in HTTP traffic

    • Handle results from Reducers and do something

10


Summary statistics sumstats framework

Observation - type a

Observation - type a

Observation - type b

Observation - type a

Observation - type b

Reducer has: predicate, key normalizer

calculates: sum, avg, etc...

Reducer has: predicate, key normalizer

calculates: sum, avg, etc...

SumStat has: epoch, thresholds, epoch-end callbacks,

11


  • Login