detecting adversaries using metafeatures l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Detecting Adversaries Using Metafeatures PowerPoint Presentation
Download Presentation
Detecting Adversaries Using Metafeatures

Loading in 2 Seconds...

play fullscreen
1 / 16

Detecting Adversaries Using Metafeatures - PowerPoint PPT Presentation


  • 166 Views
  • Uploaded on

Detecting Adversaries Using Metafeatures. Chad Mills Program Manager Windows Live Safety Platform Microsoft. Example Messages. Content Filter. Assumption: Spam words continue to appear in spam messages Good words continue to appear in good messages. m illion dollars t ransfer

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Detecting Adversaries Using Metafeatures' - ivanbritt


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
detecting adversaries using metafeatures

Detecting Adversaries Using Metafeatures

Chad Mills

Program Manager

Windows Live Safety Platform

Microsoft

content filter
Content Filter

Assumption:

  • Spam words continue to appear in spam messages
  • Good words continue to appear in good messages

million

dollars

transfer

guardian

(dollars, 0.2)

(million, 0.1)

(transfer, 0.1)

(community, -0.01)

(social, -0.01)

(fellow, -0.01)

(guardian, 0.03)

(March, -0.08)

0.37

March

community

social

fellow

-0.11

slide4

<style>

<br Bij board bar atteindre jYST GCS re sonrisa fuse Kiviuq padded />

<br Star Honolulu />

<br Ons apporter />

opens NRSU syringe />

<br Jerusalem comfort HTTPS 2604 confidence Miles />

<br 27 mails Qty backwards Meditations bans sedative ect salve <br insightful />

Korean relations header greeting Airllines Phantom CVS Rae 504 1009 perf<br graphiques />

undertaking paced Liquidation reduction />

From: "Chelsea Clark" <easyMoneySurveys@pointracer.com>

Subject: Get PaidFor yourOpinion

finding good words
Finding good words

Free

Nigeria

Viagra

+

=

Good Message

Spammy Words

Borderline Spam Message

+

late

click

commissioner

late

click

commissioner

=

Unknown Words

Inbox

Good Words

Borderline Spam

+

newsletter

select

month

newsletter

select

month

=

Unknown Words

Junk Folder

Non-Good Words

Borderline Spam

application chaff
Application: Chaff

Chaff Spam

  • [spam content]
  • newsletterpeersmonthselectthese
  • lateclickcommissionermedia
  • smoothlyoffclosesupport before
  • okaysponsorrockgoby ads
  • nonecasestextmembership

Legitimate Mail

MarchisallabouttheZune community. This month, you can help create a new featureforTheSocial, gettips from afellow Zuneuserandfind out the winners of theYour Zune Your Choice Awards.

example metafeatures
Example Metafeatures
  • Sum of weights (content filter score)
  • Average weight
  • Standard Deviation
  • Percent of words that are good
  • Percent of words that are spam
  • Number of features
  • Maximum feature weight
  • Number of strong spam words
  • Etc.
metafeatures
Metafeatures

Metafeatures

Features

Sum: 0.37

σ: 0.09

Max: 0.2

million

dollars

transfer

guardian

(dollars, 0.2)

(million, 0.1)

(transfer, 0.1)

(community, -0.01)

(social, -0.01)

(fellow, -0.01)

(guardian, 0.03)

(March, -0.08)

1.9

Sum: -0.11

σ: 0.04

Max: -0.1

March

community

social

fellow

-1.7

(feature, weight)

(Metafeature, weight)

(Sum:0.37, 1.0)

(σ: 0.09, 0.8)

(Max: 0.2, 0.1)

(Sum: -0.11, -0.8)

(σ: 0.04, -0.6)

(Max: -0.1, -0.3)

evaluation data
Evaluation Data
  • Hotmail Feedback Loop
    • Messages classified by recipients
  • Training Set: 1,800,000 messages
    • Ending on 5/20/07
  • Evaluation Set: 50,000 messages
    • Data from 5/21/07
evaluation results
Evaluation Results

45% improvement in TP at low FP levels

qualitative results
Qualitative Results
  • At a reasonable False Positive rate:
    • 98% of unique catches are chaff spam
    • Caught 99.5% of chaff spam missed by regular content filter
    • Similar types of False Positives as regular filter
  • Challenges Remaining
    • Primarily just helped on spam with chaff
    • Relies on base content filter to detect spam with obfuscated content (e.g. v1agra) or naïve spam without any chaff
conclusions
Conclusions
  • Spam messages with good word chaff have unnatural weight distributions
  • Metafeatures is able to identify and catch these messages
  • This resulted in a 45% improvement in TP
  • Gains were limited to spam with good word chaff