Foundations of Adversarial Learning. Daniel Lowd, University of Washington Christopher Meek, Microsoft Research Pedro Domingos, University of Washington. Motivation. Many adversarial problems Spam filtering Intrusion detection Malware detection New ones every year!
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Foundations of Adversarial Learning
Daniel Lowd, University of Washington
Christopher Meek, Microsoft Research
Pedro Domingos, University of Washington
From: [email protected]
Cheap mortgage now!!!
1.
Feature Weights
cheap = 1.0
mortgage = 1.5
2.
3.
Total score = 2.5
> 1.0 (threshold)
Spam
From: [email protected]
Cheap mortgage now!!!Cagliari Sardinia
1.
Feature Weights
cheap = 1.0
mortgage = 1.5
Cagliari = 1.0
Sardinia = 1.0
2.
3.
Total score = 0.5
< 1.0 (threshold)
OK
From: [email protected]
Cheap mortgage now!!!Cagliari Sardinia
1.
Feature Weights
cheap = 1.5
mortgage = 2.0
Cagliari = 0.5
Sardinia = 0.5
2.
3.
Total score = 2.5
> 1.0 (threshold)
Spam
OK

+
X2
X2
x
x
X1
X1
X2
X1
Adversarial cost function
Instance space
Classifier
c(x): X {+,}
c C, concept class
(e.g., linear classifier)
a(x): X R
a A
(e.g., more legible spam is better)
X = {X1, X2, …, Xn}
Each Xi is a feature
Instances, x X
(e.g., emails)

+

+
Classifier’s Task:Choose new c’(x) minimize (costsensitive) error
Adversary’s Task:Choose x to minimize a(x) subject to c(x) =
Learned weights:
cheap = 1.0
mortgage = 1.5
Cagliari = 1.0
Sardinia = 1.0
From: spammer@ example.com
Cheap mortgage now!!!Cagliari Sardinia
From: spammer@ example.com
Cheap mortgage now!!!
cheap = 1.0
mortgage = 1.5
Cagliari = 1.0
Sardinia = 1.0
Learned weights:
cheap = 1.0
mortgage = 1.5
Cagliari = 1.0
Sardinia = 1.0
Learned weights:
cheap = 1.5
mortgage = 2.0
Cagliari = 0.5
Sardinia = 0.5
Score
“If you know the enemy and know yourself, you need not fear the result of a hundred battles.”
 Sun Tzu, 500 BC

+
Adversary’s Task:Minimize a(x) subject to c(x) =
Problem:
The adversary doesn’t know c(x)!
?
?
?
?
X2

?
+
?
?
?
X1
Within a factor of k
X2
X1
X2
xa
X1
Linear classifier:
c(x) = +, iff(w x > T)
Linear cost function:
X2
xa
X1
x
xa
c(x)
wi
wj
wk
wl
wm
c(x)
y
x
xa
wi
wm
wj
wk
wl
c(x)
y’
xa
wi
wj
wk
wl
wp
Iteratively reduce the cost in two ways:
“good”
“spammy”
Original legit.
Original
spam
“Barely legit.”
“Barely spam”
Hi, mom!
now!!!
mortgage
now!!!
Cheap mortgage
now!!!
Spam
Legitimate
Threshold
Good words
“Barely spam”
message
Spam
Legitimate
Less good words
Threshold
Key idea: use spammy words to sort the good words.
Spam
Legitimate
Better
Worse
Threshold
* words added + words removed
* words added + words removed
c(x)
xa
y
wi
wj
wk
wl
wm
x
wp
wr