MidTerm Exam. Next Wednesday Perceptrons Decision Trees SVMs Computational Learning Theory In class, closed book. PAC Learnability. Consider a concept class C defined over an instance space X (containing instances of length n ),
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
CS446Spring 06
CS446Spring 06
We want this probability to be smaller than , that is:
H(1) <
ln(H) + m ln(1) < ln()
(with ex = 1x+x2/2+…; ex > 1x; ln (1 ) <  ; gives a safer )
(gross over estimate)
It is called Occam’s razor, because it indicates a preference towards small
hypothesis spaces
What kind of hypothesis spaces do we want ? Large ? Small ?
m
CS446Spring 06
How do we find the consistent hypothesis h ?
CS446Spring 06
How do we find the consistent hypothesis h ?
Example: n=4, k=2; monotone kCNF
Original examples: (0000,l) (1010,l) (1110,l) (1111,l)
New examples: (000000,l) (111101,l) (111111,l) (111111,l)
CS446Spring 06
Unbiased learning: Consider the hypothesis space of all Boolean functions on n features.
There are different functions, and the bound is therefore exponential in n.
The bound is not tight so this is NOT a proof; but it is possible to prove exponential growth
kCNF: Conjunctions of any number of clauses where each disjunctive clause has
at most k literals.
kclauseCNF: Conjunctions of at most k disjunctive clauses.
ktermDNF: Disjunctions of at most k conjunctive terms.
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
C
H
Importance of representation:
Concepts that cannot be learned using one representation can sometimes be learned using another (more expressive) representation.
Attractiveness of ktermDNF for human concepts
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
Y
+
+
+
+

+
+
+
+
+
+
X
CS446Spring 06
Y
+
+
+
+
+

+
+
+
+
+
+
X
CS446Spring 06
Y
+
+
+
+
+

+
+
+
+
+
+
X
Will we be able to learn the target rectangle?
Some close approximation?
Some lowloss approximation?
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
CS446Spring 06
+
+
+
+
+


a
0
CS446Spring 06
+
Shattering
+
+
+
+
+

+
+
+
+
+


a
a
0
0
CS446Spring 06
This is the set of functions (concept class) considered here


+
+
+
+
+


b
a
CS446Spring 06




+
+
+
+
+

+
+
+
+
+


b
b
+

+
b
a
CS446Spring 06
+
+
+



+

CS446Spring 06
+
+
+
+




+


+
CS446Spring 06
CS446Spring 06
CS446Spring 06