1 / 16

Vapnik-Chervonenkis Dimension

Vapnik-Chervonenkis Dimension. Part II: Lower and Upper bounds. PAC Learning model. There exists a distribution D over domain X Examples: <x, c(x)> Goal: With high probability (1- d ) find h in H such that error(h,c ) < e. Definitions: Projection. Given a concept c over X

faunus
Download Presentation

Vapnik-Chervonenkis Dimension

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Vapnik-Chervonenkis Dimension Part II: Lower and Upper bounds

  2. PAC Learning model • There exists a distribution D over domain X • Examples: <x, c(x)> • Goal: • With high probability (1-d) • find h in H such that • error(h,c ) < e

  3. Definitions: Projection • Given a concept c over X • associate it with a set (all positive examples) • Projection (sets) • For a concept class C and subset S • PC(S) = { c  S | c  C} • Projection (vectors) • For a concept class C and S = {x1, … , xm} • PC(S) = {<c(x1), … , c(xm)> | c  C}

  4. Definition: VC-dim • Clearly |PC(S) |  2m • C shatters S if |PC(S) | =2m • VC dimension of a class C: • The size d of the largest set S that shatters C. • Can be infinite. • For a finite class C • VC-dim(C)  log |C|

  5. Lower bounds: Setting • Static learning algorithm: • asks for a sample S of size m(e,d) • Based on S selects a hypothesis

  6. Lower bounds: Setting • Theorem: • If VC-dim(C) = then C is not learnable. • Proof: • Let m = m(0.1,0.1) • Find 2m points which are shattered (set T) • Let D be the uniform distribution on T • Set ct(xi)=1 with probability ½. • Expected error ¼. • Finish proof!

  7. Lower Bound: Feasible • Theorem • VC-dim(C)=d+1, then m(e,d)=W(d/e) • Proof: • Let T be a set of d+1 points which is shattered. • Let the distribution D be: • z0 with prob. 1-8e • zi with prob. 8e/d

  8. Continue • Set ct(z0)=1 and ct(zi)=1 with probability ½ • Expected error 2e • Bound confidence • for accuracy e

  9. Lower Bound: Non-Feasible • Theorem • For two hypotheses m(e,d)=W((log 1/d)/e2) • Proof: • Let H={h0, h1}, where hb(x)=b • Two distributions: • D0: Pr[<x,1>]= ½ - g and Pr[<y,0>]= ½ + g • D1: Pr[<x,1>]= ½ + g and Pr[<y,0>]= ½ - g

  10. Epsilon net • Epsilon bad concepts • Be ( c ) = { h | error(h,c) >e } • A set of points S is an e-net w.r.t. D if • for every hin Be ( c ) • there exists a point x in S • such that h(x)  c(x)

  11. Sample size • Event A: • The sample S1 is not an epsilon net, |S1|=m. • Assume A holds • Let h be a epsilon-bad consistent hypothesis. • Sample an additional sample S2 • with probability at least 1/2 • the errors of h on S2 is em/2 • for m=|S2|= O(1/e)

  12. continues • Event B • There exists h in Be ( c ) • and h consistent with S1 • h has e m/2 errors on S2 • Pr[ B | A ]  1/2 • 2 Pr[B]  P[A] • Let F be the projection of C to S1  S2 • F=PC(S1  S2 )

  13. Error set • ER(h)={ x : x S1  S2 and c(x)=h(x)} • |ER(h)|  em/2 • Event A: • ER(h)  S1 =  • Event B: • ER(h)  S1 =  • ER(h) S2= ER(h)

  14. Combinatorial problem • 2m black and white balls • exactly l black balls • Consider a random partition to S1 and S2 • The probability that all the black balls in S2

  15. Completing the proof • Probability of B • Pr[B]  |F| 2-l  |F| 2-em/2 • Probability of A • Pr[A]  Pr[B]  |F| 2-em/2 • Confidence d  Pr[A] • Sample • m=O( (1/e) log 1/ d + (1/e) log |F| ) • Need to bound |F| !!!

  16. Bounding |F| • Define: • J(m,d)=J(m-1,d) + J(m-1,d-1) • J(m,0)=1 and J(0,d)=1 • Solving the recursion • Claim: • Let VC-dim(C)=d and |S|=m, • then |PC(S)|  J(m,d)

More Related