1 / 9

CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Neha Dahiya March 7, 2008.

CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Neha Dahiya March 7, 2008. Winnowing Algorithm. Concept Class. A sequence of instances each having n binary attributes along with a result of (+) or (-) is presented to the algorithm to train it to predict the result.

windsorj
Download Presentation

CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Neha Dahiya March 7, 2008.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Neha Dahiya March 7, 2008. Winnowing Algorithm

  2. Concept Class • A sequence of instances each having n binary attributes along with a result of (+) or (-) is presented to the algorithm to train it to predict the result. • The goal is to come up with an adaptive strategy. • It is assumed that a disjunction of r literals accurately describes whether a particular instance is in the required group or not. • For example: if x1, x2, x3, x4 and x5 are attributes of instances, where xi = 1 if attribute ‘i’ is present, then n=5. If x1ν x2ν x5 exactly determines whether the instance is in required group or not, then r = 3.

  3. The Winnowing algorithm • Initialize variables w1=1, w2=1, …., wn =1. • For any input instance, • If ∑wi *xi >= n, then declare current example as (+). • Else, declare current example as (-). • Now check the actual result. • If our result matches with actual result, then no change. • If we declared (+) and actual result was (-), then half the weights of those attributes which were present in current example. • If we declared (-) and actual result was (+), then double the weights of those attributes which were present in current example.

  4. Upper Bound on # of mistakes • Now we try to find an upper bound on total no. of mistakes that can happen using winnowing algorithm. • The mistakes can be of two types: • Type 1 Mistake: We declare example as (-) and it was (+). • Type 2 Mistake: We declare example as (+) and it was (-).

  5. Bound on Type 1 Mistakes • For Type 1 mistakes, we increase the weights of attributes present in current example. • None of the relevant attributes (attributes present in the disjunction) gets its weight reduced. • Because we reduce the weight of an attribute only when an example having it makes type 2 mistake and any instance having a relevant attribute can cause type 2 error as it will always be a (+) example. • The upper bound on weight of any relevant attribute is n. • Once the weight of a relevant attribute reaches n, no instance having that attribute will be declared (-) as ∑wi xi >= n will be satisfied always (considering that weights are always positive).

  6. Bound on Type 1 Mistakes • So, for any given relevant attribute, the weight will not be doubled more than log2n times. • And there are r such attributes. • So, we cant make more than r* log2n mistakes of type 1.

  7. Bound on Type 2 Mistakes • Let the no. of type 2 mistakes be C. • Lets do amortized analysis for function W = ∑wi. • Its initial value = n. • At case 1 mistake, W is increased by at most n.( ∑wixi was less then n before that instance, so on doubling weights of attributes present in current instance, we increase W by at most n.) • At case 2 mistake, W is decreased by at least n/2.(∑wixi >=n for that instance => On halving weights of attributes present, we decrease W by at least n/2)

  8. Bound on Type 2 Mistakes • At any point of time, W is positive. So value subtracted from W should be less than Initial value + value added to W. • Initial value = n. • Value added to W <= n * no. of type 1 mistakes <= n * r * log2n • Value subtracted from W >= (n/2) * no. of type 2 mistakes =C*n/2 • C*n/2 <= n*r*log2 n + n • C <= 2*r*log2 n + 2 • So, the upper bound on type 2 mistakes = 2*r*log2 n + 2

  9. Upper bound on # of mistakes • Total no. of mistakes = No. of Type 1 mistakes + No. of Type 2 mistakes <= r*log2 n + 2*r*log2 n + 2 = 3*r*log2 n + 2 So total no. of mistakes made by winnowing algorithm is O(rlog n).

More Related