Modified Naïve Bayes Classifier for Efficient Business Intelligence

Chapter 8 – Naïve Bayes DM for Business Intelligence

Naïve Bayes Characteristics Theorem named after Reverend Thomas Bayes (1702 – 1761) Data-driven, not model-driven (i.e., lets data “create” the training model) Is Supervised Learning (i.e., defined Output Variable) Is naive since it makes the assumption that each of its inputs are independent of each other (an assumption which rarely holds true in real life) Example: an illness may be diagnosed as Flu if a patient has a virus, a fever, and congestion (even though the virus may actually be causing the other two symptoms). Also naïve because it ignores the existence of other symptoms.

Naïve Bayes: The Basic Idea For a given new record to be classified, find other records like it (i.e., same values for input variables) What is the prevalent class (i.e., Output Variable value) among those records? Assign that class to your new record

Usage • Requires Categorical input variables • Numerical input variables must be binned and converted to Categorical variables • Can be used with very large data sets • “Exact” Bayes requires that you have exact records in your dataset matching the thing you are trying to classify • Bayes can be “modified” to calculate probability of a match in your dataset

Exact Bayes Classifier Relies on finding other records that share same input variable values as record-to-be-classified. Want to find “probability of belonging to class C, given specified values of predictors/input variables” Even with large data sets, may be hard to find other records that exactly match your record, in terms of predictor values.

Solution – Modified Naïve Bayes • Assume independence of predictor variables (within each class) • Use multiplication rule • Find same probability that record belongs to class C, given predictor values, without limiting calculation to records that share all those same values

Modified Bayes Calculations • Take a record, and note its predictor values • Find the probabilities those predictor values occur across all records in C1 • Multiply them together, then by proportion of records belonging to C1 • Same for C2, C3, etc. • Prob. of belonging to C1 is value from step (3) divide by sum of all such values C1 … Cn • Establish & adjust a “cutoff” prob. for class of interest

Example: Financial Fraud Target variable: Audit finds “Fraud,” or “No fraud” Predictors: Prior pending legal charges (yes/no) Size of firm (small/large) Goal: Classify a firm as being Fraudulent or Truthful, if Size = “Small” and Prior Charges = “Yes”

Prior

Exact Bayes Calculations Goal: classify (as “fraudulent” or as “truthful”) a small firm with prior charges filed There are 2 firms like that, one fraudulent and the other truthful There is no “majority” class, so . . . P(fraud | prior charges=y, size=small) = ½ = 0.50 Note: calculation is limited to the two firms matching those exact characteristics

Modified Naïve Bayes Calculations Proportion of each Input Variable within Frauds Proportion of Frauds (Output) within Total Records Page 172 formula . . . P(Ci|x1…,xp) = P(x1 . . .,xp|Ci) P(Ci) P(x1 . . .,xp|Ci) P(Ci) + P(x1 . . .,xp|Cm) P(Cm) P(Fraud|Prior Charges & Small Firm) = Proportion of “Prior Charges,” Small,” & “Fraud” to Total Records Divided by . . . {same} + Proportion of “Prior Charges,” “Small,” & “Truthful” to Total Records Proportion of each Input Variable within Truthful Proportion of Truthful (Output) within Total Records

Prior 4 Fraudulent Firms and 6 Truthful Firms (out of 10 total records)

Modified Naïve Bayes Calculations Same goal as before Compute 2 quantities: Proportion of “prior charges = y” among frauds, times proportion of “small” among frauds, times proportion frauds = 3/4 * 1/4 * 4/10 = 0.075 Proportion of “prior charges = y” among truthfuls, times proportion of “small” among truthfuls, times proportion of truthfuls = 1/6 * 4/6 * 6/10 = 0.067 P(fraud | prior charges, small) = 0.075 / (0.075 + 0.067) = 0.53

Modified Naïve Bayes, cont. • Note that probability estimate does not differ greatly from exact result of .50 • But, ALL records are used in calculations, not just those matching predictor values • This makes calculations practical in most circumstances • Relies on assumption of independence between predictor variables within each class

Independence Assumption • Not strictly justified (variables often correlated with one another) • Often “good enough”

Advantages • Handles purely categorical data well • Works well with very large data sets • Simple & computationally efficient

Shortcomings • Requires large number of records • Problematic when a predictor category is not present in training data Assigns 0 probability of response, ignoring information in other variables

On the other hand… • Probability rankings are more accurate than the exact probability estimates Good for applications using lift (e.g., response to mailing), less so for applications requiring probabilities (e.g., credit scoring)

Summary • No statistical models involved • Naïve Bayes (like k-NN) pays attention to complex interactions and local structure • Computational challenges remain

Modified Naïve Bayes Classifier for Efficient Business Intelligence

Modified Naïve Bayes Classifier for Efficient Business Intelligence

Presentation Transcript

powerpoint presentation

Powerpoint presentation

PPT Presentation

PowerPoint presentation

PowerPoint Presentation.

talk-ppt - PowerPoint Presentation

Chapter 21 Vocabulary PowerPoint Presentation

PowerPoint Presentation - Chapter 1

Chapter 5 : PowerPoint Presentation

Chapter 2 : PowerPoint Presentation

PowerPoint Presentation

PowerPoint Presentation

PowerPoint Presentation

Full Service Moving Plano TX - PowerPoint PPT Presentation

IEinfosoft.Pvt.Ltd Powerpoint PPT Presentation.

1800 Drivers PPT - PowerPoint PPT Presentation

PPT (PowerPoint Presentation) Combat Pest Control

Text Classification and Na ï ve Bayes

PPT PRESENTATION

Chapter 8 – Naïve Bayes

Hybrid MLM Software - PowerPoint PPT Presentation

Best MLM Software - PowerPoint PPT Presentation