improve na ve bayesian classifier by discriminative training
Download
Skip this Video
Download Presentation
Improve Naïve Bayesian Classifier by Discriminative Training

Loading in 2 Seconds...

play fullscreen
1 / 22

Improve Naïve Bayesian Classifier by Discriminative Training - PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on

Improve Naïve Bayesian Classifier by Discriminative Training. Kaizhu Huang, Zhangbing Zhou , Irwin King , Michael R. Lyu Oct. 2005. Outline. Background Classifiers Discriminative classifiers: Support Vector Machines Generative classifiers: Naïve Bayesian Classifiers Motivation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Improve Naïve Bayesian Classifier by Discriminative Training' - reya


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
improve na ve bayesian classifier by discriminative training

Improve Naïve Bayesian Classifier by Discriminative Training

Kaizhu Huang, Zhangbing Zhou,

Irwin King, Michael R. Lyu

Oct. 2005

ICONIP 2005

outline
Outline
  • Background
    • Classifiers
      • Discriminative classifiers: Support Vector Machines
      • Generative classifiers: Naïve Bayesian Classifiers
  • Motivation
  • Discriminative Naïve Bayesian Classifier
  • Experiments
  • Discussions
  • Conclusion

ICONIP 2005

background

SVM

Background
  • Discriminative Classifiers
    • Directly maximize a discriminative function or posterior function
    • Example: Support Vector Machines

ICONIP 2005

background1
Background
  • Generative Classifiers
    • Model the joint distribution for each class P(x|C) and then use Bayes rules to construct posterior classifiers P(C|x), C : class label, x: features .
    • Example: Naïve Bayesian Classifiers
      • Model the distribution for each class under the assumption: each feature of the data is independent of others features, when given the class label.

Constant w.r.t. C

Combining the assumption

ICONIP 2005

background2
Background
  • Comparison

Example of Missing Information:

From left to right: Original digit, 50% missing digit, 75% missing digit, and occluded digit.

ICONIP 2005

background3

Training set

subset D1

labeled as Class 1

subset D2

Labelled as Class 2

Needed!

Estimate distribution P1 to

approximate D1

Estimate distribution P2 to

approximate D2

Construct Bayes rule for

classification

Background
  • Why Generative classifiers are not accurate as Discriminative classifiers?

It is incomplete for generative classifiers to just approximate the inner-class information.

The inter-class discriminative information between classes are discarded

ICONIP 2005

Scheme for Generative classifiers in two-category classification tasks

background4
Background
  • Why Generative Classifiers are superior to Discriminative Classifiers in handling missing information problems?
    • SVM lacks the ability under the uncertainty
    • NB can conduct uncertainty inference under the estimated distribution.

A is the feature set

T is the subset of A, which is missing

A-T is thus the known features

ICONIP 2005

slide8

Motivation

  • It seems that a good classifier should combine the strategies of discriminative classifiers and generative classifiers.
  • Our work trains one of the generative classifier: Naïve Bayesian Classifier in a discriminative way.

ICONIP 2005

discriminative na ve bayesian classifier

Training set

Sub-set D1

labeled as Class I

Sub-set D2

labeled as Class 2

Interaction is needed!!

Estimate the

distribution P1 to

approximate D1

Estimate the

distribution P2 to

approximate D2

Use Bayes rule for

classification

Discriminative Naïve Bayesian Classifier

Easily solved by Lagrange Multiplier method

ICONIP 2005

Mathematic Explanation of Naïve Bayesian Classifier

Working Scheme of Naïve Bayesian Classifier

discriminative na ve bayesian classifier dnb
Discriminative Naïve Bayesian Classifier (DNB)
  • Optimization function of DNB

Divergence item

  • On one hand, the minimization of this function tries to approximate the dataset as accurately as possible.
  • On the other hand, the optimization on this function also tries to enlarge the divergence between classes.
  • Optimization on joint distribution directly inherits the ability of NB in handling missing information problems

ICONIP 2005

discriminative na ve bayesian classifier dnb1
Discriminative Naïve Bayesian Classifier (DNB)
  • Complete Optimization problem

Nonlinear optimization problem under linear constraints.

ICONIP 2005

discriminative na ve bayesian classifier dnb2
Discriminative Naïve Bayesian Classifier (DNB)
  • Solve the Optimization problem
    • Using Rosen Gradient Projection methods

ICONIP 2005

discriminative na ve bayesian classifier dnb3
Discriminative Naïve Bayesian Classifier (DNB)

Gradient and Projection matrix

ICONIP 2005

experimental results
Experimental results
  • Experimental Setup
    • Datasets
      • 4 benchmark datasets from UCI machine learning repository
    • Experimental Environments
      • Platform:Windows 2000
      • Developing tool: Matlab 6.5

ICONIP 2005

without information missing
Without information missing
  • Observations
    • DNB outperforms NB in every datasets
    • DNB wins in 2 datasets while it loses in the other 2 datasets in comparison with SVM
    • SVM outperforms DNB in Segment and Satimages

ICONIP 2005

with information missing
With information missing
  • Scheme
    • DNB uses

to conduct inference when there is information missing

    • SVM sets 0 values to the missing features (the default way to process unknown features in LIBSVM)

…………..(5)

ICONIP 2005

with information missing1
With information missing

Setup : Randomly discard features gradually from a small percentage to a big percentage

Error Rate in Iris with missing information

Error Rate in Vote with missing information

ICONIP 2005

with information missing2
With information missing

Error Rate in Satimage with missing information

Error Rate in DNA with missing information

ICONIP 2005

summary of experiment results
Summary of Experiment Results
  • Observations
    • NB demonstrates a robust ability in handling missing information problems.
    • DNB inherits the ability of NB in handling missing information problems while it has a higher classification accuracy than NB
    • SVM cannot deal with missing information problems easily.

ICONIP 2005

discussion
Discussion
  • Can DNB be extended to general Bayesian Network (BN) Classifier?
    • Structure learning problem will be involved. Direct application of DNB will encounter difficulties since the structure is non-fixed in restricted BNs .
    • Finding optimal General Bayesian Network Classifiers is an NP-complete problem.
  • Discriminative training on constrained Bayesian Network Classifier is possible…

ICONIP 2005

conclusion
Conclusion
  • We develop a novel model named Discriminative Naïve Bayesian Classifiers
    • It outperforms Naïve Bayesian Classifier when no information is missing
    • It outperforms SVMs in handling missing information problems.

ICONIP 2005

ad