slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 600N:  Reasoning and Decision under Uncertainty Summer 2010 PowerPoint Presentation
Download Presentation
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 600N:  Reasoning and Decision under Uncertainty Summer 2010

Loading in 2 Seconds...

play fullscreen
1 / 58

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 600N:  Reasoning and Decision under Uncertainty Summer 2010 - PowerPoint PPT Presentation


  • 295 Views
  • Uploaded on

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 600N:  Reasoning and Decision under Uncertainty Summer 2010 Nevin L. Zhang Room 3504, phone: 2358-7015, Email: lzhang@cs.ust.hk Home page PMs for Classification PMs for Clustering: Continuous data PMs for Clustering: Discrete data

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 600N:  Reasoning and Decision under Uncertainty Summer 2010' - issac


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGYCSIT 600N:  Reasoning and Decision under Uncertainty Summer 2010

Nevin L. ZhangRoom 3504, phone: 2358-7015,

Email: lzhang@cs.ust.hkHome page

l09 probabilistic models pms for classification and clustering

PMs for Classification

PMs for Clustering: Continuous data

PMs for Clustering: Discrete data

L09: Probabilistic Models (PMs) for Classification and Clustering
classification
The problem:

Given data:

Find mapping

(A1, A2, …, An) |- C

Possible solutions

ANN

Decision tree (Quinlan)

(SVM: Continuous data)

Classification
bayesian networks for classification
Bayesian Networks for Classification
  • Naïve Bayes model often has good performance in practice
  • Drawbacks of Naïve Bayes:
    • Attributes mutually independent given class variable
    • Often violated, leading to double counting.
  • Fixes:
    • General BN classifiers
    • Tree augmented Naïve Bayes (TAN) models
bayesian networks for classification12
Bayesian Networks for Classification
  • General BN classifier
    • Treat class variable just as another variable
    • Learn a BN.
    • Classify the next instance based on values of variables in the Markov blanket of the class variable.
    • Pretty bad because it does not utilize all available information because of Markov boundary
bayesian networks for classification13
Bayesian Networks for Classification
  • Tree-Augmented Naïve Bayes (TAN) model
    • Capture dependence among attributes using a tree structure.
    • During learning,
      • First learn a tree among attributes: use Chow-Liu algorithm
        • Special structure learning problem, easy
      • Add class variable and estimate parameters
    • Classification
      • arg max_c P(C=c|A1=a1, …, An=an)
      • BN inference
outline

PMs for Classification

PMs for Clustering: Continuous data

  • Gaussian distributions
  • Parameter estimation for Gaussian distributions
  • Gaussian mixtures
  • Learning Gaussian mixtures

PMs for Clustering: Discrete data

Outline
slide15

http://www-stat.stanford.edu/~naras/jsm/NormalDensity/NormalDensity.htmlhttp://www-stat.stanford.edu/~naras/jsm/NormalDensity/NormalDensity.html

  • Real-world example of Normal Distributions?
outline20

PMs for Classification

PMs for Clustering: Continuous data

  • Gaussian distributions
  • Parameter estimation for Gaussian distributions
  • Gaussian mixtures
  • Learning Gaussian mixtures

PMs for Clustering: Discrete data

Outline
example

Data:

Example

Mean vector

Covariance Matrix

outline24

PMs for Classification

PMs for Clustering: Continuous data

  • Gaussian distributions
  • Parameter estimation for Gaussian distributions
  • Gaussian mixtures
  • Learning Gaussian mixtures

PMs for Clustering: Discrete data

Outline
outline30

PMs for Classification

PMs for Clustering: Continuous data

  • Gaussian distributions
  • Parameter estimation for Gaussian distributions
  • Gaussian mixtures
  • Learning Gaussian mixtures

PMs for Clustering: Discrete data

Outline
l09 probabilistic models pms for classification and clustering41

PMs for Classification

PMs for Clustering: Continuous data

PMs for Clustering: Discrete data

L09: Probabilistic Models (PMs) for Classification and Clustering
l09 probabilistic models pms for classification and clustering50

PMs for Classification

PMs for Clustering: Continuous data

PMs for Clustering: Discrete data

  • A generalization
L09: Probabilistic Models (PMs) for Classification and Clustering
latent tree models
Latent Tree Models
  • LC models
    • local independence assumption
    • often not true
  • LT models generalize LC models
    • Relax the independence assumption
    • Each latent variable gives a way to partition data… multidimensional clustering
icac data
ICAC Data

// 31 variables, 1200 samples

C_City: s0 s1 s2 s3 // very common, quit common, uncommon, ..

C_Gov: s0 s1 s2 s3

C_Bus: s0 s1 s2 s3

Tolerance_C_Gov: s0 s1 s2 s3 //totally intolerable, intolerable, tolerable,...

Tolerance_C_Bus: s0 s1 s2 s3

WillingReport_C: s0 s1 s2 // yes, no, depends

LeaveContactInfo: s0 s1 // yes, no

I_EncourageReport:s0 s1 s2 s3 s4 // very sufficient, sufficient, average, ...

I_Effectiveness: s0 s1 s2 s3 s4 //very e, e, a, in-e, very in-e

I_Deterrence: s0 s1 s2 s3 s4 // very sufficient, sufficient, average, ...

…..

-1 -1 -1 0 0 -1 -1 -1 -1 -1 -1 0 -1 -1 -1 0 1 1 -1 -1 2 0 2 2 1 3 1 1 4 1 0 1.0

-1 -1 -1 0 0 -1 -1 1 1 -1 -1 0 0 -1 1 -1 1 3 2 2 0 0 0 2 1 2 0 0 2 1 0 1.0

-1 -1 -1 0 0 -1 -1 2 1 2 0 0 0 2 -1 -1 1 1 1 0 2 0 1 2 -1 2 0 1 2 1 0 1.0

….

latent structure discover y
Latent Structure Discovery

Y2: Demographic info; Y3: Tolerance toward corruption

Y4: ICAC performance; Y7: ICAC accountability

Y5: Change in level of corruption; Y6: Level of corruption

interpreting partition
Interpreting Partition
  • Information curves:
    • Partition of Y2 is based on Income, Age, Education, Sex
    • Interpretation: Y2 --- Represents a partition of the population based on demographic information
    • Y3 --- Represents a partition based on Tolerance toward Corruption
interpreting clusters
Interpreting Clusters

Y2=s0: Low income youngsters; Y2=s1: Women with no/low income

Y2=s2: people with good education and good income;

Y2=s3: people with poor education and average income

interpreting clustering
Interpreting Clustering

Y3=s0: people who find corruption totally intolerable; 57%

Y3=s1: people who find corruption intolerable; 27%

Y3=s2: people who find corruption tolerable; 15%

Interesting finding:

Y3=s2: 29+19=48% find C-Gov totally intolerable or intolerable; 5% for C-Bus

Y3=s1: 54% find C-Gov totally intolerable; 2% for C-Bus

Y3=s0: Same attitude towardC-Gov and C-Bus

People who are tough on corruption are equally tough toward C-Gov and C-Bus.

People who are relaxed about corruption are more relaxed toward C-Bus than C-GOv

relationship between dimensions
Relationship Between Dimensions

Interesting finding: Relationship btw background and tolerance toward corruption

Y2=s2: ( good education and good income) the least tolerant. 4% tolerable

Y2=s3: (poor education and average income) the most tolerant. 32% tolerable

The other two classes are in between.

result of lca
Result of LCA
  • Partition not meaningful
  • Reason:
    • Local Independence not true
  • Another way to look at it
    • LCA assumes that all the manifest variables joint defines a meaningful way to cluster data
    • Obviously not true for ICAC data
    • Instead, one should look for subsets that do define meaningful partition and perform cluster analysis on them
    • This is what we do with LTA