bayesian classifiers n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Bayesian classifiers PowerPoint Presentation
Download Presentation
Bayesian classifiers

Loading in 2 Seconds...

play fullscreen
1 / 13

Bayesian classifiers - PowerPoint PPT Presentation


  • 159 Views
  • Uploaded on

Bayesian classifiers. Bayesian Classification: Why?. Probabilistic learning : Calculate explicit probabilities for hypothesis, among the most practical approaches to certain types of learning problems

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Bayesian classifiers' - gavivi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
bayesian classification why
Bayesian Classification: Why?
  • Probabilistic learning: Calculate explicit probabilities for hypothesis, among the most practical approaches to certain types of learning problems
  • Incremental: Each training example can incrementally increase/decrease the probability that a hypothesis is correct. Prior knowledge can be combined with observed data.
  • Probabilistic prediction: Predict multiple hypotheses, weighted by their probabilities
  • Standard: Even when Bayesian methods are computationally intractable, they can provide a standard of optimal decision making against which other methods can be measured
bayesian theorem
Bayesian Theorem
  • Given training data D, posteriori probability of a hypothesis h, P(h|D) follows the Bayes theorem
  • MAP (maximum posteriori) hypothesis
  • Practical difficulty: require initial knowledge of many probabilities, significant computational cost
na ve bayesian classification
Naïve Bayesian Classification
  • If i-th attribute is categorical:P(di|C) is estimated as the relative freq of samples having value di as i-th attribute in class C
  • If i-th attribute is continuous:P(di|C) is estimated thru a Gaussian density function
  • Computationally easy in both cases
play tennis example estimating p x i c

outlook

P(sunny|p) = 2/9

P(sunny|n) = 3/5

P(overcast|p) = 4/9

P(overcast|n) = 0

P(rain|p) = 3/9

P(rain|n) = 2/5

temperature

P(hot|p) = 2/9

P(hot|n) = 2/5

P(mild|p) = 4/9

P(mild|n) = 2/5

P(cool|p) = 3/9

P(cool|n) = 1/5

humidity

P(p) = 9/14

P(high|p) = 3/9

P(high|n) = 4/5

P(n) = 5/14

P(normal|p) = 6/9

P(normal|n) = 2/5

windy

P(true|p) = 3/9

P(true|n) = 3/5

P(false|p) = 6/9

P(false|n) = 2/5

Play-tennis example: estimating P(xi|C)
play tennis example classifying x
Play-tennis example: classifying X
  • An unseen sample X = <rain, hot, high, false>
  • P(X|p)·P(p) = P(rain|p)·P(hot|p)·P(high|p)·P(false|p)·P(p) = 3/9·2/9·3/9·6/9·9/14 = 0.010582
  • P(X|n)·P(n) = P(rain|n)·P(hot|n)·P(high|n)·P(false|n)·P(n) = 2/5·2/5·4/5·2/5·5/14 = 0.018286
  • Sample X is classified in class n (don’t play)
the independence hypothesis
The independence hypothesis…
  • … makes computation possible
  • … yields optimal classifiers when satisfied
  • … but is seldom satisfied in practice, as attributes (variables) are often correlated.
  • Attempts to overcome this limitation:
    • Bayesian networks, that combine Bayesian reasoning with causal relationships between attributes
bayesian belief networks i
Bayesian Belief Networks (I)

Age

FamilyH

(FH, A)

(FH, ~A)

(~FH, A)

(~FH, ~A)

M

0.7

0.8

0.5

0.1

Diabetes

Mass

~M

0.3

0.2

0.5

0.9

The conditional probability table for the variable Mass

Insulin

Glucose

Bayesian Belief Networks

applying bayesian nets
Applying Bayesian nets
  • When all but one variable known:
    • P(D|A,F,M,G,I)

From Jiawei Han's slides

bayesian belief network
Bayesian belief network
  • Find joint probability over set of variables making use of conditional independence whenever known

a

d

ad ad adad

b

b

0.1 0.2 0.3 0.4

Variable e independent

of d given b

b

0.3 0.2 0.1 0.5

e

C

From Jiawei Han's slides

bayesian belief networks ii
Bayesian Belief Networks (II)
  • Bayesian belief network allows a subset of the variables conditionally independent
  • A graphical model of causal relationships
  • Several cases of learning Bayesian belief networks
    • Given both network structure and all the variables: easy
    • Given network structure but only some variables: use gradient descent / EM algorithms
    • When the network structure is not known in advance
      • Learning structure of network harder
the k nearest neighbor algorithm
The k-Nearest Neighbor Algorithm
  • All instances correspond to points in the n-D space.
  • The nearest neighbor are defined in terms of Euclidean distance.
  • The target function could be discrete- or real- valued.
  • For discrete-valued, the k-NN returns the most common value among the k training examples nearest toxq.
  • Vonoroi diagram: the decision surface induced by 1-NN for a typical set of training examples.

.

_

_

_

.

_

.

+

.

+

.

_

+

xq

.

_

From Jiawei Han's slides

+

discussion on the k nn algorithm
Discussion on the k-NN Algorithm
  • The k-NN algorithm for continuous-valued target functions
    • Calculate the mean values of the k nearest neighbors
  • Distance-weighted nearest neighbor algorithm
    • Weight the contribution of each of the k neighbors according to their distance to the query point xq
      • giving greater weight to closer neighbors
    • Similarly, for real-valued target functions
  • Robust to noisy data by averaging k-nearest neighbors
  • Curse of dimensionality: distance between neighbors could be dominated by irrelevant attributes.
    • To overcome it, axes stretch or elimination of the least relevant attributes.

From Jiawei Han's slides