250 likes | 291 Views
Lecture 2. Bayesian Decision Theory. Multivariate normal distribution Discriminant function for normal distributions Discriminant function for discrete distribution. Normal density. Reminder: the covariance matrix is symmetric and positive semidefinite. Entropy - the measure of uncertainty
E N D
Lecture 2.Bayesian Decision Theory Multivariate normal distribution Discriminant function for normal distributions Discriminant function for discrete distribution
Normal density Reminder: the covariance matrix is symmetric and positive semidefinite. Entropy - the measure of uncertainty Normal distribution has the maximum entropy over all distributions with a given mean and variance.
Normal density Let Σ be the covariance matrix, then it has k pairs of eigenvalues and eigenvectors. A can be decomposed as: Σ is positive semidefinite: …0 Zero is achieved when the data doesn’t occupy the entire k dimensional space.
Normal density Whitening transform:
Discriminant function Features -> discriminant functions gi(x), i=1,…,c Assign class i if gi(x) > gj(x) j i Decision surface defined by gi(x) = gj(x)
Normal density To make a minimum error rate classification (zero-one loss), we use discriminant functions: This is the log of the numerator in the Bayes formula. Log is used because we are only comparing the gi’s, and log is monotone. When normal density is assumed: We have:
Discriminant function for normal density i = 2I Linear discriminant function: Note: blue boxes – irrelevant terms.
Discriminant function for normal density The decision surface is where With equal prior, x0 is the middle point between the two means. The decision surface is a hyperplane,perpendicular to the line between the means.
Discriminant function for normal density “Linear machine”: dicision surfaces are hyperplanes.
Discriminant function for normal density With unequal prior probabilities, the decision boundary shifts to the less likely mean.
Discriminant function for normal density (2) i =
Discriminant function for normal density Set: The decision boundary is:
Discriminant function for normal density The hyperplane is generally not perpendicular to the line between the means.
Discriminant function for normal density (3) i is arbitrary Decision boundary is hyperquadrics (hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids)
Discriminant function for normal density Decision boundary:
Discriminant function for normal density Extention to multi-class.
Discriminant function for discrete features Discrete features: x = [x1, x2, …, xd ]t, xi{0,1 } pi = P(xi = 1 | 1) qi = P(xi = 1 | 2) The likelihood will be:
Discriminant function for discrete features The discriminant function: The likelihood ratio:
Discriminant function for discrete features So the decision surface is again a hyperplane.