1 / 25

Pattern Recognition: Statistical and Neural

Nanjing University of Science & Technology. Pattern Recognition: Statistical and Neural. Lonnie C. Ludeman Lecture 10 Sept 28, 2005. P(C 2 ). P(C 2 ). 0. N MAP =. N MPE =. P(C 1 ). P(C 1 ). (C 22 - C 12 ) P(C 2 ). N BAYES =. (C 11 - C 21 ) P(C 1 ).

Download Presentation

Pattern Recognition: Statistical and Neural

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nanjing University of Science & Technology Pattern Recognition:Statistical and Neural Lonnie C. Ludeman Lecture 10 Sept 28, 2005

  2. P(C2) P(C2) 0 NMAP = NMPE= P(C1) P(C1) (C22 - C12 ) P(C2) NBAYES = (C11 - C21 ) P(C1) Review 3: MAP, MPE , Bayes, Neyman Pearson Classification Rules C1 > If l( x ) N < C2 Threshold Likelihood ratio = p(x | C2 ) dx R1( NNP)

  3. Lecture 10 Topics 1. Gaussian Random variables and Vectors 2. General Gaussian Problem: 2-Class Case Special Cases: Quadratic and Linear Classifiers 3. Mahalanobis Distance 4.General Gaussian: M-Class Case Special Cases: Quadratic and Linear Classifiers

  4. Gaussian (Normal) Random Variable: X ~ N(m, s2) X is a Gaussian (Normal) Random Variable if its probability density function pX(x) is given by 1 exp { - (x - m) 2 } pX(x) = 2s2 2 s X =Random Variable m = Mean Value s2 = Variance

  5. General Gaussian Density: x ~ N(M, K) The random vector X is normal (Gaussian) distributed if its density function p(x)is given by 1 1 2 exp( - (x - M)TK-1(x – M) ) p(x) = N 2 1 2 (2 ) K x =[ x1, x2, … , xN ]T Pattern Vector M =[ m1, m2, … , mN ]T Mean Vector k11 k12 … k1N k21 k22 … k2N Covariance Matrix K = kN1 kN2 … kNN

  6. Properties of Covariance Martix For j , k = 1, 2, … , M kjk = E[ ( xj – mj ) ( xk – mk ) ] Covariance kjj = E[ ( xj – mj )2 ] Component Variance K is a positive definite Matrix K has positive eigen values

  7. General Gaussian Problem: 2-Class Case The random vector X is normally (Gaussian) distributed under both classes C1 :X ~ N( M1, K1 ) 1 1 2 p(x|C1) = exp(- (x – M1)TK1-1(x – M1) ) N 2 1 2 (2 ) K1 C2 :X ~ N( M2, K2 ) 1 1 2 exp(- (x – M2)TK2-1(x – M2) ) p(x|C2) = N 2 1 2 (2 ) K2

  8. General Gaussian Framework: 2-Class Case A. Assumptions: C1 :X ~ N( M1, K1 ) , P(C1) C2 :X ~ N( M2, K2 ) , P(C2) B. Performance Measure: MAP, P(errror), Risk, PD , MiniMax C: Optimum Classification: Min or Max

  9. 1 1 2 exp(- (x – M1)TK1-1(x – M1) ) N 2 p(x|C1) (2 ) = p(x|C2) 1 1 2 exp(- (x – M2)TK2-1(x – M2) ) N 2 1 2 (2 ) K2 Optimum Decision Rule:2-Class Gaussian Derivation of Optimum Decision Rule which is a likelihood ratio test- threshold determined by type 1 2 K1 C1 1 2 K2 exp(- (x – M1)TK1-1(x – M1) ) > ½ T < if 1 2 exp(- (x – M2)TK2-1(x – M2) ) ½ K1 C2

  10. Optimum Decision Rule: 2-Class Gaussian C1 if > T1 - (x – M1)TK1-1(x – M1) + (x – M2)TK2-1(x – M2) < C2 Quadratic Processing 1 2 K1 T1 = 2 ln(T ) = 2 lnT + ln - ln where K1 K2 1 2 K2 And T is the optimum threshold for the type of performance measure used

  11. P(C2) P(C2) 0 NMAP= NMPE= P(C1) P(C1) (C22 - C12 ) P(C2) NBAYES= (C11 - C21 ) P(C1) T = NMAPorNBAYESor NMPEor NNP = p(x | C2 ) dx R1( NNP)

  12. Mahalanobis Distance: Definition Given two N-Vectors x and y the Mahalanobis Distance dMAH(x,y) is defined by dMAH(x, y) = (x – y)TA-1(x – y) If A = the identity Matrix then dMAH(x, y) = dEUCLIDIAN(x, y) = (x – y)T(x – y)

  13. 2-Class Gaussian: SpecialCase 1: K1 = K2 = K Equal Covariance Matrices C1 > T2 if ( M1 – M2)T K-1 x < C2 Linear Processing T2 = ln T + ½ ( M1T K-1 M1 – M2T K-1 M2) And T is the optimum threshold for the type of performance measure used

  14. 2-Class Gaussian:Case 2: K1 = K2 = K = s2 I Equal Scaled Identity Covariance Matrices C1 > T3 if ( M1 – M2)T x < C2 Linear Processing T3 = s2 ln T + ½ ( M1T M1 – M2T M2) And T is the optimum threshold for the type of performance measure used

  15. 2-Class Case Gaussian: Case 3: K1 = K2 = K = s2 I MPE or Bayes 0,1 costs with P(C1) = P(C2) C1 > T4 if ( M1 – M2)T x < C2 Linear Processing T4 = ½ ( M1T M1 – M2T M2) And T is the optimum threshold for the type of performance measure used

  16. General Gaussian: M-Class Case A. Asssumptions C1 :X ~ N( M1, K1 ) , P(C1) 1 exp(-½ (x – M1)TK1-1(x – M1) ) p(x|C1) = N/2 ½ (2 ) K1 C2 :X ~ N( M2, K2 ) , P(C2) 1 exp(- ½ (x – M2)TK2-1(x – M2) ) p(x|C2) = ½ N/2 (2 ) K2

  17. CM :X ~ N( MM, KM ) , P(CM) 1 exp(- ½ (x – MM)TKM-1(x – MM) ) p(x|CM) = ½ N/2 (2 ) KM B: Performance Measue : P(error) C: Decision Rule: Minimum P(error)

  18. General Gaussian: M-Class Case C: Optimum MPE Decision Rule Derivation Selects class Ck if p(x | Ck) P(Ck) > p(x | Cj) P(Cj) for all j = k where P(Cj) exp{-½ (x – Mj)TKj-1(x – Mj) } p(x |Cj) P(Cj) = N/2 ½ (2 ) Kj

  19. M- Class General Gaussian - Continued Define equivalent statistic: Sj(x) for j = 1, 2, … , M ½ Sj(x) = P(Cj) exp{-½ (x – Mj)TKj-1(x – Mj) } / Kj Another equivalent statistic: Qj(x) for j = 1, 2, … , M Select ClassCj if Qj(x) is MINIMUM Qi(x) = (x – Mj)TKj-1(x – Mj) } – 2 ln P(Cj) + ln | Ki | 2 dMAH(x , Mj) Bias Quadratic Operation on observation vector x

  20. M-Class Gaussian: Case 1: K1 = K2 = … = KM = K Define equivalent statistic: Sj/(x) for j = 1, 2, … , M Sj/(x) = P(Cj) exp{-½ (x – Mj)TK-1(x – Mj) } Define equivalent statistic: Sj//(x) for j = 1, 2, … , M Sj//(x) = (x – Mj)TK-1(x – Mj) – 2 lnP(Cj)

  21. 2 dMAH(x, y) Gaussian M-Class: Case 1: K1 = K2 = … = KM = K Equivalent Decision Rule Compute min of (x – Mj)TK-1(x – Mj) - 2 lnP(Cj) Select class Cj with minimum value bias

  22. Case 1a: K1 = K2 = … = KM = K (Continued) Compute min of (x – Mj)TK-1(x – Mj) = xTK-1x – x K-1 Mj - MjTK-1x + MjTK-1Mj Same terms Same for each class Select ClassCj if following is minimum - 2MjTK-1x + Mj K-1MjT – 2 lnP(Cj)

  23. M-Class Gaussian – Case 1: K1 = K2 = … = KM = K Equivalent Rule Select ClassCj if Lj(x) is MAXIMUM Lj(x) = MjTK-1x – ½ MjTK-1Mj+ lnP(Cj) Dot Product Bias Linear Operation on observation vector x

  24. Summary 1. Gaussian Random variables and Vectors 2. General Gaussian Problem: 2-Class Case Special Cases: Quadratic and Linear Classifiers 3. Mahalanobis Distance 4.General Gaussian: M-Class Case Special Cases: Quadratic and Linear Classifiers

  25. End of Lecture 10

More Related