- 62 Views
- Uploaded on
- Presentation posted in: General

Chapter 2 Decision Functions

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- Contents
2.1 Basic concepts

2.2 Linear decision functions

2.3 Generalized decision functions

2.4 Geometric discussions

2.5 Orthogonal functions

.

- A simple example
- Two classes C1 and C2
- Two-dimensional feature vector
- X = (x1, x2)T

- Figure 2.1.1
- Clearly separable by a straight line
- d(x) = w1x1 + w2x2 + w3 = 0

- Decision rule
- d(x) > 0 x C1
- d(x) < 0 x C2

- d(x) called the linear decision function

- n-dimensional Euclidean vector space (Rn)
- Decision function represented by a hyperplane
- n-dimensional feature vector
- X = (x1, x2, …, xn)T

- hyperplane
- d(x) = w1x1 + w2x2 + … + wnxn + wn+1 = 0

- Decision rule
- d(x) > 0 x C1
- d(x) < 0 x C2

- vector notation
- X = (x1, x2, …, xn, 1)T
- d(x) = wTx

- Nonlinear decision functions
- Figure 2.1.2
- circumference
- d(x) = 1 - x12 - x22 = 0

- Decision rule (Note it is the same as the previous ones.)
- d(x) > 0 x C1
- d(x) < 0 x C2

- More than two classes
- m pattern classes {C1, C2, …, Cm} in Rn
- Definition 2.1.1
- If a surface d(x), xRn, separate Ci and the remaining Cj, ji
- i.e,
- d(x) > 0 x Ci
- d(x) < 0 x Cj, ji

- d(x) called a decision function of Ci

- Example 2.1.2
- Figure 2.1.4

- Two cases
- Absolute separation
- Pairwise separation

- Absolute separation
- If each class Ci has a linear decision function di(x) for 1im
- i.e.
- d(x) = wiTx > 0, x Ci
- d(x) = wiTx < 0, otherwise

- Then absolute separation exists between C1~Cm (absolutely separable)
- Example 2.2.1
- Figure 2.2.1

- Absolute separation (Continued)
- How do we classify an incoming pattern x ?
- Classify x into C1 if
- d1(x) > 0
- d2(x) < 0
- d3(x) < 0

- Classify x into C1 if
- Definition 2.2.1 (decision region)
- Di = {x| di(x) > 0; dj(x) < 0, ji}, 1im
- Example 2.2.2
- Figure 2.2.2

- A case of no absolute separation
- Figure 2.2.3

- How do we classify an incoming pattern x ?

- Pairwise separation
- Each pair of classes separable by linear function
- Pair of Ci and Cj separable by dij if
- dij(x) > 0 for all x Ci
- dij(x) < 0 for all x Cj

- Pair of Ci and Cj separable by dij if
- Consequently, for all x Ci
- dij(x) > 0 for all ji

- Decision rule
- classify x into Ci if
- dij(x) > 0 for all ji

- classify x into Ci if
- Example 2.2.4
- Figure 2.2.4

- Each pair of classes separable by linear function

- Pairwise separation (Continued)
- Definition 2.2.2 (decision region)
- Di = {x| dij(x) > 0, ji}, 1im
- Example 2.2.5
- Figure 2.2.5

- Union of decision regions
- not the whole space
- rejection region

- Definition 2.2.2 (decision region)

- Generalized decision functions
- high complexity of boundaries nonlinear surfaces needed
- d(x) = w1f1(x) + w2f2(x) + … + wnfn(x) + wn+1
- fi(x), 1in : scalar functions of the pattern x, x Rn
- vector notation
- d(x) = i=1,n+1wifi(x) = wTx*
where x*= (f1(x), f2(x), …,fn(x), fn+1(x))T and wT = (w1, w2, …, wn, wn+1)

- d(x) = i=1,n+1wifi(x) = wTx*

- polynomial classifier is popularly used
- fi(x) are polynomials
- eg) f1(x) = x1, f2(x) = x12, f3(x)=x1x2, …..

- high complexity of boundaries nonlinear surfaces needed

- Quadratic decision functions
- 2-nd order polynomial classifier
- eg) 2-D patterns (n=2), x=(x1,x2)
- d(x) = w1x12 + w2x1x2 + w3x22 + w4x1 + w5x2 + w6

- for patterns x Rn
- d(x) = i=1,nwiixi2 + i=1,n-1 j=i+1,nwijxixj + i=1,nwixi + wn+1
- number of terms =(n+1)(n+2)/2
- eg) n=2 6 terms, n=3 10 terms, .., n=10 65 terms, …

- Quadratic decision functions (Continued)
- in case of order m
- fi(x)=xi1e1 xi2e2 ….ximem
- Theorem 2.3.1
- dm(x) = i1=1,n j2=i1,n …. im=im-1,n wi1i2…imxi1xj2….xjm + dm-1(x)
where d0(x) = wn+1

- proof by mathematical induction

- dm(x) = i1=1,n j2=i1,n …. im=im-1,n wi1i2…imxi1xj2….xjm + dm-1(x)
- Example 2.3.1
- Example 2.3.2
- number of terms = (n+m)!/(n!m!)
- matrix notation
- d(x) = xTAx + xTb + c

- in case of order m

- Importance of geometric interpretation of decision function’s properties
- hyperplanes
- dichotomies

- Hyperplanes
- linear decision functions
- in 2-D, straight line
- in 3-D, plane
- in n-D where n>3, hyperplane

- Figure 2.4.1
- hyperplane H
- unit normal vector n
- point on hyperplane, P, Q
- vector associated with P and Q, y, x

- normal vector n
- n = w0/|w0| equation 2.4.7

- distance between an arbitrary point R from H
- Dz = | (w0T/|w0|)(z-y)| = | (w0Tz + wn+1) / |w0|| equation 2.4.11

- linear decision functions

(1,2)

5/4

5/3

- Hyperplanes (Continued)
- Example 2.4.1
- 3x1 + 4x2 – 5 = 0 in R2
- |w0| = 5
- n = (3/5, 4/5)T
- D(1,2) = 1.2

- Example 2.4.2
- 2x1 - x2 + 2x3 - 7 = 0 in R3
- excluding the patterns whose distance from hyperplane is less than 0.01
- by |(2y1-y2+2y3-7) /|w0|| = |(2y1-y2+2y3-7) / 3| <0.01

- Example 2.4.1