1 / 31

Chapter 15 System Errors Revisited

Chapter 15 System Errors Revisited. Ali Erol 10/19/2005. System Errors Revisited. Quantify the accuracy of FAR and FRR estimates . Confidence Intervals, a well known technique used in statistical analysis. See references [22],[23].

joie
Download Presentation

Chapter 15 System Errors Revisited

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 15System Errors Revisited Ali Erol 10/19/2005

  2. System Errors Revisited • Quantify the accuracy of FAR and FRR estimates. • Confidence Intervals, a well known technique used in statistical analysis. • See references [22],[23]. • The first three author’s algorithm [23] experimentally demonstrated to provide better Confidence Intervals estimates.

  3. FAR/FRR • Definition: FRR(x)=Prob(smx/H0)=F(x) FAR(y)=Prob(sn>y/Ha)=1-Prob(sn y/Ha)=1-G(y) • We need • F(x)=Dist(x) : Genuine (Matching) score DF • G(y)= Dist(y): Imposter (Non-matching) score DF

  4. FAR/FRR • Instead we have • Set of genuine scores X={X1, X2, …., XM} • Set of imposter scores Y={Y1,Y2, …., YN} • We estimate

  5. Problem • What is the accuracy of these error rates? • The number of biometric samples • The quality of the samples • Data collection procedure (e.g. 10 consecutive samples) • Subjects involved, the acquisition device etc.

  6. An Estimation Problem Given x: A random variable (F(x) denotes Dist(x)) X={X1, X2, …., XM}: Sample set Estimate =E(x) Solution Error (Unbiased estimator*)

  7. Biased/Unbiased Estimators • For an unbiased estimator we have • Example: Gaussian Model: Estimate mean 1and variance 2 using maximum likelihood criterion i.e. maximize Prob(X/ ,) (Unbiased estimator) (Biased estimator) (Unbiased estimator)

  8. Confidence Interval • Assume F(x) is given then Dist(r) can be calculated • r is function of , which is a function of x • Calculate (1-) 100% certainty (Next Slide) r[1(,X), 2(,X)] • Which leads to (1-)100% confidence interval for  given by

  9. Confidence Interval • Example • Discard /2 on lower and higher ends • Find the r values corresponding to the interval boundary (called quantile) Dist(r) r Prob(q(/2) r q(1-/2))=1-

  10. Confidence Interval • Interpretation: • Generate sample sets X from F(x) • Calculate confidence intervals for each X • (1-)100% of these intervals contain .

  11. Parametric Method • Xi identically distributed • Assume Xiare independent(not true in general) • Then can be taken to be normal distribution using central limit theorem (large M). • Result: • E.g. For 95% confidence z=1.96 • Smaller interval with increasing M and 

  12. Non-Parametric Method Sample Set X • Assume F(x) is available. f(x) Density of Additional Sample Sets Random Variable

  13. Non-Parametric Method • FACT: For large B we have • Define error to be • Calculate Dist(r) • Solution:

  14. Dist(r) r /2 /2 Non-Parametric Method • Interval calculation: Sorting and counting

  15. Bootstrap Method • F(x) is not available; all we have is X • How do we generate ? • Solution (i.e. Bootstrap method): Sampling with replacement from X. • Put the samples in a bag, draw, record and put it back. • Draw M samples from X B times. Some samples Xi may not be in each set.

  16. /2 /2 Bootstrap Method (Imperfections) • Xi are not independent. • In SR the dependence between samples is not replicated. • Effect of dependence for independent samples • Variance of is smaller • Leads to smaller CIs

  17. Subset Bootstrap • Potential sources of dependency • All samples from the same person (e.g. multiple fingers) • All samples from same biometric (e.g. finger) • Partition X into independent subsets • Apply SR on subsets.

  18. Subset Bootstrap (An example) • Fingerprint database • P persons • c fingers per person  D=cP Fingers • d samples per finger • DB Size= cPd • Matching pairs • d(d-1) per finger • cd(d-1) per person • cPd(d-1)=Dd(d-1) total • Using a symmetric and asymmetric matcher does not make any difference [23].

  19. Subset Bootstrap (An Example) • X1 X2 • X1: P=10 c=2, D=20, d=8  M=1120 • X2: P=50 c=2, D=100, d=8  M=5600 • Finger based partition: Set subsets to be the samples from the same finger (i.e. D subsets of d(d-1) matching scores) • Person based partition: Set subsets to be the samples from the same person (i.e. P subsets of cd(d-1) matching scores)

  20. Subset Bootstrap (An Example) • We expect • CI1 (light gray) to be larger than CI2 (dark gray) • Because X1 has smaller number of samples • CI2 (dark gray) to be contained in CI1 (light gray) • Because X1 X2 • The intervals are larger for person based partitioning • There is dependency between fingers of the same person

  21. CIs for FAR/FRR • Calculate CIs for each threshold T=t0 and given an 

  22. CI for FRR • Given genuine score set X • Generate • Calculate • Sort and count

  23. CI for FAR • Given imposter score set Y • Generate • Calculate • Sort and count

  24. Subset Bootstrap for FAR • Imposter scores Y are not independent • We are using multiple impressions of the same finger. • Let Ixk: kth finger impression from subject x then sim(Ia1,Ib1), sim(Ia1,Ib2), sim(Ia2,Ib3) are not statistically independent • Use a finger only once; for D fingers we have only D/2 such pairs • There is actually dependency between X and Y

  25. Subset Bootstrap for FAR • Fingerprint database • P persons • c fingers per person  D=cP Fingers • d samples per finger • DB Size= cPd • Non-matching pairs • N=d2D(D-1)=P[(dc)2(P-1)+d2c(c-1)] • d2(D-1) per finger • (dc)2(P-1)+d2c(c-1) per person

  26. Subset Bootstrap for FAR …. …. DB Partition IN Ii I1 x Y1=IixI1 YN-1=IixIN Ii • Finger (N=D): Take Ii (d elements), match itagainst Iki(d2 pairs) then we have d2(D-1) pairs. Repeat it with all Ii to construct subsets Yk • Person (N=P): Take Ii (cd elements), match itagainst Iki((dc)2 pairs) then we have (dc)2(P-1) pairs. Inside Ii we have d2c(c-1) pairs. Repeat it with all Ii to construct subsets Yk • Not completely independent: We use Ii many times.

  27. Subset Bootstrap for FRR • Person subset is a better estimate

  28. How good are the CIs? • There exists a true confidence interval (At the beginning we assumed F(x) is known) • The CI we calculate is just one estimate. • How accurate is that estimate?

  29. How good are the CIs? • We estimate E(x) • Ideal Test: Assume F(x) is available • Generate • Calculate • Assume and test if

  30. How good are the CIs? • Practical Test (for comparison) • Randomly split X into two subsets Xa and Xb • Calculate and CIa • Test • Repeat 1-3 many times and count the number of hits i.e. probability of falling into the CIa • Hit rate is not equal to the confidence. Assume have normal distribution. • The higher the hit rate is the better the estimates are.

  31. How good are the CIs? • =0.1 • Person based partitioning provide more accurate confidence intervals • 73.10% is very close to the expected value

More Related