74 Views

Download Presentation
##### Evaluation Techniques in Computer Vision

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Evaluation Techniques in Computer Vision**EE4H, M.Sc 0407191 Computer Vision Dr. Mike Spann m.spann@bham.ac.uk http://www.eee.bham.ac.uk/spannm**Contents**• Why evaluate? • Images – synthetic/natural? • Noise • Example 1. Evaluation of thresholding/segmentation methods • Example 2. Evaluation of optical flow methods**Why evaluate?**• Computer vision algorithms are complex and difficult to analyse mathematically • Evaluation is usually through measurement of the algorithm’s performance on test images • Use of a range of images to establish performance envelope • Comparison with existing algorithms • Performance on degraded (noise-added) images (robustness) • Sensitivity to algorithm parameter settings**Test images**• Real images • ‘Ground truth’ difficult to establish • Pseudo-real images • Could be synthetic objects moving against real background • Often a good compromise • Synthetic images • Noise and illumination variation over object surfaces hard to model realistically**Simple synthetic images**• Simple ‘object-background’ synthetic images used to evaluate thresholding and segmentation algorithms • They obey a very simple image model (piecewise constant + Gaussian noise) • Unrealistic in practice – images are not like this!**Simple synthetic images**Medium noise Zero noise Low noise**Pseudo-real images**• More realistic object background images are better used to evaluate segmentation algorithms • Images of natural objects in natural illumination • Ground truth can be established using hand segmentation tools (such as built into many image processing packages)**Pseudo-real images**Screws Keys Cars Washers**Simple synthetic edges**• Again, piecewise constant + Gaussian noise image model • ‘Ideal’ step edge • Precise edge location but not achievable by finite aperture imaging systems**Simple synthetic edges**Low noise Medium noise High noise**Pseudo-real edges**• More realistic edge profiles can be created by smoothing an ideal step edge * = Step edge Gaussian filter**Pseudo-real movies**• The ‘yosemite’ sequence is a computer generated movie of a rendering of a fly-through the Yosemite valley • Background clouds are real • Enables true flow (ground truth) to be determined • Used extensively in the evaluation of optical flow algorithms • yosemite.avi • yosemite_flow.avi**Noise**• Often used to evaluate the ‘robustness’ of algorithms • Additive noise usual in optical images but multiplicative is more realistic in sonar/radar images • Noise level proportional to signal level • Usual noise model is independent random variables (usually Gaussian) • Correlated noise often more realistic**Noise**• Standard noise model is zero-mean identical independently distributed (iid) Gaussian (normal) random variables • Characterised by variance • Probability distribution of rv’s**Noise**• Noise level characterised by the signal-to-noise ratio • Usually expressed in dB’s • Defined as : • is the mean-square grey level defined (for a pixel image) as**Noise**dB 30dB 0dB**Noise (mean-square error)**• We can regard the mean-square error (difference) between 2 images as noise • Often used to evaluate image compression algorithms in comparing the original and decompressed images • Image differences can also be expressed as the peak-signal-to-noise-ratio (PSNR) in dB by taking the signal level as 255**Other types of noise**• The other main category of (additive) noise is impulse (sometimes called ‘salt and pepper’) noise • Characterised by the impulse rate (spatial density of noise impulses) and mean square amplitude of impulse • Can normally be easily filtered out using median filters**Other types of noise**Original Salt and pepper noise De-speckled**Other types of noise**• There are many other types of noise which can be considered in algorithm evaluation • Essentially more sophisticated and realistic probability distributions of noise rv’s • For example a ‘generalised’ Gaussian model is often considered to model ‘heavy’ tailed distributions • However, in my humble opinion, a more realistic source of noise is the deviation away from the ‘ideal’ of the illumination variation across object surfaces**Evaluation of thresholding & segmentation methods**• Segmentation and thresholding algorithms essentially group pixels into regions (or classes) • Simplest case is object/background • Simple evaluation metrics just quantify the number of miss-classified pixels • For basic images models such as constant greylevel in object/background regions plus iid Gaussian noise, the probability of error can be computed analytically**Evaluation of thresholding & segmentation methods**• For a simple object/background image :**Evaluation of thresholding & segmentation methods**• Miss-classification probability is a function of a threshold T • For a simple constant region greylevel model plus additive iid Gaussian noise we can easily derive an analytical expression for • Not very useful in practice as limited image model and we also require the ground truth • More useful just to simply measure the miss-classification error as a function of threshold**Evaluation of thresholding & segmentation methods**• Usual to represent correct classification probabilities and false alarm probabilities jointly within a receiver operating curve (ROC) • For example, the ROC shows how these vary as a function of threshold for an object/background classification**Evaluation of thresholding & segmentation methods**1.0 T=0 Prob. of correct classification T=255 0.0 0.0 1.0 Prob. of false alarm**Evaluation of thresholding & segmentation methods**• More useful methods of evaluation can be found by taking account of the application of the segmentation • Segmentation is rarely an end in itself but a component in an overall machine vision system • Also, the level of under- or over- segmentation of an algorithm needs to be determined**Evaluation of thresholding & segmentation methods**Ground truth Under-segmentation Over-segmentation**Evaluation of thresholding & segmentation methods**• Under-segmentation is bad as distinct regions are merged • Over-segmentation can be acceptable as sub-regions comprising a single ground truth region can be merged using ‘high’ level knowledge • Also, the level of over-segmentation can be controlled by parameter settings of the algorithm**Evaluation of thresholding & segmentation methods**• A possible segmentation metric is to quantify correctly detected regions, over-segmentation and under-segmentation • Depends upon some threshold setting T • Region rather than pixel based • Used in Koester and Spann’s paper (IEEE Trans. PAMI, 2000)to evaluate range image segmentations**Evaluation of thresholding & segmentation methods**• Correct detection • At least T % of the pixels in region k of the segmented image are marked as pixels in region j of the ground truth image • And vice versa Segmentation GT image**Evaluation of thresholding & segmentation methods**• Over-segmentation • Region j in the ground truth image corresponds to regions k1, k2… km in the segmented image if : • At least T % of the pixels in region ki are marked as pixels of region j • At least T % of the pixels in region j are marked as pixels in the union of regions k1, k2… km**Evaluation of thresholding & segmentation methods**GT image Segmentation**Evaluation of thresholding & segmentation methods**• Under-segmentation • Regions j1, j2… jm in the ground truth image correspond to region k in the segmented image if : • At least T % of the pixels in region kare marked as pixels in the union of regions j1, j2… jm • At least T % of the pixels in region ji are marked as pixels in region k**Evaluation of thresholding & segmentation methods**GT image Segmentation**Evaluation of thresholding & segmentation methods**• The metric also allows us to quantify missed and noise regions • Missed regions – regions in the ground truth image not found in the segmented image • Noise regions – regions in the segmented image not found in the ground truth image • Overall, the average number of correct, over, under, missed and noise regions can be quantified over an image database and different algorithms compared**Evaluation of optical flow methods**• Optical flow algorithms compute the 2D optical flow vector at each pixel using consecutive frames in a video sequence • Optical flow algorithms are notoriously un-robust • Crucial to evaluate the effectiveness of any method used (or any new method devised) • Usually ground truth difficult to come by**Evaluation of optical flow methods**• This simple error measurement naturally amplifies errors when the flow vectors are large (for the same relative flow error) • Can normalize the error by the product of the magnitudes of the ground truth flow and flow estimate**Evaluation of optical flow methods**• Often the ground truth is not available • A useful (but often crude) way of comparing the quality of two optical flow fields and is to compute the displaced frame difference (DFD) statistic • Uses the two consecutive frames of a sequence from which the flows were computed**Evaluation of optical flow methods**• DFD is a crude estimate because it says nothing about the accuracy of the motion field directly – just the quality of the pixel mapping from one frame to the next • Plus it says nothing about the confidence attached to optical flow estimates • However, it is the basis of motion compensation algorithms for most of the current video compression standards (MPEG, H261 etc)**Evaluation of optical flow methods**• In optical flow estimation, as in other types of estimation algorithms, we are often interested in the quality of the estimates • In classic estimation theory, we often compute confidence limits on estimates • We can say with a certain degree of confidence (say 90%) that the parameter lies within certain bounds • We usually assume that the quantities we are estimating follow some known probability distribution (for example chi-squared)**Evaluation of optical flow methods**• In the case of optical flow vectors, confidence regions are ellipses in 2 dimensions • They essentially characterise the distribution of the estimation error • Assuming a normal distribution of the flow error, confidence ellipses can be drawn for any confidence limit • Orientation and shape of ellipses determined by the covariance matrix defining the normal distribution • The eigenvalues of the covariance matrix define a particular confidence limit**Evaluation of optical flow methods**99% 90% 70% Confidence ellipses of**Evaluation of optical flow methods**Yosemite true flow Yosemite Yosemite flow (L&K) Yosemite flow (L&K) confidence thresholded**Conclusions**• Evaluation in computer vision is a difficult and often controversial topic • I would suggest 3 rules of thumb to consider when evaluating your work for the purposes of assignments • Consider carefully your test data. Make it as realistic as possible • Make your evaluations as much as possible ‘application driven’ • Make your algorithms ‘self evaluating’ if possible through the use of confidence statistics