1 / 48

ViPER Video Performance Evaluation Toolkit

ViPER Video Performance Evaluation Toolkit. viper-toolkit.sf.net. Performance Evaluation. Ideal: Fully automated Repeatable Can be used to compare results without access to the product Predictive validity Useful for the task General enough to cover any task. Reality.

semah
Download Presentation

ViPER Video Performance Evaluation Toolkit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ViPERVideo Performance Evaluation Toolkit viper-toolkit.sf.net

  2. Performance Evaluation • Ideal: • Fully automated • Repeatable • Can be used to compare results without access to the product • Predictive validity • Useful for the task • General enough to cover any task

  3. Reality • Cannot fully automate for most domains. • Subjective • Objective • Results from subjective studies cannot be easily extended, if at all. • Ground truth is hard to gather, lossy, and evaluation metrics are hard to formulate. • It is often difficult to determine what is really being measured.

  4. The ViPER Toolkit • Unified video performance evaluation resource, including: • ViPER-GT – a Java toolkit for marking up videos with truth data. • ViPER-PE – a command line tool for comparing truth data to result data. • A set of scripts for running several sets of results with different options and generating graphs.

  5. The Video Performance Evaluation Resource Schema Mapping Metrics Filters Ground Truth Editor Truth Data Performance Evaluation Tool Video Analysis Algorithm Result Data Video Analysis Algorithm Result Data Video Analysis Algorithm Result Data Evaluation Results

  6. ViPER (Evaluator View) Metrics Params Metrics Params Metrics Template Metrics Params Truth Data Filter Template Performance Evaluation Scripts Performance Evaluation Tool Result Data Schema Mapping Result Data Schema Mapping Evaluation Results Evaluation Results Evaluation Results

  7. ViPER (Developer View) Result Data

  8. ViPER File Format • Represents data as set of descriptors, which the user defines in a schema. • Each descriptor has a set of attributes, which may take on different values over the file. • Like a temporally qualified relational database for each media file, where each row is an instance of a descriptor.

  9. ViPER File Format: Descriptors Descriptor Types • FILE (Video Level Information) • CONTENT (descriptors of the scene) • Static attribute values • Single instance of one type for any frame • OBJECT* (descriptors of instances, including events) • Attributes are dynamic by default • Multiple instances can exist at a single frame.

  10. Attributes • Attribute Types: • Strings, numbers, booleans, and enumerations • Shape types, including bounding boxes and polygons • Relations (Foreign keys)

  11. ViPERGround Truth Editing viper-toolkit.sf.net

  12. Ground Truth Editing

  13. The Difficulty of Authoring Ground Truth • Ground truth is tedious and time consuming to edit. • Ground truth is lossy.

  14. A Generic Video Annotation Tool • Lets the user specify the task and the interpretation. • Provides a

  15. Competition • VideoAnnEx • IBM AlphaWorks MPEG-7 Editor • OntoLog (OWL) • Jon Heggland’s RDF Video Ontology Editor • Informedia • CMU Digital Video Library • PhotoStuff • Still image annotation for the semantic web

  16. Time Line View • Provides summary of ground truth. • Direct manipulation across frames. • Feedback for indirect manipulation.

  17. Time Line View • Provides summary of ground truth. • Direct manipulation. • Quick editing of activities, events, and other CONTENT descriptors. • Some ability to modify descriptors with dynamic attributes directly, if not the attribute values. • Feedback for indirect manipulation. • Easier to notice massive changes.

  18. Enhanced Keyboard Editing • Support for real-time mark-up of events and activities. • Keys for creating and deleting activities. • Keys for controlling rate of display (jog dials). • Enhance mark-up of spatial data. • Keys for creating, editing of a single descriptor's attribute. • Overall attempt to minimize effort in a GOMS model. • Mouse events unnecessary except for polygon editing.

  19. Enhanced Keyboard Editing:Real-time Example • User assigns keys for three content types. Each key toggles between off/on states for each content type. • Forward/back decelerate/accelerate video playback. May skip frames, rewind,etc. • In paused mode, space goes to next frame. • USB jog dial might be useful.

  20. Editing: Control-n creates new descriptor of given descriptor type. Control-f creates a new attribute of given type if none exists. Arrow keys move, arrow+modifier resizes. Enhanced Keyboard Editing:Spatial Example • Mode selection: • Control-d cycles through descriptor types. • Control-a cycles through attribute types. • Control-s cycles through available descriptors.

  21. Frame View

  22. Schema Editor

  23. ViPER-GT Internals ViPER-GT: A Video Ground Truth Annotation Tool ViPER Metadata API AppLoader Plug-In Manager Jena Core GT API Pure Java MPEG Decoder Schema Editor Plug-Ins Plug-Ins Plug-Ins Native Decoders: VirtualDub QuickTime JMF

  24. Latest Version in Series

  25. Latest Version in Series • Schema editor. • Timeline view. • Supports undo/redo. • New video annotation widget.

  26. GTF Inputter (Original V-GT)

  27. ViPERPerformance Evaluation viper-toolkit.sf.net

  28. PE Methodology • Ground truth and results are represented by a set of descriptive records • Target:An object or content record delineated temporally in the ground truth along with a set of of attributes (possibly spatial) • Candidate:An object or content record delineated temporally in the results along with a set of of attributes (possibly spatial) • Requirements • Matching records which are close enough to satisfy a given set of constraints on: • Temporal range • Spatial location of object • Values of attributes in date-type specific parameter space

  29. Detection and Localization Detection: whether a target object or content record is properly identified Localization: how well the target is detected • Simplest level: • A target is detected if its temporal range overlaps the temporal range of a single candidate • Qualifiers and localization constraints • Temporal overlap must meet a certain tolerance (% or #) • Spatial attributes must overlap within a tolerance a frame by frame basis • Non-spatial attributes must be within a given tolerance

  30. TARGET CANDIDATES Temporal Localization • Metrics Overlap coefficient: • # Of % of target frames detected • Dice coefficient • # Or % in common • Similarity measure • Extent coefficient • Deviation in the endpoint location of ranges

  31. Attribute Localization • Each datatype has its own metric • Svalue: edit distance • Point: euclidean distance • Bboxes, oboxes, circle: overlap and dice coefficients • Bvalue, lvalue: exact match [0,1] • Remainder: absolute difference • Object correspondence • Optimal subset • Temporal constraints • Frame by frame tolerance • Virtual candidate TARGET CANDIDATES

  32. Reporting Metrics • Detections • List of correct, missed and false detections • Summary of absolute detection scores as a percentage • Summary of overall precision and recall • Localization • Optimal subset of matching frames • Frame by frame tolerance • Mean, median, SD and maximum values reported • Issues: • Many-to-one, many-to-many

  33. Evaluation Using “Gtfc” • Used to provide basic evaluation mechanisms • Requires configuration • Equivalence classes • Evaluation specification • Reports • Attribute and descriptor level recall and precision

  34. Evaluation Configuration #BEGIN_EQUIVALENCE DISSOLVE : FADE-IN FADE-OUT TRANSLATE #END_EQUIVALENCE #BEGIN_EVALUATION_LIST CONTENT Shot-Change TYPE: CUT FADE-IN FADE-OUT OBJECT Text TYPE: FULL OVERLAY SCENE *POSITION *CONTENT *MOTION #END_EVALUATION_LIST Set up Equivalencies Evaluate subsets of GT Allow Selected Performance Measures

  35. Video Evaluation • Providing metrics to judge correctness of • Value of Attributes • Range of Frames (temporal) • Detection and Localization of objects (spatial) • Moving objects (spatio temporal) • Degree of Correctness related to • similarity of or distance between descriptors • cost of transformation between result and ideal data • Performance Metrics reported as • % of correct/incorrect instances

  36. PE Methodology • Ground truth and results are represented by a set of descriptive records • Target:An object or content record delineated temporally in the ground truth along with a set of of attributes (possibly spatial) • Candidate:An object or content record delineated temporally in the results along with a set of of attributes (possibly spatial) • Requirements • Matching records which are close enough to satisfy a given set of constraints on: • Temporal range • Spatial location of object • Values of attributes in date-type specific parameter space

  37. Detection and Localization Detection: whether a target object or content record is properly identified Localization: how well the target is detected • Simplest level: • A target is detected if its temporal range overlaps the temporal range of a single candidate • Qualifiers and localization constraints • Temporal overlap must meet a certain tolerance (% or #) • Spatial attributes must overlap within a tolerance a frame by frame basis • Non-spatial attributes must be within a given tolerance

  38. TARGET CANDIDATES Temporal Localization • Metrics Overlap coefficient: • # Of % of target frames detected • Dice coefficient • # Or % in common • Similarity measure • Extent coefficient • Deviation in the endpoint location of ranges

  39. Attribute Localization • Each datatype has its own metric • Svalue: edit distance • Point: euclidean distance • Bboxes, oboxes, circle: overlap and dice coefficients • Bvalue, lvalue: exact match [0,1] • Remainder: absolute difference • Object correspondence • Optimal subset • Temporal constraints • Frame by frame tolerance • Virtual candidate TARGET CANDIDATES

  40. Metric and Tolerance Specification • Specification in the evaluation parameter file descriptor-typedescriptor-name[METRIC TOLERANCE] attribute1[METRIC TOLERANCE] attribute2 [METRIC TOLERANCE]

  41. Match Scenarios FALSE CORRECT CORRECT MISSED

  42. Error Graphs

  43. Localization Graphs

  44. Enhanced Don't Care Example • In an activity detection, certain segments are often more important than others: • The moment someone enters or exits the scene. • The moment a thief grabs a bag. • These segments might be marked up explicitly as part of an activity descriptor, and treated as important during the evaluation.

  45. Enhanced Don't Care Regions • For object evaluation, Don't Care currently applies only to entire descriptors. • Needs to apply to dynamic attributes at a per-frame level, as it does for framewise evaluations. • Need enhanced rules for computing don't-care regions spatially and temporally. For example: • Region of body not part of torso or head is unimportant. • Frames before this event are unimportant.

  46. Scripting ViPER • RunEvaluation • Runs sets of comparisons with different input parameters.

More Related