1 / 6

Thomas Swift

Thomas Swift. Week 9. Aladdin Dataset Problems. Original feature representation: Histograms of Dollar Spatio -Temporal features. Problem: Histograms don’t preserve temporal information in video sequence; doesn’t take advantage of strength of DI.

breck
Download Presentation

Thomas Swift

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Thomas Swift Week 9

  2. Aladdin Dataset Problems • Original feature representation: Histograms of Dollar Spatio-Temporal features. • Problem: Histograms don’t preserve temporal information in video sequence; doesn’t take advantage of strength of DI. • DI is very time sensitive (order sensitive in vector case), good at following features throughout time sequence. • Results: Very poor. Continuously classified all queries as same couple of actions.

  3. Aladdin Dataset Continued • New feature representation: concatenated Dollar S-T descriptors at interest points in video frames. • Retains temporal info; should work well with DI. • Problem: RUN TIME! • DI has exponential run time. • L = 1,000; T = 1.3 secs • L = 10,000; T = 60 secs (46 times longer) • L = 100,000; T = 8270 secs = 2.25 hrs (137 times longer) • Vector lengths for video representation generally 500,000-900,000 in length. • 15 event types * 100 training examples per event * 300 queries (20 per event) = unrealistic run time. • Results: Effort abandoned (for now).

  4. Oregon State Football Dataset • 23 videos- split into 20% testing (~5) and 80% training (~18) through K_folds cross validation • Loop through all K_folds testing/training divisions (each video tested once) • 23 videos split into 5 groundtruth classes (play types) • Results: 21/23 classified correctly (91.3%) • Really 21/21, as 2 of the 5 classes only have 1 example, and so only exist in training or testing set, and so can’t be matched to its own class

  5. More Extensive Testing • Rather than splitting dataset into training and testing, compute DI between each play and all other plays but itself; find highest score. • Results: 12/23(21)- 52.1%(57.1%) • Further extension: from computed DI scores, group scores by play type and take means; find highest mean score among types. • Results: 7/23(21)- 30.4%(33.3%) • Problem: Play type 3 is an anomaly- has run & pass components, only 1 example • Of 14 misclassified plays, 13 were classified as type 3 • Of these 13, second best choice was correct for 6 of them • For pass queries, DI scores were higher for all pass plays than all run plays

  6. Next Week • Complete registration of plays to static field model, and then recompute optical flow and rerun program (should improve results!). • Salman is implementing paper. • I have obtained C code for that paper. • Do time warping to get uniform vector lengths. • Only using 15-40% of vector in most cases. • Get better (larger and more diverse) dataset. • Get more compact vector representation for videos in Aladdin Dataset. • Subh recommended PCA to reduce dimensionality.

More Related