1 / 18

Introduction

Introduction. Problem: Classifying attributes and actions in still images Model: Collection of part templates Specific scale space locations (human centric) Discriminative learning Sparse Activation. Motivation. Train. Test. Train. Test. Overview. Mining Parts &

rory
Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction Problem: Classifying attributes and actions in still images Model: • Collection of part templates • Specific scale space locations (human centric) • Discriminative learning • Sparse Activation

  2. Motivation Train Test Train Test

  3. Overview Mining Parts & Learning Templates Image Scoring

  4. Formulation Dataset: Model: fractional multiples of width and height Objective:

  5. Model fractional multiples of width and height d = 1000 . . . Model Part 1 Part 2 Part 3 parts

  6. Model & Scoring Image Scoring Model Optimization: Greedy selection of 0.33 overlap constraint overlap constraint sparse activation

  7. Model Initialization 1) randomly sample the positive training images for patch positions: 2) Initialize model parts: perfect case: worst case: 3) BoF features normalized 105patches. 3) Prunning: remove unused parts

  8. Learning k = 4

  9. Experiments Willow 7 Human actions 27 Human Attributes (HAT) Stanford 40 Human Actions

  10. Implementation Features: • VLFeat - Dense SIFT, • step size: 4 pixels • square patches (8 to 40 pixels) • k-means - vocabulary 1000 • explicit feature map + Bhattacharyya (Hellinger – Square root) kernel Baseline: 4 level spatial pyramid Immediate context: • expand the human bounding boxes by 50% in both width and height Full image context: • full image classifier uses 4 level SPM with an exponential 2 kernel

  11. Qualitative Results

  12. Willow Actions

  13. Database of Human Attributes (HAT)

  14. Stanford 40 Actions

  15. Learned Parts - I In each row, the first image is the patch used to initialize the part and the remaining images are its top scoring patches

  16. Learned Parts - II In each row, the first image is the patch used to initialize the part and the remaining images are its top scoring patches

  17. Learned Parts - III In each row, the first image is the patch used to initialize the part and the remaining images are its top scoring patches

More Related