1 / 57

The Bar Of Soap: A multifunctional device with natural mode switching

Brandon Taylor Daniel Smalley Jeevan Kalanithi Matt Adcock MAS622J 2006. The Bar Of Soap: A multifunctional device with natural mode switching. MIT MEDIA LAB Bar of Soap. What is the Bar of Soap? A device that can be your phone, your camera, your iPod, your everything.

trevor
Download Presentation

The Bar Of Soap: A multifunctional device with natural mode switching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Brandon Taylor Daniel Smalley Jeevan Kalanithi Matt Adcock MAS622J 2006 The Bar Of Soap: A multifunctional device with natural mode switching MIT MEDIA LAB Bar of Soap

  2. What is the Bar of Soap? A device that can be your phone, your camera, your iPod, your everything. It’s not confusing to deal with because: • The controls appear on the fly based on its mode. At rest, it looks like an undifferentiated block. • The Bar figures out what mode it should be based on how the user holds it.

  3. How does the Bar of Soap figure out which mode it should be in? How can we make the device cheap, flexible and general across users? Our Pattern Recognition Problem • Build a Bar of Soap prototype • Use data generated by its sensors to detect “pose” • Build classifiers to figure out pose based on these data • We want to simplify the device • What sensors can we eliminate? • What classifiers are cheap to run on an embedded device? • What classifiers allow users to add new modes on the fly?

  4. The Data • 48 Binary Buttons Determine Hand Position • 3-axis accelerometer Orients the Device to Gravity • Data is sampled continuously at ~3Hz and transmitted to a host PC via Bluetooth

  5. Single User Data Set 5 Classes Data Collection • Camera, phone, game pad, PDA, remote control • 40 Samples per Class

  6. Data Prep • Interested in static poses • Averaged over the last n samples • Chose n = 4 to filter out button glitches

  7. Data Reduction • Raw data has 51 dimensions • Important to track information • Scaling between binary buttons & analog accelerometer data • Looked at: • Accelerometer Only • Button Groups Only • Fisher & PCA • ‘Knowledgeable’ reduction

  8. “Knowledgeable Reduction” • Button Grouping • Right/Left Hand Symmetry • 180 Degree Rotational Symmetry

  9. Discriminability Across Reduction Methods Accelerometer Only Buttons Only

  10. Knowledgeable Discriminibility “Knowledgeable Reduction”

  11. Reduction Makes It Easier To Classify Lets try two techniques: • Templates • cheaper, but stupid • Neural Nets • better, but expensive

  12. PCA Reduced Data

  13. Danger for Template matching • The mean for the green class sits resolutely in the red class.

  14. “Knowledgeable Reduction” • Button Grouping • Right/Left Hand Symmetry • 180 Degree Rotational Symmetry • So we can correct for symmetries manually or we could try using Fisher and see what happens…

  15. Symmetry/Fisher Reduced Data (Even better separation than ‘Knowledgeable Reduction’!)

  16. Is Template Matching Enough? • Template : 59.3% • Neural Net: 91.8%

  17. Is Template Matching Enough? • Template : 59.3% • Neural Net: 91.8% • Template : 99.6% • Neural Net: 97.6% • Yes, template matching appears to be good enough … for ‘Fisherized’, single person data.

  18. (Aside): What if we had unlabeled data?

  19. My guess might be:

  20. My guess might be: (4 wrong)

  21. K-means Guess (2 wrong)

  22. Mixture of Gaussians

  23. Mixture of Gaussians (not so good)

  24. Rationally ReducedTrained on Jim/Tested on Jim • Template 82.2% • Neural Net 92.5%

  25. Fisher ReducedTrained on Jim/Tested on Jim • Template 97.5% • Neural Net 97.5%

  26. Advantage is that one need not do explicit training. • Disadvantage is that computation is expensive. • Two classifiers were implemented: • K-Nearest Neighbors • Parzen Windows Non-Parametric Classifiers

  27. First problem: because there are five classes, how to deal with ties? Good Old K Nearest Neighbors • Just pick one randomly • Tie-breakers based on mean distances • Or use always “weighted vote” • Each neighbor votes in proportion to its distance from the sample Does not seem to matter too much for Knn…

  28. Knn performs well on the Single User across K’s… • Best performance (dumb voting or smart tie-break voting): • K = 9, 95.0 % performance on test data • 96.77% performance on leave-one-out cross validation Good Old K Nearest Neighbors

  29. At k = 9, only camera class is confused. Good Old K Nearest Neighbors 1 = camera, 2 = game controller, 3 = PDA, 4 = phone, 5 = remote control

  30. Instead of picking some predefined number of the nearest neighbors, draw a hypersphere around your test sample, see who falls within it, and classify based on those points Parzen Windows Advantage: outlier points are not misclassified Advantage: New classes can be identified on the fly (more on that later) Disadvantage: what is the right volume size?

  31. The voting scheme really matters! Parzen Windows

  32. Parzen windows do well once the volume reaches a certain size. Scale matters! Parzen on unnormalized Accelerometer data Parzen Windows The upper fig’s volume ranges from 0 - 50, the lefthand fig from 0 - 4! 96.41% performance on 0.9 volume size with weighted voting scheme

  33. At v = 0.9, only the camera gets confused Parzen Windows 1 = camera, 2 = game controller, 3 = PDA, 4 = phone, 5 = remote control

  34. Can we recognize new classes “on-the-fly?” • Scenario: What if a user wants to associate some new pose with a new self-designed or downloaded mode? • Test this with “leave-one-class-out” validation • As we add the novel class into the data set, can the Parzen classifier recognize it as a new a class? Parzen Windows

  35. Can we recognize new classes “on-the-fly?” Parzen Windows We can recognize novel classes on the fly! But optimal volumes vary across classes…

  36. Multi-Class Linear Discriminants One-versus-one c(c-1) discriminant functions One-versus-the-rest c discriminant functions

  37. Multi-Class Linear Discriminants One-versus-one c(c-1) discriminant functions One-versus-the-rest c discriminant functions

  38. Multi-Class Linear Discriminants One-versus-one c(c-1) discriminant functions One-versus-the-rest c discriminant functions

  39. Multi-Class Linear Discriminants 1or 2? One-versus-one c(c-1) discriminant functions One-versus-the-rest c discriminant functions

  40. Multi-Class Linear Discriminants Generalised Linear Discriminant function!

  41. 1 x1 x2 x3 x4 1 x1 x2 x3 x4 1 x1 x2 x3 x4 : : : : : : -1 x1 x2 x3 x4 -1 x1 x2 x3 x4 -1 x1 x2 x3 x4 : : : : : : Two Classes (min MSE) a = Y†b † w0 w1 w2 w3 : : 1 1 1 1 1 1 : : = g(x) = aTx ( is g(x) > 0? )

  42. 1 x1 x2 x3 x4 1 x1 x2 x3 x4 : : : 2 x1 x2 x3 x4 2 x1 x2 x3 x4 : : : 3 x1 x2 x3 x4 3 x1 x2 x3 x4 : : : : : : 1 0 0 0 0 1 0 0 0 0 : : : 0 1 0 0 0 0 1 0 0 0 : : : 0 0 1 0 0 0 0 1 0 0 : : : : : : Multiple Classes (min MSE) A = Y†B † w0 w0 w0 w0 w0 w1 w1 w1 w1 w1 w2 w2 w2 w2 w2 w3 w3 w3 w3 w3 : : : : : : = G(x) = ATx ( pick max index G(x): [ g1 g2 g3 g4 g5 ] )

  43. GLD: Results(Single User 80% / 20%) Testing Performance: 87.5% (Train/Validate Performance: 92.9%)

  44. Quick summary of classifier performance Overall, the classifiers’ performance is impressive (~90%), especially considering random guessing would yield 20% accuracy Other design considerations beyond accuracy would determine which classifier to use

  45. Now, How well can training on one user generalize to lots of other users? Scenario: We build our devices and build our classifier based on some small set of users. Will this classifier work for all of the potential “customers” out in the world? We collected data from 14 different people and tested this hypothesis…

  46. Rationally ReducedTrained on Jim/Tested on World • Template 75.4% • Neural Net 79.0%

  47. Fisher ReducedTrained on Jim/Tested on World • Template 61.5% • Neural Net 59.0%

  48. Performance on all user data… K Nearest Neighbors Estimated Class Actual Class At k = 4, 75.8% performance 1 = camera, 2 = game controller, 3 = PDA, 4 = phone, 5 = remote control

  49. Performance on all user data… Parzen Windows At volume 0.9, 72.3% performance 1 = camera, 2 = game controller, 3 = PDA, 4 = phone, 5 = remote control

  50. GLD: Results(Single User / Multiple Users) Testing Performance: 70.3% (Train/Validate Performance: 93.3%)

More Related