1 / 44

Cheap User Modeling for Adaptive Systems

Cheap User Modeling for Adaptive Systems. Presented by: Frank Hines Topics in CS Spring 2011. Primary reference : Orwant, J. (1996). For want of a bit th e user was lost: Cheap user modeling. IBM Systems Journal , 35 (3 , 4), 398-416. Limitless Information. 100’s of channels.

olive
Download Presentation

Cheap User Modeling for Adaptive Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cheap User Modeling for Adaptive Systems Presented by: Frank Hines Topics in CS Spring 2011 Primary reference: Orwant, J. (1996). For want of a bit the user was lost: Cheap user modeling. IBM Systems Journal,35(3,4), 398-416.

  2. Limitless Information 100’s of channels

  3. One Size Fits All? “People have limited cognitive time they want to spend on picking a movie.” - Reed Hastings, CEO Netflix

  4. Information Overload! • Paradox of choice • Increased dissatisfaction • Increased fatigue • Increased anxiety • Lowered productivity • Lowered concentration • Lowered quality Information Overload

  5. Can we Limit to the Most Relevant Info? Learning Toolbox • User Modeling NOT strictly content filtering! • Timing/Performance • Prioritization • Formatting User Models {U1,U2} Processing & Filtering Presentation U1 {a,c,d,f} Sensors Content {a,b,c,d,e,f} Presentation U2 {b,c,e} Doppelgänger

  6. What is meant by adaptation? • What is a user model? • What can we predict? • Just how predictable are we? Overview

  7. Adaptation • Adaptation is a sign of intelligence • Adaptation in nature • Usability vs. Personalization Commonalities Differences Current Software

  8. Adaptation in Software Newsmap “One of the worst software design blunders in the annals of computing” – Smithsonian Magazine Jadeite

  9. Adaptation in Conversation Human-Human interaction (discourse) • Human-Computer interaction • Vocabulary (age) • Speech volume (noise) • Speech rate (time pressure) • Syntactic structure (cultural affiliation) • Topic (interests, knowledge)

  10. Models “The sciences do not try to explain, they hardly even try to interpret, they mainly make models.” - John von Neumann

  11. User Model Models typically include: • Knowledge • Beliefs • Goals • Plans • Schedules • Behaviors • Abilities • Preferences • Framework to “simulate” a user & predict that user’s actions • A mathematic relationship among variables • NOT necessarily a cognitive representation • GRUNDY (Rich, 1979) - book recommendations from personality traits

  12. Events • Interest • Location • Behavior What can we predict?

  13. How can we predict an event? f(n-1, n-2, …, n-j) f(n-1) f(n-1, n-2)

  14. Linear Prediction • Discrete time series • Predicts future values from linear function of past values • Canonical example: Tidal activity • Other examples: • Sunspots • Speech processing • Stock prices • Branch prediction • Oil detection

  15. Linear Prediction 3. Compute next observation sn 1. Compute autocorrelation vector R 2. Compute autocorrelation coefficients ak

  16. Correlation Shifted by one observation No Shift Shifted by two observations Shifted by n observations

  17. Use in Doppelgänger • Relevant news chosen and collated beforehand • Tailored to length of time user has available • Can determine when user is expected to read email • Problems: • Confidence decreases as predictions advance into future Inter-arrival time Session duration

  18. How can we predict interest? • Sports articles 4 out of 10 ‘Likes’ • Technology articles 9 out of 10 ‘Likes’

  19. News Topic Interest by Section

  20. Beta Distribution • Description of uncertainty of a probability • Based on Hits & Misses • Normalizes function so area under curve = 1 Confidence Mean Rating Variance

  21. Rating and Confidence H=1, M =1 H=2, M =2 H=5, M =5 H=10, M =10 As observations increase, confidence (height) increases and variance (width) decreases H=5, M =25 H=25, M =5 Rating skews relative to hit/miss distribution

  22. Use in Doppelgänger • Measuring topical interest • Problems: • Equal weight on ratings over time • Binary classification of topics • Credit assignment when multiple classification • Binary feedback of yes/no

  23. How can we keep track of location/state? • We can use Markov Models

  24. Markov Models 1 .6 1.0 .3 • Directed Graph • Set of states • Initial probabilities • Transition probabilities • For each discrete time step, state advances • Stationary random process • Markov property: No memory of past states traversed 0 3 .9 .5 .2 .4 2 .1

  25. Modeling a Student Probability Transition Matrix E H ST T SL E H ST T SL

  26. Uses in Doppelgänger • Physical location tracking • Printing priority • Phone call routing • Pre-fetching content • Website page navigation Media Lab Locations

  27. What to do if we can not observe the underlying states? Can we infer state based on observable output? • Yes, we can use “Hidden” Markov Models! • We can use this technique to infer behavior

  28. Hidden States Hidden Markov Models Symbol Emission Probabilities x

  29. Extremely Useful Technique • Speech Recognition • Part of Speech Tagging • DNA Sequencing • Biological Particle Identification • Too many other areas to list!

  30. Questions We Can Ask • What is the probability of a symbol sequence? • What is the most likely state sequence to generate a symbol sequence? • What are the most likely transition/emission probabilities that maximize a symbol sequence? Forward Algorithm (evaluation) Viterbi Algorithm (decoding) Baum-Welch Algorithm (learning)

  31. Forward Algorithm • But, exponential # of state sequences • How do we solve in polynomial time? Dynamic Programming: Forward Algorithm Output symbols x1 x2 x3 x4 s1 s2 s3

  32. Viterbi Algorithm • Via dynamic programming (similar to Forward) • Instead of summing all previous paths, only max probability stored • Store backpointer at each step for path reconstruction x1 x2 x3 x4 s1 Most probable state sequence: s2, s1, s3, s2 s2 s3

  33. Use in Doppelgänger • Determine the “working” (i.e., psychological) state • Class of task being performed • More importantly, how much attention is demanded Hacking HMM Output symbols Hidden States

  34. What do we do if we do not have enough data about a particular user? Substitute small amount of information from many other users

  35. Cluster Analysis • More computationally expensive than previous tools • But doesn’t change as often • Useful when little/no info about a user • Based on correlations between users • Construct communities • Gather a few bits from many people • Similar to popular “collaborative filtering” techniques

  36. K-Means Clustering

  37. Prediction Toolbox • Linear Prediction • Events • Beta Distribution • Interest • Markov Model • Location • Hidden Markov Model • Behavior • Cluster Analysis • When all else fails

  38. Just how predictable are we? • Netflix competition (2006) • Improve recommendation algorithm (Cinematch) by 10% for $1,000,000 • Winner: BellKor’s Pragmatic Chaos • Solution: Independent convergence • Fusing 107 independent algorithmic predictions

  39. The ‘Napoleon Dynamite’ Effect “Human beings are very quirky and individualistic, and wonderfully idiosyncratic. And while I love that about human beings, it makes it hard to figure out what they like.” - Reed Hastings, CEO of Netflix

  40. Criticisms of Primary Article • Empirical evaluation of techniques? • vs. other techniques? • vs. other cheap or expensive? • vs. non-adaptive systems? • Concessions: • Orwant’s motivation: galvanize cheap user modeling techniques • Techniques validated in other realms and in industry

  41. References • Orwant, J. (1996). For want of a bit the user was lost: Cheap user modeling. IBM Systems Journal, 35(3,4), 398-416. • Makhoul, J. (1975). Linear Prediction: A Tutorial Review. Proceedings of the IEEE, 63(4), 561-580. • Rabiner, L.R. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77(2), 257-286. • Singh, V., Marinescu, D.C., & Baker, T.S. (2004). Image Segmentation for Automatic Particle Identification in Electron Micrographs Based on Hidden Markov Random Field Models and Expectation Maximization. Journal of Structural Biology, 145, 123-141. • Many other references not shown here • If interested, email me atfrankHines@knights.ucf.edu

  42. Jon Orwant • Ph.D. • C.T.O. • Engineering Mgr.

  43. Sharing Standards & Privacy • Protocol development • User Markup Language • Passive sensors as an invasion of privacy • Informed consent • Access to personal data • Accessor keywords • Access Control Lists

More Related