1 / 41

Breaking the Laws of Action in the User Interface

Breaking the Laws of Action in the User Interface. Per-Ola Kristensson Department of Computer and Information Science Linköpings universitet, Sweden also IBM Almaden Research Center, California, USA Advisor: Shumin Zhai. What do I do?. Improve the performance of stylus keyboards Faster

toril
Download Presentation

Breaking the Laws of Action in the User Interface

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Breaking the Laws of Actionin the User Interface Per-Ola Kristensson Department of Computer and Information Science Linköpings universitet, Sweden also IBM Almaden Research Center, California, USA Advisor: Shumin Zhai

  2. What do I do? • Improve the performance of stylus keyboards • Faster • Less error-prone • Fluid interaction

  3. Background: Pen Computing • Great premise, but many failures • Text entry is slow and error-prone • Commercial pen UIs are micro-versions of the desktop GUI • Can research help?

  4. Text entry on mobile computers • How do we write text efficiently “off-the-desktop” • Explosion of mobile computers – smart phones, PDAs, Tablet PCs, handheld video game consoles • Can we achieve QWERTY touch-typing speed? • The stylus keyboard is the fastest pen-based text entry method that we know of

  5. Why not handwriting or speech recognition? • Handwriting recognition [Tappert et al. 1990] • Limited to about 15 wpm [Card et al. 1983] • Speech recognition [Rabiner 1993] • Difficult to convert the acoustic signal to text • Error correction [Karat et al. 1999] • Dictation and cognitive resources [Sheiderman 2001] • Privacy

  6. Modeling Stylus Keyboard Performance using the Laws of Action • Fitts’ law – speed-accuracy trade-off in pointing • Crossing law [Fitts 1954, Accot and Zhai 2002]

  7. How to “Break” Fitts’ Law • D/W relationship in Fitts’ law • Break D – minimize the distance the pen travels • Break W – maximize the target size

  8. The QWERTY Stylus Keyboard • “Obvious” approach – transplanting QWERTY to the pen user interface

  9. Example: Breaking D • The distance between frequently related keys should be minimized • A model of stylus keyboard performance: • Fitts’ law • Digraph statistics (the probability that one key is followed by another) • Using the model compute optimal configuration [Getschow et al. 1986, Lewis et al. 1992]

  10. Example: ATOMIK • Optimized by a Fitts’ law – digraph model using simulated annealing [Zhai et al. 2002]

  11. Elastic Stylus Keyboard Breaking the Fitts’ Law W Constraint [Kristensson and Zhai 2005]

  12. Problems with Stylus Keyboards • Error prone • Unlike physical keyboards, stylus keyboards lack tactile sensation feedback • Off by one pixel results in an error • Bounded by the Fitts’ law accuracy trade-off • Trying to be faster than what Fitts’ law predicts results in more errors

  13. Two Observations • Not all key combinations on a stylus keyboard are likely • A lexicon defines legal combinations • Stylus taps are “continuous” variables • Stylus taps form high resolution patterns • Words in the lexicon form geometrical patterns • Using pattern matching we can identify the user’s input

  14. Example h e t j w r the n

  15. Elastic Stylus Keyboard (ESK) • Pen-gesture as delimiter • Edit-distance generalized to comparing point sequences instead of strings • Handles erroneous insertions and deletions • Indexing to allow efficient computation, despite quadratic complexity of the matching algorithm • Can search 57K lexicon in real time on a 1 GHz Tablet PC

  16. ESK Video Demonstration Video

  17. Can an ESK “break” Fitts’ Law? • Regular QWERTY stylus keyboarding has an average estimated expert speed of 34.2 wpm • Since we relax or “break” the W constraint in Fitts’ law (the radius of the key), can we do better?

  18. SHARK Shorthand Breaking the Crossing Law W Constraint [Zhai and Kristensson 2003, Kristensson and Zhai 2004]

  19. SHARK • Shorthand-Aided Rapid Keyboarding • Typing on a stylus keyboard is a verbatim process • Instead of tapping the letter keys of a word • …gesture the patterns directly

  20. Writing the word “system” as a shorthand gesture Gradual transition from tracing the keys to open-loop gesture recall

  21. SHARK Video Demonstration Video

  22. Advantages • Users can be productive while training: in-use learning • The most frequently used words in a user’s vocabulary get practiced the most • Easy mode for novices (visual-guided) • Fast mode for experts (memory recall) • Transition from novice to expert is continuous • Keyboard acts as a mnemonic device

  23. Empirical “records” (wpm)

  24. Breaking the Laws of Action Tapping vs. Gesturing Visualization of the “Wiggle Room” More Advanced Interfaces in the Future?

  25. The “Sloppiness Space” • Pattern recognition accuracy depends on how similar patterns are • Larger lexicon = more confusable patterns? • … but in fact, most confusable words in SHARK and ESK are very frequent, since frequent words tend to be smaller • How does a user know how the limits of the system?

  26. What is a Recognition Error? • Speed accuracy trade-off • How fast people can do gestures? • How “sloppy” people get? • What is “reasonable”? • Users pushing the system beyond any chance of recognition

  27. Going beyond the Laws of Action… • Relaxes visual attention • Movements can be more imprecise • Movements can be faster (corollary to 2) • Tapping and gesturing patterns – where is the difference?

  28. C A N ESK vs. SHARK • Gesturing “can” vs. “an” • Tapping “can” vs. “an”

  29. Gesturing Lacks Delimitation Information

  30. Speed and Learning • Less information = faster articulation? • Chunking • Tapping = sequentially entering small “chunks” of information • Gesturing = one chunk of information • Motor memory, different muscles involved, more feedback when gesturing than tapping

  31. On-Going and Planned Future Work • Evaluating ESK and SHARK – in controlled laboratory studies, and “in the wild” • Comparative study between tapping and gesturing patterns • Speed comparison is easy • … but study learning is harder! • Studying effects of trying to visualize the limits in the system

  32. Why is this Work Important? • Gain insight in gesture and point-and-click interfaces in general • The “paradigm” of gesturing patterns can be used to develop more advanced interfaces • Video demo (if time)

  33. Thank You!

  34. SHARK vs. Marking menu • Multi-channel pattern recognition vs. angular direction • Thousands of words vs. dozens of commands • Continuous vs. binary novice-expert transition (marking menus have delayed feedback)

  35. Optimizing stylus keyboard for SHARK • Have tried, non-trivial • Less room for improvement • Computationally challenging – measuring ambiguity in a large vocabulary • Optimization would be highly dependent on classifier and its parameters

  36. Feedback • Recognized word is drawn on the keyboard • Presents ideal gesture on keyboard • Morphing of user’s pen trace towards the recognized sokgraph • The animation suggests to a user which parts of a gesture that are the farthest away from the ideal sokgraph

  37. Evaluation • Can user’s learn the sokgraphs? • Expanding Rehearsal Interval (ERI) training • Users can on average learn 15 sokgraphs per 45 minute training session

  38. Recognition architecture Shape Location … • Integration using the Gaussian probability density function and Bayes’ rule • Standard deviation is a parameter adjusting the contribution of a channel Integration

  39. QWERTY vs. ATOMIK

  40. Preprocessing and pruning • Smoothing (filtering) • Equidistant re-sampling to a fixed N number of points • Normalization in scale and translation (for shape channel and pruning) • Pruning scheme

  41. Using higher level language regularity • Bigram language model • Viterbi decoding of most likely word sequence • Problem of highly accurate recognition data being integrated with noisy statistics • Integration using a Gaussian function, again, Sigma is an empirical parameter

More Related