1 / 27

Portable Vision-Based HCI

A portable vision-based HCI system that operates on a projected interface, allowing real-time detection of user hand motion from a PDA/Smartphone's video camera. The system aims to provide an efficient method to run on portable devices, enabling a more instinctive way of data manipulation.

jackieo
Download Presentation

Portable Vision-Based HCI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Portable Vision-Based HCI A Real-Time Hand Mouse System on Portable Devices 連矩鋒(Burt C.F. Lien) DepartmentofComputer Science and Information Engineering National Taiwan University

  2. Problems • A Portable Vision-Based HCI • Hand mouse operating on a projected interface • Real-time detection of user hand motion from a user PDA/SmartPhone’s video camera (target platform) • Need an efficient method to run the idea on portable devices

  3. Why important • Vision-based HCI is a more instinct way to manipulate data

  4. Related Works I • A Portable System for Anywhere Interactions • Sukaviriya et al., IBM Research • Real-time hand tracking using a set of cooperative classifiers based on Haar-like features • Barczak1 et al., Institute of Information & Mathematical Sciences Massey University

  5. Everywhere Display (IBM) Figure 1: Interactive store application

  6. Related Works II • Rapid Object Detection Using a Boosted Cascade of Simple Features. • Viola, P., & Jones, M. (2001). • Robust real-time object detection. • Viola, P., & Jones, M. • Robust real-time face detection • P. Viola and M. Jones. • Adaboost-based real-time pedestrian detection • P. Viola, M. Jones, and D. Snow. • James W. Davis. "Hierarchical Motion History Images for Recognizing Human Motion," event, p. 39,IEEE Workshop on Detection and Recognition of Events in Video (EVENT'01), 2001 • Tim Weingaertner, Stefan Hassfeld, Ruediger Dillmann. "Human Motion Analysis: A Review," nam, p. 0090, 1997 IEEE Workshop on Motion of Non-Rigid and Articulated Objects (NAM '97),1997

  7. Reference codes • Intel OpenCV Libraries • Motion Template • Motion History Image

  8. Contribution • An efficient method to run a real-time vision-based HCI system on portable device • Experiment result: Typically 5~7% CPU Usage ( Intel Pentium M processor 730 (1.6G) ) with 640x320 resolution (3FPS) • The motion method used in this system does not need a training process. This significantly reduced lots of training efforts and can be more robust (lighting proof) on object detection even with a blurred image.

  9. Target Devices

  10. System Configuration Wireless projector projected contents data transmission Hand motion capture and interpretation Interactive Interface

  11. Platform and Tools • Platform (prototype) • “Laptop” + “Low Cost Camera (USB) – NT300” • Software tools • “MS VC++” + “Intel OpenCV library”

  12. Assumption • A rectangle screen shape • Background is static most of the time • 1 user only

  13. Adaboost (old version) • To recognize a “hand” • Adaboost training ( 1397 hand images + 3000 background images ) • Takes 2 days for training a 11-stage classifier ( Viola & Jones  order of weeks ) • Result: Classifier too weak to recognize and falsealarm rate is high

  14. Haartraining Result Original test image Stress the outline of a hand manually Darkening the background

  15. Motion Template • Give up adaboost learning classifier • Motion Template • Motion History Image : image ring buffer ( N=3) • To reduce the computation (take off complex mathematical computation and replace with some simple heuristics ) • To acquire and record the front edge of a motion • To define orientation (for different instruction) • To detect a “touch” behavior (density drop rate)

  16. where each pixel (x,y) in the MHI is marked with a current timestamp if the function signals object presence (or motion) in the current video image I(x,y) ; the remaining timestamps in the MHI are removed if they are older than the decay value . This update function is called for every new video frame analyzed in the sequence. Motion History Image

  17. Silhouette

  18. Motion trajectory Note: Record the last 50 front edges

  19. System Flow Chart start Capture from CAM Noise filter Find the screen (edge detection) Mouse/keyboard events Motion interpretation MHI Update Find frond point Image Diff

  20. Find the Screen • During initialization, to find the projected screen • Algorithm • Canny edge detection • Find the screen • Find all the squares in the image and choose the biggest one • Adaptive • Adjust the screen every 10 second in case the camera is moved

  21. Position (pixel) Mapping • Screen mapping (camera and computer) • Define the scale for coordinate translation • eg. 800x600 (camera resolution)  1280x800 (computer resolution). • scale-x = 1280/800 • scale-y = 800/600 Origin Camera Resolution Computer New Origin detected screen 800 600 Origin 800 1280

  22. Event definition • To define mouse or keyboard events • mouse click • if image density dropped dramatically ( > 70%~80%), the position of last frond edge is defined coordinate of a mouse click • Page Up (PgUp) • if above action happens from the left side of the screen, we define this as a “PgUp” event. • Close current windows application • Consecutive 3 error detection within 8 seconds

  23. Noise filtering • False positives • motion trajectories are recorded to filter out false positive signals (partly implemented) • Signal bouncing • A 10 second interval of bouncing is introduced after a valid mouse/keyboard event is detected

  24. Performance • CPU: Pentium M Processor 730 (1.6GHz) • HaarDetectObjects (Typical) • 5 fps (640x480) : 80% CPU Usage • 3 fps (640x480) : 30% CPU Usage • 3 fps (640x480, hand+face classifier) : 50% • Motion Template (Typical) • 3 fps (176x144) : 2~5% CPU Usage • 3 fps (640*480) : 5~8% CPU Usage • 3 fps (800x600) : 10% CPU Usage

  25. System Limitation • High error rate when moving fast • Can be solved by increasing the FPS • Unexpected stop in the middle of the screen will cause falsealarm • Shadow would impact the correctness • If the screen is not well detected, or if the mapping is distorted, accuracy of position will be very low.

  26. Future Work • To improve the accuracy • To port the system to a handheld device • To advance to a real steerable interface (something like “Minority Report”) that a user can drag the icons directly on the projected screen.

  27. Q&A

More Related