280 likes | 412 Views
This project focuses on developing a navigation solution for visually impaired individuals through auditory feedback. By combining visual information processing from a webcam and 3D sound creation, we aim to provide users with a spatial auditory environment that allows them to navigate effectively. Utilizing computer vision algorithms, the system converts visual inputs into auditory signals that convey important information about surroundings, facilitating safer and more independent mobility. This multimodal approach not only improves navigation capabilities but also contributes to ongoing research in cognitive science and acoustics.
E N D
ARD Presentation December, 2010 AISN http://www.cs.bgu.ac.il/~royif/AISN Auditory Imaging for Sightless Navigation
Project Team Academic Advisor: Dr. Yuval Elovici Technical Advisor: Dr. Rami Puzis Team Members: YakirDahan RoyiFreifeld VitaliSepetnitsky
The Problem Domain • Most of our navigation in the everyday life heavily depends on visual feedback that we get from our environment • When the ability to see the surroundings is missing due to visual impairments, the ability to navigate is also damaged
Existing Solutions • Physical sense: • White Cane • Guide Dog • Sensory substitution: • Warning of obstacles (e.g. Optical Radar) • Sonar-like images scanning (e.g. The vOICe)
Vision and Main Goals • Sightless navigation by sensory substitution • Development of an application that allows a person to navigate, relying primarily on the sense of hearing • Integration with a spatial auditory environment • Providing a flexible environment for future research
Our Solution A Combination of visual information processing and 3D sound creation and positioning: • Taking a stream of frames from a web-camera • Processing the frames and retrieving visual information relevant to the user • Creating appropriate sounds according the recognized information • Performing an auditory spatialization of the sound and informing the user about the locations of the detected information
External InterfacesHardware and Software • OpenCV • OpenAL • MATLAB engine library
System Users • End Users • Visually impaired (or even blind) people who use the system for the purpose of hearing their physical environment • Configuration Users • The system installation and initial tuning, such as user profiles creation, will be done by configuration users having the ability to see the operations they perform • Researchers • Cognitive science researchers who wish to conduct experiments regarding 3D sound
Functional RequirementsCore Functionality • For all users (especially the researcher): • Support several types of computer vision and image processing algorithms for extraction of the following information: • Feature points (points of interest) • Contours • BLOBS (regions that are either darker or brighter than the surrounding) • Provide a utility to add new implementations of the above algorithms according to a predefined API • Support specific configurability options for each algorithm type
Functional Requirements (cont.)Core Functionality (cont.) • For all users: • Create appropriate sounds according to the following features: • Location • Brightness • Color • Support sound spatialization using OpenAL API implementations and HRTF datasets conforming with a predefined format • Allow to install new HRTF datasets and OpenAL implementations for improving the quality of sound localization and research purposes
Functional RequirementsOperational Functionality • For the configuration user: • Ability to install the system along with all the peripheral software and initial set of HRTF datasets • User profiles managing: • Support creation of user profiles, which store the system settings optimized to the user preferences • Support the ability to view the settings stored in a user profile • Support the ability to modify and delete profiles • Supply a set of predefined (default) profiles used for initial system configuring • Ability to initialize the system according to a given user profile and switch between profiles
Functional RequirementsOperational Functionality (cont.) • For the blind user: • Support an extensive training mechanism for: • 3D sound perception • Environment understanding • Support the following training types: • Visualizing random shapes • Visualizing pre-defined image files • Fully immersive use of the system by emphasis of some feature • For the researcher: • Support defining a training experiment task • Support recording of the task results and retrieve them later
Non Functional Requirements Performance Constraints (partial) • Speed requirements: • Response time:The system will produce a 3D sound according to a frame taken by the camera within 0.1 seconds at most (we will strive to 0.03 seconds – 30 fps) • Training Speed: • A simple training in order to reach 50% accuracy of recognition should take no more than 30 minutes for a blind user . • A blind user should pass at least 80% of the accuracy tests after 2 days of extensive system usage. • A regular user should pass at least 80% of the accuracy tests after 3 days of extensive system usage.
Non Functional Requirements (cont.) Performance Constraints (cont.) • Portability requirements: • Currently the system is designed to be deployed on Microsoft Windows (XP / Vista / 7 and later) operating systems only • The system will be compatible with 32 / 64 bit machines having web-camera and audio drivers installed • Capacity requirements: • The system should work on machines with at least 1 GB of RAM • The system will support many different OpenAL implementations and HRTF datasets, the limit is the hard disk capacity only
Non Functional Requirements (cont.) Look, Feel and Use Constraints • User Interface requirements: • The UI should be easy to use even for users that are not well familiar to the computers technology • User interface will be in English • Documentation and Help: • A extensive documentation will be supported along with an installation guide • Operations will be implemented as wizards • Error messages heard via headphones
Non Functional Requirements (cont.) SE Project + Platform Constraints • The application core and the UI will be written in C++ language using .NET 3.5 Framework and Visual Studio 10.0 IDE. • MATLAB will be used as a computational engine • During the development stage of the system a home-made simple device will be used (a PC web-camera strapped to a top of headphones) • For the demo and testing purposes, a real device will be supplied by DT labs which are spy-sunglasses (sunglasses with a tiny camera hidden in the nose bridge of glasses)
Usage Scenarios (cont.) Use Cases: UC-1: Visualize Environment A blind user starts the visualization process
Usage Scenarios (cont.) Use Cases: UC-2: Train A blind user performs a training process
Usage Scenarios (cont.) Use Cases: UC-3: Choose a user profile A blind user chooses an existing user profile for the purpose of performing a training or in order to use the system
Usage Scenarios (cont.) Use Cases: UC-4: Visualize Image The core of the visualization process
Questions ? Thank You!