100 likes | 192 Views
Develop a Phone Reader to convert images to text using OCR technology and read the text aloud on the device, catering to the visually impaired, illiterates, and language learners. Project involved OCR and TTS modules with detailed design and architecture. Next steps include integrating the modules into an emulator and then porting them to a mobile platform.<br>
E N D
Phone Reader Project By: Hossein and HadiShayesteh Supervisor: Mr. James Connan
Introduction Phone Reader • Converting Image to text using OCR • Reading the text using TTS Potential Users • Intended for Blinds and Illiterates • People dealing with a Foreign Language
HLD / LLD of OCR 1 0..1 1.. 0..1 1.. 0..1
HLD / LLD of OCR Feature Extraction • Assigning PixelNum to each class • Calculating LociNum using Characteristic • Loci approach • Creating Feature Vector for each segment Classification • Creating 38 classes • Creating Feature Vector for each class • Applying a Binary Mask • Calculating Euclidean Distance: • Classifying the input
Low level Design / TTS Word Pronunciation(UnitConcatenator) • Accepts the text • Checks the database for the word pronunciation • Reverts to “letter to sound rules” If the word doesn’t exist • outputs a sequence of phonemes • Passes the pronunciation to prosody stage Play Audio • Engine receives the phoneme • Loads the digital audio from a database • Does some pitch, time, and volume changes • Sends it out to the sound card.
Overall Software Architecture OCR Package TTS Package • Main Method • Captures text image • Invokes OCR package • Sends extracted text to TTS package • Reads out the text
Project Plan Term 3 • Implementing OCR and TTS engines in emulator environment • Integrating OCR and TTS engines Term 4 • Porting the complete package to the mobile platform • Testing the final package
Project Demo Functionality Demo • Demonstrating project functionality using a win32 application User Interface Demo • Demonstrating project User Interface using a mobile emulator