1 / 13

Developing Computer Vision Applications for Android

This presentation contains video demos of 2 Computer Vision-based applications - a Document Scanner and Video Lecture Note Taker. The Computer Vision components used in each are outlined and the libraries used to build each component are detailed.

vonistudios
Download Presentation

Developing Computer Vision Applications for Android

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developing Computer Vision Applications for Android Video Lecture Notes Generator Document Scanner Teyvonia Thomas

  2. Document Scanner Features ●Automatic Document Border Detection ●Automatic Rectification (Converts image of Document to Fronto-parallel view) ●Automatic Document Enhancement ●Other Document Enhancement as well as Black/White, Grayscale, Brightening, Contrasting Enhancements ●Automatic Text Detection: User can then share text in documents ●PDF Generation ●Share PDF or multiple images of Document ●Multi-page Scanning ●Multi-document Generation

  3. Document Scanning Pipeline Canny Edge Detection Contour Extraction Input RGB Image Crop and Brighten Output T

  4. Document Enhancement Black/White Enhancement (using Thresholding) Adaptive Thresholding Simple Binary Thresholding Brightness (β) and Contrast (α) Enhancement output_img(i,j) = α • input_img(i,j) + β

  5. Text Detection

  6. Video Note Taker Features ●Automatically extracts unique pages of notes from videos with thousands of frames ●e.g. the 2 unique pages of notes were extracted in a few seconds from the 8 minute video lecture on the previous slide ●Pages are displayed as thumbnails below the video lecture ●Clicking on a page thumbnail takes the user to a full page view ●Clicking on any line in the notes (from the page view) takes user back to the point in the video the line of notes were being written and lectured about by the lecturer (Dynamic video lecture seeking)

  7. Feature Detection, Descriptor Extraction, Descriptor Matching FAST Feature 1. Feature Detection using FAST (Features from Accelerated Segment Test) Features 2. Descriptor Extraction using BRISK/ORB/FREAK FREAK Descriptor BRISK Descriptor Binary descriptor is composed out of three parts: 1. A sampling pattern: where to sample points in the region around the feature. 2. Orientation compensation: some mechanism to measure the orientation of the keypoint and rotate it to compensate for rotation changes. 3. Sampling pairs: the pairs to compare when building the final descriptor. 3. Binary Descriptor Matching Hamming Distance = sum(XOR(string1,string2)) • e.g. H. Dist between 1011101 and 1001001 is 2.

  8. Video Note Taking Method Feature Detection and Matching in Consecutive Frames New Page Detection and Generation based on ratio of features matched between consecutive frames and average displacement of corresponding features (for boards that “move” during lectures)

  9. Computer Vision Libraries for Android used to create the 2 Apps ●OpenCV4Android SDK ●Tesseract (“tess-two”)

  10. What's in OpenCV? Image Segmentation Face Detection People Detection Image Stitching Object Detection and Matching Image Inpainting Background Subtraction Motion Tracking

  11. Questions?

More Related