advanced ocr with omnipage and finereader n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Advanced OCR with OmniPage and FineReader PowerPoint Presentation
Download Presentation
Advanced OCR with OmniPage and FineReader

Loading in 2 Seconds...

play fullscreen
1 / 21

Advanced OCR with OmniPage and FineReader - PowerPoint PPT Presentation


  • 206 Views
  • Uploaded on

Advanced OCR with OmniPage and FineReader. Overview. Optical character recognition Structural recognition Options Loading Zoning OCR Editing. Optical Character Recognition (OCR). OCR turns pictures of text into e-text Does well unless… The picture is fuzzy The contrast is poor

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Advanced OCR with OmniPage and FineReader' - kuniko


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
overview
Overview
  • Optical character recognition
  • Structural recognition
  • Options
  • Loading
  • Zoning
  • OCR
  • Editing
optical character recognition ocr
Optical Character Recognition (OCR)
  • OCR turns pictures of text into e-text
  • Does well unless…
    • The picture is fuzzy
    • The contrast is poor
    • The font is unusual
    • The font is too small or too large
    • The material has unusual characters
structural recognition
Structural Recognition
  • Analyzes the layout of the page
    • Columns
    • Headings
    • Graphics
    • Tables
  • Usually does fairly well, unless the layout is non-standard
programs that run ocr
Programs that Run OCR
  • Programs for consumers
    • Kurzweil 1000, 3000
    • OpenBook
    • Intel Reader
    • Many others…
  • Programs for production
    • ABBYY FineReader
    • Nuance OmniPage
consumer programs
Consumer Programs
  • Highly automated
  • Designed for individuals who have print disabilities
  • Are not good production tools
    • Do not provide flexibility
    • Do not allow much overriding
    • Interfaces not designed for editing
production programs in general
Production Programs in General
  • A good program for production allows you to…
    • Control the zones (areas or blocks of text and graphics)
      • Add, delete, change
    • Edit easily
    • Improve recognition
preferred programs
Preferred Programs
  • ABBYY FineReader
    • Relatively easy to learn
    • Fairly intuitive
    • Good structural recognition
  • Nuance OmniPage
    • Less intuitive but more accessible
    • Often does better with technical materials
both good tools
Both Good Tools
  • If you can afford to have both, it’s nice, but not absolutely necessary.
  • If you have both, run a couple test pages through each to see which is doing better on a particular job.
under the hood
Under the Hood
  • For best results with a program, set up your options before you begin!
  • Tools > Options
lots of languages
Lots of Languages
  • FineReader and OmniPage handle multiple languages.
  • For foreign language, turn on all the languages in the book.
    • It will recognize the diacritical marks.
    • Turn on what you need, but only what you need.
slide12
Math
  • If you are running OCR on math, try turning on Greek.
    • Greek will allow the program to recognize alphas, deltas, sigmas, etc.
another decision
Another Decision
  • Detect page orientation or not?
    • Does not always get it right
    • Try it if you have many pages turned
considerations
Considerations
  • You may or may not want to keep headers and footers.
    • I generally keep them to pull the page numbers.
  • You may want to keep the page breaks.
    • Retaining page breaks helps to maintain one-to-one page correspondence with the book.
fitting everything
Fitting Everything
  • In some cases, you may need to work with a custom paper size to fit everything onto one page.
  • This feature can be helpful when you are retaining everything on the page but not the layout.
loading files
Loading Files
  • “Open”
    • Opens saved program files
  • “Load”
    • Loads image files to process
  • Note that this same issue comes up with saving!
wizards are evil
Wizards Are Evil…
  • Do not rely on the automation
  • Load the image file and choose the processes you want
workspace
Workspace
  • The program has three primary areas
  • Pages Pane
    • Either thumbnails or details
    • Allows simple navigation of pages
  • Image Pane
    • Your graphic
  • Text Pane
    • Area where the text from OCR will show
more accessible
More Accessible
  • Both programs have a detail view.
    • Shows text instead of graphics
  • Detail view is more accessible for screen readers.
  • Otherwise, it is personal preference.
two ways to save
Two Ways to Save
  • To Save the program file to access later in the OCR program, choose File > Save
    • This saves your work file.
  • You save your converted file during the last phase of the processing.
production tips
Production Tips
  • Work with dual monitors
    • Check your computer and video card
  • Stretching an OCR program across two monitors is a HUGE time-saver!
  • Learn to use keyboard shortcuts.
    • They save tons of time!