1 / 18

OCR at INIS Database Production & Imaging Group Yves Reynaud Y.Reynaud-Pulido @ iaea

OCR at INIS Database Production & Imaging Group Yves Reynaud Y.Reynaud-Pulido @ iaea.org. Some OCR features. We can find the needle in the haystack OCR offers a basic search from an unstructured document . OCR brings to life your digitilazed collection. OCR adds an extra value to your image.

makya
Download Presentation

OCR at INIS Database Production & Imaging Group Yves Reynaud Y.Reynaud-Pulido @ iaea

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OCR at INISDatabase Production & Imaging GroupYves ReynaudY.Reynaud-Pulido@iaea.org

  2. Some OCR features We can find the needle in the haystack • OCR offers a basic search from an unstructured document. • OCR brings to life your digitilazed collection. • OCR adds an extra value to your image.

  3. OCR is a computer technology software that • Translate images ­handwritten or typewritten text­ into machine-editable text. • Translate pictures of characters into a standard encoding scheme representing them (e.g. ASCII or Unicode).

  4. Scanned Image (paper or micrographic) • Vector Image (created from native application) here a raster image for sake of comparison

  5. “Do not see the trees (letters)try to see the forest (sentences)“ F0R 488UR1N6 7H3 L0N63V17Y 0F 1NF0RM4710N, P3RH4P8 7H3 M087 1MP0R74N7 R0L3 1N 7H3 0P3R4710N 0F 4 D16174L 4RCH1V3 18 M4N461N6 7H3 1D3N717Y, 1N736R17Y 4ND QU4L17Y 0F 7H3 4RCH1V38 1783LF 48 4 7RU873D 80URC3 0F 7H3 CUL7UR4L R3C0RD.

  6. Verdana FOR ASSURING THE LONGEVITY OF INFORMATION, PERHAPS THE MOST IMPORTANT ROLE IN THE OPERATION OF A DIGITAL ARCHIVE IS MANAGING THE IDENTITY, INTEGRITY AND QUALITY OF THE ARCHIVES ITSELF AS A TRUSTED SOURCE OF THE CULTURAL RECORD.

  7. Brush Script MT (Windows Font) FOR ASSURING THE LONGEVITY OF INFORMATION, PERHAPS THE MOST IMPORTANT ROLE IN THE OPERATION OF A DIGITAL ARCHIVE IS MANAGING THE IDENTITY, INTEGRITY AND QUALITY OF THE ARCHIVES ITSELF AS A TRUSTED SOURCE OF THE CULTURAL RECORD.

  8. PCs≠ Humans • OCR compares patterns and selects closer match, it can be forced to a specific context but requires customization. • People adapt to circumstances and can circumvent misspellings if context is clear.

  9. True or false Usually, an image is adequately sampled if each letter is at least two pixels in thickness:

  10. Zoom in

  11. Zoom in

  12. Results from OCR It is in this context that I… … and an additional protocol on the basis…

  13. Chinese in pixels

  14. Chinese vector images from OCR 滤器

  15. Arabic in pixels

  16. Arabic vector images from OCR هذ ا وشملت

  17. OCR initiatives IMPACT http://www.impact-project.eu/home/ EUROPEANA http://www.europeana.eu/portal/

  18. Thank you

More Related