1 / 2

How Configuring OCR in Alfresco

Simple OCR action for Alfresco. OCR is a very useful feature for any Alfresco Enterprise Content Management System or Software. Configure it in Alfresco Community Edition.

contcentric
Download Presentation

How Configuring OCR in Alfresco

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How Configuring OCR in Alfresco OCR (Optical Character Recognition) is the recognition of printed or written text characters by a computer. It recognizes the characters from the images or scanned documents, and that makes the images (which contain text) searchable. OCR is a very useful feature for ECM product or software. In this blog, we will see how we can configure it in Alfresco Community Edition. We have tested this with Alfresco versions 5.1.f and 5.2.e. It should also work with other nearby versions. Read the blog: OCR in Alfresco [Video] Prerequisites: 1. Alfresco Community / Enterprise Edition installed and running 2. Basic knowledge of Alfresco administration Steps to Configure Tesseract: Note: Here we have some Code section in this all 7 steps. Click here for original source of: Configuring OCR in Alfresco 1. Download Tesseract and install 2. Stop the alfresco tomcat server 3. Download the Linux /Windows context file and place at 4. Place ocr.bat(Windows) and ocr.sh(Linux) at <ALFRESCO-HOME>/ 5. If the current user does not have read or execute permissions on ocr.sh then give it. 6. Add following properties in the alfresco-global.properties file located at 7. Start tomcat server Note: Existing files in alfresco will not be OCRed, you have to upload new image files to test. Important: 1. Make sure you are passing correct arguments in the context file (Entries in context files will be different for Windows and Linux). 2. Check whether your .bat or .sh commands are properly working or not 3. Verify that tesseract creates text file for the image file To verify that go to the directory where tesseract is installed and run the following command tesseract ./<image file-name> ./<text file-name> -l eng If the text file is created with content in it, your tesseract is working. Call: India +91 9925144200, USA +1 (732) 927-5544; Email: sales@contcentric.com

  2. Comment here, if your contents are still not searchable. We are happy to know your ECM challenges, as we love solving them Contact us! Call: India +91 9925144200, USA +1 (732) 927-5544; Email: sales@contcentric.com

More Related