1 / 35

Optical Character Recognition for Logistics Reporting

Optical Character Recognition for Logistics Reporting. Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found here: https://jsi.webex.com/jsi/lsr.php?AT=pb&SP=MC&rID=75382732&rKey=f3bc9ca3232b8b42. Testing Methodology. Perform & Document.

len
Download Presentation

Optical Character Recognition for Logistics Reporting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found here: https://jsi.webex.com/jsi/lsr.php?AT=pb&SP=MC&rID=75382732&rKey=f3bc9ca3232b8b42

  2. Testing Methodology Perform & Document Select Tools Collect Forms

  3. OCR Tools OmniPage Professional 18 (desktop-based, licensed) Abbyy FineReader 11 (desktop-based, licensed) Tesseract-OCR (desktop-based, open-source) Evernote (mobile phone–based, free) Captricity (web-based, paid)

  4. Testing Protocol • Pass field-filled logistics management information system (LMIS) form through application • Fill out blank LMIS form carefully and pass through • Record number of correctly vs. incorrectly identified fields (numeric) • Calculate character recognition accuracy rates.

  5. Form 1: Tanzania Essential Medicines R&R

  6. Form 2: Tanzania Essential Medicines Supplementary Form

  7. Form 3: Zimbabwe ARV R&R

  8. OmniPage Professional 18 • Licensed tool—$499.99 • General impressions: • Easy to use after initial orientation • Fast processing (less than 1 minute) • Can verify/validate recognized text

  9. Interface

  10. Output

  11. OmniPage Professional 18 Accuracy rates (numerical fields): • Forms filled out in the field: • TZ essential medicines: 13% • TZ supplementary form: 21% • Forms filled out by tester: • TZ essential medicines: 53% • TZ supplementary form: 76%

  12. Abbyy FineReader 11 • Licensed tool—$169.99 • General impressions: • Easy to use after initial orientation—harder to learn to use than OmniPage • Fast processing (1–3 mins) • Can verify/validate recognized text

  13. Interface

  14. Output

  15. Abbyy FineReader 11 Accuracy rates (numerical fields): • Forms filled out in the field: • TZ essential medicines: 10% • TZ supplementary form: 10% • Forms filled out by tester: • TZ essential medicines: 39% • TZ supplementary form: 43%.

  16. Tesseract-OCR • Open-source tool • General impressions: • Does not have a graphical user interface • Is a command line tool—needs to be run from command line • Difficult for users who do not know command line use • Requires input file in image format (i.e., .png, .jpg)

  17. Tesseract-OCR In the example below, we ran Tesseract with a scanned image file and an output file to hold the recognized text:

  18. Interface Program install location Program name Scanned image Output text file name

  19. Source File

  20. Output

  21. Evernote Can send pictures of documents Not useful for character recognition or data entry Allows tagging on the image, e.g., district/facility

  22. Captricity • Web-based, paid service • Offers several tiers of pricing: • “Pay as you go”—$0.01 per field • Discounts as number of fields increase • “Premier” tier—$335/month for 50,000 fields • $0.0067 per field • “Enterprise” tier—custom tier, depending on volume • provides dedicated account manager and support • volume discounts.

  23. Captricity Process: • User creates template for form • System creates digital fingerprint from template • Compares uploaded form to digital fingerprint • Fixes skews, or flips form, if needed • Does human validation field-by-field • never see the entire form • preserves privacy • Output in .csv file.

  24. Captricity General impressions: • Initially, time intensive • must separate forms into single files, per page • must set up templates for each page, e.g., one page form took 10 minutes to create • Requires Internet connection • Approximately 24-hour turnaround for first time • turnaround time is reduced after first processing.

  25. Interface

  26. Output

  27. Captricity: Accuracy rates (numerical fields) • Forms filled out in the field: • TZ essential medicines: 65% • TZ supplementary form: 99% • Zimantiretrovirals: 52% • Forms filled out by tester: • TZ essential medicines: 98% • TZ supplementary form: 100% • Zimantiretrovirals: 98%

  28. Research conclusion: Captricity looks most promising Digging deeper…

  29. Captricity Positives • Shows best results • Validation of output is critical • Fast turnaround time • Digitization is accurate • data entry staff did not introduce new errors • Cloud storage can store data indefinitely • Output in .csv format (readable by a database).

  30. Captricity Negatives • Requires Internet connection; must be used at higher levels of supply chain • Set up is time-intensive; must— • split up forms • create templates • rotate to landscape • Validation/reconciliation can be time consuming • Cost can be high, but discounts available for high volume • Cheaper than hiring data entry clerks?

  31. Use Cases for LMIS Reporting Using Captricity

  32. Use Case 1 Central database District: Upload and verify SDP/CHW: Send paper report

  33. Use Case 2 Central database District: Upload and verify SDP/CHW: Take photo of form

  34. Use Case 3 Central database Central: Upload and verify District: Aggregate reports SDP/CHW: Send paper report

  35. Thank You! Questions?

More Related