Overview l.jpg
Sponsored Links
This presentation is the property of its rightful owner.
1 / 21

Overview PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on
  • Presentation posted in: General

Overview. Market Leader: “Intelligent Capture & Exchange” Solutions. Information comes in many forms…. Structured Content Information is predictable Location of information is predictable. Examples: Waybill Traffic Citations Tax Forms Mail Order Forms Applications

Download Presentation

Overview

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Overview

Market Leader:“Intelligent Capture & Exchange” Solutions


Information comes in many forms…

Structured Content

  • Information is predictable

  • Location of information ispredictable

  • Examples:

  • Waybill

  • Traffic Citations

  • Tax Forms

  • Mail Order Forms

  • Applications

  • Insurance Claims


Information comes in many forms…

Semi-Structured Content

  • Information is predictable

  • Location of information isNOT predictable

  • Examples

  • Accounts Payable

  • Accounts Receivable

  • Transportation

  • Bills of Lading

  • Medical Billing


Information comes in many forms…

Unstructured Content

  • Information is NOT predictable

  • Location of information isNOT predictable

  • Examples

  • Mortgage Folders

  • Medical Records

  • Email Classification

  • Digital Mailroom

  • Litigation Support


Where Did Kofax Classification / Separation Originate?

Was funded by In-Q-Tel, the joint venture capture startup group owned by the CIA.


Enabling the automation of Document Classification Processes

  • Processing millions of captured foreign documents

  • Automating the categorization of content to expedite linguistic activities

  • Connecting to an internal content management solution

Transformation Modules


Kofax Transformation - Advanced Document Separation

  • Automatically identify document type and individual document boundaries (start/end) within a batch of multiple documents

  • Goal: Perform separation/recognition just as if physical separator sheets were inserted between each document

  • Utilizes multiple approaches in classification and separation in a waterfall approach.


KTM Advanced Document Separation Process

KTM Advanced Document Separation

Typical Process Flow

Extraction 1

Document

Scan/

Extraction

Classify &

Review

Data Validation

Release

Import

Separate


Vector Space Machines Under the Hood

Warning:

The following slides may require pocket protectors.


Automatic Document ID and Indexing

S

90%

E

90%

S

65%

M

70%

E

85%

S

72%

E

80%

S

85%

E

50%

S

55%

M

65%

E

70%

S

70%

E

75%

E

22%

M

15%

S

12%

M

10%

E

65%

S

12%

E

30%

S

E

S

M

E

S

E


Automatic Document ID and Indexing

(Automatic)

Document ID &

Index

g

  • g g

  • g g g

    g g g g g g g g g g

  • g g g g g g g g g g g g g g g g g g g g g

Date

SSN

Last Name

Page Identification

Document Separation

S

E

S

M

E

S

E

Index


Automatic Document ID and Indexing

(Automatic)

Document ID &

Index

g

  • g g

  • g g g

    g g g g g g g g g g

  • g g g g g g g g g g g g g g g g g g g g g

Date

SSN

Last Name

Page Identification

Document Separation

S

E

S

M

E

S

E

Index


Classification “Waterfall” Technique

Barcode

Result:

?

?

?

?

?

?

?

INDICIUS Barcode Recognition

Image

Result:

?

?

?

?

?

N/A

INDICIUS Image Classification

Patterns

Result:

?

?

N/A

N/A

N/A

INDICIUS Pattern Matching

N/A

N/A

N/A

N/A

N/A

N/A

mC Result:

mohoClassifier (mC)

Using multiple classification engines:

  • Performance is optimized by attempting fastest classification techniques first, accepting results only if very confident

  • Mohomine text classification is used as “catch all” method—very accurate with widest reach, but dependent on full-page OCR

3

4

5

6

7

8

1

2

Page #

First

Form X

1 ms

First

Form Y

First

Form Z

20 ms

Last

Form X

Last

Form Y

Last

Form Z

200 ms

Middle

Form X

Middle

Form Z

1000 ms


How do we actually build a model?

Business

Dictionary

SAN JOSE, Calif. (AP) -- One week after firing its top executive, Hewlett-Packard Co. reported quarterly earnings that were essentially flat, and the interim chief executive acknowledged, ``There is work to be done.''

For the three months ended Jan. 31, HP reported a profit of $943 million, or 32 cents per share, only 0.7 percent higher than the $936 million, or 30 cents per share, it earned in the first fiscal quarter…

NEW YORK (Reuters) - Former WorldCom Inc. finance chief Scott Sullivan, who has become the star witness against Bernard Ebbers, admitted on Wednesday to a history of lies, saying he had deceived shareholders, analysts and the board while his staff undertook an $11 billion accounting fraud.

Sharply questioned by the lead attorney for Ebbers, the one-time chief executive officer …

Sports

Saying this was a "sad, regrettable day," Commissioner Gary Bettman announced today that the National Hockey League was canceling the season because negotiators had failed to come to an agreement with the players' union on salary caps.

With his announcement, the N.H.L. becomes the first major pro sports league in North America to lose an entire season to a labor dispute…

PARIS (AP) -- Still hungry to race but wary he is not in the best shape, Lance Armstrong wants to take his Tour de France record to even mightier heights: He will try for a seventh straight title this summer.

Armstrong had left open the possibility he wouldn't compete this year in cycling's showcase event to pursue other races. But in an announcement Wednesday on the Web site of his Discovery Channel team the Tour's only six-time winner…

Technology

A new battery-powered Etch A Sketch will rely on digital electronics for a speedy interpretation of each knob twist. It is designed, its makers say, to transmit data along a wire plugged into a television set that will display every line and detail in real time, with accompanying sounds and optional color. It will cost $20, twice the price of the traditional Etch A Sketch.

"I think the kids are becoming more advanced in…

SAN FRANCISCO, Feb. 15 - Late in the summer of 1973, two young scientists in the nascent field of computer networks hunkered down in a conference room of the Cabana Hyatt Hotel in Palo Alto, Calif., a clean but bland stopping place for salesmen and the parents of students at nearby Stanford University. Their goal was to thrash out a way to make different, isolated computer networks talk to each other….


The Problem: Document Separation

Separation of unstructured documents is a significant expense for a high volume capture system

  • Typical ‘structured’ recognition technologies are not applicable

  • Manual insertion of separator sheets is the primary solution today

  • 50% of document preparation labor spent sorting documents and inserting separator pages

Where does one document stop and the next begin?

Here?

Here?

Here?

SS


How Document Separation Works

Separation

Middle Form Y (53%)

Last Form R (81%)

First Form C (85%)

Middle Form X (69%)

Last Form E (98%)

Middle Form C (17%)

First Form Y (75%)

Middle Form X (92%)

First Form C (27%)

Last Form R (92%)

1

2

3

4

5

Page #

X

X

mC Result:

Middle Form X (92%)

Last Form Y (95%)

First Form X (97%)

First Form Y (84%)

Last Form X (95%)

FSM

Constraints:

  • A “First” page must be followed by “Middle” or “Last” of same type

  • After a “Last” page must come a “First”

  • Custom Business Rules

Best Path

Analysis:

Form X

Form Y


Customer Success Story

  • Residential mortgage processing, 12 Million images/month

  • Each customer folder: ~100 pages, 60-80 doc types

  • Before automatic document separation

    • 60 people doing document separation and preparation

    • 16 people to review (QC) a customer folder

    • 8.25 minutes per folder to review

  • With automatic document separation

    • 10 people doing document separation and preparation

    • 3 people to review (exceeded goal to reduce staff to 8)

    • 2 minutes per folder to review

    • Exceeded processing goal targets at each step

    • $420,000 annual savings in labor

    • $100,000 annual savings in separator sheet consumables


Capabilities Overview

  • Classification

    • Content (text)

    • Layout (topography)

    • Combination of the above

  • Extraction

    • Rules (format, database)

    • Learn-by-example

    • Templates

  • Any document

    • Structured (inc. legacy forms)

    • Semi-structured, e.g. invoices

    • Unstructured documents, e.g. correspondence


Key Applications/Use Cases

  • Invoices (AP automation)

    • Speed up AP process and reduce manual keying

    • Pre-configured solution already available

  • Sales Orders

    • Improve sales order process and accuracy

  • ‘Mailroom’ applications/Workflow automation

    • Automatic classification and routing

    • Indexing (<= 3 fields) for archive

    • No need for pre-sorting

  • Image to archive automation

    • Automatic classification and indexing for storage in dm system

    • ‘Better, quicker, more accurate batch capture’

  • Business process automation

    • Full data capture

    • Straight thru processing

  • Semi-structured and unstructured documents

    • Invoices and credit notes

    • Correspondence

    • Reports


Kofax KTM Differentiators

  • Integrated with Kofax Capture (offering HA, xx)

  • Learn-by-example extraction

  • Learn-by-example classification

  • Continuous supervised learning in production

  • Single product for all document types that is upgradable


Kofax Solution Strengths

  • Market leader

  • Out-of-the-box

  • Unlimited import options

  • VRS integrated with “QC Later”

  • Better Recognition/Multiple Document Types

  • API Integrated export

    • Secure handling of images & data

  • Out-of-the-box reports

  • You won’t outgrow it

Kofax Capture Overview


  • Login