Graphics recognition from re engineering to retrieval
1 / 39

- PowerPoint PPT Presentation

  • Uploaded on

Graphics Recognition – from Re-engineering to Retrieval. Karl Tombre, Bart Lamiroy LORIA, France. Document Analysis in the IR era. Information is at the core of industrial strategies A lot of digital or digitized information, but often in very “poor” formats

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - manning

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Graphics recognition from re engineering to retrieval l.jpg

Graphics Recognition – from Re-engineering to Retrieval

Karl Tombre, Bart Lamiroy

LORIA, France

Document analysis in the ir era l.jpg
Document Analysis in the IR era

  • Information is at the core of industrial strategies

  • A lot of digital or digitized information, but often in very “poor” formats

  • The challenge: not necessarily re-engineering of documents, but enrich poorly structured information, add (limited) amount of semantics, build indexes

  • Purposes: browsing, navigation, indexing

  • DAR methods and tools useful, but must be adapted

Specific challenges of large scale ir applications l.jpg
Specific challenges of large-scale IR applications

  • Genericity: we cannot necessarily build a complete and exhaustive a priori model of contextual knowledge (ontology)

  • Adaptability: various input data – scanned paper, PDF, DXF, HTML, GIF… – various resolutions

  • Robustness: “back-office” applications

  • Efficiency: online searching in heterogeneous data

  • Scaling: methods have to scale to increasing number of symbols/features

Dar and ir l.jpg
DAR and IR

  • Media without (or with very little) contextual knowledge

  • Image-based indexing and retrieval, indexing of video sequences

  • Documents do explicitly convey information from one person to another person

  • Much more structure, syntax and semantics

Dar and ir some examples l.jpg
DAR and IR – some examples

  • Indexing and/or searching scanned text without OCR

  • Similarities, signatures

  • Query or index on layout structure

  • Table spotting

  • Keyword spotting

What about graphics recognition l.jpg
What about Graphics Recognition?

  • Subfield of DAR, for graphics-rich documents

  • Numerous methods for various analysis and recognition problems

    • Raster-to-vector conversion

    • Text/graphics separation

    • Symbol recognition

  • Many specific technical areas: maps, architectural drawings, engineering drawings, diagrams and schematics, …

Graphics recognition methods l.jpg
Graphics recognition methods

  • Text/graphics separation

Graphics recognition and ir applications l.jpg
Graphics recognition and IR applications

  • Usual text-based indexing and retrieval still useful

  • But need for access to other kinds of information:

    • Symbols

    • Text-drawing connections

    • Description-illustration connections

Some contributions l.jpg
Some contributions

  • Syeda-Mahmood – maintenance drawings

IEEE Trans. On PAMI 21(8):737-751, Aug. 1999

Some contributions11 l.jpg
Some contributions

  • Arias et al., Najman et al. – use of information contained in legend / title block

Proc. GREC’01, Kingston (Ontario, Canada), p.19-26, Sept. 2001

Some contributions12 l.jpg
Some contributions

  • Samet & Soffer – symbols from legend

IEEE Trans. On PAMI 18(8):783-798, Aug. 1996

Some contributions13 l.jpg
Some contributions

  • Müller & Rigoll – graphical retrieval in database of engineering drawings

Proc. ICDAR’99, Bangalore (India), pp. 697-700, Sept. 1999

Some contributions14 l.jpg
Some contributions

  • Boose et al. (Boeing) – Generation of Layered Illustrated Parts Drawings (GREC’ 03)

Proc. GREC’03, Barcelona, pp. 139-144

Wishful thinking l.jpg

Symbol DB

Or even better…

Wishful thinking?

Symbol recognition l.jpg
Symbol recognition

Before we move on:

1st contest on

symbol recognition

held last week

See IAPR TC10 homepage

for further details

  • Natural features for indexing and retrieval

  • Most methods work with known databases of reference symbols – what about interactive querying of arbitrary symbols?

  • From segmentation followed by recognition, to segmentation-free recognition, or segmenting while recognizing

  • Scalability

    • Efficiency / complexity

    • Discrimination power

  • Signatures

Image based signatures l.jpg
Image-based signatures

  • Compute invariant signatures on binary document image

    • F-signatures (ICDAR’01)

    • Radon transform: R-signatures [Tabbone & Wendling]

    • Ridgelets [Ramos Terrades & Valveny – GREC’03] – aka wavelet transform of Radon transform

R signatures l.jpg

Detection of arrowheads [Girardeau & Tabbone]

DEA degree thesis, INPL, Nancy, Jul. 2002

R signatures19 l.jpg

Another example [Girardeau & Tabbone]

Ridgelets l.jpg

[Ramos Terrades & Valveny – GREC’03]

Proc. GREC’03, Barcelona,

pp. 202-211

Vector based signatures l.jpg
Vector-based signatures

[Dosch & Lladós – GREC’03]

  • Based on set of basic graphical features:

    • Parallelism

    • Overlap

    • Collinearity

    • T- and V-junctions

  • Quality factor associated with the various relations

  • Match signatures of reference symbols with signatures of buckets

Vector based signatures22 l.jpg
Vector-based signatures

Proc. GREC’03,


pp. 159-169

Towards symbol spotting l.jpg
Towards symbol spotting

  • Pre-compute – or compute on the spot – a set of basic signatures

  • Can be sufficient for symbol spotting and retrieval

  • Followed by classical symbol recognition if more discrimination is needed

Symbol spotting l.jpg
Symbol spotting

  • [Jabari & Tabbone] : graph matching through probabilistic relaxation, with nodes=segments and vertices=relations

DEA degree thesis, INPL, Nancy, Jul. 2003

Symbol spotting25 l.jpg
Symbol spotting

  • [Jabari & Tabbone] : another example

Combining text and graphics l.jpg
Combining Text and Graphics

  • Extracting Text/Graphics relationships within document

  • Using Text matching for inter-document relationships

  • Transitive inter-document Graphics matching

  • No need for complex graphics matching

  • Restricted to well known document types

Example continuation of wiring diagrams boeing l.jpg
Example: continuation of Wiring Diagrams (Boeing)

  • [Baum et al. – GREC’03]

Proc. GREC’03, Barcelona, pp. 132-138

Scan2xml example l.jpg
Scan2XML Example

Proc. GREC’01, Kingston (Ontario, Canada), pp. 312-325

Indexing and semantics l.jpg
Indexing and Semantics

  • Signature + metric

  • Semantics = measured distance to signature

  • Applies only to homogenous contexts

    • Pre-segmented images

    • Pre-determined image classes

    • Implicit application of domain kowledge

    • ...

  • Semantics = Syntax

Example l.jpg

Signature type A

Metric M

Signature value l

Semantics1 = (1, 1)

Semantics2 = (2, 2)

M(l,s1) < m1 ?

M(l,s2) < m2 ?

semantics = measurement to reference value

Heterogenous document bases l.jpg
Heterogenous Document Bases

  • Semantics do not have a unique syntax anymore

  • Syntax metrics may be context sensitive

  • Semantics = Syntax + Context

    Context needs to be considered

Example33 l.jpg

Context 1:

Signature type A

Metric M

Context 2:

Signature type B

Metric N

Signature value l

What if

M(l,s1) < m1and

N(l,t2) < n2 ?

(1, 1) = Semantics1 = (t1, n1)

(2, 2) = Semantics2 = (t2, n2)

A step to taking into account context l.jpg







A step to taking into account context

(while consolidating existing approaches)

Component Algebra :

  • Image Analysis = Pipeline

  • Syntax + algorithm = semantics



Syntax and semantics need not be distinguished

Component algebra l.jpg
Component Algebra

  • Components :

    Known and implemented document analysis algorithms, taking input data from one domain, and producing data into another domain.

  • Application Context :

    Set of all available Components.

  • Semantics :

    Data sets needed by or produced by Components.

Component algebra is a graph l.jpg
Component Algebra is a Graph











Advantages l.jpg

  • Each node is a semantic concept, semantic relationships are explicitly expressed.

  • Structure may support automatic reasoning and knowledge inference.

  • Context is embedded in components, different contexts give different paths in the graph.

  • Highly scalable and open architecture.

  • Bridge between signal-level document analysis and high-level document representation.

However l.jpg
However ...

The formalism exists, the realization doesn't (yet)

  • What about parametrization ?

  • How context independant can you get ?

  • What about « guessing » context appropriateness ?

  • How to design fully interoperable components ?

Conclusion l.jpg

  • A lot of DA methods – and more specifically GR methods – can be of direct use in IR, indexing and browsing applications

  • Specific challenges

    • Scaling and efficiency

    • Heterogeneous sets of documents

    • Incomplete domain knowledge

    • Symbol spotting

    • On-the-fly symbol searching

  • Sketch of open framework for including document semantics when context can be heterogeneous