a talisman automatic text and line segmentation of historical manuscripts n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts PowerPoint Presentation
Download Presentation
A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts

Loading in 2 Seconds...

play fullscreen
1 / 2

A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts - PowerPoint PPT Presentation


  • 71 Views
  • Uploaded on

A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts. Ruggero Pintus 1 , Ying Yang 2 , Enrico Gobbetti 1 and Holly Rushmeier 2 1 CRS4 2 Yale University ruggero.pintus@crs4.it , ying.yang.yy368@yale.edu , enrico.gobbetti@crs4.it , holly.rushmeier@yale.edu.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts' - melinda-dillon


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a talisman automatic text and line segmentation of historical manuscripts
A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts

Ruggero Pintus1, Ying Yang2, Enrico Gobbetti1 and Holly Rushmeier2

1CRS4

2Yale University

ruggero.pintus@crs4.it,ying.yang.yy368@yale.edu, enrico.gobbetti@crs4.it, holly.rushmeier@yale.edu

titel

Given a book, we extract per-page text leadings and features.

We select the most salient pages and image descriptors, and we compute a rough text segmentation that we use to train a SVM classifier.

We re-launch the prediction to all original features to obtain a fine segmentation.

We convert these sparse text positions into a dense text region representation, and we finally extract text blocks and lines.

Evaluated on a heterogeneous corpus content: ~3K pages, ~4K blocks, ~66K lines

Robust to:

- Different writing styles

- High layout variability - One, two or more columns, marginalia, calendars

- Presence of capital letters, portraits, ornamental bands, graphical contents

- Aging – holes, spots, ink bleed-through, fading, missing parts, damages

Text lines

Original

Text regions

Text blocks

Titel