1 / 31

Gamera A Toolkit for Structured Document Recognition including Music

GameraA is a toolkit for structured document recognition, including optical music recognition. It allows domain experts to create custom recognition applications and provides image processing tools, document and symbol segmentation, feature extraction, classification, and semantic analysis.

mmargaret
Download Presentation

Gamera A Toolkit for Structured Document Recognition including Music

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GameraA Toolkit for Structured Document Recognition including Music Ichiro Fujinaga McGill University with Michael Droettboom, Karl MacMillan, G. Sayeed Choudhury, Tim DiLauro, Mark Patton, Teal Anderson Levy Project II Digital Knowledge Center Sheridan Libraries Johns Hopkins University

  2. Content • Levy Project • Levy Sheet Music Collection • Digital Workflow Management • Optical Music Recognition • Gamera • Guido / NoteAbility

  3. Lester S. Levy Collection

  4. Lester S. Levy Collection • North American sheet music (1780–1960) • Digitized 29,000 pieces • including “The Star-Spangle Banner” and “Yankee Doodle” • Database of: • text index records • images of music (8bit gray) • lyrics (first lines of verse and chorus) • color images of cover sheets (32bit)http://levysheetmusic.mse.jhu.edu

  5. Digital Workflow Management • Reduce the manual intervention for large-scale digitization projects • Creation of data repository (text, image, sound) • Optical Music Recognition (OMR) • Gamera • XML-based metadata • composer, lyricist, arranger, performer, artist, engraver, lithographer, dedicatee, and publisher • cross-references for various forms of names, pseudonyms • authoritative versions of names and subject terms • Music and lyric search engines • Analysis toolkit

  6. Optical Music Recognition (OMR) • Trainable open-source OMR system in development since 1984 • Staff recognition and removal • Lyric removal • Stems and notehead removal • Music symbol classifier • Score reconstruction • Lyric classifier?

  7. The problem • Suitable OCR for lyrics not found • Commercial OCR systems are often inadequate for non-standard documents • The market for specialized recognition of historical documents is very small • Researchers performing document recognition often “re-invent” the basic image processing wheel

  8. The solution • Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications • Generalize OMR for structured documents

  9. Introducing Gamera • Framework for creation of structured document recognition system • Designed for domain experts • Image processing tools (filters, binarizations, etc.) • Document segmentation and analysis • Symbol segmentation and classification • Feature extraction and selection • Classifier selection and combiners • Syntactical and semantic analysis Generalized Algorithms and Methods for Enhancement and Restoration of Archives

  10. Features of Gamera • Portability (Unix, Windows, Mac) • Extensibility (Python and C++ plugins) • Easy-to-use (experts and programmers) • Open source • Graphic User Interface • Interactive / Batchable (scripts)

  11. Scripting Environment (Python) Automatic Plugin Wrapper (Boost) Architecture of Gamera Graphic User Interface (wxWindows) Plugins (Python) Plugins (C++) GAMERA Core (C++)

  12. Example of C++ Plugin // Number of pixels in matrix #include “gamera.hh” #ifdef __area_wrap__ #define NARGS 1 #define ARG1_ONEBIT #endif using namespace Gamera; template <class T> feature_t area(T &m) { return feature_t(m.nrows() * m.ncols()); }

  13. Example of Python Plugin // This filters a list of CC objects import gamera def filter_wide(ccs, max_width): tmp = [] for x in ccs: if x.ncols() > max_width: x.fill_matrix(0) else: tmp.append(x) return tmp

  14. Gamera: Interface(screenshot in Linux)

  15. Gamera: Interface(screenshot in Linux)

  16. Histogram(screenshot in Linux)

  17. Thresholding(screenshot in Linux)

  18. Thresholding(screenshot in Linux)

  19. Staff removal: Lute tablature

  20. Classifier: Lute(screenshot in Linux)

  21. Staff removal: Neums

  22. Classifier: Neums(screenshot in Linux)

  23. Greek example

  24. “A formal language for score-level representation” Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel) GUIDO Music Notation FormatH. Hoos, K. Renz, J. Kilian

  25. GUIDO: An example { [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8. g*1/16 | c#2*1/4. b1*1/8 a*1/4. g*1/8 | | e#*1/2 f#*1/4 f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8 g | c#2*1/4. b1*1/8 a*1/4. c#*1/8 ], …

  26. NoteAbility Demo

  27. Conclusions • Gamera allows rapid development of domain-specific document recognition applications • Domain experts can customize and control all aspects of the recognition process • Includes an easy-to-use interactive environment for experimentation • Beta version available on Linux • OS X version in preparation

  28. Acknowledgements • National Science Foundation • Institute of Museum and Library Services • The Levy Family

  29. OMR: Classifier • Connected-component analysis • Feature extraction, e.g: • Width, height, aspect ratio • Number of holes • Central moments • k-nearest neighbor classifier • Genetic algorithm

  30. Overall Architecture for OMR Image File Staff removal Segmentation Recognition K-NN Classifier Output Symbol Name Optimization Genetic Algorithm K-nn Classifier Knowledge Base Feature Vectors Best Weight Vector Off-line

More Related