210 likes | 216 Views
Nikola Tesla Museum Clipping Library. Saša Malkov Nenad Mitić Žarko Mijajlović. 3 rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007. Clipping Library. Nikola Tesla Museum possesses a rich collection of newspaper clippings on work and life of Nikola Tesla
E N D
Nikola Tesla Museum Clipping Library Saša Malkov Nenad Mitić Žarko Mijajlović 3rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007.
Clipping Library • Nikola Tesla Museum possesses a rich collection of newspaper clippings on work and life of Nikola Tesla • The clipping library is collected by Nikola Tesla, supported by his personal secretary • One part of the library is organized in books, while many clippings are not organized
Digital Library Prototype • Digitization Group at Faculty of Mathematics approached the development of digital clipping library prototype • Primary goals: • The problem analysis • Recognition of appropriate solutions
Problems • Significant variations in materials sources and qualities • The data and metadata organization and modeling • Data access
Differences in sources and preservation level • Different digitization techniques provide the different results, depending on paper and print type and preservation level • Different target formats are considered • Digital image formats • PDF • DejaVu format
Data organization • File systems are not appropriate • Complex data and metadata access • Limited search capabilities • Databases allow • Simpler access • Advanced searching
Automatic text extraction • Primary problems are : • Different languages • Large varieties and high font stylization used in the corresponding time period • Significantly low material quality, because of aging • Different OCR systems are evaluated • No OCR software satisfied, primarily because of the low material readability • Significant amount of manual corrections is necessary
Searching • The multiple criteria searching is essential, including searching by • Metadata • Caption • Key words • Publications • Language • Period • The clipping content • Manual corrections of text are essential • The efficiency require the application of some indexing methods
The solution – DBMS • The prototype is based on DBMS IBM DB2 • Advanced SQL implementation • Efficient handling of binary content • High concurrency level • High reliability • Good experiences • Free licensing terms
The solution – User interface • Web application concept is • Rich in content and visual presentation • Customizable • Portable • Relatively simple for implementing
The solution – Application • The library prototype is implemented in functional programming language Wafl • Wafl is designed for automatic document generation and particularly customized for Web development • Features very simple and efficient database access
Nikola Tesla Museum Clipping Library Saša Malkov Nenad Mitić Žarko Mijajlović 3rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007.