1 / 14

Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

Jan Jona Javoršek * Tomaž Erjavec* Petra Vide Ogrin** * Jožef Stefan Institute, Ljubljana, Slovenia ** Slovenian Academy of Sciences and Arts, Library, Ljubljana, Slovenia. Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application. Outline. Digitization

willem
Download Presentation

Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jan Jona Javoršek* Tomaž Erjavec* Petra Vide Ogrin** *Jožef Stefan Institute, Ljubljana, Slovenia **Slovenian Academy of Sciences and Arts, Library, Ljubljana, Slovenia Slovenian Biographical Lexicon – From a Digital Edition to an On-Line Application INFuture 2009, Zagreb

  2. INFuture 2009, Zagreb Outline • Digitization • Encoding methodology • XML–TEI structure • On-line application • Future plans

  3. INFuture 2009, Zagreb Slovenian Biographical Lexicon • Printed version comprises 15 volumes + index, published over a longer period of time (1925–1991) • Includes notable figures important for Slovenian cultural life, from the beginnings up to contemporary time • Covers 5,042 biographical entries, over 5,100 persons because of family entries • Data in the articles are checked against the relevant primary sources

  4. INFuture 2009, Zagreb Example page from SBL

  5. INFuture 2009, Zagreb Encoding methodology • Use of open standards and software • Use of TEI P5: specific elements for describing biographical and prosopographical data, e.g.: <birth>, <death>, <date>, <placeName>, <sex>, <faith>, <occupation>, <floruit> • Up-conversion into TEI–XML: OpenOffice – TEI OO package (XSLT stylesheets) → TEI–XML document (basic structure) • Semi-automatic extraction of metadata: Perl, XSLT + manual intervention

  6. INFuture 2009, Zagreb SBL article structure • <div> • <listPerson> • <person n=“main“> • <!-- other elements for biographical data: birth, death, occupation … --> • </person> • <person n=“author“> • <!--author's name--> • </person> • </listPerson> • <p> • <!-- the annotated text of the article --> • </p> • </div>

  7. INFuture 2009, Zagreb

  8. INFuture 2009, Zagreb Example of various atribute values for <persName> • @type • = adopted 2 • = artistic 21 • = incorrect 6 • = married 193 • = monastic 4 • = nickname 37 • = operosorum 21 • = partisan 96 • = pseudo 2350

  9. INFuture 2009, Zagreb SBL online application • Fedora Commons: extensible framework for storage, management and dissemination of complex objects and object relationships • Repository + a digital library of bibliographical articles, enabling browsing and searching • Fedora Generic Search – provides native Fedora Commons interface between an external search system and Fedora Commons API • SOLR, search system based on Apache Lucene search and indexing library • OAI-MH protocol, REST and SOAP protocols

  10. INFuture 2009, Zagreb Example entry

  11. INFuture 2009, Zagreb Advanced search options

  12. INFuture 2009, Zagreb Advanced search • Drop-down menus for occupations – integrated taxonomy • Drop-down menus for placenames: search by different categories, e.g. country, district, settlement, multilanguage search for some places: e.g. Gradec (slov.) – Graz (ger.) • Search by forename, surname, and by different languages of person's name • Search by rolename: e.g. bishop, or nobility titles, e.g. count, knight, baron etc.

  13. INFuture 2009, Zagreb Future plans • Expansion and normalization of numerous abbreviations – problem: Slovenian is a highly inflectional language • Named Entity Recognition: to enable (semi)-automatic extraction/encoding of persons' and place names occuring in the full-text • Encode other information in the full-text: relatives within SBL, person disambiguation, links within SBL and to external sources, e.g. COBISS bibliographical records, wikisource (online literature publication) • Map placenames on an atlas, e.g. Google maps • Slovenian Biographical Hub – SBL joined by other biographical resources

  14. http://nl.ijs.si/fedora/sbl Hvala! Welcome to beta: INFuture 2009, Zagreb

More Related