1 / 51

The Universities’ Collection Databases

The Universities’ Collection Databases. ”The Universities’ Collection Databases” denotes all databases developed by the Unit for digital documentation at the Arts Faculty, University of Oslo.

cole
Download Presentation

The Universities’ Collection Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Universities’ Collection Databases ”The Universities’ Collection Databases” denotes all databases developed by the Unit for digital documentation at the Arts Faculty, University of Oslo. The databases contains data from archaeology, antropology, botany, zoology, numismatics, history, history of arts, lexicography The databases are accessible via specially developed end user applications and via the WWW.

  2. The Universities' Collection databases This presentation gives an overview of • A common user interface • Samples from some of the databases

  3. Implementation • The databases are implemented in Oracle 8.1.7, not using any spesific object oriented features • The object types (and the table structures) are defined in a common meta database • All databases are accessed via a common framework • The common framework get design and structure information from the meta database. All queries are generated automatically on the basis of the information in the meta database. • Each user is granted access via a user database • The user interface program checks the meta database for new versions of modules and upgrade it self automatically via the net. • New databases are added regularly • A WWW version is being developed

  4. The users have their personal navigator for quick access to databases of interest. Each database has an assosiated object type

  5. The users can add their own folders or categories to the navigator

  6. Choose a database (archaeological artifacts and finds) Search for the artifact type ”ring”

  7. Click on a column title to sort the result grid

  8. Drag and drop a column title to group the rows in the grid 9 rings found in the county 'Akershus'

  9. Double click to view detailed information (show the object viewer)

  10. The artifacts found together with the selected ring (in the same find event)

  11. The users can export the data as HTML, Excel or according to the users’ predefined report templates

  12. The result grid exported to Excel

  13. The users can define report templates

  14. Drag and drop result rows onto a predefined report template of the corresponding object type to create a report

  15. The report is ready to be printed

  16. Click to save pointers to selected rows in the result grid A list can hold pointers to a manually selected set of objects or a dynamic set (query defined). The pointers can be of a single object type or have different typed. In the latter case the type will be the common supertype

  17. Click on the list icon to see the content of a list. In the system a stored list is just a (sub) database and can be queried.

  18. Additional pointers to can be added to an existing list

  19. Click on the explorer icon to get an overview of users and data sources (databases and stored lists)

  20. Click to see both the result grid and the object viewer Select another database (here: place names excerpts)

  21. Click to switch windows Display the object correponding to the next row in the result grid

  22. The users can create and store their personal result grid design The tree structure reflects the structure of the object type as defined in the meta database

  23. The users can create and store their personal query form design

  24. Linguistic and lexicographic applications • Lexicographic archives • Lexical databases • Dictionary databases • Editing tools • The Meta Dictionary - a tool for the field linguist or lexicographer • The Norwegian Dictionary project • Text corpus tools

  25. Lexical archives • The database for the traditional word slip collection of the Norwegian Dictionary project • Main collection : 2 900 000 facsimiles • Regional collection: 187 000 facsimiles • The database is linked to the Meta Dictionary

  26. Head word • Part of speech • Literature references • Place of utterance • Facsimiles

  27. Morphological databases • Lists with lemmata and inflected forms for the two Norwegian written languages (bokmål, nynorsk) • Basis for a two level morpho-syntactic tagger • Produced in collaboration with the Text Laboratory at the Arts faculty, Univ. of Oslo • Bokmål: 156.000 lemmata, 1,2 million inflected forms • Nynorsk: 123.000 lemmata, 896.000 inflected forms • The databases are linkedto the Meta Dictionary

  28. Lemma Paradigme codes and generated inflected forms

  29. Dictionary databases • Database tools for two major Norwegian dictionaries • The entire process from editing to camara ready manuscript • The tools are integrated in the common framwork • The manuscripts are linked to the Meta Dictionary

  30. The dictionary entry

  31. Fields for different information categories Graphical representation of the definition structure The editing tools are for the time being not a parts of the common framwork AWYSIWYGpresentation of the entry

  32. The entries can be viewed in the their running context The program generates the head word part of the entry based on the lemma and part of speech marking Navigation buttons

  33. A set of entries (or the entire manuscript) can be typeset in the PDF format and presented on the screen. • The entries are exported from the database as XML documents, converted via TEX, DVI to PDF and send back to the user.

  34. The Norwegian Dictionary • A national dictionary project (nynorsk) • To be finished in 12 volumes by year 2014 • DOK is developing the software solutions • The dictionary manuscript is linked to the Meta Dictionary

  35. Graphical representation of the entry The full text based on the structure of the entry

  36. Each part of the dictionary entry has its own data entry form Data entry form for the head word part Artikkelteksten vert kontinuerleg oppdatert The entry text is updated automatically

  37. Skard’s dictionary • Defines the 1938 orthography • 32 000 entries • The dictionary is linked to the Meta Dictionary

  38. The Meta Dictionary • A tool for systematising weakly normalized languages and a tool for the development of the Norwegian Dictionary (NO2014) • Interlinks different lexical databases • 521 000 headwords (NO2014) • The backbone in the (NO2014) project

  39. 924 slips about the word ”hus” (house)

  40. Word forms /lemmata written in different dialects and/or according to changing orthographies

  41. Word compound analysis

  42. Object viewer according to the type of lexical resource (here slips) Links to other lexical resources

  43. Tool for fast normalization of the head words in the Meta Dictionary Each project assistant has to normalize 300 entries a day All links are manually checked

  44. Norwegian (Nynorsk) electronic text corpus Background • Editorial requirements for NO2014 • Design and implementation • Unit for digital documentation, DOK • Work began in August 2002 and will continue according to the tasked assigned to the unit by NO2014 for one year • daniel.ridings@muspro.uio.no

  45. Norwegian (Nynorsk) electronic text corpusLong-term goals • The definitive corpus for New Norwegian for lexicography and for other domains using electronic resources • A corpus access system that can be reused for other languages and text collections • Incorporation of robust methods from computational linguistics with the goal of creating a linguistic workbench, over and above a corpus workbench

  46. Norwegian (Nynorsk) electronic text corpusApplication Area • Editorial work within NO2014 • Headword selection • Choice of examples • Examples are catalogued in the Meta Dictionary • Sense division • Firth: Knowing a word by the company it keeps. • Aided by the refined collection of examples

  47. Norwegian (Nynorsk) electronic text corpusIntegration with the Meta Dictionary • Excerpta refined by • Methods from computational linguistics • Human interaction • Eventually a selection will be made for publication, but in the framework of the Meta Dictionary, even those that were excluded from publication will remain available for other application areas • Communication with the editing software through the Meta Dictionary

  48. Norwegian (Nynorsk) electronic text corpusDesign • Representative corpus based on specifications produced by the EU language resources project, LE-PAROLE • SGML markup in accordance with PAROLE’s specifications, based on TEI • One-to-One mapping between the PAROLE format and a database structure defined in Oracle.

  49. Norwegian (Nynorsk) electronic text corpusStatus • 25,000,000+ words • Dag og Tid (news paper) • 21,000,000 words • Legacy data • approx 5,000,000 literature • Existing agreements • Weekly deliveries from Dag og Tid • Samlaget (publishing house) • Syn og Segn (monthly magazine)

More Related