1 / 19

New Century, New Metadata

New Century, New Metadata. Thomas Krichel http://openlib.org/home/krichel University of Surrey, Hitotsubashi University and Long Island University. Why Metadata. Fun Information retrieval Support organization of social process. Crisis of Author Self-archiving. Formal archiving Small

Download Presentation

New Century, New Metadata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. New Century, New Metadata Thomas Krichel http://openlib.org/home/krichel University of Surrey, Hitotsubashi University and Long Island University

  2. Why Metadata • Fun • Information retrieval • Support organization of social process

  3. Crisis of Author Self-archiving • Formal archiving • Small • Metadata poor • Informal archiving • Information retrieval difficult • Lack of support infrastructure

  4. Improving formal archiving • Strengthen the metadata provision • Broaden the mission of archiving • Allow usage of archived material in many user services • Better report on archive material usage • Strengthen the relationship with overlay services

  5. Improving Informal Archiving • Build standardized metadata supply format • Harvest that metadata into larger digital libraries • Offer archival backup for papers

  6. Metadata to Support Self-archiving • Simple to compose • Intuitive vocabulary that is specific to the academic process, e.g. “author” instead of “creator” • Widely applicable • All disciplines and publication forms • High quality i.e. controlled

  7. Metadata Control • Any processing that is done to the metadata before its inclusion in a user service. • Essential in a situation where metadata is harvested.

  8. Types of Control • Syntactic control • Relational control • Retrieval control • Identity control • Verity control • Accession control

  9. Basic Model • Four different record types • Document • Group • Person • Organization

  10. Group and document • There is only one document type. • Groups are used to refine the status of the document. • Group construct meant to be defined by librarians, publishers and other intermediaries.

  11. Person and Institution • Person and institution admit very similar attributes • It is hoped that organizational information will be contributed by intermediaries.

  12. Implementation of Basic Model • RePEc • 100000 documents • 100 groups (series) • 500 authors • 5000 institutions • Example • http://ideas.uqam.ca/EDIRC/data/frbgvus.html • Possible to do the same thing for ReLIS

  13. Basic Grammar • XML syntax • Three groups of XML elements • Nouns: element for items described • Adjectives: elements that describe nouns • Verbs: elements that relate nouns

  14. Modular Design <person><isauthorof> <document><ispublishedby> <organization><hasmember> <person></person> </hasmember></organization> </ispublishedby></document> </isauthorof></person>

  15. Relational Design • <person id=“kmarxthered”><email> k.marx@highgate.london.uk</email> </person> • <document id=“kapital”> <title>Das Kapital</title><hasauthor> <person id=“kmarxthered”/> </hasauthor></document>

  16. Other features • Lang qualifier to all elements, it ISO 639-1 if there are two letters and the bibliographic variant of ISO 639-2 if three letters. • Nouns have id. • Verbs have startdate and enddate qualifiers, and of course have id. • Adjectives can have child elements.

  17. Remaining Problems • Resolvability rules for identifiers • Dates and history • Subject classification using the group mechanism • Aliasing of element names

  18. To be done… • Complete list of verbs and adjectives • Schema design • Parsing and validation software. • Conversion with test collection ReLIS.

  19. Collaboration is welcome Thanks for listening. Have a happy New Year.

More Related