ISNI Assignment - PowerPoint PPT Presentation

alma-kirkland
isni annual general assembly frankfurt 2014 n.
Skip this Video
Loading SlideShow in 5 Seconds..
ISNI Assignment PowerPoint Presentation
Download Presentation
ISNI Assignment

play fullscreen
1 / 28
Download Presentation
ISNI Assignment
155 Views
Download Presentation

ISNI Assignment

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. October 2014OCLC Janifer Gatenby EMEA Program Manager MetadataOCLC ISNI Annual General Assembly, Frankfurt 2014 ISNI Assignment

  2. Assigned 8 million Provisional: Possible 701,157 Provisional: Unassigned 9,953,505

  3. ISNI Assignment: Batch loading Independent matching sources 3 VIAF sources

  4. ISNI Matching Name Title Partial title Rare title word Date Publisher Personal affiliation Organisation affiliation ISBN, ISWC, ISAN, DOI + Other name identifier e.g. IPI, VIAF, IPD Instrument Linked entities Dewey classification Scores are collected from each judge (ice skating style) Lowered for common surnames and common titles Score > .85 = match Score >.6 but <.85 = possible match

  5. ISNI Assignment: Batch loading Unique name Single source

  6. Central database - Trust + % confidence Publicly accessible www.isni.org Assignmentiscurated Authoritative Unique Trustful Persistent Assigned ≈ 8 million Provisional: Possible ≈638,000 Provisional: Unassigned 9+ Million • Matching algorithms • Data sampling • Anomaly checks • Quality assurance processes • End User input notes - % confidence Assignment only if confident

  7. Confidence • The two main problems for maintaining persistence are • duplicates needing to be merged • undifferentiated identities needing to be split • ISNI errs on the side of making duplicates rather than mixed identities • Thus the batch load process (usually) makes a provisional record • where there is no match (for fear of making a duplicate assignment) • where there is a low confidence match (for fear of making a mixed identity or a duplicate assignment) • where a matching record already has another local ID for the same source, regardless of the strength of the match (for fear of making a mixed identity)

  8. Procedures for maximizing assignment • Refinement of matching algorithms • E.g. introduced rare title word; • Now ignoring date of birth 1900 • Re-import program • Rematch with new rules • Rematch after new data added • ISNI Quality Team: Data sampling • assessing impact of single source • Recommendations for program changes • New criteria • Assessing uncommon surname assignment • Rules for online rich assignment

  9. Online: Guarantee assignment – Personal Name • ISNIs will be automatically assigned where there are no possible matches in these cases: • There are matches with a database record with a different source • A personal name is unique and includes a surname and forename • The request includes an “isNot” statement • The metadata supplied is considered rich as per these cases: • Full date of birth and death supplied • Year of birth + 1 title or instrument+ 1 related name (co- author or affiliated institution) • 1 title or instrument + 1 external URL link of type encyclopaedia, home page (not social network page) + 1 related name (co-author or affiliated institution) • The request is resolving a possible match by including a PPN

  10. Online: Guarantee assignment – Organisation Name • ISNIs will be automatically assigned where there are no possible matches in these cases: • There are matches with a database record with a different source • An organisation name is unique and does not consist only of • abbreviations • The metadata supplied is considered rich as per these cases: • Includes LOCODE & • Organisation type & • Organisation URL • The request is resolving a possible match by including a PPN

  11. Maximizing assignment • Enter a request record online (Web page or via API) • Batch loaded records – passive method • Quality Team manual fixes • OCLC periodic re-match runs • Matches from later batch loading & online activity • Batch loaded records – active method • Resolve possible matches found by the system • Search the database for candidate records for merging • Enrich a record with URLs to external sources such as author’s web pages, Wikipedia, IMDB, MusicBrainz, Discogs, etc.

  12. Finding possible matches

  13. Resolving Possible Matches Click

  14. Compare Screen

  15. Adding a new record – Michel Calame

  16. Adding a new record

  17. Adding a new record

  18. Adding a new record for an Organisation

  19. New Organisation form

  20. Adding your source to an existing record

  21. Adding your source to an existing record

  22. Correcting and enriching These are all the same person. The second has an incorrect DOB = 1900

  23. Enriching You can add a source note or general note to any database record, your code does not need to be present

  24. Reporting errors The general note will trigger an email to the ISNI Quality Team for attention

  25. Atom Pub API (Machine to machine) • Requests and replacements (you can replace your existing data citing local identifier) • Request • Atom Pub Header • Content = Request in the ISNI XML Request schema • Documentation • ISNI Atom Pub API guidlines.doc • ISNI request.xsd (XML schema) • ISNI request schema.doc (describes the schema) • ISNI response.xsd (XML schema) • ISNI response schema.doc (describes the schema)

  26. Documentation: Data Submission

  27. ISNI Charges

  28. What is requested from ISNI Data Contributors? • Act on notifications • (new assignments, changed assignments, errors and queries) • Assist in reviewing possible matches • (Exact matches then possible matches) • Add a note to any record found with an error • Supply URI Ingest ISNIs • Keep data up to date (become a RAG or use the services of an existing one)