1 / 39

UKOLN is supported by:

LOCAH Project and Considerations of Linked Data Approaches 29 th March 2011 JISC Managing Research Data International Workshop, Birmingham, UK Adrian Stevenson LOCAH Project Manager. UKOLN is supported by:.

Download Presentation

UKOLN is supported by:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LOCAH Project and Considerations of Linked Data Approaches 29th March 2011 JISC Managing Research Data International Workshop, Birmingham, UK Adrian Stevenson LOCAH Project Manager UKOLN is supported by:

  2. “The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web.” “the Semantic Web is the goal or end result… Linked Data provides the means to reach that goal” From ‘Linked Data: The Story So Far’ - Heath, Bizer and Berners-Lee 2009

  3. The goal of Linked Data is to enable people to share structured data on the Web as easily as they can share documents today. Bizer/Cyganiak/Heath Linked Data Tutorial, linkeddata.org

  4. In essence, it marks a shift in thinking from publishing data in human readable HTML documents to machine readable documents. That means that machines can do a little more of the thinking work for us. http://www.linkeddatatools.com/semantic-web-basics

  5. But haven’t we been putting linked data on the web for years? • In CSV , relational databases, XML etc? • Well yes, but these approaches are not so easy to integrate • Web 2.0 mashups work against a fixed set of data sources • Linked Data applications operate on top of an unbound, global data space.

  6. So what’s been happening?

  7. Data.gov.uk Officially launched 21st January 2010

  8. BBC Music

  9. A little bit of the techy stuff

  10. Linked Data is … • A way of publishing data on the web that: • Encourages reuse • Reduces redundancy • Maximises inter-connectedness • Enables network effects • So how is this achieved?

  11. Presentational tagging – HTML • <h1>Manchester Physiotherapy Centre</h1> <p>Welcome to the Manchester Physiotherapy Centre home page. Do you feel pain? Have you had an injury? Let our staff take care of your body and soul.</p> <h2>Consultation hours</h2> Mon 11am - 7pm<br/> Tue 11am - 7pm<br/> Wed 3pm - 7pm<br/> Thu 11am - 7pm<br/> Fri 11am - 3pm • <p> Please note that we will not be offering consultation during the weeks of the <a href=". . .">Olympic</a> games.</p>

  12. Semantic tagging <company> <treatmentOffered>Physiotherapy</treatmentOffered> <companyName>Manchester Physiotherapy Centre</companyName> <staff> <therapist>Lisa Davenport</therapist><therapist>Steve Matthews</therapist> <secretary>Kelly Townsend</secretary> </staff> </company>

  13. Linked Data Design Issues • URIs • LD Design Issues • Triples http://www.w3.org/DesignIssues/LinkedData.html

  14. URIs and HTTP • A ‘Uniform Resource Identifier’ (URI) provides a simple and extensible means for identifying a resource - RFC 3986 • HTTP URIs can be ‘de-referenced’ • A URL is a type of URI • HTTP URIs are used for “real world” things • http://adrianstevenson.com/id/me • http://dbpedia.org/page/Tim_Berners-Lee

  15. RDF • Resource Description Framework • a language for representing information about resources on the Web • RDF can be used to represent things identified on the Web, even when they cannot be directly retrieved on the Web • Describes relations using ‘triples’ • http://www.w3.org/TR/REC-rdf-syntax/

  16. Triples • Triples statements • ‘Things’ have ‘properties’ with ‘values’ • Subject – Predicate - Object • Triples are the basis of RDF Provides Access To Is Member Of The Rolling Stones Repository Keith Richards ArchivalResource

  17. BBC Music

  18. LOCAH Project

  19. What is the LOCAH Project? • Linked Open Copac and Archives Hub • Funded by #JiscEXPO 2/10 ‘Expose’ call • 1 year project. Started August 2010 • http://blogs.ukoln.ac.uk/locah/ tag: #locah

  20. What are the Archives Hub and Copac? • National data services • The Archives Hub is an aggregation of archival descriptions from archive repositories across the UK • http://archiveshub.ac.uk • Copac provides access to the merged library catalogues of libraries throughout the UK, including all national libraries • http://copac.ac.uk

  21. What is LOCAH Doing? • Part 1: Exposing Archives Hub & Copac data as Linked Data • Part 2: Creating a prototype visualisation • Part 3: Reporting on opportunities and barriers

  22. LOCAH Linked Data • If something is identified, it can be linked to • We can then take items from one dataset and link them to items from other datasets BBC Copac VIAF DBPedia GeoNames Archives Hub

  23. The Linking benefits of Linked Data BBC:Cranford Copac:Cranford VIAF:Dickens DBpedia: Gaskell Hub:Gaskell Geonames:Manchester DBpedia: Dickens Hub:Dickens

  24. Archives Hub Model (as at 14/2/2011) in Finding Aid Place PostcodeUnit Repository(Agent) administeredBy/administers maintainedBy/maintains encodedAs/encodes hasPart/partOf EAD Document accessProvidedBy/providesAccessTo Level Biographical History topic/page hasBiogHist/isBiogHistFor level Language ArchivalResource language at time topic/page origination hasPart/partOf TemporalEntity Creation product of associatedWith extent inScheme Extent ConceptScheme Concept Agent representedBy Object Is-a foaf:focus Is-a associatedWith Person Family Organisation Place Book participates in Genre Function Birth Death TemporalEntity at time

  25. Enhancing our data • Already have some links: • lexvo.orgURIs for languages of archival materials • reference.data.gov.ukURIs for time periods • Postcodes, using both UK Postcodes URIs and Ordnance Survey URIs • Virtual International Authority File • Matches and links widely-used authority files - http://viaf.org/ • DBPedia • Also looking at: • Library Congress Subject Headings

  26. http://data.archiveshub.ac.uk/id/archivalresource/gb1086skinnerhttp://data.archiveshub.ac.uk/id/archivalresource/gb1086skinner

  27. http://data.archiveshub.ac.uk/doc/person/ncarules/chamberlainarthurneville1869-1940statesmanhttp://data.archiveshub.ac.uk/doc/person/ncarules/chamberlainarthurneville1869-1940statesman

  28. How are we creating the Visualisation Prototype? • Based on researcher use cases • Data queried from Sparql endpoint • Use tools such as Simile, Many Eyes, Google Charts • Also looking at custom built prototype

  29. Use Case Slide http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_LOCAH

  30. Visualisation Prototype • Using Timemap – • Googlemaps and Simile • http://code.google.com/p/timemap/ • Early stages with this • Will give location and ‘extent’ of archive. • Will link through to Archives Hub

  31. Some issues • Data Modelling • Sustainability • Provenance • Licensing

  32. Data Modelling Challenges • Archival description is hierarchical and multi-level • Archives Hub: inconsistencies in data and lack of standardisation • there's no content standard in the UK

  33. Sustainability • Can you rely on data sources long-term? • Ed Summers at the Library of Congress createdhttp://lcsh.info • Linked Data interface for LOC subject headings • People started using it

  34. Library of Congress Subject Headings

  35. Provenance • Triples create individual statements • OK if data ‘watermarked’ <http://data.archiveshub.ac.uk/doc/archivalresource/gb1086skinner> rdf:type foaf:Document • But can often be a problem

  36. Licensing • Nature of Linked Data: each triple as a piece of data • ‘Ownership’ of data • Hard to track attribution • We’re using CC BY-NC 2.0 for now

  37. Questions? Slides available at http://slidesha.re/fT6QIe

  38. Attribution and CC License • Sections of this presentation adapted from materials created by other members of the LOCAH Project • This presentation available under creative commonsNon Commercial-Share Alike: http://creativecommons.org/licenses/by-nc/2.0/uk/

More Related