1 / 21

LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading

LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading. LinkSphere. Linking Researchers and their Data Social networking for researchers Cross-database search Mostly Arts and Humanities datasets “Promoting serendipity”

Download Presentation

LinkSphere: P2P Cross Database Search -- Architecture and Issues Hugo Mills University of Reading

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LinkSphere:P2P Cross Database Search -- Architecture and IssuesHugo MillsUniversity of Reading

  2. LinkSphere • Linking Researchers and their Data • Social networking for researchers • Cross-database search • Mostly Arts and Humanities datasets • “Promoting serendipity” • Access by and presentation of datasets to wider audiences

  3. Datasets • Museums Archives • Archaeology: Silchester Excavation, IADB • Ure Museum of Classical Archaeology • CentAUR: ePrints • Library • Beckett Collection • Cole Museum of Zoology • Film Collection • Herbarium • Typography Collections

  4. Tycho • Fully asynchronous peer-to-peer communications framework • Written in Java • Fully distributed • Robust • “A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” (Leslie Lamport) • Has a simple distributed data store (“Virtual Registry”) for client metadata

  5. Tycho • (Relatively) lightweight • 3MiB for a fully functional system • Fast • Flexible, Extensible • Bootstrap handlers • Additional message types • VR extensions • Alternative communication protocols • Discovery of core mediators via Bonjour/ZeroConf

  6. XDB System Architecture Search App Search App REST search API Tycho Core Meta Meta Meta Meta VR VR VR VR JDBC ... Web API SPARQL Repo Repo Repo Repo

  7. User Interface • Main UI is web-based • Uses AJAX • Currently embedded within the LinkSphere project site • Will ultimately move to the SNS • Any UI possible using the REST API

  8. Issues • Getting the data is hard • Implementation problems • Maintenance problems • Admin problems • Social problems • Legal problems

  9. “Muddling along” • Archive of material for intra-departmental use only • Some legal issues involved • Group of technicians administering the data • Poor quality data • Excel spreadsheet(!) • Reluctant to have index of material made public

  10. “Not ready yet” • Big university projects • New systems, (potentially) large data sets • MERL museums archive (AdLib) • Data all loaded from previous systems • Access modules not yet installed • CentAUR publications archive (ePrints 3) • Very little data available yet

  11. “Works For Me” • Custom web application • PHP, sophisticated • External developer • No documentation • MySQL underneath

  12. “It works, but...” (part 1) • Non-technical users • Admins are Mac-only, desktop-only people • FileMaker Pro • DB structure and UI developed externally • No documentation • This has bad implications

  13. “It works, but...” (part 2) • Completely custom application • External developer • No documentation (again) • Large lump of write-only perl • Custom data store • Not SQL. Not XML. Not RDF. • No external access

  14. Unreachable data • Uncommunicative systems • Custom applications • Developers/administrators AWOL • Custom data models • Lost passwords • Excel spreadsheets • See also, “Uncommunicative”

  15. Unreachable data • Private data • Legal issues • Possessive owners • Internal use only • Poor quality • No data!

  16. Conclusions • Building the software is easy • There is still lots of hard-to-reach data out there • Issues are largely not technical • More outreach to A&H areas needed

  17. Acknowledgements and thanks • LinkSphere team: Mark Baker, Shirley Williams, Pat Parslow (Reading), Claire Warwick, Melissa Terras, Claire Ross (UCL) • Repository owners at Reading: Amy Smith (Ure Museum), Guy Baxter (University Archivist), Mary Dyson, Hadj Messelles (Typography), Jonathan Bignell (Film Studies), Alison Sutton (CentAUR), Mike Fulford, Amanda Clarke (Silchester) • JISC VRE 3 programme

  18. Tycho Architecture C VR C C M C C VR M M VR C M VR C C

  19. REST Interface • /api/query • POST to start new query asynchronously • /api/query/query_id • GET for query metadata • DELETE to cancel query (or it will time-out naturally) • /api/query/query_id/start/finish • GET a range of results from the query • Feedback API coming soon

  20. REST Interface • /api/repository • GET list of repositories currently online • /api/repository/repo_id • GET for repository metadata • Link to repository itself • Link to LinkSphere description of it

More Related