1 / 27

Providing useful perspectives onto massive digital collections

Providing useful perspectives onto massive digital collections. Mark Gahegan Tawan Banchuen, Will Smart, Brandon Whitehead Centre for eResearch University of Auckland, New Zealand. BeSTGRID. There are 2 rules for success in life: Never share everything you know

shamus
Download Presentation

Providing useful perspectives onto massive digital collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Providing useful perspectives onto massive digital collections Mark Gahegan Tawan Banchuen, Will Smart, Brandon Whitehead Centre for eResearch University of Auckland, New Zealand BeSTGRID

  2. There are 2 rules for success in life: • Never share everything you know There is always missing information/knowledge…

  3. Vannevar Bush, As We May Think (1945) “There is a growing mountain of research. But there is increased evidence that we are being bogged down today as specialization extends. The investigator is staggered by the findings and conclusions of thousands of other workers - conclusions which he cannot find time to grasp, much less to remember, as they appear. Professionally our methods of transmitting and reviewing the results of research are generations old and by now are totally inadequate for their purpose…” “…A record, if it is to be useful to science, must be continuously extended, it must be stored, and above all it must be consulted.”

  4. The knowledge explosion: Geological Research Sarah E. Fratesi, 2008 Journal of Research Practice Volume 4, Issue 1, Article M1,Scientific Journals as Fossil Traces of Sweeping Change in the Structure and Practice of Modern Geology

  5. The Pain • Digital collections are growing at a geometric rate… • How much useful information is never used because it cannot be found? • How much time & money is lost trying to interpret (or re-interpret) acquired data? • How much effort & uncertainty is involved in trying to understand the work of others?

  6. Our systems compartmentalise our understanding: this is a very bad thing Our systems: • Databases • Analysis & simulation tools • Document repositories (e.g. articles, books, theses) • Ontologies • Visualisations • Organisation charts • Calendars • Wikis / Blogs / email Meaning resides across ALL of these places—but it is difficult to extract and connect

  7. All the resources in the collections housed in the GEON cyber-infrastructure

  8. Things to bear in mind (drivers)… • The sheer volume of resources in digital collections. • The dynamic nature of the collection catalogs in an e-Infrastructure. • The need to support multiple search strategies to find useful resources. • The need to help explain what resources mean, or to contextualise them in some way, so they can be used appropriately • The need to capture new connections and evolving understanding.

  9. A knowledge gateway to an eResearch community Examples from geoscience

  10. Conceptual Universe of GEON: now organised by themes

  11. Navigating through the themes

  12. GEON: Institutions, Personnel, PIs, Co-PIs, grad students

  13. What has person A contributed?(Kai Lin: GEON researcher)

  14. Complete conceptual neighbourhood of a resource (an article in this case)

  15. Perspectives as filters Perspectives filter an information space according to particular situations. Perspectives A and B preferentially select different types of resources and relations; the ability to view perspectives can show how someone else made sense of a given set of resources.

  16. Who used a particular article? (its user community)

  17. Multiple perspectives:What did A create that B used?

  18. Perspectives (1) as SPARQL Queries Consider the following SPARQL query: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX theme: <http://www.geovista.psu.edu/cV/themes.owl#> SELECT ?x WHERE (?xrdf:typerdfs:Class ?xrdfs:subClassoftheme:Geosciences) Here “rdf” and “rdfs” are namespace prefixes for W3C’s RDF and RDFS languages, and “theme” is the namespace prefix for an ontology that defines subjects of interest within the geosciences. Executing this query on the ontology will return all the resources that are sub-classes of the resource “theme:Geosciences”. Precise definition of a global perspective based on the query is as follows. Let T denote the original ontology (which is a set of RDF statements in the form of [subject, predicate, object]), Q denotes an RDF query, S(Q) denotes the set of subject constants in Q (empty in the above example, since all subjects are variables), P(Q) denotes the set of predicate constants in Q (including “rdf:type” and “rdfs:subClassOf” in the example), and O(Q) denotes the set of object constants in Q (including “rdf:Class” and “theme:Geosciences” in the example). If we use R to denote the set of resources returned by the query, then a global perspective is a set of RDF statement, denoted by PS, which satisfy the following conditions. For any statement [subject, predicate, object] in PS: if predicate ∈ P(Q) then subject ∈ R ∪ S(Q) and object ∈ R ∪ O(Q); if predicate ∉ P(Q) then subject ∈ R and object ∈ R; [subject, predicate, object] ∈ T

  19. 13 13 B 5 1 1 1 12 12 13 5 5 A 6 6 6 12 14 14 14 2 2 2 8 8 8 7 7 7 16 15 10 15 11 3 3 3 15 9 9 9 11 10 10 16 16 4 4 4 11 Perspectives (2)truncated concepts become properties

  20. Fold in n user B o m user A user C country content p image image Properties Date: ddmmyyyy Scale: 1:xxxxxxxx Country: “………..” Content: (m, n, o, p) Properties Date: ddmmyyyy User: (A, B, C) Scale: 1:xxxxxxxx Fold out Perspectives (3): different facets of a concept are ‘externalised’

  21. Which topics are closely related?

  22. Authors folded into themes, Themes connected together by author properties

  23. Intersecting research interests of a science community

  24. Conclusions • Organising resources according to different knowledge facets holds considerable promise • We may not be able to capture what things mean directly, but we can provide some signifiers (clues) • Usage data can provide very strong signifiers • Intuitive navigation metaphors are needed • Strong filters are needed—perspectives—to avoid overcrowding & confusion, • And to better match the users’ conceptual models • We need strong identifiers for digital resources, so we can find references to them from many systems • This facilitates value-added services (semantic web-search, maps, use cases, provenance, text descriptions, markup tools, etc)

  25. END

More Related