Using pivots to explore heterogeneous collections
1 / 28

Using Pivots to Explore Heterogeneous Collections - PowerPoint PPT Presentation

  • Uploaded on

Using Pivots to Explore Heterogeneous Collections. A Case Study in Musicology. Daniel Alexander Smith 8 December 2009. musicSpace. IAM Group, School of Electronics and Computer Science Music, School of Humanities. Outline. How musicologists use data

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Using Pivots to Explore Heterogeneous Collections' - neena

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Using pivots to explore heterogeneous collections

Using Pivots to Explore Heterogeneous Collections

A Case Study in Musicology

Daniel Alexander Smith8 December 2009


  • IAM Group, School of Electronics and Computer Science

  • Music, School of Humanities


  • How musicologists use data

  • Limitations of existing approaches

  • Our data extraction and integration methodology

  • Interface walkthrough

Musicspace tasks
musicSpace Tasks

  • Triage data partners sources

  • Extract information

  • Map data sources to schemas/ontologies

  • Produce interface over aggregated data

  • Customise interface based on feedback

Intractable research questions
Intractable research questions

  • Which scribes have created manuscripts of a composer’s works, and which other composers’ works have they inscribed?

  • Which poets have had their poems set to music by Schubert, which of these musical settings were only published posthumously, and where can I find recordings of them?

  • Which electroacoustic works were published within five years of their premier?

Why they are intractable 1
Why they are intractable (1)

  • Need to consult several sources

  • Metadata from one source cannot be used to guide searches of another source

  • Solution: Integrate sources

Why they are intractable 2
Why they are intractable (2)

  • They are multi-part queries, and need to be broken down with results collated manually

  • Requires pen and paper!

  • Solution: Optimally interactive UI

Why they are intractable 3
Why they are intractable (3)

  • Insufficient granualrity of metadata and/or search option

  • Solution: Increase granularity

Previous work
Previous work

  • Comb-e-chem modelled Chemistry data

  • We use similar approach

  • Translated this work to the arts

  • Musicology modelled using Semantic Web technologies

Musicology data sources
Musicology Data Sources

  • Disparate data

  • How to pull them together and view on demand

Data and info management problems
Data and Info Management problems

  • Sources allow searching, but not over everything

  • Data export (MARC typically) shows extra fields, e.g. characters in opera, document types hidden amongst metadata

  • Sometimes viewable on original site, but not searchable

  • Offering extracted metadata already a benefit with one source

Grove extraction example
Grove Extraction Example

  • More complicated, as Grove is a full text encyclopaedia

  • Some digitisation via Grove Music Online

  • Weak semantic metadata extraction

  • Thus we performed some data entry


  • Domain Expert + Technologist partnership

  • This will be case for some time now

  • Technology to best automate tasks to make domain expert’s job less onerous

Metadata mapping
Metadata mapping

  • Domain experts devise single schema

  • Provide mappings of fields in a particular data source to that unified schema

  • Enables an interface across all sources


  • New source comes online with information not covered by unified schema

  • Have to make changes to all mappings to ensure accurate coverage

New approach pivoting
New Approach: Pivoting

  • Marking up a single source, versus pushing all to a single schema

  • Use a pivot instead to situate metadata for integration

  • Essentially means that the interface does the heavy lifting of integration

  • Reduced effort by domain experts

Interface video1
Interface Video

  • Find a composer

  • See all copyists of their manuscripts

  • Choose a copyist and see which other composers that copyist has worked on

Thank you http ecs soton ac uk projects musicspace

Thank you

[email protected]