discovering dealing with data n.
Skip this Video
Loading SlideShow in 5 Seconds..
Discovering & Dealing with Data PowerPoint Presentation
Download Presentation
Discovering & Dealing with Data

Loading in 2 Seconds...

play fullscreen
1 / 15

Discovering & Dealing with Data - PowerPoint PPT Presentation

  • Uploaded on

Discovering & Dealing with Data. Presented by Kimberly Silk, MLS, Data Librarian, Martin Prosperity Institute, University of Toronto. Agenda. The MPI information environment Common data sources & authority Data management, discovery and access What is Open Data? Big Data?

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Discovering & Dealing with Data' - dolf

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
discovering dealing with data

Discovering & Dealing with Data

Presented by

Kimberly Silk, MLS, Data Librarian, Martin Prosperity Institute, University of Toronto

  • The MPI information environment
  • Common data sources & authority
  • Data management, discovery and access
  • What is Open Data? Big Data?
  • Fun with data visualization
  • Q & A
about the mpi
About the MPI
  • The Martin Prosperity Institute is a economic think-tank; we are part of the Rotman School within the University of Toronto
  • My client group consists of grad students, post-docs, visiting faculty and researchers who use social-science data to support their research
  • To support their research process, I procure, curate, preserve and make discoverable data sets.
  • The MPI has our own data repository that has grown to 4 TB in size.
data sources
Data Sources
  • Common & Very authoritative sources
    • StatsCan via the Data Liberation Initiative
    • Bureau of Labor Statistics, Bureau of Economic Analysis, American Fact Finder (Census)
    • OECD eLibrary
    • World Bank
    • Int’l sources such as UK Data Archive, Swedish National Data Service, etc.
    • Pew Research Center
    • Gallup
more data sources
More data sources
  • Less authoritative??
    • Chinese Data Center
    • Rolling Stone
    • MySpace
    • CrunchBase
data challenge discovery
Data Challenge: Discovery

Lots of research data being collected and added, but no method to manage it, catalogue it, or make it findable

Demands from various clients: faculty, students, researchers, staff, administration

The shared network drive was no longer effective

show share
Show & Share…
  • We want the world to see our data catalogue
  • But, we don’t want the world to be able to copy or change what’s in the catalogue, or the catalogue itself
  • We need to manage access to our data; who are you? Where are you from? Why do you want the data? What are you going to do with it? Will you share your results?
data discovery platforms
Data Discovery Platforms
  • I reviewed several platforms that would work in an academic environment:
    • Nesstar – developed in Norway by Norwegian Social Science Data Services, used by StatsCan, UK Data Archive, NORC at UChicago
    • Islandora – Open source system based on Fedora developed at UPEI
    • ODESI – proprietary system developed and used by Scholars Portal
    • Dataverse – Open source system developed by the Institute for Quantitative Social Science at Harvard, used by NBER, and many academic think tanks.
  • Dataverse was a good choice since we could install an iteration at UToronto, in the UToronto cloud, and I could manage it myself
  • It was free, and my colleagues at Scholar’s Portal was interested in installing it – I was the perfect guinea pig
  • Slowly, I am cataloguing my data collection; I have set up a lending agreement, and it’s working very well.
  • Demo:
open data
Open Data
  • Open data is an idea, that certain data should be freely available to everyone to use, reuse, and redistribute without restriction.
  • Governments around the world have begun to “open up” some of their data: US, UK, New Zealand, Norway, Russia, Australia, Morocco, Netherlands, Chile, Spain, Uruguay, France, Brazil, Estonia, Portugal, etc.
  • State- and municipal-levels of government have also created open data sites.
open data opportunities
Open Data Opportunities…
  • Governments open up their data to foster better citizenship and improve transparency
  • Open Data can spur grass-roots innovation: citizens access open data to use in software programs to solve problems, such as finding a local daycare, knowing when the next bus will come, reporting crime on-the-fly, or watching congress proceedings in real time.
and challenges
… and Challenges
  • Open Data takes commitment. Successful implementations have a dedicated team of people who decide what data to release according to usefulness and demand
  • The data must be anonymized, cleansed and in a non-proprietary format
  • Organizations must be prepared to listen to the citizens, be responsive, and trouble-shoot.
  • Open data is a public service.
big data
Big Data
  • Big Data is a collection of data sets that is too large for the average database management tool (Access and Excel, for instance).
  • Examples come from meteorology, genomics and physics. At MPI we wrestle with large GIS data sets (maps and satellite data), and deal with data at the terabyte (1 trillion bytes) level.
  • Larger data sets deal with petabytes (1 quadrillion bytes) and exabytes (1 quintillion bytes).
data visualizations
Data Visualizations
  • The visual representation of data ---- literally, a picture can say a thousand [numbers]
  • Edward Tufte is a key pioneer:
  • Fantastic examples at Flowing Data:
  • RSA Animate:
q a and thank you

Q & A(and, Thank You!)

Kimberly Silk, MLS, Data Librarian, Martin Prosperity Institute, University of