1 / 4

Methods for Data Discovery – Portals

Methods for Data Discovery – Portals. Portal facilitates access to and also assimilation of data Portal is not simply a web site: it offers services such as data reformatting, subsetting, brokering, etc.

marly
Download Presentation

Methods for Data Discovery – Portals

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Methods for Data Discovery – Portals • Portal facilitates access to and also assimilation of data • Portal is not simply a web site: it offers services such as data reformatting, subsetting, brokering, etc. • Portal is not just a collection of information and links: portal takes you elsewhere through a service • Portal answers questions: abstracts data or does simple analysis • Identify phases: • Phase 1: need a simple presence (web page) to start: avoid initial overreaching • Could be multiple portals/interfaces • Define discovery • Identifying what you know you want • Also, importantly, “accidental” discoveries that derive from the broad scope of disciplines and nations • PIs want “definitive datasets”: vetted for quality, coverage, etc. • Metadata is key • In US, 10% of all IT spending is for metadata generation • 85% of data is unstructured • Need a new means—other than a list returned from a search—to present the data to the users • Vetted datasets • Desired and useful • Danger of cliques taking control • Root of ‘vet’ also leads to ‘veto’; overreaching? • A desired interface: a list that is classified and aggregated • Who are the users? Don’t forget education and outreach community

  2. Methods for Data Discovery – Portals • IPY legacy: • Need long term stewardship of metadata and data • Define audiences: scientists and public • Public needs access to information products • Phase 0: list of datasets and datacenters • Phase 1: metadata for datasets • 2: publications • 3: Services: visualizations • Start with a single data center (?) NSIDC? • Stages: • 1. IPY project honeycomb charts: identify sources of data • Done by 2007 • Science base • Dataflows: • Regional focus, discipline focus which point to archive or individuals • 2. Complementary Portals (links) • 3. Services that allow discovery (esp. databases) of unexpected connections • Search – access • Interactive – community tools • Visualization • Integrative

  3. Methods for Data Discovery – Portals • Portal must be accessible though search engines (Google) • Alignment of commercial interests with IPY • GoogleBase as a metadata service • Target audiences: scientists and education and outreach • Also recognize that • Not designing a portal—actually designing a process • Portal captures user interaction and uses this to enhance future use (e.g. Amazon) • Need to address ontology, metadata design, data collection design early in the process; counterpoint: we don’t have enough a priori information to design • Data managers come up with good plans, but implementation is spotty, unless compelled • Location is a common element that could tie discovery and integration together • Involve projects in classifying the honeycomb and building the initial lists in Stage 1

  4. Methods for Data Discovery – Portals • Addendums following group discussion • Who is going to do this? (Implementation plan) • Agencies • National Committees • PIs • DIS • Arctic Council working groups • International bodies • NGOs • Use lessons learned from groups like ice coring, oceanographers, etc. who are already good at sharing data • All of this goes into the “funding agency data management letter”; can this be articulated in time? • Letter needs to go to agency IPY point of contact. • Three questions • Who is responsible for IPY? • How will info be used • Wher will info go? (ipy.org) • Create metadata to describe portals • AMD is an example for metadata and services descriptions • Enable search of portals • Annotate with keywords to limit search results • Geographic focus • Stakeholders • Disciplines • Create an online mechanism for users to input list of portals and annotate them; that is, put the burden on the community • Suggestions: use GCMD and AMD • Use this to solicit feedback and ideas that are desired by the user community

More Related