Ncdc station metadata system
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

NCDC Station Metadata System PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on
  • Presentation posted in: General

NCDC Station Metadata System. Jeff Arnfield Active Archive Branch National Climatic Data Center Asheville, NC. NCDC’s Role. Nation’s focal point and scorekeeper for information about weather and climate variations and changes

Download Presentation

NCDC Station Metadata System

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ncdc station metadata system

NCDC Station Metadata System

Jeff Arnfield

Active Archive Branch

National Climatic Data Center

Asheville, NC


Ncdc s role

NCDC’s Role

  • Nation’s focal point and scorekeeper for information about weather and climate variations and changes

  • Currently ingest and archive 150 terabytes (150,000 gigabytes) of data every year

  • Maintain the metadata necessary to interpret these data

NCDC - scorekeeper for the nation's climate


Metadata data about data

Metadata: Data About Data

  • Information necessary to describe and interpret collections of data and the observing systems that report them

    • Station location

    • Station configuration

    • Observing standards

    • Reporting protocols

    • Data inventories

    • Data product documentation

    • History of changes over time

NCDC - scorekeeper for the nation's climate


Current observing systems

Current Observing Systems

  • NWS cooperative observing system

  • National surface observing system

  • Upper air observing system

  • Climate Reference Network (CRN)

  • Marine observing system

  • Profiler

  • Precipitation observing network

  • Radar

  • Global Climate Observing System

  • Satellite systems

  • Special national/international experiments

NCDC - scorekeeper for the nation's climate


Why is metadata important

Why Is Metadata Important?

  • Critical to NCDC's ingest, archive and access systems

  • Gives data users perspective on reported data values

    • Station moves & changes to surroundings

    • Sensor changes

    • Quality control algorithms

  • Helps in selection of stations, data products for study

NCDC - scorekeeper for the nation's climate


The rest is just metadata

The Rest Is Just Metadata

42

NCDC - scorekeeper for the nation's climate


2000 need and opportunity meet

2000: Need and opportunity meet

  • Reviewed metadata holdings and systems

    • Strengths

    • Shortcomings

    • Opportunities for improvement

  • Climate Database Modernization Program (CDMP)

    • Partnership with private industry

    • Increase digital data holdings

    • Improve database quality

    • Improve access to and utilization of data

NCDC - scorekeeper for the nation's climate


Reality can be ugly

Reality Can Be Ugly

NCDC - scorekeeper for the nation's climate


Initial situation

Initial Situation

Metadata distributed in a combination of formal and ad hoc systems

Different systems may contain information about the same station

Multiple sources and procedures for the same metadata result in discrepancies

Data freshness and accuracy vary

Updates may not affect all similar data

NCDC - scorekeeper for the nation's climate


The problem of scope

The Problem of Scope

  • The human mind is so complex and things are so tangled up with each other that, to explain a blade of straw, one would have to take to pieces an entire universe. . . . A definition is a sack of flour compressed into a thimble.

    • Rémy de Gourmont (1858–1915)

NCDC - scorekeeper for the nation's climate


Existing station history systems

Existing Station History Systems

  • SHIPS – Station History Information Production System

    • Highly detailed, with heavy quality control

    • Coop, ASOS and some surface weather observing sites

    • UNIX on old Sun workstation

    • Empress database

  • Other ad hoc and project-specific systems

    • Database

    • Flat text files, word processing files

    • Many paper records

    • Some are essentially static

  • Access via lists, Cliserv and Web CliServ

NCDC - scorekeeper for the nation's climate


Shortcomings of ships system

Shortcomings of SHIPS system

  • Developed to meet Coop data ingest and publication needs

  • Database design not normalized

  • New networks may require structure changes

  • Lack of keys, data integrity checks

  • Cumbersome interface with limited queries

  • No “query only” option, no outside access

  • Ad hoc queries are complex

NCDC - scorekeeper for the nation's climate


Challenges

Challenges

  • Technical, cultural and logistical

  • Metadata conflicts and inconsistencies

  • Complex table key, versioning, attribution

  • Security and audit requirements

  • Informal knowledge base, imprecise terms

  • Loose system documentation

  • Geographically dispersed team

  • Resource competition

NCDC - scorekeeper for the nation's climate


Metadata project goals

Metadata Project Goals

  • Strategic architecture to manage metadata

  • Leverage CDMP project tasks and resources

  • Accommodate imperfect, real world metadata

  • Accept new information without modification

  • Flexible queries for dispersed users

  • Modular for multi-organization participation

  • Deliver useful releases within a year

NCDC - scorekeeper for the nation's climate


Technological foundation

Technological Foundation

  • Normalized relational database

  • Oracle 8i database, CASE design tools

  • Model entire subject, not one instance

  • Surrogate keys minimize dependencies

  • Enforce business rules in database

    • Declarative, triggers, stored procedures

  • Accept flawed data, identify and correct

  • Separate database, application servers

  • NCDC - scorekeeper for the nation's climate


    Technological foundation cont d

    Technological Foundation (Cont’d)

    • Similar query needs for research and maintenance

    • Web-based solution

      • Distributed access

      • Easy administration, maintenance

      • Standard interface minimizes training

    • ColdFusion web-based environment

    • Crystal Reports for flexible output

    NCDC - scorekeeper for the nation's climate


    Station history subject areas

    Station History Subject Areas

    • Identity

      • Names

      • IDs

      • Period of record

    • Location

      • Lat/Lon, elevation

      • Geographic descriptors

      • Exposure, topography

    • Classification

    • Observers

    • Equipment

    • Observing Practices

      • Phenomena

      • Schedule

      • Reporting protocols

    • Data programs

    • Administration

    • Supporting documents

      • Forms

      • Photos

    NCDC - scorekeeper for the nation's climate


    Current status

    Current Status

    • Modeling and requirements workshops held

    • Contractors familiarized with subject

    • Hardware, development software installed

    • SHIPS ported to Oracle, Web-accessible

    • Anomalies being identified and corrected

    • First cut database design completed

    • Interface prototyping in progress

    • Testing NWS Coop metadata acquisition

    NCDC - scorekeeper for the nation's climate


    2001 a metadata odyssey

    2001: A Metadata Odyssey

    • Design document complete – 1st Qtr

    • Physical DB design complete – 1st Qtr

    • Initial release of new system – 2nd Qtr

    • Automated Coop QC, ingest – 3rd Qtr

    • Access to document images – 3rd Qtr

    • Intensive manual QC, updates – 4th Qtr

    • Merge Pre-1948, Pre-1890 coop– 4th Qtr

    NCDC - scorekeeper for the nation's climate


  • Login