1 / 19

LTER Information Management Committee Meeting, July 23-25, 2013

Considering best practices in managing sensor data. Don Henshaw H.J. Andrews Experimental Forest. LTER Information Management Committee Meeting, July 23-25, 2013.

marja
Download Presentation

LTER Information Management Committee Meeting, July 23-25, 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Considering best practices in managing sensor data Don Henshaw H.J. Andrews Experimental Forest Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013

  2. Common Themes from participating sitesJoint NERC Environmental Sensor Network/Sensor NIS Workshop, Hubbard Brook Experimental Forest, NH, October 25-27th, 2011 • Greatest Needs • Middleware between sensor/data logger and database/applications • Programming support • Training workshops to disseminate knowledge & solutions • Ways to share experiences with software and tools that are useful • Clearinghouse for sharing code and solutions • Knowledge Base (web page) organized by topics (http://wiki.esipfed.org/index.php/EnviroSensing_Cluster) LTER Information Management Committee Meeting, July 23-25, 2013

  3. Joint NERC Environmental Sensor Network/LTER SensorNIS Workshop, October 25-27th, 2011

  4. ESIP EnviroSensing Cluster: Building a sensor network resource guide through community participation • Online resource guide outline • Sensor, site, and platform selection • Data acquisition and transmission • Sensor management, tracking, documentation • Streaming data management middleware • Sensor data quality assurance/quality control (QA/QC) • Sensor data archiving Software Tools for Sensor Networks, April 23-26, 2013

  5. Sensor, site, and platform selection • Problem statement • Vast array of possible sensor/hardware packages for multiple science applications • Communication among PI’s, techs, and specialists • work together in considering options and planning • Deployment may be based on interacting factors • e.g., permitting, geography, access • Considerations: • seasonal weather patterns, power sources, communications options, land ownership, distance from managing institution, available personnel/expertise, and potential expansion/future-proofing LTER Information Management Committee Meeting, July 23-25, 2013

  6. Data acquisition and transmission • Problem statement • Manual downloads of environmental sensor data may not be sufficient to assure data security or data integrity, or allow direct control of devices • Considerations: • need for immediate access • need for one- or two-way transmission methods • bandwidth requirements to transfer the data • need for line-of-site communication or repeaters • hardware and network protocols • power consumption of the system components • physical and network security requirements LTER Information Management Committee Meeting, July 23-25, 2013

  7. Sensor management, tracking, and documentation • Problem statement • Documentation of field procedures need to be sufficient to withstand personnel changes over time • Noted sensor issues and problems need to be quickly communicated among field technicians, lead investigators and data managers • Sensor histories are typically tracked in field notebooks or field check sheets and are essential for internal review of data streams, but are often inaccessible to data handlers • Noted field problems may provide insight into quality control issues and data behavior and should be captured in data qualifier flags LTER Information Management Committee Meeting, July 23-25, 2013

  8. Sensor Management, tracking, documentation • Develop protocols for installation, calibration, maintenance, and removal of sensors • Track sensor events and history • Record sensor events and failures, deployment information, calibration events, maintenance history, operational dates, etc. • Record sensor descriptions, methodology changes, sampling frequency, geo-location, photo points, etc. • Documentation • Standardize field notebooks or field checklists • Build log files or databases for annotation of sensor events, e.g., • Timestamp (or range), DataloggerID, SensorID, event category, description and note taker of event Software Tools for Sensor Networks Training, 1 May 2012 LTER Information Management Committee Meeting, July 23-25, 2013

  9. Sensor data quality assurance and quality control (QA/QC) • Preventative QA measures in the field are desirable • Automated QC is necessary for • near real-time use of data • efficient processing of high volume data streams • Manual methods are unavoidable • a hybrid QC system will include subsequent manual inspection and additional QC checking • QC system must • provide qualifier flags to sensor data • accommodate feedback to policies and procedures • assure that all QC workflows are documented LTER Information Management Committee Meeting, July 23-25, 2013

  10. Quality Assurance – preventative measures • Routine calibration and maintenance • Anticipate common repairs and replacement parts • Record known events that may impact measurements • Continuous monitoring and evaluating of sensor network • Early detection of problems • Automated alerts; in situ web cams • Sensor redundancy • Ideal: Triple the sensor, triple the logger! • Practical: Cheaper, lower cost, lower resolution sensors, or correlated (proxy) sensors • Alternative: Datalogger-independent sensor spot checks; portable instrument package LTER Information Management Committee Meeting, July 23-25, 2013

  11. Quality control on Streaming Data: Possible quality control checks in near real-time • Timestamp integrity (Date/time) • Sequential, fixed intervals, i.e., checks for time step or frequency variation • Range checks • Sensor specifications - identify impossible values; not unlikely ones • Seasonal/reasonable historic values • Internal (plausibility) checks • E.g., TMAX-TMIN>0, snow depth>snow water equivalence • Consistency of derived values • Variance checks • Sigma (standard deviation), Delta/step (difference of subsequent pairs), change in slope checks • e.g., outlier detections, indicator of sensor degradation • Sensitivity is specific to site and sensor type • Persistence checks • Check for repeating values that may indicate sensor failure • E.g., freezing, sensor capacity issues • Spatial checks • Use correlations with redundant or nearby sensors, e.g., check for sensor drift LTER Information Management Committee Meeting, July 23-25, 2013

  12. Quality control on Streaming Data:Data Qualifiers (data flags) • Many vocabularies of data flags • Good approach • Rich vocabulary of fine-grained flags for streaming data – intended to guide local review • site-specific flags • Simpler vocabulary of flags for “final” data for public consumption, e.g., • ‘Accepted’, ‘Missing’, ‘Estimated’, ‘Suspicious’, estimate uncertainty • Certain types of qualifiers may be better as data columns • Method shifts, sensor shifts • Place key documentation as close to data value as possible Image from Campbell et.al., Bioscience, In Press.

  13. Sensor data archiving • Archiving strategies • create well documented data snapshots • assign unique, persistent identifiers • maintain data and metadata versioning • store data in text-based formats • Partner with community supported archives • E.g., the LTER NIS, or federated archive initiatives such as DataONE • Best practices • develop an archival data management plan • implement a sound data backup plan • archive raw data (but they do not need to be online) • make data publicly available that have appropriate QA/QC procedures applied • assign QC level to published data sets LTER Information Management Committee Meeting, July 23-25, 2013

  14. Quality control on Streaming Data:Quality Levels • Quality control is performed at multiple levels • Level 0 (Raw streaming data) • Raw data, no QC, no data qualifiers applied (data flags) • Preservation of original data streams is essential • Level 1 (QC applied, qualifiers added) • Provisional level (near real-time preparation) • if released, provisional data must be labeled clearly • Published level (delayed release) • QC process is complete; data is unlikely to change • Level 2 (Gap-filled or estimated data) • Involves interpretation – may be controversial • Desirable when generating summarized data, but transparency critical – flag estimated values LTER Information Management Committee Meeting, July 23-25, 2013

  15. Streaming data management middleware • Definition/Purpose • “Middleware” in conjunction with sensor networks is computer software that enables communication and management of data from field sensors to a client such as a database or a website • Purpose of middleware includes the collection, analysis, and visualization of data • Middleware is chained together into a scientific workflow • Examples: • Read, reformat, export of different data types or structures (input/output) • Automated QA/QC on data streams • Integration of field notes and documentation with the data • Archiving LTER Information Management Committee Meeting, July 23-25, 2013

  16. Streaming data management middleware • Middleware/software – Proprietary • Campbell Scientific LoggerNet • functionality to set up and configure a network of loggers • tools to program, visualize, monitor, and publish data • Vista Engineering: Vista Data Vision (VDV) • tools to store and organize data from various loggers • visualization, alarming, reporting, and web publishing features • YSI EcoNet (for YSI monitoring instrumentation) • delivery of data from the field to the YSI web server • visualization, reports, alarms, and email notification tools • NexSens: iChart • Windows-based data acquisition software package • interfaces with popular products such as YSI, OTT, ISCO sensors LTER Information Management Committee Meeting, July 23-25, 2013

  17. Sensor Data Management Middleware Open Source Environments for Streaming Data • Matlab GCE toolbox (Proprietary/ limited open source) • GUI, visualization, metadata-based analysis, manages QA/QC rules and qualifiers, tracks provenance • Open Source DataTurbine Initiative • Streaming data engine, receives data from various sources and sends to analysis and visualization tools, databases, etc. • Kepler Project (open source) • GUI, reuse and share analytical components/workflows with other users, tracks provenance, integrates software components and data sources

  18. Sensor Management Best Practices Workshop Participants • Don Henshaw (AND) - organizerCorinna Gries (NTL) - organizerRenee Brown (SEV)Adam Kennedy (AND)Richard Cary (CWT)Mary Martin (HBR)Christine Laney (UTEP, JRN)Jennifer Morse (NWT)Chris Jones (DataONE)Branko Zdravkovic (Univ of Saskatchewan)Scotty Strachan (Univ of Nevada-Reno) • Jordan Read (USGS) - vtcWade Sheldon (GCE) - vtc LTER Information Management Committee Meeting, July 23-25, 2013

More Related