1 / 19

Enhanced File Consistency Checking

Enhanced File Consistency Checking. ADMT #10 – Toulouse, France 30 September - 2 October 2009 Mark Ignaszewski FNMOC. Background. 2 types of format and consistency checks failures: Errors: These block distribution of the data on the GDAC until corrected.

parker
Download Presentation

Enhanced File Consistency Checking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Enhanced File Consistency Checking ADMT #10 – Toulouse, France 30 September - 2 October 2009 Mark Ignaszewski FNMOC ADMT-10 30 September – 3 October 2009

  2. Background • 2 types of format and consistency checks failures: • Errors: These block distribution of the data on the GDAC until corrected. • Warnings: Things we’d like to see corrected but data is distributed on the GDAC as is • Unless otherwise noted, the tests generate errors and block data distribution ADMT-10 30 September – 3 October 2009

  3. Basic Format Checking • Basic format checking has not changed • All file types are checked to ensure the Dimensions, Variables, and Attributes conform to the Argo specification • Including “highly-desirable parameter” checks • All strings checked for NULL characters (Warning) ADMT-10 30 September – 3 October 2009

  4. Enhanced Profile Checking Meta-data checks Date Checks QC code checks <PARAM> variable checks D-mode specific file checks ADMT-10 30 September – 3 October 2009

  5. Meta-data Checks • PLATFORM_NUMBER: • 5 or 7 numeric digits (second digit “9” for 7) • DATA_STATE_INDICATOR: • One of the recommended codes from reference table 6 • DIRECTION: ‘A’ or ‘D’ • DATA_CENTRE: Valid for the DAC • DATA_MODE: ‘A’, ‘D’, or ‘R’ • INST_REFERENCE: Set (Warning) • POSITIONING_SYSTEM: Set (Warning) • WMO_INST_TYPE: Valid (ref table 8) (Warning) ADMT-10 30 September – 3 October 2009

  6. Date Checks • All dates are checked for validity and consistency • String date settings checked for validity • DATE_UPDATE / DATE_CREATION / HISTORY_DATE / CALIBRATION_DATE • 14 digit strings; valid (e.g., seconds must be 0 to 59) • DATE_UPDATE and DATE_CREATION must be set ADMT-10 30 September – 3 October 2009

  7. Consistency of Dates Jan 1, 1995 JULD Within 12 hours JULD_LOCATION DATE_CREATION HISTORY_DATE No order imposed CALIBRATION_DATE DATE_UPDATE Within 2 days (Warning) GDAC file time ADMT-10 30 September – 3 October 2009

  8. QC Code Checks JULD_QC and POSITION_QC: Valid values ADMT-10 30 September – 3 October 2009

  9. <PARAM> Checks • STATION_PARAMETER: • Only valid parameter names • No “blank” entries • No duplicate entries • PRES, TEMP, PSAL are required • Check that the <PARAM> variables exist for every STATION_PARAMETER • Check that no other <PARAM> variables (with data) exist in the file. • If mode = ‘A’ or ‘D’: Check that all <PARAM>_ADJUSTED have data • Subject to the D-mode “QC=4” rules in the QC manual ADMT-10 30 September – 3 October 2009

  10. <PARAM> Checks (continued) • <PARAM>_QC and _ADJUSTED_QC • Only valid QC codes - No “fill values” • Missing data flagged with 0, 4, 9 • Real-time profiles: Only codes 0 through 4 • Required parameters (PRES, TEMP, PSAL): • Cannot be code 0 if data is not missing • PROFILE_<PARAM>_QC: • Valid value • Correct value • Check that N_PARAM and N_LEVEL are not larger than necessary (Warning) ADMT-10 30 September – 3 October 2009

  11. D-mode File Checks • DATA_MODE = “D” • DATA_STATE_INDICATOR = “2C” or “2C+” • Same <PARAM> in PARAMETERS as in STATION_PARAMETERS • If PRES_ADJUSTED_QC = 4, TEMP_ADJUSTED_QC and PSAL_ADJUSTED_QC = 4 • *_ADJUSTED = missing ADMT-10 30 September – 3 October 2009

  12. D-mode File Checks • SCIENTIFIC_CALIB_COMMENT and CALIBRATION_DATE set • for every <PARAM> and N_CALIB • At least one HISTORY record • HISTORY_INSTITUTION and _DATE set • Plus, the <PARAM> and date checks previously discussed ADMT-10 30 September – 3 October 2009

  13. Results • Tested every cycle with a 1 or 5 in the ten’s digit • 015, 055, 115, 155 Warnings • 9 DACs have problems with NULLs in strings • KORDI seems to be OK • Some only in few variables, some in many variables • A couple “N_LEVELS too large” • KORDI sets the variable larger than necessary a lot • “INST_REFERENCE” not set • 1 JMA file ADMT-10 30 September – 3 October 2009

  14. Results: Date checks • JULD after DATE_CREATION • Coriolis – only a few files • INCOIS – many files – large time differences • HISTORY_DATE and/or CALIB_DATE after DATE_UPDATE • CSIO – Many files – Big time differences • Invalid dates: • AOML, INCOIS: Bad values • MEDS: Too short (missing seconds) ADMT-10 30 September – 3 October 2009

  15. Results: <PARAM> checks • ‘A’ or ‘D’: *_ADJUSTED not set • Identified some missing *_ADJUSTED data • Was not handling the “QC = 4” rule correctly • PROFILE_<PARAM>_QC: Incorrect values • AOML, Coriolis, CSIO, JMA, MEDS • Missing variables • CSIRO, INCOIS (DOXY_ADJUSTED_ERROR) • <PARAM>_QC and *_ADJUSTED_QC • Numerous inconsistencies reported • Some illegal values ADMT-10 30 September – 3 October 2009

  16. Results: D-file checks • DATA_STATE_INDICATOR: Coriolis and MEDS: • Question about “2C+” • A few MEDS files set to “2B” ADMT-10 30 September – 3 October 2009

  17. Results: D-file checks PARAMETER and SCIENTIFIC_CALIBRATION_* • PARAMETER or CALIB_DATE not set for many files • PRES, TEMP: AOML, CSIO, MEDS • TEMP, CNDC: CSIRO • PRES, TEMP, PSAL: JMA • SCI_CALIB_COMMENT not set for many files • TEMP, CNDC: CSIRO • PRES, TEMP: JMA • N_CALIB larger than necessary in many files • BODC (many files), Coriolis (few files) • No calibration information in some D-files: Coriolis ADMT-10 30 September – 3 October 2009

  18. Plan • Implement in routine processing AS ADVISORY • 20 Oct 2009 • Transition to IFREMER: Oct-Nov 2009 • Implement as operational checker: end-Nov 2009 ADMT-10 30 September – 3 October 2009

  19. Still Needed • <PARAM>_ADJUSTED_ERROR: Set • Cross-file checks: • Cycle-to-cycle: • Consistent positions and times • Duplicates • Meta-data file comparisons • Greylist: Should QC be checked? ADMT-10 30 September – 3 October 2009

More Related