Storm Tracking with Remote Data and Distributed Computing Lizzie Froude and Kevin Hodges ESSC, University of Reading Email: email@example.com Homepage: http://www.nerc-essc.ac.uk/~lsrf
Outline • Background and motivation • Meteorology • eScience • TRACK Internet Service • What it does • How it works • Demo • Use of Internet Service
Prediction of Storms • PhD Aim: To explore the prediction of storms • Storm (Extratropical cyclone) • Important for day-to-day weather in the midlatitudes via their presence or absence • Stormy, wet and windy weather • Provide essential rainfall • Can cause large amounts of damage (flooding, strong winds) • E.g. Great October Storm 1987 hit southern England and north-west France. Caused severe damage and 18 people died. Badly predicted.
TRACK • Storm Identification and tracking software (Kevin Hodges) • Identifies feature points (low pressure centres or vorticity max and min) through a time series of data. • Links points together to form trajectories of storms path (storm tracks) • Use TRACK to investigate prediction of storms
eScience Problem • Meteorological datasets are getting larger • E.g. ensemble prediction systems • Multiple forecasts run from slightly different initial states • Distributed Archiving • Difficult to generate diagnostics from one location • eScience PhD Aim: To develop an Internet Service that allows people to run TRACK with remote datasets and distributed computing
TRACK Internet Service • Web browser interface to TRACK program (java servlet and jsp) • Uses remote data sets (OPeNDAP) – NCEP re-analysis and NCEP ensemble prediction system • Construct a list of jobs which are run on multiple computers (Condor) • Progress of each job can be monitored • Computed storm tracks can be downloaded • Storm tracks can be plotted in the browser
OPeNDAP (DODS) • OPeNDAP = Open-source Project for a Network Data Access Protocol (http://www.opendap.org/) • Allows data to be accessed over the Internet using client/server model • Data Analysis Programs which use data access API’s, such as netCDF (i.e. TRACK) can be converted into OPeNDAP clients by linking them with the OPeNDAP versions of the API libraries • Access remote data in the same way as local data, but use a URL instead of a local path. • Sub-sampling facility that selects specific section of data by appending information to URL
Aggregation Server • NCEP re-analysis data yearly files (Jan-Dec). What about DJF seasons? • OPeNDAP aggregation server used to create aggregated dataset by effectively merging individual files so they appear as one large file • Can be local or remote files • NCEP re-analysis data aggregated using aggregation server at ESSC • Individual 1 year files can be treated as one large 50+ year file • Use sub-sampling to select any time period within the 50+ years of data.
Condor • Condor is a software system that can manage a large collection of jobs using the computational power of machines in a network • User can submit a list of jobs and Condor decides where and when to run them • User constructs a job list and then submits it to ESSC Condor pool
Use of Track Internet Service • Use to compute storm tracks from large amounts of NCEP (National Centers for Environmental Prediction) ensemble forecast data • Distributed computing helped with large amount of data processing • Accessing remote data reduced amount of data needed to be stored locally • Statistics have been generated from the computed storm tracks • Provides detailed information about the prediction of storms, e.g. • Position of storms predicted better than intensity • Forecasted storms generally move too slowly