Phillip E. Shafer Henry E. Fuelberg Florida State University April 4, 2007

Developing Gridded Forecast Guidance for Warm Season Lightning over Florida Using the Perfect Prognosis Method and Mesoscale Model Output Phillip E. Shafer Henry E. Fuelberg Florida State University April 4, 2007

Accurate Lightning Forecasts Are Important to FPL Lightning leads to outages FPL crews should be ready to respond Don’t want un-needed crews

Phase I: 1 August 2002 – 31 July 2003 Phase II: 1 June 2003 – 31 May 2004 Phase III: 1 June 2004 – 31 May 2005 Phase IV: 1 June 2005 – present Project Timeline

Southeast Flow Southwest Flow Phase I—Develop Lightning Climatologies Flashes/km^2/Regime Day

Detailed Climatologies For Dispatch Centers

Phases II & III • Equations derived for 11 FP&L service areas. • Morning radiosonde parameters used as predictors for afternoon lightning in each area. • Miami, Tampa, Jacksonville, Cape Canaveral • Generally, the sounding closest to each forecast area was used.

Phase 3 Sample Forecast

“I can state, unequivocally, that SSD uses the forecasts daily as an integral part of the resource and switching decision making process.” “On many occasions, we may have not held resources based on the projected weather levels only but the lightning forecasts were solid enough to override that decision- to our advantage I might add.” Forecasts Useful to FP&L Dispatchers:

Phase IV—Space and Time Varying Guidancefor all of Florida

Presentation Outline • 1. Motivation and Objectives • 2. Background • 3. Data • 4. Model Development • 5. Model Parameters • 6. Results for Dependent Data • 7. Results for Independent Test • Summary & Conclusions

1. Motivation and Objectives

Lightning is one of the leading causes of weather related fatalities in the U.S. • Lightning can cause damage to trees and utility lines, leading to disruptions in power and communications. • Florida is the lightning capital of the U.S. • Many heavily populated areas are vulnerable. • Skillful probabilistic guidance in the 3-12 h time frame would have many potential societal benefits. Motivation Flashes km-2 warm season-1 1989-2006 (May-September)

Objectives • Use the perfect prognosis (PP) method to develop a high-resolution gridded forecast guidance product for warm season cloud-to-ground (CG) lightning for all of Florida: • -- Equations to produce spatial probability forecasts for one or more CG flashes, and the probability of exceeding various flash count percentile thresholds. • -- 10 x 10 km grid, 3-h intervals • 2. Evaluate the utility and skill of the PP scheme when applied to forecast output from several mesoscale models during an independent test period (2006 warm season).

2. Background

Sea breeze usually the dominant forcing mechanism over Florida during the warm season. • Interactions between the sea breeze, the prevailing wind, and coastline curvature have been shown to influence lightning patterns (e.g., Lopez and Holle 1987; Hodanish et al. 1997; Camp et al. 1998; Lericos et al. 2002). • Other myriad factors influence timing and location of convection and lightning: • -- Local thermal circulations (e.g., water conservation areas, lakes, rivers, etc.) • -- Urban effects (e.g., Westcott 1995; Steiger et al. 2002) • -- Thunderstorm outflows Myriad Factors

Lightning ultimately is governed by cloud microphysical processes that are poorly resolved by NWP models. • Factors influencing cloud electrification are poorly understood. • Several hypotheses have been proposed: • -- Precipitation hypothesis (Reynolds et al. 1957) • -- Convection hypothesis (Vonnegut 1963) • -- Non-inductive ice-ice collision mechanism (Williams 1985) • Hypotheses depend on a vigorous updraft and robust ice phase for charge generation (Price and Rind 1992, 1993). • But, advances have been made in our understanding of the factors influencing lightning production. Cloud Microphysics

A variety of statistical techniques has been used to develop forecast models for thunderstorms and lightning: • -- Multiple linear regression often employed in earlier studies (less computationally demanding). • -- Binary logistic regression more appropriate when predictand is “yes” or “no” (e.g., Mazany et al. 2002; Bothwell 2002; Lambert et al. 2005; Shafer and Fuelberg 2006). • -- Classification and regression trees (e.g., Burrows et al. 2004). • Many studies have used data from morning soundings to forecast afternoon lightning. • Data from NWP models is more location and time specific. Statistical Studies

Model Output Statistics • Model Output Statistics (MOS): Objective forecasting technique in which statistical relationships are determined between a predictand and variables forecast by an NWP model. • Advantage: Model biases and local climatology are automatically built into the equations. Usually the method of choice when practical. • Drawback: NWP models are constantly changing. Any modifications to the NWP model that change systematic model errors require redevelopment of the MOS equations.

Perfect prognosis (PP): Statistical relationships are determined between observations of the predictand and observed atmospheric predictors. • Advantages: • -- Equations are developed without NWP forecasts (i.e., they are model independent). • -- Equations can be used with any NWP model and forecast projection, even as the models change. • Drawback: Assumes a “perfect” forecast of the predictors by the NWP model and thus, does not account for model biases. • Bothwell (2002): Used PP method to develop forecast equations for CG lightning over the western U.S. Perfect Prognosis

3. Data

Study Domain Grid spacing = 10 km Only land grid points used in model development

Lightning Data • The dependent variable • National Lightning Detection Network • System wide upgrades in 1995 & 2002 • 1995-2005 warm seasons used to develop climatological predictors. • 2002-2005 warm seasons used in equation development. • Data quality controlled for duplicate flashes and non-CG discharges. • Flashes summed within a 10-km radius of each grid point during each 3-h period (e.g., 0000-0259 UTC, …, 2100-2359 UTC).

3-h flash totals transformed into binary variables. • “1” if one or more flashes or “0” if no lightning • Binary variables also assigned based on whether the flash total exceeds the 50th, 75th, 90th, and 95th percentiles for a given 3-h period: Lightning Predictands

“Observed” atmospheric predictors derived from RUC analyses during 2002-2005 warm seasons (May-Sept). • RUC data sources: • 1. Atmospheric Radiation Measurement (ARM) Program (http://www.arm.gov/xds/static/ruc.stm) • 2. National Climatic Data Center (http://nomads.ncdc.noaa.gov) • 20-km, 50 level, hourly version (RUC20) implemented at NCEP during April 2002, with improvements in the analysis/physics. • 13-km version (RUC13) implemented at NCEP on 28 June 2005 with further improvements in the analysis/physics. • ~ 1.2 TB of RUC grib data was acquired and processed! Rapid Update Cycle (RUC)

Plethora of RUC-analyzed predictors investigated for possible inclusion in candidate predictor pool. • Parameters found useful in previous studies were examined: • -- Temperature (layer thickness, temperature advection, cold cloud thickness, etc.) • -- Moisture (moisture flux convergence, theta-e advection, PW, layer mean RH, etc.) • -- Stability (most unstable CAPE in various layers, CIN, best lifted index, Showalter Stability Index, TT, KI, temperature and theta-e lapse rates, etc.) • -- Wind (wind divergence, vorticity, vorticity advection, layer mean U and V components, layer mean speed, layer shear). Model Analyzed Predictors

All parameters calculated from the RUC 0-h temperature, dew point, wind, height, and surface pressure fields valid every three hours (e.g., 0000 UTC, 0300 UTC, …, 2100 UTC). • Fields interpolated to array of 10 km grid points and transformed into a vertical sounding. • RUC cloud hydrometeor profiles found to be unusable. • Assumption: The model analyses give the best estimate of the state of the atmosphere at the analysis time, and thus, can be treated as “observations” for purposes of developing the PP equations. • We focused mainly on parameters that are well handled by today’s NWP models. Model Analyzed Predictors

Statistics Software • S-PLUS version 6.1 for Windows. • Statistical Package for the Social Sciences (SPSS) version 11.5 for Windows. • Both are state-of-the-art software packages with a wide range of analysis and modeling capabilities.

4. Model Development

Pattern type lightning frequencies developed and used as candidate predictors. • Capture local enhancements due to interactions between the low-level wind, thermal circulations, and coastline topography, which are not well resolved by NWP models. • 3-hourly observed sea-level pressure fields used for pattern classification- implies direction and speed of low-level flow. • SLP fields obtained from RUC analyses spanning the 1998-2005 warm seasons (~1224 days). • Simple correlation technique used to develop the map types (e.g., Lund 1963, Reap 1994). Map Type Predictors

Map Type Predictors SLP fields interpolated to array of 100 km grid points.

5 map types developed using correlation threshold of 0.70. • 2 dominant types (A and B) comprise ~44% of the sample. • ~22% unclassified at a threshold of 0.70. Map Type Predictors • Relative lightning frequencies and the unconditional mean number of flashes calculated for each map type and 3-h period. • When developing equations, all unclassified maps were assigned the type with which they were most correlated.

Type A composite Mean no. flashes: 1800-2059 UTC Map Type Predictors • High northeast of Florida • Prevailing E-SE flow Most lightning confined to West Coast and east of Lake Okeechobee.

Type B composite Mean no. flashes: 1800-2059 UTC Map Type Predictors • Ridge over South Florida • SW flow across the state Lightning focused along East Coast and Big Bend region.

Type C composite Mean no. flashes: 1800-2059 UTC Map Type Predictors • Transition between A and B • SE flow over South Florida, S-SW flow across the north. Lightning maxima evident along both coasts.

Type D composite Mean no. flashes: 1800-2059 UTC Map Type Predictors • High north of Florida, lower pressure to the SE. • Most common after cold frontal passage. Dry NE flow confines most lightning to South Florida.

Type E composite Mean no. flashes: 1800-2059 UTC Map Type Predictors • Variation of type B- lobe of high pressure over Gulf. • W-NW flow across the state. Lightning confined to East Coast and Big Bend, with less coverage than type B.

MLR assumptions of constant variance and Gaussian residuals rarely are met with count data- can lead to undesirable and sometimes nonsensical results. • We considered several regression methods: • -- Forecasting one or more flashes: Binary Logistic Regression • -- Forecasting the amount of lightning: Poisson and NegativeBinomial Regression • GLMs can be used for response variables that follow any probability distribution in the exponential family (e.g., Normal, Binomial, Poisson, Negative Binomial, etc.). • GLMs accommodate non-Gaussian distributions of residuals and non-constant variance. Generalized Linear Models (GLMs)

Binary Logistic Regression • Most appropriate when predictand is “yes” or “no” • Log link function relates odds ratio to linear combination of predictors. • Probabilities bounded on the interval [0,1] • Accommodates Bernoulli distribution of residuals.

Poisson Regression • More appropriate model for count data • Log link function linearizes the expected value ()of the dependent variable (y) • Poisson probability model assumes that events occur randomly and at a constant average rate () with • Var(y) =  , where  is a dispersion parameter. • Poisson model assumes  = 1

Count Distribution • Strongly skewed distribution • Many cases with 10 or fewer flashes, few with 100 or more. • Very large variance (~80 times greater than the mean) • Data significantly over-dispersed with respect to Poisson model ( >> 1) • Likely cause: Counts generated by an inhomogeneous Poisson process- count rates vary in space and time. 1800-2059 UTC period Cases with one or more flashes

Negative Binomial Regression • Alternative probability model with shape parameter  • Var(y) is a quadratic function of  : • Var(yi| [xi]) =  ( [xi] + -1[xi] 2 ) • More accurately characterizes the uncertainty in the predicted count than does the Poisson model.

Poisson vs. Negative Binomial • Poisson model poorly represents count distribution • Negative binomial captures large number of cases with 10 or fewer flashes.

NB distribution has been used in previous studies to model thunderstorms at KSC. • No known study has used the NB as the probability model for lightning counts. • Since the count distribution is left-truncated at y=1, we can treat y-1 as having a NB distribution. • Probabilities for each y-1 must sum to 1. • Probability of exceeding any count threshold T : Negative Binomial Regression

Domain first divided into 9 areas- separate models developed for each. • Best results achieved by consolidating 9 areas into 4 larger regions: • -- East Coast • -- West Coast • -- Panhandle • -- Alabama & Georgia • Regions overlap to minimize problems at regional boundaries. Regionalized Approach

Long list of candidate predictors contains redundant information. • Principal component analysis used to select subset of predictors with less mutual correlation. • Correlations with lightning predictands are low- no single observed predictor is good indicator of lightning. • Power terms and cross products (interactions) also calculated and included in final predictor pool. • 3-h change in each parameter also included (trend indicators). • Map type predictors • Climatological predictors Final Candidate Predictors Correlations for East Coast region 1800-2059 UTC period

Combination of forward stepwise selection and cross-validation used to develop BLR and NB models. • Even years (2002 and 2004) used as “learning” sample. • Odd years (2003 and 2005) used as “evaluation” sample. • Procedure identifies best combination of predictors that is most likely to generalize to independent data, and not over-fit the dependent sample. • Models containing only climatology and persistence (L-CLIPER) also developed as a benchmark for assessing forecast skill. Equation Development

5. Model Parameters

Most important parameters for 1800-2059 UTC Model Parameters BLR models for one or more CG flashes NB models for the amount of lightning

Most important parameters for 1800-2059 UTC Model Parameters BLR models for one or more CG flashes NB models for the amount of lightning Moisture

Phillip E. Shafer Henry E. Fuelberg Florida State University April 4, 2007

Phillip E. Shafer Henry E. Fuelberg Florida State University April 4, 2007

Presentation Transcript

Ann E. Austin Michigan State University Second International Conference Oxford University April, 2008

Florida State University

Diversity at Florida State University

State University System of Florida

Florida State University Schools

State University System of Florida

Valerie Shute, Florida State University

David E. Meltzer Arizona State University

Billie E. Walker Penn State University

e-Learning Case Study: University of West Florida

About Florida State University

Dr. Peter Collier Portland State University April 2007

Vanderbilt University and Florida State University

E.4

2007 Canine Acupuncture Project Colorado State University December 4, 2007

April 4, 2007

The Florida State University

April 4, 2007

Florida State University