paos.colorado/research/wavelets/

WAVELET ANALYSIS http://paos.colorado.edu/research/wavelets/

C. Torrence and G.P. Compo 1998. A Practical Guide to Wavelet Analysis. Bulletin of the American Meteorological Society:Vol. 79, No. 1, pp. 61–78. Abstract: A practical step-by-step guide to wavelet analysis is given, with examples taken from time series of the El Niño-Southern Oscillation (ENSO). The guide includes a comparison to the windowed Fourier transform, the choice of an appropriate wavelet basis function, edge effects due to finite-length time series, and the relationship between wavelet scale and Fourier frequency. New statistical significance tests for wavelet power spectra are developed by deriving theoretical wavelet spectra for white and red noise processes and using these to establish significance levels and confidence intervals. It is shown that smoothing in time or scale can be used to increase the confidence of the wavelet spectrum. Empirical formulae are given for the effect of smoothing on significance levels and confidence intervals. Extensions to wavelet analysis such as filtering, the power Hovmöller, cross-wavelet spectra, and coherence are described. The statistical significance tests are used to give a quantitative measure of changes in ENSO variance on interdecadal time scales. Using new datasets that extend back to 1871, the Niño3 sea surface temperature and the Southern Oscillation Index show significantly higher power during 1880–1920 and 1960–90, and lower power during 1920–60, as well as a possible 15-yr modulation of variance. The power Hovmöller of sea level pressure shows significant variations in 2–8-yr wavelet power in both longitude and time. http://paos.colorado.edu/research/wavelets/bams_79_01_0061.pdf

Introduction • If a signal is stationary then we can apply FFT or Fourier expansion or Harmonic analysis to obtain the energy at the various frequencies • Many time series in geophysics exhibit non-stationarity in their statistics. • While the series may contain dominant periodic signals, these signals can vary in both amplitude and frequency over long periods of time.

Non-stationarity Sea surface temperatures in the equatorial Pacific Ocean Figure 1. Sea surface temperatures averaged over the NINO3 region in the eastern Pacific (5°S-5°N, 90°W-150°W). Blue curve is low-pass filtered (>12 months) SST. Yellow background curve is running 15-year variance, plotted at mid-point of 15-year period. Curve has been reproduced upside-down to show "envelope" of variance. The dominant "mode" of variability is the El Niño-Southern Oscillation (ENSO), shown by the high-frequency "spikes" on a time scale of 2-7 years. Superimposed on this signal are much longer inter-decadal fluctuations. The inter-decadal fluctuations have the effect of modulating the amplitude and frequency of occurrence of El Niño events. http://paos.colorado.edu/research/wavelets/wavelet1.html

Process 15-yr Variance • Ideally, one would like to separate the shorter period oscillations from the longer. [Note: the word "oscillation" is used here to indicate any repeating fluctuation in the time-series, regardless of whether the fluctuation repeats on a regular basis or not. In our example the El Niño is definitely a non-periodic (irregular) signal.] 1880-1920 1920-50 • The simplest method for analyzing non-stationarity of a time-series would be to compute statistics such as the mean and variance for different time periods and see if they are significantly different. In the Figure above we have also plotted the running 15-year variance, as a measure of total power inherent in the signal versus time. One can see that ENSO had more variance during 1880-1920 and also since 1950, with a relatively quiet period during 1920-1950.

Process • While the running variance tells us what the overall strength of the signal was at certain times, it suffers from two major defects: • [Time Localization] The shape of the curve is highly dependent on the length of the window used. Fifteen years was chosen above as a compromise between either too-much smoothness (say using a 30-year window), or too-little (a 5-year window). An ideal method would allow different window sizes depending on the scales that one is interested in. • [Frequency Localization] The running variance contains no information on the frequency of a periodic signal, only its amplitude (and that only if the window is chosen widely).

A simple example • t = [0:1023]; • fo = 0.01; • a = 10; • f = fo*5*abs(cos(2*pi*t/1024)); • x=a*cos(2*pi*f.*t); • figure • subplot(211); plot(t,x) • subplot(212); plot(t,f)

Windowed FFT One possibility would be to do a windowed (or running) Fourier transform (WFT), using a certain window size and sliding it along in time, computing the FFT at each time using only the data within the window. This would solve the second problem (frequency localization), but would still be dependent on the window size used. 0 T 2T 3T 4T T/2 3/2*T 5/2*T 7/2*T PSD PSD PSD PSD Freq Freq Freq Freq f1=1/(2T) fN=N/(2T)

FFT vs Wavelet • WFT provides an inconsistent treatment of different frequencies: • at low frequencies there so few oscillations within the window that the frequency localization is lost, while • at high frequencies there are so many oscillations that the time localization is lost. • WFT relies on the assumption that the signal can be decomposed into sinusoidal components. • Wavelet analysis attempts to solve these problems by decomposing a timeseries into time/frequency space simultaneously. One gets information on both the amplitude of any "periodic" signals within the series, and how this amplitude varies with time.

Wavelets • Measure of the stationarity of a time series can be provided by calculating the running variance using a fixed-width window. • We could repeat the analysis with a variety of window widths. By smoothly varying the window width one could then build a picture of the changes in variance versus both time and window width. • Simple "boxcar" shape of the window function introduces edge effects such as ringing., but still there is no information on what is going on within the box, but only recovery of the average energy. • In the Figure above we see an example of a wave "packet", of finite duration and with a specific frequency. • We can use this window function for our analysis of variance. This "wavelet" has the advantage of: (i) incorporating a wave of a certain period, and (ii) being finite in extent. In fact, the wavelet shown above (the Morlet wavelet) is a Sine wave (green curve) multiplied by a Gaussian envelope (red curve). • Morlet wavelet of arbitrary width and amplitude, with time along the x-axis. • Construction of the Morlet wavelet (blue dashed) as a Sine curve (green) modulated by a Gaussian (red).

As seen in the above equation , the transformed signal is a function of two variables, τand s , the translation and scale parameters, respectively. ψ(t) is the transforming function, and it is called the mother wavelet . The term mother wavelet gets its name due to two important properties of the wavelet: The term wavelet means a small wave . The smallness refers to the condition that this (window) function is of finite length ( compactly supported). The wave refers to the condition that this function is oscillatory . The term mother implies that the functions with different region of support that are used in the transformation process are derived from one main function, or the mother wavelet. In other words, the mother wavelet is a prototype for generating the other window functions. The term translation is related to the location of the window, as the window is shifted through the signal. This term, obviously, corresponds to time information in the transform domain. However, we do not have a frequency parameter; instead, we have scale parameter which is defined as “1/frequency”. The parameter scale in the wavelet analysis is similar to the scale used in maps. As in the case of maps, high scales correspond to a non-detailed global view (of the signal), and low scales correspond to a detailed view. Similarly, in terms of frequency, low frequencies (high scales) correspond to a global information of a signal (that usually spans the entire signal), whereas high frequencies (low scales) correspond to a detailed information of a hidden pattern in the signal (that usually lasts a relatively short time).

How it works (1) • Assuming that the total width of this wavelet is about 15 years, we can find the correlation between this curve and the first 15 years of our time series shown in Figure 1. • This single number gives us a measure of the projection of this wave packet on our data during the 1876-1890 period, i.e. how much [amplitude] does our 15-year period resemble a Sine wave of this width [frequency]. By sliding this wavelet along our time series (translation) one can then construct a new time series of the projection amplitude versus time. • Finally, one can then vary the "scale" of the wavelet by changing its width. This is the real advantage of wavelet analysis over a moving Fourier spectrum. For a window of a certain width, the sliding FFT is fitting different numbers of waves, i.e. there can be many high-frequency waves within a window, while the same window can only contain a few (or less than one) low-frequency waves. The wavelet analysis always uses a wavelet of the exact same shape, only the size scales up or down with the size of the window. • In addition to the amplitude of any periodic signals, we would also like information on the phase. In practice, the Morlet wavelet shown in Figure 2a is defined as the product of a complex exponential wave and a Gaussian envelope:

How it works (2) • where π is the wavelet value at non-dimensional time η, and ω0 is the wavenumber. This is the basic wavelet function, but we now need some way to change the overall size as well as slide the entire wavelet along in time. We thus define the "scaled wavelets" as (eqn. 2.2): • where s is the "dilation" parameter used to change the scale, and n is the translation parameter used to slide in time. The factor ofs-1/2 is a normalization to keep the total energy of the scaled wavelet constant. • We are given a time series X, with values of xn, at time index n. Each value is separated in time by a constant time interval δt. The wavelet transform Wn(s) is just the inner product (or convolution) of the wavelet function with our original time-series (eqn. 2.3): • where the asterisk (*) denotes complex conjugate. • The above integral can be evaluated for various values of the scale s(usually taken to be multiples of the lowest possible frequency), as well as all values of n between the start and end dates. A two-dimensional picture of the variability can then be constructed by plotting the wavelet amplitude and phase.

Wavelet Power Spectrum • Figure 3 shows the power (absolute value squared) of the wavelet transform for the NINO3 SST data. • The (absolute value)2 gives information on the relative power at a certain scale and a certain time. • A plot of the amplitude and phase would show the actual oscillations of the individual wavelets, rather than just their magnitude. Fig 3. (a) Time series of El Niño sea surface temperature. (b) The wavelet power spectrum, using the Morlet wavelet. The x-axis is the wavelet location in time. The y-axis is the wavelet period in years. The black contours are the 10% significance regions, using a red-noise background spectrum. The red areas indicate that high El Niño activity occurred during 1880-1920 and 1965-present, while 1920-1960 was relatively calm.

Comparing Figures 1 and 3 it is now much clearer that there was large power in the 2-7 year ENSO period during both the earlier and latter parts of this century. In addition we can now see hints of a 16-year oscillation as well as power at even lower frequencies. • The wavelet transform also gives information on changes in frequency that may have occurred. Thus, from 1960-1990 the ENSO time band (2-7 years) seems to have undergone a slow oscillation in period from a 3-year period between events back in 1965 up to about a 5-year period in the early 1980s. Fig. 3 Fig. 1

Algorithms (1) Setting the Scales • Previously we saw how one can use a typical wavelet (the Morlet) to decompose a time series into time-frequency phase space. We now turn to the actual computation of the wavelet transform. • The first thing to consider is the shape of the wavelet. For decomposing the NINO3 SST data, we chose the Morlet wavelet because: • it is commonly used, • it's simple, • it looks like a wave. • There are an infinite number of other mother wavelets that could be chosen (see Farge 1992 for examples). • For the Morlet wavelet transform, where the mother wavelet is:

Cont… (2) Choose the wavenumberw0: it gives the number of oscillations within the wavelet itself. One condition of the wavelet transform is that the average of the wavelet itself must be zero. In practice, if we choose w0=6, then the errors due to non-zero mean are smaller than the typical computer round-off errors (Farge 1992). (3) Our next choice is a set of scaling parameters s, such that we adequately sample all the frequencies present in our time series. We first choose the smallest resolvable scale, s0, as some multiple of our time resolution, dt. For the NINO3 SST data, we have seasonal data, thus dt = 0.25 years. The smallest wavelet we could possibly resolve is 2dt, thus we choose s0 = 2dt = 0.5 years. The larger scales (longer periods, sj) are chosen as power-of-two multiples of this smallest scale: (4) The largest scale (sJ) chosen should be less than 1/2 the length of the entire time series. The choice of scales for the NINO3 SST data is shown on the right-hand axis of Figure 3. In theory, if these scales are chosen wisely, then one can construct an orthogonal complete basis set. In reality, one usually over-samples the scales so as to provide information on frequencies in between the orthogonal scales. Thus, in Figure 3, the scales have been over-sampled by choosing 10 sub-scales within each scale.

Algorithms cont… • It is possible to compute the wavelet transform in the time domain using Eqn (2.3). However, it is much simpler to use the fact that the wavelet transform is the convolution between the two functions x and π, and to carry out the wavelet transform in Fourier space using the Fast Fourier Transform (FFT). In the Fourier domain, the wavelet transform is simply, (3.2) • where the ^ indicates the Fourier transform (FT), and the Fourier transform of the time series is given by: (3.3) • The angular frequency is defined as: • (3.4)

Algorithms cont… • To use formula (3.3), the FT of the wavelet function should be known analytically (see Table with Wavelets). • In addition, in order to ensure that the wavelet transforms at each scale s are directly comparable to each other and to the transforms of other time-series, the wavelet function at each scale s is normalized to have unit energy, as follows: (3.5) • Unlike the convolution, the FFT method allows the computation of all n points simultaneously, and can be efficiently coded using any standard FFT package.

Step by Step The steps to compute the wavelet transform for a time series are thus: • Choose a mother wavelet; • Find the Fourier transform of the mother wavelet; • Find the Fourier transform of the time series; • Choose a minimum scale s0, and all other scales; • For each scale, do the following: • Using Eqn. 3.5 (or whatever is appropriate for your mother wavelet), compute the daughter wavelet at that scale; • Normalize the daughter wavelet by dividing by the square-root of the total wavelet variance (the total of (ψ)2 should then be one, thus preserving the variance of the time series); • Multiply by the Fourier Transform of your time series; • Using Eqn. 3.2, inverse transform back to real space; • Make a contour plot.

MorletvsMexican hat

Tip (padding with zeros) • One problem with performing the wavelet transform in Fourier space is that this assumes the time series is periodic. The result is that signals in the wavelet transform at one end of the time series will get wrapped around to the other end. This effect is more pronounced at larger scales as the influence of each wavelet extends further in time. One way to avoid this is to pad one end of the time series with zeroes. A clever method is to pad with enough zeroes to make the length of the time series equal to a power of two, and thereby speed up the FFT as well.

Error Estimates Important Note: When applying the significance and confidence tests from "A Practical Guide to Wavelet Analysis", you do not need to use a Monte Carlo simulation. Analytical formulae are given in the paper for the statistical distribution of wavelet power. These formulae are correct, assuming that the underlying distribution of the original time series is Gaussian. The following description of Monte Carlo is given for those who want more background on the method, or if your time series cannot be assumed to be Gaussian. The Monte Carlo method (or simulation) was used in "A Practical Guide to Wavelet Analysis" to verify that the wavelet power spectrum was indeed chi-square distributed. The method was also used to determine the empirical formulae for time-averaging and scale-averaging (paper Sections 5a and 5b). The Monte Carlo method (or simulation) is a statistical method for finding out the answer to a problem that is too difficult to solve analytically, or for verifying the analytical solution. It is called Monte Carlo because of the gambling casinos in that city, and because the Monte Carlo method is related to rolling dice.

Wavetest.m

Monte Carlo Method Here's an example: If you roll two dice, then the chances of getting a total of "two" (a "one" on each) are 1 in 36. This is easy to figure out. But if you didn't know the answer, you could use the Monte Carlo method. You just roll two dice thousands of times, and add up how many times you got a total of "two". Eventually, your fraction of "two's" to "total rolls" will approach 1/36. In our paper, we wanted to verify that the statistical distribution for the wavelet power was chi-square. Here's the Monte Carlo method we used: • We created 100,000 random time series, each with 512 points. • Then we took the wavelet transform for each one, and computed the wavelet power. • We then took a time slice from the middle (time n=256), • At each scale, we sorted all 100,000 wavelet powers into increasing order. • You can then make a plot of wavelet power versus number. • If you then look at what the wavelet power was for number 95,000 out of 100,000, then 95% of the wavelet power is below that value, and only 5% is above. • This 95% level is the 95% confidence level (or the 5% significance level). The method can be generalized to any process where the statistical distribution is unknown, yet one needs to determine confidence or significance levels.

Time-Series Gaussian Distribution Wavelet Power Spectrum Chi-Square Distribution Chi-Square Results

paos.colorado/research/wavelets/