1 / 25

Oceanography 569 Oceanographic Data Analysis Laboratory

Oceanography 569 Oceanographic Data Analysis Laboratory. Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_2014/. Organization. 1 lecture, 1 lab period (2 hrs) per week

oya
Download Presentation

Oceanography 569 Oceanographic Data Analysis Laboratory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Oceanography 569Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_2014/

  2. Organization • 1 lecture, 1 lab period (2 hrs) per week • Exercise assigned in lab, finish by following lecture • Presentation of solution in lecture session • One class project completed individually • Grade based on presentations and project • Office hours by appointment

  3. Materials Materials available on class web site: • Powerpoint notes • mfiles & mat files for exercises • specialized functions (mfiles) • example solutions (following week) Text: “Modeling Methods for Marine Science” by Glover, Jenkins & Doney • on reserve in Physics Library • a good reference to buy

  4. General Procedure for Data Analysis • Define analysis goal • Characterize data • Prepare data • Errors and error propagation • Statistical analyses • Combine data with model (prognostic, diagnostic, statistical)

  5. Daily satellite SST maps Exercise 1: Aegean Sea temperaturesanalysis goal: create continuous 3-m time series • 5 buoys (POSEIDON) • 3-m 3-hourly temperatures (with gaps)

  6. Exercise 1: Characterize Data 3-m: higher resolution, but gaps SST: continuous, but only daily What happens when the data are “merged”? To make a consistent series, what is sacrificed?

  7. Exercise 1: Data discrepanciescompareapples & apples: average 3-m to daily What are the characteristics of the differences? How can the differences be reconciled?

  8. Periodic Signals • Robust way to estimate periodic signals, especially for gappy data: • fit_harmonics: fit to cosines with period L, L/2, etc (cf. Fourier series) • [amp,phase,frac,offset,da]=fit_harmonics(data,time,nharm,L,cutoff); • d_periodic = amp(1)*cos(2*pi*t/L+phase(1)) • + amp(2)*cos(2*pi*2*t/L+phase(2)) • + ... • + amp(n)*cos(2*pi*n*t/L+phase(n)) • +offset for nharm=n • includes jth term only if frac(tion) of variance removed > cutoff/100 • returns anomaly: da = data - d_periodic • Note: offset is not the same as mean(data) • Remove mean using fit_harmonics if strong seasonal cycle!

  9. Exercise 1: Fix discrepanciesfind & remove seasonal cycle in difference Result: daily average temperature that matches the seasonal cycle of the 3-m series

  10. Other goals Continuous SST with a diurnal cycle: use 3m temperature to find diurnal cycle Correct SST for aliasing from undersampling the diurnal cycle Create non-seasonal temperature anomalies

  11. AliasingSST sampling aliases diurnal cycle“Nyquist frequency”: period of 2*Δt sample diurnal temperature signal using 26-hr intervals

  12. Matlab functions • datenum: converts yyyy,mm,dd to Julian dates, starting at year 0; also datestr, datevec, datetick(‘x’) • imagesc: bit map that shows each image pixel, scaled to colormap • (cf. pcolor, which interpolates pixels to a grid) • NaN, “not a number”: use to flag invalid data, then nanmean, nansum, etc ignore NaN’s. Does not plot. To find valid data: • ind=find(~isnan(data)); • fit_harmonics(data,time,nharm,L,cutoff): use to find any periodic signal in the data, using the time axis, period L and a cutoff (% of variance explained)

  13. Statistics of Observations “random” variables Are these observations of random variables? Will removing the mean make them random?

  14. Statistical Definitions: mean The sample mean is given by The mean of the parent population is given by But we never know it since the sample is finite. For class the mean wil refer to the sample mean, regardless of the symbol. The factor N here is the number of degrees of freedom.

  15. Statistical Definitions: variance The sample variance is given by where s is the standard deviation of x. The variance of the parent population corresponds to an infinite number of samples, N. The N-1 factor occurs because using the sample mean “uses up” one of the degrees of freedom of the data set. In class the we will refer to the sample variance.

  16. Exercise 2: Periodic Signals need to remove non-random components Both have periodic signals (seasonal, not random)

  17. Caution: mean of data with periodic componentif incomplete cycles in sample Use “offset” from fit_harmonics instead

  18. Exercise 2: Probability Distributions(histogram) Both non-seasonal SST and non-seasonal rain are random variables. Are either of these normally distributed?

  19. Normal Distribution for Random Variable Why do we want a normal distribution? Least-squares fit, correlations, optimal interpolation have error estimates based on assumption of normal distributions of random data and/or errors

  20. Exercise 2: Making a variable more normal distribution of log(rain) log(rain) rain

  21. Exercise 2: distributions for modified variabledeciles rain uniform rain deciles

  22. Exercise 2: test for normal cumulative distribution

  23. To edit or not to edit • For a truly normal distribution, 0.3% of the data are more than • 3 standard deviations from the mean • “Three-sigma edit”: remove data more than 3 std dev from mean • Best to justify edits in terms of • likely error sources and characteristics • spikes • unphysical values • comparisons with other variables

  24. Exercise 2: Edit data3-sigma outliers • Procedure for removing suspicious data: • remove known signals (diurnal, seasonal, trends) • check for normal distribution • compute σ (standard deviation) • remove data more than 3*σ from mean • do not iterate!

  25. Central Limit Theorem Why is Normal distribution commonly used? Underlying distributions may be unknown or non-Normal BUT if measurement (or error) is sum of many processes, distribution will approach Normal Example: distribution of the mean of X for different distributions as the number of samples increases

More Related