1 / 56

TIME SERIES ANALYSIS

TIME SERIES ANALYSIS. EECS 731: INTRODUCTION TO DATA SCIENCE. Lecturer : Prof. Nicole Beckage. Team One members : Al- Smadi , Adi De Berner, Aime Duckworth, Ryan Nelakurthi ,  Pavan Nguyen, Phuong. Aime. CONTENTS. Introduction Methodology Prior Markov Model Seasonality

remedy
Download Presentation

TIME SERIES ANALYSIS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TIME SERIES ANALYSIS EECS 731: INTRODUCTION TO DATA SCIENCE Lecturer: Prof. Nicole Beckage Team One members: Al-Smadi, Adi De Berner, AimeDuckworth, RyanNelakurthi, PavanNguyen, Phuong Aime

  2. CONTENTS • Introduction • Methodology • Prior • Markov Model • Seasonality • Time Series Models • Conclusion Aime

  3. INTRODUCTION • “From the earliest times man has measured the passage of time with candles or clepsydras or clocks, has constructed calendars, sometimes with remarkable accuracy, and had recorded the progress of his race in the form of annals” Quoted from Maurice Kendall, “Time Series”, Hafner Press, (1976), pg 1 Aime

  4. TIME SERIES • A set of quantitative observations arranged in chronological order(Assumption: Time is a Discrete Variable) • a collection of observations of well-defined data items obtained through repeated measurements over time • Sequence of numerical data points in successive order Aime

  5. Time Series Analysis / FORECASTING • Time Series Analysis: • Use of methods for analyzing time series data to extract their characteristics and meaningful statistics • A branch of Statistics that generally deals with structural dependencies between observed data of random phenomena and related parameters (observed phenomena indexed by time) • Time Series Forecasting: • Use of a model to predict future values based on previous observed values Aime

  6. TERMINOLOGY • Ergodicity: assumption that means the sample moments calculated on the basis of time series with a finite number of observations converges for T->(“consistency properties”) • Stationarity: statistical equilibrium condition for stochastic process to be ergodic. Distance between two observations does not change over time. • Stochastic process (data generating process) of all possible realization (probability theory) Aime

  7. TYPES OF TIME SERIES • Continuous vs. Discrete • Observations made continuously vs. those made at certain times in (in discrete time-interval aggregation) • Stationary vs. Non-Stationary • Data that fluctuates around a certain constant • Series with parameters of cycle (eg: length, amplitude, phase) change over time • Deterministic vs. Stochastic • Data can be predicted exactly • Data partly determined by past values, future values to be described by a probability distribution. Aime

  8. TYPES OF TIME SERIES • Seasonal vs Non-seasonal • Linear vs Non-linear • Univariate vs Multivariate • Chaotic (Randomly distributed and non-periodic) Aime

  9. TYPES OF TIME SERIES • Seasonal vs Non-seasonal, • Linear vs Non-linear, • Univariate vs Multivariate, • Chaotic (Randomly distributed and non-periodic) Aime

  10. Goals of Time series ANALYSIS • Descriptive Analysis: Trends and patterns that a Time Series has by plotting or using complex techniques • Spectral Analysis: Variation in Time Series accounted for by cyclic components, (estimate on frequency – noise) • Forecasting: Prediction based on previous behavior (models built: predictions within certain confidence limits) • Intervention Analysis: “Change in a Time Series before and after a certain event” • Explanative Analysis (Cross Correlation): Mechanisms resulting in an estimate. “What is the relationship between two Time Series datasets?” Aime

  11. Forecasting Methodology Forecasting Causal Models Time Series Models Seasonal Trend Random Cyclical Regression Phuong

  12. Time-Series Method Structure Time Series Models Trend Models Cyclical Variation Seasonal Variation Random Variation Markov Model “Prior” Error/ Noise Seasonality Phuong

  13. THE NOTION OF “PRIOR” • The probability that an event will reflect established beliefs about the event before the arrival of new evidence or information. • It is the unconditional probability that is assigned before any relevant evidence is taken into account. • It is the mathematical base for prediction. Pavan

  14. Posterior Probability • Prior probabilities are the original probabilities of an outcome, which be will updated with new information to create posterior probabilities. • Bayes' theorem calculates the renormalized pointwise product of the prior and the likelihood function, to produce the posterior probability distribution, which is the conditional distribution of the uncertain quantity given the data. Pavan

  15. Bayes theorem This relates the probability of the hypothesis before getting the evidence P(H), to the probability of the hypothesis after getting the evidence, P(H|E). For this reason, P(H) is called the prior probability, while P(H|E) is called the posterior probability. The factor that relates the two, P(E|H)/P(E), is called the likelihood ratio. Pavan

  16. Example Pavan

  17. Forecasting bias • A forecast bias occurs when there are consistent differences between actual outcomes and previously generated forecasts of those quantities; that is: forecasts may have a general tendency to be too high or too low.  • Bias usually occurs due to addition of human personal ideology to the data. • https://www.youtube.com/watch?v=gn4nRCC9TwQ Pavan

  18. N-step ahead • Let D1, D2, . . . Dn, . . . be the past values of the series to be predicted (demands?). If we are making a forecast during period t (for the future), assume we have observed Dt , Dt-1 etc. • Let Ft, t + t = forecast made in period t for the demand in period t + t where t = 1, 2, 3, … • Then Ft -1, t is the forecast made in t-1 for t and • Ft, t+1 is the forecast made in t for t+1. (one step ahead) Use shorthand notation Ft = Ft - 1, t Pavan

  19. Forecasting Error • The forecast error in period t, et, is the difference between the forecast for demand in period t and the actual value of demand in t. • For a multiple step ahead forecast: et = Ft - t, t - Dt. • For one step ahead forecast: et = Ft – Dt Pavan

  20. MARKOV MODEL • Named after a Russian Mathematician: Andrey Markov [1856 – 1922] • Future state depends only on the current state, not on events that occurred before it. • Future is independent of past, given the present. • If you know the exact state of world now, and want to predict the future, knowledge about the past isn't useful because all knowledge about the past is wrapped up in the current state. • Temporal Data (Sequence of data) • Weather • Finance • Language • Music • Assume discrete time and discrete space Ryan

  21. Hidden Markov Model • "Hidden Markov model is a Markov chain for which the state is only partially observable." • One common use is for speech recognition. • Observed data is speech audio. • Hidden state is the spoken text. • Viterbi algorithm finds the most likely sequence of spoken words from the speech audio. Ryan

  22. Markov Decision Process • Markov chain where state transitions depends on the current state and an action vector that is applied to the system. • Related to Reinforcement learning • Solved by value iterations Ryan

  23. Partially Observable Markov Decision Process (POMDP) • State of the system is only partially observed. • NP complete - "nondeterministic polynomial time" (no fast solution to them is known) • Useful for agents and robotics. • Markov Random Field (Markov Network) • Generalization of a Markov Chain in multiple dimensions. • Each state depends on neighbor's state in multiple directions, as compared to a Markov Chain, where only the previous state is considered. Ryan

  24. Hierarchical Markov Models • Can be applied to categorize human behavior. • Example: Observations of time & location on campus can be interpreted to determine activity. • At Allen Field House in the afternoon or evening → Watching a basketball event. • At a cafe around noon → Eating lunch Ryan

  25. SEASONALITY Definitions: • Seasonality is a characteristic of a time series in which the data experiences regular and predictable changes that recur every calendar year • A seasonal pattern exists when a series is influenced by seasonal factors (e.g., the quarter of the year, the month, or day of the week).  Monday Tuesday Etc.. Fall Winter Etc… Jan Feb Etc.. First Second Etc… Adi

  26. Seasonality Example of Quarterly Seasons “Seasonal Variation In time series, that part of the movement which is assigned to the effect of the seasons on the year” Adi

  27. Seasonality • General: • Many time series display seasonality. By seasonality, we mean periodic fluctuations. • If seasonality is present, it must be incorporated into the time series model. • For example, glaciers tend to melt in summer season and then melting decline after the summer. Thus, time series of glacier’s mass will typically show mass reduction during summers. Adi

  28. Seasonality • Seasonality Detection Techniques: • The following graphical techniques can be used to detect seasonality: • Run sequence plot; • Seasonal subseries plot; • Multiple box plots; • The autocorrelation plot; Adi

  29. SEASONALITY DETECTION TECHNIQUES • Run sequence plot • Run sequence plot can be used to answer the following questions • Are there any shifts in location? • Are there any shifts in variation? • Purpose: Check for Shifts in Location and Scale and Outliers “Last Third of Data Shows a Shift of Location” Adi

  30. SEASONALITY DETECTION TECHNIQUES • Seasonal subseries plot can provide answers to the following questions: • Do the data exhibit a seasonal pattern? • What is the nature of the seasonality? • Is there a within-group pattern (e.g., do January and July exhibit similar patterns)? • Are there any outliers once seasonality has been accounted for? • Purpose: a tool for detecting seasonality in a time series, allows you to detect both between group and within group patterns. • Seasonal subseries plot • peak in May • steadily decrease through September • rising until the May peak. Adi

  31. SEASONALITY DETECTION TECHNIQUES • Multiple box plots • Multiple box plots can be used to answer the following questions • Does the location differ between subgroups? • Does the variation differ between subgroups? • Purpose: Check location and variation shifts, multiple box plots can be drawn together to compare multiple data sets or to compare groups in a single data set. This box plot reveals that machine has a significant effect on energy with respect to location and possibly variation Adi

  32. SEASONALITY DETECTION TECHNIQUES • The autocorrelation plot • The autocorrelation plot can provide answers to the following questions: • Are the data random? • Is an observation related to an observation twice-removed? (etc.) • Is the observed time series autoregressive? • What is an appropriate model for the observed time series? • Purpose: Check Randomness. This plot shows that the time series is not random, but rather has a high degree of autocorrelation Adi

  33. SEASONALITY DETECTION TECHNIQUES • Considering the Best Technique !! • The run sequence plot is a recommended first step for analyzing any time series. • Seasonality is shown more clearly by the seasonal subseries plot or the box plot. • Both the seasonal subseries plot and the box plot assume that the seasonal periods are known. •  If the period is not known, the autocorrelation plot can help. • Seasonal subseries plot • Run sequence plot • Multiple box plots • Autocorrelation plot Adi • Reference: Engineering Statistics Handbook, online-reference, http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc443.htm

  34. SEASONALITY DETECTION TECHNIQUES Example First: run sequence plot. Second: Seasonal Subseries Plot 1st: No obvious periodic patterns are apparent in the run sequence plot. Third: Box Plot 2nd: The means for each month are relatively close and show no obvious pattern in Seasonal Subseries Plot 3rd: Due to the rather large number of observations, the box plot shows the difference between months better than the seasonal subseries plot. Adi

  35. Time Series Models • Addition Model: X = T + S + C + R Where: X = Original Data T = Trend Value S = Seasonal Variation C = Cyclical Variation R = Random Variation Phuong

  36. Multiplicative Model • Observed value in Time Series is the product of components • For Annual Data: • For Quarterly: Where: Ti= Trend Ci= Cyclical Ri= Random Si= Seasonal Phuong

  37. Time Series Methodologies Time Series No Yes Smoothing Method Trend? Trend Models Quadratic Linear Moving Average Exponential Smoothing Exponential Auto-Regressive Phuong

  38. Moving Average Graph Sales Actual Year Phuong https://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-series-modeling/

  39. Moving Average -- example • 3 month MA: (oct+nov+dec)/3=258.33 • 6 month MA: (jul+aug+…+dec)/6=249.33 • 12 month MA: (Jan+feb+…+dec)/12=205.33

  40. What about Weighted Moving Averages? • This method looks at past data and tries to logically attach importance to certain data over other data • Weighting factors must add to one • Can weight recent higher than older or specific data above others • If forecasting staffing, we could use data from the last four weeks where Tuesdays are to be forecast. • Weighting on Tuesdays is: T-1 is .25; T-2 is .20;T-3 is .15; T-4 is .10 and Average of all other days is weighed .30.

  41. Exponential Smoothing Attendance Actual Year Phuong https://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-series-modeling/

  42. Time Series Methodologies Time Series No Yes Smoothing Method Trend? Trend Models Quadratic Linear Moving Average Exponential Smoothing Exponential Auto-Regressive Phuong

  43. Characteristics of Time Series Data • Data are NOT necessarily INDEPENDENT and NOT necessarily IDENTICALLY distributed • ORDERING is very important. • Changing the order could change the meaning of the data -> DEPENDENCY Phuong

  44. Time Series Forecasting Horizons • Long Term • Five years or more into the future • E.g., plant location and product planning • Medium Term • 1 season to 2 years • E.g., sales forecasts • Short Term • 1 day to 1 year or less than 1 season • E.g., staffing levels and inventory levels Phuong

  45. When Should Time Series Analysis Best Be Used? • Deterministic factors are NOT READILY AVAILABLE. • Consider a UNIVARIATE time series – the same variable collected over time. Phuong

  46. How to Apply Time Series Analysis? • Given a continuous signal, we can sample its values at equal time intervals. • E.g., human electrocardiography • 2. The value of the state variable accumulates during some time interval. • E.g., daily rainfall • Some processes are inherently discrete. • E.g., trains arriving to the station at discrete time moments Phuong

  47. Applications of Time Series Analysis • Economic Forecasting • Sales Forecasting • Budgetary Analysis • Stock Market Analysis • Yield Projections • Process and Quality Control • Inventory Studies • Workload Projections • Utility Studies • Census Analysis Phuong

  48. Application Software • Spreadsheets • Microsoft Excel, Quattro Pro, Lotus 1-2-3, etc. • Statistical packages • SPSS, SAS, NCSS, Minitab, etc. • Specialty forecasting packages • Forecast Master, Forecast Pro, etc. Phuong

  49. Examples of Time series data • Number of babies born in each hour. • Daily closing price of a stock. • The monthly trade balance of the U.S. for each year. • GDP of the country, measured each year. Phuong

  50. Marketing Example: wine sales of a company State variable: monthly wine sales months Phuong http://home.vicnet.net.au/~norca/Red_Wine.htm

More Related