Discovering and describing relationships
This presentation is the property of its rightful owner.
Sponsored Links
1 / 35

Discovering and Describing Relationships PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on
  • Presentation posted in: General

Discovering and Describing Relationships. Farideh Dehkordi-Vakil. Exploring Relationships between Two Quantitative Variables. Scatter plots Represent the relationship between two different continuous variables measured on the same subjects.

Download Presentation

Discovering and Describing Relationships

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Discovering and describing relationships

Discovering and Describing Relationships

Farideh Dehkordi-Vakil


Exploring relationships between two quantitative variables

Exploring Relationships between Two Quantitative Variables

  • Scatter plots

    • Represent the relationship between two different continuous variables measured on the same subjects.

    • Each point in the plot represents the values for one subject for the two variables.


Exploring relationships between two quantitative variables1

Exploring Relationships between Two Quantitative Variables

  • Example:

    Data reported by the organization for Economic Development and Cooperation on its 29 member nations in 1998.

    • Per capita gross domestic product is on x-axis

    • Per capita health care expenditures is on y-axis.


Exploring relationships between two quantitative variables2

Exploring Relationships between Two Quantitative Variables

  • We can describe the overall pattern of scatter plot by

    • Form or shape

    • Direction

    • strength


Exploring relationships between two quantitative variables3

Exploring Relationships between Two Quantitative Variables

  • Form or shape

    • The form shown by the scatter plot is linear if the points lie in a straight-line pattern.

  • Strength

    • The relation ship is strong if the points lie close to a line, with little scatter.


Exploring relationships between two quantitative variables4

Exploring Relationships between Two Quantitative Variables

  • Direction

    • Positive and negative association

      • Two variables are positively associated when above-average values of one variable tend to occur in individuals with above average values for the other variable, and below average values of both also tend to occur together.

      • Two variable are negatively associated when above average values for one tend to occur in subjects with below average values of the other, and vice-versa


Exploring relationships between two quantitative variables5

Exploring Relationships between Two Quantitative Variables

  • Per capita health care example

    • “subjects” studied are countries

    • Form of relationship is roughly linear

    • The direction is positive

    • The relationship is strong.


Correlation

Correlation

  • It is often useful to have a measure of degree of association between two variables. For example, you may believe that sales may be affected by expenditures on advertising, and want to measure the degree of association between sales and advertising.

    • Correlation coefficient is a numeric measure of the direction and strength of linear relationship between two continuous variables

    • The notation for sample correlation coefficient is r.


Correlation1

Correlation

  • There are several alternative ways to write the algebraic expression for the correlation coefficient. The following is one.

    • X and Y represent the two variables of interest. For example advertising and sales or per capita gross domestic product, and the per capita health care expenditure.

    • n is the number of subjects in the sample

    • The notation for population correlation coefficient is .


Correlation2

Correlation

  • Facts about correlation coefficient

    • r has no unit.

    • r > 0 indicates a positive association; r < 0 indicates a negative association

    • r is always between –1 and +1

    • Values of r near 0 imply a very weak linear relationship

    • Correlation measures only the strength of linear association.


Correlation3

Correlation

  • We could perform a hypothesis test to determine whether the value of a sample correlation coefficient (r) gives us reason to believe that the population correlation () is significantly different from zero

  • The hypothesis test would be

    H0:  = 0

    Ha:   0


Correlation4

Correlation

  • The test statistic would be

    • The test statistic has a t-distribution with n-2 degrees of freedom.

  • Reject H0 if


Example do wages rise with experience

Example: Do wages rise with experience?

  • Many factors affect the wages of workers: the industry they work in, their type of job, their education and their experience, and changes in general levels of wages. We will look at a sample of 59 married women who hold customer service jobs in Indiana banks. The following table gives their weekly wages at a specific point in time also their length of service with their employer, in month. The size of the place of work is recorded simply as “large” (100 or more workers) or “small.” Because industry, job type, and the time of measurement are the same for all 59 subjects, we expect to see a clear relationship between wages and length of service.


Example do wages rise with experience1

Example: Do wages rise with experience?


Example do wages rise with experience2

Example: Do wages rise with experience?


Example do wages rise with experience3

Example: Do wages rise with experience?

  • The correlation between wages and length of service for the 59 bank workers is r = 0.3535.

  • We expect a positive correlation between length of service and wages in the population of all married female bank workers. Is the sample result convincing that this is true?


Example do wages rise with experience4

Example: Do wages rise with experience?

  • To compute correlation: we need:

  • Replacing these in the formula

  • We want to test

    H0:  = 0Ha:  > 0

    The test statistic is


Example do wages rise with experience5

Example: Do wages rise with experience?

  • Comparing t = 2.853 with critical values from the t table with n - 2 = 57 degrees of freedom help us to make our decision.

  • Conclusion:

    • Since P( t > 2.853) < .005, we reject H0.

    • There is a positive correlation between wages and length of service.


Correlograms an alternative method of data exploration

Correlograms: An Alternative Method of Data Exploration

  • In evaluating time series data, it is useful to look at the correlation between successive observations over time.

  • This measure of correlation is called autocorrelation and may be calculated as follows:

    • rk = autocorrelation coefficient for a k period lag.

    • mean of the time series.

    • yt = Value of the time series at period t.

    • y t-k = Value of time series k periods before period t.


Correlograms an alternative method of data exploration1

Correlograms: An Alternative Method of Data Exploration

  • Autocorrelation coefficient for different time lags can be used to answer the following questions about a time series data.

    • Are the data random?

      • In this case the autocorrelations between yt and y t-k for any lag are close to zero. The successive values of a time series are not related to each other.


Correlograms an alternative method of data exploration2

Correlograms: An Alternative Method of Data Exploration

  • Is there a trend?

    • If the series has a trend, yt and y t-k are highly correlated

    • The autocorrelation coefficients are significantly different from zero for the first few lags and then gradually drops toward zero.

    • The autocorrelation coefficient for the lag 1 is often very large (close to 1).

    • A series that contains a trend is said to be non-stationary.


Correlograms an alternative method of data exploration3

Correlograms: An Alternative Method of Data Exploration

  • Is there seasonal pattern?

    • If a series has a seasonal pattern, there will be a significant autocorrelation coefficient at the seasonal time lag or multiples of the seasonal lag.

    • The seasonal lag is 4 for quarterly data and 12 for monthly data.


Correlograms an alternative method of data exploration4

Correlograms: An Alternative Method of Data Exploration

  • Is it stationary?

    • A stationary time series is one whose basic statistical properties, such as the mean and variance, remain constant over time.

    • Autocorrelation coefficients for a stationary series decline to zero fairly rapidly, generally after the second or third time lag.


Correlograms an alternative method of data exploration5

Correlograms: An Alternative Method of Data Exploration

  • To determine whether the autocorrelation at lag k is significantly different from zero, the following hypothesis and rule of thumb may be used.

    • H0: k= 0,Ha: k  0

    • For any k, reject H0 if

    • Where n is the number of observations.

    • This rule of thumb is for  = 5%


Correlograms an alternative method of data exploration6

Correlograms: An Alternative Method of Data Exploration

  • The hypothesis test developed to determine whether a particular autocorrelation coefficient is significantly different from zero is:

  • Hypotheses

    • H0: k= 0,Ha: k  0

  • Test Statistic:


  • Correlograms an alternative method of data exploration7

    Correlograms: An Alternative Method of Data Exploration

    • Reject H0 if


    Correlograms an alternative method of data exploration8

    Correlograms: An Alternative Method of Data Exploration

    • The plot of the autocorrelations versus time lag is called Correlogram.

    • The horizontal scale is the time lag

    • The vertical axis is the autocorrelation coefficient.

    • Patterns in a Correlogram are used to analyze key features of data.


    Example mobil home shipment

    Example:Mobil Home Shipment

    • Correlograms for the mobile home shipment

    • Note that this is quarterly data


    Example japanese exchange rate

    Example:Japanese exchange Rate

    • As the world’s economy becomes increasingly interdependent, various exchange rates between currencies have become important in making business decisions. For many U.S. businesses, The Japanese exchange rate (in yen per U.S. dollar) is an important decision variable. A time series plot of the Japanese-yen U.S.-dollar exchange rate is shown below. On the basis of this plot, would you say the data is stationary? Is there any seasonal component to this time series plot?


    Example japanese exchange rate1

    Example:Japanese exchange Rate


    Example japanese exchange rate2

    Example:Japanese exchange Rate

    • Here is the autocorrelation structure for EXRJ.

    • With a sample size of 12, the critical value is

    • This is the approximate 95% critical value for rejecting the null hypothesis of zero autocorrelation at lag K.


    Example japanese exchange rate3

    Example:Japanese exchange Rate

    • The Correlograms for EXRJ is given below


    Example japanese exchange rate4

    Example:Japanese exchange Rate

    • Since the autocorrelation coefficients fall to below the critical value after just two periods, we can conclude that there is no trend in the data.


    Example japanese exchange rate5

    Example:Japanese exchange Rate

    • To check for seasonality at  = .05

    • The hypotheses are:

      • H0; 12 = 0Ha:12  0

    • Test statistic is:

    • Reject H0 if


    Example japanese exchange rate6

    Example:Japanese exchange Rate

    • Since

    • We do not reject H0 , therefore seasonality does not appear to be an attribute of the data.


  • Login