Correlation and Regression. February 2013. GA Medicaid AA Study. Study & Abstract Graphic presentation of data. Statistical Analyses. Practical Example.
Study & Abstract
Graphic presentation of data.
Assume: You have two articles that provide conflicting evidence regarding the imposition of formulary controls for mental health drugs. One says prior authorization causes “gaps” in therapy. The other says when you impose restrictions drug costs go down as well as other services such as hospital, ER and doctor visits.
Conceptualize a Scatter Diagram
Pearson Correlation Coefficient
Linear regression analysis
Hypothesis testing of relationship between x and y
Correlation and regression are the two most common methods to describe the relationship between two quantitative variables (x and y)
Correlation coefficient measures the strength of the relationship between the two variables, provided that the relationship is linear
Regression analysis is an equation that allows the estimation of the value of y for any given x.
A good way to understand the relationship between two variables is by graphing them
The value of r can range from 0 (no linear association between x and y), and 1 (perfectly linear association)
The value of r can also be positive or negative
The closer the value of r to +1 and -1, the stronger the relationship is and the more nearly it approximates a straight line.
Although correlation tells us the strength of the relationship between x and y, it doesn’t allow us to predict the value of y for a given value of x
Instead of using r, we can use a linear equation to represent the relationship between x and y
yˆ = a + bx
H0: β1 = 0 (i.e., there is no relationship between x and y)
• H1: β1 ≠ 0
Calculate t statistic with n-2 df
Compare to critical value
Reject H0 if t is greater than c.v.
Graphs a and b are regression coefficient.
H0: β1=0 cannot be
Graphs c and d are
H0: β1=0 is rejected
Common to see this used in biomedical research.
It is a special case of multiple regression where the dependent variable (the one being studied) is a discrete, nominal variable, for example 0 or 1, representing the presence or absence of some characteristic.
For example, you could do a study of long term statin use where the dependent variable (left side of the equation) is all cause mortality and the independent variable (right side) is length of statin use.
So the dependent variable could be 0 v. 1 for dead or alive.
The reference category would be patients who were alive at the end of the study and the regression result will be expressed as an “odds ratio.” In other words, is there a greater or less chance of being alive at some point in the future as a result of using the statin?
So, if the odds ratio were .78 you could interpret this as a 22% less chance of remaining alive as a result of statin use.
Conversely, if the odds ratio were 1.22 you could conclude there was a 22% greater chance of dying as a result of statin use.
Odds ratio’s are assessed for significance using chi-square analysis, and p values are reported.
Note: Which way the percents go depends on which variable, life or death you set up as the reference variable.
Correlations and regression are methods to measure the relationship between two variables
Regression analysis can also study the relationship of more than two variables, such as in the example we will walk through in class