1 / 19

Data Transformation For Normality

Data Transformation For Normality. An assumption of our analysis is that the data is normally distributed If the data is not normally distributed, then you must do a transformation to get normal data: Log(Y), 1/Y, SQRT(Y).

dunn
Download Presentation

Data Transformation For Normality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Transformation For Normality • An assumption of our analysis is that the data is normally distributed • If the data is not normally distributed, then you must do a transformation to get normal data: • Log(Y), 1/Y, SQRT(Y)

  2. Log transformation reduces larger numbers by a greater percentage than smaller numbers.

  3. Residual Plots • Residual value = Observed – Predicted • For Regression Equation: Y=(2.197*X)-395.32 • Our YObs=76 and X=208. Based on out equation though: If X= 208, then YPred=60.5536 • Our residual for this X = 76-60.5536 = 15.4464

  4. Based on Y=(2.197*X)-395.32 = Wt – Pred_Wt

  5. Weight Not Transformed Length Residual Plots If there is a pattern to a residual plot, then you should do a data transformation.

  6. Based on Y=(0.0053*X) + 0.8357 = LogWt – P_LogWt

  7. Weight Transformed Length Residual Plots The residual plot of normally distributed data should not have an obvious pattern

  8. Weight Not Transformed Length Weight Transformed Length

  9. More on Regression Equation: Y=mX+b Y=(2.197*X)-395.32 • What does this equation tell us? • The predicted value of Y for a given X • If X=220; then Y=86.85 • What does b reflect? • It is where the regression line crosses the Y axis; where X=0. • Y = (2.197*0) – 395.32 = -395.32 • This says the weight of white trout = -395.32g when the length = 0mm; make sense to you? • Also, a white trout that weighs 0g should be 180 mm long (solve regression for X when Y=0)? • Can not extrapolate beyond your data set!!!!!

  10. What does m reflect? • It represents how much Y changes with a change in X • For every 1 mm increase in length, the predicted weight value increase by 2.197 g (NOT =(2.197*X)-395.32). • What if length increases by 10  weight increases by 21.97 • What if length increases by 23  23 * 2.197 = 50.531

  11. Comparing Two Regression Lines

  12. First thing to do is to use regression to get slopes. proc sort; by sex; proc reg; by sex; model weight=width; run; Will give us the slopes for male and female.

  13. Comparing the Growth Rate • m for female = 6.17430 • m for male = 7.95468 • The difference is 6.17430 – 7.95468 = -1.7803834 • Significant?

  14. Proc GLM (General Linear Models) • A blend of ANOVA and Regression in SAS proc sort; by sex; proc reg; by sex; model weight=width; run; proc glm; class sex; model weight=sex width sex*width / solution; run; Will give us the slopes for male and female. Can use Proc GLM in place of Proc Reg. Will compare the slopes for male and female.

  15. Output to Look For Because P < 0.05, we can say the two slopes are significantly different from one another and that male crabs are heaver than a female of = width.

More Related