Next three lectures

Today: Surface level Next three lectures Next week (John): Level 2 (underlying logic, mathematical formulas) Also today: Exposure to some R codes (no need to take notes)

Scientific research is about (causal) relationships. Inferential statistics Scientific research is about predictions. We do inferential statistics to examine our predictions. We do inferential statistics to test if the data provide empirical evidence for our hypotheses.  The inferential test will give us a YES/NO answer

DATA = MODEL + ERROR Inferential statistics What we measure Our hypothesis By how far we are off What we are trying to explain Explained variance Unexplained variance Our dependent variable Our predictor variable(s) Our "residuals"

DATA = MODEL1 + ERROR1 Inferential statistics DATA = MODEL2 + ERROR2 Usually: One model has greater explanatory power but is more complex than the other model. The price we pay is complexity. Question: Is it worth it to pay the higher price?

Bottle 1 Bottle 2 Inferential statistics $8 $10 Bottle 1 tastes like vinegar. Bottle 2 has a phenomenal taste that remains on your tongue for 3 sec. Which one would you choose?

Bottle 1 Bottle 2 Inferential statistics $8 $36 Bottle 2 tastes slightly better than Bottle 1, but the difference in taste is hardly noticeable. Which one would you choose?

Explanatory power = percentage of variance explained Your model has much better explanatory power (explains much more variance) and is only slightly more complex (involves estimating one additional parameter) than mine. Inferential statistics  ? Your model has slightly better explanatory power (explains slightly more variance), but is much more complex (involves estimating many additional parameters) than mine. ? 

The construct we are trying to explain (predict): Subjective well-being (swb) Variance of swb The model comparison approach Pooja ….

With every model – a two-step process: Step 1: Make the best predictions, given the information you have A series of models Step 2: Compute the total prediction error Our first model (see R script): swb = B0 + e (here: B0 = 0) Price: P = 0 Step 2 …

= the null model = Model 0 swb = B0 + e ; (here: B0 = 0) The stupid model Price: P = 0 Total Prediction Error: SSE = 438 Is this a good model?  not a meaningful question Q ? Let's try another model … Pooja …

The basic model = the mean-only model = Model 1 Buy one piece of information (the mean of Y) Step 1: Make the best predictions, given the information you have  Predict the mean for all participants Step 2: Compute the total prediction error  Total Prediction Error: SSE = 88

The basic model = the mean-only model = Model 1 swb = b0 + e ; (here: b0 = 5) Price: P = 1 (number of parameters estimated) Total Prediction Error: SSE = 88 Is this a good model?

Model 0: swb = B0 + e, P = 0, SSE = 438 (compact model) Model 1: swb = b0 + e, P = 1, SSE = 88 (augmented model) Model comparison Is it worth it to buy one additional piece of information? Is it worth it to estimate one additional parameter (and thus to have a more complex model)? F = = 51.70 ; p < .0001 ; yes Magic formula t = = 7.19 ;

Model 0: swb = B0 + e, P = 0, SSE = 438 (compact model) Model 1: swb = b0 + e, P = 1, SSE = 88 (augmented model) Model comparison With every model comparison we will consider - The mathematical interpretation - The conceptual interpretation: Here: Are the subjective well-being scores on average reliably different from zero? The answer is yes (not a terribly meaningful hypothesis with the present dataset). Q ?

Buy two pieces of information (the mean of Y and the scores on another variable) Model 2 Step 1: Make the best predictions, given the information you have  Use the equation predict2 = + * comp predict2 = 2.05 + 1.18 * comp

predict2 = 2.05 + 1.18 * comp Model 2 comp 0 1 2 3 4 predict2 ? ? ?

^ predict2 = swb = 2.05 + 1.18 * comp Model 2

predict2 = 2.05 + 1.18 * comp Model 2 Step 2: Compute the total prediction error  Total Prediction Error: SSE = 58.49

Compact model: swb = b0 + e (= 5 + e), P = 1, SSE = 88 ; (Model 1) Model comparison Augmented model: swb = b0 + b1 comp + e (= 2.05 + 1.18*comp + e), P = 2, SSE = 58.49 ; (Model 2) Is it worth it to estimate one additional parameter (and thus to have a more complex model)? F = = 6.05 ; p < .03 ; Magic formula ? t = = 2.46 ;

Compact model: swb = b0 + e (= 5 + e), P = 1, SSE = 88 ; (Model 1) Model comparison Augmented model: swb = b0 + b1 comp + e = (= 2.05 + 1.18*comp + e), P = 2, SSE = 58.49 ; (Model 2) With every model comparison we will consider - The mathematical interpretation - The conceptual interpretation: Here: Is there a relationship between self-complexity and subjective well-being? Answer: Yes, there is a statistically significant relationship between self-complexity and subjective well-being.

swb = b0 + b1*comp + e ; (here: b0 = 2.05 and b1 = 1.18) Price: P = 2 ; Total Prediction Error: SSE = 58.49 ; Model 2 the "slope" b1 the "intercept" b0

Next three lectures