180 likes | 261 Views
Discover essential data editing techniques like equal variance, normalizing transformations, and linearizing transformations for statistical analysis. Learn how to test for normality, equal variance, and linear relationships through graphical indications and statistical tests like the Kolmogorov-Smirnov test. Explore the use of transformations to stabilize variance and address issues with non-normal distributions, outliers, and repeated measures in statistical models. Dive into examples of applying reciprocal, square-root, and logarithmic transformations to ensure data meets statistical assumptions for accurate analysis.
 
                
                E N D
Lecture 4 Data editing
List of nice things • Equal variance • Normal distribution • Linear relationship • Independent variables • Other requirements
Equal variance • Can be done by weighting the groups. • Transformation is be better • If x is transformed into y, then the variance of y is equal if:
Linearizing transformations • Performed to find the connection between two variables
Normalizing Transformations • Many statistical methods assume normal distributions, but are robust to non-normal distributions. • Transformations to equal variance and linaer relations may fix the normal distribution issue
Example 1 • Is there a relation between the number of apples given to a teacher and the grade given by a teacher? • Normality, equal variance and linear relation? • Graphical indications! • Scatterplot and histograms • Standardized residuals (linear regression) • P-P plots (Q-Q plots) • Does it look nice?
Kolmogorov-Smirnov test • Tests if a sample of variables are normal. • Tests if the observed distribution function S(x) is significantly different from a hypothetical distribution function F(x). • Lilliefors modification because F(x) is estimated from S(x).
Transformations • Most statistical models are linear. • Equal variance. • c = 1: No transformation • c = -1: reciprocal • c = ½ : square root • c -> 0: logarithmic
Example 3 • Pain caused by mechanical pressure. • Equal variance? • Normal distribution? • Linear relationship? • Independent variables? • Logarithmic transformation • Stabilize the variance if var(x) is proportional to E(x)2 • If x has an increassing slope • x is positive and positively skewness
Example 4 • Fuel consumption • Reciprocal transformation • Stabilize the variance if var(x) is proportional to E(x)4 • If x has an increassing slope • x is positive and positively skewness
Square-root transformation • Stabilize the variance if var(x) is proportional to E(x) • If the underlying mechanism follows a Poisson distribution
Outliers • Tjeck the data • Tjeck the labbook • Trimmed mean • Common sense and your vast experience
Repeated Measures • Measurements are repeated on the same subject • Variation from subjects can be identified in the model -> more accurate description
An Example • Between-subject Factor: • cognitive style: field-independent or field-dependent • Within-subject Factors: • Type: Form and Color • Condition: Normal, Congruent, and Incongurent • Within-subject factors are randomized
A suitable model • where yijklrepresent the observation for the ith subject in the lth group (l = 1, 2), under the jth type condition (j = 1, 2), and the kth cue condition (k = 1, 2, 3) • Fixed effects: • αj, βk, γlrepresent the main effects of type, cue, and group, • (αβ)jk, (αγ)jl, (βγ)kl , (αβγ)jklthe interaction. • Random effects: • The uirepresents the effect of subject i • εijklthe residual
ARGHH! Another assumption! • 1. Normality • 2. Equal variance • 3. Sphericity: • The variances of the differences between all pairs of the repeated measurements are equal. • This requirement implies that the covariances between pairs of repeated measures are equal and that the variances of each repeated measurement are also equal, i.e., the covariance matrix of the repeated measures must have the so-called compound symmetry pattern.