1 / 30

Scientific Methods 1

Scientific Methods 1. ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 6: Statistical Methods-Significance. Barry & Goran. www.cs.man.ac.uk/~barry/mydocs/myCOMP80131. pdf(x). 1. x. a b. 1. pdf(x). m. a b. x. m- . m+ .

august
Download Presentation

Scientific Methods 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientific Methods 1 ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 6: Statistical Methods-Significance Barry & Goran www.cs.man.ac.uk/~barry/mydocs/myCOMP80131 COMP80131-SEEDSM6

  2. pdf(x) 1 x a b 1 pdf(x) m a b x m- m+ Continuous random processes • Characterised by probability density functions (pdf) Uniform pdf: Prob of the random variable x lying between a and b is: Gaussian (Normal) pdf with mean m & std dev . 95.5% for m  299.7% for m  3 68% COMP80131-SEEDSM6

  3. pdf & Histograms • Ru = rand(10000,1); %10000 unif samples • hist(Ru,20); • Rg=randn(10000,1); %Gaussian with m=0, std=1 • hist(Rg,20); COMP80131-SEEDSM6

  4. Convert histogram to estimate of pdf • Divide each column by number of samples • Then multiply by number width of bins. • For better approximation, increase number of bins COMP80131-SEEDSM6

  5. MATLAB illustration Rg = randn(100000,1); %10000 Gaussians with m=0, std=1 widthBin = 0.2; X = -4 : widthBin : 4 ; H = hist(Rg,X); % Histogram with bins centred on elements of X figure(2); bar(X,(H/100000)/widthBin); ylabel('pdf estimate'); Histogram as pdf estimate. COMP80131-SEEDSM6

  6. Gaussian (normal) pdf • Measurements {xi} of many naturally occurring phenomena tend to be normally distributed with some mean m & std . • Let zi = (xi - m)/, • Then {zi} will have a standard normal pdf with mean = 0 & std = 1. COMP80131-SEEDSM6

  7. 0.4 0.35 0.3 Gaussian pdf 0.25 0.2 0.15 0.1 0.05 0 -4 -3 -2 -1 0 1 2 3 4 x Plot true standard normal pdf Mean=0; Std=1; K = 1/( Std*sqrt(2*pi) ); X = -4*Std : widthBin : 4*Std ; for I=1:length(X); G(I) = K * exp(-(X(I)-Mean)^2 / (2*Std^2) ); end; figure(4); plot(X,G); ylabel('pdf'); COMP80131-SEEDSM6

  8. Plot Gaussian cdf X=-4:0.1:4; C = normcdf(X,0,1); figure(1); plot(X,C); grid on; xlabel('x'); ylabel('prob that var < x'); Cumulative density function (cdf) Probability of Gaussian variable (m=0 std=1)being < x. No formula for this. Use MATLAB function: normcdf(X,m,std) COMP80131-SEEDSM6

  9. 1 0.9 0.8 0.7 0.6 prob that var > x 0.5 0.4 0.3 0.2 0.1 0 -4 -3 -2 -1 0 1 2 3 4 x Complementary Gaussian cdf This is just 1 – normcdf(x,m,) It is prob of Gaussian random variable (mean= m, std=) being > x. COMP80131-SEEDSM6

  10. Complementary error function • Some call the complementary Gaussian cdf (m=0, =1) the ‘complementary error function’ Q(z) • But ‘erfc’ is also called this. • Q(z) = comp-Gaussian cdf= 0.5  erfc(-z/2). • Used to rely on tables & graphs of Q(z). • When m0 & 1, use Q((z-m)/) COMP80131-SEEDSM6

  11. COMP80131-SEEDSM6

  12. D E Use of ‘normcdf’ function Prob of random var being between D & E is: COMP80131-SEEDSM6

  13. D Tail of distribution Prob of random variable being greater than D is: COMP80131-SEEDSM6

  14. Voltage +1 +1/2 t An Engineering Question • Rectangular 1v & 0v pulses used to transmit a binary signal. • Affected by additive white Gaussian noise (AWGN). • Mean of noise =0 & power (variance) 2 = 0.01. • Estimate the bit-error probability. • Bit-error may occur if noise adds voltage > 0.5v to 0 v or < -0.5 v to 1v. • Assume same no. of 1’s & 0’s COMP80131-SEEDSM6

  15. Solution prob(error) = prob(noise > 0.5) when bit =0 + prob(noise < -0.5) when bit =1 = 0.5  prob(noise > 0.5) +0.5  prob(noise < -0.5) = prob(noise > 0.5) because of symmetry = 1 - normcdf(0.5, 0, 0.1) = 2.910-7 Or, using graph Q(z/) on next page, prob(error) = Q(0.5/) = Q(0.5/0.1) = Q(5)  310-7 COMP80131-SEEDSM6

  16. / Q(z/) z/ COMP80131-SEEDSM6

  17. Back to sampling • A population has true mean  & std . • Take a sample of N measurements from it – say N=50 • Calculate sample-mean m1 & std s1. • If population is Gaussian, pdf of m1 will be Gaussian with mean =  & std = s1 =  / N. • Can confirm by taking more samples to obtain sample-means & stds: m1, m2, …, mM & s1, s2, …, sM • Estimate mean & std of sample-means • Also mean & std of sample-variances (later) COMP80131-SEEDSM6

  18. Significance testing • Assume pop-mean  (‘mu’)may change with pop-std fixed at . • Assume know pop-std  & that it will not change. • We can only take one sample of 50 values. • Calculate m1 to decide whether pop-mean has changed. • Null Hypothesis – it has not changed. new pop-mean New =  • If Null Hyp is true, pdf of sample-mean is on next slide: COMP80131-SEEDSM6

  19. Concept of a ‘null-hypothesis’ • A null-hypothesis is an assumption that is made and then tested by a set of experiments designed to reveal that it is likely to be false, if it is false. • Testing is done by considering how probable the results are, assuming the null hypothesis is true. • If the results appear very improbable the researcher may conclude that the null-hypothesis is likely to be false. • This is usually the outcome the researcher hopes for when he or she is trying to prove that a new technique is likely to have some value. COMP80131-SEEDSM6

  20. pdf of sample-mean m1 0.4 0.35 0.3 Gaussian pdf 0.25 s1 = /50 0.2 0.15 0.1 0.05 0 -2s1 -s1  +s1 +2 +4s1 m1 Assume value we got was m1 =  + 2.5s1. How unlikely if Null Hyp is true? COMP80131-SEEDSM6

  21. p-value • “Probability of obtaining a test result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true”. • Reject the null hypothesis if the p-value is less than some value α (significance level) which is often 0.05 or 0.01. • When the null-hypothesis is rejected, the result is said to be statistically significant. • Here p-value is 1 - normcdf(m1, , s1) …with s1= /N • = 1-normcdf(+2.5s1, , s1) • = 1- normcdf(2.5 ,0, 1) = 0.0062 • Much less than 0.01 so reject NH at 1% confidence level. • Conclude that mean has changed. COMP80131-SEEDSM6

  22. Our 2 assumptions • That was easy because we made 2 assumptions: population is Gaussian & pop-std is known to us. • Now need to eliminate these 2 assumptions. • We have some help from the Central Limit Theorem: COMP80131-SEEDSM6

  23. Central Limit Theorem • If samples of size N are ‘randomly’ chosen from a pop with mean  & std , the pdf of their mean, m1 say, approaches a Normal (Gaussian) pdf with mean  & std /N as N is made larger & larger. • Regardless of whether the population is Gaussian or not! • Previous example can be made to work for non-Gaussian pop provided N is ‘large enough’. • More on this next week. COMP80131-SEEDSM6

  24. Another example • Assume we wish to find out if a proposed technique designed to benefit users of a system is likely to have any value. • Divide the users into two groups and offer the proposed technique to one group and something different to the other group. • The null-hypothesis would be that the proposed technique offers no measurable advantage over the other techniques. COMP80131-SEEDSM6

  25. The testing • This would be carried out by looking for differences between the sets of results obtained for each of the two groups. • Careful experimental design will try to eliminate differences not caused by the techniques being compared. • Must take a large number of users in each group & randomize the way the users are assigned to groups. • Once other differences have been eliminated as far as possible, any remaining difference will hopefully be indicative of the effectiveness of the techniques being investigated. • The vital question is whether they are likely to be due to the advantages of the new technique, or the inevitable random variations that arise from the other factors. • Are the differences statistically significant? • Can employ a statistical significance to find out. COMP80131-SEEDSM6

  26. Failure of the experiment • If the results are not found to look improbable under the null-hypothesis, i.e. if the differences between the two groups are not statistically significant, then no conclusion can be made. • The null-hypothesis could be true, or it could still be false. • It would be a mistake to conclude that the ‘null-hypothesis’ has been proved likely to be true in this circumstance. • It is quite possible that the results of the experiment give insufficient evidence to make any conclusions at all. COMP80131-SEEDSM6

  27. Question: fair coin test Checking whether a coin is fair Suppose we obtain heads 14 times out of 20 flips. The p-value for this test result would be the probability of a fair coin landing on heads at least 14 times out of 20 flips. From binomial distribution formula( Lecture 4), this is: (20C14 + 20C15+20C16+20C17+20C18+20C19+20C20) / 220 = 0.058 This is probability that a fair coin would give a result as extreme or more extreme than 14 heads out of 20. COMP80131-SEEDSM6

  28. Significance test for fair coin question • Reject null-hypothesis if p-value  α . • If α= 0.05, rejection of null-hypothesis is: “at the 5% (significance) level”. • Probability of wrongly rejecting null-hypothesis (Type 1 error) will be equal to α. • This is considered sufficiently low. • In this case, p-value > 0.05, therefore observation is consistent with null-hypothesis & we cannot reject it. • Cannot conclude that coin is likely to be unfair. • But we have NOT proved that coin is likely to be fair. • 14 heads out of 20 flips can be ascribed to chance alone • It falls within the range of what could happen 95% of the time with a fair coin. COMP80131-SEEDSM6

  29. Questions from Lecture 2 • Analyse the ficticious exam results & comment on features. • Compute means, stds & vars for each subject & histograms for the distributions. • Make observations about performance in each subject & overall • Do marks support the hypothesis that people good at Music are also good at Maths? • Do they support the hypothesis that people good at English are also good at French? • Do they support the hypothesis that people good at Art are also good at Maths? • If you have access to only 50 rows of this data, investigate the same hypotheses • What conclusions could you draw, and with what degree of certainty? COMP80131-SEEDSM6

  30. Questions from L4 • A patent goes to a doctor with a bad cough & a fever. The doctor needs to decide whether he has ‘swine flu’. Let statement S = ‘has bad cough and fever’ & statement F = ‘has swine flu’. The doctor consults his medical books and finds that about 40% of patients with swine-flu have these same symptoms. Assuming that, currently, about 1% of the population is suffering from swine-flu and that currently about 5% have bad cough and fever (due to many possible causes including swine-flu), we can apply Bayes theorem to estimate the probability of this particular patient having swine-flu. • A doctor in another country knows form his text-books that for 40% of patients with swine-flu, the statement S, ‘has bad cough and fever’ is true. He sees many patients and comes to believe that the probability that a patient with ‘bad cough and fever’ actually has swine-flu is about 0.1 or 10%. If there were reason to believe that, currently, about 1% of the population have a bad cough and fever, what percentage of the population is likely to be suffering from swine-flu? COMP80131-SEEDSM6

More Related