1 / 81

Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everythin

Population

kateb
Download Presentation

Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everythin

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related than distant things. Drowning in Data yet Starving for Knowledge [Naisbitt -Rogers] Lecture 3 : More Basic Statistics with R Pat Browne The earliest use of this quote is by John Naisbitt (www.naisbitt.com/) in his 1982 book Megatrends - he wrote "we are drowning in information, but we are starved for knowledge". In 1985, Rutherford D. Rogers, a librarian at Yale, was quoted in New York Times: "We're drowning in information and starving for knowledge." The earliest use of this quote is by John Naisbitt (www.naisbitt.com/) in his 1982 book Megatrends - he wrote "we are drowning in information, but we are starved for knowledge". In 1985, Rutherford D. Rogers, a librarian at Yale, was quoted in New York Times: "We're drowning in information and starving for knowledge."

    2. Population & Sample Statistics often involves selecting a random (or representative) subset of a population called a sample.

    3. Degrees of freedom (df) http://www3.imperial.ac.uk/naturalsciences/research/statisticsusingr Total freedom for fist four numbers. No choice on fifth number. Four degrees of freedom with 5 numbers. Generally. (n-1) df if we estimate the mean from a sample of size n. More generally, DF is the sample size n minus the number of parameters. http://www3.imperial.ac.uk/naturalsciences/research/statisticsusingr Total freedom for fist four numbers. No choice on fifth number. Four degrees of freedom with 5 numbers. Generally. (n-1) df if we estimate the mean from a sample of size n. More generally, DF is the sample size n minus the number of parameters.

    4. Degrees of Freedom We had total freedom in selecting the first four numbers, but we had no choice in selecting the fifth number. We have four degrees of freedom when selecting five numbers. In general we have (n-1) DOF if we estimate the mean from a sample size n. DOF is the sample size, n, minus the number of parameters, p, estimated from the data.

    5. Recall Permutations & Combinations P(n,r) = n! / (n-r)! Permutations (sequence) of a, b, and c taken 2 at a time is 3*2/1=6=<ab>,<ba>,<ac>,<ca>,<bc>,<cb> C(n,r) = n! /r! (n-r)! Combinations (set) of a, b, and c taken 2 at a time is 3*2/2*1=3={a,b},{a,c},{b,c} ab is a distinct permutation from ba, but they are the same combination.

    6. Probability Calculations Conditional probability P(A|B) = P(A ? B)/P(B) (probability of A, given B) Test for independence P(A ? B) = P(A)P(B) Calculation of union P(A ? B) = P(A) + P(B) – P(A ? B)

    7. Frequency Table One way of organizing raw data is to use a frequency table (or frequency distribution), which shows the number of times that an individual item occurs or the number of items that fall within a given range or interval.

    8. Frequency Table

    9. Histogram with class interval

    10. Random variables and probability distributions. Suppose you toss a coin two times. There are four possible outcomes: HH, HT, TH, and TT. Let the variable X represents the number of heads that result from this experiment. The variable X can take on the values 0, 1, or 2. In this example, X is a random variable; because its value is determined by the outcome of a statistical experiment.

    11. Random variables and probability distributions. A probability distribution is a table (or an equation) that links each outcome of a statistical experiment with its probability of occurrence. The table below, which associates each outcome (the number of heads) with its probability. This is an example of a probability distribution.

    12. Mean The arithmetic mean is the sum of the values in a data set divided by the number of elements in that data set. x = ?xi n x = ?fixi where f denotes frequency ?fi

    13. Variance & Standard Deviation List A: 12,10,9,9,10 List B: 7,10,14,11,8 The mean (x) of A & B is 10, but the values of A are more closely clustered around the mean than those in B (there is greater dispersion or spread in B). We use the standard deviation to measure this spread.

    14. Variance & Standard Deviation The variance is always positive and is zero only when all values are equal. variance = ?(xi - xi )2 n standard deviation =

    15. Variance of a frequency distribution http://en.wikipedia.org/wiki/Expected_value http://www.youtube.com/watch?v=j__Kredt7vY http://en.wikipedia.org/wiki/Expected_value http://www.youtube.com/watch?v=j__Kredt7vY

    16. Median The median is the middle value. If the elements are sorted the median is: Median = valueAt[(n+1)/2] Median = average(valueAt[n/2], valueAt[n/2+1]) For odd and even n respectively.

    17. Mode The mode is the class or class value which occurs most frequently. We can have bimodal or multimodal collections of data.

    18. Trials with 2 possible outcomes. Outcome = success or failure Let p be the probability of success, then q=1-p is the probability of failure. Often we are interested in the number of successes without considering their order. The probability of exactly k successes in n repeated trials is: b(k,n,p)= pkqn-k

    19. Bernoulli Trials: Example John hits target: p=1/4, John fires 6 times, n=6,: What is the probability John hits the target at least once?

    20. Bernoulli Trials: Example Probability that Mary hits target: p=1/4, Mary fires 6 times, n=6,: What is the probability Mary hits the target more than 4 times? In EXCEL =(6)*((1/4)^5)*((3/4)^1)+(1/4)^6In EXCEL =(6)*((1/4)^5)*((3/4)^1)+(1/4)^6

    21. Tossing Dice in R The rep function generates repeats; 6 one sixths which is the probability of a die landing on any one of its faces die <- 1:6 p.die <- rep(1/6,6) The total probability sums to 1. sum(p.die) die <- 1:6 p.die <- rep(1/6,6) s <- table(sample(die, size=1000, prob=p.die, replace=T)) barX <- barplot(s, ylim=c(0,200)) lbls = sprintf("%0.1f%%", s/sum(s)*100) text(x=barX, y=s+10, label=lbls) die <- 1:6 p.die <- rep(1/6,6) s <- table(sample(die, size=1000, prob=p.die, replace=T)) barX <- barplot(s, ylim=c(0,200)) lbls = sprintf("%0.1f%%", s/sum(s)*100) text(x=barX, y=s+10, label=lbls)

    22. Tossing Dice in R die <- 1:6 p.die <- rep(1/6,6) s <- table(sample(die, size=1000, prob=p.die, replace=T)) barX <- barplot(s, ylim=c(0,200)) lbls = sprintf("%0.1f%%", s/sum(s)*100) text(x=barX, y=s+10, label=lbls) Copy the above code and run it R several times. die <- 1:6 p.die <- rep(1/6,6) s <- table(sample(die, size=1000, prob=p.die, replace=T)) barX <- barplot(s, ylim=c(0,200)) lbls = sprintf("%0.1f%%", s/sum(s)*100) text(x=barX, y=s+10, label=lbls) die <- 1:6 p.die <- rep(1/6,6) s <- table(sample(die, size=1000, prob=p.die, replace=T)) barX <- barplot(s, ylim=c(0,200)) lbls = sprintf("%0.1f%%", s/sum(s)*100) text(x=barX, y=s+10, label=lbls)

    23. Tossing Dice in R Represesent the dice as a vector with vlaues 1 to 6 > die <- 1:6 Throw the dice 10 time, note replacement. > sample(die, size=10, prob=p.die, replace=T) [1] 1 1 1 2 1 6 6 2 5 1 Calculate the expected value >sum(die*P.die) [1] 3.5 If we sample twice we usually get distinct samples. > sam1 <- sample(die, size=10, prob=p.die, replace=T) > sam2 <- sample(die, size=10, prob=p.die, replace=T)

    24. Tossing Dice in R R code to throw a 1000 dice and make a bar chart of their values. s <- table(sample(die, size=1000, prob=p.die, replace=T)) lbls = sprintf("%0.1f%%", s/sum(s)*100) barX <- barplot(s, ylim=c(0,200)) text(x=barX, y=s+10, label=lbls) Print s and sum(s). > s 1 2 3 4 5 6 160 155 170 173 164 178 > sum(s) [1] 1000

    25. Tossing Dice in R Expected value of a discrete random variable X is the weighted average of the values in the range of X. For a die it is: 1*(1/6)+2*(1/6)+3*(1/6)+4*(1/6)+5*(1/6)+6*(1/6) = 3.5 Or more simply: (1+2+3+4+5+6)/6 = 3.5

    26. Random Variable A random variable X on a finite sample space S is a function from S to a real number R in S’. Let S be sample space of outcomes from tossing two coins. Then mapping a is; S={HH,HT,TH,TT} (assume HT?TH) Xa(HH)=1, Xa(HT)=2, Xa(TH)=3, Xa(TT)=4 The range (image) of Xa is: S’={1,2,3,4} From: http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html A discrete random variable is one which may take on only a countable number of distinct values such as 0, 1, 2, 3, 4, ... Discrete random variables are usually (but not necessarily) counts. If a random variable can take only a finite number of distinct values, then it must be discrete. Examples of discrete random variables include the number of children in a family, the Friday night attendance at a cinema, the number of patients in a doctor's surgery, the number of defective light bulbs in a box of ten. A coin is tossed ten times. The random variable X is the number of tails that are noted. X can only take the values 0, 1, ..., 10, so X is a discrete random variable. S = Sample Space (list of outcomes) n = size of the space (how many outcomes)From: http://www.stats.gla.ac.uk/steps/glossary/probability_distributions.html A discrete random variable is one which may take on only a countable number of distinct values such as 0, 1, 2, 3, 4, ... Discrete random variables are usually (but not necessarily) counts. If a random variable can take only a finite number of distinct values, then it must be discrete. Examples of discrete random variables include the number of children in a family, the Friday night attendance at a cinema, the number of patients in a doctor's surgery, the number of defective light bulbs in a box of ten. A coin is tossed ten times. The random variable X is the number of tails that are noted. X can only take the values 0, 1, ..., 10, so X is a discrete random variable. S = Sample Space (list of outcomes) n = size of the space (how many outcomes)

    27. Random Variable Let S be sample space of outcomes from tossing two coins, where we are interested in the number of heads. Mapping b is: S={HH,HT,TH,TT} Xb(HH)=2, Xb(HT)=1, Xb(TH)=1, Xb(TT)=0 The range (image) of X is: S’’={0,1,2}

    28. Random Variable A random variable is a function that maps a finite sample space into to a numeric value. The numeric value has a finite probability space of real numbers, where probabilities are assigned to the new space according to the following rule: pi = P(xi)= sum of probabilities of points in S whose range is xi. A random variable is a function from a sample space to the measurable space of possible values of the variable. A random variable is a function from a sample space to the measurable space of possible values of the variable.

    29. Random Variable The function assigning pi to xi can be given as a table called the distribution of the random variable. pi = P(xi)= number of points in S whose image is xi number of points in S (i = 1,2,3...n) gives the distribution of X

    30. Random Variable The equiprobable space generated by tossing pair of fair dice, consists of 36 ordered pairs(1): S={(1,1),(1,2),(1,3)...(6,6)} Let X be the random variable which assigns to each element of S the sum of the two integers: 2,3,4,5,6,7,8, 9,10,11,12 (1)In a set of ordered pairs <2,2> only appears once whereas <1,3> and <3,1> are considered distinct. These pairs all sum to 4 showing that there is not a 1:1 mapping between sample space and the random variable. In this case, three elements in the sample space map to one element in the distribution of the random variable.(1)In a set of ordered pairs <2,2> only appears once whereas <1,3> and <3,1> are considered distinct. These pairs all sum to 4 showing that there is not a 1:1 mapping between sample space and the random variable. In this case, three elements in the sample space map to one element in the distribution of the random variable.

    31. Random Variable Continuing with the sum of the two dice. There is only one point whose image is 2, giving P(2)=1/36. There are two points whose image is 3, giving P(3)=2/36. ( <1,2>?<2,1>, but their sums are =) Below is the distribution of X.

    32. Example: Random Variable A box contains 9 good items and 3 defective items (total 12 items). Three items are selected at random from the box. Let X be the random variable that counts the number of defective items in a sample. X can have values 0-3. Below is the distribution of X. There are c(9,3) = 84 of sample size 3, with 0 defective There are c(9,3) = 84 of sample size 3, with 0 defective

    33. Example: Random Variable There are choose(12,3) different 3 samples. There are choose(9,3) = 84 of sample size 3, with 0 defective. There are choose(9,2)*3 = 108 of sample size 3, with 1 defective. There are choose(3,2)*9 = 27 of sample size 3, with 2 defective. There is 1 of sample size 3, with 3 defective. Order not important, no duplicatesOrder not important, no duplicates

    34. Functions of a Random Variable If X is a random variable then so is Y=f(X). P(yk) = sum of probabilities xi, such that yk=f(xi)

    35. Expectation and variance of a random variable Let X be a discrete random variable over sample space S. X takes values x1,x2,x3,... xt with respective probabilities p1,p2,p3,... pt An experiment which generates S is repeated n times and the numbers x1,x2,x3,... xt occur with frequency f1,f2,f3,... ft (?fi=n) If n is large then one expects http://en.wikipedia.org/wiki/Expected_value http://www.youtube.com/watch?v=j__Kredt7vY http://en.wikipedia.org/wiki/Expected_value http://www.youtube.com/watch?v=j__Kredt7vY

    36. Expectation of a random variable So becomes The final formula is the population mean, expectation, or expected value of X is denoted as ? or E(X).

    37. Variance of a random variable The variance of X is denoted as ?2 or Var(X). 2 2 The standard deviation is

    38. Expected value, Variance, Standard Deviation E(X)= µ = µx =??xipi Var(X)= ?2 = ?2x =?(xi - µ)2pi SD(X)= ?x =

    39. Relation between population and sample mean. If we select a sample size N at random from a population, then it is possible to show that the expected value of the sample mean m approximates the population mean µ. This rule differs slightly for variance. The sample variance is (N-1)/N times the population variance (almost 1). The sample mean makes a good estimator of the population mean, as its expected value is the same as the population mean. The sample mean makes a good estimator of the population mean, as its expected value is the same as the population mean.

    40. Example: Random Variable A box contains 9 good items and 3 defective items (total 12 items). Three items are selected at random from the box. Let X be the random variable that counts the number of defective items in a sample. X can have values 0-3. Below is the distribution of X. There are choose(9,3) = 84 of sample size 3, with 0 defective There are choose(9,3) = 84 of sample size 3, with 0 defective

    41. Example: Random Variable There are choose(12,3) different 3 samples. There are choose(9,3) = 84 of sample size 3, with 0 defective. There are choose(9,2)*3 = 108 of sample size 3, with 1 defective. There are choose(3,2)*9 = 27 of sample size 3, with 2 defective. There is 1 of sample size 3, with 3 defective. Order not important, no duplicatesOrder not important, no duplicates

    42. Example : Random Variable & Expected Value µ is the expected value of defective items in in a sample size of 3. µ=E(X)= 0(84/220)+1(108/220)+2(27/220)+3(1/220)=132/220=? Var(X)= 02(84/220)+12 (108/220)+22 (27/220)+32 (1/220) - µ 2 =? SD(X) sqrt(µ2)=?

    43. Fair Game1? If a prime number appears on a fair die the player wins that value. If an non-prime appears the player looses that value. Is the game fair?(E(X)=0) S={1,2,3,4,5,6} E(X) = 2(1/6)+3(1/6)+5(1/6)+(-1)(1/6)+(-4)(1/6)+(-6)(1/6)= -1/6 Note: 1 is not prime, 2 is prime A game is fair if E(X) = 0, If E(X) > 0 then favourable to player If E(X) < 0 then unfavourable to playerA game is fair if E(X) = 0, If E(X) > 0 then favourable to player If E(X) < 0 then unfavourable to player

    44. Fair Game2? A player tosses two fair coins. The player wins €2 if two heads occur, and wins €1 if one head occurs. The player looses €3 if no heads occur. Find the expected value of the game. How would you test whether or not the game is fair? Is the game fair? Show the sample space and distribution.

    45. Fair Game2? Sample Space S = {HH,HT,TH,TT} each point has probability ¼. X(HH) = 2, X(HT)=X(TH)=1, X(TT)= -3 E(X) = 2(1/4)+1(2/3)-3(1/4) = 0.25 Game is fair if E(X)=0 Game favours player because E(X)>0

    46. Distribution Example Five cards are numbered 1 to 5. Two cards are drawn at random. Let X denote the sum of the numbers drawn. Find (a) the distribution of X and (b) the mean, variance, and standard deviation. There are choose(5,2) = 10 ways of drawing two cards at random.

    47. Distribution Example Ten equiprobable sample points with their corresponding X-values are

    48. Distribution Example(3) The distribution is:

    49. Distribution Example(4) The distribution is:

    50. Identically Distributed variable Same probability distributions A discrete random variable is one which may take on only a countable number of distinct values such as 0, 1, 2, 3, 4, ... Discrete random variables are usually (but not necessarily) counts. If a random variable can take only a finite number of distinct values, then it must be discrete. Examples of discrete random variables include the number of children in a family, the Friday night attendance at a cinema, the number of patients in a doctor's surgery, the number of defective light bulbs in a box of ten. Two events A and B are statistically independent if the chance that they both happen simultaneously is the product of the chances that each occurs individually. Which is equivalent to saying that learning that one event occurs does not give any information about whether the other event occurred too: Two events A and B are identically distributed if P(A) =P(Y) i.e. they have the same probability distribution. A collection of two or more random variables {X1, X2, … , } is independent and identically distributed if the variables have the same probability distribution, and are independent. Theorem : The following two statements are equivalent. (1) The random variable X and Y are identically distributed, (2) FX(x) = FY (x) for every x. A discrete random variable is one which may take on only a countable number of distinct values such as 0, 1, 2, 3, 4, ... Discrete random variables are usually (but not necessarily) counts. If a random variable can take only a finite number of distinct values, then it must be discrete. Examples of discrete random variables include the number of children in a family, the Friday night attendance at a cinema, the number of patients in a doctor's surgery, the number of defective light bulbs in a box of ten. Two events A and B are statistically independent if the chance that they both happen simultaneously is the product of the chances that each occurs individually. Which is equivalent to saying that learning that one event occurs does not give any information about whether the other event occurred too: Two events A and B are identically distributed if P(A) =P(Y) i.e. they have the same probability distribution. A collection of two or more random variables {X1, X2, … , } is independent and identically distributed if the variables have the same probability distribution, and are independent. Theorem : The following two statements are equivalent. (1) The random variable X and Y are identically distributed, (2) FX(x) = FY (x) for every x.

    51. Binomial Distribution A random variable Xn is defined on a sample space S. We count the number of successful outcomes of n repeated trials of a success or failure type experiment. The distribution of Xn is: Where probability of success in a trial is: p = 1 – q

    52. Binomial Distribution E(Xn ) = np Var(Xn)=npq SD(Xn)=sqrt(Var(Xn))

    53. Binomial Distribution If a fair die is tossed 180 times the expected number of 6’s is: µ=E(X)=np=180(1/6)=30 The standard deviation is:

    54. Normal Distribution http://www.bmj.com/cgi/content/full/331/7521/903 The standard deviation (often SD) is a measure of variability. When we calculate the standard deviation of a sample, we are using it as an estimate of the variability of the population from which the sample was drawn. For data with a normal distribution,2 about 95% of individuals will have values within 2 standard deviations of the mean, the other 5% being equally scattered above and below these limits Contrary to popular misconception, the standard deviation is a valid measure of variability regardless of the distribution. About 95% of observations of any distribution usually fall within the 2 standard deviation limits, though those outside may all be at one end. We may choose a different summary statistic, however, when data have a skewed distribution.http://www.bmj.com/cgi/content/full/331/7521/903 The standard deviation (often SD) is a measure of variability. When we calculate the standard deviation of a sample, we are using it as an estimate of the variability of the population from which the sample was drawn. For data with a normal distribution,2 about 95% of individuals will have values within 2 standard deviations of the mean, the other 5% being equally scattered above and below these limits Contrary to popular misconception, the standard deviation is a valid measure of variability regardless of the distribution. About 95% of observations of any distribution usually fall within the 2 standard deviation limits, though those outside may all be at one end. We may choose a different summary statistic, however, when data have a skewed distribution.

    55. The expected value is the mean of a sampling distribution of a statistic. The number of heads after a fair coin is tossed 6 time. E(X) = (0x1.5%)+(1x 9.3%)+(2x23.4%)+(3 x31.2%) (4x23.4%)+(5x9.3%)+(6x1.5%) =3 6 trials Prob H = 0.5 http://www.youtube.com/watch?v=j__Kredt7vY6 trials Prob H = 0.5 http://www.youtube.com/watch?v=j__Kredt7vY

    56. L7: Review: Permutations & Combinations The number of distinguishable permutations of the word TITLE. Number of 2-permutations of the word HOGS. List the 2-combinations of the word HOGS. N = total number of letters n = ni are the frequencies of each different letter Need to be aware of double counting T. 5!/2!= 120/2 = 60 HOGS is a set, so we can use: Permutaions(n,r) = n!/(n-r)! =4!/(4-2)!=4*3=12 {H,O},{H,G},{H,S},{O,G},{O,S},{G,S} (6 elements, combinations are sets)N = total number of letters n = ni are the frequencies of each different letter Need to be aware of double counting T. 5!/2!= 120/2 = 60 HOGS is a set, so we can use: Permutaions(n,r) = n!/(n-r)! =4!/(4-2)!=4*3=12 {H,O},{H,G},{H,S},{O,G},{O,S},{G,S} (6 elements, combinations are sets)

    57. Machine Learning

    58. Correct and Incorrect Interpretations

    59. Data and a Linear Model (see Lab1) Regression in R on Youtube http://www.youtube.com/watch?v=3gRQsBZSxUo&feature=more_related Regression in R on Youtube http://www.youtube.com/watch?v=3gRQsBZSxUo&feature=more_related

    61. Measurements, Observations, Variables, Values

    62. Descriptive Statistics A good statistical model should… - be simpler than the original data - make the most of the data - communicate accurately without distortion Mean is a measure of central tendency Median is the central value when values are sorted. Standard Deviation is a measure of dispersion. When the distribution of values is skewed, the mean can be an unreliable measure of central tendency, and the median becomes the preferred reporting method.

    63. Descriptive Statistics The mean is sensitive to sample size.

    64. Descriptive Statistics

    66. Descriptive Statistics

    67. Normal Distribution in R

    68. Normal Distribution in R The height of one hundred people was measured in centimetres, with mean = 170, sd=8. We can program this in R: ht <- seq(150,190,0.1) #Note type is “l” for line plot(ht,dnorm(ht,170,8), type="l",ylab="Probability density",xlab="height") Pnorm = Density, distribution function, quantile function and random generation for the normal distribution with mean equal to mean and standard deviation equal to sd. Dnorm = probability of normal distribution (cummulative)Pnorm = Density, distribution function, quantile function and random generation for the normal distribution with mean equal to mean and standard deviation equal to sd. Dnorm = probability of normal distribution (cummulative)

    69. Normal Distribution in R > plot(ht,pnorm(ht,170,8), type="l",ylab=" Cumulative Distribution Function ",xlab="height") > plot(ht,dnorm(ht,170,8), type="l",ylab="Probability density",xlab="height") dnorm. Given a set of values it returns the height of the probability distribution at each point. If you only give the points it assumes you want to use a mean of zero and standard deviation of one. There are options to use different values for the mean and standard deviation pnorm. Given a number or a list it computes the probability that a normally distributed random number will be less than that number. This function also goes by the rather ominous title of the "Cumulative Distribution Function." It accepts the same options as dnorm: dnorm. Given a set of values it returns the height of the probability distribution at each point. If you only give the points it assumes you want to use a mean of zero and standard deviation of one. There are options to use different values for the mean and standard deviation pnorm. Given a number or a list it computes the probability that a normally distributed random number will be less than that number. This function also goes by the rather ominous title of the "Cumulative Distribution Function." It accepts the same options as dnorm:

    70. Z What is the probability that a randomly selected individual will be: Taller than a particular height Shorter that a particular height Between two heights We answer these questions using R pnorm function. We first convert a height to a z value, where : z = (y - y) s

    71. Z

    72. Standard Normal Distribution Find the probability that someone is less than 160cm Z= (160-170) = -1.25, pnorm(-1.25)=0.1 8 Find the probability that someone is greater than 185cm Z =(185-170) = 1.875, 1-pnorm(1.875)=0.03 8

    73. The t-test assesses whether the means of two groups are statistically different from each other. If there is a less than 5% chance (p-value<0.05) of getting the observed differences by chance, we reject the null hypothesis and say we found a statistically significant difference between the two groups. T-Test

    74. T-Test

    75. Correlation

    76. Correlation

    77. Confidence Intervals A value higher and lower than the mean Are used to infer the mean results from a sample to a wider population Results show that if a study was conducted 100 times, 95 of the times the mean would fall within the upper and lower range Confidence intervals are wider if the sample is small and if the data is varied.

    78. Confidence Intervals A survey was conducted on rate of work-related stress in a 12 month period (per100,000 employed). The mean was 780 / 100,000 employed. The confidence limits are 700 to 860 people This shows that 95% of the time the mean number of people that self-reported work-related stress in the 12 months would fall between these values

    79. Confidence Intervals

    80. simpleR : Using R for Introductory Statistics, by John Verzani Univariate Data Bivariate Data Linear regression Random Data Simulations Exploratory Data Analysis. Confidence Interval Estimation Hypothesis Testing Two-sample tests Regression Analysis Multiple Linear Regression Analysis of Variance

    81. Correct and Incorrect Interpretations

    82. Data and a Linear Model (see Lab1)

More Related