227.407 - PowerPoint PPT Presentation

oshin
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
227.407 PowerPoint Presentation
play fullscreen
1 / 34
Download Presentation
227.407
236 Views
Download Presentation

227.407

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

    1. 1 227.407 Biometrics Alasdair Noble a.d.noble@massey.ac.nz AgHortA 2.72 Xt 4351 Introduce myself Stats Group in IIS&T Statistical Consulting part of my job Work with Mark Stevenson, Cord Heuer, Nigel French and their students.Introduce myself Stats Group in IIS&T Statistical Consulting part of my job Work with Mark Stevenson, Cord Heuer, Nigel French and their students.

    2. 2 French Veterinary Epidemiology! For the non Francophiles:- The chicken sneezes and all of the others rush away and the farmer says:_ “No – that’s not funny – you understand“ For the non Francophiles:- The chicken sneezes and all of the others rush away and the farmer says:_ “No – that’s not funny – you understand“

    3. 3 Office Hours 8.00am – 9.30am Tuesdays Or any other time that I am free. I would encourage you to come and ask if you are having problems I am lecturing 2 papers for the first half then only 227.407 Thursdays will not be a great time to catch me as I am very likely to be in Wellington – Explain Lunchtime is usually 12.00 – 1.00 In early - go home earlyI am lecturing 2 papers for the first half then only 227.407 Thursdays will not be a great time to catch me as I am very likely to be in Wellington – Explain Lunchtime is usually 12.00 – 1.00 In early - go home early

    4. 4 Introduction Text Book Aviva Petrie and Paul Watson Good points – Bad points – Computing Minitab R (Rcmdr) Discuss Text book Then Demonstrate Excel, Minitab, R. Data Petrels. Sure you can use my data. The 723 bycatch white-chinned petrels were all made available by Christopher J R Robertson and I collected all the data. The bycatch birds were caught predominantly by bottom longliners, but also by tuna longliners, squid trawlers and general fish trawlers between 2000-2003. The main conclusions reached were that the cluster of bycatch white-chinned petrels caught close to Antipodes Island during the breeding season ('Antipodes Island cluster group') were significantly larger in most external measurements than the cluster caught close to the Auckland Islands during the breeding season ('Auckland Island cluster group'). Birds caught during the breeding season were used because the current literature suggests petrels stay closer to breeding grouds during the breeding season. Satillite tracked white-chinned petrels in the Atlantic and Indian Oceans stayed relatively close to breeding grounds during the breeding season. Using discriminant analysis we obtained two functions (after cross validation): 1) that could discriminate between 'Auckland and Antipodes Island cluster group' males 93.5% using culmen length and tail length (a positive value = Antipodes cluster group, negative = Auckland cluster group 2) a function that could discriminate between 'Auckland and Antipodes Island cluster group' females 92.0% using head and bill length, culmen depth at the base and wing length (a positive value = Antipodes cluster group, negative = Auckland cluster group. I also have a function for differentiating the 'Auckland and Antipodes Island cluster groups' if the sex is not known using head and bill length, tarsus length and tail length (after cross validation 91.4%). I could also discriminate between 'Auckland Island cluster group' males and females 95.5% using head and bill length and culmen depth at the base; and between 'Antipodes Island cluster group' males and females 91.8% using head and bill length, head width, culmen depth at the base and right MTC length. We also found that the 'Antipodes Island cluster group' related closest in size to study skins of specimens collected from the Antipodes Island, and the 'Auckland Island cluster group' related closest in size to study skins of specimens collected from the Auckland Islands. We found this by putting the study skin measurements for each skin in the functions for differentiating Auckland and Antipodes cluster group males and females. This suggested that the bycatch birds caught close to those breeding Islands were likely to be from those Islands. In the lab if we know the sex (which is easy on bycatch birds via dissection) then we can use the functions for discriminating 'Auckland and Antipodes Island cluster group' males and females to give an indication as to which breeding population the birds are from. Based on these results and the taxonomy of the species, I suggest there are two groups of different sized white-chinned petrels in NZ waters. I can only make an assumption as this process now needs to be done on breeding birds at the Auckland and Antipodes Islands to determine if there really is a difference between the two populations. If there is a real difference then there could be two (or more) sub species or species, though it is impossible to determine at this stage. More research is needed on NZ white-chinned petrels, ie measurements taken from live birds on each island during the breeding season and feathers for DNA analysis. Also some breeding biology, life history data would be necessary as well. I also found that there was only a small amount of error between two observers measuring the same sample of white-chinned petrels, and that this error was biologically insignificant. Therefore if the same measuring techniques are used data from two or more observers can be pooled. Anyway that is most of the main conclusions I reached with this research. feel free to contact me for any more information. Cheers Mark Fraser Discuss Text book Then Demonstrate Excel, Minitab, R. Data Petrels. Sure you can use my data. The 723 bycatch white-chinned petrels were all made available by Christopher J R Robertson and I collected all the data. The bycatch birds were caught predominantly by bottom longliners, but also by tuna longliners, squid trawlers and general fish trawlers between 2000-2003. The main conclusions reached were that the cluster of bycatch white-chinned petrels caught close to Antipodes Island during the breeding season ('Antipodes Island cluster group') were significantly larger in most external measurements than the cluster caught close to the Auckland Islands during the breeding season ('Auckland Island cluster group'). Birds caught during the breeding season were used because the current literature suggests petrels stay closer to breeding grouds during the breeding season. Satillite tracked white-chinned petrels in the Atlantic and Indian Oceans stayed relatively close to breeding grounds during the breeding season. Using discriminant analysis we obtained two functions (after cross validation): 1) that could discriminate between 'Auckland and Antipodes Island cluster group' males 93.5% using culmen length and tail length (a positive value = Antipodes cluster group, negative = Auckland cluster group2) a function that could discriminate between 'Auckland and Antipodes Island cluster group' females 92.0% using head and bill length, culmen depth at the base and wing length (a positive value = Antipodes cluster group, negative = Auckland cluster group. I also have a function for differentiating the 'Auckland and Antipodes Island cluster groups' if the sex is not known using head and bill length, tarsus length and tail length (after cross validation 91.4%).

    5. 5 Data Analysis Exploratory Looking for features of the data, some may be expected, some may not. Inferential (we will look at this later) Confirming hypotheses previously posited

    6. 6 EDA We may have some preconceived ideas which may be investigated or we may have little idea. Graphical displays and simple statistics are all that are needed. Conclusions drawn are tentative.

    7. 7 Graphs Univariate: Dotplot Boxplot Stem and Leaf Plot Histogram Frequency/Relative Frequency Distributions Bargraph Pie Chart Time Series Plot Bivariate Scatter Plot

    8. 8

    9. 9

    10. 10

    11. 11

    12. 12

    13. 13

    14. 14

    15. 15

    16. 16

    17. 17

    18. 18 Numerical Statistics Numbers calculated from the data to summarise it. Location Mean Median Geometric mean Mode Spread Standard deviation (Variance) Interquartile Range

    19. 19 Numerical Statistics These are only summaries. Do not focus on means.

    20. 20 Tables Can be very powerful if used carefully Useful for categorical variables Often contain “Counts”

    21. 21

    22. 22

    23. 23 Definitions of Probability Long Run Relative Frequency Repeated experiments or observation “Equally Likely Outcomes” Dice, Grecian urns, packs of cards Subjective Probability Your degree of belief

    24. 24 Notation Probability of an Event A is written

    25. 25 Simple Rules

    26. 26

    27. 27 Conditional Probability

    28. 28 More Conditional Probability and Bayes Theorem

    29. 29 Conditional Probability

    30. 30 The Monte Hall Problem (aka Teaching Statistics by Chocolate)

    31. 31 Avoiding the Theory!! Two points:- There is mathematical theory supporting the statistics that we use. You may find in the future that you can no longer avoid it and have to bite the bullet. Some vets are very mathematical!!!

    32. 32 Continuous Distributions The Normal Distribution Characterised by the mean and Standard deviation Unimodal and symmetric Mean median and mode are equal Changing the mean shifts the curve horizontally Changing the standard deviation alters the width/height of the curve

    33. 33

    34. 34

    35. 35 Normal Distribution Probability Calculations Total Area under the curve is 1 Different tables give different probabilities as they refer to different areas. Petrie and Watson p.199 (unnumbered!) give “ 2 tailed probabilities” which is a little unusual. Check carefully any tables that you use.

    36. 36 The Normal Curve Areas under the normal curve give us the probability of an event happening. How likely is it that we will get a result between 25 and 28?

    37. Areas Under Normal Curves Draw a rough diagram and guess the answer Find the number of standard deviations that the end of each interval is from the mean Look up the areas for these numbers in the tables and write them on the diagram Add or subtract areas to give the required answer

    38. 38 Using Standard Normal Tables Z values are plotted on the horizontal axis. Areas under the normal curve are given in the table from the z value back to -infinity. Find the area for :- z < 1.30 z < -0.75 z > 1.23 Find the z value if the area is 0.6734

    39. 39 Calculations for “Real” Data Most data does not have a mean of 0 and a standard deviation of 1. To convert to z values subtract the mean and divide by the standard deviation. Then look up z values in the table as before.

    40. Examples Mean is 15, standard deviation is 3 what is the proportion :- Less than 10 Less than 19 Greater than 14 Greater than 22 Between 13 and 20

    41. Solution:- Step 2______is _____ above/below the mean. _____is____SD above/below the mean step 3 From table z=_____ gives an area of _____ Proportion is ______

    42. 42 Approximate Normal Probabilities

    43. 43 Approximate Normal Probabilities

    44. 44 Approximate Normal Probabilities

    45. 45 Assessing Normality Look at a graph Symmetry Single Mode “Bell shape”

    46. 46 Assessing Normality Find proportions 68% within 1 standard deviation 95% within 2 standard deviations 99% within 3 standard deviations

    47. 47 Assessing Normality Normal probability Plot

    48. 48 Discrete Probability Distributions A discrete set of possible outcomes with a probability attached to each possible outcome. Two will be considered:- Binomial Distribution Commonly for events which happen with a fixed probability Poisson Distribution Commonly for “Count” data

    49. 49 Binomial Distribution Four assumptions Fixed number of trials n Only two outcomes Fixed probability of “success” p Events are independent Eg Proportion of cows in a herd with a disease Associated with each event is a probability of success

    50. 50 Binomial Distribution Mean = np Variance = np(1-p) Standard deviation = For large sample sizes the binomial can be approximated with a Normal distribution Large means np or n(1-p) is greater than 30

    51. 51 Example The prevalence of Leptospira in the cattle population is approximately 30% What is the probability that 5 or more of the next 10 cows tested will be positive?

    52. 52 Poisson Distribution A count of the number of independent events occurring randomly over time or space The mean and variance are equal Often “The number of something per something else.” Again as sample sizes increase the Poisson can be approximated by a Normal distribution

    53. 53 Example If cases of a disease are independent new cases found per day should follow a Poisson distribution. Given a mean number of new cases per day of 4 what is the probability of finding 10 new cases on a particular day?