1 / 18

Introduction to statistics in medicine – Part 1

Introduction to statistics in medicine – Part 1. Arier Lee. Introduction. Who am I Who do I work with What do I do. Why do we need statistics. Sample. Population. The important role of statistics in medicine. Statisticians pervades every aspect of medical research

mimis
Download Presentation

Introduction to statistics in medicine – Part 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to statistics in medicine – Part 1 Arier Lee

  2. Introduction • Who am I • Who do I work with • What do I do

  3. Why do we need statistics Sample Population

  4. The important role of statistics in medicine • Statisticians pervades every aspect of medical research • Medical practice and research generates lots of data • Research involves asking lots of questions with strong statistical aspects • The evaluation of new treatments, procedures and preventative measures relies on statistical concepts in both design and analysis • Statisticians are consulted at early stage of a medical study

  5. Research process Research question Analyse data Primary and secondary endpoints Study design Interpret results Sampling and/or randomisation scheme Disseminate Power and sample size calculation Pre-define analyses methods

  6. Bias • A form of systematic error that can affect scientific research • Selection bias – well defined inclusion / exclusion criteria, randomisation • Assessment bias – blinding • Response bias, lost-to-follow-up bias – maximise response • Questionnaire bias – careful wording and good interviewer training

  7. Some common data types • Continuous age, weight, height, blood pressure • Percentages % of households owning a dog • Counts Number of pre-term babies • Binary yes/no, male/female, sick/healthy • Ordinal taste of biscuits: strongly dislike, dislike, neutral, like, strongly like • Nominal categorical Ethnicity: European, Maori, Pacific Islander, Chinese etc.

  8. Descriptive statistics for continuous data – the average • Mean (sum of values)/(number in group) • Median The middle value, 50th percentile • Mode The value that occurs the most often 3 4 7 8 8 8 9 11 11 13 21 23 24 median mode=8 mean=11.54

  9. Descriptive statistics for continuous data – the spread • Range Minimum and maximum numbers • Interquartile range Quartiles divide data into quarters • Standard deviation A statistic that tells us how far away from the mean the data is spread (95% of the data lies between 2 SD) √ (xi - x) 2 /(n-1) 0, 1, 2, 5, 8, 8, 9, 10, 12, 14, 18, 20 21, 23, 25, 27, 34, 43 18 numbers Q1 Q2 Q3

  10. Estimation • Estimation: determine value of a variable and its likely range (ie. 95% confidence intervals) • Statistical inference is a process of generalising results calculated from a sample to a population • We are interested in some numerical characteristic of a population (called a parameter). e.g. the mean height or the proportion of pregnant women with hypertension • We take a sample from the population and calculate an estimateof this parameter

  11. Estimation – a simple example • We want to estimate the mean height of 10 years old boys • Take a random sample of 100 ten years old boys and calculate the sample mean • The mean height of my random sample is 141cm • Based on our random sample, we estimate the mean height of 10 years old boys is 141cm

  12. Distribution of Data • It is essential to know the distribution of your data so you can choose the appropriate statistical method to analyse the data • Data can be distributed (spread out) in different ways • Continuous data: There are many cases when the data tends to be around a central value with no bias to the left or right – normal distribution

  13. Distribution of data – Normal distribution • Many parametric methods assumes data is normally distributed • Bell curve • Peak at a central value • Symmetric about the centre • Mean=median=mode • The distribution can be described by two parameters – mean and standard deviation

  14. Standard deviation • Standard deviation – shows how much variation or ‘dispersion’ exists in the data. • 95% of the data are contained within 2 standard deviations

  15. A simulated example – Birth weight Histogram of birth weight Mean=3250g SD=550g

  16. Some other common distributions • Some common distributions • Binomial distribution – gestational diabetes (Yes/No) • Uniform distribution - throwing a die, equal (uniform) probability for each of the six sides • And many many more…

  17. Sampling variability • Because of random sampling, the estimated value will be just an estimate – not exactly the same as the true value • If repeated samples are taken from a population then each sample and hence sample mean and standard deviation is different. This is known as Sampling Variability

  18. Sampling variability • In practice we do not repeat the sampling to measure sampling variability we endeavour to obtain a random sample and use statistical theory to quantify the error • Fundamental principle to justify our estimate is reasonable: If it were possible to repeat a study over and over again, in the long run the estimates of each study would be distributed around the true value • If we have a random sample then the sampling variability depends on the size of the sample and the underlying variability of the variable being measured

More Related