1 / 40

# HIM 3200 Midterm Review - PowerPoint PPT Presentation

HIM 3200 Midterm Review. Dr. Burton. Mid-term review. Types of data Normal distribution Variance Standard deviation and z scores 2 X 2 table Hypothesis testing H 0 : H A : t-test Pearson r/Linear regression Chi square. Measurements. Frequency Incidence

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'HIM 3200 Midterm Review' - ayasha

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### HIM 3200Midterm Review

Dr. Burton

• Types of data

• Normal distribution

• Variance

• Standard deviation and z scores

• 2 X 2 table

• Hypothesis testing H0: HA:

• t-test

• Pearson r/Linear regression

• Chi square

• Frequency

• Incidence

• The frequency of new occurrences of disease, injury, or death in the study population during the time being examined.

• Prevalence

• The number of persons in defined population that had a specified disease or condition

• Point prevalence (at a particular point in time.)

• Period prevalence (the sum of the point prevalence at the beginning of the interval plus the incidence during the interval.)

• Frequency

• Incidence

• Prevalence

• Risk

• “The proportion of persons who are unaffected at the beginning of a study period but who undergo the risk event during the study period.”

• Risk event:

• Death

• Disease

• Injury

• Cohort:

• Persons at risk for the event .

• Frequency

• Incidence

• Prevalence

• Risk

• “The proportion of persons who are uneffected at the beginning of a study period but who undergo the risk event during the study period.”

• Rates

• “The frequency of events that occur in a defined time period, divided by the average population at risk.”

Numerator

Rate = ------------------- x Constant multiplier

The constant multiplier is usually 100, 1000, 10,000 or 100,000.

Types of rates

Incidence rates (i.e. Per 1000)

Prevalence rates (Proportional i.e. 20%)

Incidence density (frequency of new events per person time)

Denominator

• Bias is a differential error

• A nonrandom, systematic, or consistent error in which the values tend to be inaccurate in a particular direction.

• Nondifferential are random errors

• Three most problematic forms of bias in medicine:

• 1. Selection (Sampling) Bias: The following are biases that distort results because of the selection process

• Distortions in risk ratios occur as a result of different hospital admission rate among cases with the risk factor, cases without the risk factor, and controls with the risk factor –causing greatly different risk-factor probabilities to interfere with the outcome of interest.

• Nonresponse bias

• i.e. noncompliance of people who have scheduled interviews in their home.

• A time differential between diagnosis and treatment among sample subjects may result in erroneous attribution of higher survival rates to superior treatment rather than early detection.

• Three most problematic forms of bias in medicine:

• 1. Selection (Sampling) Bias

• Nonresponse bias

• 2. Information (misclassification) Bias

• Recall bias

• Differentials in memory capabilities of sample subjects

• Interview bias

• “blinding of interviewers to diseased and control subjects is often difficult.

• Unacceptability bias

• Three most problematic forms of bias in medicine:

• 1. Selection (Sampling) Bias

• Nonresponse bias

• 2. Information (misclassification) Bias

• Recall bias

• Interview bias

• Unacceptability bias

• 3. Confounding

• A confounding variable has a relationship with both the dependent and independent variables that masks or potentiates the effect of the variable on the study.

• “late look bias” if it results in selecting fewer individuals with severe disease because they died before detection.

• “length bias” in screening programs which tend to select less aggressive cases for treatment.

2 X 2 Tablecomparing the test results of two observers

Observer No. 1

Positive

Negative

Total

a

b

a + b

Positive

Observer

No. 2

d

c

c + d

Negative

a + c

b + d

a+b+c+d

Total

+ _ + A B A + B - C D C + D A + C B + D

Sensitivity = A/(A + C)

Specificity = D/(B + D)

False- positive rate = B/(B + D)

False-negative rate = C/(A + C)

Positive predictive value = A/(A + B)

Negative predictive value = D/ (D + C)

Accuracy = (A + D) / (A + B + C + D)

• Nominal variables

• Dichotomous (Binary) variables

• Ordinal (Ranked) variables

• Continuous (Dimensional) variables

• Ratio variables

• Risks and Proportions as variables

A

Social Security Number

O

123 45 6789

312 65 8432

555 44 7777

Blood Type

B

AB

• Nominal variables

• Dichotomous (Binary) variables

• Ordinal (Ranked) variables

• Continuous (Dimensional) variables

• Ratio variables

• Risks and Proportions as variables

WNL

Not WNL

Normal

Abnormal

Accept

Reject

• Nominal variables

• Dichotomous (Binary) variables

• Ordinal (Ranked) variables

• Continuous (Dimensional) variables

• Ratio variables

• Risks and Proportions as variables

Strongly agree, agree, neutral, disagree, strongly disagree

a b c d e

1 2 3 4 5

• Nominal variables

• Dichotomous (Binary) variables

• Discrete variables

• Ordinal (Ranked) variables

• Continuous (Dimensional) variables

• Ratio variables

• Risks and Proportions as variables

Temperature

32° F

Height Blood Pressure

Weight

• Nominal variables

• Dichotomous (Binary) variables

• Discrete variables

• Ordinal (Ranked) variables

• Continuous (Dimensional) variables

• Ratio variables

• Risks and Proportions as variables

• A continuous scale that has a true zero point

• Mode: the value with the highest number of observations in a data set.

• Median: the middle observation when data have been arranged from highest to lowest.

• Mean: (arithmetic) the average value of all observed values.

 (xi)

Mean = x

Ni

Sum = 

Observed values = xi

Total number of observations = Ni

Raw data and results of Cholesterol levels in 26 subjects p.115

Number of observations or N 26

Initial HDL values 31, 41, 44, 46, 47, 47, 48, 49, 52, 53, 54, 57, 58, 58, 60, 60, 62, 63, 64, 67, 69, 70,

77, 78, 81, 90 mg/dl

Highest values 90 mg/dl

Lowest value 31 mg/dl

Mode 47, 48, 58, 60 mg/dl

Median (57 + 58)/2 = 57.5 mg/dl

Sum of the values  (xi) 1496 mg/dl

Means, x 1496/26 = 57.5 mg/dl

• The median is the 50%

• The 75th percentile is the point where 75% of observations lie below and 25% are above. (3rd quartile, Q3)

• The 25th percentile is the point where 25% of observations lie below and 75% are above. (1st quartile, Q1)

• Interquartile range (Q3 – Q1)

Raw data and results of Cholesterol levels in 26 subjects p.115

Number of observations or N 26

Initial HDL values 31, 41, 44, 46, 47, 47, 48, 48, 49, 52, 53, 54, 57, 58, 58, 60, 60, 62, 63, 64, 67, 69, 70,

77, 78, 81, 90 mg/dl

Highest values 90 mg/dl

Lowest value 31 mg/dl

Mode 47, 48, 58, 60 mg/dl

Median (57 + 58)/2 = 57.5 mg/dl

Sum of the values  (xi) 1496 mg/dl

Means, x 1496/26 = 57.5 mg/dl

Interquartile range 64 – 48 = 16 mg/dl

 (|xi - x|)

N

2

 (xi - x )

2

s

=

Degrees of Freedom

N -1

2

 (xi - x )

N -1

Measures of dispersion based on the Mean.

• Mean deviation =

• Variance =

• Standard deviation = s =

Raw data and results of Cholesterol levels in 26 subjects p.115

Number of observations or N 26

Initial HDL values 31, 41, 44, 46, 47, 47, 48, 48, 49, 52, 53, 54, 57, 58, 58, 60, 60, 62, 63, 64, 67, 69, 70,

77, 78, 81, 90 mg/dl

Highest values 90 mg/dl

Lowest value 31 mg/dl

Mode 47, 48, 58, 60 mg/dl

Median (57 + 58)/2 = 57.5 mg/dl

Sum of the values  (xi) 1496 mg/dl

Means, x 1496/26 = 57.5 mg/dl

Interquartile range 64 – 48 = 16 mg/dl

Sum of squares (TSS) 4,298.46 mg/dl

Variance, “s” squared 171.94 mg/dl

Standard Deviation, s 171.94 mg/dl = 13.1 mg/dl

•  stands for the mean in a theoretical distribution

•  stands for the standard deviation in a theoretical population.

-3

-2

-

+2

+3

+

-3

-2

-1

1

2

3

0

Z scores

Three Common Areas Under the Curve

• Three Normal distributions with different areas

• Test are designed to determine the probability that a finding represents the true deviation from what is expected.

• This chapter focuses on the justification for and interpretation of the p value designed to minimized type I error.

• Science is based of the following principles:

• Previous experience serves as the basis for developing hypotheses;

• Hypotheses serve as the basis for developing predictions;

• Predictions must be subjected to experimental or observational testing.

Hypothesis testing

H0 True

H0 False

a

b

Correct

Type II

Error

Accept H0

Decision

d

c

Correct

Type I

Error

Reject H0

Alpha error: rejecting the null H0 when it is true

Beta error: accepting the null H0 when it is false

(probability that a test detects differences that actually exist) can be determined by using the formula 1 – beta (1 - )

80% is usually acceptable

1. State question in terms of:

H0: no difference or relationship (null)

Ha: is difference or relationship (alternative)

2. Decide on appropriate research design and statistic

• Select significance (alpha) level and “N”

• Collect data

• Analyze and perform calculation to get P-value

• Draw and state conclusions by comparing alpha with P-value

-3

-2

-

+2

+3

+

Z scores

-3

-2

-1

1

2

3

0

Probability

Upper tail .1587 .02288 .0013

Two-tailed .3173 .0455.0027

Student’s t –test: to compare the means of two small (n < 30) independent samples.

Paired t-test: to compare the means of two paired samples (e.g. before and after)

F – test: to compare means of three or more samples or groups.

Chi-Square test: comparing two or more independent proportions.

Correlation coefficient: measures the strength of the association between two variables.

Regression analysis: Provides an equation that estimates the change in a dependent variable (y) per unit change in an independent variable (x).