Loading in 5 sec....

Bayesian Statistics: Asking the “Right” QuestionsPowerPoint Presentation

Bayesian Statistics: Asking the “Right” Questions

- 218 Views
- Updated On :

Bayesian Statistics: Asking the “Right” Questions. Michael L. Raymer, Ph.D. Statistical Games. “The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”.

Related searches for Bayesian Statistics: Asking the Right Questions

Download Presentation
## PowerPoint Slideshow about 'Bayesian Statistics: Asking the Right Questions' - Albert_Lan

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Bayesian Statistics: Asking the “Right” Questions

Michael L. Raymer, Ph.D.

Statistical Games

“The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”

“Only about 0.1% of wife batterers actually murder their wives. Therefore, evidence of abuse and battering should not be admissible in a murder trial.”

M. Raymer – WSU, FBS

The Question

- “Given the evidentiary DNA typeand the defendant’s DNA type, what is the probability that the evidence sample contains the defendant’s DNA?”
- Information available:
- How common is each allele in a particular population?
- CPI, RMP etc.

M. Raymer – WSU, FBS

An Example Problem

- Suppose the rate of breast canceris 1%
- Mammograms detect breast cancer in 80% of cases where it is present
- 10% of the time, mammograms will indicate breast cancer in a healthy patient
- If a woman has a positive mammogram result, what is the probability that she has breast cancer?

M. Raymer – WSU, FBS

Determining Probabilities

- Counting all possible outcomes
- If you flip a coin 4 times, what is the probability that you will get heads twice?
- TTTT THTT HTTT HHTT
- TTTH THTHHTTH HHTH
- TTHT THHTHTHT HHHT
- TTHH THHH HTHH HHHH

- P(2 heads) = 6/16 = 0.375

M. Raymer – WSU, FBS

Statistical Preliminaries

- Frequency and Probability
- We can guess at probabilities by counting frequencies:
- P(heads) = 0.5

- The law of large numbers: the more samples we take the closer we will get to 0.5.

- We can guess at probabilities by counting frequencies:

M. Raymer – WSU, FBS

Distributions

- Counting frequencies gives us distributions

Gaussian Distribution

(Continuous)

Binomial Distribution

(Discrete)

M. Raymer – WSU, FBS

Density Estimation

- Parametric
- Assume a Gaussian (e.g.) distribution.
- Estimate the parameters (,).

- Non-parametric
- Histogram sampling
- Bin size is critical
- Gaussian smoothingcan help

M. Raymer – WSU, FBS

Combining Probabilities

- Non-overlapping outcomes:
- Possible Overlap:
- Independent Events:

TheProduct Rule

M. Raymer – WSU, FBS

Product Rule Example

- P(Engine > 200 H.P.) = 0.2
- P(Color = red) = 0.3
- Assuming independence:
- P(Red & Fast) = 0.2 × 0.3 = 0.06

- 1/4 * 1/10 * 1/6 * 1/8 * 1/5 1/10,000

M. Raymer – WSU, FBS

Statistical Decision Making

- One variable:

A ring was found at the scene of the crime. The ring is size 11. The defendant’s ring size is also 11. If a random ring were left at the crime scene, what is the probability that it would have been size 11?

M. Raymer – WSU, FBS

Multiple Variables

- Assume independence:
- Note what happens to significant digits!

The ring is size 11, and also made of platinum.

M. Raymer – WSU, FBS

Which Question?

- If a fruit has a diameter of 4”, how likely is it to be an apple?

4” Fruit

Apples

M. Raymer – WSU, FBS

“Inverting” the question

Given an apple, what is the probability that it will have a diameter of 4”?

Given a 4” diameter fruit, what is the probability that it is an apple?

M. Raymer – WSU, FBS

Forensic DNA Evidence

- Given alleles (17, 17), (19, 21),(14, 15.1), what is the probability that a DNA sample belongs to Bob?
- Find all (17,17), (19,21), (14,15.1) individuals, how many of them are Bob?
- How common are 17, 19, 21, 14, and 15.1 in “the population”?

M. Raymer – WSU, FBS

Conditional Probabilities

- For related events, we can expressprobability conditionally:
- Statistical Independence:

M. Raymer – WSU, FBS

Bayesian Decision Making

- Terminology
- We have an object, and we want to decide if it belongs to a class
- Is this fruit a type of apple?
- Does this DNA come from a Caucasian American?
- Is this car a sports car?

- We measure features of the object (evidence):
- Size, weight, color
- Alleles at various loci

- We have an object, and we want to decide if it belongs to a class

M. Raymer – WSU, FBS

A Simple Example

- You are given a fruit with adiameter of 4” – is it a pear or an apple?
- To begin, we need to know the distributions of diameters for pears and apples.

M. Raymer – WSU, FBS

A Key Problem

- We based this decision on
(class conditional)

- What we really want to use is
(posterior probability)

- What if we found the fruit in a pear orchard?
- We need to know the prior probability of finding an apple or a pear!

M. Raymer – WSU, FBS

Prior Probabilities

- Prior probability + Evidence Posterior Probability
- Without evidence, what is the “prior probability” that a fruit is an apple?
- What is the prior probability that a DNA sample comes from the defendant?

M. Raymer – WSU, FBS

Bayes Rule Example

M. Raymer – WSU, FBS

Bayes Rule Example

M. Raymer – WSU, FBS

Posing the question

- What are the classes?
- What is the evidence?
- What is the prior probability?
- What is the class-conditional probability?

M. Raymer – WSU, FBS

An Example Problem

- Suppose the rate of breast canceris 1%
- Mammograms detect breast cancer in 80% of cases where it is present
- 10% of the time, mammograms will indicate breast cancer in a healthy patient
- If a woman has a positive mammogram result, what is the probability that she has breast cancer?

M. Raymer – WSU, FBS

Practice Problem Revisited

- Classes: healthy, cancer
- Evidence: positive mammogram (pos), negative mammogram (neg)

- If a woman has a positive mammogram result, what is the probability that she has breast cancer?

M. Raymer – WSU, FBS

A Counting Argument

- Suppose we have 1000 women
- 10 will have breast cancer
- 8 of these will have a positive mammogram

- 990 will not have breast cancer
- 99 of these will have a positive mammogram

- Of the 107 women with a positive mammogram, 8 have breast cancer
- 8/107 0.075 = 7.5%

- 10 will have breast cancer

M. Raymer – WSU, FBS

Solution

M. Raymer – WSU, FBS

An Example Problem

- Suppose the chance of a randomly chosen person being guilty is .001
- When a person is guilty, a DNA sample will match that individual 99% of the time.
- .0001 of the time, a DNA will exhibit a false match for an innocent individual
- If a DNA test demonstrates a match, what is the probability of guilt?

M. Raymer – WSU, FBS

Solution

M. Raymer – WSU, FBS

Marginal Distributions

M. Raymer – WSU, FBS

Combining Marginals

- Assuming independent features:
- If we assume independence and use Bayes rule, we have a Naïve Bayes decision maker (classifier).

M. Raymer – WSU, FBS

Bayes Decision Rule

- Provably optimum when the features (evidence) follow Gaussian distributions, and are independent.

M. Raymer – WSU, FBS

Forensic DNA

- Classes: DNA from defendant, DNA not from defendant
- Evidence: Allele matches at various loci
- Assumption of independence

- Prior Probabilities?
- Assumed equal (0.5)
- What is the true prior probability that an evidence sample came from a particular individual?

M. Raymer – WSU, FBS

The Importance of Priors

M. Raymer – WSU, FBS

Likelihood Ratios

- When deciding between two possibilities, we don’t need the exact probabilities. We only need to know which one is greater.
- The denominator for all the classes is always equal.
- Can be eliminated
- Useful when there are many possible classes

M. Raymer – WSU, FBS

Likelihood Ratio Example

M. Raymer – WSU, FBS

From alleles to identity:

- It is relatively easy to find the allele frequencies in the population
- Marginal probability distributions

- Independence assumption
- Class conditional probabilities

- Equal prior probabilities
- Bayesian posterior probability estimate

M. Raymer – WSU, FBS

M. Raymer – WSU, FBS

A Key Advantage

- The oldest citation:

T. Bayes. “An essay towards solving a problem in the doctrine of chances.” Phil. Trans. Roy. Soc., 53, 1763.

M. Raymer – WSU, FBS

Download Presentation

Connecting to Server..