bayesian statistics asking the right questions
Download
Skip this Video
Download Presentation
Bayesian Statistics: Asking the “Right” Questions

Loading in 2 Seconds...

play fullscreen
1 / 46

Bayesian Statistics: Asking the Right Questions - PowerPoint PPT Presentation


  • 219 Views
  • Uploaded on

Bayesian Statistics: Asking the “Right” Questions. Michael L. Raymer, Ph.D. Statistical Games. “The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Bayesian Statistics: Asking the Right Questions' - Albert_Lan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
statistical games
Statistical Games

“The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”

“Only about 0.1% of wife batterers actually murder their wives. Therefore, evidence of abuse and battering should not be admissible in a murder trial.”

M. Raymer – WSU, FBS

the question
The Question
  • “Given the evidentiary DNA typeand the defendant’s DNA type, what is the probability that the evidence sample contains the defendant’s DNA?”
  • Information available:
    • How common is each allele in a particular population?
    • CPI, RMP etc.

M. Raymer – WSU, FBS

an example problem
An Example Problem
  • Suppose the rate of breast canceris 1%
  • Mammograms detect breast cancer in 80% of cases where it is present
  • 10% of the time, mammograms will indicate breast cancer in a healthy patient
  • If a woman has a positive mammogram result, what is the probability that she has breast cancer?

M. Raymer – WSU, FBS

results
Results
  • 75% -- 3
  • 50% -- 1
  • 25% -- 2
  • <10% -- a lot

M. Raymer – WSU, FBS

determining probabilities
Determining Probabilities
  • Counting all possible outcomes
  • If you flip a coin 4 times, what is the probability that you will get heads twice?
    • TTTT THTT HTTT HHTT
    • TTTH THTHHTTH HHTH
    • TTHT THHTHTHT HHHT
    • TTHH THHH HTHH HHHH
  • P(2 heads) = 6/16 = 0.375

M. Raymer – WSU, FBS

statistical preliminaries
Statistical Preliminaries
  • Frequency and Probability
    • We can guess at probabilities by counting frequencies:
      • P(heads) = 0.5
    • The law of large numbers: the more samples we take the closer we will get to 0.5.

M. Raymer – WSU, FBS

distributions
Distributions
  • Counting frequencies gives us distributions

Gaussian Distribution

(Continuous)

Binomial Distribution

(Discrete)

M. Raymer – WSU, FBS

density estimation
Density Estimation
  • Parametric
    • Assume a Gaussian (e.g.) distribution.
    • Estimate the parameters (,).
  • Non-parametric
    • Histogram sampling
    • Bin size is critical
    • Gaussian smoothingcan help

M. Raymer – WSU, FBS

combining probabilities
Combining Probabilities
  • Non-overlapping outcomes:
  • Possible Overlap:
  • Independent Events:

TheProduct Rule

M. Raymer – WSU, FBS

product rule example
Product Rule Example
  • P(Engine > 200 H.P.) = 0.2
  • P(Color = red) = 0.3
  • Assuming independence:
    • P(Red & Fast) = 0.2 × 0.3 = 0.06
  • 1/4 * 1/10 * 1/6 * 1/8 * 1/5  1/10,000

M. Raymer – WSU, FBS

statistical decision making
Statistical Decision Making
  • One variable:

A ring was found at the scene of the crime. The ring is size 11. The defendant’s ring size is also 11. If a random ring were left at the crime scene, what is the probability that it would have been size 11?

M. Raymer – WSU, FBS

multiple variables
Multiple Variables
  • Assume independence:
    • Note what happens to significant digits!

The ring is size 11, and also made of platinum.

M. Raymer – WSU, FBS

which question
Which Question?
  • If a fruit has a diameter of 4”, how likely is it to be an apple?

4” Fruit

Apples

M. Raymer – WSU, FBS

inverting the question
“Inverting” the question

Given an apple, what is the probability that it will have a diameter of 4”?

Given a 4” diameter fruit, what is the probability that it is an apple?

M. Raymer – WSU, FBS

forensic dna evidence
Forensic DNA Evidence
  • Given alleles (17, 17), (19, 21),(14, 15.1), what is the probability that a DNA sample belongs to Bob?
    • Find all (17,17), (19,21), (14,15.1) individuals, how many of them are Bob?
    • How common are 17, 19, 21, 14, and 15.1 in “the population”?

M. Raymer – WSU, FBS

conditional probabilities
Conditional Probabilities
  • For related events, we can expressprobability conditionally:
  • Statistical Independence:

M. Raymer – WSU, FBS

bayesian decision making
Bayesian Decision Making
  • Terminology
    • We have an object, and we want to decide if it belongs to a class
      • Is this fruit a type of apple?
      • Does this DNA come from a Caucasian American?
      • Is this car a sports car?
    • We measure features of the object (evidence):
      • Size, weight, color
      • Alleles at various loci

M. Raymer – WSU, FBS

bayesian notation
Bayesian Notation
  • Feature/Evidence Vector:
  • Classes & Posterior Probability:

M. Raymer – WSU, FBS

a simple example
A Simple Example
  • You are given a fruit with adiameter of 4” – is it a pear or an apple?
  • To begin, we need to know the distributions of diameters for pears and apples.

M. Raymer – WSU, FBS

maximum likelihood
Maximum Likelihood

Class-Conditional Distributions

P(x)

1” 2” 3” 4” 5” 6”

M. Raymer – WSU, FBS

a key problem
A Key Problem
  • We based this decision on

(class conditional)

  • What we really want to use is

(posterior probability)

  • What if we found the fruit in a pear orchard?
  • We need to know the prior probability of finding an apple or a pear!

M. Raymer – WSU, FBS

prior probabilities
Prior Probabilities
  • Prior probability + Evidence Posterior Probability
  • Without evidence, what is the “prior probability” that a fruit is an apple?
  • What is the prior probability that a DNA sample comes from the defendant?

M. Raymer – WSU, FBS

the heart of it all
The heart of it all
  • Bayes Rule

M. Raymer – WSU, FBS

bayes rule
Bayes Rule

or

M. Raymer – WSU, FBS

example revisited
Example Revisited
  • Is it an ordinary apple or an uncommon pear?

M. Raymer – WSU, FBS

bayes rule example
Bayes Rule Example

M. Raymer – WSU, FBS

bayes rule example28
Bayes Rule Example

M. Raymer – WSU, FBS

posing the question
Posing the question
  • What are the classes?
  • What is the evidence?
  • What is the prior probability?
  • What is the class-conditional probability?

M. Raymer – WSU, FBS

an example problem30
An Example Problem
  • Suppose the rate of breast canceris 1%
  • Mammograms detect breast cancer in 80% of cases where it is present
  • 10% of the time, mammograms will indicate breast cancer in a healthy patient
  • If a woman has a positive mammogram result, what is the probability that she has breast cancer?

M. Raymer – WSU, FBS

practice problem revisited
Practice Problem Revisited
  • Classes: healthy, cancer
  • Evidence: positive mammogram (pos), negative mammogram (neg)
  • If a woman has a positive mammogram result, what is the probability that she has breast cancer?

M. Raymer – WSU, FBS

a counting argument
A Counting Argument
  • Suppose we have 1000 women
    • 10 will have breast cancer
      • 8 of these will have a positive mammogram
    • 990 will not have breast cancer
      • 99 of these will have a positive mammogram
    • Of the 107 women with a positive mammogram, 8 have breast cancer
      • 8/107 0.075 = 7.5%

M. Raymer – WSU, FBS

solution
Solution

M. Raymer – WSU, FBS

an example problem34
An Example Problem
  • Suppose the chance of a randomly chosen person being guilty is .001
  • When a person is guilty, a DNA sample will match that individual 99% of the time.
  • .0001 of the time, a DNA will exhibit a false match for an innocent individual
  • If a DNA test demonstrates a match, what is the probability of guilt?

M. Raymer – WSU, FBS

solution35
Solution

M. Raymer – WSU, FBS

marginal distributions
Marginal Distributions

M. Raymer – WSU, FBS

combining marginals
Combining Marginals
  • Assuming independent features:
  • If we assume independence and use Bayes rule, we have a Naïve Bayes decision maker (classifier).

M. Raymer – WSU, FBS

bayes decision rule
Bayes Decision Rule
  • Provably optimum when the features (evidence) follow Gaussian distributions, and are independent.

M. Raymer – WSU, FBS

forensic dna
Forensic DNA
  • Classes: DNA from defendant, DNA not from defendant
  • Evidence: Allele matches at various loci
    • Assumption of independence
  • Prior Probabilities?
    • Assumed equal (0.5)
    • What is the true prior probability that an evidence sample came from a particular individual?

M. Raymer – WSU, FBS

the importance of priors
The Importance of Priors

M. Raymer – WSU, FBS

likelihood ratios
Likelihood Ratios
  • When deciding between two possibilities, we don’t need the exact probabilities. We only need to know which one is greater.
  • The denominator for all the classes is always equal.
    • Can be eliminated
    • Useful when there are many possible classes

M. Raymer – WSU, FBS

likelihood ratio example
Likelihood Ratio Example

M. Raymer – WSU, FBS

likelihood ratio example43
Likelihood Ratio Example

M. Raymer – WSU, FBS

from alleles to identity
From alleles to identity:
  • It is relatively easy to find the allele frequencies in the population
    • Marginal probability distributions
  • Independence assumption
    • Class conditional probabilities
  • Equal prior probabilities
    • Bayesian posterior probability estimate

M. Raymer – WSU, FBS

slide45

Thank you.

M. Raymer – WSU, FBS

a key advantage
A Key Advantage
  • The oldest citation:

T. Bayes. “An essay towards solving a problem in the doctrine of chances.” Phil. Trans. Roy. Soc., 53, 1763.

M. Raymer – WSU, FBS

ad