300 likes | 383 Views
Explore the Bayesian perspective on cold hits in forensic databases, unraveling the enigmatic transition in probabilities and odds through historical insights and modern interpretations. Understand the paradoxes, practical applications, and challenges in utilizing Bayesian inference for criminal investigations.
E N D
Cold Hits:The Bayesian Perspective S. L. Zabell Northwestern University Dayton Conference August 13, 2005
Motivating Example • Database size: 10,000 • Match probability: 1 in 100,000 • Suspect population: 1,000,000
The NRC2 Approach • Np = 10,000 x (1/100,000) = 1/10 • Suggests modest evidence
But consider … • One expects about 10 people to have type • 1,000,000 x (1/100,000) • The database search has identified one • So probability this is perp about 1 in 10 (!)
This seems paradoxical … • Is the database a random sample? • Analysis assumes it is. • Intuition: database more likely to contain perp. • Is this legally appropriate?
Bayesian inference • Due to the Reverend Thomas Bayes • Models changes in belief • Lingo: prior and posterior odds
Brief Historical Interlude • Laplace (1749 - 1827) • Championed this viewpoint • Many practical applications • Dominant view in 19th century • Attacked in 20th century by • R. A. Fisher • Jerzy Neyman
Gradual Rehabilitation • Frank Ramsey (1926) • Bruno de Finetti (1937) • I. J. Good (“Probability and the Weighing of Evidence”, 1950) • L. J. Savage (1957) … and many others
The Enigma • Modified commercial device • Offline encryption • Short tactical messages • Army, Navy, Luftwaffe, Abwehr versions
“TUNNY” • Lorenz SZ(“Schlüselzusatz”)40/42 • Online teleprinter encryption device • Longmessages(several thousand characters) • Used by Hitler and his generals • Came into general use 1942
Links • System in extensive use • By time of Normandy invasion, • 26 links • 2 central exchanges • Example: JELLYFISH: Berlin - Paris (Oberbefehlshaber West)
“General Report on Tunny” • In-house report written in 1945 • > 500 pages long • I. J. Good, D. Michie, G. Timms • Declassified in 2000
Some basic terms • Probability: p • Odds: p/(1 - p) • Initial (prior) odds • Final (posterior) odds
Bayes’s Theorem posterior odds = likelihood ratio x prior odds
Theorem is not controversial • It is a simple consequence of axioms • Status of prior odds at issue • Must they be objective frequencies OR • Can they be subjective degrees of belief?
Example 1: The blood matches • Suspect and evidence match • P[E | H0] = p (RMP) • P[E | H1] = 1 • LR = 1/p
Hypothetical • Prior odds of guilt: 2 to 1 • RMP p: 1 in 400,000 • LR 1/p: 400,000 • Posterior odds: LR x prior odds 800,000 to 1
Example 2: Paternity • LR is “paternity index” (PI) • Probability of paternity is • PI x INITIAL PROBABILITY OF PATERNITY
PI ≠ “probability of paternity • These are only the same provided • prior odds are “50-50” (1 : 1) • This may be appropriate in civil cases • Mother and father on equal footing • NOT appropriate in criminal cases • Contrary to “presumption of innocence”
Example 3: Database search • Balding and Donnelly: LR unchanged • This makes sense: • P[E | H0] the same (match probability p) • P[E | H1] the same (1) • So their ratio is still the same (1/p).
Isn’t this paradoxical? • Common intuition: difference between • “Probable cause” scenario • “Database trawl” scenario • Paradox resolved: • LR the same • The priors are different
Problems in practical use • “Suspect population” ill-defined • Assigning probabilities (not uniform) • Communicating this to a jury • But the qualitative insight is key
Didn’t Balding and Donnelly say • Evidence stronger in this case? • Yes: some individuals ruled out • People in databank who don’t match • Realistically this is rarely important • Jailhouse homicide
Relatives and databank searchs • Suspect population: 1,000,000 • No matches in databank of 100,000 • Three close calls • LR for sibs 25,000
This means • Odds for three relatives • Increases: • from 1 in 1,000,000 to 1 in 40 • Odds for everyone else • Decreases • Proportionately