1 / 29

INTERPRETING COMPLEX DNA PROFILE EVIDENCE:  BAYESIAN NETWORKS TO THE RESCUE

INTERPRETING COMPLEX DNA PROFILE EVIDENCE:  BAYESIAN NETWORKS TO THE RESCUE. Philip Dawid University of Cambridge. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A. Difficulties of Formalizing Reasoning.

wind
Download Presentation

INTERPRETING COMPLEX DNA PROFILE EVIDENCE:  BAYESIAN NETWORKS TO THE RESCUE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INTERPRETING COMPLEX DNA PROFILE EVIDENCE:  BAYESIAN NETWORKS TO THE RESCUE • Philip Dawid • University of Cambridge TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A

  2. Difficulties of Formalizing Reasoning • Classical logic does not readily handle “non-monotonic” reasoning • Reasoning with uncertainty is especially delicate • but specification and manipulation of probabilities appears problematic

  3. Example: “Explaining Away” • Burglar alarm is ringing • Break-in? • Earthquake? • Radio reports earthquake in vicinity • report earthquake • earthquake alarm • alarm break-in • So report break-in ???

  4. PROBABILISTIC REASONING IN INTELLIGENT SYSTEMSNetworks of Plausible InferencePearl 1988

  5. Go with the (causal) flow ?

  6. BAYESIAN NETWORKS • Handle complex problems involving probabilistic uncertainty • Modular structure • Intuitive graphical representation • Precise semantics • relevance (conditional independence) • Correct accounting for evidence • Computational algorithms • elegant and efficient

  7. AN APPLICATION • Forensic Identification • DNA Profiling • Disputed Paternity

  8. A typical DNA profile

  9. Disputed Paternity We have DNA data D from a disputed child c, its mother m and the putative father pf If the true father tf is not pf, he is a “random” alternative father af Straightforward to compute the evidence (LIKELIHOOD RATIO)in favor of paternity (Essen-Möller 1938)

  10. MISSING DNA DATA • What if we can not obtain DNA from the suspect ? (or other relevant individual?) • Sometimes we can obtain indirect information by DNA profiling of relatives • But analysis is complex and subtle…

  11. founder founder If pf is not the true father tf, this is a “random” alternative father af query founder child hypothesis , query Network Representation We have DNA data D from a disputed child c, its mother m and the putative father pf Building blocks:founder,child

  12. founder founder child child child founder founder child founder child Complex Paternity Case We have DNA from a disputed child c1 and its mother m1 but not from the putative father pf. We do have DNA from c2 an undisputed child of pf, and from her mother m2 as well as from two undisputed full brothers b1 and b2 of pf. query hypothesis Building blocks:founder, child, query

  13. Object-Oriented Bayesian Network HUGIN 6 • Each building block (founder/child/query) in a pedigree can be an INSTANCE of a generic CLASS network — which can itself have further structure • The pedigree is built up using simple mouse clicks to insert new nodes/instances and connect them up • Genotype data are entered and propagated using simple mouse clicks

  14. Under the microscope… • Each CLASS is itself a Bayesian Network, with internal structure • Recursive: can contain instances of further class networks • Communication via input and output nodes

  15. DNA MARKER having associated repertory of alleles together with their frequencies gene GENOTYPE consisting of maximum and minimum of paternal and maternal genes genotype MENDELIAN SEGREGATION Child’s gene copies paternal or maternal gene, according to outcome of fair coin flip mendel Lowest Level Building Blocks

  16. gene gene genotype founder FOUNDER INDIVIDUAL represented by a pair of genes pginand mgin (instances of gene) sampled independently from population distribution, and combined in instance gt of genotype

  17. mendel mendel genotype child CHILD INDIVIDUAL paternal [maternal] gene selected by instances fmeiosis[mmeiosis] of mendel from father’s [mother’s] two genes, and combined in instance cgt of genotype

  18. query query QUERY INDIVIDUAL Choice of true father’s paternal gene tfpg[maternal genemfpg] as either that of f1 or that of f2, according as tf=f1? is true or false.

  19. founder founder child child child founder founder child founder child Complex Paternity Case • Measurements for 12 DNA markers on all 6 individuals • Enter data, “propagate” through system • Overall Likelihood Ratio in favour of paternity: 1300 query hypothesis

  20. MORE COMPLEX DNA CASES • Mutation • Silent/missed alleles,… • Mixed crime stains • rape • scuffle • Multiple perpetrators and stains • Database search • Contamination, laboratory errors • …

  21. mut MUTATION mendel + appropriate network mut to describe mutation process

  22. COMBINATION • Can combine any or all of above features (and others), by using all appropriate subnetworks • Can use any desired pedigree network • no visible difference at top level • Simply enter data (and desired parameter-values) and propagate…

  23. Paternity testing

  24. Paternity testing with brother too

  25. Consider additional evidence (likelihood ratio) LRB carried by the brother’s data B Overall likelihood ratio is whereD denotes data on triplet (pf, c, m)

  26. Incompatible triplet * *Maximum LRoverall is 1027, at p(silent) = 0.0000642 mgt = 12/15 pfgt = 14 cgt = 12 B = p22 = .0003 p12 = .0003

  27. Extensions • Estimation of mutation rates from paternity data • Peak area data • mixtures • contamination • low copy number

  28. Thanks to:Julia Mortera Paola VicardSteffen LauritzenRobert CowellandThe Leverhulme Trust

  29. and especially to JUDEA PEARL who made it all possible

More Related