200 likes | 211 Views
M u l t i - l o c u s M a t c h P r o b a b ili t y D e p e nd e n c i e s. B r u c e W e i r a n d E d w a r d Z h a o U n i v e r s i t y o f W a s h i ng t o n b s w e i r @ u w . e d u 201 8 N I J R & D S y m p o s i u m. S up p o r t e d i n p a r t b y N I J 2017 - D N - B X - 0136.
E N D
Multi-locusMatchProbabilityDependencies BruceWeirandEdwardZhaoUniversityofWashingtonbsweir@uw.edu 2018NIJR&DSymposium SupportedinpartbyNIJ2017-DN-BX-0136.
Whatarematchprobabilities? [Vallone:https://www.nist.gov/document-7351]
Willmatchprobabilitieskeepdecreasing? [Geetal,InvestigativeGenetics3:1-14,2012]
Willmatchprobabilitieskeepdecreasing? HowdotheseDonnelly: match probabilities address theobservationof “aftertheobservationofmatchesatsomeloci,itisrel-ativelymuchmorelikelythattheindividualsinvolvedarerelated(preciselybecausematchesbetweenunrelatedin-dividualsareunusual)inwhichcasematchesobservedatsubsequentlociwillbelesssurprising.Thatis,knowl-edgeofmatchesatsomelociwillincreasethechancesofmatchesatsubsequentloci,incontrasttotheinde-pendenceassumption.” [Donnelly,Heredity75:26-64.1995]
Arematchprobabilitiesindependentoverloci? Istheproblemthatwekeeponmultiplyingmatchprobabilitiesoverlociundertheassumptiontheyareindependent?Canwe eventestthatassumptionfor10ormoreloci? Orisourstandard“randommatchprobability”nottheappro-priatestatistictobereportingincasework?Isitactuallyappro-priatetoreportstatementssuchas Theapproximateincidenceofthisprofileis1in810quin-tillionCaucasians,1in4.9sextillionAfricanAmericansand1in410quadrillionHispanics.
Putting“match”backin“matchprobability” Let’sreserve“match”forastatementwemakeabouttwopro-filesandtake“matchprobability”tomeantheprobabilitythattwoprofilesmatch.Thisrequirescalculationsaboutpairsofprofiles. Ifthesourceofanevidenceprofileisunknown(e.g.isnotthepersonofinterest),thenthematchprobabilityistheprobabilitythisunknownpersonhastheprofilealreadyseeninthePOI.Notwoprofilesaretrulyindependent,andtheirdependenceaffectsmatchprobabilitiesacrossloci.
Likelihoodratiosusematchprobabilities Aswithmanyotherissuesonforensicgenetics,theissueofmulti-locusmatchprobabilitydependenciesisbestaddressedbycom-paringtheprobabilitiesoftheevidenceunderalternativepropo-sitions: Hp:thepersonofinterestisthesourceoftheevidenceDNAprofile. Hd:anunknownpersonisthesourceoftheevidenceDNAprofile. WritetheprofilesofthePOIandthesourceoftheevidenceas GsandGc.TheevidenceisthepairofprofilesGc,Gc.
Likelihoodratiosusematchprobabilities Thelikelihoodratiois Pr(E|Hp) LR= Pr(E|Hd) Pr(Gc,Gs|Hp) = Pr(Gc,Gs|Hd) 1 = Pr(Gc|Gs,Hd) 1 = Matchprobability providingGc=GsunderHp.ThematchprobabilityisthechanceanunknownpersonhastheevidenceprofilegiventhatthePOIhastheprofile:thisisnottheprofileprobability.
SpecialCases:UseofSampleAlleleFrequencies Thematchprobabilityisusuallyestimatedusingallelefrequen-ciesfromadatabaserepresentingsomebroadclassofpeople,suchas“Caucasian”or“AfricanAmerican”or“Hispanic.” Thepopulationrelevantforaparticularcrimemaybeanarrowerclassofpeople.Thereispopulationstructure.Ifparetheallelefrequenciesinthedatabase,thematchprobabilitiesare estimatedas [3θ+(1−θ)pA][2θ+(1−θ)pA] Pr(AA|AA)= Pr(AB|AB)= (1+θ)(1+2θ) 2[θ+(1−θ)pA][θ+(1−θ)pB] (1+θ)(1+2θ) Canthesebemultipliedoverloci?
Empiricaldependencies: 284920-locusprofiles
Empiricaldependencies: Y-STRprofiles
Theoreticaldependencies:Nomutation TheprobabilityanindividualishomozygousAABBatlociA,B is whereηistheidentitydisequilibrium. Itcannon-zeroevenfor pairsoflocithatareunlinkedand/orinlinkageequilibrium. Samplingamongparentsorgametesand/ortheinclusionofrandomelementsintheunitinggametesleadstoacorrelationinidentitybydescentevenbetweenunlinkedlocibecausegenesatbothlociareofnecessityincludedineachgamete. [Weir&Cockerham,Genetics63:711-742,1969.]
Theoreticaldependencies: Mutation [Laurie&Weir,TheoreticalPopulationBiology63:207-219,2003.]
Theoreticaldependencies:Mutation “Between-locusdependenciesinfinitepopulationscanleadtounder-estimatesofgenotypicmatchprobabilitieswhenusingtheproductrule,evenforunlinkedloci. Thethree-locusratioisgreaterthanoneandisgreaterthanthecorrespondingtwo-locusratioforlargemutationrates.Theseresultsprovideevidencethatbetween-locusdependencyeffectsaremagnifiedwhenconsideringmoreloci. Highmutationratesmeanthatspecificmutantsarelikelytoberecentandrare.Hence,iftwoindividualsshareallelesatonelocus,theyaremorelikelytoberelatedthroughrecentpedigree,andhencemorelikelytoshareallelesatasecondlocus.” [Laurie&Weir,TheoreticalPopulationBiology63:207-219,2003.]
Onepopulationsimulateddata: θ=0.001
Onepopulationsimulateddata: θ=0.01
2849USprofiles θ=0 θ=0.001 θ=0.01
15,000AustralianProfiles Theta=0.00 Theta=0.01 Theta=0.03 300 300 300 Expected Expected Expected 150 150 150 0 0 0 050150250350 Observed 050150250350 Observed 050150250350 Observed Numbersoffive-locusmatchesamongnine-locusprofiles. [Weir,JournalofForensicSciences49:1009-1014,2004.]
Conclusions • Profileprobabilitiesdecreaseatthesamerateasnumberoflociincreases. • Matchprobabilitiesarenotprofileprobabilities. • Matchprobabilitiesdecreasemoreslowlyasnumberofloci increases. • “Thetacorrection”dencies. mayaccommodate multi-locusdepen- • Empiricalstudiesneedmuchlargerdatabases.