- 407 Views
- Uploaded on
- Presentation posted in: Sports / GamesEducation / CareerFashion / BeautyGraphics / DesignNews / Politics

Ballistics DNA

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Ballistics DNA

Alain Beauchamp, PH.D.

- PART I: Correlation score and probability
- PART II: Ballistic probability model
- PART III: How could we implement a probability model in a ballistic system?
- Conclusion and future work

- Strengths and limitations of the current correlation score
- Why are correlation scores hard to interpret?
- Benefits of a probability “score”

- In the last 15 years, the correlation score has been in the core of FT’s ballistic systems
- Strength of a correlation score:
Useful as a ranking tool

- Can compare score values computed with the same reference “A” (and same type of mark)
- Score(A against B) > Score(A against C) means
B looks more similar to A than C does

- Score(A against B) > Score(A against C) means

- Can compare score values computed with the same reference “A” (and same type of mark)

- Limitations of a correlation score:
- Correlation score is hard to interpret
- Not useful as an intrinsic similarity measure

- Examples:
- Cannot compare score values computed with different references (same type of mark)
- Score(A-B) > Score(C-D)
DOES NOT mean that

the A-B pair looks more similar than the C-D pair

- Score(A-B) > Score(C-D)
- Cannot compare score values computed from different marks
- Score(A-B) for the Firing Pin > Score(A-B) for BreechFace
DOES NOT mean

B looks more similar to A on the FiringPin than on the BreechFace

- Score(A-B) for the Firing Pin > Score(A-B) for BreechFace

- Cannot compare score values computed with different references (same type of mark)

5 reasons:

- 1] Different algorithms for different marks
- Characteristics of the correlatable features and the geometry are very different
- FP/BF: circular contour and a wide variety of features
- Ejector/Rimfire: polygonal contour
- Bullets: stria only

- Characteristics of the correlatable features and the geometry are very different
- 2] Algorithms change over time

3] No unique cartridge or bullet score

- More than 1 score per exhibit
- Cartridge cases :
- BF/FP/Ejector scores

- Bullets (Land)
- MaxPhase2D, PeakPhase2D, PeakScore2D
- 3DScore

- Cartridge cases :
- Number of score per exhibit expected to increase in the future
- Cartridge cases: 3D scores
- Bullets:
- Added 3D Land score
- GEA scores?

- 4] Effect of the database size
- As the database size increases, the probability to find non matches that look similar to a given reference increases
- The probability to find a known match in the Top10 decreases even if the score does not change

- The score value alone is not sufficient. The database size is an important factor as well.
- “Universal law”, not only in ballistics systems

- As the database size increases, the probability to find non matches that look similar to a given reference increases

- 5] Each reference has its own “score response”.
Example:

- If two cartridges A and B are correlated against the same large database (with no match in it)
Sometimes get two very different list of scores

- For example, scores associated with A could be greater then scores associated with B

- If two cartridges A and B are correlated against the same large database (with no match in it)

- Experiment: Correlate 9LG bullets against the same large database (800 non matches) with BulletTRAX-3D
- Compare their non match score distribution
- Significant differences
- high score region
- position of the peak

- Significant differences
- Each bullet has its own statistical distribution of non match scores
- No universal “score response” common to all bullets

9LG

Bullet #A

9LG

Bullet #B

- Each of the previous problems can be solved using probabilities (in principle)
- Different Algorithms:
- Probability is a common concept for all score types

- Algorithms change over time
- Probability value may still change, but slightly

- Distinct score response for each bullet/cartridge
- Probability is a common concept for all exhibits

- Effect of database size
- Statistical models based on relevant data could quantify this effect

- More than 1 score per bullet/cartridge
- Compute a probability for each score and combine them to find a unique probability for the bullet/cartridge

- Different Algorithms:

- Assume
- we have a BF and a FP score for a pair of cartridge cases AND
- the 2 following probabilities are known
- P(FP): Confirmed match according to FP
- P(BF): Confirmed match according to BF

- 4 possible scenarios
- Confirmed match according to BOTH FP and BF
- Confirmed match according to FP ONLY
- Confirmed match according to BF ONLY
- Not a confirmed match

- FP/BF marks provide independent information
- A combined probability is computed by assuming independent information
- P Combined = 1 – (1-PBF)(1-PFP)

- Results:
- A mark with a low probability has no effect on the combined probability
- As we add marks, the combined probability improves

- Easy to generalize for 3 independent marks

- The 4 bullets’ scores are not computed from independent information
- Are computed from the same areas on the bullet

- A combined probability cannot be computed by assuming independent information
- Keep the highest probability only (conservative)

- The probability of being a match is a more meaningful concept than correlation score
- Using probability solves all problems found with the interpretation of correlation scores
- Probabilities of individual marks can be combined nicely

- Challenge: Compute the probability of being a match for individual marks
- Two main unknowns:
- How to deal with the individual score response of each cartridge/bullet
- How to predict the effect of database size

- Two main unknowns:

- Goal and constraints of the model
- Hypothesis
- Tests and results

- Project started in 2003
- Goal: Develop a model which
- Converts the correlation score of a mark into a probability of being a match

- Current constraints
- We only have database of sister pairs

- Tests with BulletTRAX-3D scores
- The model should find the same performance as the large database study
- As the database size increases, the probability to find a known match in the first position should decrease

- Any mathematical or physical model starts with a small number of hypotheses/laws/axioms
- Need hypotheses for the (3D bullet) ballistic model
- Need to find something common to all bullet score distributions
- However, each bullet has its own score response

- Non Match Statistical distribution

- Experiment already discussed:
Correlate 9LG bullets against the same large database (800 non matches)

- Compare their non match score distribution (3D)
- Differences
- in the high score region
- in the position of the peak

- Similarity:
- The distributions have a similar shape

- Differences

9LG

Bullet #1

9LG

Bullet #2

- Core Hypothesis:
The non match score distribution of all bullets

- Has the same universal “shape” (up to a shift and stretch factor)
- This shape is independent of calibre, material and quality of the marks
Can be broken into two hypotheses

- Hypothesis I:
- The non match score distribution of each bullet is fully characterized by only two parameters:
- its mean (position of the peak)
- its width

- The non match score distribution of each bullet is fully characterized by only two parameters:
- Hypothesis II:
- If we remove the effect of these 2 parameters,
the non match score distributions of bullets are strictly identical

- If we remove the effect of these 2 parameters,
- The effect of the 2 parameters is removed as follows
- Shift the overall distribution at the same peak position for every bullet
- Shrink or expand the overall distribution to get the same width for every bullet

- The effect of the 2 parameters is removed as follow
- Shift the mean to 0
- Shrink to unit width

- Get very similar distributions!
- Small variations due to limited data

9LG

Bullet #1

9LG

Bullet #2

- 4 steps:
- Compute 3D correlation scores from a large database study with BulletTRAX-3D
- 4 calibers, 2 materials/compositions

- Compute the individual parameters for each bullet (Hypothesis I)
- mean and width of its non match score distribution

- Determine a Universal Non Match score distribution
(Hypothesis II)

- By simulations, predict the performance of the correlation algorithm as a function of database size

- Compute 3D correlation scores from a large database study with BulletTRAX-3D

*Pittsburgh bullets database

(Allegheny County Coroner’s Office

Forensic Laboratory Division)

- For each bullet
- get an approximation of the universal distribution (Hypothesis II)

- The scores are normalized by this process

- For each bullet:
- Mean and width are computed
- The distribution is
- Shifted the mean to 0
- Rescaled to unit width

Add up the “approximated” universal distributions found for all bullets

Smooth shape even in high score region

Universal Normalized distribution for non match scores

- The simulation reproduces the operations done in a real large database study
- Real study (with sister pairs)
- For each reference bullet
- Introduce its known match in the database of size N
- Compute all correlation scores between the reference and (N+1) bullets in the database
- Find the rank of the known match

- Compute the performance of the correlation algorithm (number of known matches at the first position)

- For each reference bullet

- Simulation:
- For each reference bullet
- Select randomly N non match (normalized) correlation scores from the universal score distribution
- Normalize the (known) score of its known match by using
- the reference’s individual parameters (mean and width of its non match score distribution)

- Introduce the normalized score of its known match in the (generated) non-match score list
- Find the rank of the known match

- Compute the simulated performance of the correlation algorithm

- For each reference bullet
- Repeat the same process for several databases sizes N

Probability that the sister is at the first position as a function of its “normalized” score S

- Dark circles: experimental data
- Dark curve:
Result from the model

- Gray curves: Prediction for other database sizes

8

Summary of the figure

- If the sister has a “normalized score” = 8
- The probability to be in first position is
- 90% for N = 500
- 70% for N = 2K
- 20% for N = 10K

- The probability to be in first position is
- If we want the sister to be at the first position with a 95% probability,
- its score must be
- 9 for N = 500
- 10 for N = 2K
- 12 for N = 10K

- its score must be

- A statistical model of non match scores was built
- a database of 2000 bullets, 4 calibers, 2 compositions/materials
- 3D correlation on BulletTRAX-3D

- Hypothesis:
- The non match score distribution has the same shape for all bullets
(except for a shift and stretch factor)

- The non match score distribution has the same shape for all bullets
- The model computes the probability that the sister with a given score is in first position
- The prediction agrees with the actual performance in the large database study
- Performance decreases as the database size increases

How could we implement a probabilistic model in a ballistic system?

- Correlate a given bullet against a large database
- From the (large) list of scores, compute the two characteristic parameters of the reference bullet
- mean and width of its non match score distribution

- Compute the probability that the bullet in the first position is a match by using
- The universal non match score distributions
- Two characteristic parameters computed previously
- Actual score of the bullet at the first position
- Information about match score distributions (unknown yet)

- Repeat the same process for all score types
- MaxPhase2D, PeakPhase2D, PeakScore2D
- 3DScore

- Combine the 4 probabilities into a unique probability for the bullet

- Improving the model with new large database studies (new calibers)
- Test on cartridges
- Get a better knowledge of sister score distributions
- The current study was done with sister pairs only

- Use the model to improve correlation algorithms