Deoxyribonucleic acid dna biometrics cpsc 4600 biometrics and cryptography
Download
1 / 45

Deoxyribonucleic acid (DNA) Biometrics CPSC 4600 Biometrics and Cryptography - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on

Deoxyribonucleic acid (DNA) Biometrics CPSC 4600 Biometrics and Cryptography. DNA. DNA analysis is no longer confined to genetic and medical research. Criminal Forensics:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Deoxyribonucleic acid (DNA) Biometrics CPSC 4600 Biometrics and Cryptography' - emmly


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Deoxyribonucleic acid dna biometrics cpsc 4600 biometrics and cryptography

Deoxyribonucleic acid (DNA)BiometricsCPSC 4600 Biometrics and Cryptography


DNA

  • DNA analysis is no longer confined to genetic and medical research.

  • Criminal Forensics:

    • Forensic science relies heavily on the ability of DNA to identify the source of biological substances and determine who is most likely to have committed a crime.

    • This ability to identify an individual is enhanced by the variety of substances that contain DNA, including blood, hair, urine, bone, teeth, and tissues.


DNA

  • Criminal Forensics:

    • Using saliva, the FBI were able to match DNA samples from letters mailed to relatives by Theodore Kaczynski with DNA obtained from stamps on letters mailed by the Unabomber (University and Airline Bomber).

    • Identification of specimens using DNA has had other benefits, in one third of the cases where this technique has been used, DNA analysis has been able to exonerate people wrongly accused of crimes.


DNA

  • Establishing paternity

    • DNA analysis is now a common tool for establishing paternity, and it has been called on to identify remains after tragedies such as airline accidents.

  • Investigating migration of human beings and genetic disease

    • Anthropologists are using DNA analysis to study the migration of human beings across the oceans.

    • Historians employ these techniques to identify genetic disease in famous individuals.

  • Tracking endangered species

    • Wildlife biologists use the variation of DNA sequences between species to track endangered species.


Features of dna
Features of DNA

  • DNA is composed of FOUR different chemical building blocks called "bases". These four bases are:

    • adenine (A)

    • guanine (G)

    • thymine (T)

    • cytosine (C)

  • They are joined together in one strand by strong covalent bonds. These two strands are held together in a double helix because bases with complementary shapes can pair with each other.



Features of dna cont d
Features of DNA (cont’d)

  • Adenine is able to pair with Thymine and Guanine pairs with Cytosine.

  • Complementary base pairs are found along the entire length of the DNA duplex.

  • The complementary nature of the two strands provides a basis for copying genetic information and for passing this information on to offspring.


Features of dna cont d1
Features of DNA (cont’d)

  • Information is stored in DNA in the sequence of bases just as information can be stored in a book in the sequence of letters.

  • Each human cell contains approximately 3 billion base pairs of DNA organized in 23 pairs of chromosomes.

  • Every person inherits one set of 23 chromosomes from the mother and one set of 23 chromosomes from the father.


Techniques used for dna fingerprinting
Techniques used for DNA fingerprinting

  • Isolating the DNA in question from the rest of the cellular material in the nucleus.

  • Cutting the DNA into several pieces of different sizes.

  • Sorting the DNA pieces by size.

  • Denaturing the DNA, so that all of the DNA is rendered single-stranded. This can be done either by heating or chemically treating the DNA in the gel.

  • Blotting the DNA.

  • DNA sequence is detected: AGGCCTC

  • More: http://protist.biology.washington.edu/fingerprint/dnaintro.html


Polymerase chain reaction pcr for dna fingerprinting
Polymerase Chain Reaction (PCR) for DNA Fingerprinting

  • Often DNA samples obtained from crime scenes are too small in quantity or too degraded by sunlight or high temperature to be analyzed by the restriction fragment length polymorphism (RFLP) method.

  • These samples are subjected to a different fingerprinting technique known as PCR.

  • PCR is a valuable technique because it provides a method for producing millions of copies of small regions of DNA.


Dna matching sequence alignment
DNA Matching -- Sequence Alignment

AGGCTATCACCTGACCTCCAGGCCGATGCCC

TAGCTATCACGACCGCGGTCGATTTGCCCGAC

-AGGCTATCACCTGACCTCCAGGCCGA--TGCCC---

TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC

Definition

Given two strings x = x1x2...xM, y = y1y2…yN,

an alignment is an assignment of gaps to positions

0,…, N in x, and 0,…, N in y, so as to line up each letter in one sequence with either a letter, or a gap

in the other sequence


What is a good alignment
What is a good alignment?

Alignment:

The “best” way to match the letters of one sequence with those of the other

How do we define “best”?

Alignment:

A hypothesis that the two sequences come from a common ancestor through sequence edits

Parsimonious explanation:

Find the minimum number of edits that transform one sequence into the other


Scoring function
Scoring Function

  • Sequence edits:

    AGGCCTC

    • Mutations AGGACTC

    • Insertions AGGGCCTC

    • Deletions AGG .CTC

      Scoring Function:

      Match: +m

      Mismatch: -s

      Gap: -d

      Score F = (# matches)  m - (# mismatches)  s – (#gaps)  d


How do we compute the best alignment
How do we compute the best alignment?

AGTGCCCTGGAACCCTGACGGTGGGTCACAAAACTTCTGGA

M

Too many possible alignments:

O( 2M+N)

AGTGACCTGGGAAGACCCTGACCCTGGGTCACAAAACTC

N


Dna matching dot matrix method
DNA Matching -- Dot matrix method

  • The dot matrix method (dot plot method) is a graphical way of comparing two sequences.

  • In a dot matrix, two sequences to be compared are represented as horizontal and vertical axes of a two-dimensional diagram.

  • The comparison is done by scanning each residue of one sequence for similarity with all residues in the other sequence.


Dot matrix method
Dot matrix method

  • If a residue match is found, a dot is placed within the graph. Otherwise, the matrix positions will be left blank.

  • When the two sequences have substantial regions of similarity, many dots line up to form contiguous diagonal lines, which reveal the sequence alignment.

  • If there are interruptions in the middle of a diagonal line, they will indicate insertions and deletions. Parallel diagonal lines represent repetition.

Basically

Diagonal lines = alignment

Non-diagonal lines = gaps


Dynamic programming
Dynamic Programming

Dynamic programming is a method that determines optimal alignment between two sequences.

Suppose we wish to align

x1……xM

y1……yN

Let

F(i,j) = optimal score of aligning

x1……xi

y1……yj


Dynamic Programming (cont’d)

Three steps:

1. creates a two-dimensional alignment grid as in the dot matrix method. .

2. accumulates scores in the matrix for matches and mismatches b/w sequences.

3. traces back through matrix in reverse order to identify the highest scoring path.


Dynamic programming cont d

F(i-1, j-1)

F(i-1, j)

Dynamic Programming (cont’d)

+m/-s

-d

F( i, j-1)

F(i, j)

Notice three possible cases:

  • xi aligns to yj

    x1……xi-1 xi

    y1……yj-1 yj

    2. xi aligns to a gap

    x1……xi-1 xi

    y1……yj -

  • yj aligns to a gap

    x1……xi -

    y1……yj-1 yj

-d

m, if xi = yj

F(i,j) = F(i-1, j-1) +

-s, if not

F(i,j) = F(i-1, j) - d

Match: +m

Mismatch: -s

Gap: -d

F(i,j) = F(i, j-1) - d


Dynamic programming cont d1
Dynamic Programming (cont’d)

  • How do we know which case is correct?

    Inductive assumption:

    F(i, j-1), F(i-1, j), F(i-1, j-1) are optimal

    Then,

    F(i-1, j-1) + s(xi, yj)

    F(i, j) = max F(i-1, j) – d

    F( i, j-1) – d

    Where s(xi, yj) = m, if xi = yj; -s, if not

Match: +m

Mismatch: -s

Gap: -d


Intuitive understanding of the algorithm
Intuitive understanding of the algorithm

F(i-1, j-1)

F(i-1, j)

F(i, j) is the maximum score from one of the three directions.

+m/-s

-d

F( i, j-1)

F(i, j)

-d

Match: +m

Mismatch: -s

Gap: -d


Example
Example

x = AGTA m = 1

y = ATA s = 1

d = 1

F(i,j) i = 0 1 2 3 4

Optimal Alignment:

F(4,3) = 2

AGTA

A-TA

j = 0

-1

-2

1

1

0

2

0

0

1

0

3

-1 -1 0 2


Example1
Example

x = AGTA m = 1

y = ATA s = -1

d = -1

F(i,j) i = 0 1 2 3 4

Optimal Alignment:

F(4,3) = 2

AGTA

A-TA

j = 0

1

2

3

Score= 3 match + 0 mismatch + 1 gap

= 3x1 + 0x(-1) + 1x(-1) = 2


The needleman wunsch matrix
The Needleman-Wunsch Matrix

x1 ……………………………… xM

Every nondecreasing path

from (0,0) to (M, N)

corresponds to

an alignment

of the two sequences

y1 ……………………………… yN

Can think of it as a

divide-and-conquer algorithm


The needleman wunsch algorithm
The Needleman-Wunsch Algorithm

  • Initialization.

    • F(0, 0) = 0

    • F(0, j) = - j  d

    • F(i, 0) = - i  d

  • Main Iteration. Filling-in partial alignments

    • For each i = 1……M

      For each j = 1……N

      F(i-1,j) – d [case 1]

      F(i, j) = max F(i, j-1) – d [case 2]

      F(i-1, j-1) + s(xi, yj) [case 3]

      UP, if [case 1]

      Ptr(i,j) = LEFT if [case 2]

      DIAG if [case 3]

  • Termination. F(M, N) is the optimal score, and

    from Ptr(M, N) can trace back optimal alignment


Performance
Performance

  • Time:

    O(NM)

  • Space:

    O(NM)


The local alignment problem
The local alignment problem

Given two strings x = x1……xM,

y = y1……yN

Find substrings x’, y’ whose similarity

(optimal global alignment value)

is maximum

e.g. x = aaaacccccgggg

y = cccgggaaccaacc



The smith waterman algorithm
The Smith-Waterman algorithm

Idea: Ignore badly aligning regions

Modifications to Needleman-Wunsch:

Initialization: F(0, j) = F(i, 0) = 0

0

Iteration: F(i, j) = max F(i – 1, j) – d

F(i, j – 1) – d

F(i – 1, j – 1) + s(xi, yj)


The smith waterman algorithm1
The Smith-Waterman algorithm

Termination:

  • If we want the best local alignment…

    FOPT = maxi,j F(i, j)

  • If we want all local alignments scoring > t

    For all i, j find F(i, j) > t, and trace back


Smith waterman algorithm example

A

T

C

T

C

G

T

A

T

G

A

T

G

0

0

0

0

0

0

0

0

0

0

0

0

0

0

G

0

T

0

2

1

2

1

1

4

3

2

1

1

3

2

0

C

0

0

1

4

3

4

3

3

3

2

1

0

2

2

T

0

0

2

3

6

5

4

5

4

5

4

3

2

1

A

2

2

5

5

4

4

7

6

5

6

5

4

0

2

T

0

1

4

3

4

4

4

6

5

9

8

7

8

7

C

0

0

3

6

5

6

5

5

5

8

8

7

7

7

A

2

5

5

5

5

4

7

7

7

10

9

8

0

2

C

0

1

1

4

4

7

6

5

6

6

6

9

9

8

Smith-Waterman Algorithm (Example)

m, if xi = yj

S(i,j) =

-s, if not

  • Align S1=ATCTCGTATGATGS2=GTCTATCAC

0

0

0

0

0

0

2

1

0

0

2

1

0

2

2

  • d=1

4

3

5

7

9

8

10

A T C T C G T A T G A T G

G T C T A T C A C


An example of smith waterman
An example of Smith Waterman

A T T G C

Align with DP: A G G C

Match: m = 1

Gap: d = -1

Mismatch: s = 0


An example of Smith Waterman

0

1

0

Match: 1

Gap: -1

Mismatch: 0


0

1

0

0

1

0

1

1

0

1

0

1

0

0

2

1

0

0

2

1

0

0

An example of Smith Waterman

Match: 1

Gap: -1

Mismatch: 0

Score= 3 match + 1 mismatch + 1 gap

= 3x1 + 1x0 + 1x(-1) = 2




Issues and concerns
Issues and concerns

  • Excessive concern with the biometric may have an eclipsing effect on the performance of the technology. One could:

    • plant DNA at the scene of the crime

    • associate another's identity with his biometrics, thereby impersonating without arousing suspicion

    • interfere with the interface between a biometric device and the host system, so that a "fail" message gets converted to a "pass".


Identity theft and privacy issues
Identity theft and privacy issues

  • Two types of privacy concerns:

    • Informational privacy. Relates to the unauthorized collection, storage, and usage of biometric information. For example, if someone’s iris scan is stolen it allows someone else to access personal information or financial accounts, the damage could be irreversible.

    • Personal privacy. Relates to an inherent discomfort individuals may feel when encountering biometric technology.

    • The former one is more critical.


Defining application specific privacy risk the bioprivacy impact framework
Defining Application-Specific Privacy Risk: The BioPrivacy Impact Framework

  • Certain types of biometric deployments are more prone than others to lead to privacy-invasive uses, while other types of deployments have little or no bearing on privacy.

  • Biometrics, in and of themselves, are neither a protector nor an enemy of privacy.

  • The type of deployment determines the relation between biometrics and privacy.


Biometric deployments
Biometric Deployments Impact Framework

  • Overt versus Covert

    • User awareness and consent,

    • Notices and signs

    • A covert system can not permanently store biometric info collected from individuals who do not match watch lists.

  • Opt-in versus Mandatory

    • Mandatory system runs greater privacy risks than a voluntary or opt-in system.

    • Choice over whether one wants to provide one’s personal info is a central privacy principle.


Biometric deployments1
Biometric Deployments Impact Framework

  • Verification versus Identification

    • Identification (1:N) is more susceptible to privacy-related abuse than a system only capable of 1:1 matching.

  • Fixed Duration versus Indefinite Duration

    • When deployed for an indefinite duration, the risk increases.

  • Public Sector versus Private Sector

    • Data in public sector are more likely to be misused.


Biometric deployments2
Biometric Deployments Impact Framework

  • Citizen, Employee, Traveler, Student, Customer, Individual

  • User ownership versus Institutional Ownership of Biometric Data

  • Personal Storage versus Storage in Template Database


Sociological concerns
Sociological concerns Impact Framework

  • Physical concerns:

    • Biometric technology can cause physical harm to an individual using the methods, or instruments are unsanitary.

  • Personal information concerns:

    • whether our personal information taken through biometric methods can be misused, tampered with, or sold, e.g. by criminals stealing, rearranging or copying the biometric data.

    • The data obtained using biometrics can be used in unauthorized ways without the individual's consent.


Sociological concerns1
Sociological concerns Impact Framework

  • Society fears in using biometrics will continue over time. As the public becomes more educated on the practices, and the methods are being more widely used, these concerns will become more and more evident.

  • Biometric technology is being used at border crossings that have electronic readers that are able to read the chip in the cards and verify the information present in the card and on the passport.

  • Biometric method allows for the increase in efficiency and accuracy of identifying people at the border crossing. CANPASS, by Canada Customs is currently being used by some major airports that have kiosks set up to take digital pictures of a person’s eye as a means of identification.


Conclusions
Conclusions Impact Framework

  • Despite these misgivings, biometric systems have the potential to identify individuals with a very high degree of certainty.

  • Forensic DNA evidence enjoys a particularly high degree of public trust at present

  • Also substantial claims are being made in respect of iris recognition technology, which has the capacity to discriminate between individuals with identical DNA, such as monozygotic twins.


ad