identity in the census
Download
Skip this Video
Download Presentation
Identity in the Census

Loading in 2 Seconds...

play fullscreen
1 / 18

Identity in the Census - PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on

Identity in the Census. Finding people in more than one. What is Identity?. A unique set of identifiers. What is an Identifier?. Any measurable attribute In Census - Name, Age, Sex, Birth State AND Household characteristics. Basic Record Linking. Generalize identifiers to block

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Identity in the Census' - Patman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
identity in the census

Identity in the Census

Finding people in more than one

what is identity
What is Identity?
  • A unique set of identifiers
what is an identifier
What is an Identifier?
  • Any measurable attribute
  • In Census - Name, Age, Sex, Birth State
  • AND Household characteristics
basic record linking
Basic Record Linking
  • Generalize identifiers to block
  • Compare within a block more specifically to match

Why?

  • GEDCOM was about exchange – we’ve abandoned that in favor of linkage
  • Local conclusions, remote evidence
household characteristics
Household Characteristics?
  • Oldest male
  • Oldest female
  • Oldest boy
  • Oldest girl
types of identifiers
Types of Identifiers
  • Cultural
  • Biological
cultural identifiers
Cultural Identifiers
  • Surname
  • Given Name
  • Family Role
biological identifiers
Biological Identifiers
  • Sex
  • Age
  • Parent / Child roles
coding identifiers
Coding Identifiers
  • Soundex
  • Initials
  • Birth year
why code identifiers
Why code identifiers?
  • Because matching doesn’t work
  • Expressions of identifiers in records vary – granularity etc.
  • To speed up comparisons by allowing blocking on a matched code
examples carroll co ar
Examples: Carroll Co AR
  • 1860 and 1870
  • Surnames beginning with K and L
what kind of keys did you use
What kind of keys did you use?
  • qry1860OldWoman
  • Sex (f)
  • Initial of Surname
  • Initial of coded first name
  • Estimated birth year / 5
  • Example: 1860 Mary Keelan age 13
  • fKM369
1860 family
1860 Family
  • John
  • KEYES
  • 30
  • Hannah
  • KEYES
  • 27
  • Housekey = mKJ366fKH366
  • Less granular key = mKJfKH
easy match
Easy Match
  • Surname Soundex
  • First initial
  • Birth Year
  • Easy List
other matches
Other Matches
  • Universe 778 records
  • Key 2 – mKJ - 449 matches
  • EasyList – 78 matches
  • Key 3 – mKJ382 – 38 matches
  • Mom and oldest boy – key3 – 8 matches – 1 right
  • Housekey - mLA368fLE368 – 4 matches – 3 right
work to do
Work to do
  • Measure the effectiveness of different sets of identifiers
  • Scale the algorithms to larger data sets
  • Abandon linking for a Cartesian Event Space
ad