1 / 22

SJTU CMGPD 2012 Methodological Lecture

SJTU CMGPD 2012 Methodological Lecture. Day 9 Kinship. Ancestry identifiers Specific patrilineal ancestors. In the Basic file… FATHER_ID GRANDFATHER_ID In the Kinship file… F_ID_1 – same as FATHER_ID F_ID_2 – same as GRANDFATHER_ID F_ID_3 – Great-grandfather

abiola
Download Presentation

SJTU CMGPD 2012 Methodological Lecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SJTU CMGPD 2012Methodological Lecture Day 9 Kinship

  2. Ancestry identifiersSpecific patrilineal ancestors • In the Basic file… • FATHER_ID • GRANDFATHER_ID • In the Kinship file… • F_ID_1 – same as FATHER_ID • F_ID_2 – same as GRANDFATHER_ID • F_ID_3 – Great-grandfather • F_ID_4 – Great-great-grandfather

  3. Ancestry identifiersSpecific patrilineal ancestors • Wives of paternal ancestors • M_ID_1 – Mother • Same as MOTHER_ID in Basic • M_ID_2 – Paternal grandmother • Father’s mother (fm) • M_ID_3 – Paternal great-grandmother • ffm • M_ID_4 – Paternal great-great-grandmother • fffm

  4. Ancestry identifiersInferred ancestors • Most identifiers refer to actual individuals observed in the dataset • In some cases, the existence of a common ancestor whose death predated the earliest available register is inferred. • Based on relationship codes • Brothers in the earliest available register are inferred to have a common father. • Cousins in the earliest available register are inferred to have a common father. • For grouping purposes, an identifier is assigned that doesn’t refer to anyone observed in the dataset • No corresponding PERSON_ID • FATHER_ID_IMPUTED, GRANDFATHER_ID_IMPUTED are flags indicating that the IDs don’t refer to anyone observed in the dataset

  5. Distributions of men by numbers of descendants

  6. use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta" if SEX == 2 & PRESENT bysort PERSON_ID: keep if _n == 1 keep FATHER_ID keep if FATHER_ID != "-99" bysort FATHER_ID: generate sons = _N bysort FATHER_ID: keep if _n == 1 rename FATHER_ID PERSON_ID save Sons, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta" if SEX == 2 & PRESENT bysort PERSON_ID: keep if _n == 1 keep GRANDFATHER_ID keep if GRANDFATHER_ID != "-99" bysort GRANDFATHER_ID: generate grandsons = _N bysort GRANDFATHER_ID: keep if _n == 1 rename GRANDFATHER_ID PERSON_ID save Grandsons, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta" if SEX == 2 & PRESENT bysort PERSON_ID: keep if _n == 1 merge 1:1 RECORD_NUMBER using "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0004\27063-0004-Data.dta", keepusing(F_ID_3) keep(match master) keep F_ID_3 keep if F_ID_3 != "-99" replace F_ID_3 = substr(F_ID_3,3,.) bysort F_ID_3: generate ggrandsons = _N bysort F_ID_3: keep if _n == 1 rename F_ID_3 PERSON_ID save GGrandsons, replace

  7. use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta" if SEX == 2 & PRESENT bysort PERSON_ID: keep if _n == 1 merge 1:1 RECORD_NUMBER using "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0004\27063-0004-Data.dta", keepusing(F_ID_4) keep(match master) keep F_ID_4 keep if F_ID_4 != "-99" replace F_ID_4 = substr(F_ID_4,3,.) bysort F_ID_4: generate gggrandsons = _N bysort F_ID_4: keep if _n == 1 rename F_ID_4 PERSON_ID save GGGrandsons, replace use "C:\Users\Cameron Campbe\Documents\Baqi\CMGPD-LN from ICPSR\ICPSR_27063\DS0001\27063-0001-Data.dta" if SEX == 2 & PRESENT bysort PERSON_ID (YEAR): keep if _n == 1 & YEAR <= 1810 keep PERSON_ID merge 1:1 PERSON_ID using Sons, keep(match master) replace sons = 0 if sons == . drop _merge merge 1:1 PERSON_ID using Grandsons, keep(match master) replace grandsons = 0 if grandsons == . drop _merge merge 1:1 PERSON_ID using GGrandsons, keep(match master) replace ggrandsons = 0 if ggrandsons == . drop _merge merge 1:1 PERSON_ID using GGGrandsons, keep(match master) replace gggrandsons = 0 if gggrandsons == . drop _merge

  8. replace sons = 20 if sons >= 20 bysort sons: generate first_in_sons = _n == 1 bysort sons: generate sons_number = _N label variable sons_number "Sons" replace grandsons = 20 if grandsons >= 20 bysort grandsons: generate first_in_grandsons = _n == 1 bysort grandsons: generate grandsons_number = _N label variable grandsons_number "Grandsons" replace ggrandsons = 20 if ggrandsons >= 20 bysortggrandsons: generate first_in_ggrandsons = _n == 1 bysortggrandsons: generate ggrandsons_number = _N label variable ggrandsons_number "Great-grandsons" replace gggrandsons = 20 if gggrandsons >= 20 bysortgggrandsons: generate first_in_gggrandsons = _n == 1 bysortgggrandsons: generate gggrandsons_number = _N label variable gggrandsons_number "Great-great-grandsons" twoway line sons_number sons if first_in_sons, sort yscale(log) || line grandsons_number grandsons if first_in_grandsons, sort || line ggrandsons_numberggrandsons if first_in_ggrandsons, sort || line gggrandsons_numbergggrandsons if first_in_gggrandsons, sort ||, scheme(s1mono) xtitle("Number of descendants") ytitle("Number of men") ylabel(1 10 100 1000 10000)

  9. Kinship variables for groupingUses • Controlling for kin group membership • Via random-effects models • Alongside village, household, other levels • Multiple levels are computationally demanding • Often need tricks to collapse observations or otherwise reduce the dataset • Computation of explanatory variables • Aggregate measures of kin network status to use as right-hand side variables • Units of analysis in their own right • See yesterday

  10. Kinship variables for groupingAscending order of kin distance • FOUNDER_ID • Descent from a common male ancestor in the registers • FOUNDER_INFERRED_ID • Descent from a common male ancestor inferred from relationship codes in the earliest available register • UNIQUE_YI_HU • Descent from members of the same yihu in the earliest available register • UNIQUE_GROUP • Descent from members of the adjacent yihu with the same surname in the earliest available register

  11. Numbers and average sizes of units

  12. Kinship variables for groupingFOUNDER_ID • PERSON_ID of earliest male ancestor located in the registers. • Most narrowly-defined grouping variable • Based on descent from a single observed individual. • Many extinctions • Within one or two generations • Causes average size of groups defined by FOUNDER_ID to rise over time

  13. bysort FOUNDER_ID: generate founder_id_obs = _N bysort FOUNDER_ID: generate first_in_founder_id = _n == 1 replace founder_id_obs = 200 if founder_id_obs > 200 histogram founder_id_obs if first_in_founder_id, width(10) scheme(s1mono) xtitle("Number of observations with same FOUNDER_ID") fraction

  14. bysort FOUNDER_ID YEAR: generate founder_id_obs_year = _N bysort FOUNDER_ID YEAR: keep if _n == 1 collapse founder_id_obs_year, by(YEAR) line founder_id_obs_year YEAR, scheme(s1mono) ytitle("Mean number of observations per FOUNDER_ID") ylabel(0(2)12)

  15. Kinship variables for groupingFOUNDER_ID_INFERRED • Uses earliest available inferred ancestor • Based on relationship codes in earliest available register • Useful for grouping records in earliest registers • Until 1789, relationships were to head of yihu, not linghu. • Allowed for inference of common ancestry • Average size of groups defined by FOUNDER_ID_INFERRED increases over time because of extinction of smaller groups

  16. bysort FOUNDER_INFERRED_ID: generate founder_id_obs = _N bysort FOUNDER_INFERRED_ID: generate first_in_founder_id = _n == 1 replace founder_id_obs = 200 if founder_id_obs > 200 histogram founder_id_obs if first_in_founder_id, width(10) scheme(s1mono) xtitle("Number of observations with same FOUNDER_INFERRED_ID") fraction

  17. bysort FOUNDER_INFERRED_ID YEAR: generate founder_id_obs_year = _N bysort FOUNDER_INFERRED_ID YEAR: keep if _n == 1 collapse founder_id_obs_year, by(YEAR) line founder_id_obs_year YEAR, scheme(s1mono) ytitle("Mean number of observations per FOUNDER_INFERRED_ID") ylabel(0(2)12))

  18. Kinship variables for groupingUNIQUE_YIHU • Descendants of members of the same yihu in the earliest available register. • Clusters are much larger than the ones defined by FOUNDER_ID or FOUNDER_INFERRED_ID

  19. bysort UNIQUE_YI_HU YEAR: generate founder_id_obs_year = _N bysort UNIQUE_YI_HU YEAR: keep if _n == 1 collapse founder_id_obs_year, by(YEAR) line founder_id_obs_year YEAR line founder_id_obs_year YEAR, scheme(s1mono) ytitle("Mean number of observations per UNIQUE_YI_HU") ylabel(0(5)60)

  20. Kinship variables for groupingUNIQUE_GROUP • Descendants of members of consecutive yihuin earliest available register who have same surname. • Most stable over time in terms of size and number • Ideal for analysis of change over the long term

  21. bysort UNIQUE_GROUP YEAR: generate founder_id_obs_year = _N bysort UNIQUE_GROUP YEAR: keep if _n == 1 collapse founder_id_obs_year, by(YEAR) line founder_id_obs_year YEAR, scheme(s1mono) ytitle("Mean number of observations per UNIQUE_GROUP") ylabel(0(5)60)

More Related