Sjtu cmgpd 2012 methodological lecture day 3
Sponsored Links
This presentation is the property of its rightful owner.
1 / 15

SJTU CMGPD 2012 Methodological Lecture Day 3 PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

SJTU CMGPD 2012 Methodological Lecture Day 3. Position and Status Variables. Variables for position. The basic and analytic files include a variety of indicator variables for whether a male holds position These are based on the statuses recorded in the registers

Download Presentation

SJTU CMGPD 2012 Methodological Lecture Day 3

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

SJTU CMGPD 2012Methodological LectureDay 3

Position and Status Variables

Variables for position

  • The basic and analytic files include a variety of indicator variables for whether a male holds position

  • These are based on the statuses recorded in the registers

    • File with hanyu pinyin for raw occupations has been released

      • DS 6

    • Occupations with original Chinese characters are released as PDF

      • Turned out to be difficult to include Chinese characters in the released data

Variables for position

  • In the original data, entries included the official positions held by males.

  • Coders assigned a numeric code to each new position, and entered the code into the dataset.

    • Codes started again for each new dataset

  • Transcribed the original Chinese into a codebook

  • Can use DATASET and POSITION_CODE to look up original Chinese in the appendix to the Analytic release codebook

  • DS 6 allows merging of hanyu pinyin for code, if you want to create your own position variables from the originals.

Position variables

  • We have provided a variable of flag variables identifying different kinds of position

  • We have a separate file that for each combination of dataset and numeric position code specifies the hanyu pinyin and Chinese characters.

  • This file provides flag and other variables describing characters of positions.

  • These flags are merged back into the main file to provide variables for analysis.

Created Position Variables


    • Any salaried official position or purchased title

    • Doesn’t include miding, piding, etc. Those were statuses, not salaried official positions


    • Imputed income based on stipends associated with the position(s) held by an individual

  • RANK

    • Bureaucratic rank, based on specification of pin in the position

Position variables

  • BI_TIE_SHI, ZHI_SHI_REN, and flags for specific positions

  • JUAN, DING_DAI etc. for presence of modifiers

  • EXAMINATION for any examination-related title

  • NO_STATUS indicates that no status at all was recorded for a male, even though we would have expected one.

Name variables






Creating New Variables

  • DS-6 contains pinyin for positions

  • DATASET and POSITION_CODE are the basis of a merge back to the data files

  • POSITION_PINYIN is the ‘raw’ position, as transcribed by the coders

  • POSITION_CORE is a stripped down version that includes modifiers

  • Chinese characters are in an appendix to the Analytic File codebook

Creating new variables

  • STATA lets you search strings for particular values, and return an indicator if a string is fine.

  • Can use this for occupations of special interest

  • For example,

    • generate artisan = index(POSITION_PINYIN,"jiang") > 0

    • generate juanna = index(POSITION_PINYIN,”juanna”) > 0

  • Can code positions manually using Chinese characters in the appendix of the Analytic File codebook

Studying attainment

  • We have mainly used event-history

    • Determinants of chances of attaining position by next register

    • Allows for consideration of time-varying characteristics

      • Characteristics of kin

  • An alternative would be to look at determinants of attaining a position by a specific age, with one observation per person

Creating variables to identify attainment of position by next register

generate at_risk_position = SEX == 2 & PRESENT & NEXT_3 & HAS_POSITION == 0

bysort PERSON_ID (YEAR): generate next_position = at_risk_position & HAS_POSITION[_n+1]

bysort AGE_IN_SUI: egentotal_at_risk_position = total(at_risk_position)

bysort AGE_IN_SUI: egentotal_next_position = total(next_position)

generate p_next_position = total_next_position/total_at_risk_position

bysortAGE_IN_SUI: generate first_in_age = _n == 1

twoway line p_next_position AGE_IN_SUI if AGE_IN_SUI >= 1 & AGE_IN_SUI <= 80 & first_in_age, ytitle("Proportion attaining position by next register") scheme(s1mono)


  • bysort groups the records in the dataset according to the values of the specified variables.

  • Each set of records defined by a unique value of the specified variables is treated as a distinct block of records when the command is executed.

  • If a variable is in parentheses, the data is sorted on that variable, but not divided according to the unique values of that variable.

  • [ ]allows access to values from other observations in the same block. [1] says to draw the value of a variable from the first record in the block, [_N] from the last record, [_n+1] the next record and so forth

  • _n refers to the location of the current record within the block














  • Create a variable with the record number within x:

    • bysort x (y): generate a = _n

  • Create a flag identifying the first record within x:

    • bysort x (y): generate b = _n == 1

  • Create a flag identifying the last record within x:

    • bysort x (y): generate c = _N == _n

  • Create a variable with the total number of records with that unique value of x:

    • bysort x (y): generate d = _N

  • Create a variable with the y from the next record within x:

    • bysort x (y): generate e = y[_n+1]















  • Login