Multilevel models for family data

Multilevel models for family data Tom O’Connor Jon Rasbash A presentation to the Research Methods Festival Oxford, July 2004. Work conducted for the ESRC research methods programme project: Methodologies for Studying Families and Family Effects: the systematic assessment of research designs and data analytic strategies

The presentation looks at three applications of multilevel modelling to family data • Using multilevel models to explore the determinants of differential parental treatment of children. • 2. Extending multilevel models to include genetic effects. • 3. Applying multilevel models developed to handle social network data to family relationship data.

Application 1 Understanding the sources of differential parenting: the role of child and family level effects Jenny Jenkins, Jon Rasbash and Tom O’Connor Developmental Psychology 2003(1) 99-113

Mapping multilevel terminology to psychological terminology • Level 2 : Family, shared environment Variables : family ses, marital problems • Level 1 : Child, non-shared environment, child specific Variables : age, sex, temperament

Background • Recent studies in developmental psychology and behavioural genetics emphasise non-shared environment is much more important in explaining children’s adjustment than shared environment has led to a focus on non-shared environment.(Plomin et al, 1994; Turkheimer&Waldron, 2000) • Has this meant that we have ignored the role of the shared family context both empirically and conceptually?

Background • One key aspect of the non-shared environment that has been investigated is differential parental treatment of siblings. • Differential treatment predicts differences in sibling adjustment • What are the sources of differential treatment? • Child specific/non-shared: age, temperament, biological relatedness • Can family level shared environmental factors influence differential treatment?

“Parents have a finite amount of resources in terms of time, attention, patience and support to give their children. In families in which most of these resources are devoted to coping with economic stress, depression and/or marital conflict, parents may become less consciously or intentionally equitable and more driven by preferences or child characteristics in their childrearing efforts”. Henderson et al 1996.This is the hypothesis we wish to test. We operationalised the stress/resources hypothesis using four contextual variables: socioeconomic status, single parenthood, large family size, and marital conflict The Stress/Resources Hypothesis Do family contexts(shared environment) increase or decrease the extent to which children within the same family are treated differently?

How differential parental treatment has been analysed Previous analyses, in the literature exploring the sources of differential parental treatment ask mother to rate two siblings in terms of the treatment(positive or negative) they give to each child. The difference between these two treatment scores is then analysed. This approach has several major limitations…

The sibling pair difference difference model, for exploring determinants of differential parenting Where y1i and y2i are parental ratings for siblings 1 and 2 in family I x1iis a family level variable for example family ses • Problems • One measurement per family makes it impossible to separate shared and non-shared random effects. • All information about magnitude of response is lost (2,4) are the same as (22,24) • It is not possible to introduce level 1(non-shared) variables since the data has been aggregated to level 2. • Family sizes larger than two can not be handled.

With a multilevel model… Where yij is the j’th mothers rating of her treatment of her i’th child x1ij are child level(non-shared variables), x2j are child level(shared variables) ujand eij are family and child(shared and non-shared environment) random effects. Note that the level 1 variance is now a measure of differential parenting

Advantages of the multilevel approach • Can handle more than two kids per family • Unconfounds family and child allowing estimation of family and child level fixed and random effects • Can model parenting level and differential parenting in the same model.

Overall Survey Design • National Longitudinal Survey of Children and Youth (NLSCY) • Statistics Canada Survey, representative sample of children across the provinces • Nested design includes up to 4 children per family • PMK respondent • 4-11 year old children • Criteria: another sibling in the age range, be living with at least one biological parent, 4 years of age or older • 8, 474 children • 3, 860 families • 4 child =60, 3 child=630, 2 child=3157

Measures of parental treatment of child Derived form factor analyses.. • PMK report of positive parenting: frequency of praise of child, talk or play focusing on child, activities enjoyed together a=.81 • PMK report of negative parenting: frequency of disapproval, annoyance, anger, mood related punishment a=.71 • Will talk today about positive parenting PMK is parent most known to the child.

Child specific factors • Age • Gender • Child position in family • Negative emotionality • Biological relatedness to father and mother • Family context factors • Socioeconomic status • Family size • Single parent status • Marital dissatisfaction

Model 1: Null Model The base lineestimate of differential parenting is 3.8. We can now add further shared and non-shared explanatory variables and judge their effect on differential parenting by the reduction in the level 1 variance.

Model 2 : expanded model

positive parenting Child level predictors • Strongest predictor of positive parenting is age. Younger siblings get more attention. This relationship is moderated by family membership. • Non-bio mother and Non_bio father reduce positive parenting • Oldest sibling > youngest sibling > middle siblings Family level predictors • Household SES increases positive parenting • Marital dissatisfaction, increasing family size, mixed or all girl sib-ships all decrease positive parenting • Lone parenthood has no effect.

Differential parenting Modelling age reduced the level 1 variance (our measure of differential parenting) from 3.8 to 2.3, a reduction of 40%. Other explanatory variables both child specific and family(shared environment) provide no significant reduction in the level 1 variation. Does this mean that there is no evidence to support the stress/resources hypothesis.

Testing the stress/resource hypothesis • The mean and the variance are modelled simultaneously. So far we have modelled the mean in terms of shared environment but not the variance. • We can elaborate model2 by allowing the level 1 variance to be a function of the family level variables household socioeconomic status, large family size, and marital conflict. That is Reduction in the deviance with 7df is 78.

Graphically …

Conclusion • We have found strong support for the stress/resources hypothesis. That is although differential parenting is a child specific factor that drives differential adjustment, differential parenting itself is influenced by family as well as child specific factors. • This challenges the current tendency in developmental psychology and behavioural genetics to focus on child specific factors. • Multilevel models fitting complex level 1 variation need to be employed to uncover these relationships.

Application 2Including Genetic Effects in Multilevel models

Background Recent involvement in applying multilevel models to family data, collaborating with developmental psychologists. They asked can we include genetic effects in these models? Long tradition of quantitative genetics, arguably begun with Fishers 1918 paper “The correlation between relatives based on the supposition of Mendalian inheritance” This work has been developed by others and applied in Animal and plant genetics, evolutionary biology, human genetics and behavioural genetics.

The basic multilevel model, for kids within families Given the standard independence assumptions of multilevel models : The covariance of two children(i1 and i2 )in the same family is

Extending the basic model to include genetic effects Where gij is a genetic effect for the i’th child in the j’th family. For two individuals (i1,i2 ) BUT The genetic covariance of two individuals in the same family, is clearly not zero since there is a non-zero probability that they share the same genes. What is F? This where Fishers 1918 paper comes in.

A very little genetics background First remember, humans have 23 pairs of chromosomes. A gene is a sequence of DNA at given location(locus) on a chromosome. In a population there might be multiple different versions of gene. For example, with two versions of a gene denoted by A and little a. There are 3 possible genotypes : AA Aa aa (Note Aa is functional equivalent to aA) We can think of the genes conferring values on an individual for a trait.

Given… a number of (strong) assumptions : 1. A metric trait is influenced by a large number of genes at a large number of loci(effectively infinite) 2. The effects of the genes add-up within and across loci 3. The genes are transmitted independently from parents to progeny. 4. The population being studied is mating at random 5. The population being studied is in evolutionary equilibrium. That is gene frequencies are not changing across generations. 6. There is no correlation between genetic and environmental effects. Corrections to the theory exist for all these assumptions, but I fear they are seldom used(in BG), are often difficult to implement and have not been thoroughly evaluated.

Then.. With a lot of complicated argument and algebra, Fisher shows that Where r(i1,i2) is the relationship coefficient between two individuals and equals (0,0.125,0.25,0.5,1) for unrelated individuals, cousins, half-sibs, full sibs and mz twins respectively. Thus the greater the relationship between individuals the greater their genetic covariance and therefore their phenotypic covariance. An individual’s gij is the sum of the effects of all their genes. The variance of these gij is the additive genetic variance(g2). The size of the additive genetic variance compared to other environmental variances is often of interest.

Parameter Model 1 Model 2 Fixed Est(se) Est(se) Intercept 0.008(0.017) 0.02(0.017) Random Shared env 0.086(0.011) 0.018(0.017) Non-shared env 0.198(0.011) 0.069(0.010) Genetic - 0.209(0.022) Deviance 2165.88 2129.2 Data example 277 full sib pairs, 109 half sib pairs, 130 unrelated pairs, 93 DZ twins and 99 MZ twins aged between 9 and 18 years. Analysis of depression scores : The total variance in the two models is effectively the same 0.275 in model 1 and 0.285 in model 2 In model 2, which includes genetic effects, 70% of the family level variation and 60% of child level variation are re-assigned to the genetic variance Like autocorrelation, time-series models except covariance decays as a function of genetic distance as opposed to distance in time between measurements. Can use the same estimation machinery.

Model 3 Fixed Est(se) Intercept -0.285(0.087) Age 0.011(0.006) Mat_neg 0.157(0.024) Pat_neg 0.216(0.026) Girl 0.158(0.028) Stepfam 0.105(0.029) Random Shared env 0.0035(0.014) Non-shared env 0.70(0.096) Genetic 0.148(0.020) Deviance 1780.95 Adding covariates From the fixed effects we see that depression scores increase with child age, paternal and maternal negativity; girls and children in stepfamilies also have higher depression scores. The largest drop in the variance when these explanatory variables are introduced occurs in the genetic variance.

Why the drop in the genetic variance? The largest drop in the genetic variance occurs when paternal and maternal negativity are added to the model as covariates. Pike et al(1996) analyse the same data using a series of genetically calibrated bivariate structural equations models. Two of the models they consider are bivariate structural equations models for maternal negativity and depression and paternal negativity and depression. In each of these two models they find 15% of the genetic variance in depression is due to a shared genetic component with parental negativity. When we add paternal and maternal negativity to our model as fixed effects we are sweeping out any common genetic effects shared by parental negativity and adolescent depression. We are also taking account of any environmental correlations whereby sibling pairs of greater relatedness experience more similar parental treatment. Both these factors will reduce the remaining additive genetic variance in the model.

Complex variation and gene environment interactions Currently our model for the variance partitions the variance into three sources family, child and genetic. The model for the variance can be further elaborated to allow each of the three sources of variation to be modelled as functions of explanatory variables, where the variables may be measured at any level. That is

Gene environment interaction with paternal negativity We now elaborate model 3 to allow all three variances to be a function of paternal negativity. That is : (4)

Model 4 Fixed Est(se) Intercept -0.273(0.080) Age 0.011(0.005) Mat_neg 0.170(0.024) Pat_neg 0.210(0.028) Girl 0.159(0.027) Stepfam 0.097(0.028) Random Shared env 0.0006(0.014) -0.017(0.019) Non-shared env 0.073(0.009) 0.0078(0.010) Genetic 0.155(0.021) 0.093(0.023) Deviance 1740.42 Results from model 4 including the three extra parameters reduces the deviance by 19.5. This reduction is almost entirely driven by the gene environment interaction term, 1(g) . Removing the 1(e) and 1(u) terms from the model 4 results in a change in only 1.5 in the deviance. The significant coefficient constitutes a gene-environment interaction because it implies the genetic variance changes as a function of paternal negativity.

Graphing the gene environment interaction One explanation of GXE interactions is in terms of conditional gene expression. Suppose we have a gene A which gets switched on when an individual is subject to persistent high levels of cortisol. If some of the population have the A gene and some don’t then this genetic variation only manifests in individuals under persistent high levels of stress

Model Extensions The multilevel model with genetic effects is flexible and can be adapted to a variety of situations where population structures have further nested or crossed random classifications in addition to the standard behavioural genetics situation of children within families. For example, Time : repeated measures on kids within families Institutions: schools, hospitals Space : areas Multiple observers Complex example given in next section.

Application 3Applying social network models to family relationship data-some preliminary work.

Substantive focus:trait-like versus context specific behaviour A question in personality theory is to what extent particular emotions and behaviours are trait-like in that they are constant across different environments, to what extent are they context sensitive in that behaviours expressed by individuals are specific to particular environments (Magnusson, 1990; Magnusson & Endler, 1977). Clinically, trait-like behaviours are harder to change since by definition altering the environment will tend not to change the behaviour.

Stability of behaviours over time Studies in personality have shown that happiness and positivity are very stable over time (Costa, McCrae & Zonderman, 1987); findings for the stability of negativity vary as a function of which aspect of negativity is being considered with some aspects such as physical aggression showing high stability (Broidy et al, 2003) and other aspects such as whining showing low stability (Capaldi et al., 1994). Trait-like behaviours are often considered to be driven by genetic factors (Plomin, 1994; Tellegen et al., 1988)

Stability of behaviours across family members In this presentation we develop multilevel models to explore the stability of an individuals behaviour across their family members. 2 kids and 2 parents per family, 12 relationship scores per family a relationship is made up of an actor and a partner. c1c2c1mc1f c2c1c2m c2f mc1mc2mf fc1fc2fm We look at the traits of positivity and negativity as responses. Note that positivity and negativity are not mirror images of the same underlying construct. Clinically and statistically they show independent patterns with evidence from neuropsychology that they are controlled by different brain systems (Caccioppo, Larsen, Smith & Berntson, 2004)

The data Non-Shared Environment Adolescent Development(NEAD) data set, Reiss et al(1994). • 2 wave longitudinalfamily study, designed for testing hypothesis about genetic and environmental effects • 277 full-sib pairs, 109 half-sib pairs, 130 unrelated pairs, 93 DZ twins and 99 MZ twins, aged between 9 and 18 years • Wave 2 followed 3 years after wave 1 and any families where the older sib was older than 18 were not followed up. • A wide range of self-report, parental-report and observer variables were collected. • All families had 2 parents and 2 kids of the same sex. • We focus here on data on relationship quality collected by observers.

Actor: c1 c2 m f Within family structure We start with 12 relationship scores in each family. These can be classified : partner actor and dyad Family 1… Dyad d1 d2 d3 d4 d5 d6 Relationship: c1c2c1mc1f c2c1c2m c2f mc1mc2mf fc1fc2fm Partner: c1 c2 m f

family Dyad d1 d2 d3 d4 d5 d6 Family 1… Actor: c1 c2 m f actor partner dyad Relationship: c1c2c1mc1f c2c1c2m c2f mc1mc2mf fc1fc2fm Partner: c1 c2 m f Relationship score Diagrams to represent the structure The relationship scores are contained within a cross classification of actor, dyad and partner and all of this structure is nested within families. This can structure can be shown diagrammatically with: A unit diagram – one node per unit A classification diagram with one node per classification

is the effect for the m’th family family is the effect of j’th actor in the m’th family is the effect of k’th partner in the m’th family actor partner dyad is the effect of l’th dyad in the m’th family is the residual relationship effect conditional on actor j, partner k, dyad l and family m Relationship score The multilevel social relations model-Snijders and Kenny(1999)

Interpretation of variance components Family:the extent to which family level factors effect allthe relationships in a family. Actor: the extent to which individuals act similarly across relationships with other family members(actor stability, trait-like behaviour) Partner: We actually have two traits operating, in addition to the trait of common acting to other family members we also have the trait of elicitation from other family members. The greater the partner variance component the greater the evidence for such a trait operating. Dyad: The extent to which relationship quality is specific to the dyad. A high dyad random effect means that the relationship score from joe->fred is similar of that from fred->joe. In social network theory this is known as reciprocity. Reciprocity is a context specific effect(non trait-like) Relationship: residual variation across relationships in relationship quality.

Results of SRM more detail Table shows variance partition coefficients For positivity 44% of the variablity is attributable to actors indicating that individuals act in a consistent way across relationships with other family members. There is a strong actor trait component to positivity. For negativity 0.41 of the variability is attributable to dyad. Indicating the dyad is an important structure in determining negativity in relationships. There is a strong context specific component to negativity. There is little evidence of an elicitation or partner trait for either response. At the family level there are stronger effects for negativity than positivity.

Adding fixed effects for role The basic unit, a relationship, has an actor and a partner. Actors and partners are classified into the roles of children, mothers and fathers by the two categorical variables actor_role and partner_role. We use child as the reference category for actor_role and partner_role variables.

Including actor and partner roles-positivity Modelling actor and partner role drops likelihood by over 1000 units with 4df. The effect is dominated by the actor role categories. With mothers and then fathers being much more positive as actors than the reference category child. These actor_role role variables explain over 50% of the actor level variance. Adding interactions between actor_role and partner-role does not improve the model. Since we have explained actor level variance this means actor role explains the some of the trait component of relationship positivity.

Graphing actor and partner role effects for positivity The graph shows actor_role having a big effect on relationship quality and partner role having a marginal effect. actor child actor m actor f

} Including actor and partner roles-negativity Now an interaction is required between actor_role and partner_role. Note the interaction categories a_moth*p_moth and a_fath*p_fath structurally do not exist. Modelling actor and partner role and the interaction drops the loglike by 500 units with 6df. Note the main drop in the variance occurs at the dyad level which reduces by 15%. This means modelling actor and partner roles has explained context specific variation in relationship quality for negativity.

Multilevel models for family data

Multilevel models for family data

Presentation Transcript

Data Models for Warehouse

Multilevel survival models

Multilevel Regression Models

Analyzing Data from Small N Designs using Multilevel Models

Sample Size calculations for multilevel models (Part II)

MCMC efficiency in Multilevel Models

Multilevel Models for Binary and Ordered Response Data

aDDressing family needs through the Multilevel Family Practice Model

Multilevel Models

Multilevel Models in Survey Error Estimation

Chapter 5 Multilevel Models

Multilevel spline models for blood pressure changes in pregnancy

Multilevel Models with Latent Variables

MODELS FOR PANEL DATA

Day 3: Missing Data in Longitudinal and Multilevel Models

Multilevel Data Security (MDS)

Multilevel Models

Fixed Versus Random Effects Models for Multilevel and Longitudinal Data Analysis

Multilevel Regression Models

Analyzing Data from Small N Designs using Multilevel Models

Sample Size calculations for multilevel models (Part II)

Multilevel and multifrailty models