The use of graphical models in multi-dimensional longitudinal data

The use of graphical models in multi-dimensional longitudinal data Volkert Siersma Department of Biostatistics University of Copenhagen http://biostat.ku.dk/~vosi/ 14th European Young Statisticians Meeting Debrecen, August 22-26, 2005

Type 2 diabetes (T2DM) in General Practice Diabetes Care in General Practice (DCGP) study*: T2DM is an increasingly common illness that is linked to considerable excessive mortality. There are many indications that treatment (…) can postpone the development of diabetic complications. Treatment of T2DM is primarily done in general practice, where the results are not satisfactory. RCT: structured vs. routine care**. 1428 newly diagnosed T2DM patients included among 600 Danish GPs. The structured care group is regularly – every third month – reviewed for a period of about 6 years. This observed cohort inspires the following discussion. * http://www.gpract.ku.dk/Ansatte/olivarius.htm#diabetes ** Olivarius, N.d.F., Beck-Nielsen, H., Andreassen, A.H., Horder, M. and Pedersen, P.A. (2001) Randomised controlled trial of structured personal care of type 2 diabetes mellitus. Ann. Intern. Med., 323(7319): 970-975

Weight control in T2DM patients Body weight is related to T2DM in the sense that it is a general risk factor for insulin resistance. To avoid rampaging diabetes related complications, e.g. heart disease, kidney failure, ulcers, blindness, body weight is to be kept down to acceptable levels. In the structural care arm of DCGP the GPs were urged to motivate their patients to control their body weight through a goal-setting scheme: a weight goal was set to be attained at the next session. Body weight is often measured as Body Mass Index (BMI): weight (kg) divided by height (m) squared. Normal: 20-25 kg/m2 Obese: > 35 kg/m2

99kg 3-monthly consultations In overweight patients, the general practitioner was prompted to get agreement on a small, realistic weight reduction, and to follow up on this*. • We must control your weight! • Next time we meet you’ll have: • …kept current weight. • …lost x kg. • ...ah, forget about it. • Let’s set our next appointment in about 3 months… Loose 2 kg * From article draft Olivarius, N.d.F.,on weight development in DCGP

3-monthly consultations next consultation How does one decide on an appropriate goal? 97kg • Very fine, you’ve lost 2 kg! • Next time we meet you’ll have: • …kept current weight. • …lost x kg. • ...ah, forget about it. • Let’s set our next appointment in about 3 months… ?

The effect of the weight control programme Not the effect of a single goal, but the effect of a sequence of goals, a goal setting strategy, has to be evaluated 55kg This strategy has to be evaluated to the degree in which certain long-term goals have been fulfilled We have observed specific sequences in our data.

Other examples of control under uncertainty The data from the weight control programme in the DCGP study is but an example of a wide range of instances showing up in many fields. Public health: Most chronic diseases feature regular follow up. Data can be taken from large follow-up studies, national registers (DK), or (potentially) from electronic patient records. Engineering: Control theory to steer mechanical systems, e.g. robots. Data is input from sensors. Steering adjusted to input stream. Uncertainty from measurement error. Finance: Strategic management of companies. Yearly budget and balanced score card to reach long term goals. Data can be taken from yearly reports.

Probability framework Wt = body weight at session t with history H(Wt) = W0,…,Wt-1 Gt = weight goal at session t with historyH(Gt) = G0,…,Gt-1 The body weights and goals come sequentially and are independent of the future. The joint probability can be written P(W0,…,WT,G0,…,GT) = = ∑t P(Wt | H(Gt) H(Wt) )P(Gt | H(Gt) H(Wt) Wt ) Weight conditional on weight goal (LW) Feedback (LG) In general both Wt and Gt can be multidimensional to introduce (baseline) covariates, etc.

Simple modelling of weight Two regression models for weight conditional on weight goal Hidden analysis ?!? * Coefficients interpreted as mean weight (BMI) difference at next session ** Coefficients interpreted as mean weight (BMI) difference on top of the expected trend

Confounding (Classical) statistics regards correlation, NOT causation. This is learned in our first course in statistics, but we then often proceed to interpret statistical results causally anyway! The regression coefficient of the weight goal in the first model sure enough shows valid correlation, but this correlation is mostly attributed to the weight after which the weight goal was set. Confounder: a variable that influences both the treatment and the treatment outcome (epidemiology). A statistical analysis may be interpreted causally when all(?) confounders are included in the analysis, e.g. setting no goal is better than setting a high weight goal (second model).

Causal results when confounding is identified P(O E C) = P(O|C E)P(E|C)P(C) This can be represented by the graph to the right. Suppose an odd new law is enforced: a (random) device decided who will be exposed and who will not.* Pd(O E C) = P(O|C E) Pd(E) P(C) O C E Device removes CE arrow. As before Pd(O E C) through sampling from population P(C) and device Pd(E), and inserting this in P(O|C E) modelled from the data(!) * Glymour, C. (2001) The mind’s arrows. MIT press, London UK

Decision theoretic framework Pd(W0,…,WT,G0,…,GT) = = ∑t P(Wt | H(Gt) H(Wt) ) Pd(Gt | H(Gt) H(Wt) Wt ) = = ∑t P(Wt | Ct ) S(Gt | Ct Wt ) P(C0) Where Ct⊂ H(Gt) H(Wt) with Gt-1 ∈ Ct a subsetof the observed history up to t such that all relations WtK, K ∈ Ct, are causal. S(.) is an often deterministic strategy that we want to evaluate. We model LW and decide on LG. P(C0) is the distribution of the start values of the simulation. Often P(C0 = c) =1 for certain values c and the strategy S(.) is evaluated conditional on C0, e.g. individualised treatment.

Evaluation of a strategy • A yield Y is a function of a series of weights and weight goals. Examples include: • normality: BMI<25 after 10 sessions • stability: sum of weight differences. • A weight development for certain start values c under a certain strategy S(.) is simulated several times to get an empirical estimate of the distribution of the yield. • This distribution, or characteristics of this distribution, can be contrasted to a similarly derived distribution of the yield of a null strategy, i.e. ”no goal set” or indeed any other interesting strategy.

Optimalisation of a strategy A strategy can be viewed as a function Gt = sθ( Ct Wt ) of weight and relevant history with parameter vector θ. Optimising yield w.r.t. the strategy parameters is a difficult, often high-dimensional, optimisation problem. Heuristic search methods: Start with a sensible strategy Set as current strategy Evaluate neighbouring strategies Choose best of these Repeat until convergence A collection of generic strategies should be constructed for fast evaluation of intuitive strategies, start values for the optimisation, and base camps for other strategies.

Scope of the optimal strategy Since any strategy is evaluated conditional on the start values C0, an optimal strategy is only optimal for patients with these start values. Different yields result in different optimal strategies. Weighted sums of yields can find balanced optimal strategies, or optimal strategies under constraints.

Strategy analysis • Operationalised optimisation could take the form of a black box on-line data mining exercise. • Strategy analysis on a more general level is wanted in many cases. • An overview of the yields of various generic strategies • An overview of the strategy effect of some sort for the most usual combinations of start values. • A description or visualisation of some sort of the optimal strategy

Weight control: simple example A log-linear model for P(Wt | Wt-1 Wt-2 Gt-1 Gt-2 Z) where Z are baseline characteristics: age, sex, kidney functioning, heart disease, HbA1c level at diagnosis. C0 = W0 W1 G0 Z No goal was set at diagnosis, so G0 is a special category for itself. Below we present a tentative analysis result for men with 30<BMI<35 at first two post-diagnosis sessions, without heart condition, good HbA1c levels and kidney functioning. The analysis has the aim of normalising body weight within 5 years.

Weight control: simple example continued Estimated (10.000 simulations) probability of normal body weight (BMI<25) after 5 years (20 sessions)

Weight control: simple example continued The effect of brute force: B&C null 0.0098 full 0.1504 min 0.0000 max 0.1999 300 iterations of a simulated annealing instance. Starting from generic strategy A&B

Modelling the weight conditional on weight goal Pd(W0,…,WT,G0,…,GT) = = ∑t P(Wt | H(Gt) H(Wt) ) Pd(Gt | H(Gt) H(Wt) Wt ) Where now Wt are all outcomes observed in the data , including weight, stacked in a vector. Denote by Ht all data that could have been observed but is not. This equality already assumes that the observed data carries all information on the process, i.e. Wt┴ Ht-1 | H(Gt) H(Wt). No unmeasured confounders. The challenge is to find a Ct⊂ H(Gt) H(Wt) with Gt-1 ∈ Ct. Graphical models form the intuitive framework.

Causality in graphical models Graphical models are not statistical models, but descriptions of the structure of statistical models. Nodes represent variables. The absence of a line between two variables represents that these variables are independent conditional on all other variables in the graph. An important result is that the two variables are also independent conditional on a separating set of variables, i.e. a collection of variables through which lead all paths between the two variables. Graphical models, or its directed variety DAGs, have direct causal interpretation. Arrows are causal relations.

Compounding A relationship WG can be modelled using W and pa(W), the parent nodes of W, with G ∈ pa(W), or indeed any set C containing G that is separates or(W), the elements of an(W), the ancestor nodes of W, without parents. Then all relationships WK, K ∈ C, are causal. or(W) = E W A B C D E Such a set C is called compounding as it compounds the outcome from its root causes. Examples: {D} {BD} {BCD}

Characteristics of a suitable compounding set There is not an unique compounding set. We have to choose one suitable to our aims. Large models need a lot of data and cause simulations to be cumbersome. Many intermediate variables make inefficient modelling. The variables in the compounding set are the ones that matter for the strategy. They should be measured with low cost/effort.

Baseline vs. Trailing Baseline variable Z: ∑t P(Wt | Ct Z) S(Gt | Ct Wt ) P(C0 Z) Trailing variable Kt: ∑t P(Wt | Ct Kt-1) P(Kt | Ct H(Kt) ) S(Gt | Ct Wt Kt) P(C0 K0) Does not depend on Z or start values C0 Strategy gets more interesting Now we need to model the progression of Kt: outcome, stacked Wt and Kt, becomes multidimensional

Lags In general it is a good idea to include at least the lagged values of the outcome, e.g. weight, variable. One lag models the individual weight level and allows one to remove the variables that influence weight level from the compounding set. Two lags models the individual weight trend and allows one to remove the variables that influence weight trend from the compounding set. Beware: many lags induces chaos. Including lags of the outcome may justify not adding too many more variables as they carry much information on the outcome.

Identifying a compounding set A tentative strategy to identify a compounding set is to include Gt-1, two lags of body weight Wt-1and Wt-2, and a large collection of baseline values Z. Usually at diagnosis a large check up is done and information is obtained that is not measured each follow-up. These can be used in Z. Other elements of Z are known constants, e.g. sex. Trailing variables only included when it is relevant for the context, e.g. weight is controlled to keep blood glucose levels down and it would be very relevant to include this in the analysis. From this large set, eliminate (selectively) with your favourite model selection procedure, e.g. backwards elimination.

Models Graphical models can be directly identified as special instances of known statistical models thus providing an inference engine. Continuous variables: modelling of the inverse covariance matrix in multivariate normal distributions. Discrete (ordinal) variables: modelling of table probabilities by log-linear models. (used in the example) Mixed variables: stratified multivariate normal distributions. Dedicated software exists for all types.

Models continued • The models that are needed for our purpose should feature: • Non-linearity: many effects are non-linear, indeed the trade-off between benefit and harm is often the motivation for individual treatment strategies. • Interactions: effects changing with other variables is natural. Motivation to adhere to goal decreases in time, obese people more easy lose 6kg than people with normal weight. • Multidimensional: A disease is never characterised by one measurement alone. • For regression type models this poses an huge problem. For log-linear models this comes natural.

Using graphical models • The graphical model serves as a simulation engine. • Inference on the graphical model is used to identify a suitable compounding set. • Inference on the graphical model can reveal factors to be included in or excluded from a strategy. • Examination of interactions can reveal influences of passing time and unrealistic goal setting.

References Dynamic models with feedback: Murphy SA (2003) Optimal dynamic treatment regimes (with discussion). J.R.Statist.Soc.B 65 part 2, 331-366 Robins JM (1986) A new approach to causality in mortality studies with sustained exposure periods – application to control of the healthy worker survivor effect. Mathematical Modelling 7,1393-512. Cowell RG, Dawid AP, Lauritzen SL, Speigelhalter DJ (1999) Discrete multistage decision networks. In: Probabilistic networks and expert systems. Springer, New York, chapter 8, 155-188. Diggle PJ, Heagerty P, Liang K-Y, Zeger SL (2002) Time dependent covariates. In: Analysis of longitudinal data, second ed. Oxford statistical science series 25, chapter 12, 245-281. Causality: Pearl J (2000) Causality: Models, Reasoning and Inference. Cambridge University Press, Cambridge Graphical models: Whittaker J (1990) Graphical models in applied multivariate analysis. John Wiley and Sons, Chichester. Edwards D (1995, 2000) Introduction to graphical modelling. Springer, New York. Lauritzen SL (1996) Graphical models. Clarendon Press, Oxford. Cox DR, Wermuth N (1996) Multivariate dependencies. Chapman & Hal, London

The use of graphical models in multi-dimensional longitudinal data

The use of graphical models in multi-dimensional longitudinal data

Presentation Transcript

The use of graphical models in multi-dimensional longitudinal data

Graphical Models

Info Vis: Multi-Dimensional Data

Multi-Dimensional Data Visualization 2

Graphical Models

Developmental Models/ Longitudinal Data Analysis

Advances in Longitudinal Data and Data Use

Graphical Models

Chapter 2 Graphical Displays of Longitudinal Data Part I

Multi-Dimensional Data Visualization 3

Chapter 2 Graphical Displays of Longitudinal Data Part III

Chapter 2 Graphical Displays of Longitudinal Data Part II

Multi-Dimensional Data Visualization

Multi-Dimensional Data Visualization 2

Info Vis: Multi-Dimensional Data

Visualizing Multi-Dimensional Data

Multi-Dimensional Data Visualization

Multi-Dimensional Data Visualization

Multi-Dimensional View of Data Mining

Developmental Models/ Longitudinal Data Analysis