Introduction to Causal Modelling for Natural Resource Monitoring

Bill Shipley, département de biologie Université de Sherbrooke Sherbrooke (Qc) Canada Bill.Shipley@USherbrooke.ca A user-freindly introduction to causal modelling Is it useful for natural resource monitoring and assessment?

Number of churches Number of churches Number of murders Number of murders New causal context…... Pop size Pop size Passive prediction ONLY if the underlying causal processes are constant

2-D Shadow 3-D Object What the audience sees Hidden from view

A B C D E “2-D” correlational shadow “3-D” causal process B & C correlated, but independent given A A & D correlated, but independent given B & C And so on…. Hidden from view What the scientist sees

Treatment: 80 g  6 Control: 55 g  6 o o T-test: p<0.0001 o ? Nitrogen fertilizer Crop growth Nitrogen fertilizer Crop growth o X Random numbers X X R.A. Fisher Statistical Methods for Research Workers (1925) 15 plots with treatment (+fertilizer & water) 15 plots without treatment (+water)

variable 1 variable 2 … variable n N fertilizer N, P, K... Worms…. Experimental (observational) unit... - the unit to which the treatment is applied - the UNIT to which the treatment is applied

To any variable in the experimental unit that shows a significant change Logic of causal inference in the randomized experiment External treatment experimental unit No causal inferences between variables within the experimental unit

THE PLANT Nitrogen fertilizer Nitrogenabsorption Photosynthetic enzymes Carbonfixation Seed yield

Scenario 1 Photosynthetic Fertilizer Nitrogen enzymes addition absorption Scenario 2 Fertilizer Photosynthetic Nitrogen addition enzymes absorption Scenario 3 Photosynthetic enzymes Fertilizer addition Nitrogen absorption

Randomized experiment doesn't allow scaling.

La méthode expérimentale • Claude Bernard1813 - 1878

Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney

Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney X Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney

1. Hypothesize a causal structure. A B C A B C 3. Predict how these correlations will change if various physical manipulations hold constant different variables. A B C A B C The controlled experiment 2. Measure the correlations between the variables in their natural state. 4. Compare the new correlations after controlling the variables to the predictions assuming the causal structure. 5. If any of the predicted changes in the correlational structure disagree with the observed changes, then reject the causal structure.

Body size in autumn sex Survival to spring

Causal hypothesis 1 Causal hypothesis 2 Survival Survival Other causes Other causes to spring to spring sex sex Body size Body size Other causes Other causes in autumn in autumn

Quantity and quality of summer forage Body weight in the autumn Probability of survival until spring = = ( , ) ( ) ( ) Z f X Y f X f Y 0.120 Z 0.040 1.5 0.0 1.5 Y 0.0 1.5 X -1.5

“residuals of Y given X”

Statistical conditioning (“holding constant”) doesn’t always give the same answer as physical conditioning When does it? When doesn't it?

Communication between strangers Universal translator

“3-D” causal process “2-D” correlational shadow B & C independent given A A & D independent given B & C B & D independent given D and so on... A B C D E Hypothesis testing

“3-D” causal process “2-D” correlational shadow B & C independent given A A & D independent given B & C B & D independent given D and so on... A B C D E Hypothesis generation

A B C D d-separation E The (almost) universal translator B & C independent given A A & D independent given B & C B & D independent given D and so on...

A B 0.300 0.120 Z Z 0.100 0.040 1.5 1.5 0.0 1.5 0.0 1.5 0.0 Y -1.5 Y 0.0 X -1.5 1.5 X -1.5 = = = ¹ ( , ) ( ) ( ) ( , ) ( ) ( ) Z f X Y f X f Y Z f X Y f X f Y The language of statistics

= = Y is a function of X X causes Y The language of statistics The dangers of mistranslation between languages... French “demande” vs. English “demand” Probability distributions • Deals only in information content conditional on other information • NOT causal relationships. • There is no notion of a causal (asymmetric) relationship in probability theory • Consistently mistranslates “X-->Y” as “Y=f(X)”

Mis-translating between human languages Bill Gates worth 1,000,000,000$ (machine translation into another language) (machine translation back into English) Payment request for doors in the fence worth 1,000,000,000$

Rain Mud Other causes of mud Rain Mud Other causes of mud Mis-translating between cause and correlation Mud (cm) = 0.1Rain (cm) + N(0, 0.1) Rain(cm)=10Mud(cm)+N(0,1)

A B C 1. Express causal claims using graph theory (directed acyclic graphs - DAGs) Property: asymmetric relationships 2. Apply a graph-theoretic operator (d-separation) on this graph. A_||_C|B (A is separated from C given B in the graph) 3. If two vertices (X,Y) in this DAG are d-separated given a set Q of other vertices, then variables X and Y are probabilistically independent given the set Q of conditioning variables in ANY multivariate probability distribution generated by the DAG 4. There always exists a basis set B of d-separation claims for the DAG that together completely specify the joint probability distribution over the variables represented by the DAG. B={A_||_C|B..} implies P(X,Y,Z)

A B C A=e1 B=f(A) + e2 C=f(B) + e3 A=e1 B= e2 C=f(B) + e3 A B C 5. Test the predicted and observed independence claims implied by the graphical model. - if there are significant differences, reject the causal model; - if there aren’t significant differences, tentatively accept the causal model (and continue testing…) 6. Now, translate the graphical model into prediction equations. 7. The independence claims in the DAG are local, therefore, to change the causal structure, simply re-write the DAG and then go back to step 6.

Number of churches Number of churches Number of murders Number of murders New causal context…... Pop size Pop size Passive prediction ONLY if the underlying causal processes are constant

A B C D E d-separation A few definitions... Directed path from: A B C D E A to C NOT from A to E If you can follow the arrows from i to j then there is a directed path from i to j. E to C NOT from E to A A B C D E Undirected path from: If you can go from i to j while ignoring the direction of the arrows then there is an undirected path from i to j. A to E E to A

A B C D E B A Non-collider vertex C Unshielded collider vertex Sheilded collider vertex d-separation A few definitions...

A B C D E Causal children of A NOT causal children of A Causal children of E NOT causal children of E d-separation

A B C D E Causal ancestors of C d-separation

d-separation State of a vertex: A non-collider vertex allows causal influence to flow through it (naturally ON); conditioning (holding constant) blocks causal influence through it (turns OFF). A B C A B C A collider vertex prevents causal influence to flow through it (naturally OFF); conditioning (holding constant) allows causal influence through it (turns ON). A B C A B C

Rain mud water hose Water hose Water hose d-separation A B C A B C rain rain mud mud 1. It rained 1. It didn’t rain 2. Therefore mud 2. There was mud 3. Therefore the water hose was on 3. No idea about water hose

d-separation Is X and Y d-separated given a set Q={A, B, …} conditioning vertices? 1. List all undirected paths between X and Y For each such undirected path... 2. Are there any non-colliders along this path that are in Q? If yes, path is blocked; Go to next undirected path. 3. Are all colliders or causal children of colliders along this path in Q? If no, then path is blocked; go to next undirected path. If all undirected paths between X and Y are blocked by Q then X and Y are d-separated by Q. If X and Y are d-separated by Q, then they are probabilistically independent given Q in any probability distribution generated by the graph.

Before conditioning After conditioning Non-collider A A B B C C D D E E d-separation Are B & C d-separated given A? B_||_C|{A}? YES B & C are d-separated given A therefore... B & C will be independent conditional on A

A A B B C C D D Before conditioning collider After conditioning E E d-separation Are B & C d-separated given D? B_||_C|{D}? NO B & C are not d-separated given D therefore... B & C will be dependent conditional on A

YES YES YES NO NO NO A B C D E = 10 X [1 + 3 + 3 + 1] = 80 d-separation A _||_E|{D}? A_||_E|{D,B}? B_||_C|{A,D}? B_||_C|{A,E}? D_||_A|{B}? E_||_B|D? … and so on for every unique pair (X,Y) conditioned on every unique pair of remaining variables...

A B C D E d-separation Basis set: the smallest set of d-separation claims in a DAG that, together, imply all others. If you know the basis set, then you can specify the entire structure of the joint probability distribution that is generated by the directed acyclic graph. Therefore, you can test the causal structure by testing the d-separation claims given in the basis set. Special basis set: BU= {X_||_Y|{Pa(X) U Pa(Y)} X,Y pair of vertices not directly connected. (each unique pair of non-adjacent vertices, conditioned on the set of parents of both) BU={A_||_D|{B,C}, A_||_E|{D}, B_||_C|{A}, B_||_E|{A,D}, C_||_E|{A,D} }

A B C D Calculate : E d-Sep tests of a causal structure Calculate probability of each claim in data Convert to probabilistic claims List basis set BU p1=0.23 p2=0.50 p3=0.001 p4=0.45 p5=0.12 A_||_D|{B,C} A_||_E|{D} B_||_C|{A} B_||_E|{A,D} C_||_E|{A,D} rA,D|{B,C}=0 rA,E|D=0 rB,C|A=0 rB,E|A,D=0 rC,E|A,D=0 C = 23.98 k = 5 IF all d-sep claims in the graph are true in the data, then C follows a chi-squared distribution with 2k degrees of freedom THEREFORE if the probability of C is below the significance level……… the causal structure is rejected by the data. THEREFORE if the probability of C is above the significance level……… the causal structure is consistent with the data. REJECT causal structure X2 of 23.98 with 10 degrees of freedom gives p=0.008

A bit of history...

Claude Bernard Ronald Fisher Karl Pearson Sewall Wright Judea Pearl Clark Glymour

The rest is details

Introduction to Causal Modelling for Natural Resource Monitoring

Introduction to Causal Modelling for Natural Resource Monitoring

Presentation Transcript

Dynamic Causal Modelling (DCM): Theory

User Modelling (UM)

Dynamic Causal Modelling for fMRI

Dynamic Causal Modelling for fMRI

Causal Modelling for Relational Data

Dynamic Causal Modelling

Causal Modelling Using Bayesian Networks

User Experience Modelling

An introduction to Causal sets

Dynamic Causal Modelling (DCM)

Dynamic Causal Modelling for fMRI

Dynamic Causal Modelling THEORY

Dynamic Causal Modelling

Dynamic Causal Modelling (DCM): Theory

User Modelling (UM)

Dynamic Causal Modelling (DCM): Theory

Dynamic Causal Modelling (DCM): Theory

Dynamic Causal Modelling for fMRI

Dynamic Causal Modelling

Introduction to Causal Analysis