Ockham’s Razor in Causal Discovery: A New Explanation - PowerPoint PPT Presentation

ockham s razor in causal discovery a new explanation n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Ockham’s Razor in Causal Discovery: A New Explanation PowerPoint Presentation
Download Presentation
Ockham’s Razor in Causal Discovery: A New Explanation

play fullscreen
1 / 151
Ockham’s Razor in Causal Discovery: A New Explanation
138 Views
Download Presentation
fern
Download Presentation

Ockham’s Razor in Causal Discovery: A New Explanation

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation Carnegie Mellon University www.hss.cmu.edu/philosophy/faculty-kelly.php

  2. I. Prediction vs. Policy

  3. Predictive Links • Correlation or co-dependency allows one to predictY from X. Ash trays Linked to Lung cancer! Lung cancer Ash trays policy maker scientist

  4. Policy • Policy manipulatesX to achieve a change in Y. Ash trays Linked to Lung cancer! Prohibit ash trays! Lung cancer Ash trays

  5. Policy • Policy manipulatesX to achieve a change in Y. We failed! Lung cancer Ash trays

  6. Correlation is not Causation • Manipulation of X can destroy the correlation of X with Y. We failed! Lung cancer Ash trays

  7. Lung cancer Ash trays Standard Remedy • Randomized controlled study That’s what happens if you carry out the policy.

  8. Infeasibility • Expense • Morality Let me force a few thousand children to eat lead. IQ Lead

  9. Infeasibility • Expense • Morality Just joking! IQ Lead

  10. Ironic Alliance Ha! You will never prove that lead affects IQ… industry IQ Lead

  11. Ironic Alliance And you can’t throw my people out of work on a mere whim. IQ Lead

  12. Ironic Alliance So I will keep on polluting, which will never settle the matter because it is not a randomized trial. IQ Lead

  13. II. Causes From Correlations

  14. Causal Discovery • Patterns of conditional correlation can imply unambiguous causal conclusions • (Pearl, Spirtes, Glymour, Scheines, etc.) Protein A Protein C Cancer protein Protein B Eliminate protein C!

  15. Z X Y compatibility Z Y X Basic Idea • Causation is a directed, acyclic network over variables. • What makes a network causal is a relation of compatibility between networks and joint probability distributions. p G

  16. Compatibility Joint distribution p is compatible with directed, acyclic network G iff: Causal Markov Condition: each variable X is independent of its non-effects given its immediate causes. Faithfulness Condition: every conditional independence relation that holds in p is a consequence of the Causal Markov Cond. V V Y Z X W

  17. C Common Cause • B yields info about C (Faithfulness); • B yields no further info about C given A (Markov). A A B C B

  18. B A C Causal Chain • B yields info about C (Faithfulness); • B yields no further info about C given A (Markov). B A C

  19. B C A Common Effect • B yields no info about C (Markov); • B yields extra info about C given A (Faithfulness). B C A

  20. Distinguishability indistinguishable distinctive B C A B C A A B C A C B

  21. Immediate Connections • There is an immediate causal connection between X and Y iff • X is dependent on Y given every subset of variables not containing X and Y(Spirtes, Glymour and Scheines) Z X Y X Y W No intermediate conditioning set breaks dependency Some conditioning set breaks dependency

  22. Recovery of Skeleton • Apply preceding condition to recover every non-oriented immediate causal connection. X Y X Y skeleton truth Y Y Z Z

  23. Orientation of Skeleton • Look for the distinctive pattern of common effects. Common effect X Y X Y truth Y Y Z Z

  24. Orientation of Skeleton • Look for the distinctive pattern of common effects. • Draw all deductive consequences of these orientations. Common effect X Y X Y Y is not common effect of ZY So orientation must be downward truth Y Y Z Z

  25. Causation from Correlation • The following network is causally unambiguous if all variables are observed. Protein A Protein C Cancer protein Protein B

  26. Causation from Correlation • The red arrow is also immune to latent confounding causes Protein A Protein C Cancer protein Protein B

  27. Brave New World for Policy • Experimental (confounder-proof) conclusions from correlational data! Protein A Protein C Cancer protein Protein B Eliminate protein C!

  28. III. The Catch

  29. Inferred statistical dependencies Metaphysics vs. Inference • The above results all assume that the true statistical independence relations for p are given. • But they must be inferred from finite samples. Sample Causal conclusions

  30. Problem of Induction • Independence is indistinguishable from sufficiently small dependence at sample size n. data dependence independence

  31. Bridging the Inductive Gap • Assume conditional independence until the data show otherwise. • Ockham’s razor: assume no more causal complexity than necessary.

  32. Inferential Instability • No guarantee that small dependencies will not be detected later. • Can have spectacular impact on prior causal conclusions.

  33. Current Policy Analysis Protein A Protein A Protein C Protein C Cancer protein Cancer protein Protein B Protein B Eliminate protein C!

  34. As Sample Size Increases… Protein A Protein C Cancer protein weak Protein B Protein D Rescind that order!

  35. As Sample Size Increases Again… Protein E Protein A weak Protein C Cancer protein weak Protein B weak Protein D Eliminate protein C again!

  36. As Sample Size Increases Again… Protein E Protein A weak Protein C Cancer protein weak Protein B weak Etc. Protein D Eliminate protein C again!

  37. Typical Applications • Linear Causal Case: each variable X is a linear function of its parents and a normally distributed hidden variable called an “error term”. The error terms are mutually independent. • Discrete Multinomial Case: each variable X takes on a finite range of values.

  38. Genetics Smoking Cancer An Optimistic Concession • No unobserved latent confounding causes

  39. Causal Flipping Theorem • No matter what a consistent causal discovery procedure has seen so far, there exists a pair G, p satisfying the above assumptions so that the current sample is arbitrarily likely in p and the procedure produces arbitrarily many opposite conclusions in p about an arbitrary causal arrow in G as sample size increases. oops I meant oops I meant oops I meant

  40. Causal Flipping Theorem • Every consistent causal inference method is covered. • Therefore, multiple instability is an intrinsic feature of the causal discovery problem. oops I meant oops I meant oops I meant

  41. The Crooked Course "Living in the midst of ignorance and considering themselves intelligent and enlightened, the senseless people go round and round, following crooked courses, just like the blind led by the blind." Katha Upanishad, I. ii. 5.

  42. Extremist Reaction • Since causal discovery cannot lead straight to the truth, it is not justified. I must remain silent. Therefore, I win.

  43. Moderate Reaction • Many explanations have been offered to make sense of the here-today-gone-tomorrow nature of medical wisdom — what we are advised with confidence one year is reversed the next — but the simplest one is that it is the natural rhythm of science. • (Do We Really Know What Makes us Healthy?, NY Times Magazine, Sept. 16, 2007).

  44. Skepticism Inverted • Unavoidable retractions are justified because they are unavoidable. • Avoidable retractions are not justified because they are avoidable. • So the best possible methods for causal discovery are those that minimize causal retractions. • The best possible means for finding the truth are justified.

  45. Larger Proposal • The same holds for Ockham’s razor in general when the aim is to find the true theory.

  46. IV. Ockham’s Razor

  47. Which Theory is Right? ???

  48. Ockham Says: Choose the Simplest!

  49. But Why? Gotcha!

  50. Puzzle • An indicator must be sensitive to what it indicates. simple