1 / 127

Simplicity and Truth: an Alternative Explanation of Ockham's Razor

Simplicity and Truth: an Alternative Explanation of Ockham's Razor. Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation Carnegie Mellon University www.hss.cmu.edu/philosophy/faculty-kelly.php. I. The Simplicity Puzzle. Which Theory is Right?.

lonato
Download Presentation

Simplicity and Truth: an Alternative Explanation of Ockham's Razor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Simplicity and Truth:an Alternative Explanation of Ockham's Razor Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation Carnegie Mellon University www.hss.cmu.edu/philosophy/faculty-kelly.php

  2. I. The Simplicity Puzzle

  3. Which Theory is Right? ???

  4. Ockham Says: Choose the Simplest!

  5. But Why? Gotcha!

  6. Puzzle • An indicator must be sensitive to what it indicates. simple

  7. Puzzle • A reliable indicator must be sensitive to what it indicates. complex

  8. Puzzle • But Ockham’s razor always points at simplicity. simple

  9. Puzzle • But Ockham’s razor always points at simplicity. complex

  10. Puzzle • How can a broken compass help you find something unless you already know where it is? complex

  11. 1. Prior Simplicity Bias • Bayes, BIC, MDL, MML, etc. 2. Risk Minimization SRM, AIC, cross-validation, etc. Standard Accounts

  12. 1. Prior Simplicity Bias The simple theory is more plausible now because it was more plausible yesterday.

  13. More Subtle Version • Simple data are a miracle in the complex theory but not in the simple theory. Regularity: retrograde motion of Venus at solar conjunction Has to be! P C

  14. However… • e would not be a miracle given P(q); Why not this? P C

  15. The Real Miracle C P Ignoranceabout model: p(C) p(P); +Ignoranceabout parameter setting: p’(P(q) | P) p(P(q’ ) | P). = Knowledge about C vs. P(q): p(P(q)) << p(C). q q q q q q q q Leadinto gold. Perpetual motion. Free lunch. Ignorance is knowledge. War is peace. I love Big Bayes.

  16. Standard Paradox of Indifference Ignoranceofred vs. not-red +Ignoranceover not-red: =Knowledgeabout red vs. white. q q Knognorance = All the priveleges of knowledge With none of the responsibilities Yeah!

  17. The Ellsberg Paradox 1/3 ? ?

  18. b c a c Human Preference 1/3 ? ? b a b > <

  19. b c a c Human View 1/3 ? ? knowledge ignorance b a b > ignorance knowledge <

  20. b c a c Bayesian View 1/3 ? ? knognorance knognorance b a b > knognorance knognorance >

  21. In Any Event The coherentist foundations of Bayesianism have nothing to do with short-run truth-conduciveness. Not so loud!

  22. Bayesian Convergence • Too-simple theories get shot down… Updated opinion Theories Complexity

  23. Bayesian Convergence • Plausibility is transferred to the next-simplest theory… Updated opinion Plink! Theories Complexity Blam!

  24. Bayesian Convergence • Plausibility is transferred to the next-simplest theory… Updated opinion Plink! Theories Complexity Blam!

  25. Bayesian Convergence • Plausibility is transferred to the next-simplest theory… Updated opinion Plink! Theories Complexity Blam!

  26. Bayesian Convergence • The true theory is nailed to the fence. Updated opinion Zing! Theories Complexity Blam!

  27. Convergence • But alternative strategies also converge: • Anything in the short run is compatible with convergence in the long run.

  28. Summary of Bayesian Approach • Prior-based explanations of Ockham’s razor are circular and based on a faulty model of ignorance. • Convergence-based explanations of Ockham’s razor fail to single out Ockham’s razor.

  29. 2. Risk Minimization • Ockham’s razor minimizes expected distance of empirical estimates from the true value. Truth

  30. Unconstrained Estimates • are Centered on truth but spread around it. Pop! Pop! Pop! Pop! Unconstrained aim

  31. Constrained Estimates • Off-center but less spread. Truth Clamped aim

  32. Constrained Estimates • Off-center but less spread • Overall improvement in expected distance from truth… Pop! Pop! Pop! Pop! Truth Clamped aim

  33. Doesn’t Find True Theory • The theory that minimizes estimation risk can be quite false. Four eyes! Clamped aim

  34. Makes Sense …when loss of an answer is similar in nearby distributions. Close is good enough! Loss Similarity p

  35. But Truth Matters …when loss of an answer is discontinuous with similarity. Loss Close is no cigar! Similarity p

  36. E.g. Science If you want true laws, false laws aren’t good enough.

  37. E.g. Science You must be a philosopher. This is a machine learning conference.

  38. E.g., Causal Data Mining Protein A Protein C Cancer protein Protein B Now you’re talking! I’m on a cilantro-only diet to get my protein C level under control. Practical enough?

  39. Central Idea Correlation does imply causation if there are multiple variables, some of which are common effects. [Pearl, Spirtes, Glymour and Scheines] Protein A Protein C Cancer protein Protein B

  40. Core assumptions Joint distribution p is causally compatible with directed, acyclic graph G iff: Causal Markov Condition: each variable X is independent of its non-effects given its immediate causes. Faithfulness Condition: no other conditional independence relations hold in p.

  41. F2 Tell-tale Dependencies C C H F1 F Given F, H gives some info about C (Faithfulness) Given C, F1 gives no further info about F2 (Markov)

  42. Common Applications • Linear Causal Case: each variable X is a linear function of its parents and a normally distributed hidden variable called an “error term”. The error terms are mutually independent. • Discrete Multinomial Case: each variable X takes on a finite range of values.

  43. Genetics Smoking Cancer A Very Optimistic Assumption No unobserved latent confounding causes I’ll give you this one. What’s he up to?

  44. Current Nutrition Wisdom Protein A Protein C Cancer protein Protein B Are you kidding? It’s dripping with Protein C! English Breakfast?

  45. As the Sample Increases… Protein A Protein C Cancer protein weak Protein B Protein D This situation approximates The last one. So who cares? I do! Out of my way!

  46. As the Sample Increases Again… Protein E Protein A weak Protein C Cancer protein weak Protein B weak Protein D Wasn’t that last approximation to the truth good enough? Aaack! I’m poisoned!

  47. Causal Flipping Theorem No matter what a consistent causal discovery procedure has seen so far, there exists a pair G, p satisfying the assumptions so that the current sample is arbitrarily likely and the procedure produces arbitrarily many opposite conclusions in p as sample size increases. oops I meant oops I meant oops I meant

  48. The Wrong Reaction • The demon undermines justification of science. • He must be defeated to forestall skepticism. • Bayesian circularity • Classical instrumentalism Urk! Grrrr!

  49. Another View • Many explanations have been offered to make sense of the here-today-gone-tomorrow nature of medical wisdom — what we are advised with confidence one year is reversed the next — but the simplest one is that it is the natural rhythm of science. • (Do We Really Know What Makes us Healthy, NY Times Magazine, Sept. 16, 2007).

  50. Zen Approach • Get to know the demon. • Locate the justification of Ockham’s razor in his power.

More Related