Geometrization of Inference: Embedding in Hilbert Space

Model: { p(x|)} Truth: t(x) t  Geometrization of Inference

Embedding in Hilbert Space Fisher Information metric automagically induced on the tangent bundle !

The Volume Form as Prior A hypothesis space M is said to be regular when (M,g) is a smooth orientable riemannian manifold. A k-dim regular M has volume form: In arbitrary (orientation preserving) theta coordinates the volume of (M,g) is:

Model Volume prior ( for c = 0.1 )

n=100 n=500 n=100 n=1000 No learning FLAT 0.025 +/- 0.020 -0.032 +/- 0.016 0.048 +/- 0.24 0.039 +/- 0.24 VOLUME 0.025 +/- 0.0084 -0.011 +/- 0.007 n=10000 n=1000

Dose (log g/ml) x No. of animals n No. of deaths y -0.863 5 0 -0.296 5 1 -0.053 5 3 0.727 5 5 Ex: Simple Logistic Regression Racine’s data independent. log (odds of death) = a + b x logit(p) = log p/(1-p) Need: Ignorance Prior on (a,b)

Ignorance for Logistic Regression Racine’s data MCMC: 250k samples mean a = 0.12 sd = 3.7 Mean b = 0.63 sd = 10.0 corr[a,b] = 0.51

Volumes of Bitnets = dags of bits

Worse < BIC < AIC < CIC < Best % of correct segmentations v/s N. Based on 100 reps for each N. Params at ramdom each time.

.jpg .aiff .txt .gz The Iliad: BOOK I Sing, O goddess, the anger of Achilles son of Peleus, that brought countless ills upon the Achaeans. Many a brave soul did it send hurrying down to Hades, and many a hero did it yield a prey to dogs and vultures, for so were the counsels of Jove fulfilled from the day on which the son of Atreus, king of men, and great Achilles, first fell out with one another………… + + 01100…0 + CIC

MDL bold pragmatism Forget about the data being generated by a probability distribution. This is just a CODING GAME!! Best model is the one providing the shortest code for the observed data. Data is all there is!

Есть Проблема The shortest description length of a sequence is NON-COMPUTABLE!! And can only be approximated with MODELS.

Data and Theory are Entangled There is no data in the vacuum. Data is a logical proposition with truth values only relative to a given domain of discourse. A sequence 0110011110… is NOT DATA as the number 2.4 is not data unless is understood as “the result of such and such experiment is 2.4”. Data is theory laden. Theory is data laden. IMHO

I have a brain I obs. x I want to understand x I need to predict future x’ How?

no inmaculate obs. no theoretical vacuum no fact w/o. fiction no data w/o. theory. dataTheory

Why ? is a logical proposition in a domain of discourse

by meaning data must have meaning I mean Theory

Theory = explanation = compressing code = Probability distribution

obs. hidden

The tatistical Manifold data manifold finite measures

Sufficient map

Canonical Example:

Ignorance = Independence &Uniformity spread concentrate

Max (Ignorance) s.t. Whatever is known (I forgot to mention that Bayes Theorem follows from this as a special case and in 2 very different ways!)

Objective proc. for transforming prior info into prior distributions. • A new understanding of Data, Prior, and Likelihood. • Optimality of scalar field Conjugate Priors for Exp. Fam. • The discovery of Antidata and virtual data. • Optimality of Priors with tails following power laws. • Evaporation of the Bayesian/Freq. divide. • A dent at the Mind/Body problem. • A justification of Perelman’s Action. (That proved Thurston’s Geometrization Conjecture.) • A Geometric Theory of Ignorance. • The solution to a 260 year old problem: Objective Quantification of Ignorance in Statistical Inference. What’s New?

Spinning Ed!

Geometrization of Inference: Embedding in Hilbert Space

Geometrization of Inference: Embedding in Hilbert Space

Presentation Transcript

Rules of Inference

Inference

Inference

Inference

INFERENCE

Inference

Methods of Inference

INFERENCE

Rules of Inference

Rules of Inference

Inference

Inference

Process of Inference

Inference

Inference

Inference

Inference

Review of Inference

Inference

Examples of Inference

Inference

Inference