Geometrization of Inference

1 / 33

# Geometrization of Inference - PowerPoint PPT Presentation

Model: { p(x|  )}. Truth: t(x). t. . Geometrization of Inference. Embedding in Hilbert Space. Fisher Information metric automagically induced on the tangent bundle !. The Volume Form as Prior.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Geometrization of Inference' - neveah

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Embedding in Hilbert Space

Fisher Information metric automagically induced on the tangent bundle !

The Volume Form as Prior

A hypothesis space M is said to be regular when (M,g) is a smooth orientable riemannian manifold. A k-dim regular M has volume form:

In arbitrary (orientation preserving) theta coordinates the volume of (M,g) is:

Model

Volume prior

( for c = 0.1 )

n=100

n=500

n=100

n=1000

No learning

FLAT

0.025 +/- 0.020 -0.032 +/- 0.016

0.048 +/- 0.24 0.039 +/- 0.24

VOLUME

0.025 +/- 0.0084 -0.011 +/- 0.007

n=10000

n=1000

Dose (log g/ml)

x

No. of animals

n

No. of deaths

y

-0.863

5

0

-0.296

5

1

-0.053

5

3

0.727

5

5

Ex: Simple Logistic Regression

Racine’s data

independent.

log (odds of death) = a + b x

logit(p) = log p/(1-p)

Need: Ignorance Prior on (a,b)

Ignorance for Logistic Regression

Racine’s data

MCMC: 250k samples

mean a = 0.12 sd = 3.7

Mean b = 0.63 sd = 10.0

corr[a,b] = 0.51

Volumes of

Bitnets = dags of bits

Worse < BIC < AIC < CIC < Best

% of correct segmentations v/s N. Based on 100 reps for each N. Params at ramdom each time.

.jpg

.aiff

.txt

.gz

Sing, O goddess, the anger of Achilles son of Peleus, that brought

countless ills upon the Achaeans. Many a brave soul did it send hurrying

down to Hades, and many a hero did it yield a prey to dogs and vultures,

for so were the counsels of Jove fulfilled from the day on which the son

of Atreus, king of men, and great Achilles, first fell out with one

another…………

+

+

01100…0

+

CIC

MDL bold pragmatism

Forget about the data being generated by a probability distribution. This is just a CODING GAME!!

Best model is the one providing the shortest code for the observed data.

Data is all there is!

Есть Проблема

The shortest description length of a sequence is NON-COMPUTABLE!! And can only be approximated with MODELS.

Data and Theory are Entangled

There is no data in the vacuum.

Data is a logical proposition with truth values only relative to a given domain of discourse.

A sequence 0110011110… is NOT DATA as the number 2.4 is not data unless is understood as “the result of such and such experiment is 2.4”.

IMHO

I have a brain

I obs. x

I want to

understand x

I need to

predict future

x’

How?

no inmaculate obs.

no theoretical vacuum

no fact w/o. fiction

no data w/o. theory.

### Why ?

is a logical proposition in a domain of discourse

by

meaning

### data must have meaning

I mean

Theory

Theory

= explanation = compressing code = Probability distribution

obs.

hidden

The

tatistical Manifold

data manifold

finite measures

Max (Ignorance)

s.t. Whatever is known

(I forgot to mention that Bayes Theorem follows from this as a special case and in 2 very different ways!)

Objective proc. for transforming prior info into prior distributions.

• A new understanding of Data, Prior, and Likelihood.
• Optimality of scalar field Conjugate Priors for Exp. Fam.
• The discovery of Antidata and virtual data.
• Optimality of Priors with tails following power laws.
• Evaporation of the Bayesian/Freq. divide.
• A dent at the Mind/Body problem.
• A justification of Perelman’s Action. (That proved Thurston’s Geometrization Conjecture.)
• A Geometric Theory of Ignorance.
• The solution to a 260 year old problem: Objective Quantification of Ignorance in Statistical Inference.