Information Geometry of Self-organizing maximum likelihood

Download Presentation

Information Geometry of Self-organizing maximum likelihood

Loading in 2 Seconds...

- 65 Views
- Uploaded on
- Presentation posted in: General

Information Geometry of Self-organizing maximum likelihood

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Bernoulli 2000 Conference at Riken on 27 October, 2000

Information Geometry of

Self-organizing maximum likelihood

Shinto Eguchi ISM, GUAS

This talk is based on joint research with

Dr Yutaka Kano, Osaka Univ

Consider a statistical model:

Maximum Likelihood Estimation (MLE)

( Fisher, 1922),

Consistency, efficiency sufficiency, unbiasedness invariance, information

Take an increasing function .

-MLE

Normal density

-MLE

given data

-MLE

MLE

0.4

0.3

0.2

0.1

3

-3

-2

-1

1

2

Normal density

MLE

outlier

-MLE

Examples

KL-divergence

(1)

(2)

-divergence

-divergence

(3)

g

h

f

Pythagorian theorem

(0,1)

(1,1)

.

( t, s )

(0,0)

(1,0)

(Pf)

Differential geometry of

Riemann metric

Affine connection

Conjugate

affine connection

Ciszsar’s divergence

-divergence

Amari’s -divergence

-likelihood function

Kullback-Leibler and maximum likelihood

M-estimation ( Huber, 1964, 1983)

Another definition of Y-likelihood

Take a positive function k(x, q) and define

Y-likelihood equation is a weighted score with integrabity.

Consistency of Y-MLE

Fisher consistency

e -contamination model of

Influence function

Asymptotic efficiency

Robustness or Efficiency

Generalized linear model

Regression model

Estimating equation

Bernoulli regression

Logistic regression

Misclassification model

MLE

MLE

Logistic

Discrimination

Group I = from

Group II from

Mislabel

5

Group I

Group II

35

Group I

Group II

Misclassification

5 data

Group II

Group I

35 data

Poisson regression

-likelihood function

-contamination model

Canonical link

Neural network

Input

Output

Maximum likelihood

-maximum likelihood

Classical procedure for PCA

Let off-line data.

Self-organizing procedure

Classic procedure

Self-organizing procedure

Independent Component Analysis (Minami & Eguchi, 2000)

F

F

Theorem (Semiparametric consistency)

S

F

S

(Pf)

-likelihood satisfies the semiparametric consistency

Usual method

self-organizing method

Blue dots

Blue & red dots

150 the exponential power

http://www.ai.mit.edu/people/fisher/ica_data/

50

Concluding remark

Bias potential function

Y-sufficiency

Y-factoriziable

Y-exponential family

Y-EM algorithm

Y-Regression analysis

Y-Discriminant analysis

Y-PCA

Y-ICA

?

!