Loading in 5 sec....

Axioms and Algorithms for Inferences Involving Probabilistic IndependencePowerPoint Presentation

Axioms and Algorithms for Inferences Involving Probabilistic Independence

- By
**chace** - Follow User

- 93 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Axioms and Algorithms for Inferences Involving Probabilistic Independence' - chace

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Axioms and Algorithmsfor InferencesInvolvingProbabilistic Independence

Dan Geiger, Azaria Paz, and Judea Pearl,

Information and Computation 91(1), March 1991, 128-141.

Presentation by Guy Moses & Omer Weissbrod

for the course 236372 - Bayesian NetworksComputer Science Faculty, Technion – winter 2009

partially based on the presentation by Ilan Gronau

What’s ahead?

Introduction- some definitions, notations and reminders.

Proof of Completeness. - “if it’s true – it can be proved”.

Preparations for the Membership Algorithm–more definitions, and some theoretical groundwork.

The Membership Algorithm– description, proof of correctness, complexity analysis.

Definitions

- U (Universe) – set of random variables with probability distributionP.
- X,Y – finite sets of random variables:X= x1,…,xn, Y = y1,…,ym
- P(X,Y) = P(X)·P(Y)- a short-hand notation for the equality:Pr{x1=a1,…, xn=an, y1=b1, …, ym=bm} = Pr{x1=a1,…, xn=an} · Pr{y1=b1, …, ym=bm}
for every choice of a1, …, an, b1, …, bm

- (X,Y) – short-hand for P(X,Y) = P(X)·P(Y)
This is called an independence statement.

*note that X,Yare disjoint sets of variables (XY = ).

Notations

- - a specific independence statement of the form (X,Y)
- - a set of independence statements of the form (X,Y): = 1, … , k
- XY-short-hand notation for the union X Y
- P satisfies = (X,Y) means: P(X,Y) = P(X)·P(Y)for that specific P.

Soundness and Completeness

Definitions:

- iff every distribution that satisfies also satisfies .
- iff cl(),i.e. there exists a derivation chain 1,…,n= s.t. for each j, either j or jis derived by an axiom from the previous statements.
For a set of axioms A:

Soundness: A is sound iff for every and :

Completeness: A is complete iff for every and :

Completeness - Alternative definition:A is complete iff for every and every cl()there exists a distribution Pthat satisfies cl)( and does not satisfy.

Independence Axioms

We saw (in 1st lecture) that axioms 1a-1d are sound (always infer correctly).

Today we’ll show they are complete (can derive every true statement).

What’s ahead?

Introduction- some definitions, notations and reminders.

Proof of Completeness. - “if it’s true – it can be proved”.

Preparations for the Membership Algorithm–more definitions, and some theoretical groundwork.

The Membership Algorithm– description, proof of correctness, complexity analysis.

Minimal Statement

- Definition: =(X,Y) cl()is minimal if for every non-empty X’,Y’s.t.X’X, Y’ Y,X’Y’XY we have (X’,Y’) cl().
- For every=(X,Y) cl()we can find an appropriate minimal ’=(X’,Y’)cl()through iterative decomposition.
- Observation: Psatisfies Psatisfies’(decomposition soundness),
Therefore:Pdoesn’t satisfy ’ Pdoesn’t satisfy .

- Our plan: Given an arbitrary cl(), We will find a distribution P that satisfies cl() but doesn’t satisfy ’. This will prove completeness (using the alternative completeness definition and the observation above).
- To simplify annotation, we will assume WLOG that =(X,Y)is already minimal.

=0.5n

=0.5m

Completeness ProofLet =(X,Y) cl()be a minimal statement where:

X={x1,…,xn},Y={y1,…,ym},andZ={z1,z2,…,zk}stand for the rest of the variables in U.

We will construct Pas follows: All variables, except x1, are fair coins (probability for each of their two values)

x1 is defined thus:

Part 1: P does not satisfy

We will inspect the following scenario: x1=1, all other variables are 0.

P(x1, … , xn, y1, … , ym) P(x1, … , xn)·P(y1, … , ym)

Therefore, P does not satisfy , as required.

Completeness Proof – cont’d

Part 2: P satisfies cl()

Let(V,W) cl(). We will show thatP(V,W)=P(V)·P(W). This is done by inspecting different scenarios:

Scenario 1: either V or W contains only elements of Z. We will assume WLOG that W contains only elements of Z.

all variables in Z are independent under Pand therefore:

Z

W

Z

Z

Z

Y

Z

Z

Z

V

Z

Z

Y

X

Y

Z

X

Y

Y

X

Z

Y

Z

Z

X

Z

Completeness Proof – cont’d

Part 2: P satisfies cl()

Let(V,W) cl(). We will show thatP(V,W)=P(V)·P(W). This is done by inspecting different scenarios:

Scenario 2: Both V and W contain elements of X Y,butV W doesn’t contain all elements of X Y.

Without full information about the assignments of the variables in X Y, x1could turn out to be 0 or 1 with probability, and therefore:

Z

Z

W

Z

Z

Y

Z

Z

Z

Z

Z

V

Y

X

Y

Z

X

Y

Y

X

Z

Y

Z

Z

X

Z

mix

Completeness Proof – cont’dPart 2: P satisfies cl()- continued

Scenario 3: Both V and W contain elements of X Y, and(X Y)(V W).

We will show a derivation chain for =(X,Y), contradicting our original assumption that cl():

Mark: (V,W)=(XVYVZV, XWYWZW)cl()

where: Y=YVYW, X=XVXW, ZVZWZ, V=XVYVZV,W=XWYWZW

Remove all z’s by decomposition: (XVYV,XWYW)cl()

Due to minimality of=(X,Y):(XV,YV)cl()and (XW,Y)cl()

(XV,YV)(XVYV,XWYW) (XV,YV XWYW) = (XV,XWY)

(XW,Y) (XWY,XV) (Y,XVXW) = (Y,X) =

Z

Z

Z

Z

Y

W

Z

Z

Z

Z

Z

Y

X

V

Y

Z

X

Y

Y

X

Z

Y

Z

Z

X

Z

Completeness Proof – Summary

Reminder: Completeness - Alternative definition:A is complete iff for every and every cl()there exists a distribution Pthat satisfies cl)( and does not satisfy.

We’ve shown: given a minimalcl(),there exists a distributionPthat obeys:

- Pdoes not satisfy.
- Psatisfies.
Given a non-minimal cl(), we will derive itsminimal statement ’, and devise a distribution P’that satisfies but does not satisfy ’. Due to soundness of decomposition, P’ cannot satisfy as well.

discrete p.d.’s

normalp.d.’s

binary p.d.’s

Scope of CompletenessThe proof uses P- a binary p.d. (probability distribution

function) therefore:

- P

however,

for normal p.d.’s, the axiom set a1-d1 is not complete.

a stronger axiom is required:

replace:

with:

What’s ahead?

Introduction- some definitions, notations and reminders.

Proof of Completeness. - “if it’s true – it can be proved”.

Preparations for the Membership Algorithm–more definitions, and some theoretical groundwork.

The Membership Algorithm– description, proof of correctness, complexity analysis.

Some more Definitions and Tools

Definition: Span

span(): the set of elements represented in statement .

Example: span(x1x2,x3,x4) = {x1,x2,x3,x4}

span(): the set of elements represented in all statements of .

Example: span({(x1,x2),(x1,x3)}) = {x1,x2,x3}

Some more Definitions and Tools

Definition: Projection

The projection of onX, denoted (X), is the statement derived from by removing all elements not in X from .

Example: if =(x1x2x3, x4x5)and X={x2,x3,x4}then (X)=(x2x3, x4).

The projection of onX, denoted (X), is {(X) | }.

Some more Definitions and Tools

Projection Lemma: iff‘ , where ’= (span())

)if ' then clearly because all the statements in ‘ can be derived from the statements in by decomposition.

Some more Definitions and Tools

Projection Lemma: iff’ , where ’ = (span()), s = span()

)if then there is a derivation chain for : 1, 2, … , k.

For each j:

if k j,k<j, (by symmetry or decomposition)

then k(s) j(s)by symmetry or decomposition respectively.

Similarly,

if j is derived from kandl by mixing,

then j(s)is derived from k(s),l(s)by mixing.

Some more Definitions and Tools

Projection Lemma: iff’ , where ’ = (span()), s = span()

Observations from projection lemma:

- Variables not in are unnecessary for determining whether .
- The problem of verifying whether can be simplified to the problem of verifying whether ', where '= (span()).
- This problem can be solved with a possibly reduced time and space complexity.

Conditions for Inference of Independence

Maim claim: for a given , we have ’ iff:

- is trivial: =(X,)(up to symmetry)
OR

- is in ’:’(up to symmetry)
OR

- is derivable from ’:
there exists ’’s.t. span() = span(’)

and for ’=(AP,BQ) =(AQ,BP) (A,B,Q,P may be empty)

’ (A,P), ’ (B,Q) (up to symmetry)

Proof of Main Claim

Maim claim: for a given , we have ’ iff:

- is trivial*: =(X,) *up to symmetry
- is in*’:’
- is derivable* from ’:’’s.t. span() = span(’)
and for ’=(AP,BQ) =(AQ,BP) : ’ (A,P), ’ (B,Q)

) if 1. is trivial*

OR 2. is in*’. than the proof is immediate.

otherwise,

3. there exists ’’s.t. span() = span(’)

and for ’=(AP,BQ) =(AQ,BP) : ’ (A,P), ’ (B,Q)

we will show a constructive proof under these conditions

mix

mix

dec.

Proof of Main ClaimMaim claim: for a given , we have ’ iff:

- is trivial*: =(X,) *up to symmetry
- is in*’:’
- is derivable* from ’:’’s.t. span() = span(’)
and for ’=(AP,BQ) =(AQ,BP) : ’ (A,P), ’ (B,Q)

- ) (contd.) given that ’ (AP,BQ), ’ (A,P), ’ (B,Q).
- (A,P)(AP,BQ) (A,PBQ)
- (B,Q)(AP,BQ) (APB,Q) (PB,Q)
- (PB,Q)(A,PBQ) (AQ,PB) = (AQ, BP) =
- We’ve proven this direction.

dec.

Proof of Main ClaimMaim claim: for a given , we have ’ iff:

- is trivial*: =(X,) *up to symmetry
- is in*’:’
- is derivable* from ’:’’s.t. span() = span(’)
and for ’=(AP,BQ) =(AQ,BP) : ’ (A,P), ’ (B,Q)

)Given’ , if 1. is trivial* OR 2. is in*’,

than the proof is immediate.

Otherwise, since no axiom can add new variables to a statement, there must exist ’’s.t. span() = span(’)in the derivation chain of.

also: = (AQ,BP) (A,P)

= (AQ,BP) (Q,B)

Conclusions from Claim

- We’ve seen that, after discarding unneeded variables,it is possible to tell whether ’ (when it’s not immediately obvious) by:
- Finding another statement ’’for whichspan() = span(’),
- Verifying that ’ (A,P), ’ (B,Q)when ’=(AP,BQ) =(AQ,BP).

- Thissuggests using a recursive “divide and conquer” approach.

What’s ahead?

Introduction- some definitions, notations and reminders.

Proof of Completeness. - “if it’s true – it can be proved”.

Preparations for the Membership Algorithm–more definitions, and some theoretical groundwork.

The Membership Algorithm– description, proof of correctness, complexity analysis.

The Membership Algorithm

Procedure Find(,):

- set ’ :=(span()).
- if is trivial, or ’ (up to symmetry)then Find(,) := TRUE.
- else if for all non-trivial ’’: span() span(’), then Find(,) := FALSE.
- else there exists ’’: span() = span(’),
and ’=(AP,BQ) =(AQ,BP),

set 1:= (A,P), 2:= (B,Q).

Find(,) := (Find(’,1) Find(’,2))

Algorithm Correctness Proof

We will prove that Find(,) := TRUEcl() by induction on k=.

Induction base: if k=1 then is trivial, therefore the algorithm will return TRUE in step 2 and cl().

Algorithm Correctness Proof

Induction assumption: Find(,) := TRUEcl() for each ’<k.

Induction step: Find(,) := TRUEiff either:

1. Step 2 returns TRUE is trivial or ’cl().

2. Step 4 returns TRUE

iff

Find(’,1) := TRUE Find(’,2) := TRUE

iff

1cl(’)2cl(’)

iff

cl(’)

(according to algorithm’s definition)

(according to induction assumption)

(according to main claim)

(according to projection lemma)

iffcl()

Complexity Analysis

Definitions:

n = the number of distinct variables in {}.

k = the number of distinct variables in {}.

- First projection cost: O(||·n) – happens only once.
- Recursive step: T)k) ||·k + T(k1) + T(k2)
where k1+k2=k, k1=|1|, k2=|2|

- Can be shown by induction: T)k) ||·k·(depth of recursion)
- Worst case analysis: T)k) ||·k·k= ||·k2
- Total run time is bounded by: O(||·n + ||·k2)which is also:O(||·n2)since k n.

Improvements and Variations

- Instead of arbitrarily choosing ’, find one whose sub-statements {A,B,P,Q} have balanced size (can improve run-time complexity).
- Using the derivation chain presented in the constructive proof, the algorithm can also return a derivation chain for with a length of O(k).

Variations (contd.)

The algorithm can be expanded into a polynomial algorithm for the following problems:

- Given two sets and , is cl() cl() ?is cl() = cl() ?
- Minimize the size of while preserving cl(): Start with a maximal-size statement and remove from all statements derivable from it.Repeat with the next largest statement etc.

Download Presentation

Connecting to Server..