Pattern evaluation and process control
This presentation is the property of its rightful owner.
Sponsored Links
1 / 30

Pattern Evaluation and Process Control PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

Pattern Evaluation and Process Control. Wei-Min Shen Information Sciences Institute University of Southern California. Outline. Intuition of Interestingness Principles for Measuring Interestingness Existing Measurement Systems Minimal Description Length Principle

Download Presentation

Pattern Evaluation and Process Control

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Pattern evaluation and process control

Pattern Evaluation and Process Control

Wei-Min Shen

Information Sciences Institute

University of Southern California

UCLA Data Mining Short Course


Outline

Outline

  • Intuition of Interestingness

  • Principles for Measuring Interestingness

  • Existing Measurement Systems

  • Minimal Description Length Principle

  • Methods for Process Control

UCLA Data Mining Short Course


Why is a pattern interesting

Why Is a Pattern “Interesting”?

  • I did not know X before

  • It contradicts my thinking (surprise)

  • It is supported by the majority of the data

  • It is an exception of the usual cases

  • Occam’s Razor: Simple is better

  • More?

UCLA Data Mining Short Course


The types of classification rule

The Types of Classification Rule

  • Let h be a hypothesis and e the evidence, then respect to any given tuple, we have

    • Characteristic rule: he

    • Discriminate rule: eh

  • e and h can be interpreted as sets of tuples satisfying e and h respectively

UCLA Data Mining Short Course


A few definitions

A Few Definitions

  • Given a discriminate rule R: eh

    • |e| is the cover of the rule

    • |he|/|e| is the confidence, reliability, orcertainty factor of the rule

  • R is “X% complete”: if |he|/|h| = X% (e satisfies X% of |h|)

  • R is “Y% discriminate”: if |¬he|/|¬h| = (100-Y)% (e satisfies (100-Y)% of |¬h|)

UCLA Data Mining Short Course


Principles for measuring i

Principles for Measuring “I”

  • 1. I = 0 if h and e are statistically independent

    • e and h have no relation at all

  • 2. I monotonically with |he| when |h|, |¬h|, and |e| remain the same

    • I relates to reliability

UCLA Data Mining Short Course


Principles for measuring i1

Principles for Measuring “I”

  • 3. I monotonically with |h|(or|e|) when |he|, |e| (or |h|), and |¬h| remain the same

    • I relates to completeness

  • 4. I monotonically with |e| when reliability |he|/|e|, |h|, and |¬h| remain the same

    • I relates to cover when reliability is the same

UCLA Data Mining Short Course


Treat discriminate and characteristic rules differently

Treat Discriminate and Characteristic Rules Differently

  • Principles 1,2,3,4 apply to both discriminate and characteristic rules

  • 5.Treat discriminate and characteristic rules differently

    • RuleEHDiscrimComplete

    • A FeverFlu80%30%

    • BSneezeFlu30%80%

  • As discriminate rule I(A) > I(B)

  • As characteristic rule I(B) > I(A)

UCLA Data Mining Short Course


Existing measurement systems

Existing Measurement Systems

  • RI (Piatetsky-Shapiro 91)

  • J (Smyth and Goodman 92)

  • CE (Hong and Mao 91)

  • IC++ (Kamber and Shinghal 96)

UCLA Data Mining Short Course


Ic measurement for characteristic rules

IC++ Measurement for Characteristic Rules

  • Given h, e, let rule d: eh and rule c: he

  • Nec(d) = P(¬e|h)/P(¬e|¬h)

  • Suf(d) = P(e|h)/P(e|¬h)

  • for he, C++= if 0Nec(d)<1 then (1-Nec(d))*P(h), else 0.

  • for h¬e, C+-= if 0Suf(d)<1 then (1-Suf(d))*P(h), else 0.

  • for¬he, C-+= if 0<Nec(d)< then (1-1/Nec(d))*P(¬h), else 0.

  • for¬h¬e, C--= if 0<Suf(d)< then (1-1/Suf(d))*P(¬h), else 0.

UCLA Data Mining Short Course


Minimal description length principle

Minimal Description Length Principle

  • The goodness of a theory or hypothesis (H) relative to a set a data (D) is measured:

    • The sum of

      • The length of H

      • The length of explanation of D using H

    • Assuming both use the optimal coding schema

UCLA Data Mining Short Course


The derivation of mdl

The Derivation of MDL

  • Based on probability theory, the best hypothesis H with respect to D is:

    • the max of P(H)P(D|H)

    • or the max of logP(H) + logP(D|H)

    • or the min of -logP(H) - logP(D|H)

  • Since the optimal encode of a set is related to the probability of the elements, so we have MDL

    • the min of |coding1(H)| + |coding2(D|H)|

UCLA Data Mining Short Course


An illustration of mdl

An Illustration of MDL

One line theory:

explanation length = 294.9

Two line theory:

explanation length = 298.7

UCLA Data Mining Short Course


Fit points with lines

Fit Points with Lines

  • Theory = lines (#,angle,length,center)

  • Explanation: for each point:

    • the line it belongs to

    • the position on the line

    • the distance to line

  • Notice that the current coding is (x,y)

  • It is different if we choose coding (r,theta)

  • UCLA Data Mining Short Course


    Process control

    Process Control

    • The Goal: to predict future from past

    • The Given: the past data sequence

    • The methods:

      • Adaptive Control Theory

      • Chaotic theory

      • State Machines

    UCLA Data Mining Short Course


    Chaotic theory

    Chaotic Theory

    • The data sequence may appear chaotic

    • The underlying model may be very simple

    • Extreme sensitive to initial condition

    • Difficult to make long term prediction

    • Short term prediction is possible

    UCLA Data Mining Short Course


    An example chaotic sequence

    An Example Chaotic Sequence

    1.0

    s(k)

    0.5

    0.0

    20

    40

    60

    80

    100

    Time step k

    The simple logistic map model:

    sk+1= ask (1 - sk), where a=4

    UCLA Data Mining Short Course


    Steps of using chaotic theory

    Steps of Using Chaotic Theory

    • Reconstruction of state space:

      • xk = [xk, xk-, …, xk-(m-1)]T

      • where  is a time delay, m is the embedding dimension

    • Taken’s theorem:, one can always find an embedding dimension m2[d]+1, where [d] is the integer part of the attractor’s dimension, to preserve the invariant measures

    • Central task: chose m and 

    UCLA Data Mining Short Course


    State machine approach

    State Machine Approach

    • Identify the number of states by clustering all points in the sequence

    • Construct a transition function by learning from the sequence

    UCLA Data Mining Short Course


    Construction synchronization

    Construction & Synchronization

    • Environment = (A, P, Q, r) where |P|<|Q|

    • Model = (A, P, S, t)

      • Visibly equivalent

      • Perfect

      • Synchronized

    • The Construction problem

      • when and how to construct new model states

    • The Synchronization problem

      • how to determine which model state is current

    UCLA Data Mining Short Course


    Learning with a reset button

    Learning with a Reset Button

    • Two environmental states p and q (they may appear the same to the learner) are different if and only if there exists a sequence e of actions that leads from p and q to states that are visibly different

    • The interaction with the environment

      • Membership Query

      • Equivalence Query: “yes” or a counter example

    UCLA Data Mining Short Course


    Observation table

    Observation Table

    • Model states: {row(s) : s in S}

    • Initial state: row(l)

    • Final state: {row(s) : s in S and T(s)=1

    • Transitions: (row(s),a) = row(sa)

    • Closed table: s,as’ row(sa)=row(s’)

    • Consistent table: row(s)=row(s’)  row(sa)=row(s’a)

    E (experiments)

    States (actions from init state)

    T: Observations

    S

    Transitions

    SxA

    UCLA Data Mining Short Course


    L algorithm

    L* Algorithm

    • Initialize T for  and each action in A

    • Loop Use membership queries to make T complete, closed, and consistent If EQ(T)=w /* an counter example */ then add w and all its prefixes into S;Until EQ(T)=yes.

    UCLA Data Mining Short Course


    The little prince example

    The Little Prince Example

    • A counter example ftf for M3 (Fig 5.3), the model ends at rose, but the real observation is volcano

    • An inconsistency in T4 (Tab 5.5), where row(f)=row(ft), but row(ff)  row(ftf).

    UCLA Data Mining Short Course


    Homing sequence

    Homing Sequence

    • L* is limited by a reset button

    • Homing Sequence h: if two observation sequences of executing h are the same, then these two sequences lead to the same state

    • Let q<h> be observation sequence, andqh the ending state, then h is defined as

    • for all p, q: [p<h>=q<h>] [ph=qh]

    • e.g., {fwd} is a homing seq for Little Prince

    UCLA Data Mining Short Course


    Properties of homing seq

    Properties of Homing Seq

    • Every FDA has a homing sequence

    • Can be constructed from a FDA by appending actions (<n) that distinguish a pair of states

    • The length of this construction is n2

    • There are FDA whose shortest h is n2 long

    • h can be used as a reset

    • h cannot guarantee go to a fixed state

    UCLA Data Mining Short Course


    L with a homing sequence h

    L* with a Homing Sequence h

    • Every time a reset is needed, repeat h until you see the desired observation sequence

    • Or for each possible observation sequence of h, make a copy of L* (see Fig 5.6)

    UCLA Data Mining Short Course


    Learning the homing sequence

    Learning the Homing Sequence

    • If h is not a homing sequence, then we may discover that the same observation sequence  produced by executing h may lead us to two different states, p and q, for there is a sequence of actions x that p<x> q<x>

    • then, a better approximation of homing sequence is hx

    UCLA Data Mining Short Course


    L learning h

    L* + Learning h

    • Assume a homing sequence h, initially h=

    • When h is shown to be incorrect, extend h, and discard all copies of L* and start again

    • When h is incorrect, then there exists x such that qh<x>ph<x>, even if q<h>=p<h>

    UCLA Data Mining Short Course


    Learning h and the model

    Learning h and the Model

    • Revist and Shapire’s algorithm (Fig 5.7)

    • Little Prince Example (notice the inconsistency produced by ff in Fig 5.10)

    UCLA Data Mining Short Course


  • Login