Some ideas for detecting spurious observations based on mixture models
Download
1 / 29

Some Ideas for Detecting Spurious Observations Based on Mixture Models - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

Some Ideas for Detecting Spurious Observations Based on Mixture Models. Jim Lynch NISS/SAMSI & University of South Carolina. Some Ideas for Detecting Spurious Observations. Work with Dave Dickey and Francisco Vera Very Preliminary Ideas

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Some Ideas for Detecting Spurious Observations Based on Mixture Models' - calla


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Some ideas for detecting spurious observations based on mixture models

Some Ideas for Detecting Spurious Observations Based on Mixture Models

Jim Lynch

NISS/SAMSI & University of South Carolina


Some ideas for detecting spurious observations

Some Ideas for Detecting Spurious Observations Mixture Models

Work with Dave Dickey and Francisco Vera

Very Preliminary Ideas

Primarily Motivated by Dave’s American Airlines Data and Proschan’s (1963) paper on pooling to explain a decreasing failure rate and, to a lesser extent, M. J. Bayarri talk on Multiple testing


Outline
Outline Mixture Models

  • 1. Introduction

  • 2. Mixture Models

  • 3. Some Ideas

  • 4. Simulations

  • 5. The American Airlines Data


Introduction some motivation aa data largest log vol removed
Introduction Mixture ModelsSome Motivation – AA Data(Largest Log Vol Removed)

  • Some Time Series Diagnostics Suggest That Log Volume Ratio is an MA(1)

  • Fit an MA(1) to the log Vol Ratio to the AA Data

  • Look At The Residuals


Introduction
Introduction Mixture Models

  • Detecting spurious observations is an important area of research and has implications for anomaly detection (AD).

  • The term spurious observation is used to distinguish it from an outlier, since outliers are usually extreme observations in the data while a spurious observation need not be.

    • E.g., one could imagine that sophisticated intruders into computer systems would make sporadic intrusions and try to mimic as best as possible normal behavior


Introduction1
Introduction Mixture Models

  • Goal

    • To develop approaches to detect very transient spurious events where the objectives are

      • To detect when there are spurious events present and, if possible,

      • To identify them


Introduction2
Introduction Mixture Models

  • The Basic Data Analytic Model

    • X1,…, Xn iid ~ fp = (1-p) f0 + p f1

      • f0 is the background model

      • f1 models the spurious behavior

      • The likelihood is then


Introduction3
Introduction Mixture Models

  • A “More Realistic” Model

    • Generate a configuration C with probability p(C)

    • Given C, for ieC, Xi are iid ~ f0 and, for ieCc, Xi are iid ~ f1

      • Cand Cc model a spatial or temporal (e.g., a change-point) pattern

      • You are “pooling” observations based on the configuration C

      • The likelihood is then


Introduction some approaches for analyzing the mr model
Introduction Mixture ModelsSome Approaches for Analyzing the “MR” Model

  • Envision that the data are the effects of pooling observations from f0 and f1.

  • Treat the data as if it is from a mixture model and use a mixture model to determine the mle, p*, of the mixing proportion.

    • Use p* to test H0: p=0 versus H1: p>0(Under H0 and the mixture model, n-.5p* converges in distribution to X where X=0 with probability .5 and =|N(0,I0-1)| with probability .5)

    • If H0 is rejected see if the mixture model can give insights into the configuration Cj

      • E.g., do an empirical Bayes with prior p(Cj)=(1-p*)jp*n-j. Then


Introduction another approach
Introduction Mixture ModelsAnother Approach

  • Since f1 models the spurious behavior p~0

  • p~0 suggest using the locally most powerful(LMP) test statistic for testing H0:p=0 versus H1:p>0 as the basis of discovering if there are spurious observations present

  • The test statistic is related essentially to the gradient plot introduced by Lindsay (1983) to determine when a finite mixture mle is the global mixture mle in the mixed distribution model


Introduction another approach1
Introduction Mixture ModelsAnother Approach

  • The basis of this approach

    • use the gradient plot to determine if the one point mixture mle is the global mixture mle

    • When it isn’t, this suggest that some spurious behavior is present

      • One can then use the components in the mle mixed distribution to calculate “assignment probabilities” to the data to indicate what observations might be considered spurious

      • The examples indicate that detecting the presence of spurious observations seems to be considerably simpler than identifying which ones they are


Introduction mining data graphs
Introduction Mixture ModelsMining Data Graphs

  • Data (Maguire, Pearson and Wynn, 1952): Time Between Accidents with 10 or more fatalities

  • At the right are the gradient plots for the 2 and 3 point mixture mle’s and the assignment function for the 3 pt mle (mixing over exponentials)

  • The 2 and 3 pt mixture mle’s

    • m: 592.9, 166.2 p: .175, .825

    • m: 595.5, 171.6, 29.1 p: .171, .806, .023


Mixture models
Mixture Models Mixture Models

  • X1,…, Xn iid ~ fp = (1-p) f0 + p f1

    • f0 is the background model

    • f1 models the spurious behavior

    • Since the spurious observations are sporadic/transient p~0

  • Denote the log likelihood by f(f(X1),…, f(Xn)) = f(f) = log Pif(Xi)

  • Denote the gradient function of f by


Mixture models lmp
Mixture Models – LMP Mixture Models

  • LemmaThe locally most powerful test for testingH0:p=0 versus H1:p>0 is based on F0(f1; f0).

  • ProofThe LMP test for testing H0:p= p0 versus H1:p> p0 is based on the statistic

    For p=0 this reduces to


Mixture model
Mixture Model Mixture Models

  • The FunctionF(f1; f0)

    • Plays a prominent role in the analysis of data from mixtures models where it is essentially the gradient function.

    • Introduced by Lindsay (1983a&b and 1995) to determine when the mle for the mixing distribution with a finite number of points was the global mixture mle.


Mixture model framework
Mixture Model Mixture ModelsFramework

  • Family of densities {fq:q e Q}.

    • M is the set of probability measures on Q.

    • The mixed distribution over the family with mixing distribution Q by

    • For X1,…, Xn be iid from fQ, the likelihood and log likelihood are given by

      • L(Q) = PfQ(Xi) and f(fQ) = log PifQ(Xi)

      • fQ= (fQ(X1),…, fQ(Xn)).


Mixture model framework1
Mixture Model Mixture ModelsFramework

  • The Directional Derivative


Mixture model a diagnostic
Mixture Model Mixture ModelsA Diagnostic

  • Theorem 4.1 of Lindsay (1983a)

    • A. The following three conditions are equivalent:

      • Q* maximizes L(Q)

      • Q* minimizes supq D(q;Q)

      • supq D(q;Q*)=0.

    • B. Let f*=fQ*. The point (f*,f*) is a saddle point of .i.e.,

      F(fQ’;f*) < 0 = F(f*;f*) <F(f*; fQ’’) for Q’, Q’’ e M.

    • C. The support of Q* is contained in the set of q for which D(q;Q*)=0.


Mixture model the assignment membership function
Mixture Model Mixture ModelsThe Assignment/Membership Function


Simulations n 10 5 points n 0 1 5 points n 1 1
Simulations Mixture Modelsn=10: 5 points N(0,1), 5 points N(1,1)

  • 0 -0.34964

  • 0 -1.77582

  • 0 -0.92900

  • 0 0.58061

  • 0 -0.36032

  • 1 2.51937

  • 1 0.59549

  • 1 1.16238

  • 1 0.76632

  • 1 1.57752


Simulations n 10 5 points n 0 1 5 points n 1 11

m Mixture Models p

-.487880 .388813

.929969 .611187

Simulations n=10: 5 points N(0,1), 5 points N(1,1)


Simulations the assignment function
Simulations Mixture ModelsThe Assignment Function


Simulations n 30 25 points n 0 1 5 points n 1 1

m Mixture Models p

-0.05537 0.867670

2.05801 0.132330

Simulationsn=30: 25 points N(0,1), 5 points N(1,1)


Simulations n 30 25 points n 0 1 5 points n 1 11
Simulations Mixture Modelsn=30: 25 points N(0,1), 5 points N(1,1)


Simulations another n 30 25 points n 0 1 5 points n 1 1

m Mixture Models p

0.78767 0.921009

3.30559 0.078991

SimulationsAnother n=30: 25 points N(0,1), 5 points N(1,1)


Simulations another n 30 25 points n 0 1 5 points n 1 11
Simulations Mixture ModelsAnother n=30: 25 points N(0,1), 5 points N(1,1)


Aa data
AA Data Mixture Models

  • Francisco will discuss this and some other simulations in a moment.


Closing comments
Closing Comments Mixture Models

  • Is there an analogue (or alternative) of these ideas for the SCAN (or for the SCAN framework)?

    • As an alternative, view the problem as having several (two) mechanisms creating observations

      • background

      • infectious material is present.

    • Just consider that the data are a pooling from all these sites. See if the data is a 2-component mixture. If it is, try to “assign” the sites to these components. (You might use a thresh-holding of the assignment function to do this or p in the LMP Test Statistic.)

    • Instead of the assignment function, consider the following based on the LMP test statistic. Define Li=(f1(Xi) - f0(Xi))/f0(Xi). Let L(1) <L(2) <…< L(n) and let j(i) denote the inverse rank, i.e., L(i)= Lj(i). For mixture or scanning purposes, consider the sets Ci={j(n),..,j(n-i+1)}={k: L(n-i+1)< Lk}. For mixtures with mle p*, assign Ci to f1 and Cic to f0 where np*~i. For scanning purposes, look through increasing sequence of sets Ci for a spatial pattern to emerge.


References
REFERENCES Mixture Models

Ferguson, T. S. (1967) Mathematical Statistics: A Decision Theoretical Approach. Academic Press, NY.

Grego, J., Hsi, Hsiu-Li, and Lynch, J. D. (1990). A strategy for analyzing mixed and pooled exponentials. Applied Stochastic Models and Data Analysis, 6, 59-70.

Lindsay, B.G. (1983a). The geometry of mixture likelihoods: a general theory. Ann. Statist., 11, 86-94.

Lindsay, B.G. (1983b). The geometry of mixture likelihoods, Part II: the exponential family. Ann. Statist., 11, 783-792.

Lindsay, B.G. (1995). Mixture Models: Theory, Geometry & Applications, NSF-CBMS lecture series, IMS/ASA

Maguire, B.A., Pearson, E.S., and Wynn, A.H.A. (1952) The time interval between industrial accidents. Biometrika, 39, 168-180.

Proschan, F. (1963). Theoretical explanation of decreasing failure rate. Technometrics, 5, 375-383.


ad