1 / 35

Change Detection in Data Streams by Testing Exchangeability

Change Detection in Data Streams by Testing Exchangeability. Shen-Shyang Ho JPL/Caltech. The research is part of the author’s PhD dissertation (in computer science) at George Mason University Conference travel is partially sponsored by NASA Postdoctoral Program (NPP) Travel Grant. Outline.

thyra
Download Presentation

Change Detection in Data Streams by Testing Exchangeability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Change Detection in Data Streams by Testing Exchangeability Shen-Shyang Ho JPL/Caltech The research is part of the author’s PhD dissertation (in computer science) at George Mason University Conference travel is partially sponsored by NASA Postdoctoral Program (NPP) Travel Grant.

  2. Outline • Introduction • Previous Work (Statistics and Machine Learning/Data Mining/Computer Vision) • Intuition • Background (Exchangeability/Martingale) • Methodology • Comparison and Experimental Results • Application I: Adaptive Support Vector Machine (Classification Model) • Application II: Video Shot Change Detection (Cluster Model)

  3. Introduction Letbe a sequence of independent p-dimensional random vectors with parameters Test the following hypothesis: Assumption: Data vectors are observed sequentially.

  4. Introduction

  5. Previous Work • Statistics :- Sequential Analysis is statistical inference with the assumption that the number of observations/samples required is not pre-determined. • Sequential Probability Ratio Test – A. Wald (1945) • Application: Quality Control (Military/Manufacturing) • CUSUM (Cumulative Sum) – E. S. Page (1954) • Refer to “Sequential Analysis: Design Methods and Applications” Journal for recent research. • Most recent issue (vol 27, no 2, 2008) – papers on structural change/minimax method for change-point detection problems/multidecision quickest change-point detection – 3 out of 6 papers. • Machine Learning/Data Mining: • Applications: Concept Drift Problem, Adaptive classifier, Anomaly in Internet Traffic, Video-shot change detection • Proposed methodology is usually problem-specific • Monitoring error, sliding window, weighted data, ensemble classifier … • Statistical method: Likelihood ratio method, Bayesian methods, Hypothesis Testing …

  6. Related Data Mining/Machine Learning/Computer Vision Research • Xiuyao Song, Mingxi Wu, Christopher M. Jermaine, Sanjay Ranka: Statistical change detection for multi-dimensional data. KDD 2007: 667-676 • Kolter, J.Z. and Maloof, M.A. Dynamic Weighted Majority: An ensemble method for drifting concepts. Journal of Machine Learning Research 8:2755--2790, 2007. • Klinkenberg, Ralf and Joachims, Thorsten: Detecting Concept Drift with Support Vector Machines. Proceedings of the Seventeenth International Conference on Machine Learning (ICML): 487--494, 2000. • Bi Song, Namrata Vaswani, Amit K. Roy Chowdhury: Closed-Loop Tracking and Change Detection in Multi-Activity Sequences. CVPR 2007 • Paul L. Rosin: Thresholding for Change Detection. ICCV 1998: 274-279 • Balachander Krishnamurthy, Subhabrata Sen, Yin Zhang, Yan Chen: Sketch-based change detection: methods, evaluation, and applications. Internet Measurement Conference 2003: 234-247 • Tsuyoshi Idé, Keisuke Inoue: Knowledge Discovery from Heterogeneous Dynamic Systems using Change-Point Correlations. SDM 2005 • Tsuyoshi Idé, Koji Tsuda: Change-Point Detection using Krylov Subspace Learning. SDM 2007 • Daniel Kifer, Shai Ben-David, Johannes Gehrke, Detecting Changes in Data Streams, Proc. 30th VLDB Conference, 2004. • ... …

  7. Motivation “Lack of Exchangeability” implies “Change in Data Distribution/Model” 9/20/2014 7

  8. 1 2 3 4 5 6 7 8 9 10 • 1 9 3 5 2 6 7 4 8 10 • 2 3 4 5 6 7 8 9 10 • 1 9 3 5 2 6 7 2 8 10 Identically Distributed but may be Dependent Intuition

  9. Background • Vovk et al’s work on “Testing Exchangeability Online” (ICML 2003) and “Algorithmic Learning in a random world” (Springer) : - • Testing exchangeability assumption in an online mode. • Explicit Martingale for testing the hypothesis of exchangeability (Refer to http://www.vovk.net (conformal prediction) ) 9/20/2014 9

  10. Background Let be a sequence of random variables. A finite sequence of random variable is exchangeable if , the joint distribution is invariant under any permutation of the indices of the random variables. A martingale is a sequence of random variables such that is a measurable function of for all (in particular, is a constant value) and the conditional expectation of given is equal to , i.e., 9/20/2014 10

  11. Background

  12. Methodology - Strangeness • Strangeness measures how well one data point (for each data point seen so far) is represented by a data model compared to other points • Applicable to classification, regression or cluster model • measure diversity / disagreements, i.e. the higher the strangeness of a point, the less likely it comes from the model Condition for a valid strangeness measure: A strangeness value of a data point at a particular time instance should be independent of the order it is observed with respect to the other data points.

  13. Classification Model Strangeness (K-NN): t = 1 to 1000 1001 to 2000 2001 to 3000 A B C t aaaaa…aaaaabbbbbb…….bbbbbccccc…cccccc Strangeness (SVM): Lagrange Multiplier

  14. Classification Model Strangeness (SVM): Lagrange Multiplier 9/20/2014

  15. Cluster Model Strangeness of a data vector in a cluster

  16. Regression Model where is the regression function and is the error estimation function for at (Papadopoulos et al., Inductive Confidence Machines for Regression, ECML, LNAI 2430, pp 345-356, 2002)

  17. Methodology p-value of a new point given previous seen data points: • where is the strangeness measure for • and is randomly chosen from [0,1] for each new point • : necessary so the sequence of p-values are uniformly distributed in [0,1] for any strangeness measure (Vovk, 2003)

  18. Methodology

  19. Methodology Consider the null hypothesis against the alternative hypothesis The test for change continues as long as One rejects the null hypothesis when

  20. Methodology

  21. Methodology 9/20/2014 21

  22. Experimental Result – Performance Measure 9/20/2014 22

  23. Experimental Result – Varying 9/20/2014 23

  24. Experimental Result – Varying Strangeness

  25. Experimental Result –Varying Linearly Non-separable Classification Model Linearly Separable Classification Model

  26. Experimental Result Ringnorm/Twonorm (Change in dataset every 1000 points) Nursery Categorical Dataset (Change in class compositions every 1000 points) 9/20/2014 26

  27. Experimental Result 9/20/2014 27

  28. Experimental Result – Different Methods

  29. Application: Adaptive SVM

  30. Application: Adaptive SVM Simulated USPS 3-Digit Image Data Stream t 01120120…0340033404…156556115…77789987… 9/20/2014 30

  31. Application: Adaptive SVM A (blue): True Change Point Known to the SVM B(red): Adaptive SVM using martingale method C(magenta): SVM using sliding window of size 250 D(black): SVM using sliding window of size 500 E(green): SVM using sliding window of size 1000

  32. Application: Video-Shot Change Detection Martingale Change Detection using multiple features (MVMT: Multiple-view martingale test)

  33. Application: Video-Shot Change Detection • HI: Histogram Intersection • Chi-Square Measure • Euclidean Distance (ED) 9/20/2014 33

  34. Reference • S.-S. Ho and H. Wechsler, Detecting Change-Points in Unlabeled Data Streams using Martingale, Proc. 20th Int. Joint. Conf. Artificial Intelligence (IJCAI 2007), Hyderabad, India, Jan. 6 - 12, 2007. • S-S Ho, A Martingale Framework for Concept Change Detection in Time-Varying Data Streams, Proc Int. Conf. on Machine Learning (ICML 2005), Bonn, Germany, Aug. 7 - 11, 2005 • S-S Ho and H. Wechsler, Adaptive Support Vector Machine for Time-Varying Data streams Using the Martingale, Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI 2005), Edinburgh, Scotland, July 30 - Aug. 5, 2005 • S-S Ho and H. Wechsler, On the detection of concept change in time-varying data streams by testing exchangeability, Proc. Conference on Uncertainty in Artificial Intelligence (UAI 2005), Edinburgh, Scotland, July 26 - 29, 2005 • http://shenshyang.googlepages.com/codes (matlab codes + datasets) 9/20/2014 34

  35. Acknowledgement • Harry Wechsler, PhD Advisor (George Mason University) • Volodya Vovk, (Royal Holloway, University of London) • Alexander Gammerman (Royal Holloway, University of London) • Oak Ridge Associated University (ORAU) 9/20/2014 35

More Related