learning under concept drift an overview n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Learning under concept drift: an overview PowerPoint Presentation
Download Presentation
Learning under concept drift: an overview

Loading in 2 Seconds...

play fullscreen
1 / 11

Learning under concept drift: an overview - PowerPoint PPT Presentation


  • 383 Views
  • Uploaded on

Learning under concept drift: an overview. Zhimin He iTechs – ISCAS 2013-03-21. Agenda. What’s Concept Drift Causes of a Concept Drift Types of Concept Drift Detecting and Handling Concept Drift Implications for Software Engineering Research. Definitions. Prediction

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Learning under concept drift: an overview' - lorin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
learning under concept drift an overview

Learning under concept drift: an overview

Zhimin He

iTechs – ISCAS

2013-03-21

agenda
Agenda
  • What’s Concept Drift
  • Causes of a Concept Drift
  • Types of Concept Drift
  • Detecting and Handling Concept Drift
  • Implications for Software Engineering Research
definitions
Definitions
  • Prediction
    • is a vector in p-dimensional feature space observed at time tand ytis the corresponding label.
    • We call Xtan instanceand a pair (Xt; yt) a labeled instance. We refer to instances (X1; : : : ;Xt) as historical data and instance Xt+1as target (or testing) instance.
    • The task is to predict a label yt+1 for the target instance Xt+1.
definitions cont
Definitions(cont.)
  • Concept Drift
    • Every instance Xtis generated by a source St.
    • If all the data is sampled from the same source, i.e. S1 = S2 = : : : = St+1 = S we say that the concept is stable.
    • If for any two time points i and j Si != Sj, we say that there is a concept drift.
causes of concept drift
Causes of Concept Drift
  • Let is an instance in p-dimensional feature space. , where c1, c2,….ck is the set of class labels.
  • The optimal classier to classify is determined by a prior probabilities for the classes P(ci) and the class-conditional probability density functions p(X | ci), i = 1,….k.
  • Concept /data source:
    • a set of a prior probabilities of the classes and class-conditional pdf's:
causes of concept drift cont
Causes of Concept Drift (cont.)
  • Concept drift may occur in three ways:
    • Class priors P(c) might change over time.
    • The distributions of one or several classes p(X|ci) might change. (virtual drift)
    • The posterior distributions of the class memberships p(ci|X) might change.(real drift)
types of concept drift
Types of Concept Drift
  • Types:
    • Sudden drift
    • Gradual drift
    • Incremental drift
    • reoccurring contexts
detecting and handling concept drift
Detecting and Handling Concept Drift
  • Detecting
    • Monitoring the raw data
    • Monitoring parameters of learners
    • Monitoring prediction errors of learners
  • Handling
    • Ensemble learning
    • Instance selection
    • Instance weights
    • Training windows
    • Training windows are naturally suitable for sudden concept drift, while ensembles are more flexible in terms of change type.
detecting and handling concept drift cont
Detecting and Handling Concept Drift (cont.)
  • Overall solution for learning under concept drift
implications for se research
Implications for SE Research
  • Concept drift is a fundamental issue for SE predictions
    • Cost estimation, defect prediction…
    • Especially in the cross-company/cross-project context
    • Be harmful to performance of prediction models
  • Detecting and handling concept drift is a challenging task!
    • Quality problems of SE data, e.g., insufficient data
    • Data generation context is highly unstable.
  • Has become a increasingly popular research topic in SE field!
    • E.g., BurakTurhan [JESE 2012], JayalathEkanayake [MSR 2009, JESE 2011]
references
References
  • IndreZliobaite, “Learning under Concept Drift : an Overview,” Tech-report, 2009
  • A. Dries and R. Ulrich, “Adaptive Concept Drift Detection,” Journal of Statictical Analysis and Data Mining, 2009
  • L. Minku, A. White, and X. Yao. “The impact of diversity on on-line ensemble learning in the presence of concept drift.” IEEE Transactions on Knowledge and Data Engineering, 2009
  • M. Kelly, D. Hand, and N. Adams. “The impact of changing populations on classier performance.” KDD,1999