- 59 Views
- Uploaded on
- Presentation posted in: General

A Novel Local Patch Framework for Fixing Supervised Learning Models

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

A Novel Local Patch Framework for Fixing Supervised Learning Models

Yilei Wang1, Bingzheng Wei2, Jun Yan2, Yang Hu2, Zhi-Hong Deng1, Zheng Chen2

1Peking University

2Microsoft Research Asia

- Motivation & Background
- Problem Definition & Algorithm Overview
- Algorithm Details
- Experiments - Classification
- Experiments - Search Ranking
- Conclusion

- Supervised Learning:
- Machine Learning task of inferring a function from labeled training data

- Prediction Error:
- No matter how strong a learning model is, it will suffer from prediction errors.
- Noise in training data, dynamically changing data distribution, weakness of learner

- Feedback from User:
- Good signal for learning models to find the limitation and then improve accordingly

- Automatically fix model prediction errors from failure cases in feedback data.
- Input:
- A well trained supervised model (we name it as Mother Model)
- A collection of failure cases in feedback dataset.

- Output:
- Learning to automatically fix the model bugs from failure cases

- Input:
- Previous Works
- Model Retraining
- Model Aggregation
- Incremental Learning

- Learning models are generally optimized globally
- Introducing new prediction errors when fixing the old ones

- Our key idea: learning to fix the model locally using patches

New Error

New Error

- Our proposed Local Patch Framework(LPF) aims to learn a new model
- : the original mother model
- : Patch model
- : Gaussian distribution defined by a centroid and a range

- Failure Case Collection
- Learning Patch Regions/Failure Case Clustering
- Clustering Failure Cases into N groups through subspace learning, compute the centroid and range for every group, then define our patches

- Learning Patch Model
- Learn a patch model using only the data samples that sufficiently close to the patch centroid

Algorithm Details

- Failure cases may distribute diffusely
- Small N = large patch range → many success cases will be patched
- Big N = small patch range → high computational complexity

- How to make trade-offs ?

- Our solution to diffusion: Metric Learning
- Learn a distance metric, i.e. subspace, for failure cases, such that the similar failure cases will aggregate, and keep distant from the success cases.
(Red circle = failure cases; blue circle = success cases)

Key idea of the patch model learning

- (Left): The cases in original data space.
- (Middle): The cases mapped to the learned subspace.
- (Right): Repair the failure cases using a single patch.

- Learn a distance metric, i.e. subspace, for failure cases, such that the similar failure cases will aggregate, and keep distant from the success cases.

- Conditional distribution over
- Ideal distribution
- Learn to satisfy
- Which is equivalent to maximize

- Algorithm:
- 1. Initialize each failure case with a random group
- 2. Repeat the following steps:
- a) For the given clusters, proceeds metric learning step
- b) Update the centroids of the groups, and re-assign the failure cases to its closest centroid.

- Local Patch Region:
- For each cluster i, we define a corresponding patch with as its centroid , and as its variance
- Gaussian weight:

- Objective:
- Where are the parameters, are the labels

- Update parameter:
- For /, we have
- Notice: dependent on the specific patch model

Experiments

- Dataset
- Randomly select 3 UCI subset
- Spambase, Waveform, Optical Digit Recognition
- Convert to binary classification dataset
- ~5000 instances in each dataset
- Split to: 60% - training, 20% - feedback, 20% - test

- Randomly select 3 UCI subset
- Baseline Algorithm
- SVM
- Logistic Regression
- SVM - retrained with training + feedback data
- Logistic Regression - retrained with training + feedback data
- SVM – Incremental Learning
- Logistic Regression - Incremental Learning

- Classification accuracy on feedback dataset
- Classification accuracy on test dataset

- Number of Patches
- Data sensitive, in our experiment the best N is 2

- Dataset
- Data from a commonly used commercial search engine
- ~14, 126 <q, d> pairs
- With 5 grades label

- Metrics
- [email protected] {1,3,5}

- Baseline Algorithm
- GBDT
- GBDT + IL

- We proposed
- The local model fixing problem
- A novel patch framework fox fixing the failure cases in feedback dataset in local view

- The experiment results demonstrate the effectiveness of our proposed Local Patch Framework

Thank you!