Loading in 2 Seconds...

Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods

Loading in 2 Seconds...

- 69 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods' - astrid

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods

### Thank you!

Hidenao Abe & Takahira Yamaguchi

Shizuoka University, JAPAN

{hidenao,yamaguti}@ks.cs.inf.shizuoka.ac.jp

Contents

- Introduction
- Constructive meta-learning based on method repositories
- Case study with common data sets
- Conclusion

Parallel and Distributed Computing for Machine Learning 2003

データセット

学習

Data Set

学習

Learning Algorithms

Overview of meta-learning schemeMeta-learning

algorithm

Meta Knowledge

Meta-Learning

Executor

A Better result to the given data set

than its done

by each single learning algorithm

Parallel and Distributed Computing for Machine Learning 2003

Selective meta-learning scheme

- Integrating base-level classifiers, which are learned with different training data sets generating by
- “Bootstrap Sampling” (bagging)
- weighting ill-classified instances (boosting)
- Integrating base-level classifiers, which are learned with different algorithms, with
- simple voting (voting)
- constructing meta-classifier with a meta data set and a meta-level learning algorithm (stacking, cascading)

They don’t work well, when no base-level learning

algorithm works well to the given data set !!

-> Because they don’t de-compose base-level learning algorithms.

Parallel and Distributed Computing for Machine Learning 2003

Contents

- Introduction
- Constructive meta-learning based on method repositories
- Basic idea of our constructive meta-learning
- An implementation of the constructive meta-learning called CAMLET
- Parallel execution and search for CAMLET
- Case Study with common data sets
- Conclusion

Parallel and Distributed Computing for Machine Learning 2003

Organization

Search &

Composition

+

C4.5

CS

VS

AQ15

Basic Idea of ourConstructive Meta-LearningAnalysis of Representative

Inductive Learning Algorithms

Parallel and Distributed Computing for Machine Learning 2003

Organization

Search &

Composition

+

CS

AQ15

Basic Idea ofConstructive Meta-LearningAnalysis of Representative Inductive Learning Algorithms

Parallel and Distributed Computing for Machine Learning 2003

Organization

Search &

Composition

+

Automatic Composition

of Inductive Applications

Organizing inductive learning methods,

control structures and treated objects.

Basic Idea ofConstructive Meta-LearningAnalysis of Representative

Inductive Learning Algorithms

Parallel and Distributed Computing for Machine Learning 2003

Analysis of Representative Inductive Learning Algorithms

We have analyzed the following 8 learning algorithms:

- Version Space
- AQ15
- ID3
- C4.5
- Boosted C4.5
- Bagged C4.5
- Neural Network with back propagation
- Classifier Systems

We have identified 22 specific inductive learning methods.

Parallel and Distributed Computing for Machine Learning 2003

Inductive Learning Method Repository

Parallel and Distributed Computing for Machine Learning 2003

Data Type Hierarchy

Organization of input/output/reference data types

for inductive learning methods

Objects

classifier-set

If-Then rules

tree (Decision Tree)

network (Neural Net)

data-set

training data set

validation data set

test data set

Parallel and Distributed Computing for Machine Learning 2003

Identifying Inductive Learning Methods and Control Structures

Modifying

training and

validation data sets

Generating

Training &

Validation

Data Sets

Evaluating

classifier sets

to test

data set

Generating

a classifier

set

Evaluating

a classifier

set

Modifying

a classifier

set

Modifying

a classifier

set

Parallel and Distributed Computing for Machine Learning 2003

CAMLET:a Computer Aided Machine Learning Engineering Tool

Data Sets, Goal Accuracy

CAMLET

Method Repository

Data Type Hierarchy

Control Structures

Construction

Instantiation

Compile

User

Go to

the goal accuracy?

Go & Test

Yes

No

Refinement

Inductive application, etc.

Parallel and Distributed Computing for Machine Learning 2003

CAMLET with a parallel environment

Data Sets, Goal Accuracy

CAMLET

Method Repository

Data Type Hierarchy

Control Structures

Composition Level

Construction

Specifications

Data Sets

Execution

Level

Instantiation

Compile

User

Go to

the goal accuracy?

Go & Test

Yes

No

Results

Refinement

Inductive application, etc.

Parallel and Distributed Computing for Machine Learning 2003

Parallel Executions of Inductive Applications

Initializing

Composition Level is

necessary one process

Constructing specs

Sending specs

Execution Level is necessary one or more process elements

R

R

R

R

Refining & Sending

Executing

Receiving

Refining the spec

Sending refined one

Waiting

Receiving a result

Ex

Ex

Ex

R

Ex

Ex

Ex

Ex

Ex

Ex

S

Ex

R: receiving ,Ex: executing, S: sending

Parallel and Distributed Computing for Machine Learning 2003

Refinement of Inductive Applications with GA

Executed Specs &

Their Results

t Generation

- Transform executed specifications to chromosomes
- Select parents with “Tournament Method”
- Crossover the parents and Mutate one of their children
- Execute children’s specification, transforming chromosomes to specs
- * If CAMLET can execute more than two I.A at the same time, some slow I.A will be added to t+2 or later Generation.

Selection

execute

& add

Parents

Crossover and

Mutation

Children

t+1 Generation

X Generation

Parallel and Distributed Computing for Machine Learning 2003

CAMLET on a parallel GA refinement

Data Sets, Goal Accuracy

CAMLET

Method Repository

Data Type Hierarchy

Control Structures

Composition Level

Construction

Specifications

Data Sets

Execution

Level

Instantiation

Compile

User

Go to

the goal accuracy?

Go & Test

Yes

No

Results

Refinement

Inductive application, etc.

Parallel and Distributed Computing for Machine Learning 2003

Contents

- Introduction
- Constructive meta-learning based on method repositories
- Case Study with common data sets
- Accuracy comparison with common data sets
- Evaluation of parallel efficiencies in CAMLET
- Conclusion

Parallel and Distributed Computing for Machine Learning 2003

Set up of accuracy comparison with common data sets

- We have used two kinds of common data sets
- 10 Statlog data sets
- 32 UCI data sets (distributed with WEKA)
- We have input goal accuracies to each data set as the criterion.
- CAMLET has output just one specification of the best inductive application to each data set and its performance.
- CAMLET has searched about 6,000 inductive applications for the best one, executing up to one hundred inductive applications.
- 10 “Execution Level” PEs have been used to each data set.
- Stacking methods implemented in WEKA data mining suites

Parallel and Distributed Computing for Machine Learning 2003

Statlog common data sets

To generate 10-fold cross validation data sets’ pairs,

We have used SplitDatasetFilter implemented in WEKA.

Parallel and Distributed Computing for Machine Learning 2003

Setting up of Stacking methods to Statlog common data sets

Meta-level Learning Algorithms

J4.8

(with pruning, C=0.25)

Naïve Bayes

Classification with

Linear Regression

(with all of features)

J4.8 (with pruning, C=0.25)

Naïve Bayes

Part (with pruning, C=0.25)

IBk (k=5)

Bagged J4.8 (Sampling rate=55%, 5 Iterations)

Bagged J4.8 (Sampling rate=55%, 10 Iterations)

Boosted J4.8 (5 Iterations)

Boosted J4.8 (10 Iterations)

Base-level Learning Algorithms

Parallel and Distributed Computing for Machine Learning 2003

Evaluation on accuracies to Statlog common data sets

Inductive applications composed by CAMLET have shown us as

good performance as that of stacking with Classification

via Linear Regression.

Parallel and Distributed Computing for Machine Learning 2003

Setting up of Stacking methods to UCI common data sets

Meta-level Learning Algorithms

J4.8

(with pruning, C=0.25)

Naïve Bayes

Linear Regression

(with all of features)

J4.8 (with pruning, C=0.25)

Naïve Bayes

Part (with pruning, C=0.25)

IBk (k=5)

Bagged J4.8 (Sampling rate=55%, 10 Iterations)

Boosted J4.8 (10 Iterations)

Base-level Learning Algorithms

Parallel and Distributed Computing for Machine Learning 2003

Evaluation on accuracies to UCI common data sets

Parallel and Distributed Computing for Machine Learning 2003

Analysis of parallel efficiency of CAMLET

- We have analyzed parallel efficiencies of executions of the case study with “Statlog common data sets”.
- We have used 10 processing elements to execute each inductive applications.
- Parallel Efficiency is calculated as the following formula:

1<=x<100, n:#fold, e:Execution Time of Each PE

Parallel and Distributed Computing for Machine Learning 2003

Parallel Efficiency of the case study with StatLog data sets

Parallel Efficiencies of 10-fold CV data sets

Parallel Efficiencies of training-test data sets

Parallel and Distributed Computing for Machine Learning 2003

Why the parallel efficiency of “satimage” data set was so low?

- Because:
- - CAMLET could not find satisfactory inductive application to this
- data set within 100 executions of inductive applications.
- In the end of search, inductive applications with large computational
- cost badly affected to parallel efficiency.

Parallel and Distributed Computing for Machine Learning 2003

Contents

- Introduction
- Constructive meta-learning based on method repositories
- Case Study with common data sets
- Conclusion

Parallel and Distributed Computing for Machine Learning 2003

Conclusion

- CAMLET has been implemented as a tool for “Constructive Meta-Learning” scheme based on method repositories.
- CAMLET shows us a significant performance as a meta-learning scheme.
- We are extending the method repository to construct data mining application of whole data mining process.
- The parallel efficiency of CAMLET has been achieved greater than 5.93 times.
- It should be improved
- with methods to equalize each execution time such as load balancing or/and three-layered architecture.
- with more efficient search method to find satisfactory inductive application faster.

Parallel and Distributed Computing for Machine Learning 2003

Parallel and Distributed Computing for Machine Learning 2003

分類器

ベースレベル

分類器

ベースレベル

分類器

ベースレベル

分類器

Base-level

classifiers

Base-level

classifiers

Stacking: training and predictionTraining phase

Prediction phase

Training Set

Repeating

with CV

Training Set

Test Set

Learning of

Base-level classifiers

Training Set

(Int.)

Test Set

(Int.)

Learning of

Base-level classifiers

Transformation

Meta-level

Test Set

Transformation

Prediction

Meta-level

Training Set

Meta-level

classifiers

Meta-level learning

Meta-level

classifier

Results of Preduction

Parallel and Distributed Computing for Machine Learning 2003

How to transforme base-level attributes to meta-level attributes

Learning with Algorithm 1

Classifier by

Algorithm 1

- Getting prediction probability to each class value with the
- classifier.
- 2. Adding base-level class value of each base-level instance

A algorithm

C class values

#Meta-level Att. = AC

Parallel and Distributed Computing for Machine Learning 2003

Meta-level classifiers (Ex.)

The prediction probability of”0”

With classifier by algorithm 1

＞0.1

≦0.1

The prediction probability of”1”

With classifier by algorithm 2

The prediction probability of”1”

With classifier by algorithm 1

＞0.95

≦0.95

≦0.9

＞0.9

The prediction probability of”0”

With classifier by algorithm 3

Prediction

“1”

Prediction

“1”

Prediction

“0”

．．．．

Parallel and Distributed Computing for Machine Learning 2003

Accuracies of 8 base-level learning algorithms to Statlog data sets

CAMLET

Base-level algorithms

Stacking

Base-level algorithms

Accuracy（％）

Accuracy（％）

Parallel and Distributed Computing for Machine Learning 2003

C4.5 Decision Tree without pruning (reproduction of CAMLET)

Generating training and validation data sets

with a void validation set

1. Select just one

control structure

2. Fill each method with

specific methods

3. Instantiate this spec.

4. Compile the spec.

5. Execute its executable

code

6. Refine the spec.

(if needed)

Generating a classifier set (decision tree)

with entropy + information ratio

Evaluating a classifier set

with set evaluation

Evaluating classifier sets to test data set

with single classifier set evaluation

Parallel and Distributed Computing for Machine Learning 2003

Download Presentation

Connecting to Server..