Automatic Method for Data Preprocessing for the GAME Inductive Modelling Method
Download
1 / 16

Automatic Method for Data Preprocessing for the GAME Inductive Modelling Method - PowerPoint PPT Presentation


  • 147 Views
  • Uploaded on

Automatic Method for Data Preprocessing for the GAME Inductive Modelling Method. Miroslav Čepek [email protected] Miloslav Pavlicek, Pavel Kordik Miroslav Šnorek Computational Intelligence Group Department of Computer Science and Engineering Faculty of Electrical Engineering

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Automatic Method for Data Preprocessing for the GAME Inductive Modelling Method' - lizina


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Automatic Method for Data Preprocessing for the GAME Inductive Modelling Method

Miroslav Čepek

[email protected]

Miloslav Pavlicek, Pavel Kordik

Miroslav Šnorek

Computational Intelligence Group

Department of Computer Science and Engineering

Faculty of Electrical Engineering

Czech Technical University in Prague

ICIM 2008


Automatic preprocessing
Automatic preprocessing Inductive Modelling Method

  • The GAME Neural Network (as all others data mining methods) heavily depends on data preprocessing.

  • Preprocessing involves selection, setup and ordering of preprocessing methods.

  • We want to automate this stage.

  • We will use genetic algorithm to find optimal sequence of methods.


Game neural network
GAME Neural Network Inductive Modelling Method

  • Group of Adaptive Method Evolution (GAME) uses inductive modelling.

  • The structure of the model is created in inductive way (data driven modelling).


Main ideas of automatic preprocessing
Main Ideas of Automatic Preprocessing Inductive Modelling Method

  • The main idea is to use genetic algorithms to find optimal order and optimal setup of data preprocessing methods.

  • In the first stage we will to use simple genetic algorithm.

  • Because we want to find sequence which will the most fits the GAME ANN we will use reduced GAME ANN for fitness function evaluation.


Single individual in automatic preprocessing
Single individual in automatic preprocessing Inductive Modelling Method

  • The individuals in our automatic consists of list of preprocessing methods.

    • Each method can be applied to different attributes.

    • Each method have different setup.

    • Methods are applied one by one.

    • Some methods changes structure of the dataset (PCA) and must be treated separately.


Ga for automatic preprocessing
GA for Automatic Preprocessing Inductive Modelling Method

  • Geneticalgorithmgoes in standard wayasshownbelow.


Ga properties
GA Properties Inductive Modelling Method

  • Selection – tournamentselection

    • Several individuals are selected at random from population and individual with the highest fitness is selected.

  • Cross over – standard one-point cross over.

  • Mutation

    • adds or removes preprocessing methods from individual.

    • changes order of methods.

    • changes configuration of methods.


Fitness recalculation
Fitness Recalculation Inductive Modelling Method

  • Fitness is average accuracy of several simple GAME models generated from data preprocessed by given individual.

    • Accuracy of models is not always the same due to genetic algorithm involved in training.

    • Using several models allows more consistent results.

  • We assume that better simple model also means better complex models.


Outline of the experiment
Outline of the Experiment Inductive Modelling Method

  • Complete dataset is split into training and testing part.

  • From training data given portion of values is removed.

    • Several GAME models are created on raw data.

    • Instances with missing values are removed. Then several GAME models are created.

    • Automatic preprocessing is performed. The best individual is selected and preprocessing methods are applied and several GAME models are created.


Artificial data
Artificial data Inductive Modelling Method


Best chromosomes
Best Chromosomes Inductive Modelling Method

The best individuals for selected amount of missing values. Part a) shows the best chromosome 1% of missing values. Part b) shows individual for 5% of missing values and c) shows 20% of missing values.


Best chromosomes1
Best Inductive Modelling MethodChromosomes

  • Chromosomes for simple problems (low number of missing values) are quite simple.

  • Chromosomes for complicated problems (high number of missing values) are quite complicated.

  • In this sense our algorithm works.



Results
Results Inductive Modelling Method

  • Graph shows that GAME is unable to handle missing values. Results of RAW data are quite poor.

  • When instances with missing data are removed, accuracy increase rapidly.

  • When automatic preprocessing is used accuracy is even better.


Conclusion
Conclusion Inductive Modelling Method

  • We proposed algorithm for automatic selection and ordering of data preprocessing methods.

  • We performed the first experiment with our method.

  • It works for artificial data and in future we have to prove that it work also for more complicated and real-world data.


Thank you for your attention cepekm1@fel cvut cz

Thank You for Your attention. Inductive Modelling Method

[email protected]


ad