meta learning for classification l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Meta Learning: For Classification PowerPoint Presentation
Download Presentation
Meta Learning: For Classification

Loading in 2 Seconds...

play fullscreen
1 / 14

Meta Learning: For Classification - PowerPoint PPT Presentation


  • 411 Views
  • Uploaded on

Meta Learning: For Classification. Daniel Spohn drspohn@student.ysu.edu Youngstown State University 03-23-06. Introduction - Meta-Learning. Process to improve the results by using additional methods.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Meta Learning: For Classification' - Samuel


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
meta learning for classification

Meta Learning:For Classification

Daniel Spohn

drspohn@student.ysu.edu

Youngstown State University

03-23-06

introduction meta learning
Introduction - Meta-Learning

Process to improve the results by using additional methods.

  • Bagging, Stacking and Boosting all involve running multiple classification algorithms or running the same algorithm multiple times.
  • Meta-Learning can also be run on a table to determine which algorithm is most appropriate for Learning with.

Daniel Spohn

Youngstown State University

stacking
Stacking
  • Using multiple algorithms and combining their results.
  • Output from the previous layer is passed on as input to the next layer.
  • Ex. The Output of a decision tree and be used as input for a neural network.
  • This combining of algorithms helps reduce the problems of an individual algorithm.

Daniel Spohn

Youngstown State University

boosting
Boosting
  • Algorithm is recursively run until misclassification is low.
  • The instances that are difficult to classify are given higher weight and the algorithm is run again.
  • As an alternative, boosting can be run with random sub-samples each time, instead of weights.

Daniel Spohn

Youngstown State University

bagging
Bagging
  • Bagging (Bootstrap Aggregation)
    • Implements “Voting”
    • Multiple algorithms are run independently on different subsets of the dataset.
    • The results of the algorithms are then combined to create the final model.

Daniel Spohn

Youngstown State University

meta classifiers
Meta-Classifiers
  • Meta-Learning using Meta-Classifiers
    • Meta-Classifiers are the attributes of the dataset. (ex. Number of columns, types of attributes, etc.)
    • Using the Meta-Classifiers a dataset is compared to previously analyzed datasets and a ranking is produced to showing the estimated effectiveness of classification algorithms.

Daniel Spohn

Youngstown State University

datasets adults
Datasets - Adults
  • Adult.arff
    • 48842 instances
    • 14 attributes (6 continuous, 8 nominal)
    • Contains information on adults such as age, gender, ethnicity, martial status, education, native country, etc.
    • The instances are classified into either “Salary >50K” or “Salary <= 50K”

Daniel Spohn

Youngstown State University

datasets census income
Datasets – Census Income
  • census-income.arff
    • 199,523 instances
    • 40 attributes (7 continuous, 33 nominal)
    • Demographic information and monetary information from the 1994 and 1995 surveys conducted by the U.S. Census Bureau.
    • The instances are classified into either “Income >50K” or “Income <= 50K”

Daniel Spohn

Youngstown State University

datasets census income9
Datasets - Census Income
  • The dataset has already been split into a training set and a testing set (2/3 training, 1/3 testing).
  • This dataset contains missing values, and some attributes may need to be discretized to improved performance and effectiveness of the algorithm.
  • Additionally some attributes may need to be removed because of irrelevancy.

Daniel Spohn

Youngstown State University

analyzing adult dataset
Analyzing Adult Dataset
  • Running this dataset in Weka MetaL (Meta Classifiers) produces:
  • The Top ranked Algorithm is LogitBoost.

Daniel Spohn

Youngstown State University

analyzing adult dataset11
Analyzing Adult Dataset
  • When using Weka MetaL's top ranked algorithm, LogitBoost, 84.68% of instances are correctly classified.
  • When using Weka MetaL's lowest ranked algorithm, ZeroR, 76.07% of instances are correctly classified.

Daniel Spohn

Youngstown State University

analyzing census dataset
Analyzing Census Dataset
  • Took a sample of 9,953 records (of 199,523)
  • Of the 40 attributes, selected 9 of the most important to help classification.
  • Grouped the attributes that contain continuous numbers into static groups.

Daniel Spohn

Youngstown State University

analyzing census dataset13
Analyzing Census Dataset
  • Running J48 results in:
  • <=50K = 99%Correctly Classified
  • >50K = 17% Correctly Classified
  • Running ADABoost with J48 as the classifier:
  • <=50K = 98%Correctly Classified
  • >50K = 36% Correctly Classified

Daniel Spohn

Youngstown State University

summary
Summary
  • Meta Learning helps improve results over the basic algorithms.
  • Using Meta Characteristics on the Adult dataset to determine an appropriate algorithm, I achieved almost 85% correct classification.
  • Using Boosting – AdaBoost with J48 correctly classified more than twice the amount of correct instances for the second group, than J48 alone.

Daniel Spohn

Youngstown State University