Download Presentation
Meta Learning: For Classification

Loading in 2 Seconds...

1 / 14

# Meta Learning: For Classification - PowerPoint PPT Presentation

Meta Learning: For Classification. Daniel Spohn drspohn@student.ysu.edu Youngstown State University 03-23-06. Introduction - Meta-Learning. Process to improve the results by using additional methods.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

## PowerPoint Slideshow about 'Meta Learning: For Classification' - Samuel

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Meta Learning:For Classification

Daniel Spohn

drspohn@student.ysu.edu

Youngstown State University

03-23-06

Introduction - Meta-Learning

Process to improve the results by using additional methods.

• Bagging, Stacking and Boosting all involve running multiple classification algorithms or running the same algorithm multiple times.
• Meta-Learning can also be run on a table to determine which algorithm is most appropriate for Learning with.

Daniel Spohn

Youngstown State University

Stacking
• Using multiple algorithms and combining their results.
• Output from the previous layer is passed on as input to the next layer.
• Ex. The Output of a decision tree and be used as input for a neural network.
• This combining of algorithms helps reduce the problems of an individual algorithm.

Daniel Spohn

Youngstown State University

Boosting
• Algorithm is recursively run until misclassification is low.
• The instances that are difficult to classify are given higher weight and the algorithm is run again.
• As an alternative, boosting can be run with random sub-samples each time, instead of weights.

Daniel Spohn

Youngstown State University

Bagging
• Bagging (Bootstrap Aggregation)
• Implements “Voting”
• Multiple algorithms are run independently on different subsets of the dataset.
• The results of the algorithms are then combined to create the final model.

Daniel Spohn

Youngstown State University

Meta-Classifiers
• Meta-Learning using Meta-Classifiers
• Meta-Classifiers are the attributes of the dataset. (ex. Number of columns, types of attributes, etc.)
• Using the Meta-Classifiers a dataset is compared to previously analyzed datasets and a ranking is produced to showing the estimated effectiveness of classification algorithms.

Daniel Spohn

Youngstown State University

Datasets - Adults
• Adult.arff
• 48842 instances
• 14 attributes (6 continuous, 8 nominal)
• Contains information on adults such as age, gender, ethnicity, martial status, education, native country, etc.
• The instances are classified into either “Salary >50K” or “Salary <= 50K”

Daniel Spohn

Youngstown State University

Datasets – Census Income
• census-income.arff
• 199,523 instances
• 40 attributes (7 continuous, 33 nominal)
• Demographic information and monetary information from the 1994 and 1995 surveys conducted by the U.S. Census Bureau.
• The instances are classified into either “Income >50K” or “Income <= 50K”

Daniel Spohn

Youngstown State University

Datasets - Census Income
• The dataset has already been split into a training set and a testing set (2/3 training, 1/3 testing).
• This dataset contains missing values, and some attributes may need to be discretized to improved performance and effectiveness of the algorithm.
• Additionally some attributes may need to be removed because of irrelevancy.

Daniel Spohn

Youngstown State University

Analyzing Adult Dataset
• Running this dataset in Weka MetaL (Meta Classifiers) produces:
• The Top ranked Algorithm is LogitBoost.

Daniel Spohn

Youngstown State University

Analyzing Adult Dataset
• When using Weka MetaL's top ranked algorithm, LogitBoost, 84.68% of instances are correctly classified.
• When using Weka MetaL's lowest ranked algorithm, ZeroR, 76.07% of instances are correctly classified.

Daniel Spohn

Youngstown State University

Analyzing Census Dataset
• Took a sample of 9,953 records (of 199,523)
• Of the 40 attributes, selected 9 of the most important to help classification.
• Grouped the attributes that contain continuous numbers into static groups.

Daniel Spohn

Youngstown State University

Analyzing Census Dataset
• Running J48 results in:
• <=50K = 99%Correctly Classified
• >50K = 17% Correctly Classified
• Running ADABoost with J48 as the classifier:
• <=50K = 98%Correctly Classified
• >50K = 36% Correctly Classified

Daniel Spohn

Youngstown State University

Summary
• Meta Learning helps improve results over the basic algorithms.
• Using Meta Characteristics on the Adult dataset to determine an appropriate algorithm, I achieved almost 85% correct classification.
• Using Boosting – AdaBoost with J48 correctly classified more than twice the amount of correct instances for the second group, than J48 alone.

Daniel Spohn

Youngstown State University