1 / 12

Introduction to Weka

CS4705 – Natural Language Processing Thursday, September 28. Introduction to Weka. What is weka?. java-based Machine Learning Tool 3 modes of operation GUI Command Line API (not discussed here) To run: java -Xmx1024M -jar ~cs4705/bin/weka.jar &. weka Homepage.

laith-nunez
Download Presentation

Introduction to Weka

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS4705 – Natural Language Processing Thursday, September 28 Introduction to Weka

  2. What is weka? • java-based Machine Learning Tool • 3 modes of operation • GUI • Command Line • API (not discussed here) • To run: • java -Xmx1024M -jar ~cs4705/bin/weka.jar &

  3. weka Homepage • http://www.cs.waikato.ac.nz/ml/weka/

  4. .arff file format • http://www.cs.waikato.ac.nz/~ml/weka/arff.html @relation name @attribute attrName {numeric, string, <nominal>, date} ... @data a,b,c,d,e • <nominal> := {class1,class2,...,classN}

  5. Example Arff Files • http://sourceforge.net/projects/weka • iris.arff • cmc.arff

  6. Click 'Start' Wait... Right-click on Result list entry 'Save result buffer' 'Save model' To Classify with weka GUI • Run weka GUI • Click 'Explorer' • 'Open file...' • Select 'Classify' tab • 'Choose' a classifier • Confirm options

  7. Classify • Some classifiers to start with. • NaiveBayes • JRip • J48 • SMO • Find References by selecting a classifier • Use Cross-Validation!

  8. Analyzing Results • Important tools for Homework 2 • Accuracy • “Correctly classified instances” • Confusion matrix • Save model • Visualization

  9. Running weka from the Command Line • Running an N-fold cross validation experiment • java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -t trainingdata.arff -x N • Using a predefined test set • java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -t trainingdata.arff -T testingdata.arff

  10. Saving the model • java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -t trainingdata.arff -d output.model • Classifying a test set • java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -l input.model -T testingdata.arff

  11. Analyzing results • Get predictions from test data • java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -l input.model -T testingdata.arff -p range • Then DIY with scripts • awk and sed will be your friends

  12. Getting predictions from crossvalidation • “Output Predictions” doesn't cut it. • export CLASSPATH=~cs4705/bin/:~cs4705/bin/weka.jar • java callClassifier weka.classifiers.bayes.NaiveBayes -t trainingdata.arff

More Related