WEKA 3.5.5 - PowerPoint PPT Presentation

Weka 3 5 5
1 / 21

  • Uploaded on
  • Presentation posted in: General

WEKA 3.5.5. (sumber: Machine Learning with WEKA). What is WEKA?. Weka is a collection of machine learning algorithms for data mining tasks. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

WEKA 3.5.5

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Weka 3 5 5

WEKA 3.5.5

(sumber: Machine Learning with WEKA)

What is weka

What is WEKA?

  • Weka is a collection of machine learning algorithms for data mining tasks.

  • Weka contains tools for

    • data pre-processing,

    • classification,

    • regression,

    • clustering,

    • association rules, and

    • visualization.

  • It is also well-suited for developing new machine learning schemes.



  • A dataset is roughly equivalent to a two-dimensional spreadsheet or database table.

  • A dataset is a collection of examples.

  • The external representation of an Instances class is an ARFF file, which consists of a header describing the attribute types and the data as comma-separated list.

Dataset arff

Dataset - ARFF

  • The ARFF Header Section

    The ARFF Header section of the file contains the relation declaration and attribute declarations.

    • The @relation Declaration

      The relation name is defined as the first line.

    • The @attribute Declarations

      Each attribute in the data set has its own @attribute statement which uniquely defines the name and it's data type. The order the attributes are declared indicates the column position in the data section of the file.

Arff header section

ARFF - Header Section

Arff data types

ARFF - Data Types

  • The <datatype> can be any of the types:

    • Numeric: can be real or integer numbers.

      • integer is treated as numeric

      • real is treated as numeric

    • Nominal

    • String

    • Date

  • The keywords numeric, real, integer, string and date are case insensitive.

Arff data types example

ARFF - Data Types Example

  • @ATTRIBUTE sepallength NUMERIC

  • @ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}

  • @ATTRIBUTE LCC string

  • @attribute <name> date [<date-format>]

    default format: yyyy-MM-dd'T'HH:mm:ss

Arff data section

ARFF - Data Section

Arff data section1

ARFF - Data Section ..

  • The ARFF Data section of the file contains the data declaration line and the actual instance lines.

    • The @data Declaration

      The @data declaration is a single line denoting the start of the

      data segment in the file.

    • The instance data

      • Each instance on a single line

      • Attribute values delimited by commas

      • The order agreed the declaration in header section

      • Missing values are represented by a single question mark

      • Values of string and nominal attributes are case sensitive, and any that contain space must be quoted

Create an arff file

Create an ARFF file

Create an arff file1

Create an ARFF file ..

Weka 3 5 51

WEKA 3.5.5



  • LogWindow Opens a log window that captures all that is printed to stdout or stderr. Useful for environments like MS Windows, where WEKA is not started from a terminal.

  • Exit Closes WEKA.

Program logwindow

Program .. LogWindow



  • Explorer: for exploring data with WEKA.

  • Experimenter: for performing experiments and conducting statistical tests between learning schemes.

  • KnowledgeFlow: supports essentially the same functions as the Explorer but with a drag-and-drop interface. One advantage is that it supports incremental learning.

  • SimpleCLI: Provides a simple command-line interface that allows direct execution of WEKA commands for operating systems that do not provide their own command line interface.



  • ArffViewer An MDI application for viewing ARFF files in spreadsheet format.

  • SqlViewer represents an SQL worksheet, for querying databases via JDBC.

  • EnsembleLibrary An interface for generating setups for Ensemble Selection.









  • Plot For plotting a 2D plot of a dataset.

  • ROC Displays a previously saved ROC curve.

  • TreeVisualizer For displaying directed graphs, e.g., a decision tree.

  • GraphVisualizer Visualizes XML BIF or DOT format graphs, e.g., for Bayesian networks.

  • BoundaryVisualizer Allows the visualization of classifier decision boundaries in two dimensions.



  • Minimize Minimizes all current windows.

  • Restore Restores all minimized windows again.

  • Login