Weka 3 5 5
Sponsored Links
This presentation is the property of its rightful owner.
1 / 21

WEKA 3.5.5 PowerPoint PPT Presentation


  • 170 Views
  • Uploaded on
  • Presentation posted in: General

WEKA 3.5.5. (sumber: Machine Learning with WEKA). What is WEKA?. Weka is a collection of machine learning algorithms for data mining tasks. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization.

Download Presentation

WEKA 3.5.5

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


WEKA 3.5.5

(sumber: Machine Learning with WEKA)


What is WEKA?

  • Weka is a collection of machine learning algorithms for data mining tasks.

  • Weka contains tools for

    • data pre-processing,

    • classification,

    • regression,

    • clustering,

    • association rules, and

    • visualization.

  • It is also well-suited for developing new machine learning schemes.


Dataset

  • A dataset is roughly equivalent to a two-dimensional spreadsheet or database table.

  • A dataset is a collection of examples.

  • The external representation of an Instances class is an ARFF file, which consists of a header describing the attribute types and the data as comma-separated list.


Dataset - ARFF

  • The ARFF Header Section

    The ARFF Header section of the file contains the relation declaration and attribute declarations.

    • The @relation Declaration

      The relation name is defined as the first line.

    • The @attribute Declarations

      Each attribute in the data set has its own @attribute statement which uniquely defines the name and it's data type. The order the attributes are declared indicates the column position in the data section of the file.


ARFF - Header Section


ARFF - Data Types

  • The <datatype> can be any of the types:

    • Numeric: can be real or integer numbers.

      • integer is treated as numeric

      • real is treated as numeric

    • Nominal

    • String

    • Date

  • The keywords numeric, real, integer, string and date are case insensitive.


ARFF - Data Types Example

  • @ATTRIBUTE sepallength NUMERIC

  • @ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}

  • @ATTRIBUTE LCC string

  • @attribute <name> date [<date-format>]

    default format: yyyy-MM-dd'T'HH:mm:ss


ARFF - Data Section


ARFF - Data Section ..

  • The ARFF Data section of the file contains the data declaration line and the actual instance lines.

    • The @data Declaration

      The @data declaration is a single line denoting the start of the

      data segment in the file.

    • The instance data

      • Each instance on a single line

      • Attribute values delimited by commas

      • The order agreed the declaration in header section

      • Missing values are represented by a single question mark

      • Values of string and nominal attributes are case sensitive, and any that contain space must be quoted


Create an ARFF file


Create an ARFF file ..


WEKA 3.5.5


Program

  • LogWindow Opens a log window that captures all that is printed to stdout or stderr. Useful for environments like MS Windows, where WEKA is not started from a terminal.

  • Exit Closes WEKA.


Program .. LogWindow


Applications

  • Explorer: for exploring data with WEKA.

  • Experimenter: for performing experiments and conducting statistical tests between learning schemes.

  • KnowledgeFlow: supports essentially the same functions as the Explorer but with a drag-and-drop interface. One advantage is that it supports incremental learning.

  • SimpleCLI: Provides a simple command-line interface that allows direct execution of WEKA commands for operating systems that do not provide their own command line interface.


Tools

  • ArffViewer An MDI application for viewing ARFF files in spreadsheet format.

  • SqlViewer represents an SQL worksheet, for querying databases via JDBC.

  • EnsembleLibrary An interface for generating setups for Ensemble Selection.


ArffViewer


SqlViewer


EnsembleLibrary


Visualization

  • Plot For plotting a 2D plot of a dataset.

  • ROC Displays a previously saved ROC curve.

  • TreeVisualizer For displaying directed graphs, e.g., a decision tree.

  • GraphVisualizer Visualizes XML BIF or DOT format graphs, e.g., for Bayesian networks.

  • BoundaryVisualizer Allows the visualization of classifier decision boundaries in two dimensions.


Windows

  • Minimize Minimizes all current windows.

  • Restore Restores all minimized windows again.


  • Login