1 / 10

Clustering in R

Clustering in R. Xue li CS548 showcase. Source. http://www.statmethods.net/advstats/cluster.html http://www.r-project.org/ http://cran.r-project.org/web/packages/cluster/index.html http://cran.r-project.org/web/packages/. Introduction to R.

rhian
Download Presentation

Clustering in R

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering in R Xue li CS548 showcase

  2. Source • http://www.statmethods.net/advstats/cluster.html • http://www.r-project.org/ • http://cran.r-project.org/web/packages/cluster/index.html • http://cran.r-project.org/web/packages/

  3. Introduction to R R is a free software programming language and software environment for statistical computing and graphics. (From Wikipedia) For two kinds of people: Statisticians and data miners Two main applications: Developing statistical tools, Data analysis

  4. If you have learned any other programming language, it will be very easy to handle R. • If you don’t, R will be a good start

  5. Package and function • http://cran.r-project.org/web/packages/available_packages_by_name.html

  6. Clustering • Package: “cluster”, “fpc”… • Functions: “kmeans”, “dist”, “daisy”,“hclust”…

  7. Main steps • Data preparation (missing value, nominal attribute…) • K-means • Hierarchical • Plotting/Visualization • Validating/Evaluation

  8. disadvantage • Cannot handle nominal attributesand missing values directly • Cannot provide evaluating matrix directly

  9. Advantage • Can handle large dataset • Write our own functions (Easier than Java in Weka)

More Related