170 likes | 534 Views
MATLAB/R Dictionary R meetup NYC January 7, 2010. Harlan Harris harlan@harris.name @ HarlanH Marck Vaisman marck@vaisman.us @ wahalulu. MATLAB and the MATLAB logo are registered trademarks of The Mathworks . About MATLAB. What is MATLAB. MATLAB History.
E N D
MATLAB/R DictionaryR meetup NYCJanuary 7, 2010 Harlan Harris harlan@harris.name @HarlanH Marck Vaisman marck@vaisman.us @wahalulu MATLAB and the MATLAB logo are registered trademarks of The Mathworks.
About MATLAB What is MATLAB MATLAB History Developed by Cleve Moler (Math/CS Prof at UNM) in the 1970’s as a higher-level numerical programming language (vs. Fortran LINPACK) Adopted by engineers for signal processing, control modeling Multipurpose programming language • Commercial numerical programming language, simulation and visualization • One million users (engineers, scientists, academics) • MATrixLABoratory – specializes in matrix operations • Mathworks - base & add-ons • Open-source Octave project
Notes • Today’s focus: Compare MATLAB & R for data analysis, contrast as programming languages • MATLAB is Base plus many toolboxes • Base includes: descriptive stats, covariance and correlation, linear and nonlinear regression • Statistics toolbox adds: dataset and category (like data.frames and factors) arrays, more visualizations, distributions, ANOVA, multivariate regression, hypothesis tests
-> • Interactive programming: Scripts and Read-Evaluate-Print Loop • Similar representations of data • Both use vectors/arrays as the primary data structures • Matlab is based on 2-D matricies; R is based on 1-D vectors • Both prefer vectorized functions to for loops • Variables are declared dynamically • Can do most MATLAB functionality in R; can do most R functionality in MATLAB.
Example: k-means clustering of Fisher Iris data Fisher Iris Dataset sepal_length,sepal_width,petal_length,petal_width,species 5.1,3.5,1.4,0.2,setosa 4.9,3.0,1.4,0.2,setosa 4.7,3.2,1.3,0.2,setosa 4.6,3.1,1.5,0.2,setosa …
Functions minmax <- function(c, opt=12) { # functions are assigned to # variables ret <- list(min = min(z), max = max(z)) ret # last statement is # return value } # if minmax was created in current # environment x <- minmax(c(1, 30, 3)) smallest <- x$min function [a, b] = minmax(z) % one function per .m file! % assign to formal return names a = min(z) b = max(z) end % if minmax.m in path [smallest, largest] = … minmax([1 30 3])
Object-Oriented Programming • Formerly: objects were defined by a directory tree, with one method per file • As of 2008: new classdef syntax resembles other languages • S3 classes: attributes + syntax • class(object) • plot.lm() • S4 classes: definitions + methods • R.oo, proto, etc…
Other notes • r.matlab package • Graphics • Matlab has much better 3-d/interactive graphics support • R has ggplot2 and much better statistical graphics
Additional Resources • Will Dwinell, Data Mining in MATLAB • Computerworld article on Cleve Moler • Mathworks • Matlabcentral • Comparison of Data Analysis packages (http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/) • R.matlab package • stackoverflow
References used for this talk • David Hiebeler MATLAB/R Reference document: http://www.math.umaine.edu/~hiebeler/comp/matlabR.html • http://www.cyclismo.org/tutorial/R/index.html • http://www.stat.berkeley.edu/~spector/R.pdf • MATLAB documentation • http://www.r-cookbook.com/node/23