Relevant software and getting it installed
This presentation is the property of its rightful owner.
Sponsored Links
1 / 14

Relevant software and getting it installed. PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on
  • Presentation posted in: General

Relevant software and getting it installed. Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 1b, January 24, 2014. Admin info (keep/ print this slide). Class: ITWS-4963/ITWS 6965 Hours: 12:00pm-1:50pm Tuesday/ Friday Location: SAGE 3101 Instructor: Peter Fox

Download Presentation

Relevant software and getting it installed.

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Relevant software and getting it installed

Relevant software and getting it installed.

Peter Fox

Data Analytics – ITWS-4963/ITWS-6965

Week 1b, January 24, 2014


Admin info keep print this slide

Admin info (keep/ print this slide)

  • Class: ITWS-4963/ITWS 6965

  • Hours: 12:00pm-1:50pm Tuesday/ Friday

  • Location: SAGE 3101

  • Instructor: Peter Fox

  • Instructor contact: [email protected], 518.276.4862 (do not leave a msg)

  • Contact hours: Monday** 3:00-4:00pm (or by email appt)

  • Contact location: Winslow 2120 (sometimes Lally 207A announced by email)

  • TA: Lakshmi Chenicheri [email protected]

  • Web site: http://tw.rpi.edu/web/courses/DataAnalytics/2014

    • Schedule, lectures, syllabus, reading, assignments, etc.


Today

Today

  • Install application software

  • Get some data and read, explore, etc.

  • Install data technology and related software


Gnu r

Gnu R

  • R Studio – see R-intro.html in manualshttp://www.rstudio.com/ide/download/

    • Manuals - http://cran.r-project.org/doc/manuals/

    • Libraries – at the command line – library(), or select the packages tab, and check/ uncheck as needed

    • http://cran.r-project.org/doc/manuals/R-lang.html


Scipy numpy ipython nb

Scipy/numpy/ iPython (NB)

  • Windows/Linux

    • http://scipy.org/install.html

  • If you have a Mac

    • Anaconda – http://continuum.io/downloads (preferred)

      • Use Launcher to install Spyder (and iPQt)

    • Do you have macports installed? ‘$ which port’

    • No? (sorry – ask me for details…)

      • Install Xcode (from http://developer.apple.com/download - you will need to register - academic)

      • http://www.macports.org/install.php

  • Also see individual packages on the install page..

  • http://scipy.org/getting-started.html


Matlab

Matlab

  • http://dotcio.rpi.edu/services/software-labs

  • Student version

  • License works within RPI network, so may have to use VPN if outside

  • http://mathesaurus.sourceforge.net/octave-r.html R for Matlab users


Files

Files

  • http://escience.rpi.edu/data/DA

  • This is where the files for assignments, exercise will be placed


Exercises getting data in

Exercises – getting data in

  • Rstudio

    • read in csv file (two ways to do this) - GPW3_GRUMP_SummaryInformation_2010.csv

    • Read in excel file (directly or by csvconvert) - 2010EPI_data.xls (2010EPI_data tab)

    • See if you can plot some variables

    • Anything in common between them?


Exercises

Exercises

  • Scipy

    • In Spyder read in a matlab file:

      • import scipy.io as sio

      • mat_contents= sio.loadmat(‘Williams40.mat’)

      • mat_contents

      • Explore – plot, etc.

    • Read in a csv file (your choice)

    • Write out as matlab file, i.e. sio.savemat (see File I/O help http://docs.scipy.org/doc/scipy/reference/tutorial/io.html )

    • http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html - start looking


Exercises1

Exercises

  • Matlab

    • Read in two different datasets:

      • sw40_30s.mat or sw29adcp.mat

      • UChicago30.mat or Williams40.mat

    • Explore them…

    • Read in the csv files


If time or for fun

If time or for fun…

  • se_eqs.xls

    • Plot it

    • Fit it

  • PRESSURE.xls

    • Plot it

    • Smooth it

    • Fit it …


Install fest continues

Install-fest… continues

  • http://projects.apache.org/indexes/category.html#database

    • Hadoop(MapReduce)

    • Pig (http://wiki.apache.org/pig/RunPig )

    • HIVE (http://hive.apache.org/releases.html )

      • https://cwiki.apache.org/confluence/display/Hive/GettingStarted

      • https://cwiki.apache.org/confluence/display/Hive/Tutorial

      • https://cwiki.apache.org/confluence/display/Hive/LanguageManual

    • Cassandra (binaries from DataStax)

  • And MongoDB - http://www.mongodb.org/


Objective

Objective

  • Get a good feel for the complexity and maturity of the data and tools environments

  • See some real data and start to consider what it will take to work with it

  • Big and complex - means time and memory and laptops only can do so much

  • We’ll soon look at the intersections like RHadoop: https://github.com/RevolutionAnalytics/RHadoop/wiki


No more reading this week

No more reading this week

  • Complete the installs as best you can

  • Pick your preferred application and data software and read up on them, try some examples


  • Login