introduction to data science info 480 drexel university s ischool n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Introduction to Data Science – INFO 480 – Drexel University’s iSchool PowerPoint Presentation
Download Presentation
Introduction to Data Science – INFO 480 – Drexel University’s iSchool

Loading in 2 Seconds...

play fullscreen
1 / 14

Introduction to Data Science – INFO 480 – Drexel University’s iSchool - PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on

Introduction to Data Science – INFO 480 – Drexel University’s iSchool. Sean P. Goggins, PhD April 30, 2013 Week Five. What is Data Science?. Storytelling Database Theory – How you organize your data has a big influence on what you can do with it.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Introduction to Data Science – INFO 480 – Drexel University’s iSchool' - samira


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
introduction to data science info 480 drexel university s ischool

Introduction to Data Science – INFO 480 – Drexel University’s iSchool

Sean P. Goggins, PhD

April 30, 2013

Week Five

what is data science
What is Data Science?
  • Storytelling
  • Database Theory – How you organize your data has a big influence on what you can do with it.
  • Agile Manifesto – Key thing is iterative development; it’s a technology value system.
  • Spiral Dynamics – What we view as fact and what we desire emerges from the data presented to us.

Credit: http://www.datascientists.net/what-is-data-science

tonight
Tonight
  • Share Software for transformation on GitHub
  • Share How you approached the assigment with the class (individually)
    • Ask questions
    • Make sure you understand everyone’s approach
    • Help each other – The result not the language or technique used to transform data are what matter
  • Use network scripts from week one to transform your transformed data (that’s right!) into networks. Groups of 3
week five
Week Five
  • Software Sharing #1 (Share scripts produced in week 3 using an open source software configuration management tool).
    • Students will refine and then share their scripts with other students
    • Included in the assignment is a 500 word explanation of how their script could be improved, optimized and adapted to other data of a similar type.
    • The “read me” file distributed with the script will explain to another user how to apply the script to the data distributed in assignment one. This will include specific, technical specifications.
using github for software sharing
Using GitHub for Software Sharing
  • Creating a GitHub Account
  • Creating a GitHub Project
  • Using the GitHub Desktop client
  • Committing & Syncing
  • The Pull Request
  • Sharing Your Software!
    • For my respository
    • Create a directory with your name under “student Files”
    • Put your assignment in there
    • Create a “pull request”
discuss homework
Discuss Homework
  • Analysis Questions. Write up a short essay with tables or graphs if needed to describe how you would:
    • Build a network using the scripts from week1 against the mention connections? Reply-To connections? In this sample data. What transformations are required? How would you filter the data? Use the actual data to ground your thinking. Feel free to actually write or modify the R code samples from the first two weeks to experiment. Some of you will be more comfortable doing this; some will be more comfortable addressing the question conceptually. This is OK.
individual presentations

Individual Presentations

Informally by you!

underpants gnomes

Motivation

Underpants Gnomes

With much discourtesy from the US TV Program “South Park”

slide12

Group Informatics Described

Identify Key

Information

Brokers

Weight Connections Based on Time Distance, Grouped

By Topic and informed by analysis of time distance between posts.

Methodological Approach

week six
Week Six
  • Week 6: Sharing Data Preparation Results and Tools
    • Readings and Assignments Due:
    • Presentation involves sharing data with other people in a way that is visually insightful. Students will be asked to bring an example of a visualization of data from a website or news organization, and make a short presentation about what makes the visualization insightful.
    • Data Visualization Example Presentation
    • Chapters 4-7 of “The Anarchist in the Library: How the Clash Between Freedom and Control is Hacking the Real World and Crashing the System”.