Getting started with stata
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

Getting Started with Stata PowerPoint PPT Presentation


  • 107 Views
  • Uploaded on
  • Presentation posted in: General

Getting Started with Stata. 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education. Agenda. Overview of Stata Getting Started ‘Do’ files Basic data cleaning Basic data management Beginning analysis Special topics (time permitting).

Download Presentation

Getting Started with Stata

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Getting started with stata

Getting Started with Stata

2/11/2010

Tom Tomberlin

Nealia Khan

Learning Technologies Center

Harvard Graduate School of Education


Agenda

Agenda

  • Overview of Stata

  • Getting Started

  • ‘Do’ files

  • Basic data cleaning

  • Basic data management

  • Beginning analysis

  • Special topics (time permitting)


Agenda1

Agenda

  • Overview of Stata

  • Getting Started

  • ‘Do’ files

  • Basic data cleaning

  • Basic data management

  • Beginning analysis

  • Special topics (time permitting)


Overview

Overview

Why use Stata?

  • Availability

  • Can self-program, or use menus

  • Cutting –edge statistical methods (including user-defined functions)

  • Publication-quality graphics


Stats and graphics

Stats and Graphics


Getting started

Getting Started

  • A word about programming in and using Stata

  • Stata is case sensitive, so Myvar is different from myvar

  • All commands in Stata are lower-case

  • “and’ = &, “or” = |, “not”= !

  • Assignment is “=“ , value equivalency is “==“


Windows in stata

Windows in Stata


Agenda2

Agenda

  • Overview of Stata

  • Getting Started

  • ‘Do’ files

  • Basic data cleaning

  • Basic data management

  • Beginning analysis

  • Special topics (time permitting)


Getting started1

Getting Started

  • Opening Stata

  • Opening Data:

    • Stata formatted data

      • “use” command

    • Comma-separated variables

      • “insheet using”

    • Tab-delimited variables

      • “insheet using”

    • Flat-files

      • Create a dictionary


Apply your knowledge

Apply Your Knowledge

  • Exercise 1:

  • Open Stata

  • Using the insheet command, open the comma-separated variables data file located in

    • F:\workshops\SATdata.csv

      • (HINT: all Stata commands must be written in lower case.

      • Don’t forget to put pathnames in quotes!)


Examining data

Examining Data

  • Look at your data – did our data import correctly?

    • How are our data measured?

    • What kinds of variables do we have?

  • How would we describe the distribution of our data?

    • Graphs

      • Histograms

      • Scatterplots

    • Charts/Tables

      • Frequency tables

      • Cross-tabs


Looking at data

Looking at Data

  • There are several ways to look at our data in Stata

    • Editor

    • Browser

    • Stata commands

      • codebook

      • des

      • Tables of frequency and distribution

      • Graphs of distribution


Examining data1

Examining Data

  • Let’s look at how the variable ‘csat’ is distributed

    • hist csat

    • tab csat


Agenda3

Agenda

  • Overview of Stata

  • Getting Started

  • ‘Do’ files

  • Basic data cleaning

  • Basic data management

  • Beginning analysis

  • Special topics (time permitting)


Do files

Do files

What are do-files?

‘Do’ files are essentially a syntax list of all of the commands that you wish to run, and the setting that you would like to set

  • Why use them?

    • Replication

    • Collaboration

    • Audit trail

    • Help

  • How to create and run one


Do files1

Do-files

  • Creating and running a do-file


Do files2

Do files

  • EXERCISE 2: Create a simple do-file from the commands that you have already entered.

    (HINT: you must clear the data in memory before opening a new dataset.)


Agenda4

Agenda

  • Overview of Stata

  • Getting Started

  • ‘Do’ files

  • Basic data cleaning

  • Basic data management

  • Beginning analysis

  • Special topics (time permitting)


Agenda5

Agenda

  • Overview of Stata

  • Getting Started

  • ‘Do’ files

  • Basic data cleaning

  • Basic data management

  • Beginning analysis

  • Special topics (time permitting)


Basic data cleaning

Basic Data Cleaning

  • Labeling

  • To label a variable: label var varname label

  • To label values:

    • label define labelname 1 ‘high’ 0 ’low’

    • Label val varname labelname

  • Renaming

    • ren varname1 varname2

  • Recoding

    • recode varname oldvalue=newvalue

  • Generating a new variable

    • gen newvarname=somevalue

  • Replacing values of an already generated variable

    • replace newvarname=somevalue


Basic data management

Basic Data Management

  • Subsetting

    • keep

    • drop

    • if

      Merging

      merge

      must sort both files by the linkage variable!

      ex: merge linkage_var using “F:\workshops\newfile”


Basic data cleaning1

Basic Data Cleaning

  • EXERCISE 3:

    • generate a dichotomous variable called hi_score from the csat variable, where a value of 1 indicates a score of greater than 922 and a 0 is less than or equal to 922.

    • label it as 0=low and 1=high.


Agenda6

Agenda

  • Overview of Stata

  • Getting Started

  • ‘Do’ files

  • Basic data cleaning

  • Basic data management

  • Beginning analysis

  • Special topics (time permitting)


Beginning analysis

Beginning Analysis

  • Univariate analysis

    • summarize

    • histogram

    • Table

      Bivariate analysis

      tabulate

      pwcorr

      ttest


Apply your knowledge1

Apply Your Knowledge

  • EXERCISE 4:

  • Generate a histogram of the expense variable

  • generate a two-way table to see if distributions are the same or different for the values of expense by the different values of your newly created hi_score variable

  • If you have time, see if there is a significant correlation between scores on SATs and the average amount of money that each state spends on education.


Beginning analysis1

Beginning Analysis

  • Multivariate models

    • Linear regression

      regress depvar indepvar1 indepvar2 … indepvarN

    • Logistic Regression

      • logit depvar indepvar1 indepvar2 … indepvarN


Apply your knowledge2

Apply Your Knowledge

  • Exercise 5:

    Generate two scaterplots – one to look at the relationship between expense and csat , one to look at expense and hi_score.

    Depending on your estimation of the relationship (linear or not), run the appropriate regression to test for the relative effect of expense on either csat scores or hi_scores


Agenda7

Agenda

  • Overview of Stata

  • Getting Started

  • ‘Do’ files

  • Basic data cleaning

  • Basic data management

  • Beginning analysis

  • Special topics (time permitting)


Thanks

Thanks

Questions?

Gutman Library, room 323a&b

[email protected]

http://www.isites.harvard.edu/icb/icb.do?keyword=ltc


  • Login