Advanced Stata Workshop

Advanced Stata Workshop FHSS Research Support Center

Presentation Layout • Visualization and Graphing • Macros and Looping • Panel and Survey Data • Postestimation

Visualization and Graphing in Stata

Intro To Graphing In Stata “graph” is often optional. So is “twoway” in this case. Note: Nearly all graphing commands start with “graph”, and “twoway” is a large family of graphs.

Creating Multiple Graphs with “by():” Note that the value label is displayed above the graphs, and the variable label is displayed in the bottom right hand corner.

Overlaying “twoway” graphs The || tells Stata to put the second graph on top of the first one – order matters! You don’t need to type “twoway” twice; it applies to both. This is another way of writing the command – it doesn’t matter which one you use.

"by()" statements with overlaid graphs “qfitci” is a type of graph which plots the prediction line from a quadratic regression, and adds a confidence interval. The “stdf” option specifies that the confidence interval be created on the basis stdf is an option of qfitci. by(foreign) is an option of twoway.

"by()" statements with overlaid graphs Another way of writing the previous command is: So: This was is easier to read. This way is easier to type.

Graphs with Many Options and Overlays You can make pretty impressive graphs just from code, if you overlay the graphs and specify certain options like: multiple axes, notes, titles and subtitles, axis titles and labels, and legends.

Code for Previous Graph This may look scary, but it is actually fairly straightforward. See the accompanying do-file for explanation of each component.

Using the Graph Editor It is often easier to make changes in the graph editor than to specify all the options in code. Let’s make graph 1 into graph 2 by using the graph editor tools.

Recording Edits in the Graph Editor Before you start making changes, click the record button. After you are done, click it again, and save your changes as a recording so you can “play” them back later. We will save this recording as advanced_workshop_1.

Play Your Graph Recording You can create a graph, open the graph editor, click the green play button, and then play back your recorded edits. Or, you can play your edits right from the code: You can run your recorded edits on a graph of a different type, though in this case not all of your edits will make sense: You can also run all of your recorded edits on a different graph, and just change the title:

Storing and Moving Your Recordings Graph recordings are stored as .grec files in your “personal” folder, under the “grec” folder. Type “personal” to see where this is; normally it is C:\ado\personal. So by default Stata should store your .grec files in C:\ado\personal\grec. Unfortunately, if you are not faculty, you are probably using lab computers to use Stata, and when they are re-imaged, you will lose the files in your grec folder. So you can store the recordings on your flash drive by clicking the Browse button when you save your recording. Now, when you are in the graph editor and click the play button, your recording will not appear in the list because it is not stored where Stata knows to look for it. Never fear, just click Browse, and navigate to where your .grec file is. If you want your recording to be available right from code, as in play(advanced_workshop_1), you will need to move it (at least temporarily) to the “grec” folder, or write the directory location in the code: play(E:\flashdrive\Graph Recordings\advanced_workshop_1)

Using Schemes in Graphing Recordings are great if you are going to be making the same kind of graph a lot. But a recording for a scatter plot will hardly affect a histogram at all, and might even make it look terrible. If you want to change the look of all graphs that you make, you may want to make a scheme. Schemes are text files which tell Stata how to draw graphs.

More on Schemes Schemes are very powerful, because they let your implement a certain look without specifying a long series of options in every graph, or running every graph through the graph editor. However, creating schemes is fairly time consuming. For more on creating your own schemes, see: http://www3.eeg.uminho.pt/economia/nipe/2010_Stata_UGM/papers/Rising.pdf And http://www.ats.ucla.edu/stat/stata/seminars/stata_graph/graphsem.txt

Manipulating Graphs: Memory vs. Disk • When you draw a graph, it is stored in memory, under the name Graph. • If you draw another graph, it replaces the previous one in memory, and is now called Graph. • If you want to have multiple graphs up at the same time, you can use the name option. • graph save moves your graph from memory to disk, saving it as a .gph file. • graph dir lists all graphs in memory and on disk (in the current directory) • graph drop drops a graph from memory. Graphs contain the data files they represent, so if the dataset is large, they can actually take up quite a bit of memory.

Manipulating Graphs: Demo Graph manipulation commands are quite useful for exploratory analysis. See do file for code.

More Example Graphs Note: Annotated code is in the do file for all of these Histogram, with overlaid normal distribution

More Example Graphs Use graph bar to make bar graphs

More Example Graphs Use graph combine to combine 3 graphs into one:

More Example Graphs Graph matrix is a great alternative to a correlation matrix to investigate relationships between variables

More Example Graphs Get data labels (called marker labels in Stata) from the values of another variable

More Example Graphs Xtline from a panel data set can overlay lines for each value of panel variable.

Macros • Macros come in two general types: • Globals • Exist until Stata is closed • Locals • Exist until the end of the do file • Other types of macros exist, but are rarely used

global vs. local Creating the global Creating the local - References to locals have to be enclosed in single quotes - References to globals have to begin with a $ End of the do file The local no longer exists Conversely, the global still exists

When do we need “for” loops? • If a STATA program involves repetitive actions on a group of variables, files, or other items • Examples • Creating new variables • Recoding missing values on a list of variables • Merging multiple datasets • Labeling variables

Determining what macros already exist The local we created General macros automatically created by Stata The global we created

Foreach • Syntax of foreach command • foreachlname {in|ofvarilist} variables { commands referring to `lname' } • The open brace must appear on the same line as the foreach; • Nothing may follow the open brace except, of course, comments; the first command to be executed must appear on a new line; • The close brace must appear on a line by itself

Differences in Using -in- option and -of varlist- option in the -foreach- command • foreachi in variable1-variable5 { Stata commands } • There is only one variable called “variable1-variable5” • foreachi of varlist variable1-variable5 { Stata commands } • There are five variables, including variable1 through variable5

Stata commands in recoding variables

Using macros to store variable names Global for ind.vars

Global for ind.vars

Running Parallel lists with macros Create a local called “1” Create local called “2” Create macro 3 = # of words in macro 1 Extracting word `I’ from local “1” Extracting word `I’ from local “2” Using the new locals in a display command with other text Results

Creating a program in Stata Program name Command name First command to be run when the program is implemented Second command to be run when the program is implemented Telling Stata that there are no more commands to be used as part of the program

Invoke the program by simply typing the program name and then running in Stata. Results

SVYset and SVY Prefix

Simple vs. ComplexSample • Many Statistical techniques assume simple random sample • Simple random sample—each element of the sample has equal probability of being sampled.

Complex Survey • Sampling weights • inverse probability of being sampled • represent weight elements in the population • Clustering • groups sampled together • primary sampling units (PSU) -- first level clusters • Stratification • groups of clusters– strata • strata sampled separately

Example • States, Counties, Schools, Students sample states in different regions sample counties within each state sample schools within each county sample students from schools

svyset • svysetpsu? [pweight=?] , strata = (?) fpc(?) || psu?, fpc(?) psu = primary sampling unit pweight = probability weight fpc = finite population correction (total # of stratus or clusters PSU is sampled from) || = next stage

SVYSET Examples • use http://www.stata-press.com/data/r12/multistage • svyset county [pw=sampwgt], strata(state) fpc(ncounties) || school, fpc(nschools) • save highschool • use highschool • svyset

SVY Prefix Examples • svy: proportion race sex • svy: tab race sex, ci • svy: tab race sex, count ci • svy, subpop(if sex==1): mean weight height • svy, subpop(if sex==2): mean weight height, over (race) • svy: reg weight sex Note: subpop is preferred over “if statement” as stata will include all cases for estimating standard errors

Take-home Message • Ask what sampling design for your data before running analysis. • If complex survey data, consider svyset or multilevel modeling.

xtset and xtprefix

xtset—Declare Panel Data • xtsetpanelvarspecifyunit observed repeatedly • xtsetpanelvartimevar [, tsoptions] specify time var • xtsetdisplay current xtset • xtset, clear clearxtset Menu Statistics > Longitudinal/panel data > Setup and utilities > Declare dataset to be panel data

Time-Unit Options • [unitoptions] specify units of time clocktime, daily, weekly, monthly, quarterly, halfyearly, yearly… • [deltaoption] specify duration between observations delta (#) e.g. deta (2) delta (exp) delta (7*24) delta (# units) delta (10 min)/(7 days)

Xtdescribe—pattern of xt data • xtdescribe [if] [in] [, options] [,options] patterns(#) e.g. p(10) -- display max. 10 width(#) w(80) -- display 80 columns Menu Statistics > Longitudinal/panel data > Setup and utilities > Describe pattern of xt data

Examples • use http://www.stata-press.com/data/r12/nlswork • xtset • Browse • xtdes, p(20) • xtsum hours • xttab race • xtregln_w grade age ttl_exp tenure south, mle

Post Estimation in STATA

Advanced Stata Workshop

Advanced Stata Workshop

Presentation Transcript

Advanced Programming Workshop

EZ-Steer Advanced Workshop

Advanced NEOGOV Workshop

Advanced Tomato Workshop

Stata Workshop #1

Advanced Stata Workshop

STATA APPLICATIONS

Advanced Stata Programming

Kaseya Advanced Workshop

Research Methods Lecture 5 Advanced STATA

Advanced CCD Workshop

WOW Advanced Workshop

STATA

Advanced Routing Workshop

Advanced Stata Programming

Advanced RECAP Workshop

Advanced Java Workshop

Advanced RECAP Workshop