1 / 24

Introduction to Stata

Introduction to Stata. Max Perez Leon Quinoso Brian Fried StatLab. Create a folder named IntroStata in the desktop. Lets put all files in that folder Very simple. We can use StatTransfer (which usually comes with Stata) or export data directly using Stata. Exporting a Dataset.

sammy
Download Presentation

Introduction to Stata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Stata Max Perez Leon Quinoso Brian Fried StatLab

  2. Create a folder named IntroStata in the desktop. • Lets put all files in that folder • Very simple. We can use StatTransfer (which usually comes with Stata) or export data directly using Stata Exporting a Dataset

  3. Working space cd • Changing working space to IntroStata folder (the exact path will be different for each user) cd “C:\Users\MPLQ\Desktop\IntroStata” • Stata always has a default working directory Working Directory

  4. Different ways to call a dataset use “C:\Users\MPLQ\Desktop\IntroStata\example1.dta” • If we defined the working directory, we do not need to specify all the path. Notice we are also using command “clear”. clear use example1.dta clear insheet using example1.csv Calling dataset

  5. Browsing/Describing data browse br brpatient department edit list list patient age desc desc age desc score* codebook tab patient tab age tab department tab depart Examining data

  6. tab age survey sum score2000 sum score2000,detail sum score2000,det sum score* br *t sort score2000 gsort survey gsort - survey gsort -survey score2000 Examining data

  7. Using “in” and “if” browse in 1/4 browse if age<=20 Qualifiers br if age<20 | age >=40 br if (age<15 | age >=40) & department == “FES” Qualifiers

  8. It is not a good practice to use the command window for our research. We should have a file in which we store all our commands and that allows us to run efficiently our procedures. • Important Shortcuts: • “Ctrl+D” (visible) • “Ctrl+R” (invisible) • If we select some lines, the shortcuts will only run the commands in those specific lines. If we do not select any lines, it will run all the do file. • Comments start with asterisk. Do files

  9. clear set mem 100m set more off cd “C:\Users\MPLQ\Desktop\IntroStata” use example1 How do I usually start a Do file?

  10. generate doubscore= score2000 * 2 gen av_score=( score2000 + score2001 + score2002 )/3 gen ones=1 gen indicator= score2000<.5 • Missing values. Be aware that missing values are different from zero: gen small= av_score if av_score<0.4 tab small tab small,m Creating/Modifying variables

  11. Operations with missing values give missing values gen small_modify= small * 10 replace small_modify=0 if small_modify==. replace av_score=1000 if av_score<0.5 • Renaming variables rename doubscore score_double • Creating dummy variables tab survey, gen(Dsurvey) br br *survey* Creating/Modifying variables

  12. Egen egen mscore2000=max(score2000) br *score2000 egen Dscore2000=max(score2000),by(department) br score2000 department Dscore2000 • Bysort bysort department: tab score2000 bysort survey: sum score2001 gen index=_n br bysort department: gen dep_index=_n • Collapse collapse (count) patient (mean) score2000 (sum) score2001 (sd) score2002,by(department) Creating/Modifying variables

  13. Label a variable to convey more information label var survey "Patients Survey in 2010“ desc • Label values of categorical variables label define ex_label 1 low 2 medium 3 high label values survey ex_label desc Labeling

  14. Notice variable department is a string variable desc • Sometimes we would like to store a string variable as a numeric variable, but not loose the information contained in the strings. encode department, gen(dep_num) desc • I can reconvert numeric to string decode dep_num,gen(department2) br department dep_num department2 Numeric/Strings

  15. Long and wide format. Our original data is in wide format reshape long score ,i( patient ) j(year) reshape wide score ,i( patient ) j(year) Reshape

  16. Open file example2 use example2,clear use example1,clear append using example2 • Open file example3. Watch out, we are saving and replacing example3 but sorted by patient identification number. use example3,clear sort patient save,replace use example1, clear merge patient using C:\Users\MPLQ\Desktop\IntroStata\example3.dta Append/Merge

  17. use example1, clear append using example2 sort patient merge patient using C:\Users\MPLQ\Desktop\IntroStata\example3.dta Append/Merge

  18. clear set mem 100m set more off cd "C:\Users\MPLQ\Desktop\IntroStata" use example1 capture log close log using history, replace text tab survey log close log using history, text append tab department log close Log file: Printing results in .txt

  19. Be very careful when saving data. You could be eliminating your original data and months of hard work. Always keep a copy of your original data on a separate folder. save final_database save final_database,replace use final_database,clear save,replace Saving

  20. Mean sum score2000,det return list • Correlations corr score2000 score2001 score2002 corr score* corr score*, covar • Regression reg score2000 score2001 score2002 reg score2000 score2001 score2002, noconstant reg score2000 score2001 score2002, robust ereturn list Simple Statistics

  21. It is very useful to use Stata menus to obtain the command lines. scatter score2000 score2001 graph matrix score* graph bar (count) patient, over(survey) Graphs

  22. Text within brackets [] are optional restrictions or options. • Underlined sections indicate acceptable abbreviations help tab help help help gen Help file

  23. Local-macro variables • Foreach command (loops) • Regular expressions (Very useful if working with strings) • Commands • #delimit • return list • ereturn list • macro list Things to look up

  24. Stata’s YouTube channel: http://www.youtube.com/user/statacorp/featured • http://survey-design.com.au/tips.html • http://www.ats.ucla.edu/stat/stata/ • http://data.princeton.edu/stata/ • http://dss.princeton.edu/online_help/stats_packages/stata/ WebPages

More Related