1 / 12

Automating the Production of Descriptive Tables at Statistics Canada

Automating the Production of Descriptive Tables at Statistics Canada. mog.ado, a user-written program with quality controls Questions and comments may be sent to the author at matt.hurst@statcan.gc.ca. Contents. Environment of where mog was developed—Statistics Canada Purpose of mog

tempest
Download Presentation

Automating the Production of Descriptive Tables at Statistics Canada

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automating the Production of Descriptive Tables at Statistics Canada mog.ado, a user-written program with quality controls Questions and comments may be sent to the author at matt.hurst@statcan.gc.ca

  2. Contents • Environment of where mog was developed—Statistics Canada • Purpose of mog • Examples • Options: present and future Statistics Canada • Statistique Canada

  3. Statistics Canada • Statistics Canada produces statistics that help Canadians better understand their country—its population, resources, economy, society and culture • Objective statistical information is vital to an open and democratic society. It provides a solid foundation for informed decisions by elected representatives, businesses, unions and non-profit organizations, as well as individual Canadians • As Canada’s central statistical agency, Statistics Canada is legislated to serve this function for the whole of Canada and each of the provinces • In addition to conducting a Census every five years, there are about 350 active surveys on virtually all aspects of Canadian life • Data uses include: GDP, CPI, unemployment rate; health, social and education statistics • We at Statistics Canada are committed to protecting the confidentiality of all information entrusted to us and to ensuring that the information we deliver is timely and relevant to Canadians • Visit us at www.statcan.gc.ca for more information Source: http://www.statcan.gc.ca/about-apercu/overview-apercu-eng.htm Statistics Canada • Statistique Canada

  4. Collection and Dissemination • Collecting data (census, administrative data and surveys) • Questionnaire development, testing, collection, and data processing • Check data • Verification (errors in processing, coding mistakes) • Certification (compare estimates to other data sources) • Preparations for dissemination (e.g. for an analysis made on the data) • Reliability of the estimates is acceptable • Suppression (confidentiality of respondents is being protected) • Significance testing between estimates Statistics Canada • Statistique Canada

  5. Purpose of mog • mog designed to automate the dissemination quality control steps of: reliability, suppression, and significance testing • As well, it displays estimates by up to two other classification variables in tabular form • Result: a table giving estimates (mean or total) of one variable over one or two other categorical variables • Useful for simple, descriptive statistics Statistics Canada • Statistique Canada

  6. Example I Make a table showing the mean of “retired” by age and education categories (similar to “table education age, c(m retired)”), but with quality control checks mog retired education age, nodetail survey dec(0) Means of retired by education and age Estimation technique for standard errors: linearized Table 45 to 65 66 to 75 Over 75 doctorate/maste~ 20 87^ 88^ diploma/certifi~ 16 86^ 97^ some university~ 18 83^ 92^ high school dip~ 18 88^ 79^ some secondary/~ 26 76*^ 76*^ Notes * significantly different from the reference group of the variable educ5, category number 1, p < .05 ^ significantly different from the reference group of the variable age3, category number 1, p < .05 The data in the table is not real. Statistics Canada • Statistique Canada

  7. Example II Same as example I with additional options mog retired education age, nodetail /// survey dec(0) ref2(2)pubs pubdichot underscores varwidth(40) Means of retired by education and age Estimation technique for standard errors: linearized Table 45_to_65 66_to_75 Over_75 doctorate/masters/bachelor's_degree 20^E 87X 88X diploma/certificate_from_community_colle~ 16^ 86 97^X some_university/community_college 18^E 83X 92X high_school_diploma 18^ 88X 79 some_secondary/elementary/no_schooling 26^ 76* 76* Notes * significantly different from the reference group of the variable educ5, category number 1, p < .05 ^ significantly different from the reference group of the variable age3, category number 2, p < .05 The data in the table is not real. Statistics Canada • Statistique Canada

  8. Example I: the Long Way • At Statistics Canada, to create the table in our example that meets key confidentiality and quality requirements (there are others) would need the following commands to be run: • One table command to create a table of estimates • One mean command and one estimates table command to examine individual significance of the 15 estimates • 22 test or lincom commands requiring visual inspection of results • One tabulate command and a visual inspection of 15 cell counts • In total, 26 lines of code and 52 numbers that need to be visually inspected, as opposed to 1 line of code to run mog and inspecting the 15 estimates it produces, all in one place • The work multiplies for each table you have • All of the above needs to be done again if the sample changes Statistics Canada • Statistique Canada

  9. Copying Process • Select the table rows from the mog output • Right click and select: • “copy table” if copying to a spreadsheet or word processor (in a Word table, select enough rows and columns in the table into which you are copying) • Other options include: • “copy text” if copying to a word processor where you will use a fixed width font • “copy html” if copying to a location where you want a table to be automatically generated • mog’s underscore option useful when value labels have spaces—ensures the correct number of columns are created Statistics Canada • Statistique Canada

  10. Other Options • Display Options: • Number of decimal places displayed; number rounding • Control of column width (although columns will automatically enlarge if large numbers/many decimal places are to be displayed) • Reshow table by typing mog with no arguments • Reshow table with different reference groups (or other display options) without re-estimating the variances (time saver when bootstrapping) • Can show quality control symbols that indicate: • individual statistical significance of results at two user-defined thresholds (e.g. F = do not publish if cv > 1/3, E = publish with warning if 1/3 >= cv >= 1/6); and • whether the estimate is based on enough observations (e.g. X if too few) • The cut-offs and symbols can be changed as per the user’s needs • Statistics Canada surveys have “User Guides” that indicate these values • Analysis • Significance level used for tests between classification levels can be changed (.05, .01, …) • mog is “byable” • Will use svyset information in variance estimation via “survey” option (not through svy prefix) Statistics Canada • Statistique Canada

  11. Future Options • Save table as a csv file • Show standard errors/t-ratios under estimates • Harmonize syntax with Stata—use over() option to specify classification variables • Use estimates based on different populations by one classification variable • Use with proportion command • Find alternative to the underscores option Statistics Canada • Statistique Canada

  12. Requests for the Program • Contact me directly at matt.hurst@statcan.gc.ca and I will send you the program • Please provide me with any comments you may have on bugs, wording, inconsistencies, etc. • After receiving enough feedback, I will update the program and make it available online at one of the stata program archive sites Statistics Canada • Statistique Canada

More Related