1 / 26

data rescue in Canada, A Case Study

data rescue in Canada, A Case Study. IASSIST Cornell University June 2 nd , 2010 Jane Fry (Carleton University). Summary. Data Rescue The CRIC rescue process Previous data rescues The latest data rescue So what!. Why do data rescue?. Many reasons Department closes Funding is over

kerryn
Download Presentation

data rescue in Canada, A Case Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. data rescue in Canada, A Case Study IASSIST Cornell University June 2nd, 2010 Jane Fry (Carleton University)

  2. Summary • Data Rescue • The CRIC rescue process • Previous data rescues • The latest data rescue • So what! IASSIST, June 2010

  3. Why do data rescue? • Many reasons • Department closes • Funding is over • No data archive • No-one to get data into shape • Future data dissemination • Maintaining the collection IASSIST, June 2010

  4. Background of CRIC • CRIC = The Centre for Research and Information on Canada • established in 1996 • part of the Canadian Unity Council (CUC) • managed their research and communications activities • CRIC’s research activities centred around the theme of Canadian unity • a variety of issues related to the political, economic, and social union of Canada IASSIST, June 2010

  5. How did it happen? • A phone call was received in 2004 • CRIC closing its offices in 2005 • Does Carleton U.want the data? • We had a contact at CRIC that had previously seen the effectiveness of using DDI for marking-up datasets • Only a few conditions: • clean the data; • archive the collection; • disseminate it for research and teaching purposes only; and • maintain the collection. • Carleton U. agreed to CRIC’s conditions – why? • without preservation the data would be lost • preserving the files would create a rich resource for teaching and research IASSIST, June 2010

  6. And then... • CRIC delivered the rest of the goods • Boxes filled with documentation, video tapes, ... • CDs filled with everything from their computers • Included raw data, datasets with no background information, invitations to social events, ... • *Bonus* • Carleton U. received a modest stipend to go towards getting the collection in shape • What we didn’t realize was how long those few conditions would take to fulfill! IASSIST, June 2010

  7. The operation • Carleton U. dedicated one summer student full-time to clean up the collection, that is, make it ready for public use. • This included: • listing all the contents of the boxes; • going through the CDs and matching up files; • deleting confidential information; • anonymizing the data; and • cleaning up the variable and value labels in the datasets. IASSIST, June 2010

  8. The end result • CRIC surveys include: • Portrait of Canada Series (2000 – 2005); • Looking West, 2001; • Borderlines, 2002; • Canada and World Affairs, 2002; • Charter of Rights and Freedom, 2002; • The Globe and Mail Survey on “The New Canada”, 2003; • Survey on Official Languages, 2003; and • Attitudes towards Internationalism and Federalism, 2003. IASSIST, June 2010

  9. Previous data rescues • Gallup Canada • 1945 - 2000 • when people call Gallup Canada for data, they are referred to Carleton U. • they don’t hold any of the older data • Example: “Some people say the winters in this country are getting warmer. From your own experience, would you say this was true, or not true?” • True  - 70.4%, Not True - 29.6% • Date of Survey - February 1955, #241 IASSIST, June 2010

  10. Previous data rescues (cont’d) • Listening to Canadians (Canadian Information Office), 2000 – 2003 • the result of a phone call received at Carleton U. • the government department is now defunct • surveys measured Canadians’ views on public policy priorities • questions included respondents’ opinions on govt spending in different departments, such as, health, education, and the military IASSIST, June 2010

  11. IASSIST, June 2010

  12. The latest data rescue • Last fall, Carleton U. was contacted about rescuing another series of datasets • the Canada Millennium Scholarship Foundation (CMSF) • Started operations in 1998 • 10 year government mandate, starting in 2000 • Closed up shop on March 31st, 2010 • Wanted their information to be preserved IASSIST, June 2010

  13. What is CMSF ? • Created by an act of Parliament in 1998 to provide awards to students for post-secondary education annually for ten years. • When its mandate was completed in January 2010, the Foundation had distributed $325 million in the form of bursaries and scholarships each year throughout Canada in support of post-secondary education.  IASSIST, June 2010

  14. What is CMSF ? (cont’d) • In addition, the Foundation conducted research into post-secondary access, via the Millennium Research Program. • One of the most important research databases is MESA (Measuring Effectiveness of Student Aid), a longitudinal study of students who received bursaries based on need as assessed by applications to provincial student aid.  IASSIST, June 2010

  15. Content of CMSF • Both publications and survey microdata • Publications include • Research Series • Research Notes • Price of Knowledge publication • Annual Reports • Other Reports and Publications • Press releases IASSIST, June 2010

  16. CMSF Sidenote • The process of making sure we have all the publications is quite labour intensive. • We started by downloading all the publications from their website before it was taken down. • We then compared the CD sent from CMSF with the information we had previously retrieved from the web. • It was good that we had both sources of information because the two lists were not identical. • Between the two sources of information we were able to come up with a comprehensive list. • We learned this lesson from our CRIC data rescue • a student requested a certain CRIC publication that we didn’t have. We were eventually able to find it, but it took a lot of digging. IASSIST, June 2010

  17. The end result • Access to the forthcoming CMSF surveys • Student Financial Survey, 2001-2002 • Canadian College Student Survey Series, 2001-2006 • Post-Secondary Education: Cultural, Scholastic and Economic Drivers, 2004 • Survey of Secondary School Students, 2004 • Ontario College Applicant Survey Series, 2004-2006 • Ontario University Applicant Survey, 2005 • Class of 2003 High School Follow-up Survey, 2005 • The MESA Project (Measuring the Effectiveness of Student Assistance), Cycle 1 and 2, 2006-2008 • Canadian Career College Students Survey, 2008-2009 IASSIST, June 2010

  18. CMSF Bonus • We received a modest stipend • We were able to hire a student full-time this summer • These datasets will be available at the end of the summer – hopefully! • We are indoctrinating another person into the mission of data rescue. IASSIST, June 2010

  19. Why do data rescue? • Review from beginning • Department closes • Funding is over • No data archive • No-one to get data into shape • Future data dissemination • Maintaining the collection IASSIST, June 2010

  20. Why do data rescue? (cont’d) • CRIC, LTC, CMSF • links on the web are long gone • Carleton U Data Centre link comes up • Carleton U community has free and open access to the data • Carleton U Data Centre can allow access to others for teaching and research purposes • By the way, … • Before we decided to do data rescue, we were already receiving data from these organizations, so these rescues didn’t come from thin air IASSIST, June 2010

  21. Were the data rescues entirely successful? • Depends on your definition of success • Do we have all the information? • No! • We didn’t get to CRIC and LTC soon enough • we have some datasets with no background information so they cannot be used. • we have some datasets with incomplete variable and value labels, and some with missing variable and value labels, so they cannot be used IASSIST, June 2010

  22. Next question • When do data rescues happen? • In the data life cycle, rescuing data can happen at any point in time • It happened at the end of the CRIC lifecycle • certain information has been lost forever but more could have been saved if we had know about it sooner • Luckily, it happened before the end of the CMSF lifecycle IASSIST, June 2010

  23. What happens next? • That is, what happens once the data rescue is completed and the data is in shape • Little on-going work for Carleton U • We will continue to ensure technological and operational management of the stored data • Met international standards (DDI) to preserve this data • Use the Nesstar interface for the data • Easy data download • All associated metadata (that we have) is available with the data IASSIST, June 2010

  24. Other benefits of data rescue • Students have been totally indoctrinated into the importance of data and data rescue. • Even though some of the series only cover a short time span (eg. CRIC – 5 years, LTC – 3 years), there is a good balance of topics on issues that continue to be relevant today. IASSIST, June 2010

  25. In Summary • Data rescue is important! • We have to ask ourselves – • How many more datasets are out there that we can rescue? • Whose responsibility is it to keep data alive? • If we can do data rescue, we should! IASSIST, June 2010

  26. Thank you! Jane Fry Data Rescuer Carleton University http://www.library.carleton.ca/ssdata/ jane_fry@carleton.ca

More Related