1 / 24

Improving Data Collection Quality: Minimum Requirements for Generalized Software

This presentation discusses the adoption of a mixed-mode strategy by the Italian National Statistical Institute - ISTAT, including questionnaire design and software solutions. It identifies criteria for creating an integrated data collection system based on generalized functions. The presentation also explores the state of the art in mixed-mode surveys and future directions.

jdeems
Download Presentation

Improving Data Collection Quality: Minimum Requirements for Generalized Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode UNECE – Seminar on New Frontiers for Statistical Data Collection Geneva, 2 November 2012 Authors: M. Murgia – murgia@istat.it A. Nunnari – nunnari@istat.it Presented by: M. Murgia

  2. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Content of the presentation Aims of this presentation Mixed-mode: state of the art in Istat Adoption of a mixed-mode strategy A software solution to support mixed-mode surveys Requirements for a generalised software and evaluation criteria Results and future directions UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  3. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Aims of this presentation To describe the mixed-mode strategy adopted by the Italian NSI – ISTAT - both in terms of questionnaire design and of software solution. To identify criteria that help creating an integrated data collection system based on generalised functions covering all steps of data collection phase. In few words: how technology can best support methodology UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  4. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Mixed-mode: state of the art in ISTAT What we intend for mixed-mode The combined use of any data collection techniques for the questionnaire administration as well as for reminder and or follow-up phases. This means that a mixed-mode strategy has an impact on the entire phase: from design of data collection methodology to finalisation of collected data (sub-processes 2.3 to 4.4 of GSBPM). • Why using mixed-mode strategy? • A potential optimal solution to face, all together, problems of: • budget cuts • low response rates • low land line coverage UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  5. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Mixed-mode: state of the art in ISTAT Mixed-mode experiences in Istat Oldest experiences: mix of traditional data collection modes Business target: Mail-CATI surveys Population target: CATI-CAPI: Labour Force Survey More recently: WEB has been included in the mix Business target:Mail-WEB and WEB-CATI surveys Population target (even more recently): - CATI-WEB: 2009 PHD graduates survey - Mail-WEB: 2011 Population Census UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  6. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Mixed-mode: state of the art in ISTAT • The next future • The oldest mixed-mode approach is based on “traditional” techniques applied sequentially. • This approach is no more suitable to face budget, response rate and land line coverage problems. They can only be tackled by adopting a different approach that involves: • - any order of mode mixing: parallel or sequential • any type of data collection technique: traditional (mail, CATI, • CAPI) and less traditional (WEB) • any type of data collection instruments: traditional (pc) and • innovative (mobile phones, smartphones, tablets, etc.) Istat is moving toward this approach UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  7. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Mixed-mode: state of the art in ISTAT The next future Methodological (q’aire design) and technical issues (web & new hardware)have still to be solved by Istat: • WEB in mixed mode • Lot of experiences for business surveys, but “simple” questionnaire design strategy guided by a main technique - the other modes used only in a second step to cover missing strata. • Two experiences for population surveysand minor or no concerns about questionnaire design and few issues in terms of response rate: • - PHD graduates survey used two different questionnaires (different surveyed phenomena); Population Census used a main mail-specific questionnaire; • - High response rates for web, more than 30% in both cases: high education level of respondents for PHD graduates surveys and massive advertising campaign for Population Census UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  8. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Mixed-mode: state of the art in ISTAT The next future Methodological (q’aire design) and technical issues (web & new hardware)have still to be solved by Istat: • Mobile phones in mixed mode • Mobile phones used in CATI surveys for respondents with no land line phone, but no experiences in combining them with other methods. The combined use implies to address: • methodological issues (sampling frame, coverage, survey • environment – out of the scope of this presentation); • - technical issues: to adapt the questionnaire layout to a smaller • screen resolution when used in mixed-mode with web. Same • problems with smartphones, tablets etc. UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  9. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Mixed-mode: state of the art in ISTAT The next future Methodological (q’aire design) and technical issues (web & new hardware)have still to be solved by Istat: Mobile phones in mixed mode Besides, for population surveys, the use of web and mobile phone is a help and a must to solve budget, response rate and land line coverage problems as shown in the picture: UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  10. Technology Methodology Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Adoption of a mixed-mode strategy What it is needed The design of a data collection strategy aimed at containing the portion of non sampling error due to mode effect Mode effect = differences in collected data due to the characteristics of mode (measurement error) and not to real differences. Therefore it is always present in collected data, also if one single mode is used. In mixed-mode it needs to pay attention not to increase non- sampling error by adding a mix of measurement errors. To use data collection instruments that are mode insensitive. UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  11. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Adoption of a mixed-mode strategy Methodological side of the world TASK: To create an optimal designed mixed-mode strategy aimed at reducing the risk of greater non sampling error due to coverage, measurement, frame and non-response errors Technological side of the world TASK: To create an optimal data collection system able to implement the designed strategy and aimed at reducing the risk of greater non sampling error due to complexity. Complexity = duplication of efforts to implement the questionnaire across modes. Complexity increases survey costs, delay in data delivery and measurements errors UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  12. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Adoption of a mixed-mode strategy Methodological side of the world A data collection strategy can be designed in different ways each affecting differently the questionnaire design Literature review(De Leeuw, Dillman) • One main mode, the others are secondary or auxiliary Questionnaire design: mode-enhancement construction approach • All modes are equally important • Questionnaire design: - mode-specific construction or maximisation method - uni-mode approach - generalised mode design The questionnaire is purposely designed to be different for each method in order to reach the cognitive equivalence of the perceived stimulus “… the same offered stimulus is not necessarily the same perceived stimulus” (De Leeuw 2005) UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  13. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Adoption of a mixed-mode strategy • Methodological side of the world • The generalised approach is not easy. It requires: • to identify the differences in modes influencing the cognitive process of answering • to use cognitive tests to demonstrate that different question formats elicit equivalent answers But it seems able to answer Istat needs (to combine any data collection mode and instrument). How to implement it? Through the creation of an integrated data collection system, based ongeneralised functions covering all steps of data collection Technological side of the world UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  14. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode A software solution to support mixed-mode strategy Technological side of the world TASK: to reduce complexity HOW: • 1) Consolidation: i) to create from scratch a single all purposes, highly integrated system or ii) to use available tools and make them speak common languages, share the same data representations and meet functional standards; 2) Generalisation: to abandon ad hoc procedures in favour of generalised ones. Three main dimension of generalisation: - Data collection technique - Class of respondents - Software and hardware platforms 3) Questionnaire abstraction: to design the questionnaire independently from its implementation in any collection mode (user side); the system should be flexible in order to support any mode-specific changes to the questionnaire. UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  15. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode A software solution to support mixed-mode strategy Technological side of the world How to achieve the three objectives? (Consolidation, Generalisation, and Questionnaire abstraction) • Istat case study • A workgroup to define requirements that a data collection system must have in order to be considered as generalised. • Requirements have to be applicable to the entire process of data collection. • Four areas for requirements: • Survey units management: managing the information to contact respondent and to logically define it as a user in the system: name, address, phone, e-mail, username and password etc. • Electronic questionnaire: instrument to collect micro-data • Data collection management: real-time administrative tools for conducting and monitoring the data collection process: management of user grants, first validation tools, questionnaire tracking systems, reporting tools etc. • Communication facilities: tools for exchanging information with survey respondents: helpdesk system, content management system, automatic reminders management, etc. UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  16. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode A software solution to support mixed-mode strategy Istat case study:requirements for the Electronic questionnaire Present situation in Istat: many available data collection tools, developed independently for specific contexts and with different technologies. First we need to know: 1) what was already available: to take an accurate inventory of what is available in order to avoid redundancy and duplication of effort; 2) what we should require from it: to define standards and requirements; 3) what fully or partially already meet these standards or can more easily and cost-effectively be brought into compliance. To answer the three “questions” we collected information on tools and at the same we started to devise functional requirements on the base of which to evaluate the tools. Meetings with IT projects manager to create a feed-back process UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  17. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Requirements for a generalised software and evaluation criteria Istat case study:requirements for the Electronic questionnaire Evaluation form: criteria High level of differences among tools, no directly comparable. The evaluation form was created in order to cover the majority of topics trying to get a common minimum but exhaustive information. • Two categories for evaluation criteria: • Cross-sectional criteria: refer to an evaluation of the actual facilities provided by the tools at time t • Longitudinal criteria: take into account potential assets in order to assess possible lines of development. UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  18. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Requirements for a generalised software and evaluation criteria Istat case study:requirements for the Electronic questionnaire • Evaluation form: Cross-sectional criteria • Usability • Flexibility • Completeness of functions • Generalisation of functions • Integration with XML data representation model • Independence from proprietary systems • Cross-browser compatibility • Platform compatibility UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  19. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Requirements for a generalised software and evaluation criteria Istat case study:requirements for the Electronic questionnaire • Evaluation form: Cross-sectional criteria • Usability • A primary requisite that even non-computer experts (statistical researchers) are enabled to use authoring tools without the mediation of dedicated IT personnel. Results: increase in the quality and reduction of training and support costs. Criteria to evaluate usability: • user documentation availability; • presence of a user interface; • ability to reuse and modify existing objects and data • - metadata already defined; • - templates of questionnaire layout. 2. Flexibility • Adaptability of the software to multiple classes of respondents and of data acquisition techniques. Capability to handle questionnaires with different degrees of complexity and different ways of administering questions. Abstraction of the questionnaire object would help in flexibility and would make possible to apply adaptive collection techniques (e.g. mode-switching.). Criteria to evaluate flexibility: • Completeness of functions (explained after). • Presence of a metadata-driven architecture • Modularity

  20. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Requirements for a generalised software and evaluation criteria Istat case study:requirements for the Electronic questionnaire Evaluation form: Cross-sectional criteria 3. Completeness of functions • to implement all types of question formats; • to implement all types of check rules (both client and server-side); • to manage question sequences dynamically (smart branching, skip-and- fill…); • to manage linkage to external archives for look-ups, form pre-fill, cross-referencing, etc.; • to be able to perform computer-assisted coding through optimised text matching algorithms; • to allow the controlled upload of the requested data as files (ASCII, spreadsheet etc.); • to allow the respondent to export the questionnaire (empty or filled) for reference, printing and/or archiving; • to allow the respondent or the interviewer to complete the survey in multiple sessions (saving and later retrieving the questionnaire); • to enable questionnaire-sharing (concurrent access to a single questionnaire); • to implement loop functions: question loops, page or block loops, questionnaire loops, loop-and-merge facilities; • multilingualism

  21. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Requirements for a generalised software and evaluation criteria Istat case study:requirements for the Electronic questionnaire • Evaluation form: Cross-sectional criteria • Generalisation of functions Each software component has to be able to deal with a changing environment by allowing variable data to be introduced in the system through parameterisation. 5. Integration with XML data representation model The system must support data interoperability at three levels: a) data exchange between items in the same data collection toolbox; b) data exchange with the tools used in other phases of the statistical process; c) integration with tools for data collection and transmission used at European and international level. The interoperability must also be ensured along two dimensions: - Syntactic interoperability: sharing of formats and transmission protocols for the exchange of data. - Semantic interoperability: definition and supply of metadata necessary to interpret shared data. UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  22. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Requirements for a generalised software and evaluation criteria Istat case study:requirements for the Electronic questionnaire Evaluation form: Cross-sectional criteria 6. Independence from proprietary systems • Several advantages: • lower costs of acquisition and management; • greater independence from suppliers; • greater control over the code thus enhancing its flexibility; • Open technologies allows to share solutions with the developer community of other NSIs. 7. Cross-browser compatibility The software should ensure that the layout and functionality of the web questionnaire do not vary as different browsers are used to access it. A cross-browser application ensures the smooth running of the compilation process even when the specifications of respondent’s client environment are unknown or not directly controllable, like in business or households/population surveys. 8. Platform compatibility The progressive widening of the range of users adopting emerging technologies makes platform compatibility an essential requirement: users must be able to easily fill in electronic questionnaire even on devices such as PDAs, netbooks, laptops, smart-phones, tablets, MID, UMPC.

  23. Improve the quality on data collection: minimum requirements for a generalised software independently from the mode Requirements for a generalised software and evaluation criteria Istat case study:requirements for the Electronic questionnaire • Evaluation form: Longitudinal criteria • Generalisability of available ad hoc functions • Modularity of functions • Logical and semantic abstraction • Compliance with recognised standards for data and metadata description UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

  24. Results and future directions Data collected from the returned forms and from the meeting with IT project managers were merged, normalised and table-formatted in order to enhance comparability The result (almost expected) was that none of the surveyed software tools was found fully compliant with all the proposed requirements. The resulting data represent a solid foundation to go on with the analysis of the toolboxes and for the definition of the enterprise architecture standards for data collection in Istat. Final results are expected by the end of 2012 as they are part of the “Stat2015” project aimed at the standardisation and industrialisation of the entire cycle of ISTAT statistical processes, according to a model based on a metadata-driven and service oriented architecture. UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

More Related