350 likes | 391 Views
Learn about the data life cycle stages, data collection methods, analysis, sharing, and key considerations. Discover how to identify stakeholders, plan data use, and avoid data manipulation. Improve decision-making skills with this comprehensive guide.
E N D
Data for Decision-Making Module 3: Introduction to the Data Lifecycle – Collecting Data
Overview • Review key concepts • Introducing the data life-cycle • Case studies • Activity 3.1 • Understanding data collection methods • Key considerations in designing a data collection plan • Lying with data • Debrief
Review • Data • Data for decision-making • Stakeholders • Assessment • Mission • Vision • Data producer • Data consumer
Steps to Using Data for decision-making • Identify a problem or research question • Assess data available to you and your data needs • Identify stakeholders • Plan for how data will be used, analyze, and shared
Introducing key concepts • Lifecycle
Introducing key concepts • Collecting data • Analyzing data • Sharing data
Introducing key concepts • Data collection: data collection is the process of gathering information in a systematic way. Collected data are generally intended to answer questions and/or evaluate outcomes. • Data analysis: data analysis is the process of inspecting, cleansing, transforming, and visualizing data with the goal of discovering its useful information, suggesting conclusions, and supporting decision-making. • Data analysis is made up of many stages: • Inspecting the data • Cleaning the data • Transforming the data • Visualizing the data (visually)
Introducing key concepts • Data sharing: data sharing is the process of making data that are used in problem solving, research, or evaluation available to others.
Understanding Data Collection Methods • Primary data: information collected by you or your team. • Secondary data: information that is collected by a third party.
Primary Data Sources • Quantitative • Surveys • Experiments • Observation • Qualitative • Interviews • Focus groups • Observation • Case studies
Secondary Data Sources • Journals • Books • Newspapers • Records • Previous reports and analyses
Introducing Key Concepts • Protocols are systematic plans for how a set of operations are to be carried out • Data protocols are systematic plans for how data are to be collected, stored, and described
Introducing key concepts • Metadata: information that describes, explains, or gives context for other data. They are provided to make it easier to interpret, use, and manage data.
Introducing key concepts • Metadata are important because they are used to add context to data. Metadata are the key for primary data to be used as secondary data. Metadata can be: • Descriptive metadata (Such as who created the data, what was the data created for, where was the data collected, and when the data was collected) • Administrative metadata (Why these data were collected)
Key Considerations in Designing a Data Collection Plan • What questions or problems are trying to be addressed? • What do you need to know? • When to collect new data (primary), and when to use existing data (secondary)? • What instruments will you need to create? • Who will be involved in data collection, and for how long? • What documentation will be needed to use the data again?
Introducing key concepts Sample resources: • Data Management Plan tool: https://dmponline.dcc.ac.uk/ • Following best practices in choosing a sample (size, diversity, relevant population, etc.) https://resolutionresearch.com/page/results-calculate/ • Searching for secondary data http://datasupport.researchdata.nl/en/start-de-cursus/iv-gebruiksfase/zoeken-naar-data/ • What are databases? How to design one? www.dartmouth.edu/~bknauff/dwebd/2004-02/DB-intro.pdf
Lying with Data • The same data can easily be manipulated and used to tell different, opposing, or inaccurate stories participant introductions • Intentional or unintentional misuse • We can face many pressures in our work • Respectfully talk with your colleagues and supervisors if you feel data are being used incorrectly or inappropriately
Lying with Data • There are many ways data can be used to mislead • Always keep your data radar active! • A few examples:
Lying with Data • Correlation vs. causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data • Correlation vs causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data • A correlation describes a relationship between two or more variables. It does not, however, mean that one variable impacts the other. • Causation shows that the change in one variable is the result of a change in the other. In other words, a change in one causes a change in the other.
Lying with Data Source: http://www.tylervigen.com/spurious-correlations
Lying with Data Source: http://www.tylervigen.com/spurious-correlations
Lying with Data • Correlation vs causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data Average number of weekly hours worked at main job Source: http://callingbullshit.org/tools/tools_misleading_axes.html
Lying with Data Source: http://callingbullshit.org/tools/tools_misleading_axes.html
Lying with Data • Misleading visualizations • A good resource is http://callingbullshit.org/tools/tools_misleading_axes.html
Lying with Data • Correlation vs causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data • Correlation vs causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data • Key takeaways • Misusing data can be intentional or unintentional • Be careful with your data and what conclusions you state from it • Do not manipulate your data to fit the story you want to tell and not be open to other stories • Be critical of how others use data • Be honest if you make a mistake